2.1.5

请根据题目要求,在下方空白处填入正确的代码(点击 💡 按钮查看提示)

数据集说明
文件名:健康咨询客户数据集.csv
TimestampYour nameYour genderYour ageHow important is exercise to you ?How do you describe your current level of fitness ?How often do you exercise?What barriers, if any, prevent you from exercising more regularly? (Please select all that apply)What form(s) of exercise do you currently participate in ? (Please select all that apply)Do you exercise ___________ ?What time if the day do you prefer to exercise?How long do you spend exercising per day ?Would you say you eat a healthy balanced diet ?What prevents you from eating a healthy balanced diet, If any? (Please select all that apply)How healthy do you consider yourself?Have you ever recommended your friends to follow a fitness routine?Have you ever purchased a fitness equipment?What motivates you to exercise? (Please select all that applies )
2019/07/03 11:48:07 PM GMT+5:30ParkaviFemale19 to 252GoodNeverI don't have enough time;I can't stay motivatedI don't really exerciseI don't really exerciseEarly morningI don't really exerciseNot alwaysEase of access to fast food;Temptation and cravings3YesNoI'm sorry ... I'm not really interested in exercising
2019/07/03 11:51:22 PM GMT+5:30NithilaaFemale19 to 254Very goodNeverI don't have enough time;I'll become too tiredWalking or jogging;SwimmingWith a groupEarly morningI don't really exerciseNot alwaysEase of access to fast food;Temptation and cravings4YesNoI want to be fit;I want to be flexible;I want to relieve stress;I'm sorry ... I'm not really interested in exercising
2019/07/03 11:56:28 PM GMT+5:30Karunya vFemale15 to 183Good1 to 2 times a weekI can't stay motivatedWalking or joggingAloneEarly morning30 minutesNot alwaysTemptation and cravings4YesYesI want to be fit
2019/07/04 5:43:35 AM GMT+5:30Anusha Female15 to 184Good3 to 4 times a weekI don't have enough timeWalking or jogging;Gym;Lifting weightsAloneEvening1 hourYesTemptation and cravings4YesNoI want to be fit;I want to lose weight
2019/07/04 5:44:29 AM GMT+5:30NikkithaFemale19 to 253UnfitNeverI can't stay motivatedI don't really exerciseI don't really exerciseEveningI don't really exerciseYesEase of access to fast food;Temptation and cravings4YesNoI want to be fit
2019/07/04 6:23:37 AM GMT+5:30GirijaFemale40 and above5Average3 to 4 times a weekI exercise regularly with no barriersWalking or jogging;YogaWith a groupEvening1 hourNot alwaysTemptation and cravings3YesNoI want to be flexible
2019/07/04 6:33:21 AM GMT+5:30SrinivasanMale40 and above3Good1 to 2 times a weekI don't really enjoy exercisingWalking or joggingAloneEarly morning30 minutesNoTemptation and cravings3NoNoI want to be flexible
2019/07/04 7:40:51 AM GMT+5:30RanjaniFemale15 to 183UnfitNeverI can't stay motivated;I don't really enjoy exercisingWalking or joggingAloneEarly morningI don't really exerciseNot alwaysTemptation and cravings2YesYesI want to be fit;I'm sorry ... I'm not really interested in exercising
2019/07/04 8:06:17 AM GMT+5:30Bupesh RMale19 to 255Unfit3 to 4 times a weekI don't have enough time;I can't stay motivated;I'll become too tired;I don't really enjoy exercisingGym;Team sportWith a friendEvening1 hourNoTemptation and cravings2YesNoI want to be fit;I want to increase muscle mass and strength;I want to lose weight
2019/07/04 8:09:02 AM GMT+5:30SudhanMale15 to 185Very goodEverydayI don't have enough time;I exercise regularly with no barriersGymWith a friendEarly morning1 hourNot alwaysEase of access to fast food3YesYesI want to be fit;I want to lose weight;I want to relieve stress
共 551 条数据,仅展示前 10 条
代码填空
import pandas as pd # 加载数据集 data = # 查看表结构基本信息 print() # 显示每一列的空缺值数量 print() # 删除含有缺失值的行 data_cleaned = # 转换 'Your age' 列的数据类型为整数类型,并处理异常值 data_cleaned.loc[:, 'Your age'] = (, errors='coerce') data_cleaned = data_cleaned.dropna(subset=['Your age']) data_cleaned = data_cleaned[data_cleaned['Your age'] >= 0] data_cleaned.loc[:, 'Your age'] = data_cleaned['Your age']. print(data_cleaned['Your age'].dtype) # 检查和删除重复值 duplicates_removed = data_cleaned.duplicated().sum() data_cleaned = print(f"Removed {duplicates_removed} duplicate rows") from sklearn.preprocessing import LabelEncoder # 归一化 'How do you describe your current level of fitness ?' 列 label_encoder = LabelEncoder() data_cleaned[] = print(data_cleaned['How do you describe your current level of fitness ?'].unique()) from sklearn.preprocessing import LabelEncoder import matplotlib.pyplot as plt # 去掉列名中的空格 data.columns = data.columns.str.strip() # 显示数据集的列名 print(data.columns) # 删除包含缺失值的行 data_cleaned = data.dropna(subset=['How often do you exercise?']) # 统计不同健身频率的分布情况 exercise_frequency_counts = data_cleaned['How often do you exercise?'].value_counts() # 绘制饼图 plt.figure(figsize=(10, 6)) (autopct='%1.1f%%', startangle=90, colors=plt.cm.Paired.colors) plt.title('Distribution of Exercise Frequency') plt.ylabel('') plt.show() import pandas as pd from sklearn.model_selection import train_test_split import matplotlib.pyplot as plt # 填充缺失值 data_filled = data.apply(lambda x: x.fillna(x.mode()[0])) # 划分数据(测试集占比20%) train_data, test_data = (, random_state=42) # 保存处理后的数据 cleaned_file_path = '' (, index=False)
提示: