2.1.1

请根据题目要求，在下方空白处填入正确的代码（点击 💡 按钮查看提示）

数据集说明

文件名：auto-mpg.csv

mpg	cylinders	displacement	horsepower	weight	acceleration	model year	origin	car name
18	8	nan	130	3504.0	12.0	70	1	chevrolet chevelle malibu
15	8	350.0	165	3693.0	11.5	70	1	buick skylark 320
18	8	318.0	150	3436.0	11.0	70	1	plymouth satellite
16	8	304.0	150	3433.0	12.0	70	1	amc rebel sst
17	8	302.0	140	3449.0	10.5	70	1	ford torino
15	8	429.0	198	4341.0	10.0	70	1	ford galaxie 500
14	8	454.0	220	4354.0	9.0	70	1	chevrolet impala
14	8	440.0	215	4312.0	8.5	70	1	plymouth fury iii
14	8	455.0	225	nan	10.0	70	1	pontiac catalina
15	8	390.0	190	3850.0	8.5	70	1	amc ambassador dpl

共 398 条数据，仅展示前 10 条

代码填空

import pandas as pd # 加载数据集并显示数据集的前五行 1分 data = print("数据集的前五行:") print() # 显示每一列的数据类型 print(data.dtypes) # 检查缺失值并删除缺失值所在的行 2分 print("\n检查缺失值:") print(..) data = # 将 'horsepower' 列转换为数值类型，并（删除）处理转换中的异常值 1分 data['horsepower'] = (data['horsepower'], errors='coerce') data = # 显示每一列的数据类型 print(data.horsepower.dtypes) # 检查清洗后的缺失值 print("\n检查清洗后的缺失值:") print(data.isnull().sum()) from sklearn.preprocessing import StandardScaler # 对数值型数据进行标准化处理 1分 numerical_features = ['displacement', 'horsepower', 'weight', 'acceleration'] scaler = StandardScaler() data[numerical_features] = from sklearn.model_selection import train_test_split # 选择特征、自变量和目标变量 2分 selected_features = X = y = # 划分数据集为训练集和测试集（训练集占8成） 1分 X_train, X_test, y_train, y_test = (, random_state=42) # 将特征和目标变量合并到一个数据框中 cleaned_data = X.copy() cleaned_data['mpg'] = y # 保存清洗和处理后的数据（不存储额外的索引号） 1分 ('2.1.1_cleaned_data.csv', ) # 打印消息指示文件已保存 print("\n清洗后的数据已保存到 2.1.1_cleaned_data.csv")