2.2.3 健身分析随机森林与XGBoost回归

请根据题目要求，在下方空白处填入正确的代码（点击 💡 按钮查看提示）

数据集说明

文件名：fitness analysis.csv

Timestamp	Your name	Your gender	Your age	How important is exerc...
2019/07/03 11:48:...	Parkavi	Female	19 to 25	2
2019/07/03 11:51:...	Nithilaa	Female	19 to 25	4
2019/07/03 11:56:...	Karunya v	Female	15 to 18	3
2019/07/04 5:43:3...	Anusha	Female	15 to 18	4
2019/07/04 5:44:2...	Nikkitha	Female	19 to 25	3
2019/07/04 6:23:3...	Girija	Female	40 and above	5
2019/07/04 6:33:2...	Srinivasan	Male	40 and above	3
2019/07/04 7:40:5...	Ranjani	Female	15 to 18	3
2019/07/04 8:06:1...	Bupesh R	Male	19 to 25	5
2019/07/04 8:09:0...	Sudhan	Male	15 to 18	5

共 545 条数据，仅展示前 10 条

代码填空

import pandas as pd
from sklearn.model_selection import train_test_split
from sklearn.ensemble import RandomForestRegressor
import pickle
from sklearn.metrics import mean_squared_error, r2_score
import xgboost as xgb
# 加载数据集
df = 
# 显示前五行数据
print()
# 去除所有字符串字段的前后空格
df = df.applymap(lambda x: x.strip() if isinstance(x, str) else x)
# 检查和清理列名
df.columns = df.columns.str.strip()
# 选择相关特征进行建模
X = df[['Your gender', 'How important is exercise to you ?', 'How healthy do you consider yourself?']]
X = (X)  # 将分类变量转为数值变量
# 将年龄段转为数值变量
y = (lambda x: int(x.split(' ')[0]))  # 假设年龄段为整数
# 将数据集划分为训练集和测试集（测试集占比20%）
X_train, X_test, y_train, y_test = (, random_state=42)
# 创建随机森林回归模型（创建的决策树的数量为100）
rf_model = (, random_state=42)
# 训练随机森林回归模型
# 保存训练好的模型
with open('2.2.3_model.pkl', 'wb') as model_file:
    pickle.
# 进行结果预测
y_pred = 
results_df = pd.DataFrame(y_pred, columns=['预测结果'])
results_df.to_csv('2.2.3_results.txt', index=False)
# 使用测试工具对模型进行测试，并记录测试结果
train_score =    #训练集分数
test_score =     #测试集分数
mse =   #均方误差
r2 =   #决定系数
with open('2.2.3_report.txt', 'w') as report_file:
    report_file.write(f'训练集得分: {train_score}\n')
    report_file.write(f'测试集得分: {test_score}\n')
    report_file.write(f'均方误差(MSE): {mse}\n')
    report_file.write(f'决定系数(R^2): {r2}\n')
# 运用工具分析算法中错误案例产生的原因并进行纠正
# 初始化XGBoost回归模型（构建100棵树）
xgb_model = (, random_state=42)
# 训练XGBoost回归模型
# 使用XGBoost回归模型在测试集上进行结果预测
y_pred_xgb = 
results_df_xgb = pd.DataFrame(y_pred_xgb, columns=['预测结果'])
results_df_xgb.to_csv('2.2.3_results_xgb.txt', index=False)
with open('2.2.3_report_xgb.txt', 'w') as xgb_report_file:
    xgb_report_file.write(f'XGBoost训练集得分: {}\n')
    xgb_report_file.write(f'XGBoost测试集得分: {}\n')
    xgb_report_file.write(f'XGBoost均方误差(MSE): {}\n')
    xgb_report_file.write(f'XGBoost决定系数(R^2): {)}\n')