线性回归

2017-12-14  本文已影响0人  E_H_I_P

线性回归是机器学习中最简单的算法,这篇文章我们将探索这一算法,并用pyton来实现它。

  分类:简单线性回归&多元线性回归

简单线性回归

模型表示:

模型的代码实现:

import pandas as pd
import matplotlib.pyplot as plt
plt.rcParams['figure.figsize'] = (20.0,10.0)

data = pd.read_csv('headbrain.csv')
#print(data.shape)
#print(data.head())

#Collecting X and Y
X = data['Head Size(cm^3)'].values
Y = data['Brain Weight(grams)'].values

#Mean X and Y
mean_x = np.mean(X)
mean_y = np.mean(Y)

m = len(X)

number = 0
denom = 0
for i in range(m):
    number += (X[i]-mean_x)*(Y[i]-mean_y)
    denom += (X[i]-mean_x)**2
b1 = number / denom
b0 = mean_y - (b1*mean_x)
print(b1,b0)

max_x = np.max(X) + 100
min_x = np.min(X) - 100

x = np.linspace(min_x,max_x,1000)
y = b0 + b1*x

plt.plot(x,y,color='#58b907',label='Regression line')
plt.scatter(X,Y,c='#ef5432', label='Scatter Plot')

plt.xlabel('Head Size in cm3')
plt.ylabel('Brain Weight in gram')
plt.legend()
plt.show()

代码经过测试可运行

对模型评估

方法有:均方根误差,测定系数法

均方根误差:
RMSE

其中^Yi表示预测的输出值。

# RMSE to evaluate models:
rmse = 0
for i in range(m):
    y_pred = b0 + b1 * X[i]
    rmse += (Y[i] - y_pred) ** 2
rmse = np.sqrt(rmse/m)
print(rmse)
系数测定(R^2 Score):
R^2 Score
#Coefficient of Determination(R^2 Score):
ss_t = 0
ss_r = 0
for i in range(m):
    y_pred = b0 + b1 * X[i]
    ss_t += (Y[i] - mean_y) ** 2
    ss_r += (Y[i] - y_pred) ** 2
r2 = 1 - (ss_r / ss_t)
print(r2)
上一篇下一篇

猜你喜欢

热点阅读