机器学习-线性回归（python）

2017-09-28 本文已影响100人 songcmic

学了一段时间的机器学习，一直想着找个机会用代码实现一下算法，但是由于python的相关东西仍然不是很熟悉，因此先基本学了python的基础和一些包的简单使用，现在总算磕磕绊绊的写了一下线性回归算法。
阅读原文

线性回归

算法的理论就不再在此回顾了，前面已经学过一段时间了

首先import一些需要用到的包pandas，numpy，matplotlib分别用来读写数据，处理数据，可视化数据

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt

然后实现如下代价函数

def computeCost(X, y, theta):
    res = np.power((X*theta) - y, 2)
    return np.sum(res) / (2*len(X))

对需要处理的数据进行预处理，使其符合函数调用的要求

data.insert(0,'ones',1)

X = data.ix[:,0:2]
y = data.ix[:,2:3]

X = np.matrix(X.values)
y = np.matrix(y.values)

然后在实现梯度下降算法，该算法的公式如下

def gradientDecrease(X, y, theta, alpha, iters):
    temp = np.matrix(np.zeros(theta.shape))
    cost = np.matrix(np.zeros((iters,1)))
    para = theta.shape[0]
    for i in range(iters):
        error = X*theta - y
        for j in range(para):
            term = np.multiply(error,X[:,j])
            temp[j,0] = theta[j,0] - (alpha/len(X))*np.sum(term)
        theta = temp
        cost[i,0] = computeCost(X, y, theta)
    return theta, cost

然后初始化参数，调用函数训练模型，最后将数据和模型可视化

alpha = 0.01
iters = 1000

theta, cost = gradientDecrease(X, y, theta, alpha, iters)

inputs = np.linspace(data["Population"].min(), data["Population"].max(),100)
outs = theta[0,0] + theta[1,0]*inputs

fig, ax = plt.subplots(figsize = (12,8))
ax.plot(inputs, outs, 'r', label = "prediction")
ax.scatter(data["Population"],data["Profit"], label = "Traning data")
ax.set_xlabel("Population")
ax.set_ylabel("Profit")
ax.legend(loc = 2)

fig2, ax = plt.subplots(figsize = (12,8))
ax.plot(range(iters), cost, 'r')
ax.set_xlabel("iters")
ax.set_ylabel("cost")

最后可视化的结果
训练后的线性模型

训练过程中每迭代一次误差的变化

总结

这次总算可以半参考半思考的独立实现简单的算法，刚刚开始还不太熟练，继续多加练习。后面还会慢慢实现各种机器学习算法，等我学完算机器学习算法理论之后在说，毕竟里面要求的数学内容比较繁琐，得一定的时间来研究。
阅读原文

机器学习-线性回归（python）

线性回归

总结

猜你喜欢

热点阅读