R语言线性回归
2019-08-06 本文已影响0人
Whuer_deng
1、导入数据
library(haven)
OAP <- read_sav("C:/Users/deng/Desktop/OAP.sav")
str(OAP)
## Classes 'tbl_df', 'tbl' and 'data.frame': 38 obs. of 3 variables:
## $ NO : num 1 2 3 4 5 6 7 8 9 10 ...
## ..- attr(*, "format.spss")= chr "F8.0"
## $ DON: num 0 0 0 0 0 ...
## ..- attr(*, "format.spss")= chr "F8.2"
## $ OAP: num 14.15 11.13 7.25 5.19 4.15 ...
## ..- attr(*, "format.spss")= chr "F8.2"
print(OAP) #查看数据
NO DON OAP
1 0 14.15
2 0 11.13
3 0 7.25
4 0 5.19
5 0 4.15
6 0 3.29
7 0 2.26
8 0 0.01
9 28.76 3.27
10 48.54 3.34
11 57.94 4.28
12 69.18 7.2
13 225.41 14.16
14 187.89 7.2
15 74.78 9.27
16 74.67 14.1
17 86.09 9.26
18 75.89 2.2
19 116.33 5.27
20 128.58 5.26
21 178.42 9.19
22 177.38 13.24
23 204.63 16.15
24 215.99 14.16
25 206.9 0.03
26 247.29 5.17
27 289.54 11.18
28 306.31 19.1
29 327.23 11.15
30 358.32 11.13
31 389.22 19.12
32 419.35 20.05
33 426.85 21.33
34 426.9 19.18
35 458.04 17.09
36 468.34 20.01
37 577.52 24.24
38 588.95 19.06
2、绘制OAP评分和DON含量的散点图
library(ggplot2)
ggplot(OAP, aes(x = DON, y = OAP)) +
geom_point() +
theme_bw()
image.png
3、计算OAP和DON的直线相关系数
with(OAP, (cor(OAP, DON, method = 'pearson')))
## [1] 0.7863221
4、检验直线相关系数的统计学意义
with(OAP, cor.test(OAP, DON, method = 'pearson'))
## Pearson's product-moment correlation
##
## data: OAP and DON
## t = 7.6365, df = 36, p-value = 4.89e-09
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
## 0.6233270 0.8838328
## sample estimates:
## cor
## 0.7863221
-Pearson直线相关系数检验,p<0.05,表明OAP评分和DON含量存在直线相关关系;
-相关系数为0.7863221,说明OAP评分与DON含量呈正相关。
5、建立OAP评分关于DON含量的线性模型
fit.lm <- lm(OAP ~ DON, data = OAP)
fit.lm
## Call:
## lm(formula = OAP ~ DON, data = OAP)
##
## Coefficients:
## (Intercept) DON
## 4.78563 0.02969
6、生成线性模型的统计量
summary(fit.lm)
Call:
lm(formula = OAP ~ DON, data = OAP)
Residuals:
## Min 1Q Median 3Q Max
## -10.8995 -2.9493 -0.1378 2.7526 9.3644
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 4.785631 1.021456 4.685 3.92e-05 ***
## DON 0.029695 0.003889 7.636 4.89e-09 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 4.197 on 36 degrees of freedom
## Multiple R-squared: 0.6183, Adjusted R-squared: 0.6077
## F-statistic: 58.32 on 1 and 36 DF, p-value: 4.89e-09
-截距和解释变量DON的系数,p值均小于0.05,截距和系数均有意义。
-截距为4.785631,意为当DON的含量为0时,OAP的评分为4.785631 。DON的系数为0.029695,表示当DON平均改变一个单位时,OAP评分平均改变0.029695。
-调整R2为 0.6077,表明DON含量解释了OAP评分60.77%的变异,剩下的29.23%变异由其他因素影响。
7、生成方差分析表
anova(fit.lm)
## Analysis of Variance Table
##
## Response: OAP
## Df Sum Sq Mean Sq F value Pr(>F)
## DON 1 1027.21 1027.21 58.316 4.89e-09 ***
## Residuals 36 634.13 17.61
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
8、画出拟合直线、在图中标出回归方程和调整R2
library(ggpmisc)
ggplot(OAP, aes(x = DON, y = OAP)) +
geom_point() +
theme_bw() +
geom_smooth(method = 'lm', se = F, color = 'red') +
stat_poly_eq(aes(label = paste(..eq.label.., ..adj.rr.label.., sep = '~~~~')), formula = y ~ x, parse = T)
image.png
export::table2doc(fit.lm, add.rownames = T)
Rplot.png
参考:用ggplot2进行直线回归并添加回归方程和方差分析表
9、附spss结果
点图.png相关系数及其检验.png 直线回归结果.png