R语言线性回归

2019-08-06  本文已影响0人  Whuer_deng

1、导入数据

library(haven)
OAP <- read_sav("C:/Users/deng/Desktop/OAP.sav")
str(OAP)
## Classes 'tbl_df', 'tbl' and 'data.frame':    38 obs. of  3 variables:
##  $ NO : num  1 2 3 4 5 6 7 8 9 10 ...
##   ..- attr(*, "format.spss")= chr "F8.0"
##  $ DON: num  0 0 0 0 0 ...
##   ..- attr(*, "format.spss")= chr "F8.2"
##  $ OAP: num  14.15 11.13 7.25 5.19 4.15 ...
##   ..- attr(*, "format.spss")= chr "F8.2"
print(OAP) #查看数据
NO  DON     OAP
1   0       14.15
2   0       11.13
3   0       7.25
4   0       5.19
5   0       4.15
6   0       3.29
7   0       2.26
8   0       0.01
9   28.76   3.27
10  48.54   3.34
11  57.94   4.28
12  69.18   7.2
13  225.41  14.16
14  187.89  7.2
15  74.78   9.27
16  74.67   14.1
17  86.09   9.26
18  75.89   2.2
19  116.33  5.27
20  128.58  5.26
21  178.42  9.19
22  177.38  13.24
23  204.63  16.15
24  215.99  14.16
25  206.9   0.03
26  247.29  5.17
27  289.54  11.18
28  306.31  19.1
29  327.23  11.15
30  358.32  11.13
31  389.22  19.12
32  419.35  20.05
33  426.85  21.33
34  426.9   19.18
35  458.04  17.09
36  468.34  20.01
37  577.52  24.24
38  588.95  19.06

2、绘制OAP评分和DON含量的散点图

library(ggplot2)
ggplot(OAP, aes(x = DON, y = OAP)) + 
  geom_point() + 
  theme_bw()
image.png

3、计算OAP和DON的直线相关系数

with(OAP, (cor(OAP, DON, method = 'pearson'))) 
## [1] 0.7863221

4、检验直线相关系数的统计学意义

with(OAP, cor.test(OAP, DON, method = 'pearson')) 
##  Pearson's product-moment correlation
## 
## data:  OAP and DON
## t = 7.6365, df = 36, p-value = 4.89e-09
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
##  0.6233270 0.8838328
## sample estimates:
##       cor 
## 0.7863221
-Pearson直线相关系数检验,p<0.05,表明OAP评分和DON含量存在直线相关关系;
-相关系数为0.7863221,说明OAP评分与DON含量呈正相关。

5、建立OAP评分关于DON含量的线性模型

fit.lm <- lm(OAP ~ DON, data = OAP)
fit.lm
## Call:
## lm(formula = OAP ~ DON, data = OAP)
## 
## Coefficients:
## (Intercept)          DON  
##     4.78563      0.02969

6、生成线性模型的统计量

summary(fit.lm)
 Call:
lm(formula = OAP ~ DON, data = OAP)

 Residuals:
##      Min       1Q   Median       3Q      Max 
## -10.8995  -2.9493  -0.1378   2.7526   9.3644 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)    
## (Intercept) 4.785631   1.021456   4.685 3.92e-05 ***
## DON         0.029695   0.003889   7.636 4.89e-09 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 4.197 on 36 degrees of freedom
## Multiple R-squared:  0.6183, Adjusted R-squared:  0.6077 
## F-statistic: 58.32 on 1 and 36 DF,  p-value: 4.89e-09
-截距和解释变量DON的系数,p值均小于0.05,截距和系数均有意义。
-截距为4.785631,意为当DON的含量为0时,OAP的评分为4.785631 。DON的系数为0.029695,表示当DON平均改变一个单位时,OAP评分平均改变0.029695。
-调整R2为 0.6077,表明DON含量解释了OAP评分60.77%的变异,剩下的29.23%变异由其他因素影响。

7、生成方差分析表

anova(fit.lm)
## Analysis of Variance Table
## 
## Response: OAP
##           Df  Sum Sq Mean Sq F value   Pr(>F)    
## DON        1 1027.21 1027.21  58.316 4.89e-09 ***
## Residuals 36  634.13   17.61                     
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

8、画出拟合直线、在图中标出回归方程和调整R2

library(ggpmisc)
ggplot(OAP, aes(x = DON, y = OAP)) + 
    geom_point() + 
    theme_bw() + 
    geom_smooth(method = 'lm', se = F, color = 'red') + 
    stat_poly_eq(aes(label = paste(..eq.label.., ..adj.rr.label.., sep = '~~~~')), formula = y ~ x, parse = T)
image.png
export::table2doc(fit.lm, add.rownames = T)
Rplot.png

参考:用ggplot2进行直线回归并添加回归方程和方差分析表

9、附spss结果

点图.png
相关系数及其检验.png 直线回归结果.png
上一篇下一篇

猜你喜欢

热点阅读