R与生物信息学科研学习生物信息与数据分析

数量生态学笔记||冗余分析(RDA)

2018-09-24  本文已影响818人  周运来就是我

上一节数量生态学笔记||冗余分析(RDA)概述中,我们回顾了RDA的计算过程,不管这个过程我们有没有理解透彻,我希望你能知道的是:RDA是响应变量矩阵与解释变量之间多元多重线性回归的拟合值矩阵的PCA分析。本节我们就是具体来看一个RDA的分析案例,来看看里面的参数以及结果的解读。

# CHAPTER 6 - CANONICAL ORDINATION
# ********************************
# 载入所需程序包
library(ade4)
library(vegan)
library(packfor)# 可在http://r-forge.r-project.org/R/?group_id=195下载,但是好像在R 3.5.1上加载,所以这篇我用R3.4来做的。
# 如果是MacOS X系统,packfor程序包内forward.sel函数的运行需要加载
# gfortran程序包。用户必须从"cran.r-project.org"网站内选择"MacOS X",
# 然后选择"tools"安装gfortran程序包。
rm(list = ls())
setwd("D:\\Users\\Administrator\\Desktop\\RStudio\\数量生态学\\DATA")

library(MASS)
library(ellipse)
library(FactoMineR)
# 附加函数
source("evplot.R")
source("hcoplot.R")
# 导入CSV数据文件
spe <- read.csv("DoubsSpe.csv", row.names=1)
env <- read.csv("DoubsEnv.csv", row.names=1)
spa <- read.csv("DoubsSpa.csv", row.names=1)
# 删除没有数据的样方8
spe <- spe[-8, ]
env <- env[-8, ]
spa <- spa[-8, ]
# 提取环境变量das(离源头距离)以备用
das <- env[, 1]
# 从环境变量矩阵剔除das变量
env <- env[, -1]
# 将slope变量(pen)转化为因子(定性)变量
pen2 <- rep("very_steep", nrow(env))
pen2[env$pen <= quantile(env$pen)[4]] = "steep"
pen2[env$pen <= quantile(env$pen)[3]] = "moderate"
pen2[env$pen <= quantile(env$pen)[2]] = "low"
pen2 <- factor(pen2, levels=c("low", "moderate", "steep", "very_steep"))
table(pen2)
# 生成一个含定性坡度变量的环境变量数据框env2
env2 <- env
env2$pen <- pen2
# 将所有解释变量分为两个解释变量子集
# 地形变量(上下游梯度)子集
envtopo <- env[, c(1:3)]
names(envtopo)
#水体化学属性变量子集
envchem <- env[, c(4:10)]
names(envchem)
# 物种数据Hellinger转化
spe.hel <- decostand(spe, "hellinger")
使用vegan包运行RDA

vegan包运行RDA有两种不同的模式。第一种是简单模式,直接输入用逗号隔开的数据矩阵对象到rda()函数:

simpleRDA = rda(Y,X,W)
式中Y为响应变量矩阵,X为解释变量矩阵,W为偏RDA分析需要的协变量矩阵。

此公式有一个缺点:Y,W不能有因子变量(定性变量)。如果有因子变量,建议使用第二种模式:

formulaRDA<-rda(Y \sim var1 + factorA + var2*var3+Condition(var4),data=XWdata )

式中,Y为响应变量矩阵。解释变量矩阵包括定量变量(var1)、因子变量(factorA)以及变量2和变量3的交互作用项,协变量(var4)被放到Condition()里。所用的数据都放在XWdata的数据框里。

这个公式与lm()函数以及其他回归函数一样,左边是响应变量,右边是解释变量。

# 基于Hellinger 转化的鱼类数据RDA,解释变量为对象env2包括的环境变量
# 关注省略模式的公式
spe.rda <- rda(spe.hel~., env2) 
summary(spe.rda) # 2型标尺(默认)
#这里使用一些默认的选项,即 scale=FALSE(基于协方差矩阵的RDA)和#scaling=2

RDA结果的摘录:

RDA formula :

Call:
rda(formula = spe.hel ~ alt + pen + deb + pH + 
dur + pho + nit + amm + oxy + dbo, data = env2) 

方差分解(Partitioning of variance):总方差被划分为约束和非约束两部分。约束部分表示响应变量Y矩阵的总方差能被解释变量解释的部分,如果用比例来表示,其值相当于多元回归的R^2。在RDA中,这个解释比例值也称作双多元冗余统计。然而,类似多元回归的未校正的R^2,RDA的R^2是有偏差的,需要进一步校正。

Partitioning of variance:
              Inertia Proportion
Total          0.5025     1.0000
Constrained    0.3654     0.7271
Unconstrained  0.1371     0.2729

特征根以及对方差的贡献率(Eigenvalues, and their contribution to the variance ):当前这个RDA分析产生了12个典范轴(特征根用RDA1 至RDA12表示)和16个非约束轴(特征根用PC1至PC16表示)。输出结果不仅包含每轴特征根同时也给出累积方差解释率(约束轴)或承载轴(非约束轴),最终的累计值必定是1.12 个典范轴累积解释率也代表响应变量总方差能够被解释变量解释的部分。

两个特征根的重要区别:典范特征根RDAx是响应变量总方差能够被解释变量解释的部分,而残差特征根RCx响应变量总方差能够被残差轴解释的部分,与RDA无关。

Eigenvalues, and their contribution to the variance 

Importance of components:
                        RDA1   RDA2    RDA3    RDA4     RDA5     RDA6     RDA7     RDA8     RDA9    RDA10     RDA11     RDA12     PC1     PC2     PC3
Eigenvalue            0.2281 0.0537 0.03212 0.02321 0.008707 0.007218 0.004862 0.002919 0.002141 0.001160 0.0009134 0.0003406 0.04580 0.02814 0.01529
Proportion Explained  0.4539 0.1069 0.06393 0.04618 0.017328 0.014363 0.009676 0.005809 0.004260 0.002308 0.0018176 0.0006778 0.09115 0.05600 0.03042
Cumulative Proportion 0.4539 0.5607 0.62467 0.67085 0.688176 0.702539 0.712215 0.718025 0.722284 0.724592 0.7264100 0.7270878 0.81824 0.87424 0.90466
                          PC4      PC5      PC6      PC7      PC8      PC9     PC10     PC11      PC12      PC13      PC14      PC15      PC16
Eigenvalue            0.01399 0.009841 0.007676 0.004206 0.003308 0.002761 0.002016 0.001752 0.0009851 0.0005921 0.0004674 0.0002127 0.0001004
Proportion Explained  0.02784 0.019584 0.015276 0.008371 0.006583 0.005495 0.004013 0.003486 0.0019604 0.0011783 0.0009301 0.0004233 0.0001998
Cumulative Proportion 0.93250 0.952084 0.967360 0.975731 0.982314 0.987809 0.991822 0.995308 0.9972684 0.9984468 0.9993768 0.9998002 1.0000000

累积约束特征根(Accumulated constrained eigenvalues)表示在本轴以及前面所有轴的典范轴所能解释的方差占全部解释方差的比例累积。

Accumulated constrained eigenvalues
Importance of components:
                        RDA1   RDA2    RDA3    RDA4     RDA5     RDA6     RDA7     RDA8     RDA9    RDA10     RDA11     RDA12
Eigenvalue            0.2281 0.0537 0.03212 0.02321 0.008707 0.007218 0.004862 0.002919 0.002141 0.001160 0.0009134 0.0003406
Proportion Explained  0.6242 0.1470 0.08792 0.06352 0.023832 0.019755 0.013308 0.007990 0.005859 0.003174 0.0024999 0.0009322
Cumulative Proportion 0.6242 0.7712 0.85913 0.92265 0.946483 0.966237 0.979545 0.987535 0.993394 0.996568 0.9990678 1.0000000

物种得分(Species scores)双序图和三序图内代表响应变量的箭头的顶点坐标。与PCA相同,坐标依赖标尺Scaling的选择。

Scaling 2 for species and site scores
* Species are scaled proportional to eigenvalues
* Sites are unscaled: weighted dispersion equal on all dimensions
* General scaling constant of scores:  1.93676 


Species scores

        RDA1     RDA2      RDA3      RDA4      RDA5      RDA6
CHA  0.13383  0.11623 -0.238180  0.018611  0.043221 -0.029737
TRU  0.64238  0.06648  0.123713  0.181572 -0.009691  0.029793
VAI  0.47475  0.07015 -0.010218 -0.115369 -0.045317 -0.030033
LOC  0.36260  0.06972  0.041240 -0.190586 -0.046881  0.006448
OMB  0.13079  0.10709 -0.239224  0.043603  0.065881  0.003458
BLA  0.06587  0.12475 -0.216900 -0.004157  0.021793 -0.004195
HOT -0.17417  0.06778 -0.008426 -0.016419 -0.079730  0.044706
TOX -0.12683  0.16052 -0.035733 -0.016087 -0.089768 -0.001880
VAN -0.07963  0.04200  0.007636 -0.059179 -0.033596 -0.121440
CHE -0.10903 -0.17552 -0.090099 -0.168373  0.019444  0.008745
BAR -0.18528  0.21154 -0.073087 -0.006879 -0.012995  0.040484
SPI -0.16064  0.15513 -0.014309 -0.002488 -0.060810  0.011045
GOU -0.20537  0.02484 -0.007973 -0.017742 -0.049137 -0.096231
BRO -0.10734  0.02848  0.090055  0.012324  0.075184 -0.057088
PER -0.09164  0.10506  0.070393 -0.057443  0.013870 -0.009906
BOU -0.20907  0.16002  0.025500  0.012078 -0.011477  0.022035
PSO -0.22799  0.11121  0.018800 -0.009474 -0.027431  0.024517
ROT -0.16098  0.01348  0.041628  0.032398  0.054117 -0.094582
CAR -0.17384  0.14901  0.022262  0.009534  0.004991 -0.005396
TAN -0.14025  0.10632  0.078290 -0.122627  0.054162  0.031256
BCO -0.18594  0.12222  0.053881  0.026170  0.044015  0.014577
PCH -0.14630  0.08894  0.061880  0.034763  0.083530  0.004396
GRE -0.30881  0.01606  0.039366  0.029254 -0.011141 -0.052412
GAR -0.31982 -0.16601 -0.018225 -0.115454  0.054341  0.064772
BBO -0.23897  0.09090  0.051627  0.010224  0.007004  0.036497
ABL -0.43215 -0.22639 -0.108190  0.138807 -0.083920  0.008460
ANG -0.19442  0.14149  0.033659  0.017387  0.008110  0.017638

样方得分(Site scores (weighted sums of species scores))物种得分的加权和:使用响应变量矩阵Y计算获得的样方坐标。

Site scores (weighted sums of species scores)

       RDA1      RDA2     RDA3      RDA4      RDA5      RDA6
1   0.40151 -0.154306  0.55539  1.600773  0.191866  0.916893
2   0.53523 -0.025084  0.43389  0.294615 -0.518456  0.458860
3   0.49430 -0.014605  0.49409  0.169038 -0.246166  0.163421
4   0.33452  0.001173  0.51626 -0.321009  0.088716 -0.219837
5   0.02794 -0.194357  0.44612 -0.559210  0.853768 -1.115654
6   0.24422 -0.130778  0.41372 -0.696264  0.181514 -0.273473
7   0.46590 -0.125982  0.31674 -0.137834 -0.548635 -0.061703
9   0.03662 -0.605060 -0.07022 -1.260916  0.669108  1.164986
10  0.31381 -0.198654  0.10764 -0.635139 -0.741448 -0.990236
11  0.48116 -0.039598 -0.37851  0.181924  0.221494  0.254511
12  0.49162  0.014263 -0.37983  0.163103  0.223730  0.324672
13  0.49848  0.212367 -0.67408  0.518823  0.400091  0.221622
14  0.38202  0.229538 -0.75771  0.223651  0.515712 -0.139740
15  0.28739  0.218713 -0.71887 -0.210821  0.176392 -0.553185
16  0.09129  0.400192 -0.34443 -0.376097 -0.366868 -0.575230
17 -0.05306  0.423994 -0.41009 -0.188492 -0.726152  0.151876
18 -0.14185  0.385926 -0.36814 -0.217143 -0.644298 -0.001052
19 -0.28204  0.275528 -0.01877 -0.371457 -0.691725 -0.062230
20 -0.39683  0.209468  0.11547 -0.177972 -0.387121  0.048690
21 -0.42851  0.278256  0.22010 -0.005993 -0.027083 -0.042209
22 -0.46553  0.251819  0.22784  0.040192  0.152965  0.032185
23 -0.28123 -1.145599 -0.50543  0.300015 -0.004403  1.157206
24 -0.40893 -0.752909 -0.26785  0.428851 -0.645606  0.643084
25 -0.35205 -0.770380 -0.12186  0.459170  0.078892 -1.725973
26 -0.46923  0.093958  0.23058  0.107865  0.310432  0.132556
27 -0.47021  0.213521  0.24887  0.084219  0.331685  0.125439
28 -0.47266  0.233922  0.27053  0.105776  0.381436  0.093719
29 -0.37457  0.393260  0.10423  0.202115  0.282621  0.021834
30 -0.48932  0.321417  0.31431  0.278218  0.487541 -0.151031

样方约束——解释变量的线性组合(Site constraints (linear combinations of constraining variables)):使用解释变量矩阵X计算获得的样方坐标,是拟合的(fitted)样方坐标。

Site constraints (linear combinations of constraining variables)

       RDA1      RDA2     RDA3      RDA4      RDA5     RDA6
1   0.55135  0.002395  0.47774  0.626878 -0.210700  0.31511
2   0.29737  0.105715  0.64862  0.261161 -0.057741  0.09322
3   0.36834 -0.185376  0.59788  0.324212 -0.002385  0.31090
4   0.44348 -0.066086  0.33260 -0.344463 -0.279591 -0.37079
5   0.27004 -0.366721  0.17992 -0.453274  0.716614 -0.06545
6   0.37148 -0.255624  0.40847  0.217259  0.023374  0.34360
7   0.53874 -0.179999  0.06845 -0.424896  0.024884 -0.33454
9  -0.04438 -0.362632  0.12371 -1.180662  0.348188  0.26352
10  0.16289 -0.154212  0.22252 -0.241425 -0.573048 -0.02867
11  0.29912  0.176150 -0.08233  0.003924  0.164806 -0.44603
12  0.36843  0.197492 -0.41095  0.300566 -0.053922 -0.01139
13  0.42626  0.190107 -0.59764  0.100988  0.118714 -0.21022
14  0.34373  0.134362 -0.80378  0.063879  0.665797  0.48126
15  0.21385  0.237182 -0.56341 -0.001099 -0.028564  0.01655
16  0.03056  0.352192 -0.12110 -0.202316  0.058413 -0.43542
17 -0.10499  0.178587 -0.26925  0.046988 -0.608314  0.21237
18 -0.11204  0.221631 -0.24024 -0.302957 -0.251346 -0.01448
19 -0.05479  0.311860 -0.30701  0.010366 -0.481829 -0.12855
20 -0.25684  0.303770 -0.06768 -0.036587 -0.562578  0.13698
21 -0.39177  0.196355  0.01877 -0.281086 -0.383524  0.39310
22 -0.21361  0.180414  0.01066  0.074301 -0.036849 -0.02429
23 -0.21654 -1.016853 -0.57298  0.548175  0.182594  0.51443
24 -0.52578 -0.645438 -0.11182 -0.240149 -0.512492  0.32420
25 -0.38886 -0.867381 -0.08079  0.482839 -0.106743 -1.21305
26 -0.48440  0.031510  0.14065 -0.114545  0.425712 -0.17989
27 -0.61221  0.138191  0.32316 -0.015795  0.232397  0.38288
28 -0.46921  0.459843  0.22002  0.078870  0.278747 -0.36504
29 -0.38450  0.344487  0.20622  0.353008  0.504693  0.17747
30 -0.42572  0.338080  0.24960  0.345839  0.404692 -0.13777

解释变量双序图得分(Biplot scores for constraining variables):排序图内解释解释变量箭头的坐标,按照下面的过程获得:运行解释变量与拟合的样方坐标之间的相关分析,然后将所有相关系数转化为双序图内坐标。所有的变量包括k个水平的因子口可以有自己的坐标对因子变量在排序轴的坐标,用各个因子的形心表示更合适。

Biplot scores for constraining variables

                 RDA1      RDA2     RDA3     RDA4      RDA5     RDA6
alt            0.8239 -0.203257  0.46604 -0.16936  0.003229  0.10407
penmoderate   -0.3592 -0.008729 -0.21727 -0.18278  0.157934  0.50094
pensteep       0.2791  0.156027 -0.06876  0.01927  0.176390 -0.15469
penvery_steep  0.6129 -0.148496  0.45379  0.03618 -0.191046 -0.04715
deb           -0.7770  0.254952 -0.17470  0.30995  0.227580 -0.11938
pH             0.1023  0.178431 -0.30131  0.03959  0.298584  0.04848
dur           -0.5722  0.044963 -0.56040 -0.14813  0.275617 -0.24527
pho           -0.4930 -0.650488 -0.19868  0.29286  0.055893 -0.39345
nit           -0.7755 -0.203836 -0.23285  0.19744 -0.078849 -0.35073
amm           -0.4744 -0.687577 -0.16648  0.28470 -0.051768 -0.33852
oxy            0.7632  0.575528 -0.16425  0.08026 -0.136143  0.13748
dbo           -0.5171 -0.791805 -0.15652  0.22064  0.075568 -0.09105

因子解释变量形心(Centroids for factor constraints):因子变量各个水平形心点的坐标,即每个水平所用标识为一的样方的形心。

Centroids for factor constraints

                 RDA1      RDA2     RDA3     RDA4     RDA5     RDA6
penlow        -0.2800  0.005530 -0.09025  0.07614 -0.07860 -0.18390
penmoderate   -0.2093 -0.005086 -0.12660 -0.10650  0.09203  0.29189
pensteep       0.1965  0.109867 -0.04842  0.01357  0.12420 -0.10892
penvery_steep  0.3908 -0.094679  0.28933  0.02307 -0.12181 -0.03006

在rda()函数中大家感兴趣的典范特征系数(即每个解释变量与每个典范轴之间的回归系数),可用coef()函数获得:

#如何从rda()输出结果中获得典范系数
coef(spe.rda)
alt            0.0004482548  7.559499e-05  0.0005205825  0.0003883845  0.001857206 -6.313946e-05 -0.001355362  0.001120849 -0.0002530083  0.001189659
penmoderate   -0.0123961693 -1.660194e-02  0.0160069104 -0.0278562534  0.276128846  1.310695e-01 -0.022918427  0.018830063 -0.3113354204 -0.278006278
pensteep       0.0478390238  4.893302e-02  0.1022577908  0.1347997771  0.393812929 -1.795824e-01  0.046319741  0.123642821  0.0963820046 -0.447099975
penvery_steep  0.0180005587 -5.691933e-02  0.2322637617  0.1002359565  0.036814635 -1.742060e-01  0.517299284  0.067564014 -0.2262450630 -0.590022907
deb           -0.0014063766  4.440084e-03  0.0089298486  0.0164715901  0.013318449  2.705757e-03 -0.002419805  0.010394632 -0.0006430624  0.004427139
pH             0.0188227657 -3.163592e-02 -0.0482021704  0.1141647498  0.412886847  1.091403e-01  0.139806409 -0.436295510 -0.0215841003 -0.904063764
dur            0.0025580344 -1.955496e-03 -0.0065901935 -0.0093556696  0.005228707 -6.098098e-03  0.002195518  0.010536248 -0.0006844877  0.003353110
pho            0.1031541920  4.583584e-02 -0.1000153096 -0.1050243435  0.422991234 -3.694132e-01  0.035874664 -0.701043138 -0.2865315085  0.245917738
nit           -0.0123824749  1.041485e-01  0.0625396719  0.0774218297  0.234401221 -3.541252e-02 -0.240428544  0.128403162 -0.0686364968  0.113090041
amm           -0.1084411839 -4.407786e-01  0.0057247742  0.0538542716 -1.812468883  4.798631e-03  0.281937862  1.068013480  0.3084215704 -1.217501183
oxy            0.0686692124  1.980446e-02 -0.0894153251  0.1200884061  0.032052566  3.880445e-02 -0.058026043  0.061374900 -0.0196444146  0.089881042
dbo            0.0108322463 -2.696114e-02 -0.0253225230  0.0745175780  0.067082880  9.276548e-02 -0.019719504  0.047865971  0.0359365102  0.065416035
                      RDA11         RDA12
alt            0.0006826822  0.0009471677
penmoderate    0.0398080898 -0.2974896027
pensteep       0.2445035939 -0.3475535793
penvery_steep  0.2457103975 -0.1848717482
deb           -0.0022565029  0.0064858596
pH             0.0696045266  0.5756301035
dur            0.0007758175  0.0062181193
pho           -0.0015544897 -0.6309008260
nit            0.3983543655  0.0942246573
amm           -1.5964107514  0.8979015239
oxy            0.0627415675  0.0258937429
dbo            0.1113928484  0.0403158432

提取。解读和绘制vegan包输出的RDA结果

校正R^2

R^2_{adj}=1-\frac{n-1}{n-m-1}\left( 1-R^2 \right)

# 提取校正R2
# **********
# 从rda的结果中提取未校正R2 
(R2 <- RsquareAdj(spe.rda)$r.squared)
[1] 0.7270878
# 从rda的结果中提取校正R2
(R2adj <- RsquareAdj(spe.rda)$adj.r.squared)
[1] 0.5224036
#可以看出,校正R2总是小于R2。校正R2作为被解释方差比例的无偏估计,后#面的变差分解部分所用的也是校正R2。
# RDA三序图

现在绘制RDA的排序图。如果一张排序图中有三个实体:样方、响应变量、解释变量,这种排序图称为三序图(triplot)为了区分响应变量和解释变量,定量解释变量用箭头表示,响应变量用不带箭头的线表示。

# RDA三序图
# **********
# 1型标尺:距离三序图
plot(spe.rda, scaling=1, main="RDA三序图:spe.hel~env2 - 1型标尺- 加权和样方坐标")
#此排序图同时显示所有的元素:样方、物种、定量解释变量(用箭头表示)
#和因子变量的形心。为了与定量解释变量区分,物种用不带箭头的线表示。
spe.sc <- scores(spe.rda, choices=1:2, scaling=1, display="sp")
arrows(0, 0, spe.sc[, 1], spe.sc[, 2], length=0, lty=1, col="red")
plot(spe.rda, main="RDA三序图:spe.hel~env2 - 2型标尺- 加权和样方坐标")
spe2.sc <- scores(spe.rda, choices=1:2, display="sp")
arrows(0, 0, spe2.sc[, 1], spe2.sc[, 2], length=0, lty=1, col="red")
# 样方坐标是环境因子线性组合 
# 1型标尺
plot(spe.rda, scaling=1, display=c("sp", "lc", "cn"), 
    main="RDA三序图:spe.hel~env2 - 1型标尺- 拟合的样方坐标")
arrows(0, 0, spe.sc[, 1], spe.sc[, 2], length=0, lty=1, col="red")
# 2型标尺
plot(spe.rda, display=c("sp", "lc", "cn"), 
    main="RDA三序图:spe.hel~env2 - 2型标尺- 拟合的样方坐标")
arrows(0, 0, spe2.sc[,1], spe2.sc[,2], length=0, lty=1, col="red")
RDA 结果的置换检验
# RDA所有轴置换检验
anova.cca(spe.rda, step=1000)   

Permutation test for rda under reduced model
Permutation: free
Number of permutations: 999

Model: rda(formula = spe.hel ~ alt + pen + deb + pH + dur + pho + nit + amm + oxy + dbo, data = env2)
         Df Variance      F Pr(>F)    
Model    12  0.36537 3.5522  0.001 ***
Residual 16  0.13714                  
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
# 每个典范轴逐一检验
anova.cca(spe.rda, by="axis", step=1000)
Permutation test for rda under reduced model
Forward tests for axes
Permutation: free
Number of permutations: 999

Model: rda(formula = spe.hel ~ alt + pen + deb + pH + dur + pho + nit + amm + oxy + dbo, data = env2)
         Df Variance       F Pr(>F)    
RDA1      1 0.228081 26.6098  0.001 ***
RDA2      1 0.053696  6.2646  0.003 ** 
RDA3      1 0.032124  3.7478  0.361    
RDA4      1 0.023207  2.7075  0.763    
RDA5      1 0.008707  1.0159  1.000    
RDA6      1 0.007218  0.8421  1.000    
RDA7      1 0.004862  0.5673  1.000    
RDA8      1 0.002919  0.3406  1.000    
RDA9      1 0.002141  0.2497  1.000    
RDA10     1 0.001160  0.1353  1.000    
RDA11     1 0.000913  0.1066  1.000    
RDA12     1 0.000341  0.0397  1.000    
Residual 16 0.137141                   
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
#每个典范轴的检验只能输入由公式模式获得的rda结果。有多少个轴结果是
#显著的?

# 使用Kaiser-Guttman准则确定残差轴
spe.rda$CA$eig[spe.rda$CA$eig > mean(spe.rda$CA$eig)]
        PC1         PC2         PC3         PC4         PC5 
0.045802781 0.028143080 0.015288209 0.013987518 0.009841239 
#很明显,还有一部分有意思的变差尚未被目前所用的这套环境变量解释。
偏RDA分析
 偏RDA:固定地形变量影响后,水体化学属性的效应
# 简单模式:X和W可以是分离的定量变量表格
spechem.physio <- rda(spe.hel, envchem, envtopo)
spechem.physio

Call: rda(X = spe.hel, Y = envchem, Z = envtopo)

              Inertia Proportion Rank
Total          0.5025     1.0000     
Conditional    0.2087     0.4153    3
Constrained    0.1602     0.3189    7
Unconstrained  0.1336     0.2659   18
Inertia is variance 

Eigenvalues for constrained axes:
   RDA1    RDA2    RDA3    RDA4    RDA5    RDA6    RDA7 
0.09137 0.04591 0.00928 0.00625 0.00387 0.00214 0.00142 

Eigenvalues for unconstrained axes:
    PC1     PC2     PC3     PC4     PC5     PC6     PC7     PC8 
0.04642 0.02072 0.01746 0.01325 0.00974 0.00588 0.00512 0.00400 
(Showed only 8 of all 18 unconstrained eigenvalues)

summary(spechem.physio)
# 公式模式;X和W必须在同一数据框内
class(env)
spechem.physio2 <- rda(spe.hel ~ pH + dur + pho + nit + amm + oxy + dbo 
 + Condition(alt + pen + deb), data=env)
spechem.physio2

Call: rda(formula = spe.hel ~ pH + dur + pho + nit + amm + oxy + dbo + Condition(alt + pen + deb), data = env)

              Inertia Proportion Rank
Total          0.5025     1.0000     
Conditional    0.2087     0.4153    3
Constrained    0.1602     0.3189    7
Unconstrained  0.1336     0.2659   18
Inertia is variance 

Eigenvalues for constrained axes:
   RDA1    RDA2    RDA3    RDA4    RDA5    RDA6    RDA7 
0.09137 0.04591 0.00928 0.00625 0.00387 0.00214 0.00142 

Eigenvalues for unconstrained axes:
    PC1     PC2     PC3     PC4     PC5     PC6     PC7     PC8 
0.04642 0.02072 0.01746 0.01325 0.00974 0.00588 0.00512 0.00400 
(Showed only 8 of all 18 unconstrained eigenvalues)


#上面两个分析的结果完全相同。
偏RDA检验(使用公式模式获得的RDA结果,以便检验每个轴)
anova.cca(spechem.physio2, step=1000)

Permutation test for rda under reduced model
Permutation: free
Number of permutations: 999

Model: rda(formula = spe.hel ~ pH + dur + pho + nit + amm + oxy + dbo + Condition(alt + pen + deb), data = env)
         Df Variance      F Pr(>F)    
Model     7  0.16024 3.0842  0.001 ***
Residual 18  0.13360                  
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

anova.cca(spechem.physio2, step=1000, by="axis")
Permutation test for rda under reduced model
Forward tests for axes
Permutation: free
Number of permutations: 999

Model: rda(formula = spe.hel ~ pH + dur + pho + nit + amm + oxy + dbo + Condition(alt + pen + deb), data = env)
         Df Variance       F Pr(>F)    
RDA1      1 0.091373 12.3108  0.001 ***
RDA2      1 0.045907  6.1851  0.010 ** 
RDA3      1 0.009277  1.2499  0.964    
RDA4      1 0.006251  0.8422  0.993    
RDA5      1 0.003866  0.5209  0.996    
RDA6      1 0.002142  0.2886  1.000    
RDA7      1 0.001425  0.1920  0.997    
Residual 18 0.133599                   
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
# 偏RDA三序图(使用拟合值的样方坐标)
# 1型标尺
plot(spechem.physio, scaling=1, display=c("sp", "lc", "cn"), 
main="RDA三序图:spe.hel~chem ︳Tope- 1型标尺- 拟合的样方坐标")
spe3.sc <- scores(spechem.physio, choices=1:2, scaling=1, display="sp")
arrows(0, 0, spe3.sc[, 1], spe3.sc[, 2], length=0, lty=1, col="red")
# 2型标尺
plot(spechem.physio, display=c("sp", "lc", "cn"), 
main="RDA三序图:spe.hel~chem ︳Tope- 2型标尺- 拟合的样方坐标")
spe4.sc <- scores(spechem.physio, choices=1:2, display="sp")
arrows(0, 0, spe4.sc[,1], spe4.sc[,2], length=0, lty=1, col="red")

解释变量向前选择

每个变量的共线性程度可以用变量的方差膨胀因子(variance inflation factor,VIF)度量,VIF是衡量一个变量的回归系数的方差由共线性引起的膨胀比例。如果VIF值超过20,表示共线性很严重。实际上,VIF超过10则可能会有共线性问题,需要处理。

# 两个RDA结果的变量方差膨胀因子(VIF)
# ********************************************
# 本章第一个RDA结果:包括所有环境因子变量
vif.cca(spe.rda)
         alt   penmoderate      pensteep penvery_steep           deb            pH           dur           pho           nit 
    20.397021      2.085921      2.987679      4.594983      6.684187      1.363575      3.049937     30.614913     18.953092 
          amm           oxy           dbo 
    40.114746      6.854703     17.980889 
vif.cca(spechem.physio) # 偏RDA
     alt       pen       deb        pH       dur       pho       nit       amm       oxy       dbo 
16.188416  1.873974  6.711460  1.205235  3.268620 25.363359 16.080319 30.685907  6.904214 17.782727 
# 使用双终止准则(Blanchet等,2008a)前向选择解释变量
# 1.包含所有解释变量的RDA全模型 
spe.rda.all <- rda(spe.hel ~ ., data=env)
Call: rda(formula = spe.hel ~ alt + pen + deb + pH + dur + pho + nit + amm + oxy + dbo, data = env)

              Inertia Proportion Rank
Total          0.5025     1.0000     
Constrained    0.3689     0.7341   10
Unconstrained  0.1336     0.2659   18
Inertia is variance 

Eigenvalues for constrained axes:
   RDA1    RDA2    RDA3    RDA4    RDA5    RDA6    RDA7    RDA8    RDA9   RDA10 
0.22803 0.05442 0.03382 0.03008 0.00749 0.00566 0.00443 0.00281 0.00138 0.00079 

Eigenvalues for unconstrained axes:
    PC1     PC2     PC3     PC4     PC5     PC6     PC7     PC8 
0.04642 0.02072 0.01746 0.01325 0.00974 0.00588 0.00512 0.00400 
(Showed only 8 of all 18 unconstrained eigenvalues)

# 2.全模型校正R2
(R2a.all <- RsquareAdj(spe.rda.all)$adj.r.squared)
[1] 0.5864353
# 3.使用packfors 包内forward.sel()函数选择变量
# library(packfor) #如果尚未载入packfors包,需要运行这一步
forward.sel(spe.hel, env, adjR2thresh=R2a.all)
Testing variable 1
Testing variable 2
Testing variable 3
Testing variable 4
Procedure stopped (adjR2thresh criteria) adjR2cum = 0.594764 with 4 variables (superior to 0.586435)
  variables order         R2     R2Cum  AdjR2Cum         F  pval
1       alt     1 0.32806549 0.3280655 0.3031790 13.182488 0.001
2       oxy     9 0.16402853 0.4920940 0.4530243  8.396715 0.001
3       dbo    10 0.09733024 0.5894243 0.5401552  5.926448 0.001
4       pen     2 0.06323025 0.6526545 0.5947636  4.368924 0.007
#注意,正如这个函数的提示信息所示,选择最后一个变量其实违背了
#adjR2thresh终止准则,所以pen变量最终不应该在被选变量列表中。
# 使用vegan包内ordistep()函数前向选择解释变量。该函数可以对因子变量进# 行选择,也可以运行解释变量的逐步选择和后向选择。
step.forward <- ordistep(rda(spe.hel ~ 1, data=env), scope = formula(spe.rda.all ), 
direction="forward", pstep = 1000)
Start: spe.hel ~ 1 

      Df     AIC       F Pr(>F)   
+ alt  1 -28.504 13.1825  0.005 **
+ oxy  1 -27.420 11.7086  0.005 **
+ deb  1 -26.872 10.9840  0.005 **
+ nit  1 -26.716 10.7795  0.005 **
+ dbo  1 -23.172  6.4340  0.005 **
+ dur  1 -22.499  5.6673  0.005 **
+ pho  1 -22.189  5.3200  0.005 **
+ amm  1 -22.047  5.1620  0.005 **
+ pen  1 -20.155  3.1305  0.005 **
+ pH   1 -17.489  0.4839  0.815   
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Step: spe.hel ~ alt 

      Df     AIC      F Pr(>F)   
+ oxy  1 -34.620 8.3967  0.005 **
+ dbo  1 -32.103 5.5373  0.005 **
+ amm  1 -30.777 4.1281  0.010 **
+ pho  1 -30.560 3.9032  0.010 **
+ nit  1 -29.451 2.7810  0.035 * 
+ pen  1 -29.049 2.3847  0.040 * 
+ deb  1 -27.972 1.3504  0.170   
+ dur  1 -27.954 1.3332  0.290   
+ pH   1 -27.426 0.8403  0.515   
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Step: spe.hel ~ alt + oxy 

      Df     AIC      F Pr(>F)   
+ dbo  1 -38.789 5.9264  0.005 **
+ pho  1 -37.052 4.1280  0.010 **
+ amm  1 -36.527 3.6055  0.015 * 
+ pen  1 -36.399 3.4797  0.015 * 
+ dur  1 -34.740 1.8964  0.095 . 
+ deb  1 -34.388 1.5714  0.125   
+ nit  1 -33.474 0.7474  0.620   
+ pH   1 -33.035 0.3605  0.950   
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Step: spe.hel ~ alt + oxy + dbo 

      Df     AIC      F Pr(>F)  
+ pen  1 -41.639 4.3689  0.015 *
+ dur  1 -39.394 2.2555  0.025 *
+ deb  1 -38.436 1.4019  0.190  
+ pho  1 -37.789 0.8420  0.455  
+ nit  1 -37.577 0.6611  0.655  
+ amm  1 -37.583 0.6656  0.700  
+ pH   1 -37.316 0.4399  0.910  
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Step: spe.hel ~ alt + oxy + dbo + pen 

      Df     AIC      F Pr(>F)
+ dur  1 -41.502 1.5255  0.130
+ deb  1 -41.058 1.1528  0.380
+ pho  1 -40.822 0.9570  0.470
+ amm  1 -40.587 0.7641  0.630
+ nit  1 -40.560 0.7418  0.635
+ pH   1 -40.375 0.5912  0.760

> RsquareAdj(rda(spe.hel ~ alt, data=env))$adj.r.squared
[1] 0.303179
> RsquareAdj(rda(spe.hel ~ alt+oxy, data=env))$adj.r.squared
[1] 0.4530243
> RsquareAdj(rda(spe.hel ~ alt+oxy+dbo, data=env))$adj.r.squared
[1] 0.5401552
> RsquareAdj(rda(spe.hel ~ alt+oxy+dbo+pen, data=env))$adj.r.squared
[1] 0.5947636
> #简约的RDA分析
> # **************
> spe.rda.pars <- rda(spe.hel ~ alt + oxy + dbo, data=env)
> spe.rda.pars
Call: rda(formula = spe.hel ~ alt + oxy + dbo, data = env)

              Inertia Proportion Rank
Total          0.5025     1.0000     
Constrained    0.2962     0.5894    3
Unconstrained  0.2063     0.4106   25
Inertia is variance 

Eigenvalues for constrained axes:
   RDA1    RDA2    RDA3 
0.21802 0.05088 0.02729 

Eigenvalues for unconstrained axes:
    PC1     PC2     PC3     PC4     PC5     PC6     PC7     PC8 
0.05835 0.04357 0.02568 0.01740 0.01451 0.01230 0.00787 0.00646 
(Showed only 8 of all 25 unconstrained eigenvalues)

> anova.cca(spe.rda.pars, step=1000)
Permutation test for rda under reduced model
Permutation: free
Number of permutations: 999

Model: rda(formula = spe.hel ~ alt + oxy + dbo, data = env)
         Df Variance      F Pr(>F)    
Model     3  0.29619 11.963  0.001 ***
Residual 25  0.20632                  
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
> anova.cca(spe.rda.pars, step=1000, by="axis")
Permutation test for rda under reduced model
Forward tests for axes
Permutation: free
Number of permutations: 999

Model: rda(formula = spe.hel ~ alt + oxy + dbo, data = env)
         Df Variance       F Pr(>F)    
RDA1      1 0.218022 26.4181  0.001 ***
RDA2      1 0.050879  6.1651  0.001 ***
RDA3      1 0.027291  3.3069  0.002 ** 
Residual 25 0.206319                   
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
> vif.cca(spe.rda.pars)
     alt      oxy      dbo 
1.223386 3.544201 3.402515 
> (R2a.pars <- RsquareAdj(spe.rda.pars)$adj.r.squared)
[1] 0.5401552

# 1型标尺
plot(spe.rda.pars, scaling=1, display=c("sp", "lc", "cn"), 
    main="RDA三序图:spe.hel~alt+oxy+dbo - 1型标尺 - 拟合的样方坐标")
spe4.sc = scores(spe.rda.pars, choices=1:2, scaling=1, display="sp")
arrows(0, 0, spe4.sc[, 1], spe4.sc[, 2], length=0, lty=1, col="red")
# 2型标尺
plot(spe.rda.pars, display=c("sp", "lc", "cn"), 
    main="RDA三序图:spe.hel~alt+oxy+dbo - 2型标尺 - 拟合的样方坐标")
spe5.sc = scores(spe.rda.pars, choices=1:2, display="sp")
arrows(0, 0, spe5.sc[,1], spe5.sc[,2], length=0, lty=1, col="red")
#如果第三典范轴也显著,可以选择绘制轴1和轴3、轴2和轴3的三序图。
变差分解
# 变差分解说明图
par(mfrow=c(1,3))
showvarparts(2) # 两组解释变量
showvarparts(3) #三组解释变量
showvarparts(4) #四组解释变量
# 1.带所有环境变量的变差分解 
spe.part.all <- varpart(spe.hel, envchem, envtopo)
spe.part.all

Partition of variance in RDA 

Call: varpart(Y = spe.hel, X = envchem, envtopo)

Explanatory tables:
X1:  envchem
X2:  envtopo 

No. of explanatory tables: 2 
Total variation (SS): 14.07 
            Variance: 0.50251 
No. of observations: 29 

Partition table:
                     Df R.squared Adj.R.squared Testable
[a+b] = X1            7   0.60579       0.47439     TRUE
[b+c] = X2            3   0.41526       0.34509     TRUE
[a+b+c] = X1+X2      10   0.73414       0.58644     TRUE
Individual fractions                                    
[a] = X1|X2           7                 0.24135     TRUE
[b]                   0                 0.23304    FALSE
[c] = X2|X1           3                 0.11205     TRUE
[d] = Residuals                         0.41356    FALSE
---
Use function ‘rda’ to test significance of fractions of interest
plot(spe.part.all, digits=2) 
#这些图内校正R2是正确的数字,但是韦恩图圆圈大小相同,未与R2的大小成比例。
> # 2.分别对两组环境变量进行前向选择
> spe.chem <- rda(spe.hel, envchem)
> R2a.all.chem <- RsquareAdj(spe.chem)$adj.r.squared
> forward.sel(spe.hel, envchem, adjR2thresh=R2a.all.chem, nperm=9999)
Testing variable 1
Testing variable 2
Testing variable 3
Testing variable 4
Procedure stopped (adjR2thresh criteria) adjR2cum = 0.487961 with 4 variables (superior to 0.474388)
  variables order         R2     R2Cum  AdjR2Cum         F   pval
1       oxy     6 0.30247973 0.3024797 0.2766456 11.708553 0.0001
2       dbo     7 0.09015052 0.3926303 0.3459095  3.859122 0.0043
3       nit     4 0.11522115 0.5078514 0.4487936  5.852965 0.0005
4       amm     5 0.05325801 0.5611094 0.4879610  2.912325 0.0083
> spe.topo <- rda(spe.hel, envtopo)
> R2a.all.topo <- RsquareAdj(spe.topo)$adj.r.squared
> forward.sel(spe.hel, envtopo, adjR2thresh=R2a.all.topo, nperm=9999)
Testing variable 1
Testing variable 2
Testing variable 3
Procedure stopped (alpha criteria): pvalue for variable 3 is 0.228900 (superior to 0.050000)
  variables order         R2     R2Cum  AdjR2Cum         F   pval
1       alt     1 0.32806549 0.3280655 0.3031790 13.182488 0.0001
2       pen     2 0.05645105 0.3845165 0.3371717  2.384674 0.0469
> # 解释变量简约组合(基于变量选择的结果)
> names(env)
 [1] "alt" "pen" "deb" "pH"  "dur" "pho" "nit" "amm" "oxy" "dbo"
> envchem.pars <- envchem[, c(4,6,7)]
> envtopo.pars <- envtopo[, c(1,2)]
> # 变差分解
> (spe.part <- varpart(spe.hel, envchem.pars, envtopo.pars))

Partition of variance in RDA 

Call: varpart(Y = spe.hel, X = envchem.pars, envtopo.pars)

Explanatory tables:
X1:  envchem.pars
X2:  envtopo.pars 

No. of explanatory tables: 2 
Total variation (SS): 14.07 
            Variance: 0.50251 
No. of observations: 29 

Partition table:
                     Df R.squared Adj.R.squared Testable
[a+b] = X1            3   0.50785       0.44879     TRUE
[b+c] = X2            2   0.38452       0.33717     TRUE
[a+b+c] = X1+X2       5   0.66351       0.59036     TRUE
Individual fractions                                    
[a] = X1|X2           3                 0.25318     TRUE
[b]                   0                 0.19561    FALSE
[c] = X2|X1           2                 0.14156     TRUE
[d] = Residuals                         0.40964    FALSE
---
Use function ‘rda’ to test significance of fractions of interest
> plot(spe.part, digits=2)
> # 所有可测部分的置换检验
> # [a+b]部分的检验
> anova.cca(rda(spe.hel, envchem.pars), step=1000)
Permutation test for rda under reduced model
Permutation: free
Number of permutations: 999

Model: rda(X = spe.hel, Y = envchem.pars)
         Df Variance      F Pr(>F)    
Model     3  0.25520 8.5992  0.001 ***
Residual 25  0.24731                  
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
> # [b+c]部分的检验
> anova.cca(rda(spe.hel, envtopo.pars), step=1000)
Permutation test for rda under reduced model
Permutation: free
Number of permutations: 999

Model: rda(X = spe.hel, Y = envtopo.pars)
         Df Variance      F Pr(>F)    
Model     2  0.19322 8.1216  0.001 ***
Residual 26  0.30929                  
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
> # [a+b+c]部分的检验
> env.pars <- cbind(envchem.pars, envtopo.pars)
> anova.cca(rda(spe.hel, env.pars), step=1000)
Permutation test for rda under reduced model
Permutation: free
Number of permutations: 999

Model: rda(X = spe.hel, Y = env.pars)
         Df Variance      F Pr(>F)    
Model     5  0.33342 9.0704  0.001 ***
Residual 23  0.16909                  
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
> # [a]部分的检验
> anova.cca(rda(spe.hel, envchem.pars, envtopo.pars), step=1000)
Permutation test for rda under reduced model
Permutation: free
Number of permutations: 999

Model: rda(X = spe.hel, Y = envchem.pars, Z = envtopo.pars)
         Df Variance      F Pr(>F)    
Model     3  0.14020 6.3565  0.001 ***
Residual 23  0.16909                  
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
> # [c]部分的检验
> anova.cca(rda(spe.hel, envtopo.pars, envchem.pars), step=1000)
Permutation test for rda under reduced model
Permutation: free
Number of permutations: 999

Model: rda(X = spe.hel, Y = envtopo.pars, Z = envchem.pars)
         Df Variance      F Pr(>F)    
Model     2 0.078219 5.3197  0.001 ***
Residual 23 0.169091                  
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
> #各个部分置换检验有不显著的吗?
> # 3.无变量nit(硝酸盐浓度)的变差分解
> envchem.pars2 <- envchem[, c(6,7)]
> (spe.part2 <- varpart(spe.hel, envchem.pars2, envtopo.pars))

Partition of variance in RDA 

Call: varpart(Y = spe.hel, X = envchem.pars2, envtopo.pars)

Explanatory tables:
X1:  envchem.pars2
X2:  envtopo.pars 

No. of explanatory tables: 2 
Total variation (SS): 14.07 
            Variance: 0.50251 
No. of observations: 29 

Partition table:
                     Df R.squared Adj.R.squared Testable
[a+b] = X1            2   0.39263       0.34591     TRUE
[b+c] = X2            2   0.38452       0.33717     TRUE
[a+b+c] = X1+X2       4   0.65265       0.59476     TRUE
Individual fractions                                    
[a] = X1|X2           2                 0.25759     TRUE
[b]                   0                 0.08832    FALSE
[c] = X2|X1           2                 0.24885     TRUE
[d] = Residuals                         0.40524    FALSE
---
Use function ‘rda’ to test significance of fractions of interest
> plot(spe.part2, digits=2)
RDA 作为多元方差分析(MANOVA)的工具
# 基于RDA的双因素MANOVA
# **************************
# 生成代表海拔的因子变量(3个水平,每个水平含9个样方)
alt.fac <- gl(3, 9)
# 生成近似模拟pH值的因子变量
pH.fac <- as.factor(c(1, 2, 3, 2, 3, 1, 3, 2, 1, 2, 1, 3, 3, 2, 1, 1, 2, 3, 
  2, 1, 2, 3, 2, 1, 1, 3, 3))
# 两个因子是否平衡?
table(alt.fac, pH.fac)
table(alt.fac, pH.fac)
# 用Helmert对照法编码因子和它们的交互作用项
alt.pH.helm <- model.matrix(~ alt.fac * pH.fac, 
 contrasts=list(alt.fac="contr.helmert", pH.fac="contr.helmert"))
alt.pH.helm

  (Intercept) alt.fac1 alt.fac2 pH.fac1 pH.fac2 alt.fac1:pH.fac1 alt.fac2:pH.fac1 alt.fac1:pH.fac2 alt.fac2:pH.fac2
1            1       -1       -1      -1      -1                1                1                1                1
2            1       -1       -1       1      -1               -1               -1                1                1
3            1       -1       -1       0       2                0                0               -2               -2
4            1       -1       -1       1      -1               -1               -1                1                1
5            1       -1       -1       0       2                0                0               -2               -2
6            1       -1       -1      -1      -1                1                1                1                1
7            1       -1       -1       0       2                0                0               -2               -2
8            1       -1       -1       1      -1               -1               -1                1                1
9            1       -1       -1      -1      -1                1                1                1                1
10           1        1       -1       1      -1                1               -1               -1                1
11           1        1       -1      -1      -1               -1                1               -1                1
12           1        1       -1       0       2                0                0                2               -2
13           1        1       -1       0       2                0                0                2               -2
14           1        1       -1       1      -1                1               -1               -1                1
15           1        1       -1      -1      -1               -1                1               -1                1
16           1        1       -1      -1      -1               -1                1               -1                1
17           1        1       -1       1      -1                1               -1               -1                1
18           1        1       -1       0       2                0                0                2               -2
19           1        0        2       1      -1                0                2                0               -2
20           1        0        2      -1      -1                0               -2                0               -2
21           1        0        2       1      -1                0                2                0               -2
22           1        0        2       0       2                0                0                0                4
23           1        0        2       1      -1                0                2                0               -2
24           1        0        2      -1      -1                0               -2                0               -2
25           1        0        2      -1      -1                0               -2                0               -2
26           1        0        2       0       2                0                0                0                4
27           1        0        2       0       2                0                0                0                4
attr(,"assign")
[1] 0 1 1 2 2 3 3 3 3
attr(,"contrasts")
attr(,"contrasts")$alt.fac
[1] "contr.helmert"

attr(,"contrasts")$pH.fac
[1] "contr.helmert"

#检查当前对照法产生的表格,哪一列代表海拔因子、pH值和交互作用项?
# 检查Helmert对照表属性1:每个变量的和为1

apply(alt.pH.helm[, 2:9], 2, sum) 
# 检查Helmert对照表属性2:变量之间不相关
cor(alt.pH.helm[, 2:9]) 
                alt.fac1 alt.fac2 pH.fac1 pH.fac2 alt.fac1:pH.fac1 alt.fac2:pH.fac1 alt.fac1:pH.fac2 alt.fac2:pH.fac2
alt.fac1                1        0       0       0                0                0                0                0
alt.fac2                0        1       0       0                0                0                0                0
pH.fac1                 0        0       1       0                0                0                0                0
pH.fac2                 0        0       0       1                0                0                0                0
alt.fac1:pH.fac1        0        0       0       0                1                0                0                0
alt.fac2:pH.fac1        0        0       0       0                0                1                0                0
alt.fac1:pH.fac2        0        0       0       0                0                0                1                0
alt.fac2:pH.fac2        0        0       0       0                0                0                0                1
# 使用函数betadisper()(vegan包)(Marti Anderson检验)验证组内协方差矩阵# 的齐性
spe.hel.d1 <- dist(spe.hel[1:27,])
# 海拔因子
(spe.hel.alt.MHV <- betadisper(spe.hel.d1, alt.fac))
    Homogeneity of multivariate dispersions

Call: betadisper(d = spe.hel.d1, group = alt.fac)

No. of Positive Eigenvalues: 26
No. of Negative Eigenvalues: 0

Average distance to median:
     1      2      3 
0.5208 0.5175 0.3881 

Eigenvalues for PCoA axes:
 PCoA1  PCoA2  PCoA3  PCoA4  PCoA5  PCoA6  PCoA7  PCoA8 
6.5329 1.7407 1.2269 1.0591 0.6117 0.4683 0.3987 0.3207 

anova(spe.hel.alt.MHV)
Analysis of Variance Table

Response: Distances
          Df Sum Sq  Mean Sq F value Pr(>F)
Groups     2 0.1032 0.051602  0.8763 0.4292
Residuals 24 1.4133 0.058889       

permutest(spe.hel.alt.MHV) # 置换检验

Permutation test for homogeneity of multivariate dispersions
Permutation: free
Number of permutations: 999

Response: Distances
          Df Sum Sq  Mean Sq      F N.Perm Pr(>F)
Groups     2 0.1032 0.051602 0.8763    999  0.439
Residuals 24 1.4133 0.058889                

> # pH值因子
> (spe.hel.pH.MHV <- betadisper(spe.hel.d1, pH.fac))

    Homogeneity of multivariate dispersions

Call: betadisper(d = spe.hel.d1, group = pH.fac)

No. of Positive Eigenvalues: 26
No. of Negative Eigenvalues: 0

Average distance to median:
     1      2      3 
0.6658 0.6716 0.7019 

Eigenvalues for PCoA axes:
 PCoA1  PCoA2  PCoA3  PCoA4  PCoA5  PCoA6  PCoA7  PCoA8 
6.5329 1.7407 1.2269 1.0591 0.6117 0.4683 0.3987 0.3207 
> anova(spe.hel.pH.MHV)
Analysis of Variance Table

Response: Distances
          Df  Sum Sq   Mean Sq F value Pr(>F)
Groups     2 0.00676 0.0033802  0.1587 0.8542
Residuals 24 0.51124 0.0213018               
> permutest(spe.hel.pH.MHV) #置换检验

Permutation test for homogeneity of multivariate dispersions
Permutation: free
Number of permutations: 999

Response: Distances
          Df  Sum Sq   Mean Sq      F N.Perm Pr(>F)
Groups     2 0.00676 0.0033802 0.1587    999  0.855
Residuals 24 0.51124 0.0213018                     
> #组内协方差齐性,可以继续分析。

# 首先检验交互作用项。海拔因子和pH因子构成协变量矩阵 
interaction.rda <- rda(spe.hel[1:27, ], alt.pH.helm[, 6:9], alt.pH.helm[, 2:5]) 
anova(interaction.rda, step=1000, perm.max=1000) 
#交互作用是否显著?显著的交互作用表示一个因子的影响依赖另一个因子
#的水平,这将妨害主因子变量的分析。
# 检验海拔因子的效应,此时pH值因子和交互作用项作为协变量矩阵 
factor.alt.rda <- rda(spe.hel[1:27, ], alt.pH.helm[, 2:3], alt.pH.helm[, 4:9]) 
anova(factor.alt.rda, step=1000, perm.max=1000, strata=pH.fac) 
#海拔因子影响是否显著?
#检验pH值因子的效应,此时海拔值因子和交互作用项作为协变量矩阵
factor.pH.rda <- rda(spe.hel[1:27, ], alt.pH.helm[, 4:5], 
  alt.pH.helm[, c(2:3, 6:9)]) 
anova(factor.pH.rda, step=1000, perm.max=1000, strata=alt.fac) 
#pH值影响是否显著?
# RDA和显著影响的海拔因子三序图
alt.rda.out <- rda(spe.hel[1:27,]~., as.data.frame(alt.fac))
plot(alt.rda.out, scaling=1, display=c("sp", "wa", "cn"), 
    main="Multivariate ANOVA, factor altitude - scaling 1 - wa scores")
spe.manova.sc <- scores(alt.rda.out, choices=1:2, scaling=1, display="sp")
arrows(0, 0, spe.manova.sc[, 1], spe.manova.sc[, 2], length=0, col="red")
基于距离的RDA分析
> # 基于距离的RDA分析(db-RDA)
> # ****************************
> # 1.分步计算
> spe.bray <- vegdist(spe[1:27, ], "bray")
> spe.pcoa <- cmdscale(spe.bray, k=nrow(spe[1:27, ])-1, eig=TRUE, add=TRUE)
> spe.scores <- spe.pcoa$points
> # 交互作用的检验。从协变量矩阵获得Helmert对照编码海拔因子和pH值因子
> interact.dbrda <- rda(spe.scores[1:27, ], alt.pH.helm[, 6:9], alt.pH.helm[, 2:5])
> anova(interact.dbrda, step=1000, perm.max=1000)
Permutation test for rda under reduced model
Permutation: free
Number of permutations: 999

Model: rda(X = spe.scores[1:27, ], Y = alt.pH.helm[, 6:9], Z = alt.pH.helm[, 2:5])
         Df Variance     F Pr(>F)
Model     4 0.021857 0.441      1
Residual 18 0.223016             
> #交互作用是否显著?如果没有,可以继续检验主因子的效应(此处未显示)
> # 2.直接使用vegan包内capscale()函数运行。只能以模型界面运行。响应变量
> #可以是原始数据矩阵。
> interact.dbrda2 <- capscale(spe[1:27,] ~ alt.fac*pH.fac + Condition(alt.fac+pH.fac), distance="bray", add=TRUE)
> anova(interact.dbrda2, step=1000, perm.max=1000)
Permutation test for capscale under reduced model
Permutation: free
Number of permutations: 999

Model: capscale(formula = spe[1:27, ] ~ alt.fac * pH.fac + Condition(alt.fac + pH.fac), distance = "bray", add = TRUE)
         Df SumOfSqs      F Pr(>F)
Model     4   0.4667 0.4811      1
Residual 18   4.3658              
> # 或者响应变量可以是相异矩阵。
> interact.dbrda3 <- capscale(spe.bray ~ alt.fac*pH.fac + Condition(alt.fac+pH.fac), add=TRUE)
> anova(interact.dbrda3, step=1000, perm.max=1000)
Permutation test for capscale under reduced model
Permutation: free
Number of permutations: 999

Model: capscale(formula = spe.bray ~ alt.fac * pH.fac + Condition(alt.fac + pH.fac), add = TRUE)
         Df SumOfSqs      F Pr(>F)
Model     4   0.4667 0.4811      1
Residual 18   4.3658            
非线性关系的RDA分析
> # 二阶解释变量的RDA
> # *******************
> # 生成das和das正交二阶项(由poly()函数获得)矩阵
> das.df <- poly(das, 2)
> colnames(das.df) <- c("das", "das2")
> # 验证两个变量是否显著
> forward.sel(spe, das.df)
Testing variable 1
Testing variable 2
  variables order         R2     R2Cum  AdjR2Cum         F  pval
1       das     1 0.44777219 0.4477722 0.4273193 21.892865 0.001
2      das2     2 0.07870749 0.5264797 0.4900550  4.321662 0.005
> # RDA和置换检验
> spe.das.rda <- rda(spe ~ ., as.data.frame(das.df))
> anova(spe.das.rda)
Permutation test for rda under reduced model
Permutation: free
Number of permutations: 999

Model: rda(formula = spe ~ das + das2, data = as.data.frame(das.df))
         Df Variance      F Pr(>F)    
Model     2   36.304 14.454  0.001 ***
Residual 26   32.652                  
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
> # 三序图(拟合的样方坐标,2型标尺)
> plot(spe.das.rda, scaling=2, display=c("sp", "lc", "cn"), 
+   main="RDA三序图:spe~das + das2 - 2型标尺 - 拟合的样方坐标")
> spe6.sc <- scores(spe.das.rda, choices=1:2, scaling=2, display="sp")
> arrows(0, 0, spe6.sc[, 1], spe6.sc[, 2], length=0, lty=1, col="red")

# 4种鱼类的分布地图 
# ******************
par(mfrow=c(2, 2))
plot(spa$x, spa$y, asp=1, col="brown", cex=spe$TRU, 
    xlab="x (km)", ylab="y (km)", main="褐鳟")
lines(spa$x, spa$y, col="light blue")
plot(spa$x, spa$y, asp=1, col="brown", cex=spe$OMB, 
    xlab="x (km)", ylab="y (km)", main="鳟鱼")
lines(spa$x, spa$y, col="light blue")
plot(spa$x, spa$y, asp=1, col="brown", cex=spe$ABL, 
    xlab="x (km)", ylab="y (km)", main="欧鮊鱼")
lines(spa$x, spa$y, col="light blue")
plot(spa$x, spa$y, asp=1, col="brown", cex=spe$TAN, 
    xlab="x (km)", ylab="y (km)", main="鲤鱼")
lines(spa$x, spa$y, col="light blue")

自写代码角
为了能够正确自写RDA分析代码,有必要参考Legendre和Legendre
(1998)第11.1节相关内容。下面是计算步骤(基于协方差矩阵的RDA)
1.计算中心化的物种数据矩阵与标准化解释变量矩阵的多元线性回归,获得拟合值矩阵;
2.计算拟合值矩阵的PCA;
3.计算两类样方坐标;
4.结果输出。
下面代码解释部分使用的公式编码与Legendre和Legendre(1998)一致。
下面的代码集中在RDA约束部分,目的是使读者对RDA数学过程感兴趣,而不是最优化程序。

myRDA <- function(Y,X) {
    # 1.数据的准备
    # *************
    Y.mat <- as.matrix(Y)
    Yc <- scale(Y.mat, scale=FALSE)
    X.mat <- as.matrix(X)
    Xcr  <- scale(X.mat)
    # 2.多元线性回归的计算
    # *********************
    # 回归系数矩阵 (eq. 11.4)
    B <- solve(t(Xcr) %*% Xcr) %*% t(Xcr) %*% Yc
    # 拟合值矩阵(eq. 11.5)
    Yhat <- Xcr %*% B
    # 残差矩阵
    Yres <- Yc - Yhat
    #维度
    n <- nrow(Y)
    p <- ncol(Y)
    m <- ncol(X)
    # 3. 拟合值PCA分析 
    # ******************
    # 协方差矩阵 (eq. 11.7)
    S <- cov(Yhat)
    # 特征根分解
    eigenS <- eigen(S)
    # 多少个典范轴?
    kc <- length(which(eigenS$values > 0.00000001))
    # 典范轴特征根
    ev <- eigenS$values[1:kc]
    # 矩阵Yc(中心化)的总方差(惯量)
    trace = sum(diag(cov(Yc)))
    # 正交特征向量(响应变量的贡献,1型标尺)
    U <- eigenS$vectors[,1:kc]
    row.names(U) <- colnames(Y)
    # 样方坐标(vegan包内'wa' 坐标,1型标尺eq. 11.12)
    F <- Yc %*% U
    row.names(F) <- row.names(Y)
    # 样方约束(vegan包内'lc' 坐标,2型标尺eq. 11.13)
    Z <- Yhat %*% U
    row.names(Z) <- row.names(Y)
    # 典范系数 (eq. 11.14)
    CC <- B %*% U
    row.names(CC) <- colnames(X)
    # 解释变量
    # 物种-环境相关
    corXZ <- cor(X,Z)
    # 权重矩阵的诊断
    D <- diag(sqrt(ev/trace))
    # 解释变量双序图坐标
    coordX <- corXZ %*% D    #1型标尺 
    coordX2 <- corXZ         #2型标尺
    row.names(coordX) <- colnames(X)
    row.names(coordX2) <- colnames(X)
    # 相对特征根平方根转化(为2型标尺)
    U2 <- U %*% diag(sqrt(ev))
    row.names(U2) <- colnames(Y)
    F2 <- F %*% diag(1/sqrt(ev))
    row.names(F2) <- row.names(Y)
    Z2 <- Z %*% diag(1/sqrt(ev))
    row.names(Z2) <- row.names(Y)
    # 未校正R2 
    R2 <- sum(ev/trace)
    # 校正R2
    R2adj <- 1-((n-1)/(n-m-1))*(1-R2)
    # 4.残差的PCA 
    # *******************
    # 与第5章相同,写自己的代码,可以从这里开始... 
    #     eigenSres <- eigen(cov(Yres))
    #     evr <- eigenSres$values
    # 5.输出Output
    result <- list(trace, R2, R2adj,ev,CC,U,F,Z,coordX, U2,F2,Z2,coordX2)
    names(result) <- c("Total_variance", "R2", "R2a","Can_ev", "Can_coeff", 
    "Species_sc1", "wa_sc1", "lc_sc1", "Biplot_sc1", "Species_sc2", 
    "wa_sc2", "lc_sc2", "Biplot_sc2") 

    result
}
#将此函数应用到Doubs鱼类数据和环境数据的RDA分析
doubs.myRDA <- myRDA(spe.hel,env)
summary(doubs.myRDA)

               Length Class  Mode   
Total_variance   1    -none- numeric
R2               1    -none- numeric
R2a              1    -none- numeric
Can_ev          10    -none- numeric
Can_coeff      100    -none- numeric
Species_sc1    270    -none- numeric
wa_sc1         290    -none- numeric
lc_sc1         290    -none- numeric
Biplot_sc1     100    -none- numeric
Species_sc2    270    -none- numeric
wa_sc2         290    -none- numeric
lc_sc2         290    -none- numeric
Biplot_sc2     100    -none- numeric

置换检验(R语言实现)

上一篇下一篇

猜你喜欢

热点阅读