Type I, Type II, and Type III AN
14.5 Type I, Type II, and Type III ANOVAs
It turns out that there is not just one way to calculate ANOVAs. In fact, there are three different types - called, Type 1, 2, and 3 (or Type I, II and III). These types differ in how they calculate variability (specifically the sums of of squares
). If your data is relatively balanced
, meaning that there are relatively equal numbers of observations in each group, then all three types will give you the same answer. However, if your data are unbalanced
, meaning that some groups of data have many more observations than others, then you need to use Type II (2) or Type III (3).
事实证明,不只有一种方法可以计算方差。事实上,有三种不同的类型-称为类型1、类型2和类型3(或类型I、II和III)。这些类型在计算变异性的方式上有所不同(具体地说,就是“平方和”)。
均衡设计(balanced design)是指 每个实验条件下的被试量相等。否则就是非均衡设计(unbalanced design)。
如果你的数据相对比较平衡,也就是说每组的观察次数相对相等,那么这三种类型的数据都会给你相同的答案。但是,如果您的数据是“不平衡的”,即某些数据组的观测值比其他组多得多,那么您需要使用类型II(2)或类型III(3)。
-
类型I方法(序贯型) 计算ANOVA效应:举例如下:
y~A+B+A:B
A对y的影响;A不做调整
控制A时,B对y的影响;B根据A调整
控制A和B的主效应时,A与B的交互效应。A:B交互项根据A和B调整。
R中的aov()
默认使用的是类型I。 -
类型II(分层型)
效应根据同水平或低水平的效应做调整。
A根据B调整,B根据A调整,A:B交互项同时根据A和B调整。 -
类型III(边界型)
每个效应根据模型其他各效应做相应调整。
A根据B和A:B做调整,A:B交互项根据A和B调整。
SPSS 和 SAS 默认使用的是类型III。如果要在R中得到相同的结果,需要使用类型III。aov()
函数默认使用的是类型I的方法,若要使用类型III,需要用car包中的Anova()函数,具体可参考help(Anova,package="car")
。
样本大小越不平衡,效应项的顺序对结果的影响越大。一般来说,越基础性的效应越需要放在表达式前面。具体来讲:
首先是协变量,然后是主效应,接着是双因素的交互项,再接着是三因素的交互项,以此类推。
对于主效应,越基础性的效应越应放在表达式前面。
The standard aov()
function in base-R uses Type I sums of squares. Therefore, it is only appropriate when your data are balanced. If your data are unbalanced, you should conduct an ANOVA with Type II or Type III sums of squares. To do this, you can use the Anova()
function in the car
package. The Anova()
function has an argument called type
that allows you to specify the type of ANOVA you want to calculate.
R中的标准aov()
函数使用第一类平方和。因此,只有当您的数据达到平衡时,它才是合适的。如果你的数据不平衡,你应该用II型平方和或III型平方和进行方差分析。要做到这一点,可以使用car
包中的Anova()
函数。Anova()
函数有一个名为type
的参数,允许您指定要计算的方差分析的类型。
In the next code chunk, I’ll calculate 3 separate ANOVAs from the poopdeck data using the three different types. First, I’ll create a regression object with lm()
. As you’ll see, the Anova()
function requires you to enter a regression object as the main argument, and not
a formula and dataset. That is, you need to first create a regression object from the data with lm()
(or glm()
), and then enter that object into the Anova()
function. You can also do the same thing with the standard aov()
function`.
在下一个代码块中,我将使用三种不同的类型从poopdeck数据计算3个独立的ANOVA。首先,我将使用lm()
创建一个回归对象。正如您将看到的,Anova()
函数要求您输入回归对象作为主参数,而不是输入公式和数据集。也就是说,您需要先使用lm()
函数(或glm()
)从数据创建一个回归对象,然后将该对象输入到Anova()
函数中。标准的aov()
函数也可以做同样的事情。
# Step 1: Calculate regression object with lm()
time.lm <- lm(formula = time ~ type + cleaner,
data = poopdeck)
Now that I’ve created the regression object time.lm
, I can calculate the three different types of ANOVAs by entering the object as the main argument to either aov()
for a Type I ANOVA, or Anova()
in the car package for a Type II or Type III ANOVA:
现在我已经创建了回归对象time.lm
,我可以通过将该对象作为主参数输入到aov
(对于类型I ANOVA)或者car包(对于类型II或类型III ANOVA)中的Anova()
参数来计算三种不同类型的ANOVA:
# Type I ANOVA - aov()
time.I.aov <- aov(time.lm)
# Type II ANOVA - Anova(type = 2)
time.II.aov <- car::Anova(time.lm, type = 2)
# Type III ANOVA - Anova(type = 3)
time.III.aov <- car::Anova(time.lm, type = 3)
As it happens, the data in the poopdeck dataframe are perfectly balanced (so we’ll get exactly the same result for each ANOVA type. However, if they were not balanced, then we should not use the Type I ANOVA calculated with the aov() function.
碰巧的是,poopdeck数据帧中的数据是完全平衡的(因此,对于每种ANOVA类型,我们都会得到完全相同的结果。但是,如果它们不平衡,那么我们就不应该使用使用aov()
函数计算的I型方差分析。
要查看数据是否平衡,可以使用以下函数:
To see if your data are balanced, you can use the function:
# Are observations in the poopdeck data balanced?
with(poopdeck,
table(cleaner, type))
## type
## cleaner parrot shark
## a 100 100
## b 100 100
## c 100 100
As you can see, in the poopdeck data, the observations are perfectly balanced, so it doesn’t matter which type of ANOVA we use to analyse the data.
正如你所看到的,在poopdeck数据中,观测数据是完全平衡的,所以我们使用哪种类型的方差分析来分析数据并不重要。
有关不同类型的更多详细信息,请查看https://mcfromnz.wordpress.com/2011/03/02/anova-type-iiiiii-ss-explained/.
For more detail on the different types, check out https://mcfromnz.wordpress.com/2011/03/02/anova-type-iiiiii-ss-explained/.
参考资料:https://bookdown.org/ndphillips/YaRrr/type-i-type-ii-and-type-iii-anovas.html