可视化参数RR语言学习

R语言学习笔记(7)-因子

2021-01-21  本文已影响0人  Akuooo

参考视频:https://www.bilibili.com/video/BV19x411X7C6?p=23

一、因子概念相关

  1. R中变量分类
    (1)名义型变量(如,城市名,省份,相互之间独立)
    常见:字符串
    (2)有序型变量(不同值之间有顺序关系,但又不是连续的,如good-better-best)
    (3)连续型变量(如金额,人口等,可以为某个范围中的任意值)
    常见:数值


    变量分类-孟德尔豌豆.png
  2. 概念

在R中名义型变量和有序型变量称为因子(factor),这些分类变量的可能值称为一个水平,level
如good-better-best,都称为一个level

由这些水平值构成的向量就称为因子(因子本身就是向量)

  1. 因子的作用
    可以用来记录某项研究中研究对象满足的不同处理水平,或者其他类型的分类变量。
    应用:计算频数、独立性检验、相关性检验、方差分析、主成分分析、因子分析……
    例如:


    mtcars.png
> table(mtcars$cyl)//cyl这一列可作为因子类型,因子的level为4,6,8
 4  6  8 
11  7 14 

二、定义因子

  1. factor()函数
> f <- factor(c("red","red","green","blue"))
> f
[1] red   red   green blue 
Levels: blue green red

#指定因子水平
> week <- factor(c("Mon","Fri","Thu","Wed","Mon","Fri","Sun"), ordered = T, levels = c("Mon","Tue","Wed","Thu","Fri","Sat","Sun"))
> week
[1] Mon Fri Thu Wed Mon Fri Sun
Levels: Mon < Tue < Wed < Thu < Fri < Sat < Sun

#将向量直接转化为因子
> fcyl <- factor(mtcars$cyl)
> fcyl
 [1] 6 6 4 6 8 6 8 4 4 6 6 8 8 8 8 8 8 4 4 4 4 8 8 8 8 4 4 4 8 6 8 4
Levels: 4 6 8
> plot(mtcars$cyl)
>plot(factor(mtcars$cyl))
mtcars$cyl.png
factorcyl.png

可以看到,向量绘图为散点图,而因子的则是柱状图

  1. cut()
    1~100个数,按1-10,11-20……分类
> num <- 1:100
> cut (num,c(seq(0,100,10)))//计算每个区间包含的数字是多少,方便进行频数统计
  [1] (0,10]   (0,10]   (0,10]   (0,10]   (0,10]   (0,10]   (0,10]   (0,10]   (0,10]  
 [10] (0,10]   (10,20]  (10,20]  (10,20]  (10,20]  (10,20]  (10,20]  (10,20]  (10,20] 
 [19] (10,20]  (10,20]  (20,30]  (20,30]  (20,30]  (20,30]  (20,30]  (20,30]  (20,30] 
 [28] (20,30]  (20,30]  (20,30]  (30,40]  (30,40]  (30,40]  (30,40]  (30,40]  (30,40] 
 [37] (30,40]  (30,40]  (30,40]  (30,40]  (40,50]  (40,50]  (40,50]  (40,50]  (40,50] 
 [46] (40,50]  (40,50]  (40,50]  (40,50]  (40,50]  (50,60]  (50,60]  (50,60]  (50,60] 
 [55] (50,60]  (50,60]  (50,60]  (50,60]  (50,60]  (50,60]  (60,70]  (60,70]  (60,70] 
 [64] (60,70]  (60,70]  (60,70]  (60,70]  (60,70]  (60,70]  (60,70]  (70,80]  (70,80] 
 [73] (70,80]  (70,80]  (70,80]  (70,80]  (70,80]  (70,80]  (70,80]  (70,80]  (80,90] 
 [82] (80,90]  (80,90]  (80,90]  (80,90]  (80,90]  (80,90]  (80,90]  (80,90]  (80,90] 
 [91] (90,100] (90,100] (90,100] (90,100] (90,100] (90,100] (90,100] (90,100] (90,100]
[100] (90,100]
10 Levels: (0,10] (10,20] (20,30] (30,40] (40,50] (50,60] (60,70] (70,80] ... (90,100]
#state.division为因子类数据
> state.division
 [1] East South Central Pacific            Mountain           West South Central
 [5] Pacific            Mountain           New England        South Atlantic    
 [9] South Atlantic     South Atlantic     Pacific            Mountain          
[13] East North Central East North Central West North Central West North Central
[17] East South Central West South Central New England        South Atlantic    
[21] New England        East North Central West North Central East South Central
[25] West North Central Mountain           West North Central Mountain          
[29] New England        Middle Atlantic    Mountain           Middle Atlantic   
[33] South Atlantic     West North Central East North Central West South Central
[37] Pacific            Middle Atlantic    New England        South Atlantic    
[41] West North Central East South Central West South Central Mountain          
[45] New England        South Atlantic     Pacific            South Atlantic    
[49] East North Central Mountain          
9 Levels: New England Middle Atlantic South Atlantic ... Pacific
#还有state.region
> state.region
 [1] South         West          West          South         West          West         
 [7] Northeast     South         South         South         West          West         
[13] North Central North Central North Central North Central South         South        
[19] Northeast     South         Northeast     North Central North Central South        
[25] North Central West          North Central West          Northeast     Northeast    
[31] West          Northeast     South         North Central North Central South        
[37] West          Northeast     Northeast     South         North Central South        
[43] South         West          Northeast     South         West          South        
[49] North Central West         
Levels: Northeast South North Central West
上一篇下一篇

猜你喜欢

热点阅读