2021-05-12 dplyr包使用——按列筛选：select

2021-05-12 本文已影响0人 NAome

select——按列筛选

select()用列名作参数来选择子集。
语法：select(.data, ...)
连用参数：dplyr包中提供了些特殊功能的函数与select函数结合使用，用于筛选变量，包括starts_with，ends_with，contains，matches，one_of，num_range和everything等。用于重命名时，select()只保留参数中给定的列，rename()保留所有的列，只对给定的列重新命名。原数据集行名称会被过滤掉。

#加载包
library(dplyr)
library(tidyverse) #tidyverse是一个汇总包，其中包括了dplyr

#更好地数据输出显示（以下两种选其一即可）
iris <- tbl_df(iris)
iris <- as_tibble(iris)
iris
#直接选取列
select(iris, Petal.Length, Petal.Width)
select(iris, c(Petal.Length, Petal.Width))
#返回除Petal.Length和Petal.Width之外的所有列
select(iris, -Petal.Length, -Petal.Width)
select(iris, -c(Petal.Length, Petal.Width))
select(iris, !c(Petal.Length, Petal.Width))
#使用冒号连接列名，选择多个列
select(iris, Sepal.Length:Petal.Width)
#使用冒号连接列名，不选择这些
select(iris, !Sepal.Length:Petal.Width)
#选取变量名前缀包含Petal的列
select(iris, starts_with("Petal"))
#选取变量名前缀不包含Petal的列
select(iris, -starts_with("Petal"))
select(iris, !starts_with("Petal"))
#选取变量名后缀包含Width的列
select(iris, ends_with("Width"))
#选取变量名后缀不包含Width的列
select(iris, -ends_with("Width"))
select(iris, !ends_with("Width"))
#选取变量名中包含etal的列
select(iris, contains("etal"))
#选取变量名中不包含etal的列
select(iris, -contains("etal"))
select(iris, !contains("etal"))
#选取变量名前缀包含Petal和变量名后缀包含Width的列
select(iris, starts_with("Petal") & ends_with("Width"))
#选取变量名前缀包含Petal或变量名后缀包含Width的列
select(iris, starts_with("Petal") | ends_with("Width"))
#正则表达式匹配，返回变量名中包含t的列
select(iris, matches(".t."))
#正则表达式匹配，返回变量名中不包含t的列
select(iris, -matches(".t."))
select(iris, !matches(".t."))
#选择字符向量中的列，select中不能直接使用字符向量筛选，需要使用one_of函数或者all_of函数
vars <- c("Petal.Length", "Petal.Width")
select(iris, one_of(vars))
select(iris, all_of(vars))
#返回指定字符向量之外的列
select(iris, -one_of(vars))
#返回所有列，一般调整数据集中变量顺序时使用
select(iris, everything())
#调整列顺序，把Species列放到最前面
select(iris, Species, everything())

Reference

https://dplyr.tidyverse.org/reference/select.html
https://blog.csdn.net/wltom1985/article/details/54973811

2021-05-12 dplyr包使用——按列筛选：select

select——按列筛选

猜你喜欢

热点阅读