Getting and cleaning data——Week3
2021-10-23 本文已影响0人
Chamberzero
Subsetting and Sorting
Subsetting

Logicals ands and ors


Sorting

Ordering
注意order()函数返回的是当前位置的值现在所在的位置,例如降序排列,第1个值应该为最大值,最大值目前所在位置在23,所以order返回的第一个值为23

Ordering with ply

Adding rows and columns

Summarizing data
Look at a bit of the data
head()
\ tail()
Make summary
summary()

Mpre in depth information
str

Quantiles of quantitative variables
quantile

Make table
table

也可以生成二维表格模式

Check for missing values
sum()
\ any()
\ all()

Row and column sums

Values with specific characteristics
%in%


Cross tabs


Flat tables

Size of a data set

Creating New Variables
Creating sequences

Subsetting variables

Creating binary variables

Creating categorical variables

Easier cutting
Hmisc::cut2

Creating factor variables

Levels of factor variables

Cutting produces factor variables

Using the mutate function

Common transforms

Reshaping data
Start with reshaping

Melting data frames

Casting data frames

Averaging values

Another way
spIns = split(InsectSprays$count,InsectSprays$spray)
sprCount = lapply(spIns,sum)
unlist(sprCount)
sapply(spIns,sum)
Another way 2
ddply(InsectSprays,(spray ), summarize, sumsum( count))
dplyr
Verbs
- select: return a subset of the columns of a data frame
- filter: extract a subset of rows from a data frame based on logical conditions
- arrange: reorder rows of a data frame
- rename: rename variables in a data frame
- mutate: add new variables/columns or transform existing variables
- summarise / summarize: generate summary statistics of different variables in the data frame, possibly within strata
Functions
select()
\ filter()
\ arrange()
\ rename()
\ mutate()
\ group_by()
\ %>%

Merging data
Merging data - merge()
- Merges data frames
-
Important parameters: x, y, by, byx, by y, all
Using join in the plyr package

参考
R语言常用包汇总 - 望着小月亮 - 博客园 (cnblogs.com)
R之描述统计---summary函数,psych包与Hmisc包的区别_MC_manchang的博客-CSDN博客_psych包
R语言缺失值处理(MICE/Amelia/missForest/Hmisc/mi
A quick primer on split-apply-combine problems | R-bloggers
R tutorial on the Apply family of functions