学习小组Day6笔记-朱殊璇
2020-07-08 本文已影响0人
朱殊璇
R包的学习:dplyr包
dplyr包主要用于数据清洗和整理,主要功能有:行选择、列选择、统计汇总、窗口函数、数据框交集等是非常高效、友好的数据处理包。
一、安装dplyr包
![](https://img.haomeiwen.com/i23987107/c703064e1b534d65.jpg)
二、使用dplyr包:五个基础函数
1、mutate(),新增列
![](https://img.haomeiwen.com/i23987107/0211651005a46176.jpg)
2、select(),筛选列
(1)按列号筛选
![](https://img.haomeiwen.com/i23987107/4f4c611273958f80.jpg)
(2)按列名筛选
![](https://img.haomeiwen.com/i23987107/c6dd188984feb703.jpg)
3、filter(),筛选行
![](https://img.haomeiwen.com/i23987107/239732801544a743.jpg)
4、arrange(),按某1列或某几列对整个表格进行排序
![](https://img.haomeiwen.com/i23987107/f1c43de8f7bbc55d.jpg)
5、summarise(),汇总
![](https://img.haomeiwen.com/i23987107/31a973423794edee.jpg)
三、dplyr的两个实用功能
1、管道操作
![](https://img.haomeiwen.com/i23987107/fd116de2402bc504.jpg)
2、统计某列的unique值
![](https://img.haomeiwen.com/i23987107/8b19ce4106367f88.jpg)
四、dplyr处理关系数据:将两个表格进行连接
![](https://img.haomeiwen.com/i23987107/5a64c2561aafd2be.jpg)
1、inner_join:内连,取交集
![](https://img.haomeiwen.com/i23987107/34b4615a8bd54c6b.jpg)
2、left_join:左连
![](https://img.haomeiwen.com/i23987107/02eaa66b74b85826.jpg)
3、full_join:全连
![](https://img.haomeiwen.com/i23987107/8fe2cb143382cd96.jpg)
4、semi_join:半连接,返回能与y表匹配的x表所有记录
![](https://img.haomeiwen.com/i23987107/0a8f8c2ad8ef5985.jpg)
5、anti_join:反连接
![](https://img.haomeiwen.com/i23987107/d86dc87f174781b1.jpg)
6、简单合并:bind_rows();bind_cols()
![](https://img.haomeiwen.com/i23987107/f56ead8835644a55.jpg)