R语言学习笔记(6)-数据框
2021-01-19 本文已影响0人
Akuooo
一、数据框
数据框是一种表格式的数据结构,旨在模拟数据集,与其他统计软件如SAS或SPSS中的数据集概念一致。
通常是由数据构成的一个矩形数组,行表示观测值,列表示变量。
- 特点:实际上是一个列表。
列表中的元素是向量,这些向量构成数据框的列,每一列必须具有相同的长度,所以数据框是矩形结构,而且数据框的列必须命名
与矩阵的比较:
①形状相似
②数据框是比较规则的列表
③矩阵必须为同一数据烈性;
数据框每一列必须为同一类型,每一行可以不同。
-
R内置数据框结构的数据集
iris鸢尾花
mtcars 32辆汽车数据
rock 48块石头形状的数据 -
数据库创建
> ?data.frame
> state <- data.frame(state.name,state.abb,state.region,state.x77)
> state
state.png
如果想将数据存储在R中进行分析,需要每个内容单独存储为一个向量,然后用data.frame合并即可。
二、数据框的访问
数据框包含向量、矩阵、列表
- 通过索引来访问数据
> state[1]
state.name
Alabama Alabama
Alaska Alaska
Arizona Arizona
Arkansas Arkansas
California California
Colorado Colorado
Connecticut Connecticut
Delaware Delaware
Florida Florida
Georgia Georgia
Hawaii Hawaii
Idaho Idaho
Illinois Illinois
Indiana Indiana
Iowa Iowa
Kansas Kansas
Kentucky Kentucky
Louisiana Louisiana
Maine Maine
Maryland Maryland
Massachusetts Massachusetts
Michigan Michigan
Minnesota Minnesota
Mississippi Mississippi
Missouri Missouri
Montana Montana
Nebraska Nebraska
Nevada Nevada
New Hampshire New Hampshire
New Jersey New Jersey
New Mexico New Mexico
New York New York
North Carolina North Carolina
North Dakota North Dakota
Ohio Ohio
Oklahoma Oklahoma
Oregon Oregon
Pennsylvania Pennsylvania
Rhode Island Rhode Island
South Carolina South Carolina
South Dakota South Dakota
Tennessee Tennessee
Texas Texas
Utah Utah
Vermont Vermont
Virginia Virginia
Washington Washington
West Virginia West Virginia
Wisconsin Wisconsin
Wyoming Wyoming
> state[c(2,4)]//输出第二列第四列
>state[,"state.abb"]//利用数据列名,取出对应的列
[1] "AL" "AK" "AZ" "AR" "CA" "CO" "CT" "DE" "FL" "GA" "HI" "ID" "IL" "IN" "IA" "KS" "KY" "LA"
[19] "ME" "MD" "MA" "MI" "MN" "MS" "MO" "MT" "NE" "NV" "NH" "NJ" "NM" "NY" "NC" "ND" "OH" "OK"
[37] "OR" "PA" "RI" "SC" "SD" "TN" "TX" "UT" "VT" "VA" "WA" "WV" "WI" "WY"
> state["Alabama",]//取出对应的行
state.name state.abb state.region Population Income Illiteracy Life.Exp Murder HS.Grad Frost Area
Alabama Alabama AL South 3615 3624 2.1 69.05 15.1 41.3 20 50708
例如:women数据集,记录了女性身高体重
绘图时
> plot(women$height,women$weight)
women.png
> lm(weight ~height,data = women)
Call:
lm(formula = weight ~ height, data = women)
Coefficients:
(Intercept) height
-87.52 3.45
- attach()
(加载数据框到R所在目录中)
> attach(mtcars)//加载后,不需要$即可访问
> mpg
[1] 21.0 21.0 22.8 21.4 18.7 18.1 14.3 24.4 22.8 19.2 17.8 16.4 17.3 15.2 10.4 10.4 14.7
[18] 32.4 30.4 33.9 21.5 15.5 15.2 13.3 19.2 27.3 26.0 30.4 15.8 19.7 15.0 21.4
> hp
[1] 110 110 93 110 175 105 245 62 95 123 123 180 180 180 205 215 230 66 52 65 97
[22] 150 150 245 175 66 91 113 264 175 335 109
>detach(mtcars)//detach()取消加载
- with()
(也不需要$)
> with(mtcars,{hp})
[1] 110 110 93 110 175 105 245 62 95 123 123 180 180 180 205 215 230 66 52 65 97
[22] 150 150 245 175 66 91 113 264 175 335 109
-
单双中括号
双中括号.png