生物信息学-小白成长记

R语言-0基础学习1-数据结构

2020-04-17  本文已影响0人  lietobrain

title: R语言的基础学习
date: 2018-11-22 22:45:19
tags:
- R语言

又来翻之前的笔记啦,这里是一些R语言的基础知识,将会分为几篇,接下来再慢慢更新吧
发现之前的笔记还是简单了一些,这里也做了一些补充~ 通俗易懂,赶紧学起来吧~

如果感觉对你有帮助,可以关注:专栏-生物信息学-小白成长记

R语言学习系列
R语言-0基础学习1-数据结构
R语言-0基础学习2-构建子集
R语言-0基础学习3-循环排序信息处理函数
R语言-0基础学习4-实战1-常见操作

R语言数据结构

1. 五种基本类型

字符

x <- "233"

数值

x <- 3.14
x <- 1

整数

x <- 3L

复数

x <- 1+2i

逻辑

x <- TRUE

2. 向量

#  Vector
# 三种表达方式
x <- vector("character", length=10)
x1 <- 1:4
x2 <- c(1,2,3,4)

# x
# [1] "" "" "" "" "" "" "" "" "" ""
# x1
# [1] 1 2 3 4
# x2
# [1] 1 2 3 4

x3 <- c(TRUE,10,"a")
x4 <- c("a","b","c")
x5 <- c(TRUE,FALSE)

# x3
# [1] "TRUE" "10"   "a"   
# x4
# [1] "a" "b" "c"
# x5
# [1]  TRUE FALSE

# 转换
as.numeric(x2)
# [1] 1 2 3 4
as.logical(x5)
# [1]  TRUE FALSE
as.character(x4)
# [1] "a" "b" "c"

# 查看类型
class(x1) 
# [1] "integer"

# 重命名
names(x2) <- c("a","b","c","d")
x2
# a b c d 
# 1 2 3 4 

3. 矩阵

矩阵=向量+纬度

# matrix

# 创建方法1
x <- matrix(1:6, nrow = 3, ncol = 2)
#      [,1] [,2]
# [1,]    1    4
# [2,]    2    5
# [3,]    3    6

# 创建方法2
y <- 1:6
dim(y) <- c(2,3)
y
#      [,1] [,2] [,3]
# [1,]    1    3    5
# [2,]    2    4    6

# 属性与纬度的查看
dim(x)
# [1] 3 2
attributes(x)
# $dim
# [1] 3 2

# 矩阵扩展
y2 <- matrix(1:6, nrow = 2, ncol = 3)
rbind(y,y2)
#      [,1] [,2] [,3]
# [1,]    1    3    5
# [2,]    2    4    6
# [3,]    1    3    5
# [4,]    2    4    6

cbind(y,y2)
#      [,1] [,2] [,3] [,4] [,5] [,6]
# [1,]    1    3    5    1    3    5
# [2,]    2    4    6    2    4    6

4.数组

数组 = 矩阵+n纬度

# 数组

x <- array(1:24, dim = c(4,6))
x
#      [,1] [,2] [,3] [,4] [,5] [,6]
# [1,]    1    5    9   13   17   21
# [2,]    2    6   10   14   18   22
# [3,]    3    7   11   15   19   23
# [4,]    4    8   12   16   20   24

x2 <- array(1:24, dim = c(2,3,4))
x2
# , , 1
#      [,1] [,2] [,3]
# [1,]    1    3    5
# [2,]    2    4    6

# , , 2
#      [,1] [,2] [,3]
# [1,]    7    9   11
# [2,]    8   10   12

# , , 3
#      [,1] [,2] [,3]
# [1,]   13   15   17
# [2,]   14   16   18

# , , 4
#      [,1] [,2] [,3]
# [1,]   19   21   23
# [2,]   20   22   24

5. 列表

# 列表
# 可以添加不同类型的变量
listvalue <- list("a", 2, 10L, 1+2i, TRUE)
listvalue
# [[1]]
# [1] "a"
# [[2]]
# [1] 2
# [[3]]
# [1] 10
# [[4]]
# [1] 1+2i
# [[5]]
# [1] TRUE

listvalue2 <- list(c(1,2,3), c("a","b","c"))
listvalue2
# [[1]]
# [1] 1 2 3

# [[2]]
# [1] "a" "b" "c"

# 给矩阵命名
x <- matrix(1:6, nrow = 2, ncol = 3)
dimnames(x) <- list(c("a","b"), c("c","d","e"))
x
#   c d e
# a 1 3 5
# b 2 4 6

6. 因子

# 因子
# 整数向量+标签

x <- factor(c("female","female","male","female","male"))
x
# [1] female female male   female male  
# Levels: female male

y <- factor(c("female","female","male","female","male"),levels = c("male", 'female'))
y
# [1] female female male   female male  
# Levels: male female

table(x)
# x
# female   male 
#      3      2 
table(y)
# y
#   male female 
#      2      3 

unclass(x)
# [1] 1 1 2 1 2
# attr(,"levels")
# [1] "female" "male"

# 查看类型为factor
class(x)
# [1] "factor"

class(unclass(x))
# [1] "integer"

7. 缺失值

NA

NaN

# 缺失值 NA与NAN,类似""与NULL(皮一下)

x <- c(1, NA, 2, NA, 3)
is.na(x)
# [1] FALSE  TRUE FALSE  TRUE FALSE
is.nan(x)
# [1] FALSE FALSE FALSE FALSE FALSE

y <- c(1, NaN, 2, NaN, 3)
is.na(y)
is.nan(y)

8. 数据框

存储表格数据,视为各元素长度相同的列表

# 数据框

df <- data.frame(id = c(1,2,3,4), name = c("a","b","c","d"), gender=c(TRUE, TRUE, FALSE, FALSE))
df
#   id name gender
# 1  1    a   TRUE
# 2  2    b   TRUE
# 3  3    c  FALSE
# 4  4    d  FALSE

nrow(df)
# [1] 4
ncol(df)
# [1] 3

# 数据框转换为矩阵
data.matrix(df)
#      id name gender
# [1,]  1    1      1
# [2,]  2    2      1
# [3,]  3    3      0
# [4,]  4    4      0

9. 日期与时间

# 日期与时间

# character
x <- date()
# [1] "Fri Apr 17 07:57:43 2020"
x
class(x)

# Date
x2 <- Sys.Date()
x2
# [1] "2020-04-17"
class(x2)
# [1] "Date"

x3 <- as.Date("2018-11-23")
x3
# [1] "2018-11-23"
class(x3)
# [1] "Date"

weekdays(x3)
# [1] "星期五"

months(x3)
# [1] "十一月"

quarters(x3)
# [1] "Q4"

julian(x3)
# [1] 17858
# attr(,"origin")
# [1] "1970-01-01"

# 算时差
x4 <- as.Date("2018-04-25")
x3-x4
# Time difference of 212 days
as.numeric(x3-x4)
# [1] 212

# -----------------------
x <- Sys.time()
x
# [1] "2020-04-17 08:01:36 CST"
class(x)
# [1] "POSIXct" "POSIXt" 

p <- as.POSIXlt(x)
p
# [1] "2020-04-17 08:01:36 CST"

class(p)
# [1] "POSIXlt" "POSIXt" 

# 获取p下的属性名称
names(unclass(p))
#  [1] "sec"    "min"    "hour"   "mday"   "mon"    "year"   "wday"   "yday"   "isdst"  "zone"   "gmtoff"

# 获取p下对应属性的值
p$sec
# [1] 36.0718

10. 数据结构-小结

数据结构-小结

如果感觉对你有帮助,可以关注:专栏-生物信息学-小白成长记

上一篇 下一篇

猜你喜欢

热点阅读