R for Data Science

[R语言] magrittr包 管道操作《R for data

2020-04-28  本文已影响0人  半为花间酒

《R for Data Science》第十八章 Pipes 啃书知识点积累
参考链接:R for Data Science

library(magrittr)

Piping alternatives

- Intermediate steps

R will share columns across data frames, where possible.

diamonds <- ggplot2::diamonds
diamonds2 <- diamonds %>% 
  dplyr::mutate(price_per_carat = price / carat)

pryr::object_size(diamonds)
#> Registered S3 method overwritten by 'pryr':
#>   method      from
#>   print.bytes Rcpp
#> 3.46 MB
pryr::object_size(diamonds2)
#> 3.89 MB
pryr::object_size(diamonds, diamonds2)
#> 3.89 MB

#  如果修改了其中一列,该列在数据框就不再共享
diamonds$carat[1] <- NA
pryr::object_size(diamonds)
#> 3.46 MB
pryr::object_size(diamonds2)
#> 3.89 MB
pryr::object_size(diamonds, diamonds2)
#> 4.32 MB

pryr::object_size()可以获取给定对象占用的内存,可以给多个对象
object.size()只能给定一个对象

- Function composition

bop(
  scoop(
    hop(foo_foo, through = forest),
    up = field_mice
  ), 
  on = head
)

The dagwood sandwhich problem:
The disadvantage is that you have to read from inside-out, from right-to-left, and that the arguments end up spread far apart.

- Use the pipe

foo_foo %>%
  hop(through = forest) %>%
  scoop(up = field_mice) %>%
  bop(on = head)

# 本质上如下
my_pipe <- function(.) {
  . <- hop(., through = forest)
  . <- scoop(., up = field_mice)
  bop(., on = head)
}
my_pipe(foo_foo)

(1) 使用当前环境的函数:如assign load get

assign("x", 10); x
# [1] 10

"x" %>% assign(100); x
# [1] 10

env <- environment()
"x" %>% assign(100, envir = env); x
# [1] 100

(2) 延迟使用、惰性计算的函数: 如多数捕获异常的函数
tryCatch try suppressMessages suppressWarnings

tryCatch(stop("!"), error = function(e) "An error")
#> [1] "An error"

stop("!") %>% 
  tryCatch(error = function(e) "An error")
#> Error in eval(lhs, parent, parent): !

When not to use the pipe

知道什么时候不用管道也是很重要的事情

Pipes are most useful for rewriting a fairly short linear sequence of operations.

Other tools from magrittr

When working with more complex pipes, it’s sometimes useful to call a function for its side-effects. Maybe you want to print out the current object, or plot it, or save it to disk. Many times, such functions don’t return anything, effectively terminating the pipe.

library(magrittr)

rnorm(100) %>%
  matrix(ncol = 2) %>%
  plot() %>%
  str()
#  NULL

rnorm(100) %>%
  matrix(ncol = 2) %T>%
  plot() %>% 
  str()
# num [1:50, 1:2] -0.351 -1.751 0.666 0.516 -0.686 ...
mtcars %$%
  cor(disp, mpg)
#> [1] -0.8475514

# 可以用with显式变量
with(mtcars, cor(disp, mpg))
mtcars <- mtcars %>% 
  transform(cyl = cyl * 2)

mtcars %<>% transform(cyl = cyl * 2)
上一篇 下一篇

猜你喜欢

热点阅读