ggplot | 数据分布可视化

2023-01-01  本文已影响0人  生命数据科学

在生物信息数据分析中,了解每个样本的数据分布对于选择分析流程和分析方法是很有帮助的,而如何更加直观、有效地画出数据分布图,是值得思考的问题

1. 所需要的包

library(ggdist)
library(tidyquant)
library(tidyverse)
library(ggsci)

2. 常规作图

比较常见的数据分布图绘制主要为箱线图和小提琴图

1.1 示例数据

示例数据为ggplot2包自带数据,用到的是分类变量cyl,连续变量``

> head(mpg)
# A tibble: 6 × 11
  manufacturer model displ  year   cyl trans      drv     cty   hwy fl    class  
  <chr>        <chr> <dbl> <int> <int> <chr>      <chr> <int> <int> <chr> <chr>  
1 audi         a4      1.8  1999     4 auto(l5)   f        18    29 p     compact
2 audi         a4      1.8  1999     4 manual(m5) f        21    29 p     compact
3 audi         a4      2    2008     4 manual(m6) f        20    31 p     compact
4 audi         a4      2    2008     4 auto(av)   f        21    30 p     compact
5 audi         a4      2.8  1999     6 auto(l5)   f        16    26 p     compact
6 audi         a4      2.8  1999     6 manual(m5) f        18    26 p     compact

1.2 箱线图

最简单的就是箱线图了,能够绘制出数据的离群值、四分位数和四分位间距

# 箱线图
library(ggplot2)
library(ggsci)

p = ggplot(mpg, aes(x=factor(cyl), y=cty,fill=factor(cyl))) + 
  geom_boxplot()+
  scale_fill_lancet()
p
# 依旧运用了ggsci包来填充颜色
image

1.3 小提琴图

小提琴图相比于箱线图,能够多展示一个信息,数据密度

# 同样的数据
# 小提琴图
library(ggplot2)
library(ggsci)

p = ggplot(mpg, aes(x=factor(cyl), y=cty,fill=factor(cyl))) + 
  geom_violin()+
  scale_fill_lancet()
p
# 依旧运用了ggsci包来填充颜色
image

3. 云雨图

先看最终效果图吧~


image

图主要由3部分组成:

  1. 箱线图

3.1 先画云

mpg %>%
  filter(cyl %in% c(4,6,8)) %>%
  ggplot(aes(x = factor(cyl), y = cty, fill = factor(cyl),color = factor(cyl))) +
  # add half-violin from {ggdist} package
  ggdist::stat_halfeye(
    ## custom bandwidth
    adjust = 0.5,
    ## move geom to the right
    justification = -.2,
    ## remove slab interval
    .width = 0,
    point_colour = NA
  ) 
image

3.2 云+箱线图

mpg %>%
  filter(cyl %in% c(4,6,8)) %>%
  ggplot(aes(x = factor(cyl), y = cty, fill = factor(cyl),color = factor(cyl))) +
  # add half-violin from {ggdist} package
  ggdist::stat_halfeye(
    ## custom bandwidth
    adjust = 0.5,
    ## move geom to the right
    justification = -.2,
    ## remove slab interval
    .width = 0,
    point_colour = NA
  ) +
  geom_boxplot(
    width = .15,
    ## remove outliers
    outlier.color = NA,
    alpha = 0.5
  ) 
image

3.3 云+雨+箱线图

mpg %>%
  filter(cyl %in% c(4,6,8)) %>%
  ggplot(aes(x = factor(cyl), y = cty, fill = factor(cyl),color = factor(cyl))) +
  # add half-violin from {ggdist} package
  ggdist::stat_halfeye(
    ## custom bandwidth
    adjust = 0.5,
    ## move geom to the right
    justification = -.2,
    ## remove slab interval
    .width = 0,
    point_colour = NA
  ) +
  geom_boxplot(
    width = .15,
    ## remove outliers
    outlier.color = NA,
    alpha = 0.5
  ) +
  # Add dot plots from {ggdist} package
  ggdist::stat_dots(
    ## orientation to the left
    side = "left",
    ## move geom to the left
    justification = 1.1,
    ## adjust grouping (binning) of observations
    binwidth = .25
  ) 
image

3.4 方向不太对,颠倒一下

mpg %>%
  filter(cyl %in% c(4,6,8)) %>%
  ggplot(aes(x = factor(cyl), y = cty, fill = factor(cyl),color = factor(cyl))) +
  # add half-violin from {ggdist} package
  ggdist::stat_halfeye(
    ## custom bandwidth
    adjust = 0.5,
    ## move geom to the right
    justification = -.2,
    ## remove slab interval
    .width = 0,
    point_colour = NA
  ) +
  geom_boxplot(
    width = .15,
    ## remove outliers
    outlier.color = NA,
    alpha = 0.5
  ) +
  # Add dot plots from {ggdist} package
  ggdist::stat_dots(
    ## orientation to the left
    side = "left",
    ## move geom to the left
    justification = 1.1,
    ## adjust grouping (binning) of observations
    binwidth = .25
  ) +coord_flip()
image

3.5 给它点颜色看看

mpg %>%
  filter(cyl %in% c(4,6,8)) %>%
  ggplot(aes(x = factor(cyl), y = cty, fill = factor(cyl),color = factor(cyl))) +
  # add half-violin from {ggdist} package
  ggdist::stat_halfeye(
    ## custom bandwidth
    adjust = 0.5,
    ## move geom to the right
    justification = -.2,
    ## remove slab interval
    .width = 0,
    point_colour = NA
  ) +
  geom_boxplot(
    width = .15,
    ## remove outliers
    outlier.color = NA,
    alpha = 0.5
  ) +
  # Add dot plots from {ggdist} package
  ggdist::stat_dots(
    ## orientation to the left
    side = "left",
    ## move geom to the left
    justification = 1.1,
    ## adjust grouping (binning) of observations
    binwidth = .25
  ) +coord_flip()+
  # Adjust theme
  scale_fill_lancet() +
  scale_color_lancet()
image

3.6 去掉背景

mpg %>%
  filter(cyl %in% c(4,6,8)) %>%
  ggplot(aes(x = factor(cyl), y = cty, fill = factor(cyl),color = factor(cyl))) +
  # add half-violin from {ggdist} package
  ggdist::stat_halfeye(
    ## custom bandwidth
    adjust = 0.5,
    ## move geom to the right
    justification = -.2,
    ## remove slab interval
    .width = 0,
    point_colour = NA
  ) +
  geom_boxplot(
    width = .15,
    ## remove outliers
    outlier.color = NA,
    alpha = 0.5
  ) +
  # Add dot plots from {ggdist} package
  ggdist::stat_dots(
    ## orientation to the left
    side = "left",
    ## move geom to the left
    justification = 1.1,
    ## adjust grouping (binning) of observations
    binwidth = .25
  ) +coord_flip()+
  # Adjust theme
  scale_fill_lancet() +
  scale_color_lancet()+
  theme_bw()+
  theme_classic()
image

3.7 可以再改改标题

 p <- mpg %>%
  filter(cyl %in% c(4,6,8)) %>%
  ggplot(aes(x = factor(cyl), y = cty, fill = factor(cyl),color = factor(cyl))) +
  # add half-violin from {ggdist} package
  ggdist::stat_halfeye(
    ## custom bandwidth
    adjust = 0.5,
    ## move geom to the right
    justification = -.2,
    ## remove slab interval
    .width = 0,
    point_colour = NA
  ) +
  geom_boxplot(
    width = .15,
    ## remove outliers
    outlier.color = NA,
    alpha = 0.5
  ) +
  # Add dot plots from {ggdist} package
  ggdist::stat_dots(
    ## orientation to the left
    side = "left",
    ## move geom to the left
    justification = 1.1,
    ## adjust grouping (binning) of observations
    binwidth = .25
  ) +
  # Adjust theme
  scale_fill_lancet() +
  scale_color_lancet()+
  theme_bw()+
  theme_classic()+
  labs(title = "Raincloud_plot",
       x="cyl",
       fill="cyl_Type",color="cyl_Type")+
  coord_flip()
p
image

基本上就大功告成啦,最后可以保存一下

ggsave(p,filename = "raincloud_plot.jpg",height = 4,width = 5)

4. 小结

宝剑锋从磨砺出,梅花香自苦寒来

画图其实就是这样,简单的图3句话就能写完,而要做得完美又好看,总是需要更大的高质量
为什么我画图如此迅速呢?

因为我有ggplot2的小抄~

image
基本上所有常见的图形对应的语法都有了,遇到各种图形需求也能游刃有余

感谢观看,如果有用还请点赞,关注,在看,转发!

上一篇下一篇

猜你喜欢

热点阅读