R语言做图plot

跟着Molecular Systems Biology学作图:R

2021-12-07  本文已影响0人  小明的数据分析笔记本

论文

A genome-scale TF-DNA interaction network of transcriptional regulation of Arabidopsis primary and specialized metabolism

https://www.embopress.org/doi/full/10.15252/msb.202110625

image.png

论文中提供了figure1中4个柱形图的数据和代码,今天的推文介绍一下画柱形图的代码以及使用ggplot2作图后如何把多个图拼接到一起,拼图使用R语言的patchwork这个R包


image.png

做柱形图的数据和代码下载链接

https://github.com/melletang/ccp_y1h

首先是读取数据

library(tidyverse)
library(readxl)
network <- readxl::read_excel("MSB-2021-10625-DatasetEV3-Network.xls",
                              sheet = "CC_Y1H_network")

整理数据的代码

binding_summary <- network %>% select(Promoter_AGI, Target_Pathway) %>% unique() %>% group_by(Target_Pathway) %>% 
  tally() %>% rename(num_gene = n)
binding_summary <- left_join(binding_summary, 
                             network %>% select(TF_AGI, Target_Pathway) %>%
                               unique() %>% group_by(Target_Pathway) %>% tally() %>%
                               rename(num_tf = n))

binding_summary <- left_join(binding_summary, 
                             network %>% select(TF_AGI, Promoter_AGI, Target_Pathway) %>%
                               unique() %>% group_by(Target_Pathway) %>% tally() %>%
                               rename(num_int = n))

这里遇到一个新的函数tally(),这个函数来自dplyr这个包,作用是统计每个元素出现的个数,比如用iris这个数据集做一个简单的演示

iris %>% group_by(Species) %>% tally()
image.png

记下来是四个柱形图的代码

library(ggplot2)

panel_b <- ggplot(binding_summary, aes(reorder(Target_Pathway,num_gene), num_gene)) + geom_bar(stat = "identity", fill = "black") + coord_flip() + theme_bw() +
  ylab("Number of genes") + xlab("Pathway") + theme(
    axis.text = element_text(color = "black", size = "10"),
    axis.title = element_text(color = "black", size = "10")
  )
panel_b


panel_c <- ggplot(binding_summary, aes(reorder(Target_Pathway,num_gene), num_tf)) + geom_bar(stat = "identity", fill = "black") + coord_flip() + theme_bw() +
  ylab("Number of TFs") + xlab("Pathway") + theme(
    axis.text = element_text(color = "black", size = "10"),
    axis.title = element_text(color = "black", size = "10"),
    plot.margin = unit(c(0, 0.5, 0, 0), "cm")
  ) 

panel_c

panel_d <- ggplot(binding_summary, aes(reorder(Target_Pathway,num_gene), num_int)) + geom_bar(stat = "identity", fill = "black") + 
  coord_flip() + theme_bw() + ylab("Number of interactions") + xlab("Pathway") + theme(
    axis.text = element_text(color = "black", size = "10"),
    axis.title = element_text(color = "black", size = "10")) 

panel_d
num_path <- network %>% select(TF_AGI, Target_Pathway) %>% unique() %>% group_by(TF_AGI) %>% tally()

numpathbar <- num_path %>% group_by(n) %>% tally()

panel_e <- ggplot(numpathbar, aes(n, nn)) + geom_bar(stat = "identity", fill = "black")+ theme_bw() + ylab("Number of TFs") + xlab("Number of Pathways") + theme(
  axis.text = element_text(color = "black", size = "10"),
  axis.title = element_text(color = "black", size = "10")) + scale_x_continuous(breaks=seq(0,12,1))
panel_e

最后是拼图

其中的A图带概率是借助PPT做的,这里我的处理方式是用ggplot2做一个空白图占据位置,拼图后将整个图导出PPT,然后再PPT里作图A

先做个空白图

ggplot()+
  theme_void() -> pA

拼图代码

library(patchwork)
(pA + (panel_b/panel_c))/(panel_d+panel_e)
image.png

添加ABCDE的文字标签

library(patchwork)
(pA + (panel_b/panel_c))/(panel_d+panel_e)+
  plot_layout(heights =c(2,1) )+
  plot_annotation(tag_levels = "A")
image.png

导出为PPT

library(patchwork)
(pA + (panel_b/panel_c))/(panel_d+panel_e)+
  plot_layout(heights =c(2,1) )+
  plot_annotation(tag_levels = "A") -> x

library(export)
export::graph2ppt(x=x,file="figure1.ppt",
                  width=10,
                  height=10,
                  aspectr=3/2)
image.png

欢迎大家关注我的公众号

小明的数据分析笔记本

小明的数据分析笔记本 公众号 主要分享:1、R语言和python做数据分析和数据可视化的简单小例子;2、园艺植物相关转录组学、基因组学、群体遗传学文献阅读笔记;3、生物信息学入门学习资料及自己的学习笔记!

上一篇 下一篇

猜你喜欢

热点阅读