生物信息可视化基因组数据绘图ggplot2绘图

跟着Nature学作图:R语言ggplot2堆积柱形图完整示例

2022-06-30  本文已影响0人  小明的数据分析笔记本

论文

A global reptile assessment highlights shared conservation needs of tetrapods

https://www.nature.com/articles/s41586-022-04664-7#Sec33

数据代码链接

https://github.com/j-marin/Global-reptile-assessment-

今天的推文学习一下推文中的Figure 1a的堆积柱形图,没有找到论文中的作图代码,但是找到了原始数据集,有了原始数据集就可以自己写代码来做这个图

image.png

作图数据集部分截图

image.png

读取数据集

library(readxl)
dat01<-read_excel("data/20220630/41586_2022_4664_MOESM3_ESM.xlsx",
                  sheet = "Fig 1a")
head(dat01)

最基本的堆积柱形图

library(ggplot2)
ggplot(data = dat01,aes(x=className,y=n,fill=rlCodes))+
  geom_bar(stat = "identity",
           position = "stack")
image.png

调整x轴和图例的前后顺序

table(dat01$className)
table(dat01$rlCodes)

dat01$className<-factor(
  dat01$className,
  levels = c("Amphibians","Mammals","Reptiles","Birds")
)

dat01$rlCodes<-factor(
  dat01$rlCodes,
  levels = rev(c("EX","EW","CR","EN","VU","DD","NT","LC")))
ggplot(data = dat01,aes(x=className,y=n,fill=rlCodes))+
  geom_bar(stat = "identity",
           position = "stack")+
  scale_fill_discrete(limits=c("EX","EW","CR",
                               "EN","VU","DD","NT","LC"))
image.png

这里的小知识点是调整图例的顺序可以使用函数scale_fill_discrete(limits=c("EX","EW","CR", "EN","VU","DD","NT","LC"))

现在堆积柱形图展示的是真实数值,接下来把它转换成比例

ggplot(data = dat01,aes(x=className,y=n,fill=rlCodes))+
  geom_bar(stat = "identity",
           position = "fill")+
  scale_fill_discrete(limits=c("EX","EW","CR",
                               "EN","VU","DD","NT","LC"))

只需要把position = "stack" 改成 position = "fill"

添加顶部的文字

library(tidyverse)
dat01 %>% 
  group_by(className) %>% 
  summarise(total_number=sum(n)) %>% 
  ungroup() %>% 
  mutate(ratio=total_number/sum(total_number)) %>% 
  mutate(ratio=scales::percent(ratio)) -> dat02

ggplot(data = dat01,aes(x=className,y=n,fill=rlCodes))+
  geom_bar(stat = "identity",
           position = "fill")+
  scale_fill_discrete(limits=c("EX","EW","CR",
                               "EN","VU","DD","NT","LC"))+
  geom_text(data=dat02,
            aes(x=className,y=1,
                label=paste0(total_number,"\n","(",ratio,")")),
            inherit.aes = FALSE,
            vjust=-0.2)+
  scale_y_continuous(expand = expansion(mult=c(0,0.1)))
image.png

更改配色和其他主题

ggplot(data = dat01,aes(x=className,y=n,fill=rlCodes))+
  geom_bar(stat = "identity",
           position = "fill")+
  scale_fill_manual(values = c("LC"="#98d09d","NT"="#d7e698",
                               "DD"="#dadada","VU"="#fbf398",
                               "EN"="#f7a895","CR"="#e77381",
                               "EW"="#9b8191","EX"="#8f888b"),
                    limits=c("EX","EW","CR","EN","VU","DD","NT","LC"))+
  geom_text(data=dat02,
            aes(x=className,y=1,
                label=paste0(total_number,"\n","(",ratio,")")),
            inherit.aes = FALSE,
            vjust=-0.2)+
  scale_y_continuous(expand = expansion(mult=c(0.01,0.1)),
                     labels = scales::percent_format())+
  theme(panel.background = element_blank(),
        axis.line = element_line(),
        legend.position = "bottom")+
  labs(x=NULL,y="Species threatened (%)")+
  guides(fill=guide_legend(title = NULL,nrow = 1,byrow = FALSE))
image.png

制作封面图

library(patchwork)
p2+p1
image.png

示例数据可以到论文中去下载,示例代码可以在推文中复制

欢迎大家关注我的公众号

小明的数据分析笔记本

小明的数据分析笔记本 公众号 主要分享:1、R语言和python做数据分析和数据可视化的简单小例子;2、园艺植物相关转录组学、基因组学、群体遗传学文献阅读笔记;3、生物信息学入门学习资料及自己的学习笔记!

上一篇下一篇

猜你喜欢

热点阅读