ggpubr包系列学习教程(一)
ggpubr: 'ggplot2' Based Publication Ready Plots
一款基于ggplot2的可视化包ggpubr,能够一行命令绘制出符合出版物要求的图形。
ggplot2 by Hadley Wickham is an excellent and flexible package for elegant data visualization in R. However the default generated plots requires some formatting before we can send them for publication. Furthermore, to customize a ggplot, the syntax is opaque and this raises the level of difficulty for researchers with no advanced R programming skills. The 'ggpubr' package provides some easy-to-use functions for creating and customizing 'ggplot2'- based publication ready plots.
1. R包的安装及加载
ggpubr包可以从CRAN或GitHub中进行下载安装
Install from CRAN as follow:
install.packages("ggpubr")
library(ggpubr)
Or, install the latest version from GitHub as follow:
if(!require(devtools)) install.packages("devtools")
devtools::install_github("kassambara/ggpubr")
library(ggpubr)
2. 常用基本图形的绘制
分布图绘制(Distribution)
01. 带有均值线和地毯线的密度图
图一#构建数据集
set.seed(1234)
df <- data.frame( sex=factor(rep(c("f", "M"), each=200)),
weight=c(rnorm(200, 55), rnorm(200, 58)))
# 预览数据格式
head(df)
# 绘制密度图
ggdensity(df, x="weight", add = "mean", rug = TRUE, color = "sex", fill = "sex",
palette = c("#00AFBB", "#E7B800")) #rug参数添加地毯线,add参数可以添加均值mean和中位数median
图1. 密度图展示不同性别分组下体重的分布,X轴为体重,Y轴为自动累计的密度,X轴上添加地毯线进一步呈现样本的分布;按性别分别组标记轮廓线颜色,再按性别填充色展示各组的分布,使用palette自定义颜色,是不是很舒服。
02. 带有均值线和边际地毯线的直方图
图二gghistogram(df, x="weight", add = "mean", rug = TRUE, color = "sex", fill = "sex",
palette = c("#00AFBB", "#E7B800"))
图2. 带有均值线和边际地毯线的直方图,只是把密度比例还原为了原始数据counts值
箱线/小提琴图绘制(barplot/violinplot)
01. 箱线图+分组形状+统计
图三#加载数据集ToothGrowth
data("ToothGrowth")
df1 <- ToothGrowth
head(df1)
p <- ggboxplot(df1, x="dose", y="len", color = "dose",
palette = c("#00AFBB", "#E7B800", "#FC4E07"),
add = "jitter", shape="dose") #增加了jitter点,点shape由dose映射
p
图3. 箱线图按组着色,同时样本点标记不同形状可以一步区分组或批次
02. 箱线图+分组形状+统计
图四# 增加不同组间的p-value值,可以自定义需要标注的组间比较
my_comparisons <- list(c("0.5", "1"), c("1", "2"), c("0.5", "2"))
p+stat_compare_means(comparisons = my_comparisons)+ #不同组间的比较
stat_compare_means(label.y = 50)
图4. stat_compare_means添加组间比较连线和统计P值
03. 内有箱线图的小提琴图+星标记
图五ggviolin(df1, x="dose", y="len", fill = "dose",
palette = c("#00AFBB", "#E7B800", "#FC4E07"),
add = "boxplot", add.params = list(fill="white"))+
stat_compare_means(comparisons = my_comparisons, label = "p.signif")+ #label这里表示选择显著性标记(星号)
stat_compare_means(label.y = 50)
图5. ggviolin绘制小提琴图, add = “boxplot”中间再添加箱线图,stat_compare_means中,设置lable=”p.signif”,即可添加星添加组间比较连线和统计P值按星分类。
条形/柱状图绘制(barplot)
图六data("mtcars")
df2 <- mtcars
df2$cyl <- factor(df2$cyl)
df2$name <- rownames(df2) #添加一行name
head(df2[, c("name", "wt", "mpg", "cyl")])
ggbarplot(df2, x="name", y="mpg", fill = "cyl", color = "white",
palette = "npg", #杂志nature的配色
sort.val = "desc", #下降排序
sort.by.groups=FALSE, #不按组排序
x.text.angle=60)
图6. 柱状图展示不同车的速度,按cyl为分组信息进行填充颜色,颜色按nature配色方法(支持 ggsci包中的本色方案,如: “npg”, “aaas”, “lancet”, “jco”, “ucscgb”, “uchicago”, “simpsons” and “rickandmorty”),按数值降序排列。
# 按组进行排序
图七ggbarplot(df2, x="name", y="mpg", fill = "cyl", color = "white",
palette = "aaas", #杂志Science的配色
sort.val = "asc", #上升排序,区别于desc,具体看图演示
sort.by.groups=TRUE,x.text.angle=60) #按组排序 x.text.angle=90
图7. 由上图中颜色改为Sciences配色方案,按组升序排布,且调整x轴标签60度角防止重叠。
偏差图绘制(Deviation graphs)
偏差图展示了与参考值之间的偏差
图八df2$mpg_z <- (df2$mpg-mean(df2$mpg))/sd(df2$mpg) # 相当于Zscore标准化,减均值,除标准差
df2$mpg_grp <- factor(ifelse(df2$mpg_z<0, "low", "high"), levels = c("low", "high"))
head(df2[, c("name", "wt", "mpg", "mpg_grp", "cyl")])
ggbarplot(df2, x="name", y="mpg_z", fill = "mpg_grp", color = "white",
palette = "jco", sort.val = "asc", sort.by.groups = FALSE,
x.text.angle=60, ylab = "MPG z-score", xlab = FALSE, legend.title="MPG Group")
图8. 基于Zscore的柱状图,就是原始值减均值,再除标准差。按jco杂志配色方案,升序排列,不按组排列。
# 坐标轴变换
图九ggbarplot(df2, x="name", y="mpg_z", fill = "mpg_grp", color = "white",
palette = "jco", sort.val = "desc", sort.by.groups = FALSE,
x.text.angle=90, ylab = "MPG z-score", xlab = FALSE,
legend.title="MPG Group", rotate=TRUE, ggtheme = theme_minimal()) # rotate设置x/y轴对换
图9. rotate=TRUE翻转坐标轴,柱状图秒变条形图
棒棒糖图绘制(Lollipop chart)
棒棒图可以代替条形图展示数据
图十ggdotchart(df2, x="name", y="mpg", color = "cyl",
palette = c("#00AFBB", "#E7B800", "#FC4E07"),
sorting = "ascending",
add = "segments", ggtheme = theme_pubr())
图10. 柱状图太多了单调,改用棒棒糖图添加多样性
设置其他参数
图十一ggdotchart(df2, x="name", y="mpg", color = "cyl",
palette = c("#00AFBB", "#E7B800", "#FC4E07"),
sorting = "descending", add = "segments", rotate = TRUE,
group = "cyl", dot.size = 6,
label = round(df2$mpg), font.label = list(color="white",
size=9, vjust=0.5), ggtheme = theme_pubr())
图11. 棒棒糖图简单调整,rotate = TRUE转换坐标轴, dot.size = 6调整糖的大小,label = round()添加糖心中的数值,font.label进一步设置字体样式
棒棒糖偏差图
图十二ggdotchart(dfm, x = "name", y = "mpg_z",
color = "cyl", # Color by groups
palette = c("#00AFBB", "#E7B800", "#FC4E07"), # Custom color palette
sorting = "descending", # Sort value in descending order
add = "segments", # Add segments from y = 0 to dots
add.params = list(color = "lightgray", size = 2), # Change segment color and size
group = "cyl", # Order by groups
dot.size = 6, # Large dot size
label = round(dfm$mpg_z,1), # Add mpg values as dot labels,设置一位小数
font.label = list(color = "white", size = 9, vjust = 0.5), # Adjust label parameters
ggtheme = theme_pubr() # ggplot2 theme
)+
geom_hline(yintercept = 0, linetype = 2, color = "lightgray")
图12. 同柱状图类似,用Z-score的值代替原始值绘图。
Cleveland点图绘制
图十三ggdotchart(dfm, x = "name", y = "mpg",
color = "cyl", # Color by groups
palette = c("#00AFBB", "#E7B800", "#FC4E07"), # Custom color palette
sorting = "descending", # Sort value in descending order
rotate = TRUE, # Rotate vertically
dot.size = 2, # Large dot size
y.text.col = TRUE, # Color y text by groups
ggtheme = theme_pubr() # ggplot2 theme
)+
theme_cleveland() # Add dashed grids
图13. theme_cleveland()主题可设置为Cleveland点图样式
3. 常用基本绘图函数及参数
基本绘图函数
gghistogram Histogram plot #绘制直方图
ggdensity Density plot #绘制密度图
ggdotplot Dot plot #绘制点图
ggdotchart Cleveland's Dot Plots #绘制Cleveland点图
ggline Line plot #绘制线图
ggbarplot Bar plot #绘制柱状图
ggstripchart Stripcharts #绘制带状图
ggboxplot Box plot #绘制箱线图
ggviolin Violin plot #绘制小提琴图
ggpie Pie chart #绘制饼图
ggqqplot QQ Plots #绘制QQ图
ggscatter Scatter plot #绘制散点图
ggmaplot MA-plot from means and log fold changes #绘制M-A图
ggpaired Plot Paired Data #绘制散点图矩阵
ggerrorplot Visualizing Error #绘制误差图
基本参数
ggtext Text #添加文本
border Set ggplot Panel Border Line #设置画布边框
grids Add Grids to a ggplot #添加网格线
font Change the Appearance of Titles and Axis Labels #设置字体类型
bgcolor Change ggplot Panel Background Color #更改画布背景颜色
background_image Add Background Image to ggplot2 #添加背景图片
facet Facet a ggplot into Multiple Panels #设置分面
ggpar Graphical parameters #添加画图参数
ggparagraph Draw a Paragraph of Text #添加文本段落
ggtexttable Draw a Textual Table #添加文本表格
ggadd Add Summary Statistics or a Geom onto a ggplot #添加基本统计结果或其他几何图形
ggarrange Arrange Multiple ggplots #排版多个图形
gradient_color Set Gradient Color #设置连续型颜色
xscale Change Axis Scale: log2, log10 and more #更改坐标轴的标度
add_summary Add Summary Statistics onto a ggplot #添加基本统计结果
set_palette Set Color Palette #设置画板颜色
rotate Rotate a ggplot Horizontally #设置图形旋转
rotate_axis_text Rotate Axes Text #旋转坐标轴文本
stat_stars Add Stars to a Scatter Plot #添加散点图星标
stat_cor Add Correlation Coefficients with P-values to a Scatter Plot #添加相关系数
stat_compare_means Add Mean Comparison P-values to a ggplot #添加平均值比较的P值
theme_transparent Create a ggplot with Transparent Background #设置透明背景
theme_pubr Publication ready theme #设置出版物主题
参考来源:https://www.rdocumentation.org/packages/ggpubr/versions/0.1.4
https://mp.weixin.qq.com/s/ZKxzKZ4NBTcsJ6vFimxoGA
http://blog.sciencenet.cn/blog-3334560-1091714.html