ggplot2绘制一个高端的venn图
2021-04-28 本文已影响0人
R语言数据分析指南
venn图在论文中的作用与地位也许不如其它图形那么重要,因此本文主要不是教大家如何画一个高端的venn图,而是通过整个画图过程中的代码来介绍一些重要的画图函数而已,希望各位观众老爷能够喜欢
library(ggvenn)
library(tidyverse)
library(ggtext)
创建数据
P <- tibble(
A = sample(LETTERS, 15),
B = sample(LETTERS, 15),
C = sample(LETTERS, 15),
D = sample(LETTERS, 15))
绘制venn图
list(A=P$A,B=P$B,C=P$C,D=P$D) %>%
ggvenn(show_percentage = T,show_elements = F,label_sep = ",",
digits = 1,stroke_color = "white",
fill_color = c("#E41A1C", "#1E90FF", "#FF8C00",
"#4DAF4A", "#984EA3"),
set_name_color = c("#E41A1C", "#1E90FF","#FF8C00","#984EA3"))
image
先将字符格式转化为数据框,因为我们想先通过inner_join()函数来进行一番操作
A <- P$A %>% as.data.frame()
B <- P$B %>% as.data.frame()
C <- P$C %>% as.data.frame()
D <- P$D %>% as.data.frame()
统计各组之间的交集元素
inner_join(A,B) %>% inner_join(.,C) %>% inner_join(.,D)
可以看到如果是2组数据之间通过inner_join( )函数进行元素统计很方便,但是多组之间该如何进行统计?
下面让我们通过自定义函数来解决这一问题
Intersect <- function (x) {
if (length(x) == 1) {
unlist(x)
} else if (length(x) == 2) {
intersect(x[[1]], x[[2]])
} else if (length(x) > 2){
intersect(x[[1]], Intersect(x[-1]))
}
}
Union <- function (x) {
if (length(x) == 1) {
unlist(x)
} else if (length(x) == 2) {
union(x[[1]], x[[2]])
} else if (length(x) > 2) {
union(x[[1]], Union(x[-1]))
}
}
diff <- function (x, y) {
xx <- Intersect(x)
yy <- Union(y)
setdiff(xx, yy)
}
xx <- list(A=P$A,B=P$B,C=P$C,D=P$D)
四组中共有的
Intersect(xx)
C,D中存在;A,B中不存在
diff(xx[c("C", "D")], xx[c("A", "B")])
B,C,D中存在;A中不存在的
diff(xx[c("B","C","D")], xx[("A")])
通过这种方式我们可以获得任意组合之间的元素名称,但是值得一提的是ggvenn提供交集元素展示这一功能
list(A=P$A,B=P$B,C=P$C,D=P$D) %>%
ggvenn(show_percentage = T,show_elements = T,label_sep = ",",
digits = 1,stroke_color = "white",
fill_color = c("#E41A1C", "#1E90FF", "#FF8C00",
"#4DAF4A", "#984EA3"),
set_name_color = c("#E41A1C", "#1E90FF","#FF8C00","#984EA3"))
image
show_elements = T 即可展示共有的元素,但是如果元素文本过长就会影响美感,由于此图支持ggplot2语法,那我们就可以对图形进行一些特殊的注释
下面来进行一番骚操作
定义文本的位置信息
txt <- data.frame(
x = c(1.3,2,-2),
y = c(1.5,-0.2,-0.2),
label = c("Y","D,F,U,X","Q,M"))
geom_curve( )函数来绘制曲线,geom_richtext()来添加文本
list(A=P$A,B=P$B,C=P$C,D=P$D) %>%
ggvenn(show_percentage = T,show_elements = F,label_sep = ",",
digits = 1,stroke_color = "white",
fill_color = c("#E41A1C", "#1E90FF", "#FF8C00",
"#4DAF4A", "#984EA3"),
set_name_color = c("#E41A1C", "#1E90FF","#FF8C00","#984EA3"))+
geom_curve(aes(x = 0, y = 0.2,xend = 1.3,yend = 1.5),
arrow = arrow(length = unit(0.07, "inch"),
ends="first"),size = 0.3,
color = "grey30", curvature = 0.2) +
geom_curve(aes(x = 0.1, y = -0.6,xend = 2,yend = -0.3),
arrow = arrow(length = unit(0.07, "inch"),
ends="first"),size = 0.3,
color = "grey30", curvature = -0.1) +
geom_curve(aes(x = -0.5, y = -0.4,xend = -2,yend = -0.3),
arrow = arrow(length = unit(0.07, "inch"),
ends="first"),size = 0.3,
color = "grey30", curvature = -0.1) +
geom_richtext(
data = txt,
aes(x, y, label = label),
hjust = 1, vjust = 1,angle = 30 )
image