画某个K值对应的群体结构图
2022-02-14 本文已影响0人
宗肃書
又熬了一次夜,下次早睡!
还是分享一下今天画图的代码吧。
画群体结构图的源代码可以在github上找到https://github.com/speciationgenomics/scripts/blob/master/plotADMIXTURE.r
但是该源代码画出来的图实在是太过艳丽,而且不能把K值一个一个的画出来图
-
图像如下
image.png
- 所以针对K=15的情况修改了源代码如下
#!/usr/bin/Rscript
# Usage: plotADMIXTURE.r -p <prefix> -i <info file, 2-column file with ind name and population/species name>
# -k <max K value> -l <comma-separated list of populations/species in the order to be plotted>
# This R script makes barplots for K=2 and all other K values until max K (specified with -k). It labels the individuals
# and splits them into populations or species according to the individual and population/species names in the 2-column file specified with -i.
# The order of populations/species follows the list of populations/species given with -l.
# Usage example: plotADMIXTURE.r -p fileXY -i file.ind.pop.txt -k 4 -pop pop1,pop2,pop3
# In this example, the script would use the files fileXY.2.Q, fileXY.3.Q, fileXY.4.Q to make barplots for the three populations.
# file.ind.pop.txt should contain one line for each individual in the same order as in the admixture files e.g.
# ind1 pop1
# ind2 pop1
# ind3 pop2
# ind4 pop3
# Author: Joana Meier, September 2019
# CoAuthor: jychu, February 2022
library(randomcoloR)
palette <- distinctColorPalette(15)
# Read in the arguments
library("optparse")
option_list = list(
make_option(c("-p", "--prefix"), type="character", default=NULL,
help="prefix name (with path if not in the current directory)", metavar="character"),
make_option(c("-i", "--infofile"), type="character", default=NULL,
help="info text file containing for each individual the population/species information", metavar="character"),
make_option(c("-k", "--maxK"), type="integer", default=NULL,
help="maximum K value", metavar="integer"),
make_option(c("-m", "--minK"), type="integer", default=15,
help="minimum K value", metavar="integer"),
make_option(c("-l", "--populations"), type="character", default=NULL,
help="comma-separated list of populations/species in the order to be plotted", metavar="character"),
make_option(c("-o", "--outPrefix"), type="character", default="default",
help="output prefix (default: name provided with prefix)", metavar="character")
)
opt_parser = OptionParser(option_list=option_list)
opt = parse_args(opt_parser)
# Check that all required arguments are provided
if (is.null(opt$prefix)){
print_help(opt_parser)
stop("Please provide the prefix", call.=FALSE)
}else if (is.null(opt$infofile)){
print_help(opt_parser)
stop("Please provide the info file", call.=FALSE)
}else if (is.null(opt$maxK)){
print_help(opt_parser)
stop("Please provide the maximum K value to plot", call.=FALSE)
}else if (is.null(opt$populations)){
print_help(opt_parser)
stop("Please provide a comma-separated list of populations/species", call.=FALSE)
}
# If no output prefix is given, use the input prefix
if(opt$outPrefix=="default") opt$outPrefix=opt$prefix
# Assign the first argument to prefix
prefix=opt$prefix
# Get individual names in the correct order
labels<-read.table(opt$infofile)
# Name the columns
names(labels)<-c("ind","pop")
# Add a column with population indices to order the barplots
# Use the order of populations provided as the fourth argument (list separated by commas)
labels$n<-factor(labels$pop,levels=unlist(strsplit(opt$populations,",")))
levels(labels$n)<-c(1:length(levels(labels$n)))
labels$n<-as.integer(as.character(labels$n))
# read in the different admixture output files
minK=opt$minK
maxK=opt$maxK
tbl<-lapply(minK:maxK, function(x) read.table(paste0(prefix,".",x,".Q")))
# Prepare spaces to separate the populations/species
rep<-as.vector(table(labels$n))
spaces<-0
for(i in 1:length(rep)){spaces=c(spaces,rep(0,rep[i]-1),0.5)}
spaces<-spaces[-length(spaces)]
# Plot the cluster assignments as a single bar for each individual for each K as a separate row
tiff(file=paste0(opt$outPrefix,".tiff"),width = 4800, height = 1600,res=180)
par(mfrow=c(1,1),mar=c(2,2.8,0,0),oma=c(2,1,1,2),mgp=c(0.5,0.2,0),xaxs="i",cex.lab=1.2,cex.axis=0.8) #mfrow=c(1,1)代表展示一行一列;mar代表图形内边界的距离,从下开始逆时针,oma代表图形外边界的距离
# Plot K15
bp<-barplot(t(as.matrix(tbl[[1]][order(labels$n),])), col=palette ,xaxt="n", border=NA,ylab=paste0("K=",minK),yaxt="n",space=spaces)
axis(1,at=bp,labels=labels$ind[order(labels$n)],las=2,tick=F,cex=0.8) #1表示标签在图像的下方,3表示在图像上方
dev.off()
-
K=11时候图像如下
image.png
相对来说更好看了些,如果有更好看的颜色,欢迎留言推荐 <>