使用circos绘制变异分布图

2022-10-22  本文已影响0人  花生学生信

一、数据准备:

使用SURVIVOR将得到的vcf文件合并,

SURVIVOR merge filter_vcf_files.txt 1000 0 1 1 0 0  sample.vcf

####最后一个参数是保留的最小长度,这里用到SNP数据,所以用0

将变异个数按照1MB窗口进行滑窗统计变异个数


# variation_number_in_bin.py

import sys

input_file = sys.argv[1]
step_len = 1000000
result = {}

chrom_stats = {
    "Chr01":44717177,
    "Chr02":37244950,
    "Chr03":39727707,
    "Chr04":37171045,
    "Chr05":31106453,
    "Chr06":32139443,
    "Chr07":31006940,
    "Chr08":30241818,
    "Chr09":24981782,
    "Chr10":25400072,
    "Chr11":33669327,
    "Chr12":26652165
    }

for chrom in chrom_stats.keys():
    step= 0
    step_start = 0
    result[chrom]={}
    result[chrom][step]=0
    for line in open(input_file):
        if line.startswith(chrom):
            pos = line.split()[1]
            if step_start < int(pos) < (step_start + step_len):
                result[chrom][step] += 1
            if int(pos) >= (step_start + step_len):
                step += 1
                step_start += step_len
                result[chrom][step] = 0
                result[chrom][step] += 1

    for steps,nums in result[chrom].items():
        print(chrom, steps, nums, sep="\t")

circos需要准备的数据:

>**基因组信息数据**  rice1.txt
#第一列:染色体,表明这是一条染色体
#第二列:占位符,定义所属关系
#第三列:ID,跟身份证号码一样,唯一不能重复的标识(这一列很重要,后面其他作图数据都是要和ID对应)
#第四列:  图中显示的文本  
#第五列:起始。以0开始
#第六列:终止。第五列和第六列定义了染色体的实际大小
#第七列:颜色     (这一列我写的是Chr1,Chr2,Chr3... 其实在这里他们代表的是颜色,因为在配置文件colors.ucsc.conf
 #               中将不同的色号赋值给了Chr1,Chr2,Chr3... 所以就可用Chr1,Chr2,Chr3... 来代替颜色,当然你可以直接
   #             写色号,或是颜色,都可以)
####
chr - hs01 Chr1 0 43270923 Chr1
chr - hs02 Chr2 0 35937250 Chr2
chr - hs03 Chr3 0 36413819 Chr3
chr - hs04 Chr4 0 35502694 Chr4
chr - hs05 Chr5 0 29958434 Chr5
chr - hs06 Chr6 0 31248787 Chr6
chr - hs07 Chr7 0 29697621 Chr7
chr - hs08 Chr8 0 28443022 Chr8
chr - hs09 Chr9 0 23012720 Chr9
chr - hs10 Chr10 0 23207287 Chr10
chr - hs11 Chr11 0 29021106 Chr11
chr - hs12 Chr12 0 27531856 Chr12
###配置文件  circos.conf


karyotype = rice1.txt
chromosomes_units =100000
<plots>

<plot>
type    = line
file    = num/SNP
color   = blue
#color=div-5-spectral
r1      = 0.90r
r0      = 0.81r
#type      = line
thickness = 2
max_gap = 1u


</plot>


<plot>
type    = heatmap
#type      = line
thickness = 2
max_gap = 1u
#color   = redv
file    = num/DEL
color   = reds-9-seq
#color=div-5-spectral
r1      = 0.80r
r0      = 0.71r
</plot>

<plot>
type    = heatmap

#type      = line
thickness = 2
max_gap = 1u
#color   = redv
file    = num/INS
color   = reds-9-seq
#color=div-5-spectral
r1      = 0.70r
r0      = 0.61r
</plot>

#<plot>
#type    = heatmap
#file    = num/INS
#color   = reds-9-seq
#color=div-5-spectral
#type      = line
#thickness = 2
#max_gap = 1u
#color   = redv
#r1      = 0.60r
#r0      = 0.51r
#</plot>

#<plot>
#type    = heatmap
#file    = num/DEL
#color   = reds-9-seq
#color=div-5-spectral
#type      = line
#thickness = 2
#max_gap = 1u
#color   = redv
#r1      = 0.50r
#r0      = 0.41r
#</plot>


#<plot>
#type = scatter
#fill_color       = black # 填充色
#stroke_color     = black
#glyph            = circle
#glyph_size       = 5 # 元素大小
#file = INV.txt
#r1   = 0.80r
#r0   = 0.71r
#</plot>

#<plot>
#type = histogram
#file = INV.txt
#fill_color = blue # 填充色
#r1   = 0.89r
#r0   = 0.81r
#</plot>


</plots>


#<links>

#<link>
#file          = links.txt
#radius        = 0.61r
#color         = blue_a4
#ribbon = yes
#</link>

#</links>


# Includes content from ideogram.conf (included file path is relative
# to the file that included it). Conventionally, I separate the 
# contents of the <ideogram> block from circos.conf and include
# it via ideogram.conf.
<<include ideogram.conf>>
 
# Similarly, I put the <ticks> block in ticks.conf
#<<include ticks.conf>>

<image>
<<include etc/image.conf>>                
</image>

<<include etc/colors_fonts_patterns.conf>> 
<<include etc/housekeeping.conf>> 

circos需要的文件格式
####ideogram.conf
<ideogram>

<spacing>
default = 0.001r
<pairwise hs01;hs12>
# 以下设置两条染色体之间的间隙为圆的20度角
spacing = 40r
#radius=0.1r
</pairwise>


</spacing>

#<spacing>
# 也可设置指定的两条染色体之间的间隙

#</spacing>

# Ideogram position, fill and outline
radius           = 0.90r
thickness        = 20p
fill             = yes
stroke_color     = dgrey
stroke_thickness = 2p

# Minimum definition for ideogram labels.

show_label       = yes
# see etc/fonts.conf for list of font names
label_font       = default 
label_radius     = dims(image,radius)-60p
label_size       = 30
label_parallel   = yes

</ideogram>

最后在终端直接运行circos,

可以得到一张png和一张svg格式的图片,最后还可以用PS或者AI进行适当修改,

不说了,开组会了

上一篇 下一篇

猜你喜欢

热点阅读