circos作图三(拟南芥为例)
2020-02-14 本文已影响0人
多啦A梦的时光机_648d
上一步最后的circos.conf如下:
karyotype=karyotype.txt
chromosomes_units = 100000
chromosomes_display_default = yes
<ideogram>
<spacing>
default = 0.005r
</spacing>
radius = 0.80r
thickness = 6p
fill = yes
stroke_color = dgrey
stroke_thickness = 2p
show_label = yes
label_font = default
label_radius = dims(ideogram,radius) + 0.05r
label_size = 46
label_parallel = yes
label_fromat = eval(sprintf("%s",var(chr)))
</ideogram>
show_ticks_labels = yes
show_ticks = yes
<ticks>
color = black
multiplier = 1e-6
radius = 1r
thickness = 2p
<tick>
size = 10p
spacing = 5u
</tick>
<tick>
color = black
format = %d
label_offset = 10p
label_size = 25p
show_label = yes
size = 15p
spacing = 10u
thickness = 4p
</tick>
</ticks>
<image>
dir* = .
radius* = 500p
<<include etc/image.conf>>
</image>
<<include etc/colors_fonts_patterns.conf>>
<<include etc/housekeeping.conf>>
一:下载拟南芥数据
记得用ensembl的gff,下面是两者的区别。
iensembl的gff
ncbi的gff
# download
wget ftp://ftp.ncbi.nlm.nih.gov/genomes/all/GCF/000/001/735/GCF_000001735.4_TAIR10.1/GCF_000001735.4_TAIR10.1_genomic.gff.gz
提取基因的位置信息
zgrep '[[:blank:]]gene[[:blank:]]' GCF_000001735.4_TAIR10.1_genomic.gff.gz | cut -f 1,4,5 | awk '{print "chr"$1"\t"$2"\t"$3}' > genes.bed
接着用bedtools以500kb为滑窗,沿染色体创建窗口
cut -d ' ' -f 3,6 karyotype..txt | tr ' ' '\t' > tair10.genome
bedtools makewindows -g tair10.genome -w 500000 > tair10.windows
最后统计信息
bedtools coverage -a tair10.windows -b genes.bed | cut -f 1-4 > genes_num.txt
最后的genes_num.txt为下一步用于作图的数据。
汇总
二:开始作图
还是用上一步的circos.conf,再加上一部分plot
...
<plots>
<plot>
type = line
thickness = 2
max_gap = 1u
file = genes_num.txt
color = redv
r0 = 0.51r
r1 = 0.60r
</plot>
<plot>
type = heatmap
file = genes_num.txt
color = spectral-5-div
r1 = 0.70r
r0 = 0.61r
</plot>
<plot>
type = scatter
fill_color = grey
stroke_color = black
glyph = circle
glyph_size = 10
file = genes_num.txt
r1 = 0.80r
r0 = 0.71r
</plot>
<plot>
type = histogram
file = genes_num.txt
r1 = 0.89r
r0 = 0.81r
</plot>
</plots>
结果:
拟南芥