一. 染色体不同区域基因表达水平可视化(bwa, samtool
1. bwa index ref.fa;
eg: $ bwa index Genome.fas;
2. bwa mem ref.fa read1.fq read2.fq > aln-pe.sam 2>1.log;
eg: $ nohup bwa mem Genome.fas > Root_1.sam 2>Root1.log&
3. samtools view -bS abc.sam > abc.bam;
eg:$ ll |grep sam|awk '{print"nohup samtools view -bS "$NF" >"$NF".bam&"}'|sh
4. samtools sort abc.bam abc.bam.sort;
eg: $ ll |grep bam|awk '{print"nohup samtools sort "$NF" "$NF".sort&"}'|sh
5. samtools merge sort.bam sort1.bam sort2.bam sort3.bam ...
eg:$ nohup samtools merge Root.sort.bam Root_1.sam.bam.sort.bam Root_2.sam.bam.sort.bam Root_3.sam.bam.sort.bam&;
6. samtools index abc.sort.bam;
eg:$ nohup samtools index Root.sort.bam&;
7. bedtools makewindows -g genome.txt -w 10000000 -s 1000000 > windows.bed;#genome.txt,格式:LG1 53860269;
eg:$ nohup bedtools makewindows -g genome.txt -w 10000000 -s 1000000 > windows.bed&;
8. bedtools coverage -a windows.bed -b xxx.sort.bam > xxx.depth.txt; #xxx.depth.txt文件格式: LG11 14300000 15300000 5835 28598 1000000 0.0285980 ,共7列:染色体、区间起始位点、区间结束位点、该区间内的reads数、该区间内的碱基数、区间大小、该区间的平均覆盖度。
eg:$ nohup bedtools coverage -a windows.bed -b Root.sort.bam > Root.depth.txt;
eg:$ more xxx.depth.txt|awk '$6==1000000{print}'|awk '{print$1"\t"$2"\t"$3"\t"$NF}' >xxx.depth;# 去除末端不足1000000区间;
9. TBtools Advanced Circos 可视化作图,xxx.depth作为输入文件,由内到外:基因密度,Expression1,2,3; #具体参照:TBtools | 全基因组 - 基因密度统计,充实你的图片 - 简书
Expression_Circos_w1ms100k.pdf