生物信息学

探针寻找之旅(9)——BWA-aln比对短序列探针到基因组

2022-03-02  本文已影响0人  嗒嘀嗒嗒嘀嗒嘀嘀

BWA-aln简介

项目需求

序列准备

基因组比对

$ conda create -n bwa python=3.7
$ conda activate bwa
$ conda install bwa
# bwa建立索引
bwa index csi.chromosome.fa
# 比对到基因组,生成二进制文件(-t 指定线程数)
bwa aln -t 2 -f ./probe1.sai ./genome.fa ./probe1.fa
# 转为sam文件,因为是一条序列的比对,使用单端测序的解读方式
bwa samse -f ./probe1.sam ./genome.fa ./probe1.sai
# .sai文件可以删除了

探针比对结果统计

$ cat stat2.sh
#! /bin/sh
#
cd $(pwd)
echo -e "probe\tchr1\tchr2\tchr3\tchr4\tchr5\tchr6\tchr7\tchr8\tchr9" > title.txt
for i in $(ls *.sam)
do
        echo ${i%.*} > ${i%.*}.prob
        for j in 'chr1' 'chr2' 'chr3' 'chr4' 'chr5' 'chr6' 'chr7' 'chr8' 'chr9'
        do
                less $i | awk '$3 == "'$j'" {print $0}' | wc -l >> ${i%.*}.prob
        done
done
for i in $(ls *.prob)
do
        paste -s $i > tmp.row
        cat title.txt tmp.row > probedistribution.csv
        cat probedistribution.csv > title.txt
done
$ cat probedistribution.csv
probe   chr1    chr2    chr3    chr4    chr5    chr6    chr7    chr8    chr9
chr1a   1200    0       0       0       0       0       0       0       0
chr1b   1003    0       0       0       0       0       0       0       0
chr2a   0       1292    0       0       0       0       0       0       0
chr2b   0       1685    0       0       0       0       0       0       0
chr3a   0       0       1065    0       0       0       0       0       0
chr3b   0       0       1097    0       0       0       0       0       0
chr4a   0       0       0       1330    0       0       0       0       0
chr4b   0       0       0       1047    0       0       0       0       0
chr5a   0       0       0       0       1344    0       0       0       0
chr5b   0       0       0       0       1177    0       0       0       0
chr6a   0       0       0       0       0       875     0       0       0
chr6b   0       0       0       0       0       832     0       0       0
chr7a   0       0       0       0       0       0       1106    0       0
chr7b   0       0       0       0       0       0       1110    0       0
chr8a   0       0       0       0       0       0       0       849     0
chr8b   0       0       0       0       0       0       0       1112    0
chr9a   0       0       0       0       0       0       0       0       1134
chr9b   0       0       0       0       0       0       0       0       1182

后续


参考文章

上一篇 下一篇

猜你喜欢

热点阅读