过滤核糖体RNA

2021-01-05  本文已影响0人  宗肃書
1.下载安装
cd /public/jychu/soft/
wget http://bioinfo.lifl.fr/RNA/sortmerna/code/sortmerna-2.1-linux-64-multithread.tar.gz
tar -xvf sortmerna-2.1-linux-64-multithread.tar.gz
mv sortmerna-2.1b sortmerna-2.1
cd sortmerna-2.1
cp indexdb_rna sortmerna ../bin/
export PATH="$PATH:/public/jychu/soft/bin"
sortmerna -h    #检查是否安装成功

2.为数据库建立索引
indexdb_rna --ref \
./rRNA_databases/silva-bac-16s-id90.fasta,./index/silva-bac-16s-db:\
./rRNA_databases/silva-bac-23s-id98.fasta,./index/silva-bac-23s-db:\
./rRNA_databases/silva-arc-16s-id95.fasta,./index/silva-arc-16s-db:\
./rRNA_databases/silva-arc-23s-id98.fasta,./index/silva-arc-23s-db:\
./rRNA_databases/silva-euk-18s-id95.fasta,./index/silva-euk-18s-db:\
./rRNA_databases/silva-euk-28s-id98.fasta,./index/silva-euk-28s:\
./rRNA_databases/rfam-5s-database-id98.fasta,./index/rfam-5s-db:\
./rRNA_databases/rfam-5.8s-database-id98.fasta,./index/rfam-5.8s-db   #30min

3.合并双端测序文件
cd /public/jychu/Lishaomei/goose-cleandata-neck/norRNA
gzip  -d *.fq.gz
/public/jychu/soft/sortmerna-2.1/scripts/merge-paired-reads.sh neck-1-1.clean.fq neck-1-2.clean.fq merged_neck-1.clean.fq     #合并双端数据

4.鉴定和过滤rRNA
cd /public/jychu/soft/sortmerna-2.1
nohup ./sortmerna --ref rRNA_databases/silva-bac-16s-id90.fasta,index/silva-bac-16s-db:rRNA_databases/silva-bac-23s-id98.fasta,index/silva-bac-23s-db:rRNA_databases/silva-arc-16s-id95.fasta,index/silva-arc-16s-db:rRNA_databases/silva-arc-23s-id98.fasta,index/silva-arc-23s-db:rRNA_databases/silva-euk-18s-id95.fasta,index/silva-euk-18s-db:rRNA_databases/silva-euk-28s-id98.fasta,index/silva-euk-28s:rRNA_databases/rfam-5s-database-id98.fasta,index/rfam-5s-db:rRNA_databases/rfam-5.8s-database-id98.fasta,index/rfam-5.8s-db --reads /public/jychu/Lishaomei/goose-cleandata-neck/norRNA/merged_neck-1.clean.fq --aligned /public/jychu/Lishaomei/goose-cleandata-neck/norRNA/merged_neck-1_aligned_rRNA --fastx --sam --num_alignments 1 --other /public/jychu/Lishaomei/goose-cleandata-neck/norRNA/merged_neck-1_filtered_non_rRNA --paired_in --log -v &

运行成功的结果如下图

image.png
(base) [jychu@localhost norRNA]$ ls
merged_neck-1_aligned_rRNA.fq   merged_neck-1_aligned_rRNA.sam         merged_scale-10.clean.fq  merged_scale-12.clean.fq  merged_scale-8.clean.fq
merged_neck-1_aligned_rRNA.log  merged_neck-1_filtered_non_rRNA.fq.gz  merged_scale-11.clean.fq  merged_scale-7.clean.fq   merged_scale-9.clean.fq

(base) [jychu@localhost norRNA]$ less -S merged_neck-1_aligned_rRNA.log
     Minimal SW score based on E-value = 61
    Index: index/rfam-5s-db
     Seed length = 18
     Pass 1 = 18, Pass 2 = 9, Pass 3 = 3
     Gumbel lambda = 0.616694
     Gumbel K = 0.342032
     Minimal SW score based on E-value = 59
    Index: index/rfam-5.8s-db
     Seed length = 18
     Pass 1 = 18, Pass 2 = 9, Pass 3 = 3
     Gumbel lambda = 0.617555
     Gumbel K = 0.343861
     Minimal SW score based on E-value = 57
    Number of seeds = 2
    Edges = 4 (as integer)
    SW match = 2
    SW mismatch = -3
    SW gap open penalty = 5
    SW gap extend penalty = 2
    SW ambiguous nucleotide = -3
    SQ tags are not output
    Number of threads = 1
    Reads file = /public/jychu/Lishaomei/goose-cleandata-neck/norRNA/merged_neck-1.clean.fq

 Results:
    Total reads = 37894820
    Total reads passing E-value threshold = 592626 (1.56%)
    Total reads failing E-value threshold = 37302194 (98.44%)
    Minimum read length = 150
    Maximum read length = 150
    Mean read length = 150
 By database:
    rRNA_databases/silva-bac-16s-id90.fasta             0.15%
    rRNA_databases/silva-bac-23s-id98.fasta             0.19%
    rRNA_databases/silva-arc-16s-id95.fasta             0.03%
    rRNA_databases/silva-arc-23s-id98.fasta             0.12%
    rRNA_databases/silva-euk-18s-id95.fasta             0.33%
    rRNA_databases/silva-euk-28s-id98.fasta             0.73%
    rRNA_databases/rfam-5s-database-id98.fasta          0.00%
    rRNA_databases/rfam-5.8s-database-id98.fasta                0.00%

 Wed Jan  6 12:35:55 2021

上一篇下一篇

猜你喜欢

热点阅读