Canu三代测序矫错、修剪、组装
2020-03-04 本文已影响0人
胡童远
导读
Canu可以组装PacBio或Nanopore三代测序数据。下面是Canu分任务处理PacBio数据的方法。
文章:Canu: scalable and accurate long-read assembly via adaptive k-mer weighting and repeat separation
杂志:Genome Res.
时间:2017
引用:[2019年] 1200+
官网:canu
GitHub:canu github
主要参数:
useGrid: Run under grid control (true), locally (false)
p: 前缀prefix
d: 目录directory
genomeSize: <number>[g|m|k]
pacbio-raw: 输入
pacbio-corrected:输入
nanopore-raw:输入
nanopore-corrected:输入
一、矫正:Correct
time canu useGrid=false -correct -p run1 -d canu_1 genomeSize=2.1m \
-pacbio-raw clean_data/1B.pacbio.fa # 18 min
结果:
二、修剪:Trim
time canu useGrid=false -trim -p run1 -d canu_2 genomeSize=2.1m \
-pacbio-corrected canu_1/run1.correctedReads.fasta.gz # 9 min
结果:
三、组装:Assembly
尝试不同的correctedErrorRate参数
The default is 0.045 for PacBio reads, and 0.144 for Nanopore reads
time canu useGrid=false -assemble -p run1 -d canu_3_erate-0.039 genomeSize=2.1m \
correctedErrorRate=0.039 \
-pacbio-corrected canu_2/run1.trimmedReads.fasta.gz # 42 min
time canu useGrid=false -assemble -p run1 -d canu_3_erate-0.045 genomeSize=2.1m \
correctedErrorRate=0.045 \
-pacbio-corrected canu_2/run1.trimmedReads.fasta.gz # 52 min
time canu useGrid=false -assemble -p run1 -d canu_3_erate-0.075 genomeSize=2.1m \
correctedErrorRate=0.075 \
-pacbio-corrected canu_2/run1.trimmedReads.fasta.gz # 148 min
correctedErrorRate=0.045结果: