生信笔记生信生信基础知识

生物信息常用软件使用说明记录

2019-07-09  本文已影响16人  11的雾

记录用过的一些生信软件吧

FastX格式处理系列

$ seqtk sample

Usage:   seqtk sample [-2] [-s seed=11] <in.fa> <frac>|<number>

Options: -s INT       RNG seed [11]
         -2           2-pass mode: twice as slow but with much reduced memory

举例:
seqtk sample -s100 test.fq.gz 5242880 | pigz -p 4 > test.clean.fq.gz

$ seqtk trimfq
Usage:   seqtk trimfq [options] <in.fq>
Options: -q FLOAT    error rate threshold (disabled by -b/-e) [0.05]
         -l INT      maximally trim down to INT bp (disabled by -b/-e) [30]
         -b INT      trim INT bp from left (non-zero to disable -q/-l) [0]
         -e INT      trim INT bp from right (non-zero to disable -q/-l) [0]
         -L INT      retain at most INT bp from the 5'-end (non-zero to disable -q/-l) [0]
         -Q          force FASTQ output

例如:read长度为400bp,需要截取前150bp,可以设置-e是从后端开始截取250bp,剩下的就是前150bp。
seqtk trimfq -e 250 RP01G9E1L1_R1.fq.gz >trimed_RP01G9E1L1_R1.fq
例如: read长度为400bp,需要丢掉前30bp,保留后面370bp,则可以设置-b参数
seqtk trimfq -b 30 G19E1L1_1.fq.gz > >test.fq

将fastq转换为fasta

seqkit fq2fa ../02.align/RP01G9E1L3_R1.fq.gz >RP01G9E1L3_R1.fa

比对软件系列:

单细胞测序系列

自己写的一些工具:

上一篇下一篇

猜你喜欢

热点阅读