2020生物信息学

转录组直播学习笔记:第二天

2020-03-30  本文已影响0人  焱黎

数据过滤与质控
软件:fastp
目的:对测序序列的质量检查

安装软件

对数据进行过滤与质控

第一天下载的fastq文件进行操作

fastp软件的简单使用命令
fastp -i SRR2176358_RNA_seq_of_Blondee_fruit_skin_with_fiesh_at_stage_ I_Rep_I.fastq.gz -o SRR2176358_RNA_seq_of_Blondee_fruit_skin_with_fiesh_at_stage_ I_Rep_I.fastq.gz -h SRR2176358_RNA_seq_of_Blondee_fruit_skin_with_fiesh_at_stage_ I_Rep_I.html -j SRR2176358_RNA_seq_of_Blondee_fruit_skin_with_fiesh_at_stage_ I_Rep_I.json
$ rename 's/SRR.*_RNA-seq_of_//' *.gz
Blondee_fruit_skin_with_fiesh_at_stage_ I_Rep._I.fastq.gz

$ rename 's/_fruit_skin_with_fiesh//' *.gz
Blondee_at_stage_ I_Rep._I.fastq.gz

$ rename 's/at_stage_ IV/S4/' *.gz
Blondee_S4_at_harvest_Rep._I.fastq.gz

$ rename 's/at_stage_ III/S3/' *.gz
Blondee_S3_Rep._I.fastq.gz

$ rename 's/at_stage_ II/S2/' *.gz
Blondee_S2_Rep._I.fastq.gz

$ rename 's/at_stage_ I/S1/' *.gz
Blondee_S1_Rep._I.fastq.gz

$ rename 's/Blondee/BLO/' *.gz
BLO_S1_Rep._I.fastq.gz

$ rename 's/Kidds-D_8/KID/' *.gz
KID_S1_Rep._II.fastq.gz

$ rename 's/at_harvest_//' *.gz
BLO_S4_Rep._I.fastq.gz

$ rename 's/._III/3/' *.gz
BLO_S1_Rep3.fastq.gz

$ rename 's/._II/2/' *.gz
BLO_S1_Rep2.fastq.gz

$ rename 's/._I/1/' *.gz
BLO_S1_Rep1.fastq.gz
经过以上几次的修改最终得到以下文件名: 修改后的文件名.png
$ ls *.gz > fastq.lst
$ head -3 fastq.lst
BLO_S1_Rep1.fastq.gz
BLO_S1_Rep2.fastq.gz
BLO_S1_Rep3.fastq.gz

$ awk '{print "fastp -i "$1}' fastq.lst
fastp -i BLO_S1_Rep1.fastq.gz
fastp -i BLO_S1_Rep2.fastq.gz
fastp -i BLO_S1_Rep3.fastq.gz
…

$ awk '{print "fastp -i "$1" -o clean_data/"$1}' fastq.lst
fastp -i BLO_S1_Rep1.fastq.gz -o clean_data/BLO_S1_Rep1.fastq.gz
…

$ awk '{print "fastp -i "$1" -o clean_data/"$1" -h "$1".html -j "$1".json &"}' fastq.lst
fastp -i BLO_S1_Rep1.fastq.gz -o clean_data/BLO_S1_Rep1.fastq.gz -h BLO_S1_Rep1.fastq.gz.html -j BLO_S1_Rep1.fastq.gz.json &(如果是在自己的电脑上运行的话可以不用加&,加&的目的是想让所有命令并行运行,自己的电脑脑没有这么多线程)
…

$ awk '{print "fastp -i "$1" -o clean_data/"$1" -h "$1".html -j "$1".json &"}' fastq.lst > run_fastp.sh
$ nohup sh run_fastp.sh & 
上一篇下一篇

猜你喜欢

热点阅读