Linux与生物信息

接上,pcoc检测氨基酸性质的趋同

2023-11-27  本文已影响0人  多啦A梦的时光机_648d

pcoc检测

cd /home/lx_sky6/yt/1-top/4-conv_AR/4-Converg/06_pcoc

step1. 处理树文件;将第2步生成的02_Trees文件夹拷贝到当前目录,并修改为.txt结尾

cp -r ../02_Trees/ ./00.trees/
ls ./00.trees//|sed 's/.ntree//g' >list
for id in $(cat list); do mv 00.trees/$id.ntree 00.trees/$id.txt;done
```{长这样}
ortholog14728.phy.txt
ortholog14729.phy.txt
..............

step2:处理序列文件

mkdir prank
cp ../prank/*.fas prank/
```{长这样}
ortholog14720.best.fas
ortholog14721.best.fas
...........

step3: 准备需要检测物种的文件

Carex_breviculmis/Achnatherum_splendens

运行pcoc.1.0.py脚本,切记在/home/lx_sky6/yt/soft/miniconda3/envs/py276/bin/环境下运行,因为脚本是python2版本写的

##/home/lx_sky6/yt/script/pcoc_motified_script/pcoc.1.0.py
nohup python pcoc.1.0.py --path=/home/lx_sky6/yt/1-top/4-conv_AR/4-Converg/06_pcoc/prank  --tree=/home/lx_sky6/yt/1-top/4-conv_AR/4-Converg/06_pcoc/ortholog10114.phy.ntree --scenario=scenario.txt --cpu=30 &

生成output_pcoc_det文件夹

run1
run2
............

##统计
cd /home/lx_sky6/yt/0729_Carex/20-Conv/5-Converg/06.pcoc
ls output_pcoc_det/*/*/*.filtered_results.tsv >pcoc.out ##355个趋同基因
for id in $(cat pcoc.out); do cat $id|grep -v '^Site'|wc -l >>gene2sitesnumber;done
paste gene2sitesnumber pcoc.out >pcoc.out1

2       output_pcoc_det/run1002/RUN_20200320_225941/ortholog19997.best.filtered_results.tsv
2       output_pcoc_det/run1008/RUN_20200320_230023/ortholog24397.best.filtered_results.tsv

利用脚本提取gene存在趋同的氨基酸sites(存在一个基因对应多个sites)

python pcoc_result_YT.py pcoc.out1 >gene2sites

ortholog19997.best.filtered_results.tsv 310     0.0     0.0     0.9940532148768447      0.10249518202468676     0.99941029547

ortholog19997.best.filtered_results.tsv 318     0.0     0.0     0.9990523179270168      0.4381105911411674      0.99932570244

ortholog24397.best.filtered_results.tsv 220     0.0     0.0     0.9999274973052947      0.9496206162432255      0.99902952545

less pcoc.out|cut -d '/' -f4|cut -d '.' -f1|sed 's/$/.fasta/g' >og.list
for id in $(cat og.list);do grep 'Cbre' ../../ortholog/$id|sed 's/>Cbre-//g' >>Cbre.id;done
for id in $(cat Cbre.id); do grep -w $id ../../../10-eggnog/ID2genename.txt >>CbreID2genename.id ; done

paste og.list  CbreID2genename.id >0g2id2name.txt
##最后对应一下genename

##所有脚本存入
cp /home/lx_sky6/yt/0729_Carex/20-Conv/5-Converg/convCal.pbs /home/lx_sky6/yt/script/pcoc_motified_script/

cp /home/lx_sky6/yt/0729_Carex/20-Conv/5-Converg/convgent_1.2.py /home/lx_sky6/yt/script/pcoc_motified_script/
cp /home/lx_sky6/yt/0729_Carex/20-Conv/5-Converg/05_convCal/probCal.py /home/lx_sky6/yt/script/pcoc_motified_script/

cp /home/lx_sky6/yt/0729_Carex/20-Conv/5-Converg/06.pcoc/pcoc.1.0.py /home/lx_sky6/yt/script/pcoc_motified_script/
cp /home/lx_sky6/yt/0729_Carex/20-Conv/5-Converg/06.pcoc/pcoc_result_YT.py /home/lx_sky6/yt/script/pcoc_motified_script/
上一篇 下一篇

猜你喜欢

热点阅读