OrthoDB / Orthologer
2023-05-29 本文已影响0人
LET149
Orthologer 为 OrthoDB 的应用程序
1. 安装
用docker安装Orthologer :
docker pull ezlabgva/orthologer:v2.3
Orthologer所在的URL为 :
https://hub.docker.com/r/ezlabgva/orthologer
报错:
- Error response from daemon: manifest for ezlabgva/orthologer:latest not found: manifest unknown: manifest unknown
- 解决:https://blog.csdn.net/qq_20042935/article/details/105043400
2. 建立Orthologer的工作环境
(base) zhiyong@zhiyong-OptiPlex-7050:~/Application/test/Sequences$ id -u '#查看当前用户的UID
1000 #当前用户的UID为1000
(base) zhiyong@zhiyong-OptiPlex-7050:~/Application/test/Sequences$ mkdir Orthologer_env #创建Orthologer的工作目录
(base) zhiyong@zhiyong-OptiPlex-7050:~/Application/test/Sequences$ cd Orthologer_env/
(base) zhiyong@zhiyong-OptiPlex-7050:~/Application/test/Sequences/Orthologer_env$ pwd
/home/zhiyong/Application/test/Sequences/Orthologer_env
(base) zhiyong@zhiyong-OptiPlex-7050:~/Application/test/Sequences/Orthologer_env$ sudo docker run -u 1000 -v /home/zhiyong/Application/test/Sequences/Orthologer_env:/odbwork ezlabgva/orthologer:v2.3 setup_odb.sh #在当前目录下创建Orthologer的工作环境
工作环境创建成功后,目录中的内容如下
(base) zhiyong@zhiyong-OptiPlex-7050:~/Application/test$ ll
总用量 84
drwxrwxr-x 12 zhiyong zhiyong 4096 3月 29 19:32 ./
drwxrwxr-x 7 zhiyong zhiyong 4096 3月 29 19:31 ../
drwxr-xr-x 2 zhiyong root 4096 3月 29 19:31 bin/
drwxr-xr-x 2 zhiyong root 4096 3月 29 19:31 Cluster/
-rwxr-xr-x 1 zhiyong root 12846 3月 29 19:31 common.sh*
drwxr-xr-x 2 zhiyong root 4096 3月 29 19:31 docs/
drwxr-xr-x 6 zhiyong root 4096 3月 29 19:32 JobLog/
-rw-rw-r-- 1 zhiyong zhiyong 124 3月 29 19:32 mydata.txt
-rwxr-xr-x 1 zhiyong root 880 3月 29 19:31 orthologer.sh*
drwxr-xr-x 6 zhiyong root 4096 3月 29 19:32 PWC/
drwxr-xr-x 2 zhiyong root 4096 3月 29 19:31 Rawdata/
drwxr-xr-x 2 zhiyong root 4096 3月 29 19:31 Results/
lrwxrwxrwx 1 zhiyong root 31 3月 29 19:31 sbin -> /usr/local/ORTHOLOGER-2.3.0/bin
drwxr-xr-x 2 zhiyong root 4096 3月 29 19:31 .scratch/
drwxr-xr-x 2 zhiyong root 4096 3月 29 19:46 Sequences/
-rw-r--r-- 1 zhiyong root 1332 3月 29 19:31 setup.log
-rw-r--r-- 1 zhiyong root 756 3月 29 19:31 .setup_project_uid1000.sh
-rwxr-xr-x 1 zhiyong root 1951 3月 29 19:31 setup_project_uid1000.sh*
drwxr-xr-x 2 zhiyong root 4096 3月 29 19:32 todo/
3. 将fasta文件转移至工作路径
- fasta文件可以是DNA序列也可以是蛋白质序列
- 在Orthologer的工作环境中创建新的目录(注意不要和原有目录重名)
- 将用来做比较的fasta文件转移到此新目录下
4. 创建fasta文件引导文件
- 在工作环境下创建此文件
- 文件中包含物种标记(taxid),以及对应fasta文件所在目录,具体格式如下:
(base) zhiyong@zhiyong-OptiPlex-7050:~/Application/OrthoDB_env$ cat mydata.txt '#此文件是fasta文件引导文件
+Amel kkkk/Aeml_protein.fa #第一列是物种标记(taxid),注意前面加号"+"的添加(具体原因请查看帮助文件);后面是对应fasta文件所在位置,其中kkkk是位于工作环境目录下的目录;两者之间用空格隔开
+Bimp kkkk/Bimp_protein.fa
+Bter kkkk/Bter_protein.fa
+Dmel kkkk/Dmel_protein.fa
note : 在写
fasta
文件目录的时候,只需要写fasta
文件所在的这一层目录即可,否则报错
5. 将以上fasta引导文件中的fasta进行导入
sudo docker run -u 1000 -v /home/zhiyong/Application/test/Sequences/Orthologer_env:/odbwork ezlabgva/orthologer:v2.3 ./orthologer.sh manage -f mydata.txt
note : 此命令要在工作环境那层执行
6. 修改.todo文件
- .todo 文件位于 todo 目录下,以 .todo 为拓展名
- .todo 文件里包含将用于比对的物种标记(taxid)
- .todo 文件的名字不重要,但拓展名非常重要
(base) zhiyong@zhiyong-OptiPlex-7050:~/Application/OrthoDB_env/todo$ cat Amel_Bter.todo #这个文件将用于引导比较Amel和Bter
AMEL
BTER
(base) zhiyong@zhiyong-OptiPlex-7050:~/Application/OrthoDB_env/todo$ cat Amel_Bter_Dmel.todo #这个文件将用于引导Amel,Bter和Dmel
AMEL
BTER
DMEL
7. 运行Orthologer container
sudo docker run -u 1000 -v /home/zhiyong/Application/test/Sequences/Orthologer_env:/odbwork ezlabgva/orthologer:v2.3 ./orthologer.sh -t todo/mydata.todo -r all
7.1 OrthoDB-pro-ID - gtf-translation-ID
- location:
Rawdata/.fs.maptxt
(base) zhiyong@zhiyong-OptiPlex-7050:~/Application/OrthoDB_env/Rawdata$ cat ACER.fs.maptxt|head
ACER:000000 NP_001315405.1 icarapin-like precursor [Apis cerana]
ACER:000001 NP_001315406.1 odorant receptor coreceptor [Apis cerana]
ACER:000002 NP_001315407.1 odorant receptor 30a-like [Apis cerana]
ACER:000003 NP_001315409.1 major royal jelly protein 5 precursor [Apis cerana]
ACER:000004 NP_001315410.1 period circadian protein [Apis cerana]
ACER:000005 NP_001315411.1 opsin, ultraviolet-sensitive [Apis cerana]
ACER:000006 NP_001315412.1 opsin, blue-sensitive [Apis cerana]
ACER:000007 NP_001315413.1 vitellogenin precursor [Apis cerana]
ACER:000008 NP_001315414.1 rhodopsin, long-wavelength [Apis cerana]
ACER:000009 NP_001315415.1 phosphoglycerate mutase 2 [Apis cerana]
- 第一列:
OrthoDB-pro-ID
- 第二列:
gtf-translation-ID
7.2 OG - gtf-translation-ID
- location:
Results/mydata_orthogroups.txt
cat mydata_orthogroups.txt|head -20
0 XP_016906112.1 1 84 1 84 100 304.17 4.146e-63
0 XP_016767410.1 1 84 1 84 100 304.17 4.146e-63
1 XP_016910209.1 1 79 1 79 100 292.4 1.732e-56
1 XP_006567534.1 1 79 1 79 100 292.4 1.732e-56
2 XP_028519923.1 1 286 1 286 100 288.99 3.081e-215
2 XP_006564532.1 1 286 1 286 100 288.99 3.081e-215
3 XP_016908651.1 1 2874 1 2874 99.3 286.28 0
3 XP_016908652.1 9 2874 1 2874 -1 100 0
3 XP_003250904.2 1 2876 1 2876 99.3 286.28 0
4 XP_016918707.1 1 232 1 232 99.5 285.78 1.028e-171
4 NP_001165850.1 1 232 1 232 99.5 285.78 1.028e-171
5 XP_016915986.1 1 336 1 336 100 285.25 8.206e-253
- 第一列:
OG
- 第二列:
gtf-translation-ID