SRA 数据下载自救指南
还在羡慕海峡那边的朋友下载SRA 快到飞起?还在难过用wget 下载数据经常下载不完整?用了官方的下载工具还是慢的不行?这里有一个SRA 下载自救尝试指南供你参考。
需要用到两个工具
- SRA Toolkit
- IBM aspera 高速文件传输工具
因为这是一篇极简自救指南,所以一切都不解释,直接给出链接,不明白的自行学习(爱学不学)。
SRA Toolkit 网址:https://trace.ncbi.nlm.nih.gov/Traces/sra/sra.cgi?view=toolkit_doc
aspera 网址:https://support.asperasoft.com/hc/en-us
aspera 官方对于下载NCBI数据的说明
SRA Toolkit 官方对于使用aspera的说明:
https://www.ncbi.nlm.nih.gov/books/NBK242625/
https://trace.ncbi.nlm.nih.gov/Traces/sra/sra.cgi?view=toolkit_doc&f=prefetch
快速自救前奏
- 下载 aspera (选择linux版本)
https://downloads.asperasoft.com/en/downloads/8?list
- 安装 aspera
wget https://download.asperasoft.com/download/sw/connect/3.8.1/ibm-aspera-connect-3.8.1.161274-linux-g2.12-64.tar.gz
# 小心版本号有变动,不要直接复制上面的命令
tar zxvf ibm-aspera-connect-3.8.1.161274-linux-g2.12-64.tar.gz
bash ibm-aspera-connect-3.8.1.161274-linux-g2.12-64.sh
# 默认安装路径 /home/user/.aspera
- 安装 sra toolkit 具体命令省略,注意一定要安装最新版本:)
正式开始自救
目前中文关于使用 aspera 下载 sra 数据的几篇教程都写的婆婆妈妈乱七八糟,千万不要再看了!
记住,正式的自救只需要两步,其它写一大串的文章都是“耍流氓”。
-
把要下载的数据SRR号写入一个文件srr.txt,每行是一个SRR id
-
利用SRA toolkit 的
prefetch
下载,并指定下载方式为ascp
,命令如下,各种参数的含义自行查看文档(爱看不看)
prefetch -t ascp -a "/home/user/.aspera/connect/bin/ascp|/home/user/.aspera/connect/etc/asperaweb_id_dsa.openssh" --option-file srr.txt -O /opt/user/ncbi
其中-a 参数中必须要用绝对路径写上ascp所在的位置和previte KEY 的位置,如果是正常安装只需要把user替换为自己的用户名。
自救效果测试
下载了八个SRR文件,平均一个大小5G左右,使用时间如下:
2018-09-05T14:14:33 prefetch.2.9.2: 1) Downloading 'SRR******'...
2018-09-05T14:14:33 prefetch.2.9.2: Downloading via fasp...
SRR******
2018-09-05T14:16:58 prefetch.2.9.2: fasp download succeed
2018-09-05T14:16:58 prefetch.2.9.2: 1) 'SRR******' was downloaded successfully
2018-09-05T14:17:01 prefetch.2.9.2: 2) Downloading 'SRR******'...
2018-09-05T14:17:01 prefetch.2.9.2: Downloading via fasp...
SRR******
2018-09-05T14:19:25 prefetch.2.9.2: fasp download succeed
2018-09-05T14:19:25 prefetch.2.9.2: 2) 'SRR******' was downloaded successfully
2018-09-05T14:19:28 prefetch.2.9.2: 3) Downloading 'SRR******'...
2018-09-05T14:19:28 prefetch.2.9.2: Downloading via fasp...
SRR******
2018-09-05T14:22:31 prefetch.2.9.2: fasp download succeed
2018-09-05T14:22:31 prefetch.2.9.2: 3) 'SRR******' was downloaded successfully
2018-09-05T14:22:35 prefetch.2.9.2: 4) Downloading 'SRR******'...
2018-09-05T14:22:35 prefetch.2.9.2: Downloading via fasp...
SRR******
2018-09-05T14:25:14 prefetch.2.9.2: fasp download succeed
2018-09-05T14:25:14 prefetch.2.9.2: 4) 'SRR******' was downloaded successfully
2018-09-05T14:25:17 prefetch.2.9.2: 5) Downloading 'SRR******'...
2018-09-05T14:25:17 prefetch.2.9.2: Downloading via fasp...
SRR******
2018-09-05T14:26:46 prefetch.2.9.2: fasp download succeed
2018-09-05T14:26:46 prefetch.2.9.2: 5) 'SRR******' was downloaded successfully
2018-09-05T14:26:49 prefetch.2.9.2: 6) Downloading 'SRR******'...
2018-09-05T14:26:49 prefetch.2.9.2: Downloading via fasp...
SRR******
2018-09-05T14:28:13 prefetch.2.9.2: fasp download succeed
2018-09-05T14:28:13 prefetch.2.9.2: 6) 'SRR******' was downloaded successfully
2018-09-05T14:28:16 prefetch.2.9.2: 7) Downloading 'SRR******'...
2018-09-05T14:28:16 prefetch.2.9.2: Downloading via fasp...
SRR******
2018-09-05T14:29:56 prefetch.2.9.2: fasp download succeed
2018-09-05T14:29:56 prefetch.2.9.2: 7) 'SRR******' was downloaded successfully
2018-09-05T14:30:00 prefetch.2.9.2: 8) Downloading 'SRR******'...
2018-09-05T14:30:00 prefetch.2.9.2: Downloading via fasp...
SRR******
2018-09-05T14:31:58 prefetch.2.9.2: fasp download succeed
2018-09-05T14:31:58 prefetch.2.9.2: 8) 'SRR******' was downloaded successfully
喏,5G的文件,即便是在(你懂的)这种网络状况下,一个也只需要不到2分钟。
自救成功,祝好!