工作生活

vcf注释软件VEP

2019-07-02  本文已影响0人  大魔王鱼鱼鱼

一、简介

Variant Effect Predictor

The VEP is a software suite that performs annotation and analysis of most types of genomic variation in coding and noncoding regions of the genome. From disease investigation to population studies, it is a critical tool to annotate variants and prioritize a subset for further analysis.

使用说明概览:https://asia.ensembl.org/info/docs/tools/vep/script/vep_tutorial.html

详情:http://asia.ensembl.org/info/docs/tools/vep/script/vep_download.html

二、下载安装

1、下载

git clone https://github.com/Ensembl/ensembl-vep.git

2、安装

cd ensembl-vep

perl INSTALL.pl 

选数据库,选插件,会进行下载和解压。0是不选,all是选所有,某个数字就下载某个版本的。

如果没有对应的cache文件也没关系,可以用脚本从gtf和fa文件转化。The VEP package also includes a script, gtf2vep.pl, to build custom cache files. This requires a local GFF or general transfer format (GTF) file that describes transcript structures and a FASTA file of the genomic sequence.

3、测试

如果未下载vcf文件对应版本的数据库需要加上参数--port 3337。

/home/shaoyu/software/ensembl-vep/vep -i /home/shaoyu/software/ensembl-vep/examples/homo_sapiens_GRCh37.vcf --cache --port 3337

结果文件:variant_effect_output.txt variant_effect_output.txt_summary.html


三、使用

/home/shaoyu/software/ensembl-vep/vep -i /home/shaoyu/software/ensembl-vep/examples/homo_sapiens_GRCh37.vcf --cache --port 3337 #bascic

/home/shaoyu/software/ensembl-vep/vep -i /home/shaoyu/software/ensembl-vep/examples/homo_sapiens_GRCh37.vcf --cache --port 3337 --sift b -o test2.sift.txt #SIFT is an algorithm for predicting whether a given change in a protein sequence will be deleterious to the function of that protein. the b means we want both the prediction and the score.

/home/shaoyu/software/ensembl-vep/filter_vep -i test2.sift.txt -filter "SIFT is deleterious" -o test2.sift.filter.txt #只留下deleterious的

/home/shaoyu/software/ensembl-vep/vep -i /home/shaoyu/software/ensembl-vep/examples/homo_sapiens_GRCh37.vcf --cache --port 3337 --everything -o test3.everything.txt  #--everthing加上所有注释

--everthing 

Shortcut flag to switch on all of the following:

--sift b, --polyphen b, --ccds, --uniprot, --hgvs, --symbol, --numbers, --domains, --regulatory, --canonical, --protein, --biotype, --uniprot, --tsl, --appris, --gene_phenotype --af, --af_1kg, --af_esp, --af_gnomad, --max_af, --pubmed, --variant_class

更多参数:

https://asia.ensembl.org/info/docs/tools/vep/script/vep_options.html#opt_species

过滤参数:

四、解读

1. transcript annotation

transcript annotation

2. Protein annotation

五、输出文件

可以通过参数设定文件格式(txt, vcf, json),默认为txt。

上一篇下一篇

猜你喜欢

热点阅读