transvar坐标转换

2022-05-06  本文已影响0人  XuningFan

有时候我们拿到变异位点的基因组坐标之后,往往想看它的转录本坐标或氨基酸坐标以便跟已知变异位点进行比较,transvar就可以轻轻松松实现坐标转换。
举例如下:

Deletion

比如我们有如下变异位点:

one base Deletion

13 32954022 CA C
在13号染色体32954023位置发生了A的缺失
可以用以下命令进行查询:
transvar ganno -i 'chr13:g.32954022_32954023delinsC' --ucsc or
transvar ganno -i 'chr13:g.32954023del' --ucsc
查询结果如下:

chr13:g.32954022_32954023delinsC    NM_000059 (protein_coding)  BRCA2   +   chr13:g.32954030delA/c.9097delA/p.T3033Lfs*29   inside_[cds_in_exon_23]CSQN=Frameshift;left_align_gDNA=g.32954023delA;unaligned_gDNA=g.32954023delA;left_align_cDNA=c.9090delA;unalign_cDNA=c.9090delA;source=UCSCRefGene
chr13:g.32954023del NM_000059 (protein_coding)  BRCA2   +   chr13:g.32954030delA/c.9097delA/p.T3033Lfs*29   inside_[cds_in_exon_23] CSQN=Frameshift;left_align_gDNA=g.32954023delA;unaligned_gDNA=g.32954023delA;left_align_cDNA=c.9090delA;unalign_cDNA=c.9090delA;source=UCSCRefGene
multi base Deletion

13 32912089 CTG C
transvar ganno -i 'chr13:g.32912089_32912091delinsC' --ucsc or
transvar ganno -i 'chr13:g.32912091del' --ucsc

Insertion

13 32937354 T TA
transvar ganno -i 'chr13:g.32937354_32937354delinsTA' --ucsc or
transvar ganno -i 'chr13:g.32937354_32937355insA' --ucsc

SNV

17 41246245 C A
transvar ganno -i 'chr17:g.41246245C>A' --ucsc or
transvar ganno -i 'chr17:g.41246245_41246245delinsA' --ucsc

批量转换

生成如下sites.txt 文件:

CHROM   POS REF ALT id
1   46714092    C   T   chr1:g.46714092C>T
1   46714198    TC  T   chr1:g.46714199del
1   46714231    C   T   chr1:g.46714231C>T
1   46714263    G   A   chr1:g.46714263G>A
1   46714267    A   G   chr1:g.46714267A>G
1   46714272    T   C   chr1:g.46714272T>C
1   46714273    G   A   chr1:g.46714273G>A
1   46714274    A   T   chr1:g.46714274A>T
1   46714275    G   T   chr1:g.46714275G>T

transvar ganno -l sites.txt -m 5 --ucsc > transvar_result.bed
结果文件如下所示:

chr2:g.47607092C>A  NM_002354 (protein_coding)  EPCAM   +   chr2:g.47607092C>A/c.842C>A/p.A281D inside_[cds_in_exon_7]  CSQN=Missense;codon_pos=47607091-47607092-47607093;ref_codon_seq=GCT;source=UCSCRefGene
chr22:g.29095861T>C NM_001005735 (protein_coding)   CHEK2   -   chr22:g.29095861T>C/c.1102A>G/p.K368E   inside_[cds_in_exon_10] CSQN=Missense;codon_pos=29095859-29095860-29095861;ref_codon_seq=AAG;source=UCSCRefGene
chr22:g.29095861T>C NM_001257387 (protein_coding)   CHEK2   -   chr22:g.29095861T>C/c.310A>G/p.K104E    inside_[cds_in_exon_10] CSQN=Missense;codon_pos=29095859-29095860-29095861;ref_codon_seq=AAG;source=UCSCRefGene
chr22:g.29095861T>C NM_007194 (protein_coding)  CHEK2   -   chr22:g.29095861T>C/c.973A>G/p.K325E    inside_[cds_in_exon_9]  CSQN=Missense;codon_pos=29095859-29095860-29095861;ref_codon_seq=AAG;source=UCSCRefGene
chr22:g.29095861T>C NM_145862 (protein_coding)  CHEK2   -   chr22:g.29095861T>C/c.973A>G/p.K325E    inside_[cds_in_exon_9]  CSQN=Missense;codon_pos=29095859-29095860-29095861;ref_codon_seq=AAG;source=UCSCRefGene
chr11:g.108200964A>T    NM_000051 (protein_coding)  ATM +   chr11:g.108200964A>T/c.7331A>T/p.E2444V inside_[cds_in_exon_50] CSQN=Missense;codon_pos=108200963-108200964-108200965;ref_codon_seq=GAG;source=UCSCRefGene
chr17:g.59761513A>G NM_032043 (protein_coding)  BRIP1   -   chr17:g.59761513A>G/c.2906-12T>C/.  inside_[intron_between_exon_19_and_20]  CSQN=IntronicSNV;source=UCSCRefGene
上一篇下一篇

猜你喜欢

热点阅读