细菌基因组:质粒序列的鉴定之PlasmidFinder
细菌基因组测序完成后,想知道里面有没有质粒怎么办?
质粒(plasmid) 广泛存在于生物界,从细菌、放线菌、丝状真菌、大型真菌、酵母到植物,甚至人类机体中都含有。从分子组成看,有DNA 质粒,也有RNA 质粒; 从分子构型看,有线型质粒、也有环状质粒: 其表型也多种多样。细菌质粒是基因工程中最常用的载体。
质粒是细菌、酵母菌和放线菌等生物中染色体(或拟核)以外的DNA分子,存在于细胞质中(但酵母除外,酵母的2 μm质粒存在于细胞核中),具有自主复制能力,使其在子代细胞中也能保持恒定的拷贝数,并表达所携带的遗传信息,是闭合环状的双链DNA分子。质粒不是细菌生长繁殖所必需的物质,可自行丢失或人工处理而消除,如高温、紫外线等。质粒携带的遗传信息能赋予宿主菌某些生物学性状,有利于细菌在特定的环境条件下生存。
与细菌基因组相同,质粒也属于环形双链DNA(共价闭环DNA,covalenr closed circular DNA, cccDNA)。
PlasmidFinder介绍
从细菌基因组测序数据中鉴定出质粒序列。基于一个人工校对的质粒复制子数据库。
也有在线版本:https://cge.cbs.dtu.dk/services/PlasmidFinder/
不需要安装直接上传序列即可快速得到结果
PlasmidFinder软件安装
git clone https://bitbucket.org/genomicepidemiology/plasmidfinder.git
cd plasmidfinder
下载和安装PlasmidFinder数据库
# Clone database from git repository (develop branch)
git clone https://bitbucket.org/genomicepidemiology/plasmidfinder_db.git
cd plasmidfinder_db
PLASMID_DB=$(pwd)
# Install PlasmidFinder database with executable kma_index program
python3 INSTALL.py kma_index
如果kma_index 没有安装可以参考
(https://bitbucket.org/genomicepidemiology/kma)
git clone https://bitbucket.org/genomicepidemiology/kma.git
cd kma && make
PlasmidFinder软件使用:
查看帮助文档
$ python3 plasmidfinder.py -h
usage: plasmidfinder.py [-h] [-i INFILE [INFILE ...]] [-o OUTDIR]
[-tmp TMP_DIR] [-mp METHOD_PATH] [-p DB_PATH]
[-d DATABASES] [-l MIN_COV] [-t THRESHOLD] [-x] [-q]
optional arguments:
-h, --help show this help message and exit
-i INFILE [INFILE ...], --infile INFILE [INFILE ...]
FASTA or FASTQ input files.
-o OUTDIR, --outputPath OUTDIR
Path to blast output
-tmp TMP_DIR, --tmp_dir TMP_DIR
Temporary directory for storage of the results from
the external software.
-mp METHOD_PATH, --methodPath METHOD_PATH
Path to method to use (kma or blastn)
-p DB_PATH, --databasePath DB_PATH
Path to the databases
-d DATABASES, --databases DATABASES
Databases chosen to search in - if non is specified
all is used
-l MIN_COV, --mincov MIN_COV
Minimum coverage
-t THRESHOLD, --threshold THRESHOLD
Minimum threshold for identity
-x, --extented_output
Give extented output with allignment files, template
and query hits in fasta and a tab seperated file with
allele profile results
-q, --quiet
运行命令:
$ python3 plasmidfinder.py -i test/test.fsa -o testout/ -p plasmidfinder_db -x
查看结果文件夹:
$ ls testout
data.json Hit_in_genome_seq.fsa Plasmid_seqs.fsa results_tab.tsv results.txt tmp
$ more testout/results.txt
plasmidfinder Results
Organism(s): Enterobacteriaceae,Gram Positive
****************************************************************************************
Enterobacteriaceae
**********************************************************************************************************************************
Plasmid Identity Query / Template length Contig Position in contig Note Accession number
**********************************************************************************************************************************
IncHI1B(R27) 100 540 / 540 IncHI1B(R27)_1_R27_AF250878 1..540 R27 AF250878
==================================================================================================================================
****************************************************************************************
Gram Positive
****************************************************************************************************************
Plasmid Identity Query / Template length Contig Position in contig Note Accession number
****************************************************************************************************************
- - - No hit found - - -
================================================================================================================
Extended Output:
# IncHI1B(R27)_AF250878
template: ATTCCAGAAAACCGATCTCTTTAAGCTGGCCCAGCGCCTTTTTAACCGTGGCATTCTGGT
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
query: ATTCCAGAAAACCGATCTCTTTAAGCTGGCCCAGCGCCTTTTTAACCGTGGCATTCTGGT
template: TACCGAGGTGTGATGACAGTTGGAGTCGTCCACGAAGCCGATCGAATCCGATGCGGTAAA
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
query: TACCGAGGTGTGATGACAGTTGGAGTCGTCCACGAAGCCGATCGAATCCGATGCGGTAAA
template: AGGTGCTCGGCAGCTCAGCCAGATACAGGTACAGGGCCTGTGCGGACTCCTTACGGGCCA
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
query: AGGTGCTCGGCAGCTCAGCCAGATACAGGTACAGGGCCTGTGCGGACTCCTTACGGGCCA
template: GTTTTTGCAATGTCTTCAGGTAGAGTCGGGTTTTACCGTCGACGCGATACAGCGTATTGA
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
query: GTTTTTGCAATGTCTTCAGGTAGAGTCGGGTTTTACCGTCGACGCGATACAGCGTATTGA
template: GCTTCGAATTTGGCTTGATGATGATTTTTCCCGTGGAACTGTCGTAATACGTCGATTCCA
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
query: GCTTCGAATTTGGCTTGATGATGATTTTTCCCGTGGAACTGTCGTAATACGTCGATTCCA
template: CCAGGTGCATGTTTATCGTTATCTGATCATCTGTACCGGGTATTTTCTTAATAAATGAAA
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
query: CCAGGTGCATGTTTATCGTTATCTGATCATCTGTACCGGGTATTTTCTTAATAAATGAAA
template: TGTTGGTCCGGGCTATACGCGTCAGCGAAGCATCAAAGCGCTCTTTCAGTTGTTTATCAA
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
query: TGTTGGTCCGGGCTATACGCGTCAGCGAAGCATCAAAGCGCTCTTTCAGTTGTTTATCAA
template: TGCGCTTGGTATCAAACCCACAAAATTTTGCAAACTCCGGAAAATTCAGCTCCAGCTGAC
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
query: TGCGCTTGGTATCAAACCCACAAAATTTTGCAAACTCCGGAAAATTCAGCTCCAGCTGAC
template: CTTCTGAATCAAGCGGCCGGTTAGACAACGCATAAACGATCCCACACCATGATTTGAAAT
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
query: CTTCTGAATCAAGCGGCCGGTTAGACAACGCATAAACGATCCCACACCATGATTTGAAAT
参考:PlasmidFinder and pMLST: in silico detection and typing of plasmids. Carattoli A, Zankari E, Garcia-Fernandez A, Volby Larsen M, Lund O, Villa L, Aarestrup FM, Hasman H. Antimicrob. Agents Chemother. 2014. April 28th.
感谢您的阅读,欢迎点赞、评论和转发!!
扫描或长按下方二维码,即可关注公众号: 基因的生物信息学分析
image相关阅读