比较基因组基因组组装便捷工具

工具 | 比较基因组 | WGDI

2022-07-05  本文已影响0人  shwzhao

看我不如看【参考】

参考:
WGDI | https://github.com/SunPengChuan/wgdi
WGDI | https://wgdi.readthedocs.io/en/latest/Introduction.html
bilibili | WGDI的简单使用(一)
bilibili | WGDI的简单使用(二)
简书 | xuzhougeng | 如何用WGDI进行共线性分析(上)
简书 | xuzhougeng | 如何用WGDI进行共线性分析(中)
简书 | xuzhougeng | 如何用WGDI进行共线性分析(下)
简书 | xuzhougeng | WGDI之深入理解blockinfo输出结果
简书 | xuzhougeng | 使用jcvi基于wgdi的共线性结果(-icl)绘制点图
简书 | 生信石头 | 点阵图 | 比较基因组分析之基石 - 手把手入门教程
简书 | 生信石头 | WGDI | 比较基因组分析神器,要啥有啥!
公众号 | 生信媛 | 使用WGDI分析全基因组复制事件

1. 下载

$ conda create -c bioconda -c conda-forge -n wgdi wgdi
$ conda activate wgdi

2. 所需文件

3. 运行

$ wgdi -d help >> total.conf
$ wgdi -d total.conf

每个参数都是【三步走】方式,运行简单

4. 逐个参数

原图地址:https://wgdi.readthedocs.io/en/latest/Introduction.html

后面古基因组和核型分析这块,我还不是很明白。

4. 直接生成全部配置文件

可以把关键信息传参导入,不用每次修改文件名了。

$ python3 createWGDIconf.py -h
usage: createWGDIconf.py [-h] [--prefix PREFIX] --blast BLAST --gff1 GFF1 [--gff2 [GFF2]] --len1 LEN1 [--len2 [LEN2]] [--genome_name1 GENOME_NAME1] [--genome_name2 GENOME_NAME2] [--cds CDS]
                         [--pep PEP]

OK

optional arguments:
  -h, --help            show this help message and exit
  --prefix PREFIX
  --blast BLAST
  --gff1 GFF1
  --gff2 [GFF2]
  --len1 LEN1
  --len2 [LEN2]
  --genome_name1 GENOME_NAME1
  --genome_name2 GENOME_NAME2
  --cds CDS
  --pep PEP
#!python3

import argparse


def create_parser():
    parser = argparse.ArgumentParser()
    parser.add_argument("--prefix", default="test")
    parser.add_argument("--blast", required=True)
    parser.add_argument("--gff1", required=True)
    parser.add_argument("--gff2", nargs='?')
    parser.add_argument("--len1", required=True)
    parser.add_argument("--len2", nargs='?')
    parser.add_argument("--genome_name1", default="genome1")
    parser.add_argument("--genome_name2", default="genome2")
    parser.add_argument("--cds", default="cds.fa")
    parser.add_argument("--pep", default="pep.faa")
    return parser

def create_conf():
    d = """[dotplot]
blast = {blast}
gff1 =  {gff1}
gff2 =  {gff2}
lens1 = {len1}
lens2 = {len2}
genome1_name =  {genome_name1}
genome2_name =  {genome_name2}
multiple  = 1
score = 100
evalue = 1e-5
repeat_number = 10
position = order
blast_reverse = false
ancestor_left = none
ancestor_top = none
markersize = 0.5
figsize = 10,10
savefig = {prefix}.pdf

[dotplot]
blast = {blast}
gff1 =  {gff1}
gff2 =  {gff2}
lens1 = {len1}
lens2 = {len2}
genome1_name =  {genome_name1}
genome2_name =  {genome_name2}
multiple  = 1
score = 100
evalue = 1e-5
repeat_number = 10
position = order
blast_reverse = false
ancestor_left = none
ancestor_top = none
markersize = 0.5
figsize = 10,10
savefig = {prefix}.pdf

[collinearity]
gff1 = {gff1}
gff2 = {gff2}
lens1 = {len1}
lens2 = {len2}
blast = {blast}
blast_reverse = false
multiple  = 1
process = 8
evalue = 1e-5
score = 100
grading = 50,25,10
mg = 25,25
pvalue = 1
repeat_number = 10
positon = order
savefile = {prefix}.icl.collinearity.txt

[ks]
cds_file = {cds}
pep_file = {pep}
align_software = muscle
pairs_file = {prefix}.icl.collinearity.txt
ks_file = {prefix}.ks

[blockinfo]
blast = {blast}
gff1 =  {gff1}
gff2 =  {gff2}
lens1 = {len1}
lens2 = {len2}
collinearity = {prefix}.icl.collinearity.txt
score = 100
evalue = 1e-5
repeat_number = 20
position = order
ks = {prefix}.ks
ks_col = ks_NG86
savefile = {prefix}.block.csv

[correspondence]
blockinfo =  {prefix}.block.csv
lens1 = {len1}
lens2 = {len2}
tandem = false
tandem_length = 200
pvalue = 0.2
block_length = 5
multiple  = 1
homo = -1,1
savefile = {prefix}.block.c.csv

[blockks]
lens1 = len1
lens2 = {len2}
genome1_name =  {genome_name1}
genome2_name =  {genome_name2}
blockinfo = {prefix}.block.c.csv
pvalue = 0.2
tandem = false
tandem_length = 200
markersize = 1
area = 0,2
block_length =  5
figsize = 8,8
savefig = {prefix}.ks.c.pdf

[kspeaks]
blockinfo = {prefix}.block.c.csv
pvalue = 0.2
tandem = false
block_length = 5
ks_area = 0,10
multiple  = 1
homo = 0,1
fontsize = 9
area = 0,3
figsize = 10,6.18
savefig = {prefix}.kp.pdf
savefile = {prefix}.kp.medain.ks

[peaksfit]
blockinfo = {prefix}.block.c.csv
mode = median
bins_number = 200
ks_area = 0,10
fontsize = 9
area = 0,3
figsize = 10,6.18
shadow = true
savefig = {prefix}.pf.pdf

""".format(blast=args.blast,
    gff1=args.gff1,
    gff2=args.gff2,
    len1=args.len1,
    len2=args.len2,
    genome_name1=args.genome_name1,
    genome_name2=args.genome_name2,
    prefix=args.prefix,
    cds=args.cds,
    pep=args.pep)

    return(d)

if __name__ == "__main__":
    parser = create_parser()
    args = parser.parse_args()
    if args.len2 is None:
        args.len2=args.len1
    if args.gff2 is None:
        args.gff2=args.gff1
    conf = create_conf()
    print(conf)
上一篇下一篇

猜你喜欢

热点阅读