Exomiser yml文件修改

2023-08-30  本文已影响0人  重拾生活信心

exomiser-cli-13.2.0/examples/exome-analysis.yml

## Exomiser Analysis Template.
# These are all the possible options for running exomiser. Use this as a template for your own set-up.
---
analysis:
    # hg19 or hg38 - ensure that the application has been configured to run the specified assembly otherwise it will halt.
    genomeAssembly: hg19
    vcf: 
    ped:##前六列输入家系信息
    proband:CY0619_02
    hpoIds: ['HP:0000407', 'HP:0001181', 'HP:0001249', 'HP:0001531','HP:0001824','HP:0002014','HP0002017','HP:0002019','HP:0002027','HP:0002251','HP:0004322','HP:0005214','HP:0012719','HP:0100031','HP:0100806','HP:0200008']

image.png

不同遗传模式:如果预先知道样本的疾病遗传模式,可只保留一个。
常染色体显性(AD),常染色体隐性(纯和、杂合?)
数值为最大MAF(minor allele frequency)。
这里是能被作为候选致病variant的次等位基因频率不能过高
vcf过滤掉MAF小的variant,减少假阳性

    # These are the default settings, with values representing the maximum minor allele frequency in percent (%) permitted for an allele to be considered as a causative candidate under that mode of inheritance. If you just want to analyse a sample under a single inheritance mode, delete/comment-out the others. For AUTOSOMAL_RECESSIVE or X_RECESSIVE ensure *both* relevant HOM_ALT and COMP_HET modes are present.In cases where you do not want any cut-offs applied an empty map should be used e.g. inheritanceModes: {}

    inheritanceModes: {
      AUTOSOMAL_DOMINANT: 0.1,
      AUTOSOMAL_RECESSIVE_HOM_ALT: 0.1,
      AUTOSOMAL_RECESSIVE_COMP_HET: 2.0,
      X_DOMINANT: 0.1,
      X_RECESSIVE_HOM_ALT: 0.1,
      X_RECESSIVE_COMP_HET: 2.0,
      MITOCHONDRIAL: 0.2
    }
  #FULL or PASS_ONLY
#保留符合条件的variant
    analysisMode: PASS_ONLY

  # Possible frequency Sources:
      #   Thousand Genomes project http://www.1000genomes.org/
       #   THOUSAND_GENOMES,
      # ESP project http://evs.gs.washington.edu/EVS/
      #   ESP_AFRICAN_AMERICAN, ESP_EUROPEAN_AMERICAN, ESP_ALL,
      # ExAC project http://exac.broadinstitute.org/about
      #   EXAC_AFRICAN_INC_AFRICAN_AMERICAN, EXAC_AMERICAN,
      #   EXAC_SOUTH_ASIAN, EXAC_EAST_ASIAN,
      #   EXAC_FINNISH, EXAC_NON_FINNISH_EUROPEAN,
      #   EXAC_OTHER
  # Possible frequencySources:
      # Thousand Genomes project - http://www.1000genomes.org/ (THOUSAND_GENOMES)
      # TOPMed - https://www.nhlbi.nih.gov/science/precision-medicine-activities (TOPMED)
      # UK10K - http://www.uk10k.org/ (UK10K)
      # ESP project - http://evs.gs.washington.edu/EVS/ (ESP_)
      #   ESP_AFRICAN_AMERICAN, ESP_EUROPEAN_AMERICAN, ESP_ALL,
      # ExAC project http://exac.broadinstitute.org/about (EXAC_)
      #   EXAC_AFRICAN_INC_AFRICAN_AMERICAN, EXAC_AMERICAN,
      #   EXAC_SOUTH_ASIAN, EXAC_EAST_ASIAN,
      #   EXAC_FINNISH, EXAC_NON_FINNISH_EUROPEAN,
     #   EXAC_OTHER
     # gnomAD - http://gnomad.broadinstitute.org/ (GNOMAD_E, GNOMAD_G)
    
frequencySources: [
        THOUSAND_GENOMES,
        TOPMED,
        UK10K,

        ESP_AFRICAN_AMERICAN, ESP_EUROPEAN_AMERICAN, ESP_ALL,

        EXAC_AFRICAN_INC_AFRICAN_AMERICAN, EXAC_AMERICAN,
        EXAC_SOUTH_ASIAN, EXAC_EAST_ASIAN,
        EXAC_FINNISH, EXAC_NON_FINNISH_EUROPEAN,
        EXAC_OTHER,

        GNOMAD_E_AFR,
        GNOMAD_E_AMR,
#        GNOMAD_E_ASJ,
        GNOMAD_E_EAS,
        GNOMAD_E_FIN,
        GNOMAD_E_NFE,
        GNOMAD_E_OTH,
        GNOMAD_E_SAS,

        GNOMAD_G_AFR,
        GNOMAD_G_AMR,
      #        GNOMAD_G_ASJ,
        GNOMAD_G_EAS,
        GNOMAD_G_FIN,
        GNOMAD_G_NFE,
        GNOMAD_G_OTH,
        GNOMAD_G_SAS
    ]
  # Possible pathogenicitySources: (POLYPHEN, MUTATION_TASTER, SIFT), (REVEL, MVP), CADD, REMM

  # REMM is trained on non-coding regulatory regions
  # *WARNING* if you enable CADD or REMM ensure that you have downloaded and installed the CADD/REMM tabix files
  # and updated their location in the application.properties. Exomiser will not run without this.
    
pathogenicitySources: [ REVEL, MVP ]

this is the standard exomiser order.
all steps are optional

根据染色体区间过滤 —— intervalFilter
根据质量过滤——qualityFilter
根据effect过滤[INTERGENIC_VARIANT……]——variantEffectFilter
过滤已知variant——knownVariantFilter
根据MAF过滤——frequencyFilter
……

    steps: [
      #intervalFilter: {interval: 'chr10:123256200-123256300'},
      # or for multiple intervals:
      #intervalFilter: {intervals: ['chr10:123256200-123256300', 'chr10:123256290-123256350']},
      # or using a BED file - NOTE this should be 0-based, Exomiser otherwise uses 1-based coordinates in line with VCF
      #intervalFilter: {bed: /full/path/to/bed_file.bed},
      #genePanelFilter: {geneSymbols: ['FGFR1','FGFR2']},
        failedVariantFilter: { },
      #qualityFilter: {minQuality: 50.0},
        variantEffectFilter: {
          remove: [
              FIVE_PRIME_UTR_EXON_VARIANT,
              FIVE_PRIME_UTR_INTRON_VARIANT,
              THREE_PRIME_UTR_EXON_VARIANT,
              THREE_PRIME_UTR_INTRON_VARIANT,
              NON_CODING_TRANSCRIPT_EXON_VARIANT,
              NON_CODING_TRANSCRIPT_INTRON_VARIANT,
              CODING_TRANSCRIPT_INTRON_VARIANT,
                UPSTREAM_GENE_VARIANT,
                DOWNSTREAM_GENE_VARIANT,
                INTERGENIC_VARIANT,
                REGULATORY_REGION_VARIANT
            ]
        },
        #knownVariantFilter: {}, #removes variants represented in the database
        frequencyFilter: {maxFrequency: 2.0},
        pathogenicityFilter: {keepNonPathogenic: true},
        #inheritanceFilter and omimPrioritiser should always run AFTER all other filters have completed
        #they will analyse genes according to the specified modeOfInheritance above- UNDEFINED will not be analysed.
        inheritanceFilter: {},
        #omimPrioritiser isn't mandatory.
        omimPrioritiser: {},
        #priorityScoreFilter: {minPriorityScore: 0.4},
        #Other prioritisers: Only combine omimPrioritiser with one of these.
        #Don't include any if you only want to filter the variants.
        hiPhivePrioritiser: {},
        # or run hiPhive in benchmarking mode: 
        #hiPhivePrioritiser: {runParams: 'mouse'},
        #phivePrioritiser: {}
        #phenixPrioritiser: {}
        #exomeWalkerPrioritiser: {seedGeneIds: [11111, 22222, 33333]}
    ]
outputOptions:
    outputContributingVariantsOnly: false
    #numGenes options: 0 = all or specify a limit e.g. 500 for the first 500 results  
    numGenes: 0
    # Path to the desired output directory. Will default to the 'results' subdirectory of the exomiser install directory
    #outputDirectory: results
    # Filename for the output files. Will default to {input-vcf-filename}-exomiser
    outputFileName: Pfeiffer-hiphive-exome-PASS_ONLY
    #out-format options: HTML, JSON, TSV_GENE, TSV_VARIANT, VCF (default: HTML)
    outputFormats: [HTML, JSON, TSV_GENE, TSV_VARIANT, VCF]
上一篇下一篇

猜你喜欢

热点阅读