基因组估算软件:Genomescope

2021-12-22  本文已影响0人  王梓维

网页版的Genomescope貌似崩了,一直在运算中,不出结果,因而只好本地安装了

conda install -c bioconda genomescope2
###出现报错如下
Package python conflicts for:
python=3.6
genomescope2 -> r-argparse -> python[version='2.7.*|3.4.*|3.5.*|3.6.*']
genomescope2 -> python[version='>=3.2|>=3.6']The following specifications were found to be incompatible with your system:

  - feature:/linux-64::__glibc==2.31=0
  - feature:|@/linux-64::__glibc==2.31=0

Your installed version is: 2.31

经查阅,应该是conda-forge和default冲突导致的,只使用conda-forge,问题解决:

conda install -c conda-forge -c bioconda jellyfish genomescope2

使用jellyfish切成21mer(该软件推荐使用21mer,也可以尝试其他)

jellyfish count -C -m 21 -s 1000000000 -t 10 *.fastq -o 21.reads.jf
jellyfish histo -t 10 21.reads.jf > reads.21.histo

帮助文件

usage: /mnt/d/conda1/bin/genomescope2 [-h] [-v] [-i INPUT] [-o OUTPUT]
                                      [-p PLOIDY] [-k KMER_LENGTH]
                                      [-n NAME_PREFIX] [-l LAMBDA]
                                      [-m MAX_KMERCOV] [--verbose]
                                      [--no_unique_sequence] [-t TOPOLOGY]
                                      [--initial_repetitiveness INITIAL_REPETITIVENESS]
                                      [--initial_heterozygosities INITIAL_HETEROZYGOSITIES]
                                      [--transform_exp TRANSFORM_EXP]
                                      [--testing] [--true_params TRUE_PARAMS]
                                      [--trace_flag] [--num_rounds NUM_ROUNDS]

optional arguments:
  -h, --help            show this help message and exit
  -v, --version         print the version and exit
  -i INPUT, --input INPUT
                        input histogram file
  -o OUTPUT, --output OUTPUT
                        output directory name
  -p PLOIDY, --ploidy PLOIDY
                        ploidy (1, 2, 3, 4, 5, or 6) for model to use [default
                        2]
  -k KMER_LENGTH, --kmer_length KMER_LENGTH
                        kmer length used to calculate kmer spectra [default
                        21]
  -n NAME_PREFIX, --name_prefix NAME_PREFIX
                        optional name_prefix for output files
  -l LAMBDA, --lambda LAMBDA, --kcov LAMBDA, --kmercov LAMBDA
                        optional initial kmercov estimate for model to use
  -m MAX_KMERCOV, --max_kmercov MAX_KMERCOV
                        optional maximum kmer coverage threshold (kmers with
                        coverage greater than max_kmercov are ignored by the
                        model)
  --verbose             optional flag to print messages during execution
  --no_unique_sequence  optional flag to turn off yellow unique sequence line
                        in plots
  -t TOPOLOGY, --topology TOPOLOGY
                        ADVANCED: flag for topology for model to use
  --initial_repetitiveness INITIAL_REPETITIVENESS
                        ADVANCED: flag to set initial value for repetitiveness
  --initial_heterozygosities INITIAL_HETEROZYGOSITIES
                        ADVANCED: flag to set initial values for nucleotide
                        heterozygosity rates
  --transform_exp TRANSFORM_EXP
                        ADVANCED: parameter for the exponent when fitting a
                        transformed (x**transform_exp*y vs. x) kmer histogram
                        [default 1]
  --testing             ADVANCED: flag to create testing.tsv file with model
                        parameters
  --true_params TRUE_PARAMS
                        ADVANCED: flag to state true simulated parameters for
                        testing mode
  --trace_flag          ADVANCED: flag to turn on printing of iteration
                        progress of nlsLM function
  --num_rounds NUM_ROUNDS
                        ADVANCED: parameter for the number of optimization
                        rounds

根据该用法,

genomescope2 -i reads.21.histo -o 21 -k 21

结果生成到名为21的目录中

更多详情请见该软件使用手册GitHub - schatzlab/genomescope: Fast genome analysis from unassembled short reads

上一篇下一篇

猜你喜欢

热点阅读