Crack4-详解根据基因组测序报告,进行细菌基因组Genome

2021-05-23  本文已影响0人  RashidinAbdu

背景:

1. Concept:

To calculate the genome coverage, divide the number of bases sequenced by the estimated genome size, multiplied by % reads placed in contigs, as the following example:

1,514,603,088 / 2,100,000 x (96% of reads placed) = 692x 
2. How to calculate:
number_of_bases_sequenced =1377024000
estimated_genome_size= 4109798
reads_placed_in_contigs= (1317332892/1377024000)

genome_coverage="{:.2f}".format((number_of_bases_sequenced/estimated_genome_size)*reads_placed_in_contigs)#print 2 decimal places

#format_float = "{:.2f}".format(genome_coverage)
#print(format_float)

print("%reads_placed_in_contigs=", "{:.2f}".format(reads_placed_in_contigs*100), "%") #print 2 decimal places
# 最终获得的基因组覆盖度
print("genome_coverage=", genome_coverage)

就得到:


image.png

3. 那么问题来了,如何找到基因组报告里对应的值?

具体如下:


image.png image.png
image.png

所以根据这个进行计算:


#To calculate the genome coverage, divide the number of bases sequenced by the estimated genome size,
# multiplied by % reads placed in contigs
# 如: 1,514,603,088 / 2,100,000 x (96% of reads placed) = 692x

number_of_bases_sequenced =1377024000
estimated_genome_size= 4109798
reads_placed_in_contigs= (1317332892/1377024000)

genome_coverage="{:.2f}".format((number_of_bases_sequenced/estimated_genome_size)*reads_placed_in_contigs)#print 2 decimal places

#format_float = "{:.2f}".format(genome_coverage)
#print(format_float)

print("%reads_placed_in_contigs=", "{:.2f}".format(reads_placed_in_contigs*100), "%") #print 2 decimal places
print("genome_coverage=", genome_coverage, "x")
上一篇下一篇

猜你喜欢

热点阅读