生信猿全基因组/外显子组测序分析funny生物信息

CNV变异检测文献笔记(CODEX)

2018-09-26  本文已影响12人  井底蛙蛙呱呱呱
Biases in CNV detection:
image.png
Steps:

Sample selection and target fltering

Read depth normalization

Due to the extremely high level of systemic bias in WES data, normalization is crucial in WES CNV calling.
CODEX’s multi-sample normalization model takes as input the WES depth of coverage, exon-wise GC content and sample-wise total number of reads

Poisson latent factors and choice of K

有些影响cnv检测的原因可以直接检测到(如GC含量,mappability,外显子大小),然而也有些因素是难以直接检测的,如捕获建库测序或样本导致的bias,称之为潜在因素(latent factors)。
潜在因素的个数K是一个非常关键的因素,太大容易抑制屏蔽掉那些产生真实cnv的信号,太小又无法屏蔽那些干扰信号(artifacts),对结果造成干扰。
CODEX分别使用两个统计参数来评估K值:Akaike informa�tion criterion (AIC) and Bayes information criterion (BIC):


where L is the likelihood for the estimated model, k is the number of parameters in the model and n is the number of data points.

最后使用BIC值来确定K值。

Both CoNIFER and XHMM(28) use latent factor models to remove systemic bias, but their models assume continuous measurements with Gaussian noise structure, while CODEX is based on a Poisson log-linear model, which is more suitable for modeling the discrete counts in WES data, especially when there is high variance in depth of coverage between exons.

CNV detection and copy number estimation

Proper normalization sets the stage for accurate segmentation and CNV calling. For germline CNV detection in normal samples, many CNVs are short and extend over only one or two exons. In this case, simple gene- or exon-level thresholding is suffcient.
For longer CNVs and for copy number estimation in tumors where the events are expected to be large and exhibit nested structure, we propose a Poisson likelihood-based recursive segmentation algorithm.

Discuss

The distinguishing features of CODEX compared to existing methods are:

文献:CODEX: a normalization and copy number variation detection method for whole exome sequencing.

上一篇 下一篇

猜你喜欢

热点阅读