Hi-C组装

hi-c 文献导读-3 Hi-C analysis

2020-10-06  本文已影响0人  小贝学生信

文章首先回顾与论述了3D基因组学发展的必然性,以及3C、4C、5C,Hi-C的发展以及Hi-C的基本步骤,这里就不在叙述,可参考之前的文献导读。

1、Hi-C目前特点

1.1 mutiple scales

(1)large scale: A/B compartment

genome is organized in two distinct comparments

(2)fine scale: TAD

regions characterized by hy intradomain contact frequency and reduced interdomain contacts.

(3)finer scale: loop
mutiple scales-1
mutiple scales-2
mutiple scales-3

1.2

如上从compartment,到TAD,再到loop,需要resolution分辨率越高(值越小=bin)。
因此如何提高分辨率是一个重要的问题,文章主要了介绍从以下两个方面

(1)the restriction enzymes
(2)sequencing depth

The contact maps at 40 kb resolution can clearly highlight topological domains

1.3 future

As more and more datasets become available, it will become increasingly important to establish common and standardized procedures to assess data quality and reproducibility of replicates.

2、Hi-C分析流程

Hi-C分析简单来说可以分成两大步:raw data → Hi-C contact matrix → downstream analysis
(这样说来和RNA-seq差不多,先根据原始测序数据拿到表达矩阵,再做下游分析)


2-1

2.1 raw data(fastq) → Hi-C contact matrix

(1)比对 aligned to the reference genome
(2)质控 filtered to remove spurious signal
(3)set bin
(4)normalization

still open problem: the normalization of Hi-C data originating from genomes with copy number alterations.(拷贝数变异)
An earlier work proposed a solution with an additional scaling factor to be applied on top of
ICE normalization to correct for aneuploidies with whole chromosome duplications or deletions.
Recent publications proposed instead more generalizable solutions adding a correction factor to matrix-balancing normalization to model and adjust the effect of local copy number variations.

2.2 Hi-C contact matrix → Downsteam analysis

即1.1所述的3个角度的研究--compartment,TAD,loop

(1)Tools to call compartments
2.2-1
(2)TAD callers
(3)Interaction callers

3、数据格式与可视化

3.1 数据格式

3.2 可视化

(1)Juiceboxisavailable bothasa desktop and a cloudbased web application named Juicebox.js. It loads matrices in '.hic' format and its strengths are its intuitive interface and easy use.
(2)gcMapExplorer is a Python software featuring a GUI that loads data in the '.gcmap' format; it also performs different types of normalizations on raw matrices.
(3) HiGlass is available as a docker container and loads matrices in '.cool' format. It allows sophisticated customization of the layout by juxtaposing panels with multiple maps at the desired zoomlevels, along with othergenomic data.

Juicebox and HiGlass allow sharing a session via a URL or a JSON representation, respectively, which can also be easily hosted at web sites.

上一篇下一篇

猜你喜欢

热点阅读