HiNT&hic_breakfinder检测HiC数据中染色体易
2020-09-07 本文已影响0人
caokai001
参考:
1.HiNT : 只能针对剪切酶:DpnII,MboI,HindIII
;发表于Genome Biology
HiNT: a computational method for detecting copy number variations and translocations from Hi-C data
Github for HiNT

包括三个子命令:
HiNT-PRE
HiNT-CNV
HiNT-TL
$ hint -h
usage: hint [-h] [--version] {pre,cnv,tl} ...
HiNT --- Hic for copy Number variations and Translocations detection
positional arguments:
{pre,cnv,tl} sub-command help
pre HiNT for Hi-C data preprocessing: raw Hi-C --> HiC contact matrix and chimeric
read pairs
cnv Copy Number Vairations detection from Hi-C
tl Identify translocated chromosomal pairs, and detect the breakpoints in 100kb as
well as 1bp resolution
optional arguments:
-h, --help show this help message and exit
--version show program's version number and exit
For command line options of each command, type: hint COMMAND -h
- 输出文件
In the HiNT-TL output directory, you will find
jobname_Translocation_IntegratedBP.txt the final integrated translocation breakpoint
jobname_chrompairs_rankProduct.txt rank product predicted potential translocated chromosome pairs
otherFolders intermediate files used to identify the translocation breakpoints
2.hic_breakfinder : https://github.com/dixonlab/hic_breakfinder
;发表于NG
cmake安装jsoncpp
bamtools安装
Ubuntu:安装Eigen3
主需要三个输入文件:
--bam-file
: 比对结果的bam文件
--exp-file-inter
:从box 存储位置下载 https://salkinstitute.app.box.com/s/m8oyv2ypf8o3kcdsybzcmrpg032xnrgx
--exp-file-intra
:从box 存储位置下载 https://salkinstitute.app.box.com/s/m8oyv2ypf8o3kcdsybzcmrpg032xnrgx
网速慢,下载花时间

### 帮助信息:
/public/home/kcao/tools/HiCTranslocations/hic_breakfinder/hic_breakfinder_done/bin/hic_breakfinder
Required options:
--bam-file [input bam file]
--exp-file-inter [Inter-chromosomal 1Mb expectation file]
--exp-file-intra [Intra-chromosomal 100kb expectation file]
--name [output file name prefix, will append with *.super_matrix.txt and *.SR.txt]
HiNT实践
目的: 从bedpe
or bam
文件,计算易位现象
Step1:安装HiNT :
参考:https://github.com/parklab/HiNT#dependencies
## R依赖包
mgcv, strucchange, doParallel, Cairo, foreach
## 其他工具
pip install HiNT-Package
conda install -c bioconda pairix
Step2:下载必要的输入文件
Download reference files used in HiNT HERE
- Download HiNT references HERE. Only hg19, hg38 and mm10 are available currently. Unzip it
$ unzip hg19.zip
image.png
- Download HiNT background matrices HERE. Only hg19, hg38 and mm10 are available currently. Unzip it
$ unzip hg19.zip

Step3: 运行
### 帮助信息
$ hint tl -h
usage: hint tl [-h] -m MATRIXFILE --refdir REFERENCEDIR [-e {DpnII,MboI,HindIII}]
[-f {cooler,juicer}] --ppath PAIRIXPATH [-g {hg38,hg19,mm10}] [--chimeric CHIMERIC]
--backdir BACKGROUNDINTERCHROMMATRIXDIR [-c CUTOFF] [-o OUTDIR] [-n NAME]
[-p THREADS]
### 运行
$ nohup hint tl -m /public/home/kcao/Work/Pyroptosis_3D/20200794_Hi-C/03_juicer_format/B1_L7_A001.inter.clean.bedpe.hic -f juicer --refdir ./hg19_ref/ --backdir ./hg19_bg/ -g hg19 -n B1 -c 0.05 --ppath ~/miniconda3/envs/R_env/bin/pairix -p 12 -o HiNTtransl_juicerOUTPUT &
Step4: 结果文件
结果输出11个易位发生位点


hic_breakfinder 实践
目的:从bedpe
文件中检测易位位点
- Step1: 准备输入文件
## step1.1 : 转换bam格式(bedtools bedpeToBam : bedpe转换成bam)
$ echo "bedpeToBam -i /public/home/kcao/Work/Pyroptosis_3D/20200794_Hi-C/02_dlo_hic_result/B1_L7_A001_result/03.NoiseReduce/B1_L7_A001.clean.bedpe -g /public/home/kcao/Work/Pyroptosis_3D/20200794_Hi-C/02_dlo_hic_result/B1_L7_A001_result/0a.EnzymeFragment/B1_L7_A001.TTAA.ChrSize.bed > B1_L7_A001.bam"| qsub -d . -N bedpetobam
- Step2:下载Inter-chromosomal 1Mb expectation file,Intra-chromosomal 100kb expectation file
## Tips:网速比较慢
https://salkinstitute.app.box.com/s/m8oyv2ypf8o3kcdsybzcmrpg032xnrgx
- Step3:运行程序
##### Step2:运行程序
$ echo "/public/home/kcao/tools/HiCTranslocations/hic_breakfinder/hic_breakfinder_done/bin/hic_breakfinder --bam-file B1_L7_A001.bam --exp-file-inter inter_expect_1Mb.hg19.txt --exp-file-intra intra_expect_100kb.hg19.txt --name hic_breakfinder_B1_output" | qsub -d . -N hic_breakfinder
- Step4: 结果
Here we will describe the meaning of column of the line:
1 - Log-odds score of the rearrangement call. This is can be thought of as the "strength" of the call.
2 - column chromosome
3 - column start
4 - column end
5 - column strand
6 - row chromosome
7 - row start
8 - row end
9 - row strand
10 - resolution of the call (minimal bin size for which this call is made).

思考:
- 不同工具,结果检测易位数目不同,
hic_breakfinder
,HiNT
比HiCTrans
效果更好一些 - 软件安装比较麻烦,特别是
hic_breakfinder
, 编译一直报错.
欢迎评论交流~😀