Hi-C三维基因组学

【三维基因组】在寻找差异盛行的时代—你以为DEseq仅仅用于寻找

2020-09-01  本文已影响0人  XuningFan

https://www.bioinfo-scrounger.com/archives/111/

DESEQ往往作为寻找差异基因的工具而被我们所熟知,但是在寻找差异盛行的年代,也可以用Deseq寻找差异contact。

不信?来看一下:

文章一:Alterations of specific chromatin conformation affect ATRA-induced leukemia cell differentiation


image.png
企业微信截图_1593404978784.png

文章二:Highly rearranged chromosomes reveal uncoupling between genome topology and gene expression


image.png image.png

那么如果我们想用bin pairs 做差异contact分析,该如何做呢?
首先以bin pair 为基本单位准备多个文库(2个左右)的raw_count_table,为每个文库准备好组别信息的condition文件。最终按照以下操作就大功告成啦~~

library(DESeq)
library(gplots)
## read raw count table
    #name  arep1  arep2  brep1  brep2
    #bin_pair1  500  500  400  700
     #............................................  

data=read.table(raw_count_table,header=T,row.names=1)


##read condition file content like that
    #sample  condition
    #arep1  group_A
    #arep2  group_A
    #brep1  group_B
    #brep2 group_B

design=read.table(conditions,header=T,row.names=1)

#condition1=group_A
#condition2=group_B
cmp = c(condition1,condition2)

dd=which(design$condition==cmp[1] | design$condition==cmp[2])

new_data = data[,dd]
#---------'generate DEseq object'--------
cds=newCountDataSet(new_data,design[dd,1])
print('estimate Size Factors')
cds=estimateSizeFactors(cds)
print('estimate Dispersions')
cds=estimateDispersions(cds,method='per-condition',fitType="local")
res=nbinomTest(cds,cmp[1],cmp[2])


result = array('no',dim=c(length(res$padj),1))
result[which(res$padj < 0.05)] = 'yes'
total =cbind(res,counts(cds,normalized =T),result)

write.table(total,file=out_result,sep="\t", quote = FALSE, row.names = FALSE)

通过上面呢,我觉得只要模型对,方法都是通用的。

上一篇 下一篇

猜你喜欢

热点阅读