找蛋白HIF1a结合在基因nos1附近，与super enhan

2019-04-09 本文已影响7人 Ray钱

一、先说一下rose（ RANK ORDERING OF SUPER-ENHANCERS）

定义一下super enhancer（SE)，https://en.wikipedia.org/wiki/Super-enhancer，通用H3K27ac的chip-seq macs2出来的bed，通过rose筛选出来的enhancer_withsuper.bed。

安装方法如下：
点击查看安装方法介绍—1。
点击查看安装方法介绍—2。

安装前须知：
Python2.7环境
需要samtools和R作为依赖（注意，以前装过的，需要在py2.7环境下，export环境变量）

python $ SOFT_PATH /ROSE_main.py -g mm10 \
-i $ WORK_PATH /gtf/KYSE510_peaks.bed \
-r $ WORK_PATH /samtools_sort/sort_treat1.bam \
-c $ WORK_PATH /samtools_sort/sort_control1.bam \
-o $ WORK_PATH / ROSE / KYSE510 \
-s 12500 -t 2500 2> $ LOG_PATH /KYSE510_enhancer.log

-i 是指input,即H3K27ac chip下来的两个.bam文件，macs2下来的的bed文件
-r是指rank by，即H3K27ac chip下的bam文件
-c是指control，即H3K27ac chip的input的bam文件
-o是指outputdir

具体用法，网上很多。
可以根据这个程序，得到一个bed，name：H3k27ac-enhancer_withsuper.bed
这个，就是这个样本所有的SE.

二、再确认目标bed位置

打开IGV，将这个SE.bed,和自己的目的蛋白（HIF1a）chip下来的bed导入IGV，查询自己感兴趣的位置（这里我找的是nos1的上游）。

igv截图

将这个位置的大小记录下来。然后对bed做处理。

##文件与路径
~/Public/backupII/seqdate/4.7/
1NIE-Hypo-HIF1a-13.FCC76L8ACXX_L6_R1_ICGTAGA.PE_macs2_peaks.bed*
1NIE-Hypo-HIF1b-14.FCC76L8ACXX_L6_R1_ITCAGAG.PE_macs2_peaks.bed*
Hy__K27Ac_Enhancers_withSuper.bed*
##bedtools处理得到HIF1a的peak与k27AC（SE）的交集
bedtools intersect -a ~/Public/backupII/seqdate/4.7/Hy__K27Ac_Enhancers_withSuper.bed -b ~/Public/backupII/seqdate/4.7/1NIE-Hypo-HIF1a-13.FCC76L8ACXX_L6_R1_ICGTAGA.PE_macs2_peaks.bed -wb >HIF1a-result.bed
#取nos1上游的HIF1a的SE
awk '{print $6"\t"$7"\t"$8"\t"$9"\t"$10"\t"}' HIF1a-result.bed >HIF1a-clean.bed
awk '$2>117504310&&$3<117842831' HIF1a-clean.bed|grep chr5 >HIF1a-nos1.bed

bedtools 的用法就不多说了，google吧。
过程中要用note或者igv检查bed，以确保自己的bed正确。

三、用homer找motif

介绍一下motif的定义：https://en.wikipedia.org/wiki/Sequence_motif
简单的说1、某个蛋白识别的核酸位点。2、是一个概率矩阵。

再说一下homer，其中一个主要的程序是findMotifsGenome.pl。是一个用perl写的查找motif的程序。稍微补充一下安装方法：

conda install wget samtools r-essentials bioconductor-deseq2 bioconductor-edger
conda install homer
cd /conda/share/homer
perl  configureHomer.pl -install mm10

用homer找motif的代码如下：

#文件与路径
~/motif/HIF1
HIF1a-nos1.bed* 
HIF1b-nos1.bed
#批量查找motif，同类的bed放一起。
for id in ~/motif/HIF1/*.bed;
do
echo $id
awk '{print $4"\t"$1"\t"$2"\t"$3"\t+"}' $id >homer_peaks.tmp
findMotifsGenome.pl homer_peaks.tmp mm10 ${id%%.*}_motifDir -size given -len 8,10,12
annotatePeaks.pl homer_peaks.tmp mm10 1>${id%%.*}.peakAnn.xls 2>${id%%.*}.annLog.txt 
done

output是这样的：

motif

annotation

如果有你的目标蛋白，那么就是直接结合的，如果没有你的目标蛋白，就有可能是间接结合的。

转录调控模型

end

找蛋白HIF1a结合在基因nos1附近，与super enhan

一、先说一下rose（ RANK ORDERING OF SUPER-ENHANCERS）

二、再确认目标bed位置

三、用homer找motif

猜你喜欢

热点阅读