ChIP-seq数据分析实战训练（四）

2021-02-17 本文已影响0人小西瓜f

homer软件来寻找motif

下载数据库

cd /public/workspace/fangwen/learn/chip-seq/biosoft/
mkdir homer &&  cd homer
wget http://homer.salk.edu/homer/configureHomer.pl 
perl configureHomer.pl -install
perl configureHomer.pl -install mm10

运行homer软件

homer软件找motif整合了两个方法，包括依赖于数据库的查询，和de novo的推断,都是读取ChIP-seq数据上游分析得到的bed格式的peaks文件。
但是使用起来很简单：http://homer.ucsd.edu/homer/ngs/peakMotifs.html

cd  /public/workspace/fangwen/learn/chip-seq/motif/
for id in /public/workspace/fangwen/learn/chip-seq/peaks/*.bed;
do
echo $id
file=$(basename $id )
sample=${file%%.*} 
echo $sample  
awk '{print $4"\t"$1"\t"$2"\t"$3"\t+"}' $id >homer_peaks.tmp  
findMotifsGenome.pl homer_peaks.tmp mm10 ${sample}_motifDir -len 8,10,12
annotatePeaks.pl    homer_peaks.tmp mm10  1>${sample}.peakAnn.xls 2>${sample}.annLog.txt 
done

把上面的代码保存为脚本runMotif.sh，然后运行：nohup bash runMotif.sh 1>motif.log &
不仅仅找了motif，还顺便把peaks注释了一下。得到的后缀为peakAnn.xls 的文件就可以看到和使用R包注释的结果是差不多的。
还可以使用meme来找motif，需要通过bed格式的peaks的坐标来获取fasta序列。MEME，链接：http://meme-suite.org/

其它高级分析

比如可以比较不同的peaks文件，代码见：https://github.com/jmzeng1314/NGS-pipeline/blob/master/CHIPseq/step6-ChIPpeakAnno-Venn.R
本教程讲解的是单端测序数据的处理，如果是双端测序，里面的很多参数是需要修改的。

cd  /public/workspace/fangwen/learn/chip-seq/motif/
for id in /public/workspace/fangwen/learn/chip-seq/peaks/*.bed;
do
echo $id
file=$(basename $id )
sample=${file%%.*} 
echo $sample  
awk '{print $4"\t"$1"\t"$2"\t"$3"\t+"}' $id >homer_peaks.tmp  
findMotifsGenome.pl homer_peaks.tmp /public/workspace/fangwen/learn/chip-seq/biosoft/homer/data/genomes/
mm10/mm10 ${sample}_motifDir -len 8,10,12
annotatePeaks.pl    homer_peaks.tmp /public/workspace/fangwen/learn/chip-seq/biosoft/homer/data/genomes/
mm10  1>${sample}.peakAnn.xls 2>${sample}.annLog.txt 
done

ChIP-seq数据分析实战训练（四）

homer软件来寻找motif

下载数据库

运行homer软件

其它高级分析

猜你喜欢

热点阅读