重点

RIP-Seq(4)-- Peak calling

2022-06-17  本文已影响0人  Li_bioinfo
cat >macs3.sh
macs3 callpeak -f BAMPE -t ../hisat2/CBFB_FLAG_antibody.sorted.bam -c ../hisat2/CBFB_FLAG_antibody_Input.sorted.bam -g hs  -n CBFB_FLAG_antibody --nomodel -B --trackline

macs3 callpeak -f BAMPE -t ../hisat2/hnRNPK_antibody_rep1.sorted.bam -c ../hisat2/hnRNPK_antibody_Input.sorted.bam -g hs  -n hnRNPK_antibody_rep1 --nomodel -B --trackline 

macs3 callpeak -f BAMPE -t ../hisat2/hnRNPK_antibody_rep2.sorted.bam -c ../hisat2/hnRNPK_antibody_Input.sorted.bam -g hs  -n hnRNPK_antibody_rep2 --nomodel -B --trackline  
nohup sh macs3.sh > macs3.log 2>&1 &

查看两次重复中的peak情况

1.使用bedtools粗略估计两次重复的peak交集,重叠1bp就算有交集

sed -i '1d' *.narrowPeak
bedtools intersect -a hnRNPK_antibody_rep1_peaks.narrowPeak -b hnRNPK_antibody_rep2_peaks.narrowPeak -wo > hnRNPK_antibody.bed

intervene venn -i hnRNPK_antibody_rep1_peaks.narrowPeak hnRNPK_antibody_rep2_peaks.narrowPeak --output venn --save-overlaps
image.png
对MACS3的结果文件narrowPeak根据-log10(p-value)进行排序
sort -k8,8nr hnRNPK_antibody_rep1_peaks.narrowPeak -o hnRNPK_antibody_rep1_peaks.narrowPeak
sort -k8,8nr hnRNPK_antibody_rep2_peaks.narrowPeak -o  hnRNPK_antibody_rep2_peaks.narrowPeak

对生物学重复样本间的peak进行鉴定,查看两次重复的peak的IDR(不可重复率)

idr --samples hnRNPK_antibody_rep1_peaks.narrowPeak hnRNPK_antibody_rep2_peaks.narrowPeak  --output-file hnRNPK_antibody_peak.narrowPeak --rank p.value --plot  --idr-threshold 0.05 --log-output-file hnRNPK_antibody.idr.log
#如果想看IDR<0.05的,可以通过第5列信息过滤:
awk '{if($5 >= 540) print $0}' hnRNPK_antibody_peak.narrowPeak | wc -l 
1851
log文件会给出peaks通过IDR < 0.05的比率
cat hnRNPK_antibody.idr.log
Initial parameter values: [0.10 1.00 0.20 0.50]
Final parameter values: [1.16 0.80 0.90 0.64]
Number of reported peaks - 1851/3732 (49.6%)

Number of peaks passing IDR cutoff of 0.05 - 1851/3732 (49.6%)
image.png
查看富集倍数大于4的交集情况
awk '{if($7 > 4 ) print $0}' hnRNPK_antibody_rep1_peaks.narrowPeak >hnRNPK_antibody_rep1_peaks_fc4.narrowPeak
awk '{if($7 > 4 ) print $0}' hnRNPK_antibody_rep2_peaks.narrowPeak >hnRNPK_antibody_rep2_peaks_fc4.narrowPeak
awk '{if($7 > 4 ) print $0}' CBFB_FLAG_antibody_peaks.narrowPeak >CBFB_FLAG_antibody_peaks_fc4.narrowPeak

合并两次重复peak
idr --samples hnRNPK_antibody_rep1_peaks_fc4.narrowPeak hnRNPK_antibody_rep2_peaks_fc4.narrowPeak  --output-file hnRNPK_antibody_peak_fc4.narrowPeak --rank p.value --plot  --idr-threshold 0.05 --log-output-file hnRNPK_antibody_fc4.idr.log

求hnRNPK与CBFB公有peak
awk '{print "chr"$1"\t"$2"\t"$3}' hnRNPK_antibody_peak_fc4.narrowPeak >hnRNPK_antibody_peak_fc4.bed
wc -l hnRNPK_antibody_peak_fc4.bed
1182 hnRNPK_antibody_peak_fc4.bed

awk '{print "chr"$1"\t"$2"\t"$3}' CBFB_FLAG_antibody_peaks_fc4.narrowPeak >CBFB_FLAG_antibody_peaks_fc4.bed
wc -l CBFB_FLAG_antibody_peaks_fc4.bed
1117 CBFB_FLAG_antibody_peaks_fc4.bed

bedtools intersect -a hnRNPK_antibody_peak_fc4.bed -b CBFB_FLAG_antibody_peaks_fc4.bed >>  hnRNPK_CBFB_FLAG_antibody.bed
wc -l hnRNPK_CBFB_FLAG_antibody.bed
594 hnRNPK_CBFB_FLAG_antibody.bed

原文公有peak情况


image.png
上一篇 下一篇

猜你喜欢

热点阅读