RIP-Seq(4)-- Peak calling
2022-06-17 本文已影响0人
Li_bioinfo
cat >macs3.sh
macs3 callpeak -f BAMPE -t ../hisat2/CBFB_FLAG_antibody.sorted.bam -c ../hisat2/CBFB_FLAG_antibody_Input.sorted.bam -g hs -n CBFB_FLAG_antibody --nomodel -B --trackline
macs3 callpeak -f BAMPE -t ../hisat2/hnRNPK_antibody_rep1.sorted.bam -c ../hisat2/hnRNPK_antibody_Input.sorted.bam -g hs -n hnRNPK_antibody_rep1 --nomodel -B --trackline
macs3 callpeak -f BAMPE -t ../hisat2/hnRNPK_antibody_rep2.sorted.bam -c ../hisat2/hnRNPK_antibody_Input.sorted.bam -g hs -n hnRNPK_antibody_rep2 --nomodel -B --trackline
nohup sh macs3.sh > macs3.log 2>&1 &
查看两次重复中的peak情况
1.使用bedtools粗略估计两次重复的peak交集,重叠1bp就算有交集
sed -i '1d' *.narrowPeak
bedtools intersect -a hnRNPK_antibody_rep1_peaks.narrowPeak -b hnRNPK_antibody_rep2_peaks.narrowPeak -wo > hnRNPK_antibody.bed
intervene venn -i hnRNPK_antibody_rep1_peaks.narrowPeak hnRNPK_antibody_rep2_peaks.narrowPeak --output venn --save-overlaps
image.png
对MACS3的结果文件narrowPeak根据-log10(p-value)进行排序
sort -k8,8nr hnRNPK_antibody_rep1_peaks.narrowPeak -o hnRNPK_antibody_rep1_peaks.narrowPeak
sort -k8,8nr hnRNPK_antibody_rep2_peaks.narrowPeak -o hnRNPK_antibody_rep2_peaks.narrowPeak
对生物学重复样本间的peak进行鉴定,查看两次重复的peak的IDR(不可重复率)
idr --samples hnRNPK_antibody_rep1_peaks.narrowPeak hnRNPK_antibody_rep2_peaks.narrowPeak --output-file hnRNPK_antibody_peak.narrowPeak --rank p.value --plot --idr-threshold 0.05 --log-output-file hnRNPK_antibody.idr.log
#如果想看IDR<0.05的,可以通过第5列信息过滤:
awk '{if($5 >= 540) print $0}' hnRNPK_antibody_peak.narrowPeak | wc -l
1851
log文件会给出peaks通过IDR < 0.05的比率
cat hnRNPK_antibody.idr.log
Initial parameter values: [0.10 1.00 0.20 0.50]
Final parameter values: [1.16 0.80 0.90 0.64]
Number of reported peaks - 1851/3732 (49.6%)
Number of peaks passing IDR cutoff of 0.05 - 1851/3732 (49.6%)
image.png
查看富集倍数大于4的交集情况
awk '{if($7 > 4 ) print $0}' hnRNPK_antibody_rep1_peaks.narrowPeak >hnRNPK_antibody_rep1_peaks_fc4.narrowPeak
awk '{if($7 > 4 ) print $0}' hnRNPK_antibody_rep2_peaks.narrowPeak >hnRNPK_antibody_rep2_peaks_fc4.narrowPeak
awk '{if($7 > 4 ) print $0}' CBFB_FLAG_antibody_peaks.narrowPeak >CBFB_FLAG_antibody_peaks_fc4.narrowPeak
合并两次重复peak
idr --samples hnRNPK_antibody_rep1_peaks_fc4.narrowPeak hnRNPK_antibody_rep2_peaks_fc4.narrowPeak --output-file hnRNPK_antibody_peak_fc4.narrowPeak --rank p.value --plot --idr-threshold 0.05 --log-output-file hnRNPK_antibody_fc4.idr.log
求hnRNPK与CBFB公有peak
awk '{print "chr"$1"\t"$2"\t"$3}' hnRNPK_antibody_peak_fc4.narrowPeak >hnRNPK_antibody_peak_fc4.bed
wc -l hnRNPK_antibody_peak_fc4.bed
1182 hnRNPK_antibody_peak_fc4.bed
awk '{print "chr"$1"\t"$2"\t"$3}' CBFB_FLAG_antibody_peaks_fc4.narrowPeak >CBFB_FLAG_antibody_peaks_fc4.bed
wc -l CBFB_FLAG_antibody_peaks_fc4.bed
1117 CBFB_FLAG_antibody_peaks_fc4.bed
bedtools intersect -a hnRNPK_antibody_peak_fc4.bed -b CBFB_FLAG_antibody_peaks_fc4.bed >> hnRNPK_CBFB_FLAG_antibody.bed
wc -l hnRNPK_CBFB_FLAG_antibody.bed
594 hnRNPK_CBFB_FLAG_antibody.bed
原文公有peak情况
image.png