Motif counts

2022-04-15  本文已影响0人  余绕
  1. Extract sequence from Gemomic sequences basing on GFF3 file.
    Here, we use promoter sequences (2kb upstream of the gene) as an example.
image.png
  1. Use the perl script to calculates the Motif found in promoter regions.
#!usr/bin/perl
open FA,"$ARGV[0]";

$/=">";
<FA>;
while(<FA>){
chomp;
my ($id,$seq)=split/\n/,$_,2;

$seq=~s/\n//g;
$seq=~s/\s//g;

if(($seq=~/CTTCT[TA]C/i) or ($seq=~/G[TA]AGAAG/i) ){
    

    my $COUNT =($seq=~s/CTTCT[TA]C/xxxxxxx/ig);  #正向链match的
    my $count =( $seq=~s/G[TA]AGAAG/xxxxxxx/ig); #反向互补链mathced的
    print "Gene ID:$id\t"."Positive strand\t"."$COUNT\ttimes\t"."Negative strand\t"."$count\ttimes\n";

}
else {
    
    next;
}


}

  1. Run the script and get the output files.
上一篇下一篇

猜你喜欢

热点阅读