解析KEGG文件

2018-06-05  本文已影响0人  白云梦_7

https://www.genome.jp/kegg-bin/get_htext?hsa00001+3101

C开头的就是kegg的pathway的ID所在行,D开头的就是属于它的kegg的所有的基因

perl -alne '{if(/^C/){/PATH:hsa(\d+)/;$kegg=$1}else{print "$kegg\t$F[1]" if /^D/ and $kegg;}}' hsa00001.keg >kegg2gene.txt

++++++++++++++++++++++++++++++++

#!usr/bin/perluse warnings;use strict; my ($path, $num);open IN, 'hsa00001.keg';open OUT, '>kegg_sorting'; while (){

  chomp;

    if (/^C/){

        ($num)=$_=~/C\s*(\d+).*/;

        #print OUT"$num\n";

    }

    elsif(/^D/){

        ($path)=$_=~/D\s+(\d+).*/;

        print OUT "$num\t$path\n";

    }

}

上一篇下一篇

猜你喜欢

热点阅读