python根据基因ID筛选fasta格式的序列

2020-06-28  本文已影响0人  蜡笔小生信
新手笔记:

1.注意变量格式化
2.注意read.line()和read.lines()的区别,后者需要strip("\n")
3.v.write同时出现两行就不能写入文件(原因未知)

import sys

with open(sys.argv[1]) as f:
        with open(sys.argv[2]) as g:
                with open(sys.argv[3],"a") as v:
                        end = {}
                        a = str()
                        ab = str()
                        ac = {}
                        d = str()
                        for x in f.readlines():
                                if(x.startswith(">")):
                                        a = x.strip("\n")
                                else:
                                        b = x.strip("\n")
                                        ab = "{" +'"' + a +'"'+ ":" +'"'+ b +'"' + "}"
                                        ac = eval(ab)
                                        end.update(ac)
                        for i in g.readlines():
                                d = ">" + i.strip("\n")
                                if (d in end.keys()):
                                        v.write(str(d)+"\n"+ end[d] + "\n")
                                else:
                                        continue
上一篇 下一篇

猜你喜欢

热点阅读