micro

小 RNA 长度分布统计

2018-11-15  本文已影响0人  Weiyx

统计 长度 分布,代码,参考别人针对自己的数据做了修改。

命令行,输入:python py.jiaoben  *.fa

import sys

miRNA_len={}

for i in range(0,52):

      miRNA_len[i] = 0

for i in open(sys.argv[1]):

      if i.startswith('>') :

                      continue

      length = len(i) - 1

      miRNA_len[length] += 1

for i in miRNA_len:

      print str(i)+"\t"+str(miRNA_len[i])

数据出来之后,作图:

代码如下:参考别人,做了更改。

setwd("G:/smallRNA/length.txt")

S1 = read.table("AACC1.txt")

S2 = read.table("AACC2.txt")$V2

S3 = read.table("AACC3.txt")$V2

S4 = read.table("AACC4.txt")$V2

S5 = read.table("AACC5.txt")$V2

S6 = read.table("AACC6.txt")$V2

S7 = read.table("AACC7.txt")$V2

S8 = read.table("AACC8.txt")$V2

S9 = read.table("P1.txt")$V2

S10 = read.table("P2.txt")$V2

data = cbind(S1,S2,S3,S4,S5,S6,S7,S8,S9,S10)

colnames(data)=c("Length","AACC1","AACC2","AACC3","AACC4","AACC5","AACC6","AACC7","AACC8","P1","P2")

data$AACC1=100 * data$AACC1/sum(data$AACC1)

data$AACC2=100 * data$AACC2/sum(data$AACC2)

data$AACC3=100 * data$AACC3/sum(data$AACC3)

data$AACC4=100 * data$AACC4/sum(data$AACC4)

data$AACC5=100 * data$AACC5/sum(data$AACC5)

data$AACC6=100 * data$AACC6/sum(data$AACC6)

data$AACC7=100 * data$AACC7/sum(data$AACC7)

data$AACC8=100 * data$AACC8/sum(data$AACC8)

data$P1=100 * data$P1/sum(data$P1)

data$P2=100 * data$P2/sum(data$P2)

library(reshape2)

data.melt=melt(data, id="Length")

library(ggplot2)

p<-ggplot(data.melt, aes(x=Length, y=value, col=variable))

p+geom_line() +

  theme( text =element_text(size=18),

        panel.background=element_blank(),

        axis.line = element_line(size = 1,colour="black"),

        axis.text =element_text(colour="black")) +

  labs(title="All reads length distribution",x="Read Length", y="Fraction (%)")

下一步 : 比对:

上一篇 下一篇

猜你喜欢

热点阅读