入门生物信息学

DEseq2 均一化多个处理的样本

2020-06-09  本文已影响0人  煮梦斋_bioinfo

我要做一个分析,需要用均一化的数据,学生说之前是需要两两去做DEseq2,然后再做下游分析,我觉得太麻烦。于是摸索了下,DEseq2可以均一化多个样本。
代码如下:我这里是8个条件下的24个样本,大家根据自己的情况去修改:

  1. 加载需要的相关包
  rm(list=ls())
  if(!require(DESeq2))BiocManager::install("DESeq2")
  library(rio)
  library(dplyr)
  1. 导入数据
  data_all=import("exprpac.csv",header = T)

3.选取需要均一化的样本列,构建database

  database=data_all[,c(5:28)]
  head(database)
  database=database[complete.cases(database), ]
  1. 构建factor,切忌名称中间不能添加如“_“ 等符号,否则会报错Error in DESeqDataSet(se, design = design, ignoreRank) : variables in design formula cannot contain NA: condition
 condition = factor(c(rep("56AA", 3), rep("56CK", 3), rep("col0AA", 3), rep("col0CK", 3), rep("30AA", 3), rep("30CK", 3), rep("30mycAA", 3), rep("30mycCK", 3)), levels = c("56AA","56CK","col0AA","col0CK","30AA","30CK","30mycAA","30mycCK"))
  1. DEseq2均一化
  countdata = round(as.matrix(database))
  coldata = data.frame(row.names = colnames(countdata), condition)
  dds = DESeqDataSetFromMatrix(countdata, colData = coldata, design = ~ condition)
  dds = DESeq(dds)
  sizeFactors(dds)
  head(dds)
  res <- results(dds)
  resdata <- merge(as.data.frame(res), as.data.frame(counts(dds, 
  normalized=TRUE)),by="row.names",sort=FALSE)
  head(resdata)
  merge_list <- data.frame(data_all,resdata)
  head(merge_list)
  resdata <- merge_list
  head(resdata)

6.数据保存,因为我是用均一化的数据替代了原始的数据,所以,表格结构重新构造了下

 nor <- select(resdata,c(1:4,59:82),c(29:51))
 write.csv(nor,file = "DEseq2_tot_normalized.csv",row.names = FALSE)

*注,目前我的水平看起来后续做差异分析的时候只能两两做,不能够一次生成。如果有高手的有办法的话,欢迎指教!

上一篇 下一篇

猜你喜欢

热点阅读