Kawasaki disease and WGCNA

2021-07-29  本文已影响0人  泥人吴

参考文章:


本文主要讲illumina beadchip 芯片原始数据处理,基于数据GSE73461

> x <- read.ilmn(files="GSE73461_Raw_data.txt",
+                expr="Sample",
+                probeid="ID_REF",
+                other.columns="Detection Pval")
Reading file GSE73461_Raw_data.txt ... ...
Error in readGenericHeader(fname, columns = expr, sep = sep) : 
  Specified column headings not found in file
rm(list = ls())  ## 魔幻操作,一键清空~
options(stringsAsFactors = F)#
# 读取原始数据
raw_data=read.table('GSE73461_series_raw_matrix.txt.gz',sep='\t',quote = "",fill = T,
             comment.char = "!",header = T) # 提取表达矩阵
raw_data=raw_data[,-2]
a=raw_data
## 修改列明是为了统一名称,方便后续read.ilmn能够识别expr="Sample",other.columns="Detection Pval"
colnames(a)[seq(2,919,by=2)]=paste0("Sample",seq=".",1:459)
colnames(a)[seq(3,919,by=2)]=paste0("Detection Pval",seq=".",1:459)
#再次导出数据为TXT文本,方便read.ilmn读入
write.table(a,"limma.txt",sep='\t',quote = FALSE) #quote = FALSE取消列明加用引号

原始数据处理

#1.读取、背景校正和标准化
x <- read.ilmn(files="limma.txt",
               expr="Sample",
               probeid="ID_REF",
               other.columns="Detection Pval")
## Reading file GSE16997_raw.txt ... ...
y <- neqc(x,detection.p="Detection Pval")

#2.探针过滤
x$other$Detection[1:4,1:4]
dim(y)
expressed <- rowSums(y$other$`Detection Pval` < 0.05) >= 115 ;table(expressed)
y <- y[expressed,]
dim(y)

#3.提取表达矩阵
exp = as.data.frame(y$E)

一些概念

1.关键基因筛选

FilterGenes= abs(GS1)> .2 & abs(datKME$MM.brown)>.8

2.样本聚类分析

sampleTree = hclust(dist(datExpr0), method = "average")

3.GO/KEGG


4.排序并取方差前25%的基因筛选

exp <- as.data.frame(dat)
exp$sd=apply(exp,1,sd)
exp=exp[order(exp$sd,decreasing = TRUE),]
dim(exp)
tj <- quantile(exp$sd,0.75);tj
exp <- exp[exp$sd >= tj,]
dim(exp)

ROC

讨论:

上一篇 下一篇

猜你喜欢

热点阅读