【R>>rbsurv】hub基因的筛序

2021-05-23 本文已影响0人高大石头

基因表达数据预多种临床数据密切相关，特别是这些数据对患者的预后生存判断非常重要。对于芯片（microarray）的预后，HyungJun Cho等开发了基于COX模型的部分似然函数算法的R包rbsurv，能够方便的选取关键基因。它的特点在于，可以根据认为的设置产生多个由预后相关基因构建的预后模型。最难能可贵的是，虽然这个R包发表与2009年，但到现在作者还在更新中。

示例数据

# BiocManager::install("rbsurv",ask = F,update = F)
library(rbsurv)
data("gliomaSet")
gliomaSet

## ExpressionSet (storageMode: lockedEnvironment)
## assayData: 100 features, 85 samples 
##   element names: exprs 
## protocolData: none
## phenoData
##   sampleNames: Chip1 Chip2 ... Chip85 (85 total)
##   varLabels: Time Status Age Gender
##   varMetadata: labelDescription
## featureData: none
## experimentData: use 'experimentData(object)'
##   pubMedIds: 15374961 
## Annotation:

数据整理

x <- exprs(gliomaSet) # 表达矩阵
x <- log2(x) #log转换
time <- gliomaSet$Time
status <- gliomaSet$Status
z <- cbind(gliomaSet$Age, gliomaSet$Gender)

模型一

fit <- rbsurv(time=time, status=status, x=x, method="efron", max.n.genes=20)

## Please wait... Done.

fit$model

##     Seq Order Gene nloglik    AIC Selected
## 0     1     0    0  228.74 457.47         
## 110   1     1   46  218.53 439.05 *       
## 2     1     2   57  202.21 408.42 *       
## 3     1     3   43  195.50 396.99 *       
## 4     1     4   34  194.01 396.01 *       
## 5     1     5   99  192.14 394.29 *       
## 6     1     6   36  189.81 391.63 *       
## 7     1     7    8  188.80 391.59 *       
## 8     1     8   86  187.90 391.80         
## 9     1     9   68  187.52 393.04         
## 10    1    10   56  187.42 394.84         
## 11    1    11   15  186.68 395.37         
## 12    1    12   29  185.54 395.09         
## 13    1    13   75  185.54 397.09         
## 14    1    14   67  185.41 398.83         
## 15    1    15   40  183.76 397.52         
## 16    1    16   98  183.04 398.09         
## 17    1    17   19  182.25 398.49         
## 18    1    18   39  181.99 399.98         
## 19    1    19   96  181.88 401.76

模型二

如果有重要的因子，还可以对预后模型进行校正。

注意：这个耗费时间比较长，大家量力而为。

fit <- rbsurv(time=time, status=status, x=x, z=z, alpha=0.05, gene.ID=NULL,
              method="efron", max.n.genes=100, n.iter=100, n.fold=3,
              n.seq=3, seed = 1234)
fit$model

文章实战一

[Identification of an apoptosis-related prognostic gene signature and molecular subtypes of clear cell renal cell carcinoma (ccRCC)(https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8100811/#SM0)

文献实战二

文献实战三

A seven-gene signature predicts overall survival of patients with colorectal cancer

参考文献：
Robust Likelihood-Based Survival Modeling with Microarray Data.” Journal of Statistical Software