文献阅读1: 关于age的文献

2020-04-10 本文已影响0人 MJades

题目：Undulating changes in human plasma proteome profiles across the lifespan
主要是学习他的数据分析方式，所以不对introduction进行解读。

Linear modeling links the plasma proteome to functional aging and identifies a conserved aging signature.
Result中第一部分主要包括三部分分析：1. 回归分析；2. Sliding enrichment pathway analysis （SEPA）。

线性模型，以下这几张图：

Fig 1. 线性回归ANOVA分析

文中对这一部分的描述如下：

Fig 2. method描述
我利用自己的数据模仿进行了以上的分析，但没有绘图：

library(emmeans) # 计算 effect size
library(car) # 用于type2型ANOVA
attach(lmdataclin)
lm.function<-function(x){
  # calculate
  lm.pro<-lm(x~AGE+SEX,data=lmdataclin)
  x.age<-lm.pro$coefficients[2]
  y.age<-summary(lm.pro)$"coefficients"[2,4]
  x.sex<-lm.pro$coefficients[3]
  y.sex<-summary(lm.pro)$"coefficients"[3,4]
  mm<-Anova(lm.pro,type=2)
  eta_sq(mm,partial=TRUE)
  partial.age<-eta_sq(mm,partial=TRUE)[1,2]
  partial.sex<-eta_sq(mm,partial=TRUE)[2,2]
  lm.list<-data.frame(ß.age=x.age, pvalue.age=y.age, ß.sex=x.sex,pvalue.sex=y.sex,effect.size.age=partial.age,effect.size.sex=partial.sex)
  return(lm.list)
}
aov.pro<-apply(lmdataclin[,100:651],2,lm.function) # 批量处理
sum.aov<-do.call(rbind,lm.pro) # 进行合并

结果如下：

Fig 3. 结果展示

p.value的校正可参考：
https://blog.csdn.net/zhu_si_tao/article/details/71077703?depth_1-utm_source=distribute.pc_relevant.none-task&utm_source=distribute.pc_relevant.none-task
具体计算：

 p.adjust(sum.aov$pvalue.age,method="BH",n=length(sum.aov$pvalue.age)) # BH就是FDR校正

Sliding Enrichment pathway analysis （SEPA）
方法中对SEPA的介绍如下： SEPA介绍
涉及到了GSEA的内容，可参考以下内容：
https://www.jianshu.com/p/e28783bdd092
https://www.jianshu.com/p/b409a5576ce1
我下载了补充材料, 里面有2025个蛋白的信息，进行尝试。
Method 1. clusterProfiler运行之后，和报道的结果不太一致；

Gene.back<-str_match(data$EntrezGeneSymbol,"[A-Z0-9]*")    %>% as.vector
Gene.back<-bitr(Gene.back, fromType = "SYMBOL",toType =    "ENTREZID",OrgDb = org.Hs.eg.db)
Gene.back<-Gene.back$ENTREZID
Gene.back[duplicated(Gene.back)]

data.rank.up<-data[data$Coefficient.Age>0,] %>% .[1:100,]
Gene.rank.up<-str_match(data.rank.up$EntrezGeneSymbol,"     [A-Z0-9]*") %>% as.vector
Gene.rank.up<-bitr(Gene.rank.up, fromType = "SYMBOL",toType = "ENTREZID",OrgDb = org.Hs.eg.db)
Gene.rank.up<-Gene.rank.up$ENTREZID
Gene.rank.up[duplicated(Gene.rank.up)]

go.up<- enrichGO(gene = Gene.rank.up,
               OrgDb=org.Hs.eg.db,
               ont = "ALL", # MF,BP,CC
               pAdjustMethod = "BH",
               minGSSize = 1,
               maxGSSize = 500,
               pvalueCutoff = 1,
               qvalueCutoff = 1,
               readable = TRUE,
               universe = Gene.back,
               keyType = "ENTREZID")

clusterProfiler

Method 2. gprofiler2，没有结果。

library(ggplot2)
library(gprofiler2)
Gene.rank.up.gp<-    str_match(data.rank.up$EntrezGeneSymbol,"[A-Z0-9]*")  %>% na.omit%>%unique %>%  as.vector
Gene.back.gp<-str_match(data$EntrezGeneSymbol,"[A-Z0-9]*") %>% na.omit%>%unique %>%  as.vector

# query type: Gene symbol
gostres <- gost(query=Gene.rank.up.gp,organism =  "hsapiens", ordered_query = TRUE, 
            multi_query = FALSE, significant = TRUE, exclude_iea = FALSE, 
            measure_underrepresentation = FALSE, evcodes = FALSE, 
            user_threshold = 0.05, correction_method = "fdr", 
            domain_scope = "annotated", #custom_bg = Gene.back.gp, 
            numeric_ns = "", sources = c("GO:BP") , as_short_link = FALSE)
p <- gostplot(gostres)

gprofiler2

暂时不知道问题出在哪，找到方法后再补充～

Prediction of human biological age using the plasma proteome.

Prediction method

作者是用glmnet包做的，目前这个包已经更新，需要3.6.0版本以上的R才可以安装～

glmnet

未完，待续.....

文献阅读1: 关于age的文献

猜你喜欢

热点阅读