生信生信程序员基因组数据绘图

基因功能富集调控网络分析--FGNet

2019-07-04  本文已影响172人  尧小飞

FGNet介绍

  FGNet(Functional Gene Networks derived from biological enrichment analyses)允许在基因或表达集列表上执行功能富集分析(FEA),并将结果转换为网络。由此产生的功能网络提供了基因/术语的生物学功能的概述,并且允许容易地看到基因之间的链接,簇之间的重叠,发现关键基因等。

Examples of functional network for different analyses
  其实主要是通过输入基因list,然后使用别的工具进行注释富集分析(比如:DAVID、GeneTerm Linker、TopGO、GAGE ),然后对某个或者几个功能进行富集分析。最后进行网络可视化、寻找核心基因。

FGNet安装

  通过bioconductor安装,并且由于需要其他第三方包进行注释富集分析,建议也安装其他必须的软件包。

if (!requireNamespace("BiocManager", quietly=TRUE))
    install.packages("BiocManager")
BiocManager::install("FGNet")
BiocManager::install(c("RGtk2", "RCurl","RDAVIDWebService", "gage", "topGO",  "GO.db", "KEGG.db", "reactome.db", "org.Sc.sgd.db"))

测试

  注释富集一共有四个工具可以进行,但是由于集群的问题,只能有三个工具能够用,分别为:DAVID、TopGO、GAGE,并在这里展示着三个工具初步结果。

基因集准备

#Here is an example analyzing a gene list with the different tools:
genesYeast <- c("ADA2", "APC1", "APC11", "APC2", "APC4", "APC5", "APC9", "CDC16", 
                "CDC23", "CDC26", "CDC27", "CFT1", "CFT2", "DCP1", "DOC1", "FIP1", 
                "GCN5", "GLC7", "HFI1", "KEM1", "LSM1", "LSM2", "LSM3", "LSM4", 
                "LSM5", "LSM6", "LSM7", "LSM8", "MPE1", "NGG1", "PAP1", "PAT1", 
                "PFS2", "PTA1", "PTI1", "REF2", "RNA14", "RPN1", "RPN10", "RPN11", 
                "RPN13", "RPN2", "RPN3", "RPN5", "RPN6", "RPN8", "RPT1", "RPT3", 
                "RPT6", "SGF11", "SGF29", "SGF73", "SPT20", "SPT3", "SPT7", "SPT8", 
                "TRA1", "YSH1", "YTH1")
library(org.Sc.sgd.db)
geneLabels <- unlist(as.list(org.Sc.sgdGENENAME))
genesYeast <- sort(geneLabels[which(geneLabels %in% genesYeast)])

# Optional: Gene expression (1=UP, -1=DW)
genesYeastExpr <- setNames(c(rep(1,28), rep(-1,30)),genesYeast) 

注释富集分析

#######DAVID
feaResults_David <- fea_david(names(genesYeast), geneLabels=genesYeast)#email="example@email.com"
#######TopGO
feaResults_topGO <- fea_topGO(genesYeast, geneIdType="GENENAME", organism="Sc") 
#######Gene-Term Linker  报错
# jobID <- fea_gtLinker(geneList=genesYeast, organism="Sc")
# ?fea_gtLinker
# jobID <- 3907019
# feaResults_gtLinker <- fea_gtLinker_getResults(jobID=jobID, organism="Sc")
######GAGE
library(gage)
data(gse16873)
# Set gene labels? (they need to have unique identifiers)
library(org.Hs.eg.db)
geneSymbols <- select(org.Hs.eg.db,columns="SYMBOL",keytype="ENTREZID",keys=rownames(gse16873))
geneLabels <- geneSymbols$SYMBOL
names(geneLabels) <- geneSymbols$ENTREZID
head(geneLabels)
#GAGE:
feaResults_gage <- fea_gage(eset=gse16873,refSamples=grep('HN',colnames(gse16873)),compSamples=grep('DCIS',colnames(gse16873)),geneLabels=geneLabels, annotations="REACTOME",geneIdType="ENTREZID", organism="Hs")
###生成结题报告
FGNet_report(feaResults_David, geneExpr=genesYeastExpr)
FGNet_report(feaResults_topGO, geneExpr=genesYeastExpr)  
#FGNet_report(feaResults_gtLinker, geneExpr=genesYeastExpr)
FGNet_report(feaResults_gage)

测试结果

  Functional enrichment with DAVID:

DAVID结果概览
Distances between Clusters
  Functional enrichment with topGO:

topGO结果概览
  Functional enrichment with gage:

gage结果概览
  上述三种结果的内容几乎一样,虽然输入基因list一样,但是他们富集聚类结果不一样,主要是三种富集的方式方法不一致造成的。

其他结果筛选

  上文介绍了普通的功能忘了富集网络分析结果,但是我们对得到初步结果往往不能直接用于后续分析,因此我们需要对结果进行筛选,这里将介绍接种筛选的方式,其他的筛选方式见官方说明文档

  Genes - Terms networks筛选命令如下:

##Genes - Terms networks
gtSets <- feaResults_David$geneTermSets
gtSets <- gtSets[gtSets$Cluster %in% c(9),] 
gtSets <- gtSets[gtSets$Pop.Hits<500,]
termsGenes <- t(fea2incidMat(gtSets, clusterColumn="Terms")$clustersMatrix)
library(R.utils)
rownames(termsGenes) <- sapply(strsplit(rownames(termsGenes), ":"),function(x) capitalize(x[length(x)]))
termsGenes[1:5,1:5]
pdf('Genes_Terms_networks.pdf',w=12,h=8)
functionalNetwork(t(termsGenes), plotType="bipartite", keepAllNodes=TRUE,legendPrefix="", plotTitle="Genes - Terms network", plotTitleSub="",geneExpr=genesYeastExpr, plotExpression="Fill")
functionalNetwork(termsGenes, plotType="bipartite", keepAllNodes=TRUE,legendPrefix="", plotTitle="Genes - Terms network", plotTitleSub="")
dev.off()
Genes - Terms networks
  上述图中方块代表关键的基因,圆圈代表GO Term
.

  Selecting specific clusters筛选命令如下:

##Selecting specific clusters
incidMat <- fea2incidMat(feaResults_David)
distMat <- clustersDistance(incidMat)
selectedClusters <- rep(FALSE, nrow(feaResults_David$clusters))
selectedClusters[c(8,9,11)] <- TRUE
tmpFea <- feaResults_David
tmpFea$clusters <- cbind(tmpFea$clusters, select=selectedClusters)
incidMatSelection <- fea2incidMat(tmpFea, filterAttribute="select", filterOperator="!=",filterThreshold="TRUE")
pdf('Selecting_specific_clusters.pdf',w=12,h=8)
functionalNetwork(incidMatSelection, eColor=NA)
dev.off()
Selecting specific clusters
  上述图中不同颜色代表不同的聚类,这里我们只选择了8,9,11三个聚类,因此图只展示了这三个聚类,通过此方法,可以只展示任何感兴趣的聚类。

此文仅仅记录了FGNet简单的使用,如果对计算分析原理感兴趣的话,可以看看官方说明文档。官方说明文档

2019年7月7日

上一篇下一篇

猜你喜欢

热点阅读