基因ID转换geo相关

AnnotationDbi 使用(以 org.Hs.eg.db

2019-11-26  本文已影响0人  BeeBee生信

分析芯片数据应该都接触过 Bioconductor 上的注释包,像人种 org.Hs.eg.db 小鼠 org.Mm.eg.db 大鼠 org.Rn.eg.db. AnnotationDbi提供了访问注释包注释信息的方法,以最常用的人种 org.Hs.eg.db 为例子简单示范如何使用。

首先导入包,当然无需导入 AnnotationDbi 只要导入 org.Hs.eg.db 就行了。导入 tidyverse 是为了获取 %>% 操作符。

library(org.Hs.eg.db)
library(tidyverse)

使用 keytypes/columns 函数显示注释包包含哪些注释项目。

> keytypes(org.Hs.eg.db)
 [1] "ACCNUM"       "ALIAS"        "ENSEMBL"      "ENSEMBLPROT"  "ENSEMBLTRANS"
 [6] "ENTREZID"     "ENZYME"       "EVIDENCE"     "EVIDENCEALL"  "GENENAME"    
[11] "GO"           "GOALL"        "IPI"          "MAP"          "OMIM"        
[16] "ONTOLOGY"     "ONTOLOGYALL"  "PATH"         "PFAM"         "PMID"        
[21] "PROSITE"      "REFSEQ"       "SYMBOL"       "UCSCKG"       "UNIGENE"     
[26] "UNIPROT"     
> columns(org.Hs.eg.db)
 [1] "ACCNUM"       "ALIAS"        "ENSEMBL"      "ENSEMBLPROT"  "ENSEMBLTRANS"
 [6] "ENTREZID"     "ENZYME"       "EVIDENCE"     "EVIDENCEALL"  "GENENAME"    
[11] "GO"           "GOALL"        "IPI"          "MAP"          "OMIM"        
[16] "ONTOLOGY"     "ONTOLOGYALL"  "PATH"         "PFAM"         "PMID"        
[21] "PROSITE"      "REFSEQ"       "SYMBOL"       "UCSCKG"       "UNIGENE"     
[26] "UNIPROT" 

使用 keys 函数查看注释项目的键。

> keys(org.Hs.eg.db, keytype="PATH") %>% head()
[1] "04610" "00232" "00983" "01100" "00380" "00970"
> keys(org.Hs.eg.db, keytype="SYMBOL") %>% head()
[1] "A1BG"  "A2M"   "A2MP1" "NAT1"  "NAT2"  "NATP" 

select 函数返回需要的注释数据。例子展示根据基因名返回 ENTREZID 和 UNIPROT ID

> symbols <- keys(org.Hs.eg.db, keytype="SYMBOL")[1:10] 
> symbols
 [1] "A1BG"     "A2M"      "A2MP1"    "NAT1"     "NAT2"     "NATP"    
 [7] "SERPINA3" "AADAC"    "AAMP"     "AANAT" 

> AnnotationDbi::select(org.Hs.eg.db, keys=symbols, columns=c("ENTREZID", "UNIPROT"), keytype="SYMBOL")
'select()' returned 1:many mapping between keys and columns
     SYMBOL ENTREZID    UNIPROT
1      A1BG        1     P04217
2      A1BG        1     V9HWD8
3       A2M        2     P01023
4     A2MP1        3       <NA>
5      NAT1        9     P18440
6      NAT1        9     Q400J6
7      NAT1        9     F5H5R8
8      NAT2       10     A4Z6T7
9      NAT2       10     P11245
10     NATP       11       <NA>
11 SERPINA3       12 A0A024R6P0
12 SERPINA3       12     P01011
13    AADAC       13     P22760
14     AAMP       14 A0A024R410
15     AAMP       14     Q13685
16     AAMP       14     C9JEH3
17    AANAT       15     F1T0I5
18    AANAT       15     Q16613

上一篇下一篇

猜你喜欢

热点阅读