[R]基因ID转换、鼠基因转为人基因
2021-04-03 本文已影响0人
小贝学生信
- Ensembl, EntreID,Symbol这三种基因名格式的互相转换
#以人的12个基因(Symbol格式)为例
gene_symbol=c("RHO","CALM1","MEG3","GNGT1","SAG","RPGRIP1","TRPM1","PCP2","PCP4","AP1B1")
法1:org.Hs.eg.db包
library(org.Hs.eg.db)
keytypes(org.Hs.eg.db)
gene_ids<-AnnotationDbi::select(org.Hs.eg.db, keys=as.character(gene_symbol),
columns=c("ENSEMBL","ENTREZID"), #目标格式
keytype="SYMBOL") #目前的格式
gene_ids
gene_ids
library(org.Mm.eg.db)
keytypes(org.Mm.eg.db)
法2:biomaRt包
library("biomaRt")
ensembl = useMart("ensembl",dataset="hsapiens_gene_ensembl")
attributes = listAttributes(ensembl)
attributes[1:5,]
# library(httr)
# httr::set_config(config(ssl_verifypeer = 0L))
gene_ids2 <- getBM(filters= "hgnc_symbol",
attributes= c("hgnc_symbol","ensembl_gene_id","entrezgene_id"),
values = gene_symbol, mart= ensembl)
gene_ids2
gene_ids2
- 将老鼠基因转为人类基因名
musGenes <- c("Hmmr", "Tlx3", "Cpeb4")
法1:直接转换
对于SYMBOL基因名格式,一般老鼠基因与人类基因就是大小写字母的区别。
人类基因名全部是大写,而老鼠基因名只有第一个字母是大写,其余为小写。
toupper(musGenes)
# [1] "HMMR" "TLX3" "CPEB4"
但也有例外,所以严谨点,可以使用下面的方法。
法2:biomaRt包
require("biomaRt")
human = useMart("ensembl", dataset = "hsapiens_gene_ensembl")
mouse = useMart("ensembl", dataset = "mmusculus_gene_ensembl")
genes = getLDS(attributes = c("mgi_symbol"), filters = "mgi_symbol",
values = musGenes,
mart = mouse,
attributesL = c("hgnc_symbol"),
martL = human, uniqueRows=T)
# MGI.symbol HGNC.symbol
# 1 Cpeb4 CPEB4
# 2 Hmmr HMMR
# 3 Tlx3 TLX3