Single-cell

单细胞学习入门-SingleR-辅助Cluster对细胞类别进行

2021-05-19  本文已影响0人  7f0a92cda77c

下载参考的数据库

singleR自带7个数据库文件,需要联网才能下载,其中5个是人类数据,2个是小鼠的数据:
BlueprintEncodeData Labels
HumanPrimaryCellAtlasData Labels
DatabaseImmuneCellExpressionData Labels
NovershternHematopoieticData Labels
MonacoImmuneData Labels
ImmGenData Labels
MouseRNAseqData Labels

rm(list=ls())
BiocManager::install("celldex")
library(celldex)
hpca.se <- HumanPrimaryCellAtlasData()

获取数据集

library(dplyr)
library(Seurat)
library(patchwork)
data_dir <- "data/pbmc3k_filtered_gene_bc_matrices/filtered_gene_bc_matrices/hg19"
pbmc.data <- Read10X(data.dir = data_dir)
pbmc <- CreateSeuratObject(counts = pbmc.data, 
                           project = "pbmc3k", 
                           min.cells = 3,
                           min.features = 200)
pbmc[["percent.mt"]] <- PercentageFeatureSet(pbmc, pattern = "^MT-")
pbmc <- subset(pbmc, subset = nFeature_RNA > 200 & nFeature_RNA < 2500 & percent.mt < 5)
pbmc <- NormalizeData(pbmc, normalization.method = "LogNormalize", scale.factor = 10000)# 标准化后的值保存在:pbmc[["RNA"]]@data
pbmc <- FindVariableFeatures(pbmc, selection.method = "vst", nfeatures = 2000)
top10 <- head(VariableFeatures(pbmc), 10)
all.genes <- rownames(pbmc)
pbmc <- ScaleData(pbmc, features = all.genes)
pbmc <- RunPCA(pbmc, features = VariableFeatures(object = pbmc))
ElbowPlot(pbmc)
pbmc <- FindNeighbors(pbmc, dims = 1:10)
pbmc <- FindClusters(pbmc, resolution = 0.5)
pbmc <- RunUMAP(pbmc, dims = 1:10)

准备好了数据


testdata <- GetAssayData(pbmc, slot="data")#提取标准化后的数据
dim(testdata)
testdata[1:30,1:4]
clusters <- pbmc@meta.data$seurat_clusters
table(clusters)
cellpred <- SingleR(test = testdata,  
                    ref = hpca.se,
                    labels = hpca.se$label.main,
                    method = "cluster", 
                    clusters = clusters,
                    assay.type.test = "logcounts", 
                    assay.type.ref = "logcounts")

str(cellpred,max.level = 3)
得到的数据
celltype = data.frame(ClusterID = rownames(cellpred), 
                      celltype = cellpred$labels, 
                      stringsAsFactors = F)
celltype
#  ClusterID  celltype
#1         0   T_cells
#2         1  Monocyte
#3         2   T_cells
#4         3    B_cell
#5         4   T_cells
#6         5  Monocyte
#7         6   NK_cell
#8         7  Monocyte
#9         8 Platelets
write.csv(celltype, "data/celltype_anno_SingleR.csv")

得到了对应的Cluster的细胞类型,仅供参考

# 打分热图上面的注释结果需要校正
p = plotScoreHeatmap(cellpred, clusters = rownames(cellpred), order.by = "cluster")
p

p-打分后得到的图进行检查,看得分,黄色最高

参考了生信技能树的分享,没有一一列出
单细胞2021公开课-讲师张娟
https://cloud.tencent.com/developer/article/1692253
https://cloud.tencent.com/developer/user/7131101

上一篇 下一篇

猜你喜欢

热点阅读