10x Genomics PBMC(七):整合数据后的聚类分析

2020-06-11  本文已影响0人  程凉皮儿

Cluster Analysis in Integrated Data

clp

11 June, 2020

准备工作

加载前面学习程中的R环境变量和必要的R包

library(data.table)
library(ggplot2)
library(Seurat)

load('out/01_immune_combined.rd') #immune.combined

鉴定保守的细胞类型标记(markers)

为了识别在不同条件下保守的典型细胞类型标记基因,我们提供了FindConservedMarkers函数。此函数为每个数据集/组执行差异基因表达检测,并使用来自MetaDE包的荟萃分析方法组合p值。例如,我们可以计算由’NK’细胞标记的簇中的保守标记基因,而不考虑刺激条件。

DefaultAssay(immune.combined) <- "RNA"
Idents(immune.combined) <- "seurat_annotations"

message("This run will take 5+ min ...")
nk.markers <- FindConservedMarkers(immune.combined, ident.1 = "NK", grouping.var = "stim", verbose = FALSE) #default slot: 'data'
head(nk.markers)
#>        CTRL_p_val CTRL_avg_logFC CTRL_pct.1 CTRL_pct.2 CTRL_p_val_adj
#> GNLY            0       4.186117      0.943      0.046              0
#> NKG7            0       3.164712      0.953      0.085              0
#> GZMB            0       2.915692      0.839      0.044              0
#> CLIC3           0       2.407695      0.601      0.024              0
#> FGFBP2          0       2.241968      0.500      0.021              0
#> CTSW            0       2.088278      0.537      0.030              0
#>           STIM_p_val STIM_avg_logFC STIM_pct.1 STIM_pct.2 STIM_p_val_adj
#> GNLY    0.000000e+00       4.066429      0.956      0.059   0.000000e+00
#> NKG7    0.000000e+00       2.904602      0.950      0.081   0.000000e+00
#> GZMB    0.000000e+00       3.128167      0.897      0.060   0.000000e+00
#> CLIC3   0.000000e+00       2.460388      0.623      0.031   0.000000e+00
#> FGFBP2 1.674159e-159       1.485116      0.259      0.016  2.352696e-155
#> CTSW    0.000000e+00       2.175186      0.592      0.035   0.000000e+00
#>             max_pval minimump_p_val
#> GNLY    0.000000e+00              0
#> NKG7    0.000000e+00              0
#> GZMB    0.000000e+00              0
#> CLIC3   0.000000e+00              0
#> FGFBP2 1.674159e-159              0
#> CTSW    0.000000e+00              0

此外,我们可以探索每种细胞类型的以下标记基因,以验证这些clusters是否具有特定的细胞类型。

marker_genes <- c("CD3D", "SELL", "CREM", "CD8A", "GNLY", "CD79A", "FCGR3A", "CCL2", "PPBP")

FeaturePlot(immune.combined, features = marker_genes, min.cutoff = "q9")
image.png

带有split.byDotPlot函数可用于跨条件查看保守的细胞类型标记,显示表达任何给定基因的簇中细胞的表达水平和百分比。在这里,我们为之前获取的13个簇中的每一个绘制了2-3个强标记基因。


markers.to.plot <- c("CD3D", "CREM", "HSPH1", "SELL", "GIMAP5", "CACYBP", "GNLY", "NKG7", "CCL5", "CD8A", "MS4A1", "CD79A", "MIR155HG", "NME1", "FCGR3A", "VMO1", "CCL2", "S100A9", "HLA-DQA1", "GPR183", "PPBP", "GNG11", "HBA2", "HBB", "TSPAN13", "IL3RA", "IGJ")

DotPlot(immune.combined,
        features = rev(markers.to.plot), 
        cols = c("blue", "red"), 
        dot.scale = 8, 
        split.by = "stim") + RotatedAxis()
image.png

保存R环境变量留待下次使用:

save(immune.combined, file = 'out/02_immune_cons.rd',compress = TRUE)

到了这一步需要了解的重点

上一篇下一篇

猜你喜欢

热点阅读