人工智能单细胞转录组

10x Genomics PBMC(六):整合处理和对照组PBM

2020-06-10  本文已影响0人  程凉皮儿

Integrating stimulated vs. control PBMC datasets to learn cell-type-specific responses

clp

10 June, 2020

注意切换工作目录(文件夹5)

Reference

本教程介绍了来自Kang et al, 2017的两组PBMC的比对。本实验将PBMCs分为刺激组和对照组,刺激组给予β干扰素治疗。对干扰素的反应导致了细胞类型特异性基因表达的变化,这使得对所有数据的联合分析变困难了,细胞聚类既要考虑刺激条件,也要考虑细胞类型。在这里,我们展示了我们的分析策略,如 Stuart and Butler et al, 2018中所述,用于执行整合分析,以促进常见细胞类型的识别并进行比较分析。虽然此示例演示了两个数据集(条件)的整合,但这些方法可以扩展到多个数据集。详情请参阅提供整合了四个胰岛数据集的示例workflow

整合分析目标

下面的教程旨在让您了解一个概述:使用Seurat集成过程可以对复杂细胞类型进行的各种比较分析。在这里,我们讨论三个主要目标:

工作流程摘要

我们将协调SCTransform输出的Pearson残差。如下所示,该工作流程由以下步骤组成:

下载Kang et. al. 2017 Seurat原始数据 (Raw read count)

library(data.table)
library(ggplot2)
library(Seurat)

options(future.globals.maxSize = 4000 * 1024^2)

pkg <- "ifnb.SeuratData"
if( !is.element(pkg, .packages(all.available = TRUE)) ) {
    install.packages("https://seurat.nygenome.org/src/contrib/ifnb.SeuratData_3.0.0.tar.gz", repos = NULL, type = "source")
}
library(pkg,character.only = TRUE)

#load Kang data
data("ifnb")

预处理和归一化

load('data/cycle.rda')

#split into the original samples
ifnb.list <- SplitObject(ifnb, split.by = "stim")
ifnb.list <- lapply(X = ifnb.list, function(seu) {
    message("This run will take 5+ min ...")
    seu <- NormalizeData(seu, verbose = TRUE) #the normalization result will be stored into .data slot.
    seu <- CellCycleScoring(seu, g2m.features=g2m_genes, s.features=s_genes)
    seu <- SCTransform(seu,verbose = FALSE)
    return(seu)
})

Feature Selection

下一步,整合好数据后进行features筛选,运行PrepSCTIntegration,确保计算出所有需要的Pearson残差。

sc.features <- SelectIntegrationFeatures(object.list = ifnb.list)

ifnb.list <- PrepSCTIntegration(object.list = ifnb.list,
                                anchor.features = sc.features,
                                verbose=FALSE)

Perform integration (经典的相关性分析)

整合(Integration)是一种强大的方法,它使用这些最大变异的共享来源来识别跨处理条件或数据集的共享子亚类[Stuart and Bulter et al. (2018)]。整合的目标是确保一个条件/数据集的细胞类型与其他条件/数据集的相同细胞类型对齐(例如,对照组巨噬细胞与刺激组的巨噬细胞对齐)。

具体地说,该integration方法期望在跨组的单细胞的至少一个子集之间进行“对应”或“共享”某生物状态。integration分析的步骤如下图所示: image.png

Fig1. Stuart T and Butler A, et. al. Comprehensive integration of single cell data, bioRxiv 2018

进行经典的相关性分析(CCA):

注:使用共享的高可变基因是因为它们最有可能代表那些区分不同细胞类型的基因。

经典的整合方法(CCA integration will take 5+ min)耗时较久。

immune.anchors <- FindIntegrationAnchors(object.list = ifnb.list,
                                         normalization.method = "SCT",
                                         anchor.features = sc.features,
                                         verbose=FALSE)

immune.combined <- IntegrateData(anchorset = immune.anchors,
                                 normalization.method = "SCT",
                                 verbose=FALSE)
#> Warning: Adding a command log without an assay associated with it

整合后数据可视化(Visualization)

对集成数据集进行下游分析(即可视化、聚类)。您可以看到,整合后,细胞按两种条件分组(对照组和刺激组)。要显示的群集注释来自我们下载的数据。

#Let us delete ifnb.list to free up the memory space
rm(ifnb)
rm(ifnb.list)
rm(immune.anchors)

#Make sure that your default assay is 'integrated'
DefaultAssay(immune.combined) <- "integrated"

immune.combined <- RunPCA(immune.combined, verbose = FALSE)
immune.combined <- RunUMAP(immune.combined, dims = 1:20)
#> Warning: The default method for RunUMAP has changed from calling Python UMAP via reticulate to the R-native UWOT using the cosine metric
#> To use Python UMAP via reticulate, set umap.method to 'umap-learn' and metric to 'correlation'
#> This message will be shown once per session

# immune.combined <- FindNeighbors(immune.combined, reduction = "pca", dims = 1:20)
# immune.combined <- FindClusters(immune.combined, resolution = 0.5)

plots <- DimPlot(immune.combined, group.by = c("stim","seurat_annotations"), combine = FALSE)

plots <- lapply(X = plots, FUN = function(x) {
  p <- x + theme(legend.position = "top")
  p <- p + guides(color = guide_legend(nrow = 4, byrow = TRUE, override.aes = list(size = 2.5)))
  })

CombinePlots(plots)
#> Warning: CombinePlots is being deprecated. Plots should now be combined
#> using the patchwork system.
image.png

要并排可视化这两个条件,我们可以使用split.by参数来显示按示例着色的每个条件。


DimPlot(immune.combined, reduction = "umap", split.by = "stim", group.by = "seurat_annotations", label = TRUE) + NoLegend()
#> Warning: Using `as.character()` on a quosure is deprecated as of rlang 0.3.0.
#> Please use `as_label()` or `as_name()` instead.
#> This warning is displayed once per session.
image.png

保存R环境变量留待下次使用

wkd <- "out"
if (!file.exists(wkd)){dir.create(wkd)}
save(immune.combined, file = file.path(wkd,'01_immune_combined.rd'), compress = TRUE)

本节重点

上一篇下一篇

猜你喜欢

热点阅读