Seurat: IntegrateData()

2023-10-30  本文已影响0人  LET149

https://satijalab.org/seurat/articles/integration_introduction
https://satijalab.org/seurat/reference/selectintegrationfeatures
https://satijalab.org/seurat/reference/findintegrationanchors
https://satijalab.org/seurat/reference/integratedata

这个方法被用来去除不同Sample之间的批次效应

基本原理是回归掉由批次带来的数据变异

去除了(回归掉)批次效应的表达数据在 @assays$intergate 中,所有的下游分析都要在此数据的基础上进行

library(Seurat)
library(dplyr)
require(ggplot2)
require(cowplot)

#---------------------------------------------------------------------------#
load("/home/zhiyong/Desktop/BBBBBBBB-RPE1-LXY/Figure_1/C/New/25uM/Feature_over_2500/Human_REP1_RPL35_KD_25uM_3_200_filtered_feature_over_2500.RData")
load("/home/zhiyong/Desktop/BBBBBBBB-RPE1-LXY/Figure_1/C/New/Mix/Feature_over_2500/Human_REP1_RPL35_KD_Mix_3_200_filtered_feature_over_2500.RData")

object_used_1 <- Human_REP1_RPL35_KD_25uM_3_200_filtered_feature_over_2500
object_used_2 <- Human_REP1_RPL35_KD_Mix_3_200_filtered_feature_over_2500

#---------------------------------------------------------------------------#
object_used_1 <- NormalizeData(object_used_1); object_used_1 <- FindVariableFeatures(object_used_1, selection.method = "vst", nfeatures = 3000)
object_used_2 <- NormalizeData(object_used_2); object_used_2 <- FindVariableFeatures(object_used_2, selection.method = "vst", nfeatures = 3000)

#---------------------------------------------------------------------------#
intergrated_list <- list(object_used_1, object_used_2)

#---------------------------------------------------------------------------#
#Select features that are repeatedly variable across datasets for integration
feature_used <- SelectIntegrationFeatures(object.list = intergrated_list, nfeatures = 2000)

#Identify the anchors for data intergratin, this function creates a anchor object for IntegrateData() function, this anchor object includes a list of two Seurat objects
anchor_used <- FindIntegrationAnchors(object.list = intergrated_list, anchor.features = feature_used)

#---------------------------------------------------------------------------#
#Intergrate Seurat objects based on anchor object created by FindIntegrationAnchors() function, this command creates an "integrated" data assay
Intergrated_object <- IntegrateData(anchorset = anchor_used)

#---------------------------------------------------------------------------#
#Change the activate assay to "Intergrated_object@assays$integrated", this slot is used for downstream analysis
DefaultAssay(Intergrated_object) <- "integrated"

#---------------------------------------------------------------------------#
#Run the standard workflow for visualization and clustering; all these downstream analysis is based on the "Intergrated_object@assays$integrated" slot
immune.combined <- ScaleData(immune.combined, verbose = FALSE)
immune.combined <- RunPCA(immune.combined, npcs = 30, verbose = FALSE)
immune.combined <- RunUMAP(immune.combined, reduction = "pca")
immune.combined <- FindNeighbors(immune.combined, reduction = "pca", dims = 1:10)
immune.combined <- FindClusters(immune.combined, resolution = 0.5)
1. 说明一
    1. 原数据的@assays$RNA@counts@assays$RNA@data都不会被改变,而是放在整合后数据的 RNA 这个slot
    1. 原数据的 @assays$RNA@scale.data 被清除
2. 说明二
    1. 整合后,去除完批次效应的数据存放在 @assays$integrated@data 中,对此数据进行 ScaleData() 之前,@assays$integrated@scale.data 是没有数据的
    1. 对整合后的数据运行ScaleData()时,实际上是对 @assays$integrated@data进行scale;此步骤处理后的数据,放在 @assays$integrated@scale.data 中,并被用作此后的 PCAcluster 等数据处理
上一篇下一篇

猜你喜欢

热点阅读