Seurat: IntegrateData()
2023-10-30 本文已影响0人
LET149
https://satijalab.org/seurat/articles/integration_introduction
https://satijalab.org/seurat/reference/selectintegrationfeatures
https://satijalab.org/seurat/reference/findintegrationanchors
https://satijalab.org/seurat/reference/integratedata
这个方法被用来去除不同Sample之间的批次效应
基本原理是回归掉由批次带来的数据变异
去除了(回归掉)批次效应的表达数据在
@assays$intergate
中,所有的下游分析都要在此数据的基础上进行
library(Seurat)
library(dplyr)
require(ggplot2)
require(cowplot)
#---------------------------------------------------------------------------#
load("/home/zhiyong/Desktop/BBBBBBBB-RPE1-LXY/Figure_1/C/New/25uM/Feature_over_2500/Human_REP1_RPL35_KD_25uM_3_200_filtered_feature_over_2500.RData")
load("/home/zhiyong/Desktop/BBBBBBBB-RPE1-LXY/Figure_1/C/New/Mix/Feature_over_2500/Human_REP1_RPL35_KD_Mix_3_200_filtered_feature_over_2500.RData")
object_used_1 <- Human_REP1_RPL35_KD_25uM_3_200_filtered_feature_over_2500
object_used_2 <- Human_REP1_RPL35_KD_Mix_3_200_filtered_feature_over_2500
#---------------------------------------------------------------------------#
object_used_1 <- NormalizeData(object_used_1); object_used_1 <- FindVariableFeatures(object_used_1, selection.method = "vst", nfeatures = 3000)
object_used_2 <- NormalizeData(object_used_2); object_used_2 <- FindVariableFeatures(object_used_2, selection.method = "vst", nfeatures = 3000)
#---------------------------------------------------------------------------#
intergrated_list <- list(object_used_1, object_used_2)
#---------------------------------------------------------------------------#
#Select features that are repeatedly variable across datasets for integration
feature_used <- SelectIntegrationFeatures(object.list = intergrated_list, nfeatures = 2000)
#Identify the anchors for data intergratin, this function creates a anchor object for IntegrateData() function, this anchor object includes a list of two Seurat objects
anchor_used <- FindIntegrationAnchors(object.list = intergrated_list, anchor.features = feature_used)
#---------------------------------------------------------------------------#
#Intergrate Seurat objects based on anchor object created by FindIntegrationAnchors() function, this command creates an "integrated" data assay
Intergrated_object <- IntegrateData(anchorset = anchor_used)
#---------------------------------------------------------------------------#
#Change the activate assay to "Intergrated_object@assays$integrated", this slot is used for downstream analysis
DefaultAssay(Intergrated_object) <- "integrated"
#---------------------------------------------------------------------------#
#Run the standard workflow for visualization and clustering; all these downstream analysis is based on the "Intergrated_object@assays$integrated" slot
immune.combined <- ScaleData(immune.combined, verbose = FALSE)
immune.combined <- RunPCA(immune.combined, npcs = 30, verbose = FALSE)
immune.combined <- RunUMAP(immune.combined, reduction = "pca")
immune.combined <- FindNeighbors(immune.combined, reduction = "pca", dims = 1:10)
immune.combined <- FindClusters(immune.combined, resolution = 0.5)
1. 说明一
- 原数据的
@assays$RNA@counts
和@assays$RNA@data
都不会被改变,而是放在整合后数据的RNA
这个slot
中
- 原数据的
@assays$RNA@scale.data
被清除
2. 说明二
- 整合后,去除完批次效应的数据存放在
@assays$integrated@data
中,对此数据进行ScaleData()
之前,@assays$integrated@scale.data
是没有数据的
- 对整合后的数据运行
ScaleData()
时,实际上是对@assays$integrated@data
进行scale
;此步骤处理后的数据,放在@assays$integrated@scale.data
中,并被用作此后的PCA
和cluster
等数据处理