10X单细胞(10X空间转录组)数据分析之时间动力学(sctou
2022-05-25 本文已影响0人
单细胞空间交响乐
今天呢,分享一个软件,用来分析单细胞的时间动力学,也就是轨迹方面的内容,感觉是一个不错的分析点,分享分享。
分享的文章在scTour: a deep learning architecture for robust inference and accurate prediction of cellular dynamics,我们来看看分析的创新点
scTour 是一种通过分析源自单细胞基因组学的数据集来剖析细胞动力学的创新且全面的方法。 它提供了一个统一的框架,从包括发育伪时间、向量场和潜在空间在内的多个角度描绘发育过程的全貌,并进一步将这些功能推广到多任务架构,以批量不敏感的方式进行数据集内推断和细胞动力学的跨数据集预测。

scTour features
- unsupervised estimates of cell pseudotime along the trajectory with no need for specifying starting cells
- efficient inference of vector field with no dependence on the discrimination between spliced and unspliced mRNAs
- cell trajectory reconstruction using latent space that incorporates both intrinsic transcriptome and extrinsic time information
- model-based prediction of pseudotime, vector field, and latent space for query cells/datasets
- reconstruction of transcriptomic space given an unobserved time interval
scTour performance
✅insensitive to batch effects
✅ robust to cell subsampling
✅ scalable to large datasets
我们来看看示例代码
import sctour as sct
import scanpy as sc
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
adata = sc.read('../../../mouse_endo_brain/raw_count_matrix_brain.txt').T
info = pd.read_csv('../../../mouse_endo_brain/Metadata.csv', sep=';', index_col=0)
cells = adata.obs_names.intersection(info.index)
adata = adata[cells, :]
adata.obs['Cluster'] = info.loc[cells, 'Cluster'].copy()
常规处理
sc.pp.calculate_qc_metrics(adata, percent_top=None, log1p=False, inplace=True)
sc.pp.filter_genes(adata, min_cells=20)
sc.pp.highly_variable_genes(adata, flavor='seurat_v3', n_top_genes=2000, subset=True)
####log-transform the count before running scTour when you use the “nb” mode
adata.X = np.log1p(adata.X)
Train the scTour model
tnode = sct.train.Trainer(adata, loss_mode='nb')
tnode.train()
Infer the developmental pseudotime
adata.obs['ptime'] = tnode.get_time()
推断潜在表示。 两个参数 alpha_z 和 alpha_predz 调整来自变分推理和 ODE 求解器的潜在 z 的权重。 较大的 alpha_z 使潜在空间向内在转录组结构倾斜,而较大的 alpha_predz 更能代表外在的伪时间排序。 可以根据自己的目的调整这两个参数。
#zs represents the latent z from variational inference, and pred_zs represents the latent z from ODE solver
#mix_zs represents the weighted combination of the two, which is used for downstream analysis
mix_zs, zs, pred_zs = tnode.get_latentsp(alpha_z=0.2, alpha_predz=0.8)
根据推断的潜在空间生成 UMAP 嵌入。 或者,可以在此步骤之前根据伪时间对细胞进行排序,这被证明可以产生更好的轨迹。
adata.obsm['X_TNODE'] = mix_zs
adata = adata[np.argsort(adata.obs['ptime'].values), :]
sc.pp.neighbors(adata, use_rep='X_TNODE', n_neighbors=15)
sc.tl.umap(adata, min_dist=0.1)
推断向量场
adata.obsm['X_VF'] = tnode.get_vector_field(adata.obs['ptime'].values, adata.obsm['X_TNODE'])
Visualize the clusters, pseudotime and vector field on the UMAP generated from scTour’s latent space.
%matplotlib inline
fig, axs = plt.subplots(ncols=3, nrows=1, figsize=(18, 5))
sc.pl.umap(adata, color='Cluster', size=20, ax=axs[0], legend_loc='on data', show=False)
sc.pl.umap(adata, color='ptime', size=20, ax=axs[1], show=False)
sct.vf.plot_vector_field(adata, zs_key='TNODE', vf_key='VF', use_rep_neigh='TNODE', color='Cluster', ax=axs[2], legend_loc='none', frameon=False, size=100, alpha=0.2)
plt.show()

但也再一次印证了,轨迹分析必须要人为介入,时间的起点和终点都需要人为判断
可视化的反转
sc.pl.umap(adata, color=['cluster', 'ptime'], legend_loc='on data')

adata.obs['ptime'] = sct.train.reverse_time(adata.obs['ptime'].values)
sc.pl.umap(adata, color=['cluster', 'ptime'], legend_loc='on data')

最后推断向量场
adata.obsm['X_VF'] = tnode.get_vector_field(adata.obs['ptime'].values, adata.obsm['X_TNODE'])
fig, ax = plt.subplots(ncols=1, nrows=1, figsize=(5, 5))
sct.vf.plot_vector_field(adata, reverse=True, zs_key='TNODE', vf_key='VF', use_rep_neigh='TNODE', ax=ax, color='cluster', frameon=False,size=200, alpha=0.05)
plt.show()

今天简单,生活很好,有你更好