scRNA-seq

Python单细胞复现2022||05-绘制Figure 3

2022-10-05  本文已影响0人  信你个鬼

本次学习资料来源,结合视频观看效果更佳,视频相关代码如下:

文献中的Fig3如下:

image-20221004215250070.png

本次绘制图C:

Differential gene expression (log-normalized, scaled; see Methods) of AT1 and AT2 cells from COVID-19 and control lungs. Columns, single cells; rows, expression of top-regulated genes. Left bar, lineage markers for AT1 (purple) and AT2 (pink) cells. Colour-coded top lanes indicate expression strength of signatures (log-normalized; see Methods) and group assignment as indicated on the right. exp., expression

数据读取

import scanpy as sc
import scvi
import seaborn as sns
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt

# 注意dir改成自己的路径
#dir = '/dir/data/GSE171524/'
adata = sc.read_h5ad(dir+'integrated_anno.h5ad')
adata

差异分析

首先提取注释到的AT1与AT2

subset = adata[adata.obs['cell type_sub'].isin(['AT1', 'AT2'])].copy()
subset

总共20741个细胞

image-20221004220933957.png

差异分析可使用SCVI or diffxpy来做

diffxpy方法差异分析

这里需要安装diffxpy这个包,还比较麻烦。需要安装在scanpy的conda环境中,安装如下:

# 先安装依赖 batchglm 需要安装以前的版本:https://pypi.org/project/batchglm/#history 找到0.7.4版本,下载到本地
tar -zxvf batchglm-0.7.4.tar.gz
cd batchglm-0.7.4
pip install -e .

# 再安装 diffxpy
# 下载包到本地:https://github.com/theislab/diffxpy
unzip diffxpy-master.zip
cd diffxpy-master
pip install -e .

这样就安装成功了:

# two options: SCVI or diffxpy
import diffxpy.api as de

subset.X = subset.X.toarray()
len(subset.var)
#20858

# 地表达过滤
sc.pp.filter_genes(subset, min_cells=100)
len(subset.var)
#13412

# 修改一下细胞注释的列名
subset.obs = subset.obs.rename(columns = {'cell type_sub':'cell_type_sub'})
subset.obs

差异分析:

#if want to test between covid/non covid
# res = de.test.wald(data=subset,
#              formula_loc= '~ 1 + condition',
#              factor_loc_totest='condition'
#                   )

res = de.test.wald(data=subset,
             formula_loc= '~ 1 + cell_type_sub',
             factor_loc_totest='cell_type_sub'
                  )

dedf = res.summary().sort_values('log2fc', ascending = False).reset_index(drop = True)
dedf

结果如下:

image-20221005001446692.png

查看谁是实验组,谁是对照组,并调整为 AT1 vs AT2

subset.obs.cell_type_sub.unique()

# ['AT2', 'AT1']
# Categories (2, object): ['AT1', 'AT2']

most_up = dedf.iloc[0].gene
i = np.where(subset.var_names == most_up)[0][0]

a = subset[subset.obs.cell_type_sub == 'AT1'].X[:, i]
b = subset[subset.obs.cell_type_sub == 'AT2'].X[:, i]
print(f"{most_up} expression:")
print(f"AT1: {a.mean()}")
print(f"AT2: {b.mean()}")

# THBS2 expression:
# AT1: 0.0
# AT2: 0.04352555051445961


dedf['log2fc'] = dedf['log2fc']*-1
dedf = dedf.sort_values('log2fc', ascending = False).reset_index(drop = True)
dedf

结果图:

image-20221005001900728.png

挑选显著差异结果:

dedf = dedf[(dedf.qval < 0.05) & (abs(dedf.log2fc) > .5)]
dedf

结果图如下,有4836个显著差异表达基因:

image-20221005002146043.png

卡表达值:

dedf = dedf[dedf['mean'] > 0.15]
dedf

还有1095个基因:

image-20221005002309756.png

选择top50个基因绘制热图:

#top 25 and bottom 25 from sorted df
genes_to_show = dedf[-25:].gene.tolist() + dedf[:25].gene.tolist() 
sc.pl.heatmap(subset, genes_to_show, groupby='cell_type_sub', swap_axes=True)
plt.savefig(dir+"06-intergration_heatmap.png")

结果图:

image-20221005002522720.png

scvi方法差异分析

# 加载上一次保存的模型
model  = scvi.model.SCVI.load(dir+ 'model.model', adata)
model

# 差异分析
scvi_de = model.differential_expression(
    idx1 = [adata.obs['cell type_sub'] == 'AT1'],
    idx2 = [adata.obs['cell type_sub'] == 'AT2']
    )

scvi_de

# 显著性结果筛选
scvi_de = scvi_de[(scvi_de['is_de_fdr_0.05']) & (abs(scvi_de.lfc_mean) > .5)]
scvi_de = scvi_de.sort_values('lfc_mean')
scvi_de


# 表达筛选
scvi_de = scvi_de[(scvi_de.raw_normalized_mean1 > .5) | (scvi_de.raw_normalized_mean2 > .5)]
scvi_de

最终筛选到952个基因:

image-20221005003115485.png
#top 25 and bottom 25 from sorted df
genes_to_show = scvi_de[-25:].index.tolist() + scvi_de[:25].index.tolist() 
sc.pl.heatmap(subset, genes_to_show, groupby='cell_type_sub', swap_axes=True, layer = 'scvi_normalized',log = True)
plt.savefig(dir+"06-intergration_heatmap_scvi.png")

结果图如下:

image-20221005003207962.png

结果保存

dedf.to_csv(dir+'dedf.csv')
scvi_de.to_csv(dir+'scvi_de.csv')

下次见~

上一篇 下一篇

猜你喜欢

热点阅读