10X单细胞空间通讯分析之最新版cellphoneDB(v4)解

2023-05-04  本文已影响0人  单细胞空间交响乐

作者,Evil Genius

前不久刚给学员上了一节关于细胞通讯的课程,也发现了很多软件的更新之处,在这里给大家分享一下cellphoneDB v4.0最新更新的内容。

考虑空间位置的通讯分析手段---CellphoneDB(V3.0)

安装上的不同,现在cellphoneDB完全封装成一个linux运行命令,conda直接安装就可以。

conda create -n cpdb python=3.8

source activate cpdb

pip install cellphonedb

分析方法上的更新(三种方法选择)

如果采用方法1,那么直接会选出所有的表达配受体的细胞类型pair
means, deconvoluted = cpdb_analysis_method.call(
         cpdb_file_path = cellphonedb.zip,
         meta_file_path = test_meta.txt,
         counts_file_path = test_counts.h5ad,
         counts_data = 'hgnc_symbol',
         output_path = out_path)

结果只包含受配体对的means.csv and deconvoluted.csv

如果采用方法2,那么就会对配受体对进行假设检验

  • Only receptors and ligands expressed in more than a user-specified threshold percentage of the cells in the specific cluster (threshold default is 0.1) are tested and will get a mean value in the significant.txt output.
  • For the multi-subunit heteromeric complexes, we require that:
    1、 all subunits of the complex are expressed by a proportion of cells (threshold), and then
    2、 We use the member of the complex with the minimum expression to compute the interaction means and perform the random shuffling.
然后,对所有细胞类型进行两两比较。首先,随机排列所有细胞的cluster标签(默认为1000),并确定cluster中平均受体表达水平的平均值和相互作用cluster中平均配体表达水平的平均值。对于两种细胞类型之间的每个成对比较中的每个受体配体对,这产生零分布。通过计算等于或高于实际平均值的平均值的比例,获得给定受体-配体复合物细胞类型特异性可能性的p值。然后,根据显著对的数量优先考虑细胞类型之间高度丰富的相互作用,以便可以手动选择生物学上相关的相互作用
from cellphonedb.src.core.methods import cpdb_statistical_analysis_method

deconvoluted, means, pvalues, significant_means = cpdb_statistical_analysis_method.call(
        cpdb_file_path = cellphonedb.zip,
        meta_file_path = test_meta.txt,
        counts_file_path = test_counts.h5ad,
        counts_data = 'hgnc_symbol',
        output_path = out_path)
from cellphonedb.src.core.methods import cpdb_degs_analysis_method

deconvoluted, means, relevant_interactions, significant_means = cpdb_degs_analysis_method.call(
         cpdb_file_path = cellphonedb.zip,
         meta_file_path = test_meta.txt,
         counts_file_path = test_counts.h5ad,
         degs_file_path = degs_file.txt,
         counts_data = 'hgnc_symbol',
         threshold = 0.1,
         output_path = out_path)
这种方法可以自由地设计基因表达比较,以更好地匹配研究问题。使用方法2,零假设(和背景分布)考虑数据集中的所有细胞类型,并执行“一个”细胞类型与“其余”细胞类型的比较。然而,分析可能希望使用不同的方法来更好地反映研究情况。下面是一些例子:

分析需要考虑技术批次或生物协变量。在这里,更好的方法是依赖包含这些混杂因素的差异表达方法,并直接向CellphoneDB提供结果。

**对特定谱系中的特异性感兴趣,并希望执行分层差异表达分析(例如,对特定谱系感兴趣,例如上皮细胞,并希望识别在该上皮谱系中改变其表达的基因;研究问题:与上皮细胞b相比,上皮细胞a中哪些相互作用被上调?)

希望在疾病与控制方式中比较特定群体(例如,通过将疾病T细胞与对照T细胞进行比较来识别疾病T细胞中的上调基因;研究问题:疾病t细胞上调了哪些相互作用?)

包含空间信息,可以参考考虑空间位置的通讯分析手段---CellphoneDB(V3.0)

结果解读

Output files

All files (except “deconvoluted.txt”) follow the same structure: rows depict interacting proteins while columns represent interacting cell type pairs.

See below the meaning of each column in the outputs:

P-value (pvalues.txt), Mean (means.txt), Significant mean (significant_means.txt) and Relevant interactions (relevant_interactions.txt)
  • id_cp_interaction: Unique CellphoneDB identifier for each interaction stored in the database.
  • interacting_pair: Name of the interacting pairs separated by “|”.
  • partner A or B: Identifier for the first interacting partner (A) or the second (B). It could be: UniProt (prefix simple:) or complex (prefix complex:)
  • gene A or B: Gene identifier for the first interacting partner (A) or the second (B). The identifier will depend on the input user list.
  • secreted: True if one of the partners is secreted.
  • Receptor A or B: True if the first interacting partner (A) or the second (B) is annotated as a receptor in our database.
  • annotation_strategy: Curated if the interaction was annotated by the CellphoneDB developers. Otherwise, the name of the database where the interaction has been downloaded from.
  • is_integrin: True if one of the partners is integrin.
  • rank: Total number of significant p-values for each interaction divided by the number of cell type-cell type comparisons. (Only in significant_means.txt)
  • means: Mean values for all the interacting partners: mean value refers to the total mean of the individual partner average expression values in the corresponding interacting pairs of cell types. If one of the mean values is 0, then the total mean is set to 0. (Only in means.txt)
  • p.values: p-values for all the interacting partners: p.value refers to the enrichment of the interacting ligand-receptor pair in each of the interacting pairs of cell types. (Only in pvalues.txt)
  • significant_mean: Significant mean calculation for all the interacting partners. If p.value < 0.05, the value will be the mean. Alternatively, the value is set to 0. (Only in significant_means.txt)
  • relevant_interactions: Indicates if the interaction is relevant (1) or not (0). If a gene in the interaction is a DEG (i.e. a gene in the DEG.tsv file), and all the participant genes are expressed, the interaction will be classified as relevant. Alternatively, the value is set to 0. ( Only in relevant_interactions.txt)

Again, remember that the interactions are not symmetric. It is not the same IL12-IL12 receptor for clusterA clusterB (i.e. receptor is in clusterB) that IL12-IL12 receptor for clusterB clusterA (i.e. receptor is in clusterA).

Deconvoluted (deconvoluted.txt)

  • gene_name: Gene identifier for one of the subunits that are participating in the interaction defined in the “means.csv” file. The identifier will depend on the input of the user list.

  • uniprot: UniProt identifier for one of the subunits that are participating in the interaction defined in the “means.csv” file.

  • is_complex: True if the subunit is part of a complex. Single if it is not, complex if it is.

  • protein_name: Protein name for one of the subunits that are participating in the interaction defined in the “means.csv” file.

  • complex_name: Complex name if the subunit is part of a complex. Empty if not.

  • id_cp_interaction: Unique CellphoneDB identifier for each of the interactions stored in the database.

  • mean: Mean expression of the corresponding gene in each cluster.

Interpreting the outputs

How to read and interpret the results?

The key files are significant_means.txt (for statistical_analysis) or relevant_interactions.txt (for degs_analysis), see below. When interpreting the results, we recommend you first define your questions of interest. Next, focus on specific cell type pairs and manually review the interactions prioritising those with lower p-value and/or higher mean expression. For graphical representation we recommend @zktuong repository: ktplots in R and ktplotspy in python.

CellphoneDB output is high-throughput. CellphoneDB provides all cell-cell interactions that may potentially occur in your dataset, given the expression of the cells. The size of the output may be overwhelming, but if you apply some rationale (which will depend on the design of your experiment and your biological question), you will be able to narrow it down to a few candidate interactions. The new method degs_analysis will allow you to perform a more tailored analysis towards specific cell-types or conditions, while the option microenvs will allow you to restrict the combinations of cell-type pairs to test.

It may be that not all of the cell-types of your input dataset co-appear in time and space. Cell types that do not co-appear in time and space will not interact. For example, you might have cells coming from different in vitro systems, different developmental stages or disease and control conditions. Use this prior information to restrict and ignore infeasible cell-type combinations from the outputs (i.e., columns) as well as their associated interactions (i.e. rows). You can restrict the analysis to feasible cell-type combinations using the option microenvs. Here you can input a two columns file indicating which cell type is in which spatiotemporal microenvironment. CellphoneDB will use this information to define possible pairs of interacting cells (i.e. pairs of clusters co-existing in a microenvironment) ignoring the rest of combinations.

最重要的是结果文件采用受配体对进行展示,而不是通常的配受体对。

简单记录一下,生活很好,有你更好
上一篇 下一篇

猜你喜欢

热点阅读