
2023-05-04  本文已影响0人  单细胞空间交响乐

作者,Evil Genius

前不久刚给学员上了一节关于细胞通讯的课程,也发现了很多软件的更新之处,在这里给大家分享一下cellphoneDB v4.0最新更新的内容。



conda create -n cpdb python=3.8

source activate cpdb

pip install cellphonedb


means, deconvoluted = cpdb_analysis_method.call(
         cpdb_file_path = cellphonedb.zip,
         meta_file_path = test_meta.txt,
         counts_file_path = test_counts.h5ad,
         counts_data = 'hgnc_symbol',
         output_path = out_path)

结果只包含受配体对的means.csv and deconvoluted.csv


  • Only receptors and ligands expressed in more than a user-specified threshold percentage of the cells in the specific cluster (threshold default is 0.1) are tested and will get a mean value in the significant.txt output.
  • For the multi-subunit heteromeric complexes, we require that:
    1、 all subunits of the complex are expressed by a proportion of cells (threshold), and then
    2、 We use the member of the complex with the minimum expression to compute the interaction means and perform the random shuffling.
from cellphonedb.src.core.methods import cpdb_statistical_analysis_method

deconvoluted, means, pvalues, significant_means = cpdb_statistical_analysis_method.call(
        cpdb_file_path = cellphonedb.zip,
        meta_file_path = test_meta.txt,
        counts_file_path = test_counts.h5ad,
        counts_data = 'hgnc_symbol',
        output_path = out_path)
from cellphonedb.src.core.methods import cpdb_degs_analysis_method

deconvoluted, means, relevant_interactions, significant_means = cpdb_degs_analysis_method.call(
         cpdb_file_path = cellphonedb.zip,
         meta_file_path = test_meta.txt,
         counts_file_path = test_counts.h5ad,
         degs_file_path = degs_file.txt,
         counts_data = 'hgnc_symbol',
         threshold = 0.1,
         output_path = out_path)






Output files

All files (except “deconvoluted.txt”) follow the same structure: rows depict interacting proteins while columns represent interacting cell type pairs.

See below the meaning of each column in the outputs:

P-value (pvalues.txt), Mean (means.txt), Significant mean (significant_means.txt) and Relevant interactions (relevant_interactions.txt)
  • id_cp_interaction: Unique CellphoneDB identifier for each interaction stored in the database.
  • interacting_pair: Name of the interacting pairs separated by “|”.
  • partner A or B: Identifier for the first interacting partner (A) or the second (B). It could be: UniProt (prefix simple:) or complex (prefix complex:)
  • gene A or B: Gene identifier for the first interacting partner (A) or the second (B). The identifier will depend on the input user list.
  • secreted: True if one of the partners is secreted.
  • Receptor A or B: True if the first interacting partner (A) or the second (B) is annotated as a receptor in our database.
  • annotation_strategy: Curated if the interaction was annotated by the CellphoneDB developers. Otherwise, the name of the database where the interaction has been downloaded from.
  • is_integrin: True if one of the partners is integrin.
  • rank: Total number of significant p-values for each interaction divided by the number of cell type-cell type comparisons. (Only in significant_means.txt)
  • means: Mean values for all the interacting partners: mean value refers to the total mean of the individual partner average expression values in the corresponding interacting pairs of cell types. If one of the mean values is 0, then the total mean is set to 0. (Only in means.txt)
  • p.values: p-values for all the interacting partners: p.value refers to the enrichment of the interacting ligand-receptor pair in each of the interacting pairs of cell types. (Only in pvalues.txt)
  • significant_mean: Significant mean calculation for all the interacting partners. If p.value < 0.05, the value will be the mean. Alternatively, the value is set to 0. (Only in significant_means.txt)
  • relevant_interactions: Indicates if the interaction is relevant (1) or not (0). If a gene in the interaction is a DEG (i.e. a gene in the DEG.tsv file), and all the participant genes are expressed, the interaction will be classified as relevant. Alternatively, the value is set to 0. ( Only in relevant_interactions.txt)

Again, remember that the interactions are not symmetric. It is not the same IL12-IL12 receptor for clusterA clusterB (i.e. receptor is in clusterB) that IL12-IL12 receptor for clusterB clusterA (i.e. receptor is in clusterA).

Deconvoluted (deconvoluted.txt)

  • gene_name: Gene identifier for one of the subunits that are participating in the interaction defined in the “means.csv” file. The identifier will depend on the input of the user list.

  • uniprot: UniProt identifier for one of the subunits that are participating in the interaction defined in the “means.csv” file.

  • is_complex: True if the subunit is part of a complex. Single if it is not, complex if it is.

  • protein_name: Protein name for one of the subunits that are participating in the interaction defined in the “means.csv” file.

  • complex_name: Complex name if the subunit is part of a complex. Empty if not.

  • id_cp_interaction: Unique CellphoneDB identifier for each of the interactions stored in the database.

  • mean: Mean expression of the corresponding gene in each cluster.

Interpreting the outputs

How to read and interpret the results?

The key files are significant_means.txt (for statistical_analysis) or relevant_interactions.txt (for degs_analysis), see below. When interpreting the results, we recommend you first define your questions of interest. Next, focus on specific cell type pairs and manually review the interactions prioritising those with lower p-value and/or higher mean expression. For graphical representation we recommend @zktuong repository: ktplots in R and ktplotspy in python.

CellphoneDB output is high-throughput. CellphoneDB provides all cell-cell interactions that may potentially occur in your dataset, given the expression of the cells. The size of the output may be overwhelming, but if you apply some rationale (which will depend on the design of your experiment and your biological question), you will be able to narrow it down to a few candidate interactions. The new method degs_analysis will allow you to perform a more tailored analysis towards specific cell-types or conditions, while the option microenvs will allow you to restrict the combinations of cell-type pairs to test.

It may be that not all of the cell-types of your input dataset co-appear in time and space. Cell types that do not co-appear in time and space will not interact. For example, you might have cells coming from different in vitro systems, different developmental stages or disease and control conditions. Use this prior information to restrict and ignore infeasible cell-type combinations from the outputs (i.e., columns) as well as their associated interactions (i.e. rows). You can restrict the analysis to feasible cell-type combinations using the option microenvs. Here you can input a two columns file indicating which cell type is in which spatiotemporal microenvironment. CellphoneDB will use this information to define possible pairs of interacting cells (i.e. pairs of clusters co-existing in a microenvironment) ignoring the rest of combinations.


上一篇 下一篇

