基于小鼠anndata数据的CellphoneDB分析-1

2022-11-08  本文已影响0人  Yayamia

Step1 获得表达矩阵及注释文件

adata.to_df().T不过得到的是gene_symbol作为行名
关于是否使用raw_counts:

Counts file. scRNA-seq count data containing gene expression values in which rows are genes presented with gene names identifiers (Ensembl IDs, gene names or hgnc_symbol annotation) and columns are cells. We recommend using normalized count data. Importantly, the user needs to specify whether the data was log-transformed when using the subsampling option.

sc.get.obs_df即可

Step2 小鼠同源基因转换

biomaRt包实现不同物种之间同源基因转换

rm(list = ls())#一键清空
options(stringsAsFactors = F) #字符串不转化为因子

library(biomaRt)

exp <- read.table("exp_db.txt", header = TRUE, row.names=NULL)
exp[1:5,1:5]
names(exp)[1] <-"MGI.symbol"#以便于后期inner join

mouse.gene <- rownames(exp)

human.mart <- biomaRt::useMart(host="https://dec2021.archive.ensembl.org", "ENSEMBL_MART_ENSEMBL", dataset="hsapiens_gene_ensembl")
mouse.mart <- biomaRt::useMart(host="https://dec2021.archive.ensembl.org", "ENSEMBL_MART_ENSEMBL", dataset="mmusculus_gene_ensembl")

m2h.g <- getLDS(attributes = c("mgi_symbol"),filters = "mgi_symbol",
                values = mouse.gene,mart = mouse.mart,
                attributesL = c("hgnc_symbol","chromosome_name","start_position"),
                martL = human.mart,uniqueRows = T,)
head(m2h.g)
m2h.g2 <- m2h.g[,1:2]
dim(m2h.g2)
dim(exp)
m2h.g3 <- m2h.g2[!m2h.g2$HGNC.symbol=="",]
dim(m2h.g3)

hexp <- m2h.g3 %>%  inner_join(exp, by = "MGI.symbol")
hexp[1:5,1:5]
dim(hexp)
exp[exp$MGI.symbol=="mt-Nd6","AAACCTGAGGAGCGAG"]#验证一下
hexp2 <- hexp %>% distinct(HGNC.symbol, .keep_all = TRUE)
dim(hexp2)
row.names(hexp2) <- hexp2[,2]
hexp2[1:5,1:5]
hexp3<-hexp2[,-c(1,2)]
hexp3[1:5,1:5]
dim(hexp3)

注意最好不要直接保存为table,因为这样行和列都会加上双引号,可以先保存为csv,然后再改成txt(tab分割)
注意完全模仿参考文件的格式

Step3 在终端进行cellphonedb分析

cellphonedb method statistical_analysis \
CD8_annotation_db_hs.txt \
CD8_exp_db_hs.txt \
--counts-data=gene_name \
--threads=4 

output:


Step4 利用Cellphonedb自带的绘图

cellphonedb plot dot_plot \
--means-path=means.txt \
--pvalues-path=pvalues.txt \
--output-path=/out

To plot only desired rows/columns (samples for rows and columns based in example data files):

cellphonedb plot dot_plot --rows in/rows.txt --columns in/columns.txt
上一篇 下一篇

猜你喜欢

热点阅读