肿瘤免疫微环境

05--CellMix: a comprehensive too

2018-06-21  本文已影响12人  六六_ryx

Overview of CellMix

CellMix is a R package released in 2013, which integrate 7 gene expression deconvolution algorithms, 8 marker gene lists, 11 public datasets, and facilitates the estimation of cell type proportions and/or cell-specific differential expression in gene expression experiments.

Background and objectives

Gene expression deconvolution is naturally expressed as matrix decomposition problem.It use global gene expression data including supervised and unsupervised methods, for supervised deconvolution it need to combine with known signatures or marker genes.
Objectives:

Install CellMix package

source('http://bioconductor.org/biocLite.R')
biocLite("CellMix", siteRepos ="http://web.cbio.uct.ac.za/~renaud/CRAN")
biocLite("GEOquery")
library("CellMix")

Estimating cell proportions from known signatures

Blood samples

# load data (normally requires an internet connection to GEO) 
acr <- ExpressionMix("GSE20300", verbose = 2)
# estimate proportions using signatures from Abbas et al. (2009) 
res <- gedBlood(acr, verbose = TRUE)
# proportions are stored in the coefficient matrix 
dim(coef(res))
coef(res)[1:3, 1:4]
# cell type names 
basisnames(res)
# basis signatures (with converted IDs) 
basis(res)[1:5, 1:3]
# aggregate into CBC 
cbc <- asCBC(res) dim(cbc)
# plot against actual CBC 
profplot(acr, cbc) 
# plot cell proportion differences between groups 
boxplotBy(res, acr$Status, main = "Cell proportions vs Transplant status")

Building/filtering basis signatures

select genes based on their cell type specificity, and build a basis
signature matrix that provides the “maximum” deconvolution power.

# check if data is in log scale 
is_logscale(mix)
# compute mean expression profiles within each cell type p <- ged(expb(mix, 2), sel, "meanProfile") 
# plot against known proportions (p is by default not scaled) 
profplot(mix, p, scale = TRUE, main = "meanProfile - Linear scale")
# compute mean expression profiles within each cell type
lp <- ged(mix, sel, "meanProfile") 
# plot against known proportions (p is by default not scaled) 
profplot(mix, lp, scale = TRUE, main = "meanProfile - Log scale")
# compute proportions using DSA methods 
pdsa <- ged(mix[sel], sel, "DSA", verbose = TRUE)
profplot(mix, pdsa, main = "DSA - Linear scale") 
pdsa <- ged(mix[sel], sel, "DSA", log = FALSE) 
profplot(mix, pdsa, main = "DSA - Log scale")

Estimating differential cell-specific expression

Complete deconvolution using marker genes

A priori: enforce marker expression patterns

# generate random data with 5 markers per cell type 
x <- rmix(3, 200, 20, markers = 5) 
m <- getMarkers(x)
# deconvolve using KL-divergence metric 
kl <- ged(x, m, "ssKL", log = FALSE, rng = 1234, nrun = 10)
# plot against known proportions 
profplot(x, kl) 
# check consistency of most expressing cell types in known basis signatures 
basismarkermap(basis(x), kl) 
# correlation with known signatures 
basiscor(x, kl)

A posteriori: assign signatures to cell types

# deconvolve using KL divergence metric 
dec <- ged(x, m, "deconf", rng = 1234, nrun = 10)
# plot against known proportions 
profplot(x, dec) 
# check consistency of most expressing cell types in known signatures basismarkermap(basis(x), dec) 
# correlation with known signatures 
basiscor(x, dec)
上一篇 下一篇

猜你喜欢

热点阅读