生信工具

【单细胞转录组】TooManyCells识别并可视化单细胞进化枝

2020-07-02  本文已影响0人  Geekero

简介:

简单说,就是如果觉得目前的聚类软件的分类效果不太好,可以用这个软件用可视化进化分支的形式将细胞分群

详细教程

安装

由于不想安装那么多的依赖包,下面的操作全部基于Docker

docker pull gregoryschwartz/too-many-cells:0.2.2.0

启动docker容器

docker run -it --rm -v "/home/luohb:/share/nas1/Data/Users/luohb/Personalization/20191206/TooManyCells" gregoryschwartz/too-many-cells:0.2.2.0 -h
too-many-cells, Gregory W. Schwartz. Clusters and analyzes single cell data.

Usage: too-many-cells (make-tree | interactive | differential | diversity |
                      paths)

Available options:
  -h,--help                Show this help text

Available commands:
  make-tree                
  interactive              
  differential             
  diversity                
  paths     

输入文件构建

这里输入既可以是一个文件夹(里面放 10X cellranger 的 3 个文件),也可以是一个 csv 格式的普通表达矩阵

1. 矩阵:

PS:如果是一个count矩阵文件记得第一行的第一列是逗号,行名标签和列标签可以没有双引号

"","A22.D042044.3_9_M.1.1","C5.D042044.3_9_M.1.1","D10.D042044.3_9_M.1.1","E13.D042044.3_9_M.1.1","F19.D042044.3_9_M.1.1","H2.D042044.3_9_M.1.1","I9.D042044.3_9_M.1.1",...
"0610005C13Rik",0,0,0,0,0,0,0,...
"0610007C21Rik",0,112,185,54,0,96,42,...
"0610007L01Rik",0,0,0,0,0,153,170,...
"0610007N19Rik",0,0,0,0,0,0,0,...
"0610007P08Rik",0,0,0,0,0,19,0,...
"0610007P14Rik",0,58,0,0,255,60,0,...
"0610007P22Rik",0,0,0,0,0,65,0,...
"0610008F07Rik",0,0,0,0,0,0,0,...
"0610009B14Rik",0,0,0,0,0,0,0,...
...

2. 标签文件

item,label
AAACCTGCAGTAACGG-1,Marrow
AAACGGGAGACCGGAT-1,Marrow
AAACGGGAGCGCTCCA-1,Marrow
AAACGGGAGGACGAAA-1,Marrow
AAACGGGAGGTACTCT-1,Marrow
...

这里的标签文件,可以是细胞的样本来源信息,或者认为分群的标签,只作为最后上色的结果,不影响最后进化树的分支结构

运行

docker run -it --rm -v /share/nas1/Data/Users/luohb/Personalization/TooManyCells/test:/test \
      gregoryschwartz/too-many-cells:0.2.2.0 make-tree \
      --matrix-path /test/count.csv \
      --labels-file /test/OrigIdent.labels.csv \
      --draw-collection "PieRing" \
      --output /test/LabelsBySamples > log

结果类似这样


“修剪”分支

默认参数下的分支太细了,可以通过两种方式来调整:

另外,我们不需要重新计算整个树!我们可以使用参数--prior来提供以前的结果(我们也可以用--prior删除--matrix-path 来加快处理速度,不过可能会失去某些功能特性)

docker run -it --rm -v /share/nas1/Data/Users/luohb/Personalization/TooManyCells/test:/test \
      gregoryschwartz/too-many-cells:0.2.2.0  make-tree \
      --prior /test/LabelsBySamples --labels-file /test/OrigIdent.labels.csv \
      --smart-cutoff 1 --min-size 1 \
      --draw-collection "PieChart"   #末端改成饼图 \
      --output /test/pruned_LabelsBySamples > log1_2

最后结果类似


提取子集

cp log3_2 clusters_pruned.csv
vi clusters_pruned.csv
# vim中
%s/^M$//g

各个节点的结果在Docker中会显示有些问题,需要手动修改成以下形式

$ head clusters_pruned.csv
cell,cluster,path
AAACGGGAGGTGTTAA.1,9,9/8/7/6/5/4/3/2/1/0
AACACGTTCGGCGGTT.1,9,9/8/7/6/5/4/3/2/1/0
AACCGCGGTATATGAG.1,9,9/8/7/6/5/4/3/2/1/0
ACACCCTTCTGGTTCC.1,9,9/8/7/6/5/4/3/2/1/0
ACCTTTAAGGTGTTAA.1,9,9/8/7/6/5/4/3/2/1/0
ACGAGGACACGTTGGC.1,9,9/8/7/6/5/4/3/2/1/0
AGGGAGTCAGGCTCAC.1,9,9/8/7/6/5/4/3/2/1/0
AGGGATGAGCGATAGC.1,9,9/8/7/6/5/4/3/2/1/0
AGTGGGAAGATGTAAC.1,9,9/8/7/6/5/4/3/2/1/0

标注上节点信息

docker run -it --rm -v /share/nas1/Data/Users/luohb/Personalization/TooManyCells/test:/test \
    gregoryschwartz/too-many-cells:0.2.2.0 make-tree 
    --prior /test/LabelsBySplitGroup \
    --labels-file /test/SplitGroup.labels.csv --smart-cutoff 1 --min-size 1 --draw-collection "PieChart" \
    --draw-node-number  #加上节点信息\
    --output /test/number_pruned_LabelsBySplitGroup > log7

然后可以根据节点对应的barcode去提取细胞子集

基因表达情况

docker run -it --rm -v /share/nas1/Data/Users/luohb/Personalization/TooManyCells/test:/test \
    gregoryschwartz/too-many-cells:0.2.2.0 make-tree \
    --prior /test/LabelsBySplitGroup \
    --matrix-path /test/count.csv \
    --labels-file /test/SplitGroup.labels.csv  \
    --smart-cutoff 1 \
    --min-size 1 \
    --draw-leaf "DrawItem (DrawThresholdContinuous [(\"gene1\", 0), (\"gene2\", 0)])" \
    --draw-colors "[\"#e41a1c\", \"#377eb8\", \"#4daf4a\", \"#eaeaea\"]"\
    --draw-scale-saturation 10 \
    --output /test/out_gene_expression \
    > clusters_pruned_gene_expression.csv

结果类似


差异基因分析

根据提供的标签进行差异基因分析

两个节点之间的差异分析

$docker run -it --rm -v /share/nas1/Data/Users/luohb/TooManyCells/test:/test \
    gregoryschwartz/too-many-cells:0.2.2.0 differential \
    --prior /test/LabelsBySplitGroup \
    --matrix-path /test/count.csv \
    --labels-file /test/SplitGroup.labels.csv  \
    -n "([70, 3, 105, 166], [45])" \
    > clusters_pruned_gene_expression.csv

对所有节点进行查找Marker基因

$cat run12.sh 
$docker run -it --rm -v /share/nas1/Data/Users/luohb/TooManyCells/test:/test \
    gregoryschwartz/too-many-cells:0.2.2.0 differential \
    --prior /test/LabelsBySplitGroup \
    --matrix-path /test/count.csv \
    -n "([], [])" \
    --normalization "UQNorm" \
    +RTS -N26
    --plot-output /test/plot.pdf
    -t 5  #限定节点层级

$sh run12.sh >FindAllMarker.txt
上一篇下一篇

猜你喜欢

热点阅读