单细胞文献起始——补充背景知识

2021-12-14 本文已影响0人结城明日奈_7e51

听说课程配笔记，学习无压力

构建文库

综述：Comparative Analysis of Single-Cell RNA Sequencing Methods. 2017, （doi: 10.1016/j.molcel.2017.01.023.）

涉及到了6中文库构建方法（CEL-seq2, Drop-seq, MARS-seq, SCRB-

seq, Smart-seq, and Smart-seq2），可以再结合相关的每一个文库找6篇文章

文章发现：Smart-seq2可以在每个细胞中找到最多的基因，同样费用比较高；检测少量细胞时，MARS-seq、SCRB-seq、Smart-seq2更有效

归一化

文献1：Assessment of Single Cell RNA-Seq Normalization Methods，2017 (doi: 10.1534/g3.117.040683)

评价了几种归一化方法：

fragments per kilobase of transcript per million mapped

reads (FPKM)(Mortazavi et al., 2008)

upper quartile (UQ)(Bullard et al., 2010)

Trimmed mean of M-values (TMM)(Robinson and Oshlack, 2010)

DESeq(Love et al.,2014)

removed unwanted variation (RUV)(Risso et al., 2014)

gamma regression model (GRM)(Ding et al., 2015).

文献2：Performance Assessment and Selection of Normalization Procedures for Single-Cell RNA-Seq, 2019 (DOI:https://doi.org/10.1016/j.cels.2019.03.010)

主要研究了scone方法：a flexible framework for assessing performance based

on a comprehensive panel of data-driven metrics

(http://bioconductor.org/packages/scone/)

另外方法还有很多，比如：LSF(Lun Sum Factors)，BigNorm, Scnorm, BASiCS, RLE(size factor relative log expression)

降维

PDF: https://lib.ugent.be/fulltxt/RUG01/002/349/740/RUG01-002349740_2017_0001_AC.pdf

值得好好阅读，讲了许多关于降维原理和应用的知识

文中1.5.1部分（Clustering high-dimension to identify subtypes）写出：

Importantly, the reduced dimensionality data are less noisy than the high-dimensional data bust lose some of the biological variance.

文章1：PCA, MDS, k-means, Hierarchical clustering and heatmap.

文章2：Outlier Preservation by Dimensionality Reduction Techniques

"MDS best choice for preserving outliers, PCA for variance, & T-SNE for clusters"

鉴定细胞群

每个术语都对应一篇文献

降维：PCA、tSNE、DM(Diffusion maps)

feature selection：M3Drop(Michaelis-Menten Modelling of Dropouts)、HVG(Highly variable genes)、Spike-in based methods、Correalated expression

Seurat：is an R package designed for the analysis and visualization of single cell RNA-seq data. It contains easy-to-use implementations of commonly used analytical techniques, including the identification of highly variable genes, dimensionality reduction (PCA, ICA, t-SNE), standard unsupervised clustering algorithms (density clustering, hierarchical clustering, k-means), and the discovery of differentially expressed genes and markers.

SC3：SC3 achieves high accuracy and robustness by consistently integrating different clustering solutions through a consensus approach. Tests on twelve published datasets show that SC3 outperforms five existing methods while remaining scalable, as shown by the analysis of a large dataset containing 44,808 cells. Moreover, an interactive graphical implementation makes SC3 accessible to a wide audience of users, and SC3 aids biological interpretation by identifying marker genes, differentially expressed genes and outlier cells.

tSNE+kmeans

SNN-Clip: doi: 10.1093/bioinformatics/btv088

SINCERA: SINCERA: A Pipeline for Single-Cell RNA-Seq Profiling Analysis.

综述：A systematic performance evaluation of clustering methods for single-cell RNA-seq data (SC3 and Seurat show the most favorable results)

关于各种单细胞工具：https://www.scrna-tools.org/

文章在：Exploring the single-cell RNA-seq analysis landscape with the scRNA-tools database

在单细胞天地的公众号里面有

#第一期单细胞视频笔记汇总

根据目录内容，里面大多数是教学如何实现代码得到想要的结果，所以在这里我选择先花两天时间补充背景知识【12.15-12.16】而后再根据里面的内容来进行具象实现。

#第一期单细胞视频笔记汇总 (qq.com)