多分组差异分析策略
大家学习到的通常是两个组的样本进行差异分析,然后走标准分析流程,火山图,热图,GO/KEGG数据库注释等等。但真实情况下,通常是有多个分组,这个时候就会有多种策略可供选择了。
比如拿某一组的样本与剩余其它组所有样本进行比较,这样的差异分析策略还是蛮流行的!
TNBC多个分子亚型
比如发表在 January 2019, https://doi.org/10.1002/1878-0261.12446 文章:Expression of long non‐coding RNA ENSG00000226738 (LncKLHDC7B) is enriched in the immunomodulatory triple‐negative breast cancer subtype and its alteration promotes cell migration, invasion, and resistance to cell death ,而且作者还实验验证了LncKLHDC7B (ENSG00000226738) 及其临近基因KLHDC7B的功能。
就是采取这个策略 comparison of the samples belonging to a specific subtype against to the rest of the samples (e.g. IM vs other subtypes). A ≥ 1.5‐fold change, ANOVA P‐value less than or equal to 0.05, and false discovery rate (FDR) < 0.05 were considered as significant to detect expression changes between the TNBC subtypes, except for BL2 where FDR was < 0.5.
这样每个组都会有自己的差异分析结果,可以独立进行GO/KEGG数据库注释,如下:
值得一提的是,作者首先在意大利和墨西哥TNBC数据集挖掘
- GSE86945 Transcriptome characterization of triple negative breast cancer [Italy]
- GSE86946 Transcriptome characterization of triple negative breast cancer [Mexico]
然后在中国人队列验证:GEO: GSE76250 ,最后还在 HCC1187 细胞系里面敲自己感兴趣的基因,数据也上传了,GEO: GSE114468),
肉瘤的多个器官起源
上个月我分享的文献DOI: https://doi.org/10.1016/j.cell.2017.10.014 也是这个策略。
首先癌症细分类型:
- synovial sarcoma (SS)
- malignant peripheral nerve sheath tumor (MPNST)
- uterine leiomyosarcoma (ULMS)
- dedifferentiated liposarcomas (DDLPS)
- undifferentiated pleomorphic sarcomas (UPS)
- myxofibrosarcomas (MFS)
- leiomyosarcoma (LMS)
然后对每个亚型都跟其它所有的样本进行比较,仅仅是展现基因的差异情况和GO数据库注释情况。
完整的文献汇报思维导图见:https://mubu.com/doc/22yIgKWcTg
image