Bioinformatics机器学习自然科学之文献阅读

「文献05」深度学习应用于基因组学的入门指导

2019-01-19  本文已影响74人  六六_ryx

日期:2019年2月2日——2019-Week5
分类:「综述+资源」
题目:A primer on deep learning in genomics
DOI: https://doi.org/10.1038/s41588-018-0295-5
杂志:Nature genetics,21 December 2018
关键词: Deep learning,genomics

深度学习是机器学习的一个变异,其使用神经网络从数据集中自动提取新的特征。目前成功应用于图像识别、机器人(如无人驾驶),在大数据研究中也发挥着重要的作用。随着测序技术的发展,生命组学的数据爆发式增加,将深度学习作为基因组学领域的工具是完全合适的,虽然目前仍然处于研究初期阶段,但是深度学习在癌症诊断和治疗、临床遗传学、作物改良、流行病学和公共卫生、人口遗传学、进化或系统发育分析以及功能基因组学等领域展现出巨大潜在应用价值。

这篇文章对深度学习在基因组学中的应用提供了一个入门指导,包括以下几方面内容:

1. 深度学习的基本概念和方法

深度学习的工作流

相关术语:

2. 如何有效的使用深度学习

深度学习的主要元素和指导:

3. 解读深度学习模型

在基因组学中的应用,研究者更关心的是预测模型揭示的生物机制。
如对于CNN来说,还可以可视化每个卷积过滤器作为热图或位置权重矩阵图像,这些可视化有助于了解网络正在学习的特征。

4.深度学习在基因组中的应用

  • Khodabandelou, G., Mozziconacci, J. & Routhier, E. Genome functional
    annotation using deep convolutional neural network. Preprint at https://www.
    biorxiv.org/content/early/2018/05/25/330308 (2018).
  • Kelley, D. R., Snoek, J. & Rinn, J. L. Basset: learning the regulatory code of the accessible genome with deep convolutional neural networks. Genome Res. 26, 990–999 (2016).
  • Quang, D. & Xie, X. DanQ: a hybrid convolutional and recurrent deep neural network for quantifying the function of DNA sequences. Nucleic Acids Res. 44, e107 (2016).
  • Li, Y., Shi, W. & Wasserman, W. W. Genome-wide prediction of cis-regulatory regions using supervised deep learning methods. BMC Bioinformatics 19,202 (2018).

Xie, R., Wen, J., Quitadamo, A., Cheng, J. & Shi, X. A deep auto-encoder
model for gene expression prediction. BMC Genomics 18 (Suppl. 9),
845 (2017)

Jha, A., Gazzara, M. R. & Barash, Y. Integrative deep models for alternative
splicing. Bioinformatics 33, i274–i282 (2017).

  • Tripathi, R., Patel, S., Kumari, V., Chakraborty, P. & Varadwaj, P. K.
    DeepLNC, a long non-coding RNA prediction tool using deep neural
    network. Netw. Model. Anal. Health Inform. Bioinform. 5, 21 (2016).
  • Yu, N., Yu, Z. & Pan, Y. A deep learning method for lincRNA detection using auto-encoder algorithm. BMC Bioinformatics 18 (Suppl. 15), 511 (2017).
  • Hill, S. T. et al. A deep recurrent neural network discovers complex biological rules to decipher RNA protein-coding potential. Nucleic Acids Res. 46, 8105–8113 (2018).
  • Wang, Y. et al. Predicting DNA methylation state of CpG dinucleotide using
    genome topological features and deep networks. Sci. Rep. 6, 19598 (2016).
  • Angermueller, C., Lee, H. J., Reik, W. & Stegle, O. DeepCpG: accurate
    prediction of single-cell DNA methylation states using deep learning. Genome
    Biol. 18, 67 (2017).
  • Shaham, U. et al. Removal of batch effects using distribution-matching
    residual networks. Bioinformatics 33, 2539–2546 (2017).
  • Lin, C., Jain, S., Kim, H. & Bar-Joseph, Z. Using neural networks for reducing the dimensions of single-cell RNA-Seq data. Nucleic Acids Res. 45, e156 (2017).
  • Wang, Y. et al. Predicting DNA methylation state of CpG dinucleotide using
    genome topological features and deep networks. Sci. Rep. 6, 19598 (2016).
  • Schreiber, J., Libbrecht, M., Bilmes, J. & Noble, W. Nucleotide sequence and DNaseI sensitivity are predictive of 3D chromatin architecture. Preprint at
    https://www.biorxiv.org/content/early/2017/01/30/103614 (2017).

Poplin, R. et al. Creating a universal SNP and small indel variant caller with
deep neural networks. Preprint at https://www.biorxiv.org/content/
early/2018/03/20/092890 (2017).

还有基于长读长的数据利用深度学习进行base calling的技术,如:

  • Boža, V., Brejová, B. & Vinař, T. DeepNano: deep recurrent neural networks for base calling in MinION nanopore reads. PLoS One 12, e0178751 (2017).
  • Teng, H., Hall, M.B., Duarte, T., Cao, M.D. & Coin, L. Chiron: translating

nanopore raw signal directly into nucleotide sequence using deep learning.
Preprint at https://www.biorxiv.org/content/early/2017/08/23/179531 (2017).

  • Zhou, J. & Troyanskaya, O. G. Predicting effects of noncoding variants with deep learning-based sequence model. Nat. Methods 12, 931–934 (2015).
  • Zhou, J. et al. Whole-genome deep learning analysis reveals causal role of
    noncoding mutations in autism. Preprint at https://www.biorxiv.org/content/
    early/2018/05/11/319681 (2018).
  • Zhou, J. et al. Deep learning sequence-based ab initio prediction of variant
    effects on expression and disease risk. Nat. Genet. 50, 1171–1179 (2018).

5. 深度学习的工具资源

6. 基于卷积神经网络预测DNA-binding motifs的交互教程

https://colab.research.google.com/drive/17E4h5aAOioh5DiTo7MZg4hpL6Z_0FyWr

上一篇 下一篇

猜你喜欢

热点阅读