Machine-learning Approaches to Functional Genomics:

From Big Data to Understanding of Human Disease

Olga Troyanskaya

Department of Computer Science and Lewis-Sigler Institute for Integrative Genomics, Princeton University

Thursday, December 29th, 2016 at 14:00

Room 570, Education Building, 5th floor

Abstract

An immense molecular complexity forms the foundation of human disease. This complexity must be interpreted and distilled through algorithms enabling accurate data-driven modeling at the genomic, cellular, and organ level. At the genomic level, understanding functional effects of mutations, especially in specific cellular context, is a major challenge. DeepSea, our deep learning-based algorithmic framework for predicting the chromatin effects of sequence alterations with single nucleotide sensitivity, addresses this challenge and we utilize it to predict the chromatin effects of sequence variants and prioritize disease-associated sequence variants.On the cellular level, although cell-lineage-specific gene expression and function underlie the development, function, and maintenance of diverse cell types within an organism and are critical to understanding molecular basis of disease, high-throughput data are rarely resolved with respect to specific cell lineages. I will present on our recent work developing integrative Bayesian approaches that leverage functional genomics data collections to study how cellular pathways function in diverse cell types, enabling molecular-level understanding of human disease. I will describe how integrated analysis of functional genomics data can be leveraged to study tissue-lineage-specific protein function and interactions and to identify genes involved in disease in a novel approach for re-prioritizing quantitative genetics studies results. This will include applications to cardiovascular disease and autism.