T11 – Learning with structured (semantic) biological data

Short Description

Ontologies have long provided a core foundation in the organization of biomedical entities, their attributes, and their relationships. With over 500 biomedical ontologies currently available there are a number of new and exciting new opportunities emerging in using ontologies for large scale data sharing and data analysis. Additionally, life sciences have adopted Semantic Web technologies for organizing large parts of their infrastructure with most major databases being available in Semantic Web formats and using ontologies for annotation. It is now a challenge to use this structured information in data analysis and to build predictive models. Recently, several methods have become available that apply representation learning to knowledge graphs and which can be used to predict biological relations, determine similarity between biological entities, and provide features for machine learning applications.

This tutorial will introduce recent methods for learning with structured biological data, focusing mainly on knowledge graph embeddings. We will explore several applications of these embeddings in protein function prediction, drug repurposing, prediction of gene-disease association, and information retrieval from databases. The tutorial will include a significant hands-on component and focus on skills that can be applied to a  large number of biological problems.