Protein interface prediction using graph convolutional networks

Fout, Alex M., author; Ben-Hur, Asa, advisor; Anderson, Chuck, committee member; Chitsaz, Hamidreza, committee member; Zhou, Wen, committee member

Protein interface prediction using graph convolutional networks

Files

Fout_colostate_0053N_14473.pdf (4.59 MB)

Date

2017

Authors

Fout, Alex M., author

Ben-Hur, Asa, advisor

Anderson, Chuck, committee member

Chitsaz, Hamidreza, committee member

Zhou, Wen, committee member

Abstract

Proteins play a critical role in processes both within and between cells, through their interactions with each other and other molecules. Proteins interact via an interface forming a protein complex, which is difficult, expensive, and time consuming to determine experimentally, giving rise to computational approaches. These computational approaches utilize known electrochemical properties of protein amino acid residues in order to predict if they are a part of an interface or not. Prediction can occur in a partner independent fashion, where amino acid residues are considered independently of their neighbor, or in a partner specific fashion, where pairs of potentially interacting residues are considered together. Ultimately, prediction of protein interfaces can help illuminate cellular biology, improve our understanding of diseases, and aide pharmaceutical research. Interface prediction has historically been performed with a variety of methods, to include docking, template matching, and more recently, machine learning approaches. The field of machine learning has undergone a revolution of sorts with the emergence of convolutional neural networks as the leading method of choice for a wide swath of tasks. Enabled by large quantities of data and the increasing power and availability of computing resources, convolutional neural networks efficiently detect patterns in grid structured data and generate hierarchical representations that prove useful for many types of problems. This success has motivated the work presented in this thesis, which seeks to improve upon state of the art interface prediction methods by incorporating concepts from convolutional neural networks. Proteins are inherently irregular, so they don't easily conform to a grid structure, whereas a graph representation is much more natural. Various convolution operations have been proposed for graph data, each geared towards a particular application. We adapted these convolutions for use in interface prediction, and proposed two new variants. Neural networks were trained on the Docking Benchmark Dataset version 4.0 complexes and tested on the new complexes added in version 5.0. Results were compared against the state of the art method partner specific method, PAIRpred [1]. Results show that multiple variants of graph convolution outperform PAIRpred, with no method emerging as the clear winner. In the future, additional training data may be incorporated from other sources, unsupervised pretraining such as autoencoding may be employed, and a generalization of convolution to simplicial complexes may also be explored. In addition, the various graph convolution approaches may be applied to other applications with graph structured data, such as Quantitative Structure Activity Relationship (QSAR) learning, and knowledge base inference.

Subject

graph convolution

neural networks

deep learning

structural bioinformatics

machine learning

URI

https://hdl.handle.net/10217/185661
https://doi.org/10.25675/3.021826

Collections

2000-2019
Theses and Dissertations

Full item page

Protein interface prediction using graph convolutional networks

Files

Date

Authors

Journal Title

Journal ISSN

Volume Title

Abstract

Description

Rights Access

Subject

Citation

URI

Collections

Endorsement

Review

Supplemented By

Referenced By