Thursday, June 21, 2007

Structures in Systems Biology (a double bill)

Once in a while I get to write about what I have been working on. The last time it was about the evolution of protein interaction networks. This time it is about two papers that I contributed too. A review about the use of structures in systems biology and an article about structure based prediction of Ras/RBD interactions. I am sorry to say that both require a subscription (pedrobeltrao *at* gmail).

Main conclusions
Structural data can be used to predict Ras/RBD interactions with approximately 80% accuracy
We can and should use structural information to understand the main molecular properties before abstracting away the atomic details. Structural genomics can serve as a bridge between the abstract network view and the atomic detail.

The Making off
Although I am not the first author of the article I think it is safe to say that the main inspiration for the line of work done by Kiel (see also previous publication) is the work by Aloy and Russell where they first showed that it was possible to use a protein complex to predict if similar proteins would be able to interact in a similar way. What Kiel showed is that more accurate predictions can be made by modeling the protein domains under test onto the complex and evaluating the binding energy using a protein design program under development in the lab (FoldX). She used pull-down experiments and available information on Ras/RBD interactions to benchmark the predictions.

The predicted binding energies inform us about the probability that the two protein domains would bind in vitro. Inside the cell there are many other factors contributing to the likelihood of binding (gene expression, localization, complex formation, post-translational modifications, etc). To try to add some of this knowledge to the predictions I contributed with a Naive Bayes predictor that combines information on gene expression, GO functions, conserved physical/genetic interactions in other species and shared binding partners. The likelihood score obtained can be used to further rank the predicted interactions according to the likelihood that these are occurring inside the cell. In supplementary information there are the methods and tables with individual likelihood scores that can be used to reproduce the Naive Bayes predictor.

From atoms to nodes and edges
I think one of the main goals of the the review was to show the current progress that has been made in using structural information to obtain the fundamental properties (binding sites, catalytic sites, protein dynamics, etc) of cellular components that may allow us to create models of cellular functions. There has been some work in approximating the very abstract "nodes and edges" view of cellular interactions to a more traditional pathway model. This has been done typically by searching for modules and particular node roles that depend on the patterns of intra or inter module interactions (see Guimera et al). We should be able to automatically decorate interaction networks (and the pathway modules) with structural data that can further help to computationally generate meaningful models of cellular functions.
The picture was obtained from Beltrao et al , it is Copyright © 2007 Elsevier Ltd and it used here hopefully under fair use.

In the pipeline
There are several important details to iron out before we can just apply this structure based prediction of protein interactions to any protein that we can model onto complexes. We are in the process of testing the approach with other different domain types. Some of if I have been more directly involved and we started now the submission process. I tried to get everyone to agree to submit it to a preprint server but not everyone was comfortable with the idea.