Research at the Informatics Lab

My lab studies algorithms that explain protein binding preferences. Our efforts over the last few years have focused on detecting steric and electrostatic influences on specificity from an analysis of protein structures. We have also developed several techniques that compensate for certain kinds of conformational flexibility in protein structures. Our work includes fundamental algorithms research with a focus on geometric representations, statistical models, and biophysical phenomena. We also develop on applications on families of proteins that have very complex binding preferences. We are aiming to create an integrated and general system for specificity annotation: A system that finds all the components of a protein structure that influence specificity and explains why these components are influential.

Background, briefly.

Proteins are chains of amino acids that perform biochemical functions, such as transporting oxygen, by physically attaching to or binding other molecules. Similar-functioning proteins specialize further by preferentially binding different partners. These preferences are called binding specificities, and they play a central role in coordinating biological systems. By curbing unproductive interactions, they organize randomly colliding molecules into teams that perform chemical necessities for the cell.

Specificity is a mechanical consequence of protein structure. Just like the tip of a screwdriver decides which screws it can turn, complementarity between molecule shapes can limit or enable binding through a biophysical effect called steric hindrance (fig 1). Other biophysical effects also have a critical influence on specificity, such as the attraction and repulsion of electric fields or the attachment of hydrogen bonds. Proteins achieve these effects by strategically placing chemically appropriate amino acids near binding sites, where they bind other molecules. To understand how specificity is achieved, we must examine many combinations of interacting amino acids to find the subset that actually selects binding partners. It is a search in a combinatorial space, but once found and understood, mechanisms of specificity can inform us about how a protein participates in a larger system.

The goal of understanding or manipulating biological systems drives many investigators to discover why proteins bind some partners and how they can be re-engineered to prefer others. These efforts are ubiquitous in the basic sciences, such as molecular or evolutionary biology, where understanding specificity is part of studying living systems. Similar efforts are common in applied fields as well: People study why mutations change the specificity of cancer proteins, causing tumors to become drug resistant. Others try to produce antibodies that selectively bind disease proteins to assist a patient’s immune system. My lab seeks to accelerate these efforts by creating algorithms to find the mechanisms that control specificity and to explain how they work. We call the challenge of designing such algorithms the specificity annotation problem.

Algorithms Research

The bulk of our recent publications have focused on creating software that identifies influences on specificity through the geometric analysis of biophysical properties. In particular, we are exploring the space of techniques that use constructive solid geometry (CSG) to analyse geometric variation. CSG was developed originally as a modeling technique for computer aided design and computer graphics, and it can still be found in classic tools like Maya and Autocad. Rather than using CSG to draft mechanical objects and to model animated characters, we are using CSG to find similarities and differences in shape. Individual similarities and differences can be isolated with CSG, so they point to potential influences on specificity, for specificity annotation.

We are also developing techniques that enable the comparison of flexible protein structures. Whereas earlier methods compensated for dramatic motions in tertiary structure, we are concerned with subtle effects at the sidechain level that make binding sites appear different. By combining methods from structure prediction and molecular simulation, we have found ways to normalize binding site geometry as it often exists, thereby permitting fairer comparisons.


We are collaborating with Katya Scheinberg and her group, on the application of mathematical optimization for the superposition of protein binding sites. We use Derivative Free Optimization to systematically search for a superposition of two solid objects that maximizes their overlapping volume. This technology combines our own CSG-based algorithms to measure volumetric overlap with optimization techniques that search efficiently for effective superpositions worth testing. The result is a powerful technique for binding site superposition and also a fascinating case study into the challenge of optimization on noisy objective functions.

We have also started a recent collaboration with Ilker Hacihaliloglu on the superposition of voxel-based bone images. Superpositions of this sort could reduce the need for implanting surgical markers that help indicate a bone's specific orientation in medical imaging. As a result, surgical procedures can be eliminated, cutting surgical risks, recovery and costs.