Protein-Protein Docking
Protein-protein docking is the determination of the molecular structure of complexes formed by two or more proteins without the need for experimental
measurement. The study of protein-protein docking was boosted by the
rapid increase in available protein structures of the 1990s, and it has
now been under intensive research for over a decade. Many proteins
which remain relatively rigid upon complexation can now be successfully
docked. Methods are under development to handle cases where the
internal conformation of one or more of the partners changes
substantially.
Protein-protein docking generally does not refer to describing the
path taken by the components during complexation; the only object of
docking is the final complexed state. Since the natural use of
"docking" suggests guidance along a path, "protein-protein docking" may
be regarded as a misnomer.
Introduction
For most of the proteins known to science, their biological
role, as characterized by which other proteins they interact with, is
incompletely understood. Even proteins which participate in a
well-understood biological process (e. g. the Krebs cycle)
may have interaction partners or functions which are unrelated to that
process. Moreover, vast numbers of "hypothetical" proteins were
discovered in the genomic revolution of the late 1990s, about which there remains no information at all, apart from their amino acid sequence.
In cases of known protein-protein interactions, other questions arise. Genetic diseases are known to be caused by misfolded or mutated proteins (e. g. cystic fibrosis),
and there is a desire to understand what, if any, anomalous
protein-protein interactions a given mutation can cause. In the distant
future, proteins may be designed to perform biological functions, and a
determination of the potential interactions of such proteins will be
essential.
For any given set of proteins, the following questions may arise:
- Do the proteins bind in vivo?
- If they bind,
- What is the spatial configuration which they adopt in their bound state?
- How strong or weak is their interaction?
- If they do not bind, can they be made to bind by inducing a mutation?
Protein-protein docking is proposed to have the ultimate potential
to address all these issues comprehensively. Furthermore, since docking
methods can be based on purely physical
principles, even proteins of unknown function (or which have been
studied relatively little) may be docked. The only prerequisite is that
their molecular structure has been either determined experimentally, or can be estimated by some theoretical technique (see protein structure prediction).
Rigid-body docking vs. flexible docking
If the bond angles, bond lengths and torsion angles of the components are not modified at any stage of complex generation, it is known as rigid body docking.
A subject of speculation is whether or not rigid-body docking is
sufficiently good for most docking. When substantial conformational
change occurs within the components at the time of complex formation,
rigid-body docking is inadequate. However, scoring all possible
conformational changes is prohibitively expensive in computer time.
Docking procedures which permit conformational change, or flexible docking procedures, must intelligently select small subset of possible conformational changes for consideration.
Methods
Successful docking requires two criteria:
- Generating a set configurations which reliably includes at least one nearly correct one.
- Reliably distinguishing nearly correct configurations from the others.
For many interactions, the binding site is known on one or more of the proteins to be docked. This is the case for antibodies and for competitive inhibitors. In other cases, a binding site may be strongly suggested by mutagenic or phylogenetic evidence. Configurations where the proteins interpenetrate severely may also be ruled out a priori.
After making exclusions based on prior knowledge or stereochemical
clash, the remaining space of possible complexed structures must be
sampled exhaustively, evenly and with a sufficient coverage to
guarantee a near hit. Each configuration must be scored with a measure
that is capable of ranking a nearly correct structure above at least
100,000 alternatives. This is a computationally intensive task, and a
variety of strategies have been developed.
Reciprocal space methods
Each of the proteins may be represented as a simple cubic lattice. Then, for the class of scores which are discrete convolutions,
configurations related to each other by translation of one protein by
an exact lattice vector can all be scored almost simultaneously by
applying the convolution theorem.[1]
It is possible to construct reasonable, if approximate,
convolution-like scoring functions representing both stereochemical and
electrostatic fitness.
Reciprocal space methods have been used extensively for their
ability to evaluate enormous numbers of configurations. They lose their
speed advantage if torsional changes are introduced. Another drawback
is that it is impossible to make efficient use of prior knowledge. The
question also remains whether convolutions are too limited a class of
scoring function to identify the best complex reliably.
Monte Carlo methods
In Monte Carlo,
an initial configuration is refined by taking random steps which are
accepted or rejected based on their induced improvement in score (see
the Metropolis criterion),
until a certain number of steps have been tried. The assumption is that
convergence to the best structure should occur from a large class of
initial configurations, only one of which needs to be considered.
Initial configurations may be sampled coarsely, and much computation
time can be saved. Because of the difficulty of finding a scoring
function which is both highly discriminating for the correct
configuration and also converges to the correct configuration from a
distance, the use of two levels of refinement, with different scoring
functions, has been proposed.[2] Torsion can be introduced naturally to Monte Carlo as an additional property of each random move.
Monte Carlo methods are not guaranteed to search exhaustively, so
that the best configuration may be missed even using a scoring function
which would in theory identify it. How severe a problem this is for
docking has not been firmly established.
Selecting the docked complex structure
To find a score which forms a consistent basis for selecting the
best configuration, studies are carried out on a standard benchmark (see below)
of protein-protein interaction cases. Scoring functions are assessed on
the rank they assign to the best structure (ideally the best structure
should be ranked 1), and on their coverage (the proportion of the
benchmark cases for which they achieve an acceptable result). Types of
scores studied include:
- Heuristic scores based on residue contacts.
- Shape complementarity of molecular surfaces ("stereochemistry").
- Free energies, estimated using parameters from molecular mechanics force fields such as CHARMM or AMBER.
- Phylogenetic desirability of the interacting regions.
- Clustering coefficients.
It is usual to create hybrid scores by combining one or more
categories above in a weighted sum whose weights are optimized on cases
from the benchmark. To avoid bias, the benchmark cases used to optimize
the weights must not overlap with the cases used to make the final test
of the score.
Benchmark
A benchmark of 84 protein-protein interactions with known complexed structures has been developed for testing docking methods.[3]
The set is chosen to cover a wide range of interaction types, and to
avoid repeated features, such as the profile of interactors' structural
families according to the SCOP
database. Benchmark elements are classified into three levels of
difficulty (the most difficult containing the largest change in
backbone conformation). The protein-protein docking benchmark contains
examples of enzyme-inhibitor, antigen-antibody and homomultimeric
complexes.
The CAPRI assessment
The Critical Assessment of PRediction of Interactions[4]
is an ongoing series of events in which researchers throughout the
community try to dock the same proteins, as provided by the assessors.
Rounds take place approximately every 6 months. Each round contains
between one and six target protein-protein complexes whose structures
have been recently determined experimentally. The coordinates and are
held privately by the assessors, with the cooperation of the structural biologists who determined them. The assessment of submissions is double blind.
CAPRI attracts a high level of participation (37 groups participated
worldwide in round seven) and a high level of interest from the
biological community in general. Although CAPRI results are of little
statistical significance owing to the small number of targets in each
round, the role of CAPRI in stimulating discourse is significant. (The CASP assessment is a similar exercise in the field of protein structure prediction).
Deciding whether a complex actually occurs in nature and measuring its affinity
A reliable method for affinity prediction has the potential to
transform biochemistry and cell biology. Though a distant prospect,
affinity prediction may be considered as the ultimate achievement in
protein-protein docking.
Protein-protein docking and molecular docking
The field of protein-protein docking is highly computationally oriented, and it shares approaches with molecular docking. Molecular docking is sometimes referred to as small-molecule docking, to distinguish it from protein-protein docking. Proteins complexed with polynucleotide molecules are widely studied using similar or identical approaches to protein-protein docking, although if the nucleotide molecule is small enough, the case may be framed as a molecular docking problem.
References
- ^ Katchalski-Katzir E,
Shariv I, Eisenstein M, Friesem AA, Aflalo C, Vakser IA (1992).
"Molecular surface recognition: determination of geometric fit between
proteins and their ligands by correlation techniques". Proc. Natl. Acad. Sci. U.S.A. 89 (6): 2195–9. doi:10.1073/pnas.89.6.2195. PMID 1549581.
- ^ Gray
JJ, Moughon S, Wang C, Schueler-Furman O, Kuhlman B, Rohl CA, Baker D
(2003). "Protein-protein docking with simultaneous optimization of
rigid-body displacement and side-chain conformations". J. Mol. Biol. 331 (1): 281–99. doi:10.1016/S0022-2836(03)00670-3. PMID 12875852.
- ^ Mintseris J, Wiehe K, Pierce B, Anderson R, Chen R, Janin J, Weng Z (2005). "Protein-Protein Docking Benchmark 2.0: an update". Proteins 60 (2): 214–6. doi:10.1002/prot.20560. PMID 15981264.
- ^ Janin
J, Henrick K, Moult J, Eyck LT, Sternberg MJ, Vajda S, Vakser I, Wodak
SJ (2003). "CAPRI: a Critical Assessment of PRedicted Interactions". Proteins 52 (1): 2–9. doi:10.1002/prot.10381. PMID 12784359.
This article is licensed under the GNU Free Documentation License. It uses material from Wikipedia Encyclopedia article "Protein-Protein Docking"
|
|