Open access peer-reviewed chapter - ONLINE FIRST

Protein-Protein Docking Using Map Objects

By Xiongwu Wu and Bernard R. Brooks

Submitted: September 19th 2018Reviewed: December 13th 2018Published: January 23rd 2019

DOI: 10.5772/intechopen.83543

Downloaded: 95

Abstract

Protein-protein docking is a molecular modeling strategy to predict biomolecular complexes and assemblies. Traditional protein-protein docking is performed at atomic resolution, which relies on X-ray and NMR experiments to provide structural information. When dealing with biomolecular assemblies of millions of atoms, atomic description of molecular objects becomes very computational inefficient. This article describes a development work that introduces map objects to molecular modeling studies to efficiently derive complex structures through map-map conformational search. This method has been implemented into CHARMM as the EMAP command and into AMBER in its SANDER program. This development enables molecular modeling and simulation to manipulate map objects, including map input, output, comparison, docking, etc. Through map objects, users can efficiently construct complex structures through protein-protein docking as well as from electron microscopy maps according to low map energies. Using a T-cell receptor (TCR) variable domain and acetylcholine binding protein (AChBP) as example systems, we showed the application to model an energetic optimized complex structure according to a complex map. The map objects serve as a bridge between high-resolution atomic structures and low-resolution image data.

Keywords

  • protein-protein docking
  • molecular modeling
  • electron microscopy
  • molecular image
  • computational tool
  • protein complexes
  • biomolecular assembly

1. Introduction

Protein-protein docking has been a powerful approach to provide structural insights into biological procedures at atomic level [1, 2, 3, 4]. Based on the structural information provided by X-ray and NMR, as well as constraints derived from biological data such as mutagenesis observations, protein-protein docking can produce structures and interactions of protein complexes, which helps to illustrate structural mechanism of many biological processes [5, 6, 7].

New development in experimental technologies, such as electron microscopy, provides an approach to obtain low-resolution structure information of large molecules and their assemblies [8]. Extracting structure information from these low-resolution maps and obtaining atomic interpretation of the large biomolecular assemblies become a central piece of modern structural biology [9]. This requires molecular modeling to be conducted on these low-resolution maps, as well as high-resolution atomic structures, to maximize the capability in structural biology studies.

On the other hand, as the development of structural biology, molecular modeling is applied to larger and larger biomolecular machineries. As the biology systems become larger, atomic description of molecular system becomes very inefficient and time-consuming. Millions of atoms and their chemical structural become redundant in many of modeling studies. Therefore, it would be very efficient if large biomolecules are simplified to shape objects while ignoring their internal structures. Although molecular flexibility plays important roles in biological activities, in many cases, molecular geometric shapes plus surface properties are sufficient to describe many cellular processes such as molecule assembling and protein-protein binding. In these cases, it is satisfactory to describe large molecules as rigid domains. In some cases certain internal flexibility can be simplified to the motion of several rigid fragments. Therefore, molecular modeling of large biomolecular machinery can be achieved efficiently by simplifying biomolecules with simplified shape objects.

In this work, we introduce map objects to represent molecules with fixed structures to achieve efficient molecular modeling of large molecular systems and to efficiently derive structural information from low-resolution experimental maps. Map objects are designed to work with high-resolution atomic structures so that low-resolution maps are interchangeable with high-resolution atomic structures. A map object represents a property distribution over certain space, while a molecular structure is generally described by the coordinates of a set of atoms. This work describes an efficient approach to handle and manipulate map objects so that efficient molecular modeling of large systems can be performed.

2. Method and design

We introduce map objects to represent space occupation of molecular structures. Unlike chemical description of molecules that contain atoms that are linked by chemical bonds, a map does not have internal chemical structures. Instead, a map represents a spatial distribution of certain properties, typically electron density. This distribution generally is described as scalar values at discrete grid points due to irregularity of the distributions and limit in storage. Figure 1 shows a cartoon of a map object. As can be seen, a map objects contains three components.

Figure 1.

A map object and its properties.

2.1 Grid definition

The grid of a map object is defined by its starting position, x0, y0, and z0; grid intervals, dx, dy, and dz; and grid point numbers, nx, ny, and nz.

2.2 Molecular reference

Because we use map to represent a molecular structure, we use molecular reference to record which molecule this map is representing. Through reference molecule, map object and molecule coordinates become interchangeable.

2.3 Distribution properties

The distribution property describes the distribution of given property over the space covered by the grid points. This can be the electron density measured in experiment or properties generated from reference molecules.

Here are several typical types of map objects used in molecular modeling:

1. Electron density maps

This is the most widely used map type, which describes the electron density over the space. This type of map is often determined from electron microscopy. It can also be derived from molecular structure based on atomic coordinates and type.

2. Electric charge maps

This type of map is solely derived from molecular structures based on a force field. The partial charges of atoms are distributed to their nearest grid points.

3. Electric field maps

Because electrostatic interactions are long ranged, it is difficult to have a map to cover a very large space. Instead, we propose to use transformed coordinates:

X=xx+b,E1
x=bX1X,E2

where x is the real space coordinate, X is the reduced coordinate, and b is a constant controlling the reduction. X will take a range of (−1, 1) to represent a real space of x over .

4. VDW core maps

The VDW cores provide boundary to avoid overlapping between molecules. The core map is constructed based on the accessibility of a molecular structure. The surface has low core index, while the center has high index (the core indices are shown as the number in each grid box in Figure 1).

2.4 Rigid domains

Because a map object contains a large amount of data, it is inconvenient to perform movement on a map itself. For example, a rotation of a map object will result in the rectangular space not parallel to the coordinate axis anymore, and new boundaries and distributions need be updated accordingly. In addition, a real system often contains more than one copy of some molecular species, and it would be very memory costing to have a map object for each copy of these species. Instead, we define a rigid domain to represent a copy of the molecular species. A rigid domain contains only the identity of the map object it represents and the position and orientation vectors related to the map object, and can be manipulated easily. A rigid domain can be understood as a mobile representation of a map object. Each rigid domain has a unique identity, and many rigid domains can represent the same map object. Figure 2 shows the map objects of the α-chain and β-chain of a TCR variable domain and their manipulation through rigid domains.

Figure 2.

Rigid domains as a convenient way to manipulate map objects.

Each rigid domain is defined by its map ID and its translation vector, T, and rotational matrix, U:

T=txtytz,U=u11u12u13u21u22u23u31u32u33E3

The operation, translation, and rotation are done by applying these vectors:

Ti+1=Ti+ΔTi+1=txityitzi+Δtxi+1Δtyi+1Δtzi+1E4
Ui+1=Ωi+1×Ui=ω11i+1ω11i+1ω13i+1ω21i+1ω22i+1ω23i+1ω31i+1ω32i+1ω33i+1×u11iu11iu13iu21iu22iu23iu31iu32iu33iE5

and many operations can be accumulated without losing accuracy:

Tn=Tn1+ΔTn=Tn2+ΔTn+ΔTn1=T0)+i=1nΔTiE6
Un=Ωn×Un1=Ωn×Ωn1×Un2=i=1nΩi×U0E7

When a map object is created, a rigid domain at origin is created for it, with

T=000,U=100010001E8

A rigid domain can also be created for a molecular structure by linking the structure to a given map object according to the translation vector and rotation matrix from the reference coordinates to the linked structure.

X=T+U×XrefE9

This equation also provides a way to update the structure coordinates according to the position and orientation of a rigid domain.

2.5 Map comparison

Map comparison provides a target function for fitting one map into another map. Four types of cross-correlation functions [10] are provided for comparison between map objects, which are listed below.

1. Density correlation (DC)

DCmn=ρmρn¯ρ¯mρ¯nδρmδρnE10

where

ρ¯=1nxnynzinxjnyknzρijkE11

and

δρ=ρ2¯ρ¯2E12

represent the average and fluctuation of the density distribution. DCmnis the density correlation of map m to map n. Figure 3 shows two comparison maps in two dimensions. DCmnis calculated according to map m’s dimension and grid properties. The calculation runs over all grid points of map m, which are transformed and interpolated into grid points of map n to get corresponding density properties.

Figure 3.

A cartoon to show the grid-threading Monte Carlo searching method.

2. Laplacian correlation (LC)

LCmn=2ρm2ρn¯2ρ¯m2ρ¯nδ2ρmδ2ρnE13

where 2ρis the Laplacian filtered density derived from density distribution by the following finite difference approximation:

2ρijk=ρi+1jk+ρi1jk+ρij+1k+ρij1k+ρijk+1+ρijk16ρijkE14

LCmnis the Laplacian correlation of map m to map n. Similar to DCmn, LCmnis calculated according to map m’s dimension and grid properties.

3. Core-weighted density correlation (CWDC)

CWDCmn=ρmρnw¯ρmw¯ρnw¯δwρmδwρnE15

where Xw¯represents a core-weighted average of distribution property X:

Xw¯=i,j,kwmnijkXijki,j,kwmnijkE16

and

δwX=X2w¯Xw¯2,E17
wmn=fmafma+kcfna+bE18

where wmm is core-weighting function of core m to core n. Three parameters, a, b, and kc, control the dependence of the function to the core indices. We chose a = 2 and kc = 1 in this work calculations, and b is set to a very small value, say 10–6, to ensure wmn=0when fm=0and fn=0. Therefore, only the core region of map m has contribution to the core-weighted density correlation, CWDCmn.

4. Core-weighted Laplacian correlation (CWLC)

CWLCmn=2ρm2ρnw¯2ρmw¯2ρnw¯δw2ρmδw2ρnE19

CWLCmnuses Laplacian filtered density, instead of the density in the calculation. Again, only the core region of map m has contribution to the core-weighted Laplacian correlation, CWLCmn.

2.6 Molecular interactions between map objects

Energetics of molecular systems is the basis of molecular modeling. Calculation of molecular interaction using map objects is the crucial step for a successful modeling or simulation study. For atomic objects interaction calculation is pairwise and is very time-consuming for large molecular assemblies. For map objects, we propose to use field interactions that can be calculated much more efficiently. We define four types of interactions to describe interaction between map objects: electric field interaction, surface charge-charge interaction, VDW interaction, and desolvation interaction as described below.

1. Electric field interaction

The electric field around a molecule is described by the field map with scaled coordinates. The interaction with the field is

E12ele=m1e1φ1E20

where e1is the charge at the charge map 1 and φ2is the electric field from object 2, which depends on the dielectric constant, ε, and distances from each grid points of object 2. The dielectric constant, ε = 80, is used for most cases.

2. VDW interaction

Surface interaction brings the surface together while avoiding core overlapping. The surface can be identified by low core index. We propose to use the following equation to make the surface contact favorable while overlapping unfavorable:

E12vdw=4υδ12δ22m1C1C232C1C23E21

where C1 and C2 are the core indices of molecular 1 and 2 at each grid point and δ1 and δ2 are the grid intervals of map. 1 and 2, respectively. υ is the VDW interaction parameter.

3. Surface charge: charge interaction

Upon binding, the surface charge groups will contact with each other. The surface charge–charge interaction is different from the charge-field interaction which is screened by the solvent environment:

E12binding=bδ12δ13δ23m1e1e2E22

where b is the surface interaction parameter.

4. Desolvation interaction

Before and after binding, the surface charge groups change from the solvation state to the buried state and will create an energy gain we termed as desolvation energy:

E12desolv=sδ12m1C1e12δ131+C1/261+C26C2e22δ231+C161+C2/26E23

where s is the desolvation parameter.

These interaction parameters used to define the interactions, Eqs. (20)(23), can be derived from atomic force field or from experimental data. By fitting into energies calculated with the CHARMM force field [11], we obtained the parameters υ = 0.14 kcalÅ, b = 330 kcal/(C2Å), and s = 70 kcal/(C2Å2).

2.7 Conformational search

We implemented the grid-threading Monte Carlo searching algorithm [10] for robustly fitting rigid domains to a target map. The grid-threading Monte Carlo (GTMC) search is a combination of the grid search and Monte Carlo sampling. As shown in Figure 3, the conformational space is split into grid points, and short Monte Carlo searches are performed to identify local maximums around the grid points. The global maximum is identified among the local minimums.

3. Results and discussions

3.1 Complex structures from EM maps

Deriving high-resolution molecular assembly structures from microscopy maps are a major application of the map approach. This method has been successfully applied into several experimental studies [12, 13]. Figure 4 illustrates the steps to perform a fitting of high-resolution molecular structure into electron microscopy maps. We chose a T-cell receptor (TCR) variable domain (PDB code: 1a7n) as an example complex to illustrate the modeling process with map objects. The TCR variable domain is a complex of two chains, α-chain and β-chain. The two chains are first blurred into maps of the same resolution (here 15 Å) as the EM map. Then each map is fitted into the EM map to get a complex map. The complex map is projected back to atomic structures, which is the complex structure we are looking for. The root mean square (rms) deviation of the fitting result from X-ray complex is 3 Å.

Figure 4.

Steps to derive molecular assembly structures by fitting molecular structures into electron microscopy maps.

The structure obtained from map fitting generally is not optimized in atomic details. There are often atom overlaps or improper spacing between components. This structural mismatch can be removed by many modeling methods available in CHARMM [14, 15], such as energy minimization and simulated annealing, if the fitting result is very close to the right structure. After the minimization, the rms deviation is 0.97 Å.

3.2 Complex structures from energy optimization

The energy function is designed to have the minimum at the binding conformation. Therefore, it is possible to determine complex structures through minimizing the map interaction energy in cases where the EM complex map is not available. It should be noted that the map object assumes certain rigidity of a molecular object. Certain flexibility of loop region can be accommodated by the low-resolution characters, while large flexibilities like domain movement should be dealt with multiple map objects. Recently, this method was successfully applied in modeling of the peroxiredoxin (Prx) complex [16].

Figure 5 shows the steps to perform an energy-based conformational search to determine complex structures. In this case, no EM map is used. The TCR chains are transferred into property maps that allow interaction between map objects to be calculated. By searching the minimum interaction energy conformation, the complex structure is determined. The final result is only 0.57 Å away from the X-ray structure. It should be noted that this is an ideal case that the structure of the two chains is taken from the complex and there is no conformational change in this fitting process.

Figure 5.

Derive complex structure base on map interactions.

It is interesting to see that energy-based approach to derive complex structure takes account of molecular structure and energetic information of molecules. Figure 6a–c shows the electric field, shape, and charge maps of the two chains and their complex. Obviously, the two chains are binding together to have the low potential region matching the high potential one, to have shape complementary to each other, and to have surface charge overlapped oppositely.

Figure 6.

(a) Electrostatic field maps of TCR two chains and complexes. (b) Core-index maps of TCR two chains and complex. (c) Partial charge maps of TCR two chains and complexes.

Figure 6a shows the electric field of TCR α-chain and β-chain and their complex. Please notice the reduced coordinates are used for the map. The range of (−1, 1) for the reduced coordinates covers the range of (−∞, ∞) for the regular coordinates. The α-chain has negative field near its top-left and bottom-right areas and positive field near its lower-right and upper-left areas. Correspondingly, the β-chain has positive field at its top-right area and negative field at its bottom area, which are complementary to the α-chain. As a result, the complex map has negative field at its top and bottom areas and positive field at its left and right areas. The symmetric distribution of the field of the complex indicates its stability.

Figure 6b shows the core indices of TCR α-chain and β-chain and their complex. The high values in the core indices indicate the region further away from surface and are difficult to access. The α-chain and β-chain that show complementary shape are the binding surface. Their complexes are the two maps matching together.

Figure 6c shows the electric charge distribution of TCR α-chain and β-chain and their complex. The α-chain map shows more negative charges at the right side, while the β-chain shows more positive charges at its left side. The complex map shows the two chains come together with negative patches contacting positive patches. Overall, these map interactions provide energetic basis for protein-protein docking as shown in Figure 5.

As a further example of protein-protein docking, we show the procedure to build the pentamer of acetylcholine binding protein (AChBP). The monomers are docked one by one to form dimer, trimer, tetramer, and pentamer (Figure 7). The resulting pentamer is only 0.73 Å away from the X-ray structure, 1I9B. The electric field maps during the building process are shown in Figure 8. From the field map of the monomer, we can see the most positive field is at the top-right area and the most negative field is at the bottom-left area. A dimer is formed by matching the positive area of the second monomer with the negative area of the first one. The third monomer’s positive area fits into the most negative area of the dimer to form the trimer. Similarly, the fourth and fifth monomers are docked to form the tetramer and pentamer. The map interaction limits the way of docking monomers and allows correct assemblies to be built.

Figure 7.

Protein-protein docking of acetylcholine binding protein (AChBP) to build its pentamer.

Figure 8.

Electric field map at each docking stage to build the pentamer of acetylcholine binding protein (AChBP).

Map objects cannot only be used to model rigid proteins [17], they can also be used for targeted conformational search such as flexible fitting and restrained molecular dynamics [18, 19]. Map objects provide an efficient bridge from molecular systems to large-scale bodies such as cells and organelles.

4. Conclusions

This work designed and developed a computational tool to manipulate map information for molecular modeling studies. Protein–protein docking can be efficiently performed with map objects. This tool is implemented into CHARMM, as a module, EMAP, and into AMBER in its SANDER program. Our design and implementation make it very flexible and efficient to perform various manipulations of map objects and to perform some routine task related to map data. This module enables user to construct macromolecular assemblies by docking high-resolution X-ray or NMR structures to low-resolution cryo-electron microscopy maps. And when there is no EM map available, this module allows user to search for low-energy complex structures, for example, in protein-protein docking. By replacing high-resolution atomic structure with low-resolution map objects, this work creates a convenient approach to extend the molecular modeling studies to large biomolecular machinery. This map-based approach can extend modeling and simulation objects from molecular systems to macroscopic systems like cells and bacteria.

Acknowledgments

This research was supported by the Intramural Research Programs of National Heart, Lung, and Blood Institute (Z01 HL001027-34).

Conflict of interest

The authors declare there is no conflict of interest in publishing this work.

Download

chapter PDF

© 2019 The Author(s). Licensee IntechOpen. This chapter is distributed under the terms of the Creative Commons Attribution 3.0 License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

How to cite and reference

Link to this chapter Copy to clipboard

Cite this chapter Copy to clipboard

Xiongwu Wu and Bernard R. Brooks (January 23rd 2019). Protein-Protein Docking Using Map Objects [Online First], IntechOpen, DOI: 10.5772/intechopen.83543. Available from:

chapter statistics

95total chapter downloads

More statistics for editors and authors

Login to your personal dashboard for more detailed statistics on your publications.

Access personal reporting

We are IntechOpen, the world's leading publisher of Open Access books. Built by scientists, for scientists. Our readership spans scientists, professors, researchers, librarians, and students, as well as business professionals. We share our knowledge and peer-reveiwed research papers with libraries, scientific and engineering societies, and also work with corporate R&D departments and government entities.

More About Us