The design and assembly of soft supramolecular structures based on small building blocks are governed by non-covalent interactions, selective host-guest interactions, or a combination of different interaction types. There is a surprising number of studies supporting the use of computational models for mimicking supramolecular nanosystems and studying the underlying patterns of molecular recognition and binding, in multi-dimensional approaches. Based on physical properties and mathematical concepts, these models are able to provide rationales for the conformation, solvation and thermodynamic characterization of this type of systems. Molecular dynamics (MD), including free-energy calculations, yield a direct coupling between experimental and computational investigation. This chapter provides an overview of the available MD-based methods, including path-based and alchemical free-energy calculations. The theoretical background is briefly reviewed and practical instructions are introduced on the selection of methods and post-treatment procedures. Relevant examples in which non-covalent interactions dominate are presented.
- molecular dynamics
- non-covalent interactions
- host-guest systems
- supramolecular structures
In the past 3 decades, atomistic simulations and free-energy calculations have emerged as indispensable tools to tackle deep chemical and biological questions that experiment has left unresolved. To fully understand the vast majority of chemical and biological processes, it is often necessary to examine their underlying free-energy behavior . This is the case, for instance, of phase diagrams, protein-ligand and drug binding affinities, drug partitioning, reaction rates, equilibrium constants, acid-base equilibria, solvation contributions, or confinement energies. The prediction of these quantities, with relevance in the field of computer-aided drug design requires knowledge of the related free-energy changes . Most of the research topics involving molecular modeling and simulation, with free-energy calculations, stem from pharmaceutical applications [1, 3, 4]. Those methods have provided insight into the time evolution of host-guest systems and of protein-ligand interactions [2, 4] and have revealed the respective dynamic behavior in water, thus establishing relevant recognition and affinity patterns. However, the formation of inclusion complexes is frequently used as a drug solubilizing approach, aiming at better therapeutic outcomes. This makes results sparse and focused on individual systems, lacking a comprehensive characterization for assessing the factors that govern, for example, the inclusion process. Despite the continuous progress and development of improved algorithms, the estimation of binding free energies by molecular dynamics (MD) simulations still requires significant computational efforts due to the mathematical complexity imposed by the solvated systems, often composed by myriads of atoms . Recently, Pais and co-workers  have proposed an automated procedure based on umbrella sampling and the “flexible molecule” approximation for the calculation of binding constants in complexes formed between β-cyclodextrin and several naphthalene derivatives. In this type of complexes, the guest molecule may alter the structure of the host, leading to relevant cooperative effects, when compared to the free molecules. Inclusion of guests into host molecules is essentially governed by (i) relatively weak non-covalent interactions (NCI), including hydrophobic, van der Waals and electrostatic interactions, π-π stacking, hydrogen bonding, or a combination of these interactions, and also by (ii) stronger ionic and dipolar interactions. In addition to several reviews (see e.g. [6, 7]), including many dealing with specific host-guest systems [4, 7, 8, 9, 10, 11, 12], a large number of research papers are available on this subject, including some focused on mechanistic aspects of the inclusion phenomena (see e.g. Refs. [5, 13, 14, 15].)
In fact, both the efficiency and accuracy of free-energy calculations have been greatly improved by a wide variety of methods . However, it is increasingly difficult for researchers to find their way through the maze of available computational techniques. Why are there so many methods? Are they conceptually related? Do they differ in efficiency and accuracy? Why do methods that appear to be very similar carry different names? Which method is the best for a specific problem? How to choose the most relevant method to tackle the system at hand? These questions may leave researchers in the field confused and looking for clear guidelines. However, answers to these questions are not straightforward. A distinction has been made between two classes of free-energy transformations, namely those of alchemical and geometrical nature . While the former exploits the malleability of the potential energy function and the virtually infinite possibilities of computer simulations to transform between chemically distinct states, the latter includes positional, orientational, and conformational changes in macromolecules and complexes thereof. At the practical level, such transformations are achieved using a variety of approaches, which can be separated into four main groups, (i) probability distributions and histograms, (ii) perturbation theory, (iii) non-equilibrium work and (iv) gradient-based methods . One recurrent question concerns the selection of the most relevant approach to tackle the problem at hand, which can be reworded in terms of the best-suited, cost-effective method to obtain a reliable answer. To a large extent, this question can be addressed by considering the actual nature of the transformation, alchemical, or geometrical, and subsequently by deciding which one, among the aforementioned methods, is more likely to yield the desired free-energy change at a minimal computational effort.
A detailed overview of the concepts and fundamentals of the available methods for estimating free energies is given in Ref. . The guiding principle is that most of these are based on a few fundamental assumptions, devised by some pioneers in the field [19, 20, 21, 22, 23]. With some exceptions, more recent contributions are not based on innovative fundamental principles, but are “astute and ingenious ways of applying those already known” .
2. Context and relevant aspects
In the context of non-covalent binding, host-guest systems have emerged as useful models for assessing the accuracy of simulation methods. This can be explained by the fact that these systems display interesting features comparable to protein-ligand binding, including hydrogen bonds, conformational restriction and desolvation, and also possess an adequate size for conducting more precise thermodynamic calculations . This is the case of cyclodextrin host-guest systems, which mimic several characteristics of protein-ligand binding with the advantage of being much more accessible due to their small size. These characteristics also facilitate the selection of the force field (FF) based on the attribution of the level of accuracy.
Specifically, the combination of MD and free-energy calculations with experimental results have revealed the relevant interaction patterns in the formation of inclusion complexes between several host (e.g. cyclodextrins [4, 5], cucurbiturils  and bambusurils [8, 26]) and guest molecules (e.g. antitumor and antiviral drugs , steroids , flavonoids  and small ionic species ). These small aggregates, comprising typically two (in 1:1 host-guest complexes) or three molecules (in 1:2 complexes) have been the basis of fundamental developments for establishing what governs associations in soft-matter and stability in larger supramolecular nanostructures, such as nanogels and targeted nanoparticles.
Despite significant advances in molecular modeling techniques and the comparative simplicity of host-guest systems, there is still a need for a tractable and theoretically sound computational method to interpret experimental data and help with the design of new hosts for targeted molecular guests. Similarly to other macromolecular systems, host-guest systems exhibit marked entropy-enthalpy compensation. Such property has been observed for NCI in both water and organic solvents. Among the wide applications of alchemical transformations, host-guest chemistry occupies the most prominent position .
A fundamental aspect to be understood is the precise manner in which the guest molecule binds to the host. MD simulations have greatly contributed to understanding molecular binding phenomena as a fully dynamical process. However, these simulations are limited by the time scales that can be routinely sampled. The introduction of special-purpose machines and the evolution of parallel codes, has increased enormously the time scales accessible by fully atomistic MD. However, guest molecules (e.g. drugs) with long residence times are common, and the respective association/dissociation from a host or receptor cannot be observed by conventional MD calculations even when specialized hardware is employed. This well-recognized limitation of MD has led to the development of various algorithms to enhance the sampling of the high free-energy states and rare events, associated to high free-energy barriers.
Enhanced sampling methods can speed up conformational sampling by various means and represent an effective alternative to access with high accuracy the thermodynamics and possibly the kinetics that underlie these processes. Steered MD and umbrella sampling , with potential of mean force estimation (PMF)  allows inspecting the free-energy profile and the mechanistic aspects involved in formation of host-guest complexes .
In recent years, other strategies aimed at the same goal have been proposed. Replica exchange  metadynamics [34, 35], accelerated MD , milestoning , transition path sampling , and combinations of these are among the most widely used methods to enhance conformational sampling. For instance, the replica exchange methodology has been used in the atomistic simulations, as well as in a number of coarse-grained simulations (see e.g. ). This method has been applied to study free-energy landscape and folding mechanisms of several peptides and proteins , through several variants of the traditional temperature dependent scheme, available in some of the most popular MD packages, such as AMBER , GROMACS  and NAMD .
Among the enhanced sampling methods that fully explore the binding mechanism, metadynamics, especially in the well-tempered formulation, has emerged as a powerful approach for accelerating rare events . This has been applied in drug docking to protein and enzymes, systems involving big conformational changes and relevant solvation effects. Directed dynamics such as the adaptive biasing force (ABF) and hyperdynamics were also derived from the same principles as adopted by metadynamics . The implementation of metadynamics in MD codes, such as NAMD and GROMACS, have promoted a broad range of applications of the method, ranging from solid state physics to biological systems.
The adaptive biasing force algorithm (ABF) has also emerged as a promising strategy for mapping complex free-energy landscapes [42, 43], as it combines both constrained and unconstrained simulations into a highly efficient scheme, providing an uniform sampling of the order parameter. Briefly, as a simulation progresses, a continuously updated biasing force is added to the equations of motion, such that in the long-time limit it produces a Hamiltonian devoid of an average force acting along the transition coordinate of interest. In contrast to umbrella sampling schemes, based on probability distribution functions, ABF uses forces, which can be readily estimated without the need to sample broad ranges of the order parameter.
Another relevant problem in the early estimations of free-energy is related to the strong dependence on system size, in the presence of significant electrostatic interactions . Once long-range corrections using Ewald lattice summation or the reaction field are included in molecular simulations, size effects in neutral systems decrease markedly. The problem, however, persists in charged systems, for example in determining the free energy of charging a neutral species in solution. In this context, it has been demonstrated that system-size dependence can be largely eliminated in these cases by careful treatment of the self-interaction term, which is associated with interactions of charged particles with their periodic images and a uniform neutralizing charge background .
The ability to decouple relevant energy contributions from individual components is also far from a simple solution, being the basis of alchemical free-energy methods [19, 20], such as the thermodynamic integration (TI) [17, 44, 45], free-energy perturbation (FEP)  and molecular mechanics-Poisson-Boltzmann surface area (MM-PBSA) .
The feasibility of the estimated binding free-energy depends on the correct estimation of thermodynamic quantities and the concerted interaction components (enthalpy, entropy and solvent contributions) [15, 47]. For instance, the quantification of these interaction components have been directed at inclusion complexes between cyclodextrins and its derivatives and different model drugs (see Refs. [13, 15, 48].). However, only a few studies (e.g. [5, 13]) have introduced technical and practical directions to investigate the energy components underlying the stability of these complexes. Spoel and co-workers  have emphasized the role of solvent contribution resorting to steered molecular dynamics and PMF calculations. The latter techniques have also been used by Pais and co-workers  for assessing the relevance of non-included guest moieties in the binding constants of cyclodextrin inclusion complexes, estimating the weight of enthalpic and entropic contributions to the free-energy. In this context, MD and PMF calculations allow the modulation of the formation processes in inclusion complexes, along which the free-energy profiles are derived from an umbrella sampling procedure . This approach has been employed in the characterization of the inclusion of a wide range of drugs on cyclodextrins, and also in the aggregation of host units either in dimerization [15, 48] or in rotaxanes formation [49, 50]. Other hybrid approaches , including quantum mechanics/molecular mechanics (QM/MM)  methods have also provided insight on the formation of inclusion complexes at a molecular level.
3. Using potentials of mean force to address host-guest recognition and association
Several research groups have opened the way for future progress through innovative applications of free-energy methods to physical and organic chemistry, as well as structural biology. An exhaustive account of the wide range of works published in the early years of free-energy calculations falls beyond the scope of this section. The reader is referred to Refs. [16, 52] for a full description of these efforts.
A complete thermodynamic characterization of the binding process implies the knowledge of the enthalpy and entropy of association. This is one of the key elements in identifying the stabilizing factors and in understanding how is the guest and host assemble. Standard molecular simulation methods have reproduced the absolute thermodynamic properties of binding (standard free-energy, enthalpy and entropy) between host-guest systems . The absolute binding free-energy can be expressed as the sum of separate free-energy contributions corresponding to a step-by-step process describing the association process between the host and guest . This can be accomplished by calculating the PMF profile along a specific reaction coordinate characterizing the association process (this reaction coordinate can be the distance between the center of mass of the host and that of the guest).
Different methodologies have been successfully applied for the calculation of the PMF profile. Among them, free-energy perturbation (FEP) [47, 50, 54, 55], thermodynamic integration (TI) [45, 47], umbrella sampling, with the weighted histogram analysis method (WHAM) [5, 56] and force constraint methods  are the most commonly used in the literature for chemically and biologically relevant systems. PMF calculations play a considerable role in understanding how chemical species recognize and associate. One might argue that they actually provide a more satisfactory picture of these phenomena than alchemical free-energy calculations, because they follow the molecules of interest from their free, unbound state, to the bound state of the complex. This point of view is obviously misleading, as the model reaction coordinate followed to describe association/dissociation bears certain arbitrariness. Furthermore, PMF have helped to decode complex recognition and association phenomena in thermodynamic signatures with a detailed atomic description of the underlying processes. A detailed description of the theoretical basis of PMF calculation is given in ref. 
The PMF, W(ξ) may be interpreted as the potential that is produced by the average forces over all the configurations of a given system, in which a set of molecules is kept fixed (at a certain value of a reaction coordinate, ξ). This quantity was devised by Kirkwood [19, 32] and can be defined based on the average distribution function, ρ(ξ), along the coordinate ξ, obtained from a Boltzmann weighted average,
where and are arbitrary reference values, represents the Dirac delta function for the coordinate ξ, U(r) corresponds the total energy of the system as a function of the coordinates r, and ξ(r) is a function depending on the number of degrees of freedom, such as an angle, a distance or a more complex function of the cartesian coordinates of the system.
The distribution function ρ(ξ) cannot be computed by standard MC or MD simulations, due to low sampling frequency in higher-energy configurations. Specific sampling techniques (non-Boltzmann sampling) have been designed for calculating the PMF along a coordinate ξ, from a MD trajectory. The umbrella sampling technique, proposed in the 1970s by Torrie and Valleau , is one of these approaches, and is commonly used to overcome the difficulty of sampling rare events, by modifying the potential function so that the unfavorable states are sampled appropriately.
Umbrella sampling is used to describe simulations in which an order parameter connecting the initial and final ensembles is divided into mutually overlapping regions, or windows, which are sampled using non-Boltzmann weights. To explain briefly, several independent simulations are performed in each of the imposed regions of the reaction coordinate (Figure 1, bottom), using a bias potential to constrain the simulations to adjacent windows. Such initial configurations are generated, each corresponding to a location wherein the guest molecule is restrained and centered at subsequent values of ξi, corresponding to decreasing or increasing center of mass (COM) distances from the center of the host cavity, using an umbrella biasing potential, w(ξ), often of quadratic form, wi(ξ) = 0.5 K(ξ − ξi)2. This restraint allows the guest molecule to sample the configurational space in a defined region along the association/dissociation coordinate. Then the biased system (Figure 1, middle) will sample configurations close to a defined position, z0, even when these would not be sampled in the unbiased system. In general, the potential energy (U(r) + w(ξ)) is used to generate the biased simulations. The mean force, as a function of position, is calculated in each window, and the respective potentials, W(ξ), are derived using an unbiasing procedure.
Statistical techniques, such as WHAM  are used to remove the umbrella bias and combine the local distributions, allowing free-energy to be computed (Figure 1, top). WHAM is the basic tool for constructing free-energy profiles from distributions derived through stratification, for which the path connecting the reference and the target states is separated into intermediate states. The respective equations have also been used as the core of adaptive umbrella sampling approaches [42, 43], in which the efficiency of free-energy calculations are improved through refinement of the biasing potentials as the simulation progressed.
3.1. Thermodynamics of host-guest binding
The determination of binding affinities for host-guest complexes arising from the non-covalent association of two molecules is of paramount importance in different fields, including drug discovery. The host-guest binding affinity is related to the difference in Gibbs free energies of binding (∆Gbind) corresponding to the associated and dissociated states of the system under study (see Figure 2).
∆Gbind is composed of enthalpic and entropic terms. While the enthalpic contribution reflects changes in the inner energy of the systems, i.e. energies related to atomic motions and interactions, entropic contributions are also related with conformational changes upon binding .
The common methods for binding affinity estimation include very fast but less accurate similarity-based regression models and scoring functions and accurate but computationally demanding physical methods, which require a large number of MD (or Monte Carlo) simulations . This suggests that the most convenient methods and algorithms are those that offer a good balance between computational costs and accuracy. Most accurate strategies provide estimates for thermodynamic quantities, with intrinsic difficulties associated to fundamental aspects. The quantification of atomic or ionic interactions, associated to repulsive and attractive forces, requires physical models assigned to different force fields (FF). The latter can be detailed using quantum mechanical (QM) ab-initio methods. However, a quantum mechanical representation of solvated biological systems, such as membranes and proteins consisting of large amounts of atoms is useless, even with approximations, such as the density functional theory (DFT) [58, 59]. Relevant considerations have been made  on the accuracy of FF for the calculation of binding thermodynamics. Some FF combinations may provide the most accurate binding enthalpies but the least accurate binding free energies, with implications in the development of new FF. These have been evaluated for a wide range of host-guest complexes and water models, using different partial charge assignment methods and host FF parameters, and resorting to the attach-pull-release (APR) method , which allows computing the absolute binding free energies, from a series of umbrella sampling simulations. In the latter, restraints controlling the host-guest complex are activated cumulatively for separating host and guest molecules, and then released to leave the guest at standard concentration.
In fact, the selection of FF and setup parameters (e.g. protonation states and slow motions) can be a difficult task, as new variants primarily concerned to the treatment of proteins, are constantly being produced. There are standard options such as OPLS , AMBER , CHARMM , and GROMOS  and other more specific FF, such as the Kirkwood-Buff  FF, directed at aromatic amino acids, the CHARMM polarizable FF, based on the classical Drude oscillator [65, 66], the induced dipole , for modeling polarization in proteins and protein-ligand complexes, the Jorgensen’s approach  for parameterization by atoms-in-molecule electron density partitioning, AMOEBA  and GEM* , among many others [71, 72, 73]. Additionally, several water models such as TIP3P , TIP4Pew , and SPC/E , and the recently introduced OPC , TIP3P-FB and TIP4P-FB , TIP4P-D , and iAMOEBA  are also available. It is worth mentioning that the assessement of the ability of these FF to reproduce experimental data are very sparse, and based on the free-energy of hydration of small molecules , and on the structure of nucleic acids  and proteins . With regard to explicit solvent methods, only a few studies have focused on thermodynamic data from non-covalent binding for validating the FF. Recently [84, 85], the binding free energies of cyclodextrin host-guest complexes were also used, but in the context of implicit solvent models.
It is worth to note that there are other issues in this type of systems that transcend the accuracy of the selected FF, and affect the simulation results. These are related to restrictions for the sampling and estimation of reliable thermodynamic quantities, including the respective binding constants. The latter can be determined from experimental techniques such as 1H NMR, isothermal titration calorimetry and solubility experiments. The calculations of binding constants that provide a convenient counterpart to the experimental observations in inclusion complexes is sometimes addressed using a “flexible molecule” approximation , instead of the conventional rigid rotor harmonic oscillator [86, 87, 88]. The available volume for a guest molecule is described in terms of a cylindrical, PMF weighted region, in which the respective motion is measured by the positioning of the center of mass. The “flexible molecule” approximation is not as widespread as might be expected, and relatively recent publications [86, 89] both divulge the approximation and highlight controversies on the topic. In simple terms, the binding constant, Kbind can be estimated by integrating the PMF values (∆GPMF(ξ) along the association/dissociation coordinate, ξ, from
where NA and R corresponds to the Avogadro and the ideal gas constants, respectively and r is the average radius of the cross section of the cylinder in each sampling window, which is also a function of ξ.
The key factors affecting the thermodynamic signatures in host-guest systems can also be assessed along with an accurate description of the NCI including their spatial features. This can be carried out resorting to a recently developed Independent Gradient Method (IGM) of Lefebvre and co-workers  (see Section 4 for details), which allows the visualization of regions of low charge density corresponding to stabilizing/destabilizing NCI, based on the analysis of the electronic charge density of the interacting molecules and the respective gradients. Although the original NCI method of Johnson and co-workers  provide a very similar qualitative analysis of NCI, the reduced density gradient, s, used to identify the interaction types is a dimensionless quantity and therefore difficults the evaluation of the respective strengths. The new IGM allows for a quantitative comparison of the strength of NCI through the calculation of the IGM descriptor, δg, which corresponds directly to the charge density gradient(s) in real-space.
4. Visualizing NCI using a charge density topology
The topological features of the electronic charge density have played an important part in several schemes aiming to provide a route to understanding molecular structure from first principles, that is, without resorting to any form of empirical or approximate models (other than the physically justified approximations underlying the electronic structure methods employed). Methods such as the theory of Atoms in Molecules (AIM)  and the Electron Localisation Function (ELF) [93, 94, 95] can provide great insight into features of the density. Both AIM and the ELF are useful for studying strong interactions such as covalent or ionic bonding but are not as useful in the analysis of very weak interactions of the type that are important for subjects as diverse as protein structure, drug design, catalysis, materials self-assembly, and many more. To address this, Johnson et al. [91, 96] introduced the NCI analysis method. This approach is based on the electronic charge density, ρ, and its derivatives. The first derivative 𝛻ρ enters into the expression for the reduced density gradient, s,
In regions of low ρ such as the density tails far from a molecule the gradient remains large and so s displays high values. However, in regions of low ρ between atoms where weak NCI occur the gradient (and therefore s) drop to zero.
In order to differentiate between stabilizing attractive interactions and those that are unfavorable and destabilize the system it is necessary to analyze the second derivative (Laplacian) of the charge density 𝛻 2ρ. Decomposition of 𝛻 2ρ into the three eigenvalues representing the axes of maximal variation
provides information on the nature of the interactions at a given point in space. λ2 displays negative values in stabilizing/bonding regions (charge is flowing into this region indicating a local build-up of ρ) while in destabilizing/repulsive regions it is positive (charge flowing out, indicating a local depletion of ρ). Plotting s against sign(λ2)ρ will therefore permit the identification of NCI in regions where s and sign(λ2)ρ → 0. Although these quantities can be readily obtained with first principles electronic structure methods, such calculations remain too computationally expensive for the large systems of interest in biological or materials science applications. In such cases it is possible to approximate the charge density with a sum of atomic densities providing a pro-molecular density, ρpro. This approximate density does not include relaxation of the atomic densities as would happen in self-consistent calculations for bonded systems. Fortunately, the largest deviation of ρpro from the fully relaxed density occurs in covalent bonding regions and has a minimal effect on the (low-density) NCI regions.
Substituting ρpro into the equations above has been found to have have very little impact on the resulting NCI analysis. For purposes of interfacing with the results of molecular dynamics simulations, pro-molecular densities have the additional advantages that they are readily computed for both finite and extended/periodic systems and the fact that this approach requires orders of magnitude less computational time than the method based on first principles electronic structure calculations.
A recent development of the NCI method that makes use of the pro-molecular route to building the charge density of the system under study is the Independent Gradient Method (IGM) of Lefebvre et al.  As with the NCI method, IGM employs in addition to ρ quantities related to the first and second derivatives of the density. However, in IGM the reduced density gradient, s, is replaced with the descriptor δginter, defined as the difference between the first derivatives of the charge densities for the total system and the fragments
δginter > 0 indicates the presence of NCI and the magnitude of the descriptor at a point in space gives an indication of the strength of the interaction. 𝛻ρIGM, inter is obtained from sums over all the atomic densities in the different fragments (here for two fragments A and B in the x-direction)
The IGM approach has the attractive feature that δginter allows for a more facile comparison of the strength of the weak interactions than the quantity s in the original NCI . This is due to the fact that the presence of significant non-covalent interactions is indicated when s → 0 making interpretation difficult whereas the magnitude of the IGM quantity δginter increases in proportion to the strength of the interaction.
An illustrative example of the use of the NCI and IGM methods is given in Figures 3 and 4. The host-guest complex shown is between curcurbituril and the dodecyl serine-based monomeric cationic surfactant 12SerTFAc  (to make the example more accessible to the reader only a single configuration is shown). Calculation of the NCI and IGM descriptors was performed using IGMPlot version 1.0 (
As can be seen from the molecular structure images, both the s and δginter surfaces provide qualitatively similar information and display the general distribution of the most significant host-guest interactions. Similarly, the color-coding provided by the value of sign(λ2)ρ can be seen to predict that the interactions are largely of the weak van der Waals type with the exception of a single blue region denoting the hydrogen bond between the serine hydroxyl group and the host. A significant difference between the s and δginter surfaces is that the latter also contains information on the strength of the interactions through the volumes of the isosurface regions at a given point. The IGM approach therefore provides an important extra source of information on the nature of the interactions. The intrinsic difference between the two methods is even more clearly seen in the adjacent scatter plots. In the IGM plot (Figure 4, right) peaks of different magnitude corresponding to different strengths of interaction can be seen which give rise to the differing isosurface volumes. The NCI plot (Figure 3, right), however, displays all spikes in the non-covalent interaction region −0.025 ≤ sign(λ2)ρ ≤ 0.025 in a very similar manner and makes interpretation much less easy. For this reason, despite very similar computational requirements for both methods making them equally suitable for analysis of large (ensembles of) systems such as those arising from MD simulations it is likely that the future will see the newer IGM method being used much more extensively than the original NCI one.
This section presents some examples in which MD or MC simulations are used for establishing the contributions of NCI (e.g. electrostatic, hydrophobic and hydrogen-bond interactions) to the free energy, in systems in which these are the main source of recognition, assembly, and stability. These include, for instance, host-guest complexes, polyelectrolytes and proteins.
MD and free-energy calculations have provided insight on relevant aspects including the effect of substituents, the role of solvation, and rationales for the conformation and thermodynamic characterization of the systems under investigation. Among several studies on the topic, available in the literature, only a few are briefly reviewed. For instance, MD simulations in water, with PMF estimation based on both TI and constraint force methods, have been used for inspecting the thermodynamic properties of binding of some inorganic ions, such as ClO4− and SO42− (Figure 5), and ferrocenyl alkanethiols with both free and gold-surface confined β-cyclodextrins [14, 98]. In general, the association process between the host and guest molecules is analyzed along the reaction coordinate defined by the distance between the centers of mass of both host and guest, along a reference coordinate (e.g. z-axis). In these particular systems, electrostatic interactions and hydrogen bonding have played a major role on the binding process. For instance, simulations have elucidated the association mode of sulfate anion with free and grafted β-cyclodextrin (see Figure 5). For the latter system, a small minimum in the PMF profile (Figure 5, top), positioned at a larger distance relative to the cavity of surface-grafted cyclodextrin, suggested the formation of a noninclusion complex. The ion binds to the host by forming hydrogen bonds with the free-cyclodextrin portal. Also relevant is the individual energy contributions to the cyclodextrin-ion interaction, which is governed by electrostatic interactions. However, this favorable electrostaction contribution is not sufficient to compensate desolvation of the anion, considering the respective hydration energy .
The release and transport of drugs mediated by cyclodextrin-based carriers, have also been investigated  systematically using MD and free-energy calculations, showing the preferred inclusion modes of such drugs for cyclodextrins. One example (see Figure 6), refers to the binding of amphotericin B (AmB), which possesses two sites, within the respective prolonged macrolide, with higher binding affinity to γ-cyclodextrin. The decomposition of the PMFs into free-energy contributions have suggested that van der Waals and electrostatic interactions are the main driving forces responsible for the formation of this type of complexes .
Recently, some of the authors  have demonstrated the relevance of non-included moieties in the stability constants of several cyclodextrin-based complexes. The binding constants for naphtalene derivatives forming complexes with β-cyclodextrin were calculated (ranging from 128 to 2.1×104 kJ.mol−1), pointing out the important effects of the substituents. Substitution of naphthalene promoted an increase in the binding constant (up to 100-fold), irrespective of the nature of the substituent, the latter comprising small hydrophobic and hydrophilic, including charged, groups. Enthalpic and entropic contributions were separated in order to estimate their weight in the free-energy. Also, the preferred orientation of the guest molecules within the cyclodextrin was established.
In a completely different system , the same strategy was used for exploring the ion caging ability of bambusurils in aqueous media (Figure 7). Specifically, new insights on the conformation, hydration and energy changes involved in the complexation process between the water-soluble benzoyl-substituted bambusuril and chloride ion were provided. The structural features of three single bambusuril derivatives, and the relative energy contributions to the formation of the benzoyl-substituted bambusuril-chloride complex were computed. The estimated standard free-energy of binding and the respective binding constant were found to be −11.7 kJ.mol− 1 and 112, respectively. Binding occurred with complete desolvation of both guest and host cavity. One of the most interesting observation was that chloride was hermetically sealed inside the host cavity, as a result of a concerted action of both conformation change and desolvation (see Figure 7) .
Solvent contributions to the thermodynamics of binding have been scrutinized , emphasizing the importance of using explicit models for the accurate calculation of binding thermodynamics. A detailed analysis on the energetics of the host-guest complexation between β-cyclodextrin and several model drugs (e.g. puerarin, daidzin, daidzein, and nabumetone) have demonstrated that the flexibility of the binding partners and solvation-related enthalpy and entropy changes must be included explicitly for the precise estimation of thermodynamic parameters involved in molecular association. In this study, it was also demonstrated that implicit models are not suitable to provide detailed information on how free-energy is decomposed into enthalpy and entropy .
The dimerization of cyclodextrins has also been recognized as a relevant step in the construction of nanostructured materials. Only a few studies (see e.g. refs [15, 48].) involving umbrella sampling simulations and PMF calculations have been carried out for investigating the binding affinity of dimers in the presence/absence of a guest molecule, and the preferred orientation of interglucopyranose hydrogen bonds, at the cyclodextrin portals. These include β-cyclodextrin monomers and dimers in aqueous and nonaqueous media . Polar solvents with stronger hydroden bond accepting abilities can easily disrupt intermolecular hydrogen bonds, decreasing the stability of the dimers. In the same environment, higher binding affinities are achieved if guest molecules are included in the channel-type cavity of such dimers. These results allowed concluding that the formation of dimers is solvent-dependent and guest-modulated . In another related study , it was shown that the cooperative binding of cyclodextrin cavities to guest molecules can facilitates the dimerization process, favoring the overall stability and assembly of nanostructures. Different dimerization modes yielded different binding strengths. This has been demonstrated in systems consisting of isoflavone drug analogs used as drug templates, and cyclodextrins (Figure 8). A detailed quantification of host, guest and solvent contributions to the thermodynamics of binding has revealed that head-to-head dimerization promotes the most stable complexes, which can be used as building blocks for template-stabilized nanostructures. Desolvation of cyclodextrin dimers and entropy changes upon complexation also affect significantly the cooperative binding .
Other strategies have been adopted resorting to both MD and MC simulations. As an example of application , free-energy calculations and lattice chain MC simulations have been used for understanding the formation of polyrotaxanes (Figure 9), including the identification of the most favorable conformations of cyclodextrin molecules in a polyrotaxane and the quantification of dimerization free energies, related to different spatial arrangements of two consecutive cyclodextrin molecules. It has been suggested, that the binary association is controlled by the formation of hydrogen bonds between two adjacent molecules, promoting an overall structural stabilization .
The thermodynamic cost of confining polyelectrolytes spherical cells of different radii can also be quantified by free-energy calculations, resorting to TI . Simulation studies of polyelectrolytes, based on MC and MD, have been precluded by the respective chain lengths and the need of large concentrations of compacting agents. The conformational and energetic changes in DNA delivery systems have been studied computationally by some of the authors [101, 102]. For instance, a coarse-grained model was proposed  for exploring structural and thermodynamics aspects of confining polyions in spherical cavities, with different linear charge densities and with counterions of different valences. The free-energy of the confined polyion and counterions were estimated as a function of the sphere radius, from a dilute solution corresponding to the largest sphere. A positive free-energy difference was found for all systems and was associated to the compression. This means that the system containing the polyion with the largest linear charge density and monovalent counterions displays the largest resistance to being compressed. The penalty is thus largest for the polyion with the largest linear charge density and monovalent counterions. The replacement of the monovalent counterions with trivalent ones induced a compactation of the polyion, as a consequence of the stronger electrostatic polyion-counterion attraction. This leads to a nearly full compensation of the ideal and electrostatic contributions to the free-energy of their confinement.
Other interesting example of application relates to protein-ligand interactions [103, 104, 105, 106, 107], which are typically pursued by significant conformational and energetic changes. These changes often occur on time scales that make direct atomic-level simulations useless. One possible alternative relies on the assumption that the change in the binding free-energy resulting, for e.g. from a mutation of a ligand bounded to a protein, complies a linear response scheme , with parameters estimated from training sets of protein-ligand complexes, and used for predicting binding affinities of new ligands. Another strategy is the computational design of the ligand in free and bound states. In this type of systems, the emergence of some reliable implicit solvation models and classical statistical-mechanical approaches have been part of the solution. These include, for instance, the use of the molecular mechanics Poisson-Boltzmann surface area (MM-PBSA) model . A theoretical study, recently published , encompasses MD and free-energy calculations for evaluating the structural and thermodynamic signatures involved in the interaction of siRNA molecules, bearing chemical modifications at 3’-overhang, with the PAZ domain of human Argonaute (Figure 10). In these systems, a reduction of the complex binding affinity has been observed upon siRNA modification, being explained by the hampering of hydrogen bond formation in the active site of PAZ. Analyses of free-energy, achieved from simulations based on the Generalized Born and surface area continuum solvation (MM-GBSA) method, and of hydrogen bonds, have provided a complete identification of the most relevant residues for PAZ-siRNA interaction (see Figure 10). This data will contribute to the improved design of synthetic nucleotide analogues, circumventing some of the intrinsic siRNA drawbacks.
6. Concluding remarks
Molecular simulations, including free-energy calculations is still a fertile ground for research, from both theoretical and applied perspectives. Among the various challenges in the field of free-energy calculations, the integration of thermodynamics and kinetics through the concomitant determination of potentials of mean force and diffusivities in biased simulations, will benefit from substantial efforts. Although there is a general consensus that free-energy calculations are an important tool of computational chemistry and structural biology, thriving beyond academic walls, namely in industrial environments, some concerns still remain, dealing with the question to whether or not those methods provide convincing and reliable answers.
The authors acknowledge the Fundação para a Ciência a Tecnologia (FCT), Portuguese Agency for Scientific Research, through the Projects n. 016648 POCI-01-0145-FEDER-016648 and COMPETE POCI-01-0145-FEDER-007440. The Coimbra Chemistry Centre is also supported by FCT through the Projects PEst-OE/QUI/UI0313/2014 and POCI-01-0145-FEDER-007630. Tânia F.G.G. Cova, Sandra C. C. Nunes and Andreia. F. Jorge acknowledge, respectively, the PhD and post-doctoral research Grants SFRH/BD/95459/2013, SFRH/BPD/71683/2010 and SFRH/BPD/104544/2014, assigned by FCT.