Open access peer-reviewed chapter

Fundamentals of Molecular Docking and Comparative Analysis of Protein–Small-Molecule Docking Approaches

Written By

Sefika Feyza Maden, Selin Sezer and Saliha Ece Acuner

Submitted: 31 May 2022 Reviewed: 13 June 2022 Published: 08 July 2022

DOI: 10.5772/intechopen.105815

From the Edited Volume

Molecular Docking - Recent Advances

Edited by Erman Salih Istifli

Chapter metrics overview

992 Chapter Downloads

View Full Metrics

Abstract

Proteins (e.g., enzymes, receptors, hormones, antibodies, transporter proteins, etc.) seldom act alone in the cell, and their functions rely on their interactions with various partners such as small molecules, other proteins, and/or nucleic acids. Molecular docking is a computational method developed to model these interactions at the molecular level by predicting the 3D structures of complexes. Predicting the binding site and pose of a protein with its partner through docking can help us to unveil protein structure-function relationship and aid drug design in numerous ways. In this chapter, we focus on the fundamentals of protein docking by describing docking methods including search algorithm, scoring, and assessment steps as well as illustrating recent successful applications in drug discovery. We especially address protein–small-molecule (drug) docking by comparatively analyzing available tools implementing different approaches such as ab initio, structure-based, ligand-based (pharmacophore-/shape-based), information-driven, and machine learning approaches.

Keywords

  • molecular docking
  • drug design
  • drug discovery
  • protein interactions
  • machine learning

1. Introduction

The molecular machines of the cell, i.e., proteins, are essential to many cellular processes such as signal transduction and cell regulation. Proteins seldom act alone in the cell, but they function through interacting with other small or macromolecules. Therefore, understanding protein interactions at the atomic level is critical to understanding biological processes [1]. Primary structure, i.e., amino acid sequence, of the interacting proteins is a necessary but insufficient source of information at the atomic level. After being synthesized, proteins fold and acquire a stable native structure, i.e., tertiary structure that can be defined in a three-dimensional (3D) plane in order to be functional. It is known that proteins with different sequence information can have similar functional structures, that is, different amino acid sequences can show similar folding trends in 3D space and structure is more conserved than sequence [2]. Therefore, it is crucial to understand the interaction details at the structural level. Proteins physically interact with their partners via non-covalent associations, namely H-bond, hydrophobic, and electrostatic interactions, with the exception of covalent disulfide bridges. These intermolecular physical forces also dominate the protein folding process.

The 3D structure of the macromolecules can be determined using the experimental methods such as X-ray crystallography, nuclear magnetic resonance (NMR), and cryo-EM and then deposited in the Protein Data Bank (PDB) (https://www.rcsb.org). However, there is a huge gap between the number of known protein sequences and structures [3, 4]. Computational modeling approaches that can predict 3D structures of macromolecules can help to bridge this gap. A recent machine learning algorithm developed by DeepMind, called AlphaFold [5], can predict 3D structures of proteins using the sequence information with high accuracy and has been accepted as a breakthrough in the structural biology field. In 1 year, approximately 1 million new structures have been predicted and deposited at AlphaFold Protein Structure Database (https://alphafold.ebi.ac.uk/). In order to have a complete understanding of the proteome, computational techniques are not only needed for modeling single protein structures, but also the interactions between them.

Molecular docking is a method used to predict the structures of proteins in complex with other proteins, nucleic acids, or small molecules. It can be defined as predicting the appropriate low-energy binding pose of the ligand in complex with the target structure, by randomly colliding proteins and their potential partners in space, first creating a rigid complex structure model, and then focusing on the binding sites of that model with flexible interface refinement [6]. Energy minimization of randomly docked conformations in space requires a multidimensional calculation. Initially developed molecular docking method was treating ligands and receptors as rigid bodies without considering any conformational changes [7]. However, interactions between proteins can become quite complex even with small changes in the conformation of the structures [7], and docking algorithms may not physically solve this complex problem correctly [8]. The main factor that creates computational difficulties in docking algorithms is when the protein backbone changes its conformation significantly upon binding [9, 10]. To address this problem, different techniques that consider backbone flexibility have been successfully implemented in docking algorithms [10].

Many diseases today, such as cancer, are likely to be linked to problems in protein-protein interactions and targeting them can therefore enable the development of next-generation therapeutic methods [11]. Modeling the complex structures formed by proteins with other proteins or small molecules holds the key to understand many biological processes such that modeling enzyme-substrate or protein-drug interactions can reveal insights into binding sites/interface regions, function, and mechanism of action. The main protein–small-molecule docking applications in drug discovery include drug repositioning, structure- and ligand-based (pharmacophore−/shape-based) drug design approaches using virtual and reverse screening [11, 12, 13, 14]. Today, with the continuously developing technology; targeted drug design, drug target search, evaluation of the side effects of existing drugs, or finding new targets for these drugs can be achieved with the help of molecular modeling and machine learning methods [12]. Deep learning neural network models have strong computational ability on big data and attract attention in structural biology field [15]. There are antibiotic discovery studies using deep neural networks [16] and deep learning studies adapted to drug design [17].

In this chapter, we focus on the protein–small-molecule docking fundamentals and the steps of the docking algorithm and procedure in detail. We then give recent successful applications in drug design and discovery that use different docking approaches, namely virtual screening, reverse screening, and machine learning. Lastly, we comparatively analyze some of the available protein–small-molecule docking tools using the structure of SARS-CoV-2 main protease in complex with a non-covalent inhibitor Jun8-76-3A as a case study.

Advertisement

2. Fundamentals of protein–small-molecule docking

Protein–small-molecule interactions are essential for the sustainability of biological processes such as enzymatic catalysis and overall homeostasis in the body [18]. The engineering of protein–small-molecule interactions is one of the computational approaches used to solve critical problems in biology [18]. protein–small-molecule docking, i.e., modeling the interaction between chemical compounds and their target protein receptors at the atomic level, is an effective tool in drug design. In the structure-based design of small-molecule drugs, a good estimation of the binding pose is required to clearly demonstrate important interactions and design drugs with increased selectivity and efficacy [19]. The procedures that can be followed and the tools that can be used before, during, and after molecular docking are explained in the following subsections and summarized in Figure 1.

Figure 1.

The procedures that can be followed and the tools that can be used before, during, and after protein-ligand molecular docking in drug design.

2.1 Before docking: molecule preparation

Before starting the docking studies, first of all, the most suitable protein and ligand structures should be selected [20]. There are databases to access the experimentally determined structures of target proteins such as PDB, Uniprot, and Therapeutic Target Database (TTD). If the experimental structure is not available, modeled structures can be obtained from AlphaFold Database or can be modeled using relevant structure modeling software. The most frequently used databases for getting the small-molecule ligand/chemical structures are: DrugBank [21], PubChem [22], ZINC [23], ChEMBL [24], and Chemspider [25] (Figure 1). DrugBank, Chemspider, and ZINC databases include more than 500,000, 100 million, and 230 million compounds/drug molecules, respectively.

The molecular docking algorithms may require preliminary preparation of the structures that are obtained in PDB format (lacking H atoms). There are tools available for such preliminary preparations such as Open Babel [26] and AutoDockTools (Figure 1) [27].

It is also of crucial importance to guide docking with preliminary information on the binding site. Otherwise, there are no binding site constraints, blind docking takes place, and it is more difficult to detect the correct binding poses when the ligand search space is large. There are various guiding algorithms for active site prediction that can be used when binding sites are not known. Some of them can be listed as: GRID [28], SurfNet [29], COACH [30], SCFbio [31], CASTp [32], DeepSite [33], and PUResNet (Figure 1) [34].

The capabilities of docking algorithms can differ from each other, and in this respect, it is important to carefully choose the algorithm to use in accordance with the purpose of the study before starting the docking.

2.2 Docking algorithm steps

There are many approaches and algorithms for molecular docking, based on different parameters, and they aim to perform the protein-ligand docking with the best performance [12]. The steps of molecular docking algorithms can be summarized as follows: molecule flexibility, conformational search algorithms (ligand sampling), and scoring functions (Figure 2) [12, 35].

Figure 2.

Methods for protein-ligand molecular docking.

2.2.1 Molecule flexibility

During molecular docking, structures can be considered rigid or flexible. Rigid docking takes into account only the translation and rotation degrees of freedom. Providing flexibility means also considering the rotation about single bonds so that they have the same bond lengths and angles but different torsion angles. Although flexible docking approach is more realistic than rigid docking, when there are many rotatable bonds, the ligand conformational search space becomes so large that it is difficult to find the correct binding pose with the lowest binding free energy (global minimum solution). Some algorithms, such as HADDOCK [36], first treat the structures as rigid to increase time efficiency and then perform flexibility improvements on the poses of molecules with the best energy scores. Molecular docking software can be grouped according to the flexibility treatments of molecules such as Rigid Docking, Semi-Flexible Docking, and Soft Docking [35, 37].

In rigid docking, protein and ligand molecules are treated as rigid entities [37, 38]. During docking, the positions of the molecules change without losing their shape [37], i.e., only translation and rotation but no conformational degrees of freedom are considered.

Semi-flexible docking is based on the principle of keeping the protein structure rigid and letting the ligand structure be flexible by allowing rotatable bonds. Thus, various conformational poses of the ligand on the protein are sampled [35, 37, 38]. It gives more accurate results than rigid docking [37].

In soft docking, van der Waals interactions between atoms are softened, making the structures of both receptor and ligand molecules implicitly flexible as overlap is allowed to a small extent [39, 40]. Soft docking process is carried out realistically by ensuring that both the protein and the ligand are rotatable as in their natural states [37, 38]. It is an advantageous method due to its computational efficiency and ease of application [35, 37].

2.2.2 Conformational search algorithms

Conformational search algorithms can identify different conformational orientations (poses) of the ligand sampled around the experimentally determined active site or other binding sites on the protein [35, 41, 42]. These algorithms are generally classified as: shape matching, systematic, stochastic, and simulation methods [35, 38, 43].

Shape matching algorithms have the advantage of speed over other algorithms [35, 44] and adopt a sampling principle in which the conformation of the ligand should be structurally complementary to the protein binding site [38]. It ensures that the ligand is positioned in such a way that best complements the molecular surface of the binding site on the protein [35]. Some example software using shape matching are: DOCK [45], FLOG [46], EUDOC [47], Surflex [48], LibDOCK [49], SANDOCK [50], and MDock [51].

Using systematic search algorithms, a large number of possible binding poses can be obtained by gradually changing the degrees of freedom of the ligands [35, 52] toward the direction of minimum energy. Systematic search algorithms can be divided into two as exhaustive search and fragmentation (incremental structure) [35, 41, 53]. Exhaustive search algorithm is based on systematically generating flexible ligand conformations by rotating the rotatable bonds in the ligand [35]. If the number of rotatable bonds is large, there is a combinatorial explosion in the number of poses, i.e., the search space, so that some filtering and optimization procedures are applied for practical purposes [35]. Glide [54] and FRED [55] are example docking software using exhaustive conformational search algorithms. In the fragmentation method, the ligand is divided into smaller fragments, each fragment is placed and augmented at the binding site gradually through covalent bonding to the previous one [35]. DOCK [56], LUDI [57], FlexX [58], and eHiTs [59] are example software using fragmentation.

The algorithms used in stochastic search methods are more efficient but do not guarantee an accurate result as they are based on generating random ligand conformations, and therefore, the docking process is iterative in these algorithms [41, 44]. Monte Carlo, swarm optimization, evolutionary algorithms, and Tabu search methods are among the most used stochastic algorithms [35, 38, 52]. Example software using stochastic conformational search method include AutoDock [60], GOLD [61], DockThor [62], and MolDock [63].

Simulations of the obtained ligand poses (simulation methods) represent protein and ligand flexibility better than the other algorithms but have a slow flow and can make insufficient sampling [38, 44]. For this reason, they are used as a complement to other conformational search methods [38].

2.2.3 Scoring functions

In the previously described conformational search step, many structures are created and most of them should be eliminated by selecting the biologically appropriate structures. Therefore, the possible poses created by conformational search algorithms are evaluated and ranked by using a scoring function [35]. The scoring function is a measure to evaluate the docking poses obtained [35, 38, 52] in terms of their binding free energies [11, 44, 64].

With the scoring functions that estimate the binding energies of the created complex structures, various physicochemical properties should be evaluated in order to distinguish good results from the bad ones. These physicochemical properties can be intermolecular interactions, desolvation from solvent, electrostatic and entropic effects, etc. [65]. As the number of evaluated parameters increases, the accuracy of the scoring function will increase; but the computational load will also increase. Therefore, scoring functions with ideal efficiency, especially when working with large ligand sets, are those that are balanced in terms of accuracy and speed [11]. The scoring functions can be classified as: force-field-based, empirical, knowledge-based, and consensus scoring.

The Force Field Scoring Function (FFSF) is designed to work with multiple force fields such as AMBER [66], CHARMM [67], GROMOS [68], and OPLS [69] individually or in combination. The designed FFSFs estimate the free energy of ligand binding by considering van der Waals energy terms such as electrostatic interactions and hydrogen bonds [35, 38].

Empirical scoring functions use simpler energy terms to estimate the free energy of ligand binding such as hydrogen bonds and ionic interaction, and they can be calculated more easily and faster than FFSFs [35, 38, 52]. Some examples of empirical scoring functions are GlideScore [54], PLP [70], LigScore [71], LUDI [72], SCORE [73], and X-Score [74].

Knowledge-based scoring functions use statistical analysis of protein-ligand complex structures to derive protein-ligand distance [44]. These functions can show high performance in a short time [52]. They can also model some uncommon interactions, such as sulfur-aromatic, that other functions do not address [44].

Consensus scoring function, not a specific scoring system, aims at an effective scoring with a combination of multiple scoring functions with the idea of minimizing the possible error margins of existing scoring systems [35, 38, 44].

2.3 After docking: evaluation of the results

After performing protein-ligand docking studies, the accuracy of pose estimations needs to be evaluated [41, 52]. The best way to evaluate the docking algorithm is to compare the predicted binding pose of the ligand with position of the reference ligand in the experimentally determined structure, if possible. The structural comparison is quantified by using root mean squared deviation (RMSD) (Eq. 1), with the unit of Å [41, 75]. It is preferred that this value is between 2 and 4 Å or less for a good docking. RMSD calculations are simple, but this metric is not normalized to number of atoms and therefore should not be considered as an absolute measure [76]. As a more systematic approach, in order to ensure the consistency of the docking algorithm used, it should be checked whether the same poses are obtained by repeating the docking process [52] at least 50 times and clustering the poses of the side chains and references according to a certain threshold value [77]. With this method, whether the docking algorithm correctly and consistently creates a pose in the right position can be determined [41, 44, 78].

RMSD=1N+i=1Nxaixbi2+yaiybi2+zaizbi2E1

Eq. (1) Root mean squared deviation for the coordinates of two molecules, a and b, with N atoms.

Modeling successes and capabilities of docking algorithms are being evaluated in a competition called CAPRI (Critical Assessment of Protein Interactions) (https://www.capri-docking.org/) since 2001 [79, 80]. Experimentally determined complex structures that have not yet been published in PDB are submitted to CAPRI and without knowing the experimental structure of the complex, the participants try to predict the most similar structure to the experimentally determined complex structure through docking algorithms [79]. A solution set of 10 models is presented to the CAPRI committee for evaluation based on the geometry similarity and biological relevance of the predicted complex structures. The results of CAPRI show very good predictions for easy targets with simple conformational changes, but rather worse ones for difficult targets with conformational changes upon binding [9].

Advertisement

3. Molecular docking approaches and applications in drug design

Computational methods have become an important part of the drug discovery process with increasing accuracy of algorithms. Various docking methods based on different algorithms are constantly being developed to determine the structural relationships of potential drug molecules and their targets [44]. In addition, studies in this area shed light on the candidate drugs in terms of the pharmacodynamic properties, affinity, and selectivity [11]. The main molecular docking applications in drug discovery include drug repositioning (repurposing), structure- and ligand-based drug design approaches using virtual and reverse screening [11, 12, 13, 14].

Drug repositioning seeks out new targets for natural compounds, drugs currently in use, or candidate ligands to reveal their unknown therapeutic potentials [81]. Many successful repositioning studies are available in the literature [81, 82, 83]. Virtual screening (VS) and reverse screening (RS) techniques are frequently used in drug discovery and repositioning. VS offers a more effective and rational approach compared with traditional methods [36]. The atomic-level analyzable results presented to us by virtual screening studies guide us in understanding the function of the target and in new drug discoveries [5, 36, 55]. In the RS approach, interest is on a single ligand molecule, and there is a search for a biological target for this molecule [12]. Unlike virtual screening (VS), the search library consists of potential target receptors. RS approach has the potential to lead studies such as testing toxicity or side effects of the existing drugs [38]. The potential side effects of a drug need to be evaluated in the drug discovery process. Molecular docking studies can offer an important perspective in this regard, and there are inverse (reverse) docking studies that provide bioactivity data by detecting off-target bindings [25]. Lastly, the subclasses of Artificial Intelligence (AI): Machine Learning (ML) and Deep Learning (DL) methods have significant contributions in pharmaceutical industry [84]. AI can be applied to different steps such as drug design with VS, de novo generation of drug molecules, and computational planning of drug synthesis [85]. Recent developments are promising that molecular docking methods may benefit from the machine learning methods more in the future [84].

3.1 Virtual screening

Virtual screening (VS) approach uses a target receptor and a library of small molecules. Libraries can be created manually, or already existing libraries can be used. The library consists of a large number of chemically diverse bioactive small molecules with a high probability of binding to the receptor. This virtual computing technique is considered as the in silico equivalent of in vitro methods such as high-throughput screening (HTS) [11]. VS is preferred as a guide in scientific studies because its success rate is 400 times higher [86], less costly, faster, and requires less labor compared with high-throughput screening methods [87]. VS studies aim to reduce a large number of potential drug candidates to manageable numbers applying various filters. The biggest challenge in VS is the detection of false negatives [19].

Ligand-based VS methods conduct research by identifying common properties of compound sequences, such as molecular volume and protonation state [11]. In addition to chemical similarity [88] and rule-based [89] software included in filtration strategies, there are also various software such as freely add-on pharmacophore and quantitative structure-activity relationship (QSAR) models [87, 90]. The most commonly used ligand-based virtual screening method is the QSAR method. Ligand-based VS does not contain structural information about the receptor, it only scans using receptor sites known to be active and tries to detect active ligand molecules [85].

Structure-based VS methods are often used when the receptor has different conformations. The aim is to predict receptor binding affinity by processing structural information using a variety of techniques, such as binding site similarity and pharmacophore mapping. By estimating the different binding modes, the molecules are sorted for evaluation [11]. Analysis of the predicted poses can be done manually using visualization programs. It has been reported that nAPOLI, a web server developed in recent years, analyzes results automatically [91].

Structure-based pharmacophore generation is one of the most frequently used methods for small molecules in the virtual screening method. Here, 3D pharmacophore model interfaces of the scaffolds of the ligands are created, and ligands that will adapt to the binding site and provide the desired bioactivity are selected. Some of the programs that use pharmacophore modeling are HipHop [92], PHASE [93], MOE, which are commercial, SCAMPI [94], PharmaGist [95], ALADDIN [96], which are suitable for academic use.

A recent example of VS application on the non-structural protein of SARS-CoV-2, nsp1, one of the virulence factors causing viral infection, is by G. O. Timo et al. [74]. They estimated the exact pattern of nsp1 interaction through molecular simulation studies and analyzed 8694 potential inhibitors from the DrugBank database using the virtual screening method and proposed 16 inhibitor molecules with the best binding energy scores [74]. There is another recent study on the transcription factor BRF2, which is among the therapeutic targets as its upregulation is observed in the formation of various types of cancer, but there is no available specific drug targeting BRF2. By performing drug repositioning through virtual screening of drug molecules that are potential candidates for BRF2 inhibition, Rashidieh et al. found that the bexarotene molecule led to a serious decrease in the proliferation of this type of cancer cells [97].

3.2 Reverse screening

Reverse screening (RS) is also called inverse docking, reverse docking, inverse virtual screening, or target screening. Libraries are more limited for target hunting and profiling [12] and can be created manually using the most common accessible databases such as PDB [98] and TTD [12, 99]. But this process requires a long preparation time and effort. There are various algorithms used to detect interactions by reverse screening. Some web platforms (INVDOCK [100], idTarget [101], ACTP [102], etc.) have been developed for reverse docking, which use libraries prepared for specific diseases and docked using programs such as standard AutoDock and AutoDock Vina [12].

A recently developed Consensus Reverse Docking System (CRDS) detects potential binding sites by screening approximately 5200 candidate proteins for the ligand molecule using three different scoring methods [103]. In another example, Stepanova et al. tested the antimicrobial activity against Mycobacterium tuberculosis strain by reverse screening for chemicals that had been successful in experimental studies and determined the most appropriate target as aspartate 1-decarboxylase by performing docking studies using 35 different target protein structures [104]. Reverse screening was also used for Bazedoxifene, an FDA-approved drug for the prevention of postmenopausal osteoporosis, and Xiao et al. defined the inhibitory power of Bazedoxifene on IL-6/GP130 signaling pathway (critical for cancer survival) by using computational techniques and confirmed the result with in vivo studies [83].

3.3 Machine-learning-based approaches

Machine learning techniques take information from biological data and make predictions about them, thus contributing to building a structural model [9]. Once a model is built, it must be improved so that the state with the lowest potential energy (global minimum) can be reached. Global minimum means a stable and sterically acceptable structure, and reaching it without being stuck at the local minima is very important in the field of bioinformatics and computational structural biology. A recent machine learning algorithm developed by DeepMind, called AlphaFold [5], implements deep learning and can predict 3D structures of proteins using the sequence information with high accuracy and has been accepted as a breakthrough in the structural biology field.

Machine learning makes classifications by learning on datasets and needs human intervention to evaluate possible outcomes. Deep learning is a more advanced model having the neural network with ability to decide the right result without human intervention (Figure 3). Machine learning can use supervised or unsupervised learning. Supervised learning performs machine learning on datasets that we know about, whereas unsupervised learning detects and labels similarities and orientations in a created cluster [38, 90].

Figure 3.

Schematic illustration of artificial intelligence subfields: Machine learning and deep learning.

The training set used in machine learning constitutes the performance of the algorithm. Machine learning studies in the field of virtual screening are generally focused on improving the performance of the scoring function [85]. Studies have shown that working with small subsets of the same family, which consists of similar structures, gives better scoring results rather than working with large data from different complexes [105]. Working with subsets of interest is also a better approach in terms of computational requirements [38].

Machine learning and deep learning can describe more diverse data than other computational systems and can be representative of structural biology. Nonparametric machine learning has great potential to be the next step in computer-based programming to improve the accuracy of molecular docking studies [41]. Machine learning can be used to refine predetermined function data as well as provide high-quality data to complement pharmaceutical discovery research and development.

Advertisement

4. Case study: comparison of docking tools

As a case study for comparing different protein-ligand docking tools, the crystal structure of the SARS-CoV-2 (COVID-19) main protease in complex with its non-covalent inhibitor Jun8-76-3A (PDB ID: 7KX5) is used as the experimental reference structure to evaluate the accuracies of the complex structures predicted using AutoDock Vina, HADDOCK, and SwissDock programs and changing some of the parameters to test their effects on prediction capabilities. The inhibitor in the experimental protein structure is removed and then molecular docking is performed using the initial coordinates of the main protease structure of SARS-CoV-2 and its inhibitor Jun8-76-3A, separately.

4.1 Docking with AutoDock Vina

AutoDock is a free software that predicts the binding compatibility of small ligands to macromolecule targets with a flexible-rigid (semi-flexible) docking approach [27]. It uses a grid-based method to place the ligand in the active region determined on the macromolecule [106]. AutoDockTools (http://mgltools.scripps.edu/downloads) is the user interface to produce and examine grid information required for the preparation of the protein and ligand structures in the relevant format and the configuration file [27].

As a docking input in AutoDock Vina, a configuration file, which contains the coordinate information of the protein and ligand structures and the ligand-binding region on the receptor, is required. For docking the case study ligand to the receptor using AutoDock Vina, the structure file was downloaded from RCSB PDB database (https://www.rcsb.org) in .pdb format (PDB ID:7KX5). AutoDockTools (v1.5.6) interface was used to prepare input files, such that, water molecules in the relevant protein structure were deleted, polar H bonds were added to the structure and both the receptor and ligand structures were saved in .pdbqt file format. After preparing the ligand and protein structures, the most important input information for AutoDock is the docking parameter. The docking parameter involves determining the coordinates of the ligand-binding region on the target protein. While determining the docking parameter, if the binding region on the protein is not known, blind docking can be performed by putting the whole protein in the grid box (Figure 4A), or a small grid box can be placed in the specific known/predicted ligand-binding region on the protein (Figure 4B). Lastly, after determining the region on the protein where the ligand is to be bound by using the “grid box” in AutoDockTools, the protein coordinates were specified in the input configuration file. Preparing all the required inputs, docking was performed using AutoDock Vina by repeating each docking process three times in order to observe the consistency of the algorithm (Table 1).

Figure 4.

Grid box usage in docking: (A) blind docking with a grid box of size: 44×72×68 and center coordinates: 10.711, 0.0, 3.782, (B) specific docking with a grid box of size: 14×14×16 and center coordinates: 10.735, −2.409, 21.173.

ModeSpecific dockingBlind docking
Affinity (kcal/mol)Affinity (kcal/mol)
Rep1Rep 2Rep 3AVGRep1Rep2Rep3AVG
1−8.9−8.8−8.9−8.9−8.9−8.9−9.0−8.9
2−7.3−8.7−7.3−7.8−8.2−8.2−8.3−8.2
3−7.2−7.2−7.3−7.2−8.1−8.0−8.1−8.1
4−6.8−7.0−7.0−6.9−7.9−7.8−8.0−7.9
5−6.8−6.9−7.0−6.9−7.9−7.5−8.0−7.8
6−6.8−6.8−6.9−6.8−7.7−7.4−7.8−7.6
7−6.7−6.5−6.8−6.7−7.7−7.4−7.8−7.6
8−6.4−6.4−6.8−6.5−7.6−7.2−7.6−7.5
9−6.3−6.4−6.7−6.4−7.5−6.9−7.4−7.3

Table 1.

Specific and blind docking studies with AutoDock were repeated three times.

In order to examine the accuracy of the docking results, the poses obtained from AutoDock Vina were aligned with the original PDB structure by using the PyMol program [107]. When the energies of the poses predicted with specific docking (i.e., using specific grid on the binding site) and blind docking are compared, although the energy scores of the blind docking results are better, the comparison of the poses with the reference ligand shows that the most accurate binding is achieved with specific docking (Figure 5). Alignment of the first poses (with the lowest energy score) predicted with specific docking (green) and blind docking studies (blue) with the reference ligand (red) shows that the specific docking pose was in a more similar position with the reference ligand (green vs. red), than the blind docking pose (blue vs. red).

Figure 5.

Crystal SARS-CoV-2 main protease structure (white, PDB ID: 7KX5_chain (A) in complex with the blind docking (blue), specific docking (green) poses predicted with AutoDock Vina and the reference ligand Jun8-76-3A inhibitor (red, PDB ID: 7KX5_chain B). This figure was drawn with PyMol 2.5.2.

4.2 Docking with HADDOCK

An integrative platform called High Ambiguity-Driven biomolecular DOCKing (HADDOCK) is used for molecular docking of two or more molecules [108] and is a popular algorithm [36]. Although it is mainly suitable for protein-protein interactions, it can also be applied to model the protein–small-molecule complexes [109]. HADDOCK automatically decides the most suitable configuration of the ligand according to the given restrictions [108]. Protein-protein docking is more complex than protein–small-molecule docking, as the proteins are flexible and the conformational space is larger [110].

HADDOCK does not require CPU and allows the user to see all the docking steps from start to finish. It should be noted that the success of HADDOCK studies is directly related with the amount of data entered into the system [36]. HADDOCK allows processing different types of molecules with the help of different platforms such as WHATIF, ProDRG, PDB. There is no need to create different conformer sequences as the system selects the most compatible conformers based on the shape constraints. With restriction files, we can set clear target sites, binding distances, or select active or passive residues (areas that are likely to interact). Defining semi-flexible regions is also allowed.

HADDOCK algorithm consists of three stages: rigid-body minimization and randomization of orientations (it0), semi-flexible simulated annealing in torsion angle space (it1), and refinement in 3D space with explicit solvent (water) (https://www.bonvinlab.org/education/HADDOCK-protein-protein-basic/). it0 stage treats structures as rigid solids and 1000 poses with the best score are selected. it1 optimizes orientations by allowing different docking poses from it0 to have different flexible regions defined. Two-hundred models with the best energy pass to the final stage. In the final step, a complex solvent medium (DMSO or water) is considered to improve the interaction energy and the final models are automatically aggregated.

To dock the case study inhibitor-protein complex (PDB ID:7XK5), the guideline tutorial (HADDOCK small-molecule binding site screening protocol) [111] was followed and two different approaches were tested: (i) using an unambiguous (distance) restraint file, indicating the target that should bind the ligand, (ii) by defining the active and passive residues. This case study consists of a pre-docking for the detection of the binding region and a second docking for the detection of binding pose.

First, we tested HADDOCK’s accuracy of binding site detection. Two different binding sites were detected in the top 10 clusters with the best energy scores and 70% (7 out of 10) of the clusters were in the correct binding site (Figure 6A). Secondly, an ambiguous and unambiguous restraint file was created by identifying the region with the highest number of interactions between the ligand and the receptor. The restraint files can be created manually or using the link in the protocol. However, it may be necessary to make corrections in the distance restraints. The structure with the best energy is visualized in Figure 6B. Secondly, active and passive residues were defined on the system, and the pose with the best energy result is visualized in Figure 6C. HADDOCK results are summarized in Table 2.

Figure 6.

Crystal SARS-CoV-2 main protease structure (gray, PDB ID: 7KX5_chain (A) in complex with the docking poses (blue) predicted with HADDOCK and reference ligand Jun8-76-3A inhibitor (red, PDB ID: 7KX5_chain B). A. Top 10 clusters for binding site determination. B. Pose with the best energy using ambiguous/unambiguous restraints. C. Pose With the best energy using active/passive restraints. This figure was drawn with PyMol 2.5.2.

Binding site detectionAmbiguous/Unambiguous restraintsActive/passive restraints
HADDOCK score−53.4 ± 1.5−52.1 ± 0.5−21.9 ± 2.7
Cluster size69513
RMSD from the overall lowest-energy structure0.3 ± 0.20.1 ± 0.10.2 ± 0.0
Van der Waals energy−40.3 ± 1.2−41.6 ± 0.2−32.4 ± 4.5
Electrostatic energy−22.1 ± 1.9−15.2 ± 6.0−25.8 ± 7.3
Desolvation energy−10.9 ± 2.5−9.0 ± 0.2−6.7 ± 0.3
Restraints violation energy0.0 ± 0.000.7 ± 0.2198.5 ± 78.0
Buried Surface Area795.4 ± 21.9781.6 ± 5.2783.0 ± 9.4
Z-Score−1.7−2.4−1.3

Table 2.

HADDOCK results.

Comparison of the results shows that HADDOCK is successful in detecting the binding site. However, according to the results obtained in the second stage, the algorithm was not successful enough to find the correct conformation of the ligand in binding site. Defining ambiguous/unambiguous restraint files or selecting active and passive residues did not make a significant contribution in detecting the correct binding pose (Figure 6B and C). Docking with both approaches was repeated several times and no significant similarity was detected.

4.3 Docking with SwissDock

SwissDock is a database to improve protein–small-molecule docking using amino acid sequence information from genome projects. Moreover, it is a web browser and programmatic interface that enables creating three-dimensional protein models from protein amino acid sequences [112]. It also has user interfaces such as Swiss-Pdb Viewer (DeepView) to simultaneously analyze several proteins [113]. Using the SwissDock web server, the starting crystal structures of the target proteins can be searched and fetched from protein and ligand structure databases. If there is no crystal structure available to compare, it provides homology modeling of the studied protein. During the docking process, the user does not have to do any calculations because all calculations are handled by the server side [112]. As a docking constraint, the ligand binding region can be defined or blind docking can be applied with no information.

Using the case study, both specific and blind dockings were performed on the SwissDock server, and the results were compared. The server presented 256 poses. The best scores obtained by specific docking (green) blind docking (blue) were −9.88 and −9.35 kcal/mol, respectively (Figure 7). Although both of the predicted poses did not show the same conformation with the reference ligand, it was observed that the pose obtained from the specific docking (green) was more similar to the reference ligand (red) (Figure 7).

Figure 7.

Crystal SARS-CoV-2 main protease structure (white, PDB ID: 7KX5_chain (A) in complex with the blind docking (blue), specific docking (green) poses predicted by SwissDock and the reference ligand Jun8-76-3A inhibitor (red, PDB ID: 7KX5_chain B). This figure was drawn with PyMol 2.5.2.

Advertisement

5. Conclusions

Molecular docking is a computational method that predicts the 3D structures of receptor-ligand complexes. Modeling the atomic details of the ligand pose with the receptor protein by molecular docking can assist in understanding protein structure-function relationship and in drug design studies in several ways. Computational modeling approaches complement and/or lead experiments by eliminating irrelevant drug candidates and selecting the ones with the best binding properties. With the continuously developing technology, there are many different approaches and algorithms for molecular docking studies, and they are successfully used in therapeutic applications such as targeted drug design, drug target search, evaluation of the side effects of existing drugs, or finding new targets for these drugs.

The crystal structure of the SARS-CoV-2 (COVID-19) main protease in complex with its non-covalent inhibitor Jun8-76-3A (PDB ID: 7KX5) was used as an experimental reference case study to compare and evaluate the prediction accuracies of AutoDock Vina, HADDOCK, and SwissDock programs as well as to test the effects of some parameters on their prediction capabilities. One of the main observations is that the ligand poses with the lowest binding energy scores are not necessarily the best solution. Therefore, docking results should always be evaluated in terms of biological relevance. Moreover, when a priori information about the ligand-binding site is included as grid box placement and size in AutoDock Vina and as ligand binding residues in SwissDock, the binding accuracy is improved significantly.

In summary, before starting the molecular docking, it is of crucial importance to obtain detailed information on the target protein and ligand from various sources and servers and to decide which docking algorithm to use. Moreover, the top predicted poses with the best scores should not be unquestioningly accepted as the best solutions but further structural analyses and evaluations should be incorporated in the decision process.

Advertisement

Acknowledgments

We would like to thank dear Merve DEMİR AKYÜZ and Merve YÜCETÜRK (a.k.a Merves), who are fourth-year undergraduate students at Molecular Biology and Genetics Department of Istanbul Medeniyet University, for their contribution to the writing of the introduction section.

References

  1. 1. Russell RB, Alber F, Aloy P, Davis FP, Korkin D, Pichaud M, et al. A structural perspective on protein–protein interactions. Current Opinion in Structural Biology. 2004;14(3):313-324
  2. 2. Sadowski MI, Jones DT. The sequence–structure relationship and protein function prediction. Current Opinion in Structural Biology. 2009;19(3):357-362
  3. 3. Petrey D, Honig B. Structural bioinformatics of the interactome. Annual Review in Biophysics. 2014;43(1):193-210
  4. 4. Stein A, Mosca R, Aloy P. Three-dimensional modeling of protein interactions and complexes is going ‘omics. Current Opinion in Structural Biology. 2011;21(2):200-208
  5. 5. Senior AW, Evans R, Jumper J, Kirkpatrick J, Sifre L, Green T, et al. Improved protein structure prediction using potentials from deep learning. Nature. 2020;577(7792):706-710
  6. 6. Andrusier N, Mashiach E, Nussinov R, Wolfson HJ. Principles of flexible protein-protein docking. Proteins. 2008;73(2):271-289
  7. 7. Bonvin AM. Flexible protein–protein docking. Current Opinion in Structural Biology. 2006;16(2):194-200
  8. 8. Vakser IA. Protein-protein docking: From Interaction to Interactome. Biophysical Journal. 2014;107(8):1785-1793
  9. 9. Harmalkar A, Gray JJ. Advances to tackle backbone flexibility in protein docking. Current Opinion in Structural Biology. 2021;67:178-186
  10. 10. Wang C, Bradley P, Baker D. Protein–protein docking with backbone flexibility. Journal of Molecular Biology. 2007;373(2):503-519
  11. 11. Ferreira L, dos Santos R, Oliva G, Andricopulo A. Molecular docking and structure-based drug design strategies. Molecules. 2015;20(7):13384-13421
  12. 12. Pinzi L, Rastelli G. Molecular docking: Shifting paradigms in drug discovery. IJMS. 2019;20(18):4331
  13. 13. March-Vila E, Pinzi L, Sturm N, Tinivella A, Engkvist O, Chen H, et al. On the integration of in silico drug design methods for drug repurposing. Frontiers in Pharmacology. 2017;23(8):298
  14. 14. Wilson GL, Lill MA. Integrating structure-based and ligand-based approaches for computational drug design. Future Medicinal Chemistry. 2011;3(6):735-750
  15. 15. Anighoro A. Deep learning in structure-based drug design. Methods in Molecular Biology. 2022;2390:261-271
  16. 16. Stokes JM, Yang K, Swanson K, Jin W, Cubillos-Ruiz A, Donghia NM, et al. A deep learning approach to antibiotic discovery. Cell. 2020;180(4):688-702.e13
  17. 17. Elton DC, Boukouvalas Z, Fuge MD, Chung PW. Deep learning for molecular design—a review of the state of the art. Molecular System and Design Engineering. 2019;4(4):828-849
  18. 18. Allison B, Combs S, DeLuca S, Lemmon G, Mizoue L, Meiler J. Computational design of protein-small molecule interfaces. Journal of Structural Biology. 2014;185(2):193-202
  19. 19. Śledź P, Caflisch A. Protein structure-based drug design: From docking to molecular dynamics. Current Opinion in Structural Biology. 2018;48:93-102
  20. 20. Guterres H, Im W. Improving protein-ligand docking results with high-throughput molecular dynamics simulations. Journal of Chemical Model. 2020;60(4):2189-2198
  21. 21. Wishart DS. DrugBank: A comprehensive resource for in silico drug discovery and exploration. Nucleic Acids Research. 2006;34(90001):D668-D672
  22. 22. Li Q , Cheng T, Wang Y, Bryant SH. PubChem as a public resource for drug discovery. Drug Discovery Today. 2010;15(23-24):1052-1057
  23. 23. Irwin JJ, Sterling T, Mysinger MM, Bolstad ES, Coleman RG. ZINC: A Free Tool to Discover Chemistry for Biology. Journal of Chemical Information and Modeling. 2012;52(7):1757-1768
  24. 24. Gaulton A, Bellis LJ, Bento AP, Chambers J, Davies M, Hersey A, et al. ChEMBL: A large-scale bioactivity database for drug discovery. Nucleic Acids Research. 2012;40(D1):D1100-D1107
  25. 25. Pence HE, Williams A. ChemSpider: An online chemical information resource. Journal of Chemical Education. 2010;87(11):1123-1124
  26. 26. O’Boyle NM, Banck M, James CA, Morley C, Vandermeersch T, Hutchison GR. Open Babel: An open chemical toolbox. Journal of Cheminformatics. 2011;3(1):33
  27. 27. Morris GM, Huey R, Lindstrom W, Sanner MF, Belew RK, Goodsell DS, et al. AutoDock4 and AutoDockTools4: Automated docking with selective receptor flexibility. Journal of Computational Chemistry. 2009;30(16):2785-2791
  28. 28. Goodford PJ. A computational procedure for determining energetically favorable binding sites on biologically important macromolecules. Journal of Medicinal Chemistry. 1985;28(7):849-857
  29. 29. Laskowski RA. SURFNET: A program for visualizing molecular surfaces, cavities, and intermolecular interactions. Journal of Molecular Graphics. 1995;13(5):323-330
  30. 30. Yang J, Roy A, Zhang Y. Protein-ligand binding site recognition using complementary binding-specific substructure comparison and sequence profile alignment. Bioinformatics. 2013;29(20):2588-2595
  31. 31. Narang P, Bhushan K, Bose S, Jayaram B. Protein structure evaluation using an all-atom energy based empirical scoring function. Journal of Biomolecular Structure & Dynamics. 2006;23(4):385-406
  32. 32. Binkowski TA, Naghibzadeh S, Liang J. CASTp: Computed Atlas of Surface Topography of proteins. Nucleic Acids Research. 2003;31(13):3352-3355
  33. 33. Jiménez J, Doerr S, Martínez-Rosell G, Rose AS, De Fabritiis G. DeepSite: Protein-binding site predictor using 3D-convolutional neural networks. Bioinformatics. 2017;33(19):3036-3042
  34. 34. Kandel J, Tayara H, Chong KT. PUResNet: Prediction of protein-ligand binding sites using deep residual neural network. Journal of Cheminformatics. 2021;13(1):65
  35. 35. Huang SY, Zou X. Advances and Challenges in Protein-Ligand Docking. IJMS. 2010;11(8):3016-3034
  36. 36. de Vries SJ, van Dijk M, Bonvin AMJJ. The HADDOCK web server for data-driven biomolecular docking. Nature Protocols. 2010;5(5):883-897
  37. 37. Fan J, Fu A, Zhang L. Progress in molecular docking. Quantitative Biology. 2019;7(2):83-89
  38. 38. Crampon K, Giorkallos A, Deldossi M, Baud S, Steffenel LA. Machine-learning methods for ligand–protein molecular docking. Drug Discovery Today. 2022;27(1):151-164
  39. 39. Jiang F, Kim SH. “Soft docking”: Matching of molecular surface cubes. Journal of Molecular Biology. 1991;219(1):79-102
  40. 40. Ferrari AM, Wei BQ , Costantino L, Shoichet BK. Soft docking and multiple receptor conformations in virtual screening. Journal of Medicinal Chemistry. 2004;47(21):5076-5084
  41. 41. Torres PHM, Sodero ACR, Jofily P, Silva-Jr FP. Key topics in molecular docking for drug design. IJMS. 2019;20(18):4574
  42. 42. Gioia D, Bertazzo M, Recanatini M, Masetti M, Cavalli A. Dynamic docking: A paradigm shift in computational drug discovery. Molecules. 2017;22(11):2029
  43. 43. Sousa SF, Fernandes PA, Ramos MJ. Protein-ligand docking: Current status and future challenges. Proteins. 2006;65(1):15-26
  44. 44. Meng XY, Zhang HX, Mezei M, Cui M. Molecular docking: A powerful approach for structure-based drug discovery. Caduceus. 2011;7(2):146-157
  45. 45. Kuntz ID, Blaney JM, Oatley SJ, Langridge R, Ferrin TE. A geometric approach to macromolecule-ligand interactions. Journal of Molecular Biology. 1982;161(2):269-288
  46. 46. Miller MD, Kearsley SK, Underwood DJ, Sheridan RP. FLOG: A system to select ?quasi-flexible? ligands complementary to a receptor of known three-dimensional structure. Journal of Computer-Aided Molecular Design. 1994;8(2):153-174
  47. 47. Pang YP, Perola E, Xu K, Prendergast FG. EUDOC: A computer program for identification of drug interaction sites in macromolecules and drug leads from chemical databases. Journal of Computational Chemistry. 2001;22(15):1750-1771
  48. 48. Jain AN. Surflex: Fully automatic flexible molecular docking using a molecular similarity-based search engine. Journal of Medicinal Chemistry. 2003;46(4):499-511
  49. 49. Diller DJ, Merz KM. High throughput docking for library design and library prioritization. Proteins. 2001;43(2):113-124
  50. 50. Burkhard P, Taylor P, Walkinshaw MD. An example of a protein ligand found by database mining: Description of the docking method and its verification by a 2.3 Å X-ray structure of a Thrombin-Ligand complex. Journal of Molecular Biology. 1998;277(2):449-466
  51. 51. Huang SY, Zou X. Ensemble docking of multiple protein structures: Considering protein structural variations in molecular docking. Proteins. 2006;66(2):399-421
  52. 52. Prieto-Martínez FD, Arciniega M, Medina-Franco JL. Acoplamiento Molecular: Avances Recientes y Retos. TIP RECQB. 2018. [cited 2022 May 15];21. Available from: http://tip.zaragoza.unam.mx/index.php/tip/article/view/143
  53. 53. Guedes IA, de Magalhães CS, Dardenne LE. Receptor–ligand molecular docking. Biophysical Reviews. 2014;6(1):75-87
  54. 54. Friesner RA, Banks JL, Murphy RB, Halgren TA, Klicic JJ, Mainz DT, et al. Glide: A new approach for rapid, accurate docking and scoring. 1. Method and assessment of docking accuracy. Journal of Medicinal Chemistry. 2004;47(7):1739-1749
  55. 55. McGann MR, Almond HR, Nicholls A, Grant JA, Brown FK. Gaussian docking functions. Biopolymers. 2003;68(1):76-90
  56. 56. Ewing TJA, Kuntz ID. Critical evaluation of search algorithms for automated molecular docking and database screening. Journal of Computational Chemistry. 1997;18(9):1175-1189
  57. 57. Böhm HJ. The computer program LUDI: A new method for the de novo design of enzyme inhibitors. Journal of Computer-Aided Molecular Design. 1992;6(1):61-78
  58. 58. Rarey M, Kramer B, Lengauer T, Klebe G. A fast flexible docking method using an incremental construction algorithm. Journal of Molecular Biology. 1996;261(3):470-489
  59. 59. Bentham Science Publisher BSP. eHiTS: An innovative approach to the docking and scoring function problems. CPPS. 2006;7(5):421-435
  60. 60. Trott O, Olson AJ. AutoDock Vina: Improving the speed and accuracy of docking with a new scoring function, efficient optimization, and multithreading. Journal of Computational Chemistry. 2009
  61. 61. Verdonk ML, Cole JC, Hartshorn MJ, Murray CW, Taylor RD. Improved protein-ligand docking using GOLD. Proteins. 2003;52(4):609-623
  62. 62. de Magalhães CS, Almeida DM, Barbosa HJC, Dardenne LE. A dynamic niching genetic algorithm strategy for docking highly flexible ligands. Information Sciences. 2014;289:206-224
  63. 63. Thomsen R, Christensen MH. MolDock: A new technique for high-accuracy molecular docking. Journal of Medicinal Chemistry. 2006;49(11):3315-3321
  64. 64. Forli S, Huey R, Pique ME, Sanner MF, Goodsell DS, Olson AJ. Computational protein–ligand docking and virtual drug screening with the AutoDock suite. Nature Protocols. 2016;11(5):905-919
  65. 65. Bentham Science Publisher BSP. Scoring functions for protein-ligand docking. CPPS. 2006;7(5):407-420
  66. 66. Weiner PK, Kollman PA. AMBER: Assisted model building with energy refinement. A general program for modeling molecules and their interactions. Journal of Computational Chemistry. 1981;2(3):287-303
  67. 67. Brooks BR, Bruccoleri RE, Olafson BD, States DJ, Swaminathan S, Karplus M. CHARMM: A program for macromolecular energy, minimization, and dynamics calculations. Journal of Computational Chemistry. 1983;4(2):187-217
  68. 68. van Gunsteren WF, Berendsen HJC. Computer simulation of molecular dynamics: Methodology, applications, and perspectives in chemistry. Angewandte Chemie (International Ed. in English). 1990;29(9):992-1023
  69. 69. Jorgensen WL, Tirado-Rives J. The OPLS Potential Functions for Proteins. Energy Minimizations for Crystals of Cyclic Peptides and Crambin. p. 10
  70. 70. Parrill AL, Reddy MR. Rational Drug Design: Novel Methodology and Practical Applications. American Chemical Society; 1999 [cited 2022 May 23]. (ACS Symposium Series; vol. 719). Available from: https://pubs.acs.org/doi/book/10.1021/bk-1999-0719
  71. 71. Krammer A, Kirchhoff PD, Jiang X, Venkatachalam CM, Waldman M. LigScore: A novel scoring function for predicting binding affinities. Journal of Molecular Graphics & Modelling. 2005;23(5):395-407
  72. 72. Böhm HJ. The development of a simple empirical scoring function to estimate the binding constant for a protein-ligand complex of known three-dimensional structure. Journal of Computer-Aided Molecular Design. 1994;8(3):243-256
  73. 73. Wang R, Liu L, Lai L, Tang Y. SCORE: A new empirical method for estimating the binding affinity of a protein-ligand complex. Journal of Molecular Modeling. 1998;4(12):379-394
  74. 74. Wang R, Lai L, Wang S. Further development and validation of empirical scoring functions for structure-based binding affinity prediction. Journal of Computer-Aided Molecular Design. 2002;16(1):11-26
  75. 75. Dias R, de Azevedo W. Molecular docking algorithms. CDT. 2008;9(12):1040-1047
  76. 76. Waszkowycz B, Clark DE, Gancia E. Outstanding challenges in protein–ligand docking and structure-based virtual screening. WIREs Computational Molecular Science. 2011;1(2):229-259
  77. 77. Morris GM, Lim-Wilby M. Molecular docking. In: Kukol A, editor. Molecular Modeling of Proteins. Totowa, NJ: Humana Press; 2008. pp. 365-382
  78. 78. Verdonk ML, Taylor RD, Chessari G, Murray CW. Illustration of current challenges in molecular docking. In: Structure-Based Drug Discovery. Dordrecht: Springer Netherlands; 2007. pp. 201-221
  79. 79. Janin J, Henrick K, Moult J, Eyck LT, Sternberg MJE, Vajda S, et al. CAPRI: A critical assessment of PRedicted interactions. Proteins. 2003;52(1):2-9
  80. 80. Janin J. Protein–protein docking tested in blind predictions: The CAPRI experiment. Molecular BioSystems. 2010;6(12):2351
  81. 81. Hurle MR, Yang L, Xie Q , Rajpal DK, Sanseau P, Agarwal P. Computational drug repositioning: From data to therapeutics. Clinical Pharmacology and Therapeutics. 2013;93(4):335-341
  82. 82. Scherman D, Fetro C. Drug repositioning for rare diseases: Knowledge-based success stories. Thérapie. 2020;75(2):161-167
  83. 83. Xiao H, Bid HK, Chen X, Wu X, Wei J, Bian Y, et al. Repositioning Bazedoxifene as a novel IL-6/GP130 signaling antagonist for human rhabdomyosarcoma therapy. PLoS ONE. 2017;12(7):e0180297
  84. 84. Gupta RR. Application of artificial intelligence and machine learning in drug discovery. Methods in Molecular Biology. 2022;2390:113-124
  85. 85. Thomas M, Boardman A, Garcia-Ortegon M, Yang H, de Graaf C, Bender A. Applications of artificial intelligence in drug design: Opportunities and challenges. Methods in Molecular Biology. 2022;2390:1-59
  86. 86. Zhu T, Cao S, Su PC, Patel R, Shah D, Chokshi HB, et al. Hit identification and optimization in virtual screening: Practical recommendations based on a critical literature analysis: Miniperspective. Journal of Medicinal Chemistry. 2013;56(17):6560-6572
  87. 87. Neves BJ, Mottin M, Moreira-Filho JT, Sousa BK de P, Mendonca SS, Andrade CH. Best practices for docking-based virtual screening. In: Molecular Docking for Computer-Aided Drug Design. Academic Press (Elsevier); 2021. pp. 75-98
  88. 88. Lipinski CA, Lombardo F, Dominy BW, Feeney PJ. Experimental and computational approaches to estimate solubility and permeability in drug discovery and development settings. Advanced Drug Delivery Reviews. 2001;46(1-3):3-26
  89. 89. Veber DF, Johnson SR, Cheng HY, Smith BR, Ward KW, Kopple KD. Molecular properties that influence the oral bioavailability of drug candidates. Journal of Medicinal Chemistry. 2002;45(12):2615-2623
  90. 90. Neves BJ, Braga RC, Melo-Filho CC, Moreira-Filho JT, Muratov EN, Andrade CH. QSAR-based virtual screening: Advances and applications in drug discovery. Frontiers in Pharmacology. 2018;9:1275
  91. 91. Fassio AV, Santos LH, Silveira SA, Ferreira RS, de Melo-Minardi RC. nAPOLI: A graph-based strategy to detect and visualize conserved protein-ligand interactions in large-scale. IEEE/ACM Transactions on Computational Biology and Bioinformatics. 2019:1-1
  92. 92. Kurogi Y, Güner OF. Pharmacophore modeling and three-dimensional database searching for drug design using catalyst. Current Medicinal Chemistry. 2001;8(9):1035-1055
  93. 93. Dixon SL, Smondyrev AM, Knoll EH, Rao SN, Shaw DE, Friesner RA. PHASE: A new engine for pharmacophore perception, 3D QSAR model development, and 3D database screening: 1. Methodology and preliminary results. Journal of Computer-Aided Molecular Design. 2006;20(10-11):647-671
  94. 94. Chen X, Rusinko A, Tropsha A, Young SS. Automated pharmacophore identification for large chemical data sets. Journal of Chemical Information and Computer Sciences. 1999;39(5):887-896
  95. 95. Schneidman-Duhovny D, Dror O, Inbar Y, Nussinov R, Wolfson HJ. PharmaGist: A webserver for ligand-based pharmacophore detection. Nucleic Acids Research. 2008;36:W223-W228
  96. 96. Fan N, Bauer CA, Stork C, de Bruyn KC, Kirchmair J. ALADDIN: Docking approach augmented by machine learning for protein structure selection yields superior virtual screening performance. Molecular Informatics. 2020;39(4):e1900103
  97. 97. Rashidieh B, Molakarimi M, Mohseni A, Tria SM, Truong H, Srihari S, et al. Targeting BRF2 in cancer using repurposed drugs. Cancers. 2021;13(15):3778
  98. 98. Berman HM. The protein data bank. Nucleic Acids Research. 2000;28(1):235-242
  99. 99. Chen X. TTD: Therapeutic target database. Nucleic Acids Research. 2002;30(1):412-415
  100. 100. Chen YZ, Zhi DG. Ligand-protein inverse docking and its potential use in the computer search of protein targets of a small molecule. Proteins. 2001;43(2):217-226
  101. 101. Wang JC, Chu PY, Chen CM, Lin JH. idTarget: A web server for identifying protein targets of small chemical molecules with robust scoring functions and a divide-and-conquer docking approach. Nucleic Acids Research. 2012;40:W393-W399
  102. 102. Xie T, Zhang L, Zhang S, Ouyang L, Cai H, Liu B. ACTP: A webserver for predicting potential targets and relevant pathways of autophagy-modulating compounds. Oncotarget. 2016;7(9):10015-10022
  103. 103. Lee A, Kim D. CRDS: Consensus reverse docking system for target fishing. Bioinformatics. 2019
  104. 104. Stepanova EE, Balandina SY, Drobkova VA, Dmitriev MV, Mashevskaya IV, Maslivets AN. Synthesis, in vitro antibacterial activity against Mycobacterium tuberculosis, and reverse docking-based target fishing of 1,4-benzoxazin-2-one derivatives. Archiv der Pharmazie. 2021;354(2):2000199
  105. 105. Imrie F, Bradley AR, van der Schaar M, Deane CM. Protein family-specific models using deep neural networks and transfer learning improve virtual screening and highlight the need for more data. Journal of Chemical Information and Modeling. 2018;58(11):2319-2330
  106. 106. Kitchen DB, Decornez H, Furr JR, Bajorath J. Docking and scoring in virtual screening for drug discovery: Methods and applications. Nature Reviews. Drug Discovery. 2004;3(11):935-949
  107. 107. Yuan S, Chan HCS, Hu Z. Using PyMOL as a platform for computational drug design. WIREs Computers Molecular Science. 2017;30(2):70
  108. 108. Koukos PI, Réau M, Bonvin AMJJ. Shape-restrained modeling of protein–small-molecule complexes with high ambiguity driven DOCKing. Journal of Chemical Information and Modeling. 2021;61(9):4807-4818
  109. 109. Koukos PI, Xue LC, Bonvin AMJJ. Protein–ligand pose and affinity prediction: Lessons from D3R Grand Challenge 3. Journal of Computer-Aided Molecular Design. 2019;33(1):83-91
  110. 110. Stanzione F, Giangreco I, Cole JC. Use of molecular docking computational tools in drug discovery. In: Progress in Medicinal Chemistry. Elsevier; 2021. pp. 273-343
  111. 111. Sennhauser G, Amstutz P, Briand C, Storchenegger O, Grütter MG. Drug export pathway of multidrug exporter AcrB revealed by DARPin inhibitors. PLoS Biology. 2007;5(1):e7
  112. 112. Grosdidier A, Zoete V, Michielin O. SwissDock, a protein-small molecule docking web service based on EADock DSS. Nucleic Acids Research. 2011;39(suppl):W270-W277
  113. 113. Guex N, Peitsch MC. SWISS-MODEL and the Swiss-Pdb Viewer: An environment for comparative protein modeling. Electrophoresis. 1997;18(15):2714-2723

Written By

Sefika Feyza Maden, Selin Sezer and Saliha Ece Acuner

Submitted: 31 May 2022 Reviewed: 13 June 2022 Published: 08 July 2022