Transformation of Drug Discovery towards Artificial Intelligence: An in Silico Approach

Computational methods play a key role in the design of therapeutically important molecules for modern drug development. With these “ in silico ” approaches, machines are learning and offering solutions to some of the most complex drug related problems and has well positioned them as a next frontier for potential breakthrough in drug discovery. Machine learning (ML) methods are used to predict compounds with pharmacological activity, specific pharmacodynamic and ADMET (absorption, distribution, metabolism, excretion and toxicity) properties to evaluate the drugs and their various applications. Modern artificial intelligence (AI) has the capacity to significantly enhance the role of computational methodology in drug discovery. Use of AI in drug discovery and development, drug repurposing, improving pharmaceutical productivity, and clinical trials will cer-tainly reduce the human workload as well as achieving targets in a short period of time. This chapter elaborates the crosstalk between the machine learning techniques, computational tools and the future of AI in the pharmaceutical industry.


Introduction
Computer-aided drug design [1,2] has the potential to lower the cost, decrease the failure rates and speed up the discovery process. Computational tools play various roles in medicinal chemistry ranging from optimization of protein-ligand interactions for drug discovery to the design of new drugs. These methods are broadly classified as structure based and ligand based methods. For structural methods the computational studies are carried out for molecular dynamics, proteinÀligand docking and calculation of free binding energies. For the ligand based methods, the computational calculations help to predict the biological response about known active and inactive ligands which include quantitative structure-activity relationships, activity cliffs analysis, and similarity search. In recent years, discovery of new molecules that could be more effective with fewer unwanted side effects is a constant concern of pharmaceutical industry. So, new developed research methods are used to predict the properties and activities of molecules even before they are synthesized. The significant development of computational tools as well as theoretical studies of quantum chemistry allow researchers to obtain more precise physicochemical and quantum parameters of compounds in a shorter time. These techniques move towards the synthesis of a very large number of molecules simultaneously and to test their actions on therapeutic targets. Density Functional theory (DFT) has become a powerful tool to study the electronic and geometric characteristics of the drugs. Conceptual density functional theory (CDFT), originally developed by Parr and collaborators [3][4][5][6][7][8], with several global and local reactivity descriptors help us to understand various physicochemical processes. As Global reactivity descriptors are connected with several electronic structure principles so they play very important role in the physico-chemical information of the complexes. The understanding of the relationship between the structure and activity of the drug, the pharmacokinetic parameters responsible for bioavailability and the toxicity are evaluated by other computational tools. In this way the drugs with higher efficiency is obtained. These studies give a complete picture to design new drug molecules and its physicochemical parameters, drug-likeness and cytotoxicity evaluation in a shorter time. Recent advances in these methods increase the quantity and complexity of generated data. This massive amount of raw data needs to be stored and interpreted in order to advance the medicinal world. The correlations and patterns from large amounts of complex drug data should be performed by machine learning algorithms to extract knowledge and insights from the accumulated data. Databases are used to design new molecular descriptors and the models are validated with external test sets [9].
Modern artificial intelligence (AI) has the potential to significantly enhance the role of computational methods and machine learning in pharmaceutical industry [10]. According to the World Economic Forum, a combination of big data and AI are considered as the fourth paradigm of science and the fourth industrial revolution. Interestingly, with machine learning and AI solutions to some of the most complex drug related problems, drug discovery has created a potential breakthrough in medicinal world.

Method
CDFT is used to predict, analyze and interpret the reactivity properties of small drug molecules. Global reactivity descriptors, Fukui indices [11] and the Dual descriptors proposed by Morell et al. [12] are used for the analysis. These theories have been validated by a large number of studies [13][14][15][16][17][18]. A better microscopic insight to the whole interaction process can be observed by global indices and the other derived reactivity indices for the interacting complexes. In our work, we have selected design and structure-based, informatics-based, fragment-based, smallmolecule microarray screening, dynamic combinatorial screening and use of phenotypic assays based 22 small drug molecules to identify RNA-binding molecules. These drug show several desirable properties as good absorption, distribution, oral bioavailability and have ability to target bulges, loops, junctions, pseudo-knots, or higher order structures. The optimized structures for these 22 small drug molecules are given in Figure 1. We have computed relevant electronic properties of the studied drug including global parameters such as E HOMO , E LUMO , Energy gap, IP, EA, electronegativity (χ), global hardness (η), global softness(S), Chemical potential (μ) Softness (S), Electrophilicity index (ω), fraction of electrons transferred (Δ), Nucleophilicity index as well as local ones (Fukui functions and dual descriptors). CDFT results predicted structural and thermodynamic stability and low reactivity for the complexes. Also few complexes were identified as fluorescent biomarkers as their emission lies in the visible region [19]. In another study, the global and local descriptors are calculated to study ten Anti-inflammatory steroids (AIS) to understand the structure-activity relationship. The toxicity evaluation of drug and the pharmacokinetic parameters responsible for bioavailability and bioactivity are carried by the bioinformatics Osiris/Molinspiration [20,21] tools. The physico-chemical properties are studied by G09 software tools. As the structures of small-molecule drugs increase in complexity, the importance of synthetic and in silico approaches both play an important role in understanding of ligandÀreceptor interactions within target classes. The predictive drug discovery tools offered a high degree of specificity within molecular design. Careful arrangement of structural features about a molecule necessitated efficient and practical approaches to model these privileged structures. We have also used similar computational tools to study the 21 molecular new chemical entities (NCEs) approved for the first time by a governing body anywhere in the world during 2019 [22]. Out of 11 therapeutic areas, 10 therapeutic areas as anti-infective/antibiotic, cardiovascular and hematologic, neurological (central nervous system (CNS)), dermatologic, inflammation and immunologic, metabolic, musculoskeletal, oncologic, reproductive, and respiratory drugs were selected. See Figures 2 and 3. Osiris Calculations were carried out to predict the toxicity risk in the drug molecules. Results showed drug conform behavior for all studied drugs except Triclabendazole, Trifarotene, Alpilisib and Ertafinitib, which shows high risks of undesired effects like mutagenicity, tumorigenicity, irritating effects and reproductive effects [23].
The understanding of various types of interactions is also crucial for the druglike molecule and its target. These possible interactions between a drug and target consist of covalent bonds, dipole-dipole interactions, ion-dipole interactions, ionic interactions, hydrogen bonding, hydrophobic interactions and charge transfer. The mechanism of drug action can be explained with ionic interactions as during physical pH condition several functional groups undergo ionization. The weak ion-dipole and dipole-dipole interactions both plays significant role in drugreceptor binding. Weak interactions include hydrogen bonds, hydrophobic interactions and charge transfer which exist between drug and receptor to provide stability to the drug-receptor binding. DFT is utilized to understand the reaction mechanisms of the drug molecule. Various Computational tools are used to precisely calculate the transition state for drug-target complexes. Dipole moment (DM) is also an important parameter which is used to explain observable chemical and physical properties of drug molecules. DM is used to assess cell permeability and oral bioavailability of drugs as complexes with large dipole moment are more soluble in water and less likely to be absorbed through lipophilic membranes   (17)(18)(19)(20)(21). [24,25]. DM is included as a highly relevant descriptor in explaining the catalytic activity of enzymes in Quantitative Structure-Activity Relationships (QSAR) or Quantitative Structure-Property Relationships (QSPR) studies. For example: QSAR modeling of aromatase inhibition [26], antifungal activity [27], and HIV-1 protease/ cyclin-dependent kinases inhibition [28], QSAR modeling of aromatase inhibition [26], antifungal activity [27], and HIV-1 protease/ cyclin-dependent kinases inhibition [28], and in the estimation of micellar properties such as drug loading capacity (LC) [29] in QSPR model. Since DFT calculations are computationally too demanding for most large-scale virtual screening explorations, or for the incorporation in fast QSAR or QSPR models, so the relevant parameters are calculated by empirical or machine learning (ML) methods.
ML from data precalculated by DFT has emerged as a successful method for drugs as the results are highly accurate and has higher speed compared to the previous approaches [30,31]. A new class of atomistic simulation techniques combining machine learning (ML) with simulation methods based on quantum mechanical (QM) calculations has emerged in the last decades. These methods can dramatically increase the computational efficiency of QM-based simulations and enable to reach the large system sizes and long timescales required to access properties with relevance for drug industry. Machine learning tools provide early stage filtering to identify promising drug molecules for further screening by computationally more intensive methods. The core quantity of atomistic simulation is the Potential energy landscape (PES), a high dimension function which is the basic ingredient for Monte Carlo (MC) simulations. Simulation methods are more or less computationally efficient depending on the degree of physical approximation. Force fields as AMBER [32], CHARMM [33], GROMOS [34], and OPLS [35]) are computationally very efficient since they employ simple pairwise interaction terms and fixed atomic charges for MM based methods in drug discovery pipeline [36]. QM simulations are fully reactive and can describe the complex bonding patterns, polarization effects and charge transfer processes that govern the behavior of biological systems [37].Various machine learning tools as artificial neural networks (ANN), support vector machines (SVM) and genetic programming have been explored to predict inhibitors, blockers, agonists, antagonists, activators and substrates of proteins related to specific therapeutic targets. These methods use screening compound libraries of diverse chemical structures, "noisy" and high-dimensional data to complement QSAR methods, and in cases of unavailable receptor 3D structure to complement structure-based methods. Several open access chemical spaces as, PubChem, ChemBank, DrugBank, and ChemDB are used in virtual screening of Drug molecules. DeepVS is used for docking of 40 receptors and 2950 ligands, showed exceptional performance when 95000 decoys were tested against these receptors [38]. In another study, multiobjective automated replacement algorithm is used to optimize the potency profile of a cyclin-dependent kinase-2 inhibitor by assessing its shape similarity, biochemical activity, and physicochemical properties [39]. GLORY, an innovative tool was used to predict the metabolism of molecules, identifying chemical structures of metabolites formed by cytochrome P450 enzyme family (CYPs) [40]. In another approach, drug combination synergy was used to exploit the largest available dataset reporting synergism of anticancer drugs (NCI-ALMANAC, with over 290,000 synergy determinations) [41].
As the vast chemical space comprising >10 60 molecules, fosters the development of a large number of drug molecules [42,43], sometimes limits the drug development process, making it a time-consuming and highly expensive. So AI is used as it can recognize hit and lead compounds and provide a quicker validation of the drug target and optimization of the drug structure design [42][43][44]. See Figure 4. AI can aid rational drug design [45]; assist in decision making; determine the right therapy for a patient, including personalized medicines; and manage the clinical data generated and use it for future drug development [46]. AI has done major contributions to the further incorporation of the developed drug in its correct dosage form as well as its optimization, in addition to aiding quick decision-making, leading to faster manufacturing of better-quality products. Robotic synthesis could eventually provide a fully automated drug discovery pipeline driven by AI [47,48]. AI based approaches can also contribute to the safety and efficacy of the product in clinical trials as well as ensuring proper positioning and costing in the market through comprehensive market analysis and prediction.

Conclusions
The world of science has changed, and there is no question about it. The new model is for the data to be captured by instruments or generated by simulations before being processed by software and for the resulting information or knowledge to be stored in computers. The continued improvement of ML methods in chemistry, which compete with standard computational approaches and expertise are continuously developing the modern computational medicinal chemistry. Machine learning potentials are capable of carrying out high-throughput calculations in millisecond time scales with DFT accuracy and help to avoid false positives and false negatives. De novo molecular design are giving accurate predictions of lead compounds to target for simulation, effectively narrowing the search space for highthroughput screening applications. The advancement of AI along with its remarkable tools is continuously aims to reduce challenges faced by drug development process along with the overall lifecycle of the product as healthcare sector is facing several complex challenges, such as the increased cost of drugs and therapies, and society needs specific significant changes in this area. Though there are specific challenges remain with regards to the implementation of this technology, it is likely that AI will become an invaluable tool in the pharmaceutical industry in the near future. The vast knowledge of physics needs to be utilized to improve these advance techniques and tools while also not sacrificing speed and accuracy.