Tools for ADME evaluation.
The drug discovery and development pipeline have more and more relied on in vitro testing and in silico predictions to reduce investments and optimize lead compounds. A comprehensive set of in vitro assays is available to determine key parameters of absorption, distribution, metabolism, and excretion, for example, lipophilicity, solubility, and plasma stability. Such test systems aid the evaluation of the pharmacological properties of a compound and serve as surrogates before entering in vivo testing and clinical trials. Nowadays, computer-aided techniques are employed not just in the discovery of new lead compounds but embedded as part of the entire drug development process where the ADME profiling and big data analyses add a new layer of complexity to those systems. Herein, we give a short overview of the history of the drug development pipeline presenting state-of-the-art ADME in vitro assays as established in academia and industry. We will further introduce the underlying good practices and give an example of the compound development pipeline. In the next step, recent advances at in silico techniques will be highlighted with special emphasis on how pharmacogenomics and in silico PK profiling can enhance drug monitoring and individualization of drug therapy.
- drug discovery
- in silico prediction
- pharmacokinetics prediction
Drug discovery and development grew into a wide interdisciplinary field during the last decades and many factors played and play an important role in the successful evolution from a bioactive compound, or so-called new molecular entity (NME), into a potential drug . Herein, we discuss the drug discovery and development (DDD) process where the pharmacokinetic profiling in terms of ADME assessment is concerned. Therefore, we provide a short overview of the in vitro, ex vivo, and in vivo state-of-the-art techniques used in academy and industry with special emphasis on how recent advances in computer science paved the path for in silico prediction in the DDD process for small molecules. However, the discussion of the whole topic is out of the scope of this review, which only aims to give insights into the principle process of (computer-aided) drug discovery and development.
The current state of pharmaceutical DDD estimates that only up to ten compounds out of thousand screened hits would result in optimized leads and enter preclinical testing, with a chance of 9.6% to pass the clinical testing phase [1, 2]. Additionally, the drug approval process is estimated to last in average 15 years, with major expenses in phases II and III of clinical trials, which highlights the drawback a failure in (pre-) clinical testing causes [3, 4, 5, 6], where the overall DDD cost for each drug can reach as high as ~$2.56 billion preapproval rising to $2.87 billion including postapproval investments [6, 7, 8]. From the initial small molecule screened as hit to the optimized lead, a variety of in vitro tests are performed to guarantee efficacy and safety, but also to find structure-activity relationships (SAR), which can then be connected to specific physicochemical properties of the compound and further aid in the lead optimization phase [8, 9, 10].
The drug development phase starts with preclinical testing followed by the clinical stage comprising phase I–III human trials. Each of the phases aims to answer a specific question. Initially, preclinical trials are conducted in animals and can provide information about whether a drug is toxic or not. Compounds that show no toxicity in animals then advance to phase I trials, which will study whether the drug is also safe in healthy humans and provide an initial idea for appropriate dosage. In phase II, the efficacy of the drug is examined in parallel to potential side effects to answer the question if it principally meets the expected performance. Phase II presents the biggest hurdle with a transition success rate as low as 30%. Ultimately, drug candidates enter clinical phase III in which the preliminary results found so far need to be proofed and any adverse reactions monitored to make sure that the drug really helps treating the disease [2, 11].
Starting from the generation of a lead compound assessment, and optimization of pharmacokinetic properties and correlation to pharmacodynamic effects increases in importance as one of the three major attrition causes among toxicity and efficacy [8, 12]. In this sense, it is not surprising that the period between lead and the clinical candidate is sometimes referred to as “valley of death” due to the often occurring failures and dead ends during this time of the DDD process, which results in high costs and missing deadlines .
2. Role of computer-aided techniques in drug discovery
In a long ongoing effort, more and more in silico techniques are being integrated into several points of DDD with different purposes. In silico techniques can ease the process of SAR assessment as well as the generation of compound series by guiding combinatorial chemistry since they allow fast and easy evaluation of compounds prior to synthesis from big libraries. For instance, combinatorial chemistry offered an option to readily produce a broad range of potentially pharmaceutical active small molecules in a short time, while SAR data in combination with complex mathematical algorithms, such as regression analyses based or machine-learning–based approaches, allow to determine the potential effects of the analogues and derivative’s structures a priori .
Latter approach can save time and resources by eliminating in early stages molecules that have predicted low efficacy against the target or to suggest the next round of chemical modifications [14, 15]. Still, lead generation and/or optimization will eventually also include in vivo testing after no toxic side effect was shown in vitro. In vivo efficacy testing will be carried out as proof of concept followed by PK assessment and ultimately animal models of human disease to find correlations between preliminary data and potential performance later on in humans .
In silico ADME prediction aims to generate tools and models based on experimental data to calculate in vivo behavior of compounds by finding quantitative structure-property relationships (QSPRs), which connect structural information to physical and chemical characteristics or even biological behavior (quantitative structure-activity relationship; QSAR). Gained empirical data are then related to descriptors/properties thereby supporting the process of hit-to-lead optimization [10, 16].
When using in silico methods for prediction, it is important to keep in mind that algorithms and tools applied are only models thus being only as good as the data and idea they are based on. That implies a continuous experimental validation and improvement as a basic principle that is supported by an interdisciplinary team. In this sense, frequently used models include QSPR predictors, matched molecular pair (MMP), and data trend analysis since they allow comparably easy application and are based on a high amount of (end) point data. For instance, some experiments offer highly convenient data but do not contribute much to model design, whereas others show high variability but lead to impactful models. Considering the nature of data, it is important to know which type can be used as input from different sources (low variability biological and activity data or homogeneously calculated chemical descriptors) in contrast to data that should only be used from one source (Caco-2, MDCK). A sophisticated approach to generate reliable data or to determine differences between individual experiments is to use assays with control compounds . The target property must be obtained under the same experimental condition and, in the best scenario, obtained from the same laboratory aiming to avoid interlaboratory and interpersonal data noise .
Furthermore, the choice of the number and type of molecular descriptors has a high impact, since it influences the accuracy and interpretability of the model. One would expect that using the maximum number of descriptors would be beneficial, but in reality, the risk of overfitting the data or losing the interpretability is a trade-off. This leads to the point that it is fundamental for a “good” model to find the perfect compromise between quality and quantity. Nevertheless, it is crucial to test and train a model and to evaluate its predictability by different means, such as statistical measures and internal and external validation as recommended by organizations as OECD , and also includes outlier analysis to reduce the noise in the model. An extensive review of different adequate validation methods is discussed in .
As a result of newly achieved advances in computational capability, more complex models and algorithms can now be applied. Despite this, it is still a challenge to create a model for the pharmacokinetic and pharmacodynamic phenomena and interactions within an organism as complex as a mammal, let alone humans . Finally, notwithstanding the apparent linearity, the development of a new chemical entity into a drug is an iterative process, even more, where modeling is concerned, with data from failed attempts being integrated into the new predictions .
3. How specific parameters shape the pharmacology studies
Pharmacology is a major part of the DDD process and describes the interaction of an organism and the drug. It can be divided into two main branches: while pharmacodynamics (PD) describes what the drug does to the body, pharmacokinetics (PK) is interested in what the body does to the drug . The main processes of PK are absorption, distribution, metabolism, and excretion (ADME), finally complemented by toxicity (ADMET). While ADME tries to maximize the pharmacological performance of a small molecule, toxicology aims to ensure that it causes no harm in any kind of side effect .
The big hurdle to overcome is to combine appropriate physicochemical properties of the drug, which would drive its interaction with the organism and show biological activity . Or according to Hodgson: “A chemical cannot be a drug, no matter how active nor how specific its action, unless it is also taken appropriately into the body (absorption), distributed to the right parts of the body, metabolized in a way that does not instantly remove its activity, and eliminated in a suitable manner—a drug must get in, move about, hang around, and then get out” .
As already reviewed  and suggested by the FDA, PK/PD assessment is one of the main focuses for optimization in the drug development process. This is apparently an idea that was shared among many: whereas ADME evaluation was previously addressed in the late stages of preclinical development, currently it became a major concern throughout the whole DDD process, starting from the very beginnings in drug discovery approaches until the very last steps in lead optimization [21, 24].
For each step in the drug’s path through the body, several parameters determine the destination of the drug. In respect to this, each of those parameters would be addressed directly and individually. Unfortunately, to address experimentally each potential parameter is timely unviable, due to the complexity of the human body where all those parameters influence each other. This is not only restricted to mechanisms within the body between different compartments but also extends to interpersonal variations introduced through gender, age, genetic state, disease, etc. To find an approximation, most of the important variables are indirectly evaluated by either models or surrogates (Table 1). In an approach to characterize the properties of compounds, facilitate calculations, and allow standardization between experiments, descriptors are introduced as numerical representations encoding aspects of the chemical information of a molecule. Examples of descriptors and properties include molecular weight and H-bond donors/acceptors and they can be directly obtained from experimental or generated by computational techniques .
|In vitro||Ex vivo/cells||In vivo||In silico|
Dissolution and solubility
Binding and expression of transporters
Inhibition of efflux pumps
HAS-coupled (RP-) HPLC
Plasma protein binding
Plasma protein binding
Activity and expression of transporters
Activity and expression of transporters
|Urine analysis||Half-life prediction|
For instance, although the perfect approach of PK profiling would also reflect the kinetics of drug administration and concentration at the site of action, most in vivo systems rely on plasma sampling as a medium of drug equilibrium since it is easily accessible. As a consequence, results are highly influenced by intrinsic and extrinsic factors such as interpersonal variances as already stated above .
Each compound possesses individual physicochemical properties, such as solubility or lipophilicity, which are influenced by biochemical properties of the body as the different pH of tissues. Although they can be similar, each compound will behave differently, and it is futile to address in vivo behavior without any preliminary knowledge of the basic PK parameters in vitro .
Furthermore, every PK assessment varies depending on the route of administration and requires different models and assays. While some routes depend on absorption mechanisms like oral and transdermal administration, others (i.e., intravenous) directly target the bloodstream and the bioavailability is essentially equal to 100%. Hereafter, we will discuss oral administration parameters of small molecules as the most common form due to many advantages like reliability, safety, price, their experimental approaches, and most common prediction modes [27, 28].
Passive transport across membranes is defined as permeability, which is dependent on lipophilicity, since biological membranes are virtually lipid bilayers, and is by far the most important transport for small molecules, especially in oral absorption [8, 24, 29]. Nonlipophilic compounds normally do not traverse membranes passively, while highly lipophilic molecules run the risk to get stuck within the membranes .
Properties utilized for measuring lipophilicity are the logarithm of the partition coefficient (logP) and the distribution coefficient (logD) with the first not differentiating between ionized and nonionized species. Both are normally applied for n-octanol/water representing an organic and aqueous phase, respectively [21, 26].
Ionizability and lipophilicity provide a strong indication if a compound is likely to be orally absorbed or not .
Ultimately, also the molecular size of the compound is involved in successful absorption due to the aforementioned effects on permeability and solubility . Usually, increasing molecular weight by adding new chemical moieties leads to decreased solubility in aqueous solutions  and while big lipophilic compounds partition passively along membranes (transcellular), small charged molecules can also cross membranes via tight junctions (paracellular) . For oral absorption in terms of permeability, Lipinski and collaborators already proposed in 1997 [33, 34] that orally active compounds should fit at least three of observed four parameters: molecular weight < 500 g mol−1, logP < 5; number of hydrogen bond acceptors <10; number of hydrogen bond donors <10; the well-known Lipinski’s rule of 5 (Ro5). In other words, Ro5 stated a physicochemical space in which molecules outside its domain has a low probability to become orally active. Other rules, as Veber rules , Daina and Zoete , Egan and collaborators , Lovering et al. , and Ritchie and colleagues’  works for example, also included other properties as the sum of hydrogen bond acceptor and donors, rotatable bonds count, polar surface area, number of aromatic rings, and fraction of sp3 carbon atoms.
Despite the criticism and overinterpretation of Lipinski and derived rules, the influence of physicochemical parameters on oral bioavailability and related parameters (as logP and aqueous solubility) is notable. Moreover, these rules are still being employed nowadays in virtual screening campaigns aiming to reduce the number of compounds from massively large available libraries (e.g., ZINC, which contains more than 750 millions of compounds) [40, 41, 42]. Furthermore, those initial steps instigate the generation of more complex models to predict not just oral bioavailability but other PK-related parameters as Caco-2 permeability, aqueous solubility, and logP as indirectly related properties as well as other direct parameters as intestinal absorption, metabolism, clearance, etc.
3.1 Aqueous solubility and lipophilicity
As already mentioned, ionizability is one of the most important properties in PK, thus making pKa the physicochemical property with the highest impact.
Early attempts to increase the efficiency of pKa evaluation were reported by Morgan and colleagues by scaling down the classical titration and spectrophotometric methods introducing microscale versions .
These alterations, however, could not overcome the principle demands of each technique, which are moderate precision and frequent calibration (potentiometric), and the need for a chromophore within the analyte (spectrophotometric) . Starting in 1998, capillary electrophoresis (CE) was effectively used to determine pKa of many compounds and was further upgraded from Pfizer by implementing pressure-assisted capillary electrophoresis (PACE) as a standard method, which is nowadays readily applied in industry settings showing superior features compared to the aforementioned methods [44, 45]. Other variants such as vacuum-assisted multiplexed capillary electrophoresis also exist (VAMCE) . A different approach better suited for HTS is called pH gradient titration offered from Sirius Analytical Instruments but is still limited due to the UV spectroscopy technology .
It is well established that solubility in aqueous media is one of the most important physicochemical properties to be evaluated in oral administration. It is not only necessary for absorption in the GI tract but also a requirement for almost all in vitro and in vivo assays, which depend on a solved compound. Poor solubility can affect the reproducibility of assay results by introducing high variability and further increase development costs of leads with low solubility [26, 47]. Traditionally, solubility measurements were conducted via labor-intensive potentiometric techniques  or equilibrium solubility (thermodynamic; e.g., shake flask) . HTS alternatives comprise laser nephelometric scans (kinetic)  and LC-MS/HPLC techniques, which can also be performed with DMSO solutions of the compound—the standard for HTS applications [47, 49]. It should be noted, though, that aqueous solubility, as described above, is not an optimal model for GI solubility since it does not consider the composition of the GI fluids .
On the other hand, generally speaking, lipophilicity is the ability of a compound to dissolve in lipids and/or organic solvents thus being able to pass biological membranes. Descriptors for lipophilicity are the logarithm of the partition coefficient (logP) or distribution coefficient (logD). Classically, logP was determined using the shake flask method applying n-octanol/water phases. Later, UV spectroscopy became the standard, which unfortunately is not applicable for compounds without absorption in the UV range . Today, RP-HPLC methods are frequently in use due to superior properties [25, 51]. As with many methods, comparison of results obtained under different conditions and in different laboratories proves to be difficult with RP-HPLC. A solution offers the implementation of a standardized lipophilicity value, for example, the chromatographic hydrophobicity index (CHI).
In recent years, a great effort has been made to improve the ability of in silico models to accurately predict aqueous solubility. One of the most developed model is Yalkowsky and Jain’s  general solubility equation (GSE), which is based on the melting point (m.p. °C − 25) and logP (the octanol-water partition coefficient of the un-ionized molecule) of a chemical substance (Eq. (1)), with a relevant prediction power as represented by the coefficient of determination (R2) = 0.96 and root-mean-square error (RMSE) = 0.53 in a dataset of 1026 organic compounds .
General solubility equation as proposed by Yalkowsky and Jain’s :
Modifications in terms of the GSE have been proposed, for instance with the SCRATCH model, which replaces the melting point by molar aqueous activity coefficient, with comparable accuracy (R2 = 0.956, RMSE = 0.859 in a dataset of 883 compounds) . Ali and collaborators suggested replacing the melting point descriptor of the GSE with TPSA, aiming to overcome the issues with compounds with high melting points and also to explicitly take into account the effect of polar and polarizable atoms on the aqueous solubility .
The argument that real drugs are actually more soluble than drug-like molecules, filtered by Lipinski’s rule of five , pointed out the studies in the direction of more complex models. Indeed, nowadays, the quantitative structure-property relationship (QSPR) models correlating the aqueous solubility with various molecular descriptors are often employed. As an example, Chevillard et al. reported the use of a random forest protocol to select the most accurate model among several available, both in commercial or free software packages, for each compound . They report that the multimodel approach can enlarge the applicability domain given that more accurate results for solubility prediction were obtained in comparison to using individual models. This approach agrees with other reports that consensus of local QSAR models can generate predictive workflows, especially for datasets with large structural diversity [58, 59]. It is worth noting that Lipinski himself recently revisited his own rules , in vision of new potential classes of drugs, such as natural products, peptide-like, and fragments, which, despite the validated effect, would defy the original Ro5 limits.
3.2 Ionization state and pKa prediction
Early pKa measurement proves beneficial in lipophilicity assessment since logD values at any pH can be calculated from the pKa and logP values [25, 50]. Although octanol/water logP is similar to most components in the body, not all biological partition processes (i.e., blood-brain barrier and gastrointestinal absorption) can be easily modeled by it .
The prediction of ionization state of compounds, which is indicated by the pKa value, is relevant to derive several other physicochemical and ADME properties of drugs, including solubility, lipophilicity, and pharmacokinetic profile. The use of pKa prediction can be placed in two different stages along the DDD, in the beginning with fast models for larger libraries, intending to generate all possible state populations of particular compounds, and/or later on with more refined semiempirical and, computationally expensive, the density functional theory (DFT), in which more accurate ionization states can be accessed. Examples of fast prediction methods for ionization states, which are available as computer programs, are SPARC , MoKa , and Epik, which use the Hammett and Taft approaches for the pKa prediction . On the other hand, once smaller subsets of molecules are being addressed, the use of semiempirical or density functional theory (DFT) with more computationally expensive models was reported to accurately incorporate the structural features and diversity into the pKa prediction [64, 65].
3.3 Permeability and the use of cellular and noncellular models
As already seen, lipophilicity (logP, logD) is highly involved in membrane permeability. Apart from the already described in vitro methods for logP and logD determination, systems for ex vivo/in situ but also in vivo assessment exist as “direct empirical” determination of permeability. When talking about permeability, the difference between passive diffusion and transporter-mediated active transport needs to be considered.
Cell culture methods have been applied to study intestinal absorption for several decades already . Finding the correct model or cell line is crucial to assess the desired parameters such as passive or active transport. In general, it cannot be distinguished between the different transport mechanisms when using cell culture approaches, but several models exist to shift the focus on one of the parameters.
Two main cell lines are in use as models for intestinal absorption: Caco-2 and MDCK cells. Caco-2 cells are derived from a human colorectal carcinoma and possess many of the typical properties of the small intestine, therefore representing a well-established and validated assay system for absorption, permeability, and secretion studies [21, 67]. This assay is mainly used for rank ordering of compounds in terms of oral absorption and permeability in early phases of drug design. Unfortunately, results obtained in different batches and laboratories vary heavily due to several reasons, which make control compound usage necessary and represent a drawback of the technique . Additional disadvantages include long preparation times (about 3 weeks) and no specific permeation mechanism evaluation. Caco-2 assays are usually used as a primary assay followed and complemented by other in vitro and ex vivo methods . Recently, a 3D version of the Caco-2 assay, “Caco-2 3D spheroid permeability assay” was reported, increasing the overall performance and correlation to a human in vivo data .
As already stated above, transcellular permeation either occurs passively via diffusion of lipophilic molecules or is driven by membrane transporters. Important transporter includes ATP-dependent efflux transporter such as MRP2, BCRP, and P-gp and the organic solute transporter and the multidrug resistance protein 3 (MRP3) on the luminal and basolateral membranes, respectively .
Madin-Darby canine kidney (MDCK) cells are an alternative to Caco-2 cell-based assays and the next most common cell line for passive permeability assessment as well as drug-receptor interaction . MDCK cells also are ideal for transfection and overexpression experiments with human transporters and receptors due to the lack of P-glycoprotein [68, 71]. For instance, the MDCK-MDR1 cell line overexpresses the multidrug resistance protein 1 (MDR1, P-glycoprotein) and can be used in concert with other cell-based assays to specifically address the influence of MDR1 in drug efflux .
Immobilized artificial membranes (IAMs) were already used very early on for lipophilicity determination and are gaining interest again in recent years for direct permeability measures . IAMs are also intensively used in the measurement of the volume of distribution to mimic in vivo binding to phospholipids and phospholipid bilayers (membranes). Therefore, IAMs are discussed in more detail in the following section.
The parallel artificial membrane permeability assay (PAMPA)  is a cheap and fast in vitro alternative to cellular-based assay systems. A very comprehensive review of recent PAMPA methodologies and applications is available . In principle, PAMPA was developed to overcome cellular-based systems (Caco-2, MDCK, etc.) for passive permeability evaluation, which are error-prone, more difficult, and labor and time intensive and tend to report false negatives. Another advantage of PAMPA over conventional cell-based assays is the ability to selectively measure passive permeability, while in cell-based systems influence of membrane transporters cannot be left out. PAMPA assays can be readily applied in high throughput processes or scales and different variants exist to address ionic and H-bonding with membranes that influence permeability and complement the use of Caco-2 and other cell assays . Bermejo and colleagues also showed a significant correlation between Caco-2, in situ rat perfusion, and PAMPA assay data underlining applicability of the method for ADME assessment .
Although high-throughput applications of newly developed and standardized techniques allow gathering of an exorbitant amount of data, it is crucial to also (cor)relate the physicochemical and biomimetic properties to structural features of the compound. This will facilitate the development of QSPRs and allows the construction of in silico models ultimately guiding the medicinal chemistry efforts .
When dealing with oral administration, it is important to note that the drug is not only confronted with the hurdles of solubility and permeability in the absorption process but is also facing metabolizing mechanisms (i.e., enzymes) in the gastrointestinal tract, which are referred to as first-pass metabolism [24, 76]. These include but are not limited to P-glycoproteins, uridine diphosphate glucuronosyltransferase, and mainly cytochrome P450s (CYP450) . This will be discussed more deeply in the Metabolism section.
Permeability has a direct influence on the drug absorption rate and, as discussed, despite the several in vitro cellular models available (e.g., Caco-2, PAMPA, and MDCK), the high costs justify the use of in silico prediction. Further, QSPR study developed using a large compound dataset of Caco-2 permeability data (1272 compounds) presented good apparent permeability prediction accuracy (R2 = 0.81 for the test set) using the polar volume, number hydrogen bond donors, and the surface area as main descriptors .
However, we are far from a model that can predict overall permeability and, the current status, rather focuses on individual compartments and tissues, such as the gastrointestinal (GI) tract, skin, buccal membrane, and the blood-brain barrier (BBB). Since the first BBB permeability correlations with logP in 1977 , models to predict BBB permeability, particularly logBB (Eq. (2)), have greatly advanced. Current models using an array of machine-learning methods such as multilinear regression, support vector machine (SVM), and artificial neural network (ANN) against a dataset of 320 unique compounds had good predictive power (R2 = 0.89) . The work of Shen et al. developed SVM models using 1593 compounds (1283 BBB+ and 310 BBB−) by using different pattern selection methods and obtained the overall accuracy of 98.2% . Both methods have the limitation of unbalanced datasets (where the number of BBB+ is higher than the BBB− within the training set), which was addressed on the work of Wang et al. by using resampling methods coupled with the machine-learning techniques, to achieve accuracy rates of 0.919 in external test data . Wang and collaborators compiled a dataset of 439 unique molecules, which were employed to generate a diverse set of QSAR models and consensus (R2 = 0.504 for external dataset prediction). They also reported the use of transporter profiles as additional biological descriptors to develop hybrid QSAR BBB models, with an improved correlation coefficient R2 = 0.526 for external validation .
LogBB can be calculated by the log of the ratio between the concentration within the brain (Cbrain) by the bloodstream concentration (Cblood) of the determined chemical entitiy
Finally, beyond the usual ADME parameters of interest in DDD, there are several other unusual ones that also can be predicted; as examples, we here point the permeability of the models for skin permeability, which evolved from simple diffusion models based on molecular weight and n-octanol/water partition coefficient [83, 84], until more sophisticated models, such as (non)-linear QSPR models and even molecular dynamics simulation (as extensively reviewed by [85, 86]).
4. ADME properties: experimental approaches and in silico models
Oral bioavailability is defined as the amount of drug that reaches the site of action after oral administration and is influenced by factors like drug solubility and dissolution, chemical and enzymatic stability in the gastric and intestinal lumen, interacting luminal contents (food), gastrointestinal transit time, enterocyte permeability, and intestinal and hepatic metabolism . Recently, bioavailability has been also described as the rate and speed of the drug to reach systemic bloodstream, considering the initial formulation as the starting point.
Oral administration includes a pharmaceutical phase—prior to PK and PD phases—that comprises disintegration and dissolution of the dosage form. When using oral dosage forms, the shape and chemical composition (e.g., tablets) play an important role since they contribute to the time needed for disintegration and dissolution.
Following the pharmaceutical phase, absorption is the first step in the pharmacokinetic phase and is defined as the movement of the drug from the site of administration to the bloodstream. The main properties determining the rate of oral absorption for small molecules are permeability and solubility .
As such, the rate of dissolution and ionization, which are described by the Noyes-Whitney and Henderson-Hasselbalch equation, respectively, is the key factors in lead optimization for oral administration and is complemented by lipophilicity as an additional factor influencing membrane permeation and solubility of the compound .
Dissolution can be expressed by a function of the aqueous solubility of a compound, the surface area of the administered tablet (or the particles in other solid formulation), and a specific dissolution rate constant. Altering any of these parameters directly affects the dissolution profile . While solubility is an endpoint value indicating the amount of a compound that is soluble in a solvent, dissolution describes the kinetic process of a compound being solved in a solvent .
On the other hand, ionization reflects if a compound is present in the charged or uncharged state and is at least influenced by two major parameters. The physicochemical property responsible for ionization is the pKa and describes the ionization state of that entity at a given pH. It is also referred to as aqueous ionization constant . Thereby, it is directly influenced by the pH of the environment, the second parameter, which drastically changes on the way through the GI tract, from about pH 1 to 8 in the stomach and ileum, respectively.
The determination of the ionization state of a compound in the gastrointestinal system (stomach, jejunum, ileum, and colon) is crucial for absorption since it not only influences the solubility of a compound but also the lipophilicity and permeability [26, 30, 89]. About 60–70% of all drugs (effective 1999) are ionizable, which underlines the role that ionization plays in ADME assessment [30, 90]. While charged molecules easily dissolve in aqueous systems (GI tract), they do not permeate membranes via passive diffusion and are reliant on active transport. The contrary is true for uncharged molecules, which pass biological membranes passively but show low solubility in aqueous solutions. Mechanisms of drug absorption include passive diffusion, active transport, and receptor-mediated endocytosis, which are influenced by different factors and can themselves influence the bioavailability.
Similar to model and prediction, the absorption of a drug is a complex process, which is influenced not only by the physicochemical properties of drugs themselves but also by the physiological state of the tissue in question. As such, there are a large number of prediction models available, which were generated based on the physicochemical properties involved in the absorption process, such as membrane permeability and drug solubility. These models can help formulation scientists to optimize drugs with poor absorption due to low aqueous solubility.
Initial absorption models can be separated into dispersion and compartmental models . While dispersion models treat the GI as a continuous system, with variable pH and surface area, compartmental models take into account physiological factors such as transporters. The compartmental absorption transit (CAT) was one of the first models to regard distinct physiological properties, such as the minimal absorption in the stomach and colon, while assuming some mathematical simplifications, such as the instant dissolution of the drug and linear kinetics . CAT was further modified as advanced CAT (ACAT), by including nonlinear absorption kinetics and the effects of the first-pass metabolism. ACAT also considers the gastrointestinal tract as nine subsections, each with unique physicochemical properties, such as pH, allowed solubility, particle size, and permeability . Novel developments have included other absorption routes other than the GI, which have been recently included in commercially available software, such as oral absorption for the development of sublingual zolpidem tablets . The absorption constant (Ka, expressed in terms of h−1 min−1), or also called first-order absorption rate constant (to not be confounded with pKa), is employed in most of the aforementioned models and is determined as a result from the changes in mass of absorbable drug over time at the site of administration. Ka can be derived from the decrease in the drug amount of absorbable present at the site of administration over time; however, it is often indirectly determined by the drug amounts measured in the blood and/or urine.
Along physicochemical models, which have a global application, machine-learning techniques were extensively employed to model absorption (as comprehensively reviewed by Kumar et al. ) and are inclined to be local models, since they are mostly based on a small, homogeneous dataset that influences their applicability domain.
4.2 Distribution and the role of plasma-binding proteins
After being absorbed and entering the circulatory system, the drug moves reversibly between different compartments within the body, which is described as distribution and influenced by several physicochemical properties of the drug and biological factors of the body. One of the most important properties is lipophilicity, and as such logP/logD, since it reflects the ability of the compound to pass biological membranes to reach other sites, tissues, and organs within the body . Additional factors include phospholipid and (plasma) protein binding, which reduces the free drug concentration within the body, can prevent the migration to the receptor side/side of action, and causes drug-drug interactions [25, 96]. Interestingly, binding to plasma proteins can also prolong the drug action by releasing the drug over a longer period of time. It is also important to note that the influence of lipophilicity on plasma protein binding is hypothesized to be higher for acidic compounds than for bases, meaning that negative charges contribute highly to plasma protein binding and prevent tissue binding, which leads to diminished volumes of distribution (Vd, Eq. (3)). The Vd is the amount of drug that is freely available in the blood, thus not bound to plasma proteins or other components [25, 97, 98]. Vd is an apparent volume that increases proportionally to the extravascular drug binding and not an anatomically defined volume. Consequently, extensive drug binding outside the bloodstream leads to increasing values of volume distribution.
Volume distribution (Vd) is defined by the ratio between the amount of drug in the body (A) and the drug concentration in plasma (C, comprising both free drug and protein-bound drug):
The parameter describing protein binding is the plasma protein affinity constant Ki. Many efforts to determine distribution led to chromatography-based methods, such as (RP-)HPLC to mimic n-octanol/water logP or lipophilicity to measure distribution. In general, chromatographic methods are believed to resemble biological partition processes more than octanol/water partition . In the beginning, stationary phases in (RP-)HPLC were either silica-based or polymer-based but both had difficulties to reproduce logP and logD values despite several additives in the mobile phases . The introduction of biomimetic (stationary) phases coated with human serum albumin (HAS), α1 acid glycoprotein (AGP), or immobilized artificial membranes (IAM) revolutionized the methodology since they allowed a better approximation of the biological system [25, 100].
A method to address plasma protein binding is the use of HSA and other plasma proteins (e.g., α1 acid glycoprotein) coupled with RP-HPLC [25, 101]. On the other hand, HPLC combined with IAMs is a popularly accepted technique for phospholipid interaction and partition and several IAM columns are commercially available for DDD projects. Both techniques represent good assay systems to model in vivo Vd in high-throughput scale . Problems with HPLC techniques, which are also true for biomimetic phases, include the lack of a gold standard that is needed to calibrate and later standardize results to make a comparison possible .
In vitro standard methods for unbound plasma fraction calculation include equilibrium dialysis and ultrafiltration among several others as the two most commonly used methods and are considered the gold standard for binding assessment .
To calculate the Vd “a priori”/nonexperimentally, plasma protein binding, experimental logD and pKa are necessary. Then again, based on the Vd, the half-life (t1/2) of a compound can be calculated . Apart from protein binding, tissue binding is also involved in the distribution of the compound. Generally, “tissue” here comprises several components of the human body such as lipids, DNA, or RNA and is also referred to as nonspecific binding .
In silico models to predict the Vd are often based on lipophilicity and solubility descriptors, which correlate with the fractions of the drugs that are either bound to plasma proteins or freely available. The work of del Amo et al. not just accurately predicts Vd and unbound drug fraction but also compares the model’s performance against the commercially available software VolSurf+ with comparable accuracy (R2 = 0.70 and 0.71, respectively) .
Expanding these studies, the work of Lombardo and Jing generated a set of models to predict the Vd in the steady state (Vss), using a dataset of 1096 diverse compounds . They compared models generated by linear (PLS) with nonlinear (Random Forest) models, recommending the latter, with 33 descriptors, as the optimal method for Vss prediction.
The Vd of drugs is greatly influenced by binding to plasma proteins with several machine-learning–based models generated to predict this interaction. Protein-protein interaction (PPI) information derived from molecular docking was employed to derive a PPI-QSAR model for a small dataset of antibiotics (65 unique compounds), which resulted in an accurate model (R2 = 0.86 for the test set) . Additionally, global quantitative models using an array of classification and regression models using physicochemical and molecular descriptors derived from a dataset of 794 compounds were shown to correctly classify the binding status of the test set compounds and could be used as a prescreening . Another recent QSAR study using an extensively curated training set of 967 diverse pharmaceuticals aimed to predict plasma protein-bound fractions (fb) using models generated by six machine-learning algorithms with 26 molecular descriptors . This study is particularly interesting where the applicability domain is concerned allowing to differentiate whether the classification derives from (un-)favorable regions.
del Amo et al. recently reported one of the first QSPR models to predict intravitreal volume of distribution and clearance of small molecules ; the model relies on the LogD and hydrogen bond capacity to understand phenomena such as intraocular pressure and guide drug discovery. Complementarily, the prediction of the drug passage through the blood-ocular barrier was described to be an important factor to evaluate volume distribution in this organ .
Recently, as a novel approach bridging the animal experiments with human results, it was shown that in PXB mice, a chimeric mice linage with a humanized liver, plasma concentration-time profiles could be used to infer human’s compound half-life .
Volume of distribution is also closely related to half-life and clearance parameters. As the Vd is a relative measurement of the free concentration of drug in the blood, this same amount could be excreted by kidneys in the glomerular filtration (clearance). Consecutively, the rate of clearance (discussed below in Excretion section) directly influences the amount of available drug. Naturally, the concentration of free drug that can bind its molecular target is related to the therapeutic dosage and the half-life of the administered drug (as seen in Eq. (4)).
Half-life definition. Half-life is calculated by a ratio between the Napierian logarithm multiplied by the volume of distribution (Vd) and renal clearance (CL):
Drug metabolism normally involves enzymatic modification or degradation of the compound to facilitate excretion via one of the major clearance organs: liver, kidney, spleen, or bile. While phase I enzymatic reactions include modifications such as oxidation, hydrolysis, and reduction to either introduce a functional group to the molecule or make it accessible, phase II reactions are conjugation mechanisms (e.g., methylation, acetylation, glutathione conjugation, amino acid conjugation, and others) that result in polar products that can be actively effluxed . Thus, isozymes of the CYP450 family and efflux transporters such as P-glycoprotein and members of the multidrug resistance transporter MRP family are highly involved in the metabolism of drugs as well as drug-drug interactions, which are a major attrition cause. For instance, CYP3A4, CYP2C9, and CYP2D6 together catalyze the hepatic metabolism of about 50% of drugs, which underlines the importance of the superfamily. Interestingly, when CYP3A4 is expressed, usually P-glycoprotein is as well [8, 10, 14, 24, 111]. An approximation for metabolic behavior analysis is the use of either liver microsomes or S9 fractions although also recombinantly expressed proteins are partially in use [24, 26].
When available, the 3D structure of those proteins could be employed in molecular docking and molecular dynamics simulations aiming to predict the binding affinity of drugs or drug candidates aiming the estimation of a PK profile . The metabolism prediction combines mathematical models to predict whether the target compound could be a substrate of a specific enzyme in combination with metabolism site predictions. Usually, those initial predictions are followed by molecular docking simulations and quantum mechanics simulations due to the dependency of electronics structure from both substrate and enzyme in catalyzed reaction [113, 114].
Nowadays, several attempts have been made to develop in silico models for predicting drug metabolism, specifically site-of-metabolism (SOM), and quite often are also converted into online server prediction tools for general use, for instance, the FAst MEtabolizer (FAME) model, which was generated from a diverse chemical datasets of more than 20,000 molecules and their respective experimentally determined metabolism sites. FAME prediction rates were comparable to other metabolism site predictors focused on specific enzyme families, despite using only seven chemical descriptors . Similarly, SMARTCyp server, which initially relied on the 2D structure of the molecule, without considering electronic properties or generating 3D structures, to predict CYP2D6 , was later expanded for other CYP isoforms. A more refined version was later updated to include the atomic solvent accessible surface area, which is independent of 3D coordinates, slightingly improving the overall prediction accuracy for different CYP isoforms . The newest SMARTCyp version (3.0) uses the activation energies calculated by the density functional theory (DFT), meaning the difference between the energy of the transition state and the reactant complex, to predict SOMs of molecular fragments of the query in an unsupervised fashion. SMARTCyp 3.0 also calculates the similarity between the query molecule and the model fragment .
IDSite approach aims to overcome the ligand-based bias of SOM prediction by using it as a part of a large framework, more precisely by combining it with molecular docking, where an atom can be considered a significant SOM by a P450 enzyme when accessible to the reactive heme iron center, and/or quantum calculations, where the candidate atom must have some degree of reactivity in the absence of the enzyme . Similarly, the work of Kingsley et al. combined different approaches into a framework to predict CYP2C9 substrates. They validated the predictions from SMARTCyp in an ensemble docking, followed by a QSAR model to account for influences of both the inherent reactivity of each atom and the physical structure of the CYP2C9 binding site . This combined approach resulted in 88% of true SOMs accurately predicted among the top ranked sites.
Excretion is guided by one of the major clearance organs, and the assessment of clearance behavior sometimes involves isolated organs or tissues . Humans rely on the kidney clearance as a major route for xenobiotic excretion, despite other available routes such as feces, bile, sweat, and breath. The excretion pathways directly impact the concentration of available drugs and are often measured in terms of half-life and the initial administered dose.
The renal clearance of a drug is another important parameter, which is usually employed to predict drug excretion. Experimentally, clearance is defined by the drug concentration drug along a defined time of renal excretion by a linear equation (Eq. (5)).
Equation for renal clearance. m is the substance’s mass generation rate, K is the clearance and C is the concentration at the time, and V is the volume where the drug is distributed, or for systemic approaches the total body water.
Gombar et al. developed SVM- and MLR-based QSAR models to predict both systemic clearance and apparent volume of distribution from intravenous data  using as input structural fingerprints and electro-topological states (so-called E-states), respectively. The model performed with high accuracy, despite the highly diverse initial dataset employed for its generation, which points the importance of those models in early steps of the drug-discovery pipeline.
Also, the work of Kusama et al. established a chemoinformatic-based classification model to predict the major clearance pathways of 141 approved drugs based on four physicochemical parameters: charge, molecular weight, lipophilicity, and protein unbound fraction in plasma, resulting in a final model with an accuracy of 88% . This model approach was further refined by using support vector machine and increasing the number of relevant descriptors . In order to better model the biotransformation processes, often the major triggers of excretion, the work of Berellini et al. used ELASTICO (Enhanced Leave Analog-Structural, Therapeutic, Ionization Class Out) to provide an appropriate sampling during the validation process. Their partial least-square models resulted in a highly accurate model derived from 754 compounds .
On another topic, ABCB1, also known as P-glycoprotein (P-gp or MDR1), is a membrane protein member of the ATP-binding cassette (ABC) transporters superfamily. Together with the hERG channel and CYP3A4, P-gp is one of the most widely studied antitarget, where its inhibition could bring consequences for several processes, such as the absorption, distribution, and excretion of drugs. Classical studies used chemometric methods to describe bioavailability in terms of P-gp and CYP enzyme activities, generating QSAR models based on 805 unique drug molecules with high accuracy (R2 = 0.80 for the test set) . Alternatively, an approach to predict P-glycoprotein inhibition using molecular interaction fields, derived from a literature collection of more than 1200 structures, generated a pharmacophore model for competitive P-gp inhibition .
The most recent reported studies involving prediction of drug clearance, both from human and rat hepatic in vitro systems, were based on microsomes, with a recent emphasis on the use of hepatocytes. Wood et al. discuss the inherent limitation of using human hepatocyte predictions, due to underprediction when compared to in vivo clearance data, and the comments on the potential causes for those divergences .
As the pinnacle of ADME in silico approaches, the holistic physiologically based pharmacokinetic (PBPK) modeling was initially conceptualized by Teorell , aiming to enable the prediction of drugs’ pharmacokinetic behavior in humans using preclinical data. Recent PBPK models benefit from the large amount of available ADME data not only to aid the drug discovery process and dose regiment selection but also to guide the risk assessment for regulatory reviews . PBPK models are compartmentalized representations of the different organs, and each compartment can be described by a specific tissue volume and blood flow rate, which communicates with the blood (venous and arterial). Each organ/tissue has a unique volume, permeability, and eliminating anatomical constants and terms, which are determined independently from the studied drug, while other physiological drug-specific parameters are later incorporated, such as affinity toward plasma proteins, tissue-to-plasma distribution rate, and even on target activity (Km, Vmax, or binding kinetics).
One of PBPK models’ important features is the perspective for the mechanistic and prospective prediction of a drug’s pharmacokinetic profiles. The use of drug-dependent parameters includes, but is not limited to, physicochemical properties, solubility and permeability values, and also the role of individual enzymes and transporters in the metabolism. Those parameters can be determined in vitro or calculated from the compound structure with other in silico approaches, which allows the early use of PBPK in the DDD (the bottom-up approach). Concurrently, it is also noteworthy that the model construction and parameter fine-tuning are a source of knowledge for the hit development, where the predictions from the ongoing model can help to understand the model’s accuracy itself along the way (called as middle-out approach) and then prospectively be applied to simulate unstudied scenarios. Currently, there are several free-to-use and commercially available PBPK and ADME prediction options (Table 2), which are also extensively reviewed and discussed by the works of Madden et al. .
|vNN-ADMET||Public web server for ADMET property prediction based on 15 nearest neighbor models.|
|Swiss-ADME||Public web server for ADME property prediction. It has a very unique LogP calculation (i.LogP) based on free energy.|
|pkCSM||ADME web server based on chemical fragment similarity (the so-called graph-based signatures).|
|ADMETlab||Web server using similarity-based ADME calculator models and drug-likeness analyses.|
|Schrodinger—QikProp||Calculates pKa; LogP; water solubility—Schrodinger also offers other tools for property calculation.|
QikProp, Schrödinger, LLC, NY, 2019
|DDI-Predictor||DDI-Predictor is able to make quantitative predictions of drug exposure even in cases where the interaction has not been studied yet.|
|PBPK models and platforms|
|GastroPlus||Comprises 10 different modules including PBPK modeling and in vitro vs. in vivo correlation, can be parameterized for different disease states and age groups.|
|PKSIM||PBK modeling tool with integrated database of anatomical and physiological parameters for humans, mouse, rat, dog, and monkey. Can model different scenarios depending on the chosen building blocks.|
|Simcyp||Incorporates databases of genetic, physiological, and epidemiological information to enable simulation of different populations and species, ultimately is able to predict ADME parameters.|
|ADMEWORKS DDI Simulator||As a differential is able to predict drug-drug interactions using nonlinear models.|
Early PBPK models, such as the work of Varma et al., described another layer of complexity by including drug-drug interactions (DDI). The dosing time-dependent model considering the interaction between repaglinide with rifampicin was able to predict repaglinide plasma concentrations along a day. The model also predicted the drug interaction with other CYP3A4 and OATP1B1 inhibitors, which could result in further DDIs. Reports of DDI leading to complications in patients with particular genotype stimulated studies such as the one performed by Fermier et al. , where the effects of polymorphic cytochromes provided the basis for a more accurate DDI prediction.
5. Biological (large) molecules
During recent years, larger molecules (LM) have gained in significance and popularity, due to achievements and approvals, as new molecular entities. These “biologics” are normally biotechnologically synthesized or recombinantly produced compounds of biological origin such as peptides, antibodies, and nucleic acids . From a historical perspective, drug discovery and development of LMs are heavily delayed in comparison to SMs with their first approved entity happening in the 1980s . At about the same time, two major inventions allowed huge progress in pharmacokinetics assessment of small molecules, contributing to smaller drop-out rates in later DDD stages . One of them was the improved understanding of CYP450 mechanism and the other, the invention of (HP)LC-MS technology, fueled the assessment of the ADME parameters. LMs’ discovery and development face many challenges, which demand high efforts to overcome but also offer unique opportunities in comparison to those of small molecules [138, 139].
The main differences between small and large molecules, despite the molecular weight, the number of heavy atoms, and torsions, can be found in the physicochemical properties, such as permeability, oral bioavailability, stability, specificity, and immunogenicity [138, 139]. New parameters, unique for large molecules, are also of interest, such as the physical particle size and the hydrodynamic radius, which has a dramatic effect on the absorption. Both parameters are related to the overall shape and correlate well with MW for globular proteins, but not necessarily for unstructured or highly modified entities. As a result, biologics are normally administered parenterally, only targeting extracellular structures; they are also more likely to trigger an immune response; and their production costs are considerably higher . Interestingly, with the exception of the costs, these disadvantages can potentially be circumvented by appropriate delivery systems, for example, nanoparticle-based delivery to facilitate membrane permeation.
Other parameters, such as charges, which were previously modeled by pKa in case of small molecules, are heavily heterogeneous in LMs. The charge can be represented by the use of isoelectric points (pI), which are calculated from the available amino-acid sequence, and surface charge, which can use individual pKa’s and structural information to be inferred. Overall protein charge often influences the biologic excretion , since negatively charged molecules undergo less renal filtration disregarding size effect .
While representing difficulties in the development of new molecular entities, the aforementioned properties also offer special advantages that small molecules cannot cover. As such, LMs normally have longer t1/2, slower clearance, and higher selectivity; are multifunctional; and rarely expose drug interactions . Apart from those, it was suggested that only 2–5% of the human genes can be targeted by small molecules, offering a niche for LMs’ application against several diseases .
The increasing effort and development of new technologies, driven by the belief in higher success rates, enabled the latest advances in the field . For instance, currently, peptide drugs only account for ~2% of the drug market but are in use in a wide range of diseases such as acromegaly and multiple sclerosis, together with different cancer types such as prostate and breast cancer.
Several other biologics are currently in use, namely monoclonal antibodies (mAbs) and bispecific antibodies (bsAbs), as example agents that activate or enhance the immunologic response. Of special interest in cancer therapy is a subclass of bsAbs, so-called bispecific T-cell engager (BiTEs), which can recruit CD3 cells at the tumor site by binding to both cell types thereby directing the immunological response .
Other interesting examples for biologics comprise hormones (e.g., insulin), cytokines (such as erythropoietin, EPO; IL-1; IL-2; IL-6) , nucleic acids such as siRNA (ONPATTRO) , and aptamers (Pegaptanib) . While such a broad spectrum of molecule classes offers also a wide range of treatments, at the same time, it exacerbates the need for new developments since every molecule type exhibits different properties. In the field of predicting the biologics activity against specific targets, classical modeling tools, such as Monte Carlo sampling, genetic algorithms, docking, and molecular dynamics simulation, were adapted or even developed anew to accommodate the specifics (as extensively reviewed by [146, 147]).
On the other hand, the absence of standard techniques to assess ADME properties hampers the PK profiling and thus further development . In fact, the current knowledge of LM pharmacokinetics is even impaired compared to the basic knowledge of ADME principles for small molecules in the 1980s . Although the basic PK principles are similar between SMs and LMs, the specific mechanisms influencing each step of ADME are different. To begin with, the route of administration between them can differ, which leads to different mechanisms of absorption and first-pass metabolism. Furthermore, LMs are not metabolized by CYPs but can still trigger the release of pro-inflammatory cytokines leading to heavy side effects known as cytokine storm [136, 139]. Also, other modifications play a role in biologics ADME, namely glycosylation, PEGylation, and neonatal Fc receptor (FcRn) interactions [139, 148]. Unfortunately, up until now, most of the evaluation of those factors is only addressed on in vivo level systems, which are not suited for HTS, are expensive and labor intensive, and require longer bioethics evaluation.
In this regard, the development of in vitro and in silico methods to evaluate ADME should be a high-profile goal. One of the main challenges will be to find a way to integrate as many of the biologics into the process in order to facilitate ADME assessment and guide large molecules’ DDD as already implemented for their smaller counterparts.
The main difficulties in PK profiling lie in the high costs and comparable low throughput of in vivo models. The extensive use of animals in DDD also raises ethical issues and is further affected since animal models not always translate readily to the humans, especially in terms of metabolism [149, 150]. Furthermore, the advent of combinatorial chemistry coupled with HTS for efficacy evaluation leads to an explosion in the number to an extent that the classical PK assays could not compensate [29, 47]. In vitro PK screens are supposed to offer a solution to the problem by complementing in vivo assessment to reduce costs while increasing efficiency, but they also suffer from shortcomings. In general, one must distinguish between two main forms of in vitro systems: static and dynamic models. Only dynamic models are suited for PK evaluation because they allow variation of compound concentrations, a key factor in pharmacokinetics. In this sense, diffusion-based dynamic in vitro models offer a solution but still are quite limited in terms of high throughput and costs. An alternative was presented by Lockwood and colleagues in the form of a 3D-printed fluidic device utilizing trans-well technique generating dynamic in vitro PK profiles also applicable for HTS infrastructure .
What distinguishes the DDD “then” and “now” is principally two main changes. First, in the past, pharmaceutical companies as well as academic laboratories were not that concerned with ADMET assessment in the early stages of drug discovery (hit and lead generation) and only addressed PK from preclinical stages on forward. Instead, HTS/HCS, genomics, and computational chemistry were high-profile areas. Today, almost all pharmaceutical big-players have shifted pharmacokinetic profiling to discovery phases. However, only the future will tell whether those changes will yield fruit.
Second, CADD became more and more part of the DDD pipeline in different stages facilitating fast screening of compounds in silico and supporting QSAR. Although bioinformatics techniques already substituted many in vitro tests, basically all of them require in vitro and/or in vivo validation and standardization to guarantee trustable predictions. Another important aspect, recently addressed by the work of Ferreria and Andricopulo , is the importance of translating those models into well-structured and user-friendly (online) platforms that can be accessed and used by the drug discovery community. Still, the efficacy and reliability of computer simulations increase permanently and drastically, and many see a future of solely virtual drug discovery. Thankfully, these failures resulted in the consequence of addressing safety and efficacy concern earlier in the drug discovery process, for instance, via in vitro screens to assess metabolic stability and absorption properties and diminish failure rates later on .
The authors would like to acknowledge the Fundação de Amparo à Pesquisa do Estado de São Paulo (FAPESP grants 2018/08820-0, 2017/03966-4, and 2015/26722-8). The authors would like to thank Prof Dr. José Eduardo Gonçalves for his valuable comments on the manuscript.
Conflict of interest
The authors declare no conflict of interest.
ORCID numbers of all authors:
Thales Kronenberger: 0000-0001-6933-7590
Carsten Wrenger: 0000-0001-5987-1749
Vinicius Goncalves Maltarollo: 0000-0001-9675-5907
Arne Krüger: 0000-0002-5531-9508