Chemical Biology Toolsets for Drug Discovery and Target Identification

Chemical biology is the scientific discipline that deals with the application of chemical techniques and often small molecules produced through synthetic chemistry, to the manipulation and study of biological systems. Its working framework ranges from simple chemical entities to complex drugs by employing the principles of biological origin. This chapter particularly focuses on the principles and working models of chemical biology to discover new drug leads. Drug discovery is an extensive and multifaceted complex process. Chemical biology uses both natural and synthetic compounds with the best therapeutic potential and verifies them by employing the best possible chemical toolsets. Screening of compounds is done by the use of phenotypic as well as the target-based screening to identify and characterize the potent hits. After the identification of target, it is characterized, and validated by extensive testing. The next step is the validation of hits obtained, and lead compounds are tested in clinical trials before introducing them for commercial application.


Introduction to chemical biology and history
Chemical biology flourished as a discipline of science which makes use of several aspects of chemistry to understand biology [1]. Chemical biology includes a wide range of fundamental problems related to the understanding of complex biological processes by the development of synthetic frameworks to generate selective and active lead compounds [2].
The roots of history of chemical biology lie in the emergence of chemistry and biology as separate disciplines. Chemical biology flourished as a separate discipline of science because of newer challenges and questions for the study of chemical methods employed on living bodies. This branch of study is concerned with advanced molecular concepts of biology harnessed to the use of chemical entities. In spite of the newness of this concept, the history of chemical biology extends up to two centuries, considering the foundations of chemistry and biology. Here only a brief account of history of chemical biology is discussed. Joseph Priestley discovered nitrous oxide gas in 1772 and incubated the mice with "airs" (the gases discovered till that time). He used 10 gases including nitrous oxide on experimental mice. His experiments on mice faced a strong mass discontent from Americans who showed a sympathetic behavior towards animal rights. Thus, the first chemical biologist fell a prey to angry mob due to his experiment on mice [3].
Afterwards, another chemist, Humphry Davy, worked (1778-1829) on the newly isolated and unfamiliar gases at that time. Frightened by the previous experiment, Humphry completely omitted the use of mice and decided to carry out the research on himself. It was not a matter of surprise that one of the gases, carbon monoxide, proved fatal for the scientist, but the pleasant effect of nitrous oxide made him name this gas, "the laughing gas." He also investigated the use of this gas in medical surgeries. Samuel Taylor also documented this gas as a pleasure-making gas [4], but the practical use of this gas in medicine was described in 1844 by an American

Chemical biology tools 2.1 Chemical probes
Chemical probes are the small molecules which bind to the specific targeted sites and initiate their cellular activities. These archetypal tools act as highly valued reagents for molecular-and genetic-level biological research. Chemical probes are helpful in the accurate investigation of biological pathways and their associated targets [10].

Antisense and RNAi technologies
Many tools have been involved in target validation since the 1980s. Target identification and validation are long procedures. They were mainly based on structureactivity relationship. The drug discovery system becomes the most important approach towards the targeted cells [11]. Traditional antisense and RNA interference (RNAi) technologies are the robust tools used in multidimensional phases to discover and validate the potential drug targets. This approach elaborates the potentially selective cleavage of a targeted messenger RNA. This targeting technique enables the researchers to explore the protein-based expression on phenotypes [12].

Induced protein degradation
Induced protein degradation is an event-driven approach which depends on drug binding and eliminating the target protein after tagging it. This approach is gaining attention in recent times because of the selective degradation of the target proteins.
Drug discovery based on small molecules focuses on the loss of function of proteins due to the already-occupied binding sites ultimately making the proteins unable to target. In this approach, there is a need of high drug exposure in vivo to avoid target inhibition conditions which may lead to potentially harmful side effects of that drug. Proteolysis-targeting chimeras (PROTACS) use the cellular quality control setup to degrade the selective proteins as their targets. This protein degradation system reduces the quantity of drug to be exposed to the living systems which are to be used for halting the protein functions. These proteins may belong to regulatory proteins, transcription factors, and scaffolding proteins [13,14].

Chemoproteomics
Chemoproteomics is employed as a chemical tool for target identification. It can be used to investigate the signal transductions. This particular field of study has flourished as a key technology to characterize the action mechanism of chemical probes and drugs which can act as pharmacological modulators, hence validating the cellular targets of several therapeutic drug candidates. Chemoproteomics can be further characterized as affinity-and activity-based chemical proteomics [15]. In some cases when probe development is a difficult task, multiple kinase inhibitors are used for targeting the kinome effectively [16].

Drug discovery
Drug discovery is a hectic multistep procedure comprising of highly systematic approaches to identify, and characterize different compounds leading towards the development of hits and validate them extensively via utilization of chemical toolsets to attain the status of a commercial therapeutic drug status. The important steps of drug discovery are mentioned in Figure 2.

Screening
There are two fundamental approaches which can be used for the purpose of drug discovery, namely, phenotypic screening and target-based screening.  The first one looks at the effects of phenotype that the compound induces on cell, tissue or whole organism, and the second one evaluates the effects of a compound on a purified target protein.

Phenotypic screening
In the early twentieth century, drug development started with the advancements in pharmacology and synthetic and therapeutic chemistry. In the 1950s and 1960s, enzyme kinetics has provided methods for accurate computation of compound's effectiveness and enzyme competence [17].
Between 1999 and 2008, the US Food and Drug Administration (FDA) approved new drug discovery approaches. During this period, 75 small molecules were discovered and analyzed. Out of these, 28 drugs were discovered through phenotypic selection, and 17 drugs were identified by target dependent selection [18].
"Alemtuzumab" was the first antibody that was been obtained by using hybridoma technology in combination with phenotypic identification. It was previously reported against relapse of multiple sclerosis and chronic lymphocytic leukemia (CLL). The CD44 antigen (cell surface glycoprotein) antagonist, RG7356, was isolated with the help of function F.I.R.S.T™ platform. Therefore, functional assays antibodies were used to check effects on cell signaling, proliferation, and programmed cell death [19].
Large combinatorial antibody libraries are the sources of human monoclonal antibodies, successfully used in medical and phenotypic screening. For example, BI-505 was isolated by using F.I.R.S.T™ platform. Improved versions of antibodies were ultimately used in simulation studies of tumor cell death assay and for selective B-lymphoma cell surface binding. Soon after the isolation of BI-505, its molecular target was identified as ICAM-1, which were found to be involved in apoptosis of B-lymphoma cells. BI-505 has a broad antimyeloma activity [20].
By using phenotypic screening technology, patients can increase their effective antibody response like B-cell repertoire. For example, from a healthcare worker, anti-respiratory syncytial virus (RSV) antibody, D25, was isolated. On the virus coat, D25 neutralizes RSV, and perfusion structure of the F protein was expressed which was not identified by target-based screening [21]. The use of phenotypic screening in various experiments is outlined in Table 1

Target-based screening
Target-based screening of natural compounds and synthetic chemicals is being considered as a significant innovation for anticancer drug development [28]. In 2007, Lysine demethylase 5B (KDM5B) and Histone demethylase were recognized, which are liable for the removal of H3K4me2/3 activation marker. Thus, for cancer therapy, KDM5B is regarded as a promising drug target, but the elevated levels of KDM5B were found in many human cancers [29].
The respiratory chain of Streptococcus agalactiae consists of two enzymes; type 2-NADH dehydrogenase (NDH-2) and cytochrome bd oxygen reductase. S. agalactiae is considered as the primary cause of sepsis and meningitis in neonates as well as considerable cause of pneumonia and urinary tract infection [30]. The difference between phenotypic and target-based screening is shown in Figure 3.
Some of the target-based screening methods are mentioned as follows.

Mass spectrometry-based method
Mass spectrometry is known to be a highly efficient technique for the identification and structural characterization of natural products derived from herbal medicine [31].
Target-based method relies on mass spectrometry to search for active compounds, and this technology can be used for identification, structural characterization, quantitative elemental analysis, tracking of key intermediate compounds in a chemical reaction, analysis of pharmaceuticals and metabolites, and elucidation of unknown structures in drug development. All these achievements can be finally used in various applications like pharmaceutics (drug developments, pharmacokinetics, metabolic pathways), clinical screening, etc. On the basis of MS data information of compounds, the UniFi™ platform has been built for more detailed analysis of structures [32].

Liquid chromatography-mass spectrometry (LC-MS)
LC-MS is an analytical technique for separating different complex mixtures into their components using liquid chromatography. These assays check the correct synthesis, purity, various physical and chemical properties like their volatility and active functionalities present in the newly synthesized chemical entities [33]. During drug discovery, LC-MS hyphenated technique is used for seperation and structural characterization of compounds [34].

Ultra-performance liquid chromatography-mass spectrometry (UPLC-MS)
Currently, UPLC-MS is one of the most adaptable hyphenated techniques. Proteomics and metabolomics have proved to be useful concepts for understanding the causes of different diseases. This technology aims to seperate and identify proteins and metabolites for cellular signaling pathways and to discover biomarkers for screening and diagnosis as well as determining response to a specific treatment [37]. For example, vancomycin (VCM) is clinically used for the treatment of human intracranial infections. The treatment concentration of vanomycin greatly varies among the patients. UPLC-MS technique was developed and used for the analysis of VCM in human cerebrospinal fluid [38].

Nuclear magnetic resonance spectroscopy (NMR)
Among the common techniques of metabolomics, NMR has evolved the most. Unlike mass spectroscopy, NMR is also used for quantitative analysis, but it does not require extra steps for sample preparation [39]. It is commonly used to analyze the 3D structures of biomacromolecules and their interactions. It has been proved a valuable tool for the reliable identification of small molecules that bind to proteins and for hit-to-lead optimization. Mainly, NMR spectroscopy is suitable for the analysis of bulk metabolites [40]. NMR has been used for analyzing the structure of protein, nucleic acid, and small molecule [41]. NMR has been proven to be a useful tool in target-based drug discovery in the step of hit identification and lead optimization [42]. For example, NMR spectroscopy is used to understand the structure of G-quadruplexes, which are noncanonical, four standard nucleic acids with consecutive sequences of guanines [43].

Thermal shift or calorimetry-based method
Isothermal titration calorimetry (ITC) is the only technique which is currently available for the direct determination of enthalpy, ΔH, of a ligand binding to a protein [44]. Thermodynamic evaluation might be useful to provide information about specificity, agonist versus antagonist effects of ligands, and other important properties [45]. Fragment-based drug discovery (FBDD) is an approach of particular interest and relevance here. Fragments are molecules smaller than typical drugs, and they generally bind with lower affinity than conventional drug screening hits [46]. Measuring the contributions of enthalpy and entropy to the free energy of binding provides information that can be useful in fragment elaboration and subsequent medicinal chemistry work [47]. ITC is a uniquely powerful tool for characterization of the thermodynamics of test compounds binding to target proteins. Interaction between the compound and protein leads to release or uptake of small amounts of heat, while the mixture is held at a close approximation to constant temperature [48]. Thermal shift screening methods has allowed to identify compounds that interact with Trypanosoma brucei choline kinase (TBCK) and inhibit TBCK, a validated drug target against African sleeping sickness [49].

Affinity-based methods
The methods regarding affinity-based immobilized proteins have vital role in understanding the connections between small molecules and their biological targets [50]. Affinity-based technologies are divided into two groups: (1) direct detection of noncovalent macromolecule-ligand complex and (2) indirect detection of noncovalent macromolecule-ligand complex. The negative aspect of this approach is that it recognizes chemical entities basically based on their binding affinities for a target irrespective of whether or not the biological function of the target is affected. In the late 1980s, matrix-assisted laser desorption ionization (MALDI) and electrospray ionization (ESI) techniques were used to analyze proteins and nucleic acids. Both phenotypic screening and target-based screening are comparable to each other in terms of benefits and drawbacks. This fact has been illustrated in

Target identification and characterization
Target identification and elucidation of its action mechanism have played vital roles in probing small molecules and drug discovery. Target identification has been based on biological and technologically advanced cell-based assays [51].

Disease association and target validation
Identification of the molecules and their underlying pathophysiological mechanisms contribute towards the discovery of targets that can be modulated therapeutically [52]. Each drug target is linked to a disease using integrated genome-wide data from a broad range of data sources. The target validation reveals the evidence that associates a target with a disease [53].

Bioactive small molecules
Bioactive small molecules are preferred as lead structures for the target validation. These small molecules isolated from phenotypic screen play a crucial role in chemical biology [54,55]. Many genomic, proteomic, and bioinformatic technologies have been developed for validation of the drugs.

Protein interactions
To identify the selective potent drugs, the first step is to find the protein interference. In signal transductions, protein-protein interactions are involved in the complex cellular networks that govern the different processes [56]. The deregulated transcription factors are involved in playing significant roles in human pathological abnormalities, but the complicated nature of protein-protein networks has made the transcription-targeted therapeutics impractical. Recent technological advancements are the ray of hope regarding the modulation of protein interaction networks [57].

Cell-based models and target validation
Exosomes are highly adequate for drug carriers as a cell-based model. Due to the association of multiple proteins with cellular membranes, the exosomes are well-known in cell to cell communication, and they are the novel approach for the delivery of potent drugs. Exosome-based drug technique is applied for a variety of disorders such as cancer and various neurodegenerative disorders [58].

Target validation
Drug target discovery and validation demand complicated and expensive frameworks which may pose heavy financial load on pharmaceutical industry. Target validation is referred to as the direct involvement of a certain molecular target in pathological conformity; hence, its reversal or inflection may have a therapeutic effect [12].

Approaches to target validation
The following approaches are used in target validation during the discovery and development of drug.

Antibodies
Firstly access the antibody fitness towards a specific target. Then, standardized procedures are obligatory to ensure the quality of the sample in test procedures; hence, utilizing only a single approach will not work in all situations [59]. Mass spectrometry is used to identify the validation of the antibody. This type of technique confirms the validity for antibodies or their fragments against the targets. The antibody is able to bind to its natural antigen in cell lysates among thousands of other proteins, DNA, RNA, and other cellular components [60].

Cellular thermal shift assay (CETSA)
CETSA is used to assess the capability of a ligand to bind with its targets (cells or tissue samples). The basis of this method lies on the ligand-induced thermodynamic stabilization of target proteins. The compound-treated cell lysates and intact cells were heated to different temperatures, and in the soluble fractions, the target protein was separated from destabilized protein and detected by Western blotting. SPROX is a method of target validation based on identification of ligand-induced stabilization of target proteins. It evaluates the levels of methionine oxidation of target proteins [61].

Drug affinity responsive target stability (DARTS)
DARTS has been used for the identification of the targeted proteins. It is based on ligand binding interaction with proteins forming a complex which changes the structural stability of target protein. There alteration is measured by SDS page/ liquid chromatography. DARTS is also involved in the analysis of the low affinity interactions [61].

Hit generation
Hit identification is considered as the significant bottleneck for lead generation success and for new medicines. An example for random hit identification is physical and biochemical testing [62]. The journey of a compound from the hit status to lead status follows a series of steps which have been briefly illustrated in Figure 5. The figure describes a note of possible techniques which could be utilized for the selection of lead compounds and proceeding them through lead optimization preclinical and clinical phase trials.

Development of lead drug
Pharmaceutical companies are facing constant economic pressure to bring efficacy in drug discovery and development process. Lists of compounds obtained after hit optimization are further subjected to refining process in order to find out the lead compounds that can be analyzed for production at commercial scale. During this "hitto-lead" refining process, many compounds are dropped out due to inadequate absorption, distribution, metabolism, excretion, and toxicity/ADMET characteristics [63].
Refining of hit compounds to lead compound is done through the process of secondary screening. Almost 50% of all drug candidates thin out during optimization and preclinical and clinical trials [64].
There are many approaches available for the discovery and development of drug which might follow different pathways to optimize the compounds into bioavailable drugs. All these pathways must have a common origin; they all begin with a lead compound. It is necessary to go through the phylogeny study of all the compounds because there are some properties like solubility, target affinity, toxicity, ease of synthesis, and bioavailability, all of which are highly dependent on the initial lead selection and the method of identification [65].

Techniques of lead selection
A rational approach is used to select lead drug candidate after optimization of hit compounds. There are many methods which can be used for screening of compounds. Selection of techniques depends upon the source of hit compounds and types of their solvents as well. The following techniques are useful in selection.

QSAR model development
Quantitative structure-activity relationship model is used to compare chemical structures by using database of prior selected active compounds. Different software like ChemBioOffice Ultra 1.11 is used to generate two-dimensional and threedimensional structures. The results of QSAR can be validated by using statistical approaches like correlation coefficient and regression coefficient [66].

Visualization of SAR activity
It is called as Bayesian approach. It provides with proficient understanding of shape features, hydrophobic nature, and electrostatic properties of the compounds. All of these features lie under the structure-activity relationship of selected compounds from hits. Structure data analysis of SAR is obtained in 3D form. Other results are obtained in diverse type of interrelated biochemical data, i.e., average of activities and region explored analysis. The results obtained from average activity show a common part in active compounds, and region explored data exhibit the areas of fully explored compounds [67].

Fragment-based drug discovery
It is a powerful method which is used to find out the proportion of ligands with high affinity to target proteins. The compounds which are found to have low ligand binding ability are eliminated, and the compounds with high ligand ability move forward to the precision of compounds. FBDD consists of the techniques such as NMR, SAR, X-ray crystallography, and surface plasmon resonance (SPR).

X-ray crystallography
It can ascertain the binding sites and modes of ligand binding to protein [68].

Surface plasmon resonance (SPR)
Surface plasmon resonance is known as a nonlabel technology that can identify, screen, and quantify intermolecular interactions in actual time. It is applied to quantify binding affinities. SPR-dependent biosensors work by detecting the ligands and immobilized target molecular interactions and supply appropriate information on kinetics of biomolecular interactions. The output information can be utilized to provide comprehensive functional data on binding actions such as specificity, kinetics, concentration, and affinity [69]. Scientific literature study revealed Biacore tools as mainly used SPR technology at commercial levels [70].

Preclinical trials
In the last 2 years, different methodologies based on high-throughput screening and their combinations with chemistry have been developed in order to manufacture versatile compounds by limiting the resources. Among these methodologies, several other in vitro and in silico supplementary approaches have also come forward for the identification and potential evaluation of these compounds as lead candidate validation. Those compounds which are selected as "hits" during this screening procedure are further analyzed and subjected to in vivo toxicity and efficacy profiling. During preclinical stage of drug development, simple formulation approaches are favored. Combinatorial chemistry and high-throughput approaches have been appraised in several publications [71].
PLOTs are preclinical lead optimization technologies that should be rapid enough to edge with high-throughput discovery screenings without causing further delay and should be predictive and cost-effective. PLOT platform usually comprised of in vitro systems, small and acquiescent to mechanization, and that is why it is easy to achieve the mandatory throughput with minimum use of compound use [72].

Tools of preclinical drug development
Selection of methodology and tools for selection of preclinical drug candidates is a rigorous process. Sequential approach of preclinical to clinical is practiced to sort out the long list of target selected compounds. This streamline strategy provides with deeper understanding of action of the drug prior to its progress to the next steps [73].

Pharmacokinetics and pharmacodynamics (PK/PD) during preclinical drug evaluation
Pharmacodynamics involves the study of effect of drug in dose-and timedependent manner. Pharmacokinetics is the study of absorption, metabolization, distribution, and excretion of a drug over time. PK/PD is a program at early phase of lead drug development which acts as a bridge between drug discovery and preclinical drug development. This stage set aims for further development activities, and information obtained at this stage act as a key to subsequent steps. DOI: http://dx.doi.org /10.5772/intechopen.91732 It is necessary because of the following reasons: a. It provides potency-based intrinsic activity of the compound rather than dose.
b. It characterizes the compounds on the basis of dose concentration and effect relationship.
c. It allows the investigation of tolerance phenomenon of compounds on the basis of physiological parameters [74].

Lead optimization
Optimization of a drug is a multifaceted process. It usually involves various types of screening methods which tend to find out the metabolism and pharmacokinetic properties of selected compounds or drugs [75].

ADME
This is the final stage of preclinical trials; after this the optimized drug is further processed towards the clinical trial. Absorption, distribution, metabolism, and excretion screening is performed at this stage. The primary goal of ADME is to develop a competitive drug with adequate safety avoiding PK failure in clinical phase.

ADME properties
Ideal properties of a drug in ADME testing involve the good oral bioavailability, blood clearance and volume of convenient dosing, and low potential of drug-drug interaction. All of these properties are assessed at early stage of drug discovery [76].

DRUGeff
Drug effect is a parameter which determines the concentration of a drug which do not cause any harm at the site of action. In other words at this stage, toxicity of a drug is tested to find out the minimum safe dosage potency. In vitro DRUGeff testing of all compounds show interaction with the target treatment, until a small portion of dose gets to select according to biophase levels. Concentration of treatment dose maximization per unit of biophase acts as a key objective for lead optimization. The drugs qualifying this test enter into the clinical phase [77].

Clinical phase of drug discovery
The final step of drug discovery and development is referred to as the clinical trial. At this stage, the data regarding safety and efficacy of the new drug must be proven by application to humans directly in different phases. After the successful trials, research data is sent to the FDA for approval for commercial manufacturing and marketing (Figure 6) [78].

Clinical phase I
The first phase of clinical trial normally takes several weeks to some months. At this stage application of optimized drug is tested on a small group of volunteers.
They may or may not get paid for their participation in drug trial studies. This mini trial is useful in determining the absorption and side effect of drug in relation to its dose concentration [17].

Clinical phase II
The second phase of clinical trial may last up to 2 years. It is a totally randomized study which involves the application of drug on a relatively large group of patients. This trial study is divided into two groups of patients, one receiving experimental drug and the other receiving placebo. Sometimes it may be named as a blind application trial. This type of random application of drug allows investigators and pharmaceutics to prove the success and safety of drug to the FDA with comparative information [79].

Clinical phase III
It is a large-scale testing of drugs on hundreds of patients. This third stage testing provides with a more thorough understanding and effectiveness of useful drugs to the FDA and pharmaceutical companies. The pharmaceutical company can request for the approval for commercial synthesis of drug after phase III is completed [80].

Clinical phase IV
After the approval of a drug for commercial consumption, clinical phase IV trials are used as post marketing surveillance trials. This trial system is based upon the various objectives at commercial levels, i.e., the comparison of newly approved and already-available drugs in market, to evaluate the chronic effects on patients' quality of life and to estimate the economical comparison of newly approved and already-present drugs as well as the traditional system of medication [81].

Conclusion and future perspectives
Chemical biology is an emerging field of science which particularly focuses on the research in biological systems by employing the chemicals and related chemoinformatic tools. This field of study is working well in combination with medicinal and combinatorial chemistry to seek the cure of incurable and life-threatening human pathologies. This chapter illustrated the significant techniques and chemical setups which can be employed to testify the chemical as well as biological aspects of natural and synthetic compounds before introducing them as therapeutic drugs in the field of medicine. There is an ultimate need of the hour to seek for the newer and better drugs which are safer, cheaper, and more effective than the already existing therapeutics. This field of study is flourishing at a very fast pace, and it is anticipated that it will provide better treatment options and strategies in future for the medical practitioners to use the best among the rest drugs discovered.

Conflict of interest
Authors have no conflict of interest.
© 2020 The Author(s). Licensee IntechOpen. This chapter is distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/ by/3.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.