Mixing Pharmacophore Modeling and Classical QSAR Analysis as Powerful Tool for Lead Discovery

Discovery of new bioactive leads for subsequent optimization into drugs is both time consuming and expensive process. Two main approaches are currently available for lead discovery, namely, high throughput (in vitro) screening and computer-aided virtual (in silico) screening. Normally, in silico techniques are implemented as pre-filters to enrich the success rates of high throughput screening campaigns.

challenging problem given the degree of conformational flexibility at the ligandmacromolecular level. [10][11][12] Although docking programs employ diverse methodologies to evaluate different ligand conformations within binding pockets, [13][14][15][16][17][18][19][20][21][22][23] conformational sampling must be guided by scoring function(s) to evaluate the fitness between the protein and the ligand. 4,[24][25][26][27][28][29] The final docked conformations are selected according to their scores. Unfortunately, the sheer complexity of the underlying ligand-receptor molecular interactions extremely complicate free energy calculations and undermine the ability of scoring functions to evaluate binding free energies correctly in order to rank different potential ligand-receptor complexes. 1,3,[5][6][7][8]10,30,31,39 In addition to deciding the optimal docking/scoring combination for a particular docking problem, the molecular modeler must decide whether to leave crystallographically explicit water molecules in the binding site or not prior to ligand docking. [32][33][34][35][36][37] Furthermore, the fact that crystallographic structures lack information on hydrogen atoms means that it should be appropriately assumed whether ionizable moieties embedded within the binding site exist in their ionized form or not. 36,38 Additional to the previous problems, use of single protein conformation for designing new ligands ignores important dynamic aspects of protein-ligand binding. In particular, the "induced fit'' effects are ignored. 40,41 Unfortunately, all current computational models directed towards assessing the flexibility of macromolecular binding sites (e.g., soft receptors 42,43 ; few critical rotable degrees of freedom in the receptor binding site [43][44][45][46][47][48] ; systematic conformer searches of amino acids' side chains at the binding site 49 ; molecular dynamics and free energy calculations conducted on flexible enzyme 50,51 ; use of multiple crystallographic receptor structures 43,52 ) suffer from two major drawbacks. Firstly, their computational cost, which reduces their effectiveness in virtual screening and fast docking, and secondly, their complete reliance on crystallographic structures.
The drawbacks of structure-based methods prompted us to introduce an interesting and novel ligand-based approach as a tool for characterizing binding sites' flexibilities. This approach ignores the protein template and focuses completely on the ligand side. It is carried out over two subsequent stages. Firstly, the pharmacophoric space of the targeted enzyme is extensively explored utilizing the three-dimensional Quantitative Structure Activity Relationship (3D-QSAR) software program CATALYST. The resulting binding models (hundreds) are then in allowed to compete within the context of classical quantitative structure-activity relationship analyses (QSAR) employing genetic algorithm (GA) and multiple linear regression (MLR) analyses. This process selects optimal combination of orthogonal pharmacophores that best explain the observed bioactivities, i.e., best possible QSAR model. Such combination of binding pharmacophores should correspond to accessible binding modes available for ligands within a particular binding pocket.
We previously reported the successful use of this combination to probe the induced fit flexibilities of activated factor X 53 and towards the discovery of new inhibitory leads against glycogen synthase kinase-3β, 54 bacterial MurF, 55 protein tyrosine phosphatase, 56 DPP IV, 57 hormone sensitive lipase, 58 β-secretase, 59 influenza neuraminidase, 60 cholesteryl ester transfer protein, 61 cycline dependent kinase, 62 Heat shock protein, 63 estrogen receptor β, 64 β-D-Glucosidase, 65 and β-D-Galactosidase. 66 The author intends in this chapter to discuss the basic theoretical principles of this successful ligand-based approach and to provide interested audiences with experimental details related to this approach.
The modeling process of this approach can be divided in the following steps:

Data mining and conformational coverage
Firstly, the literature is extensively surveyed to identify as many reported structurally diverse ligands against the selected target as possible. The collected compounds must satisfy two important prerequisites: (i) they should be all bioassayed by a single procedure. Consistency in bioassay is a major requirement for QSAR modeling as it is not possible to model bioactivity data generated via more than one bioassay procedure. (ii) They must exhibit wide bioactivity range, i.e., over > 4 logarthmic cycles.
Initially, the 2D structures of the inhibitors are imported into the modeling package (CATALYST) and converted automatically into plausible 3D single conformer representations and energy minimized to the closest local minimum. The resulting single conformer 3D structures are normally used as starting points for conformational analysis for pharmacophore modeling and in the determination of various molecular descriptors for QSAR modeling.
The conformational space of each ligand is extensively sampled usually utilizing the poling algorithm employed within the CONFIRM module of CATALYST. 67 Efficient conformational coverage guarantees minimum conformation-related noise during pharmacophore generation and validation stages because pharmacophore generation and pharmacophore-based search procedures are known for their sensitivity to inadequate conformational sampling within the training compounds. 60 The logarithm of measured IC 50 , EC 50 or Ki values are used in pharmacophore modeling and QSAR analysis, thus correlating the data linear to the free energy change.

The algorithm
Normally we implement the HYPOGEN module within CATALYST package to explore the pharmacophoric space of different ligands. CATALYST-HYPOGEN enables automatic pharmacophore construction by using a collection of at least 16 molecules with bioactivities spanning over 3.5 orders of magnitude. 67 CATALYST-HYPOGEN models drug-receptor interaction using information derived from the ligand structure. It identifies a 3D array of a maximum of five chemical features common to active training molecules, which provides a relative alignment for each input molecule consistent with their binding to a proposed common receptor site. The chemical features considered can be hydrogen bond donors and acceptors (HBDs and HBAs), aliphatic and aromatic hydrophobes (Hbic), positive and negative ionizable (PosIon and NegIon) groups and aromatic planes (RingArom). CATALYST pharmacophores have been used as 3D queries for database searching and in 3D-QSAR studies. [54][55][56][57][58][59][60][61][62][63][64][65][66] Although pharmacophore modeling employing HYPOGEN has been heavily reviewed in the literature, [68][69][70][71][72][73][74][75][76] a brief discussion of this algorithm is provided herein to allow better readability of the chapter.
HYPOGEN pharmacophore exploration proceeds through three successive phases: the constructive phase, subtractive phase and optimization phase. [67][68][69][70][71][72][73][74][75][76] During the constructive phase, CATALYST generates common conformational alignments among the most-active training compounds. Only molecular alignments based on a maximum of five chemical features are considered. The program identifies a particular compound as being within the most active category if it satisfies equation (1). 73 (MAct x UncMAct)  (Act / UncAct) > 0.0 (1) Where "MAct" is the activity of the most active compound in the training set, "Unc" is the uncertainty of the compounds and "Act" is the activity of the training compounds under question. However, if there are more than eight most-active inhibitors, only the top eight are used.
In the subsequent subtractive phase, CATALYST eliminates some hypotheses that fit inactive training compounds. A particular training compound is defined as being inactive if it satisfies equation (2): 67-76 However, in the optimization phase, CATALYST applies fine perturbations in the form of vectored feature rotation, adding new feature and/or removing a feature, to selected hypotheses that survived the subtractive phase, in an attempt to find new models of enhanced bioactivity/mapping correlation, i.e., improved 3D-QSAR properties. [67][68][69][70][71][72][73][74][75][76] Eventually, CATALYST selects the highest-ranking models (10 by default) and presents them as the optimal pharmacophore hypotheses resulting from the particular automatic modeling run.

Selection of training subsets
The fact that pharmacophore modeling requires limited number of carefully selected training compounds (from 16-45 compounds only) 67-76 that exhibit bioactivity variations attributable solely to the presence or absence of pharmacophoric features, i.e., not due to steric or electronic factors, makes it impossible to explore the pharmacophore space of large training sets in one shot (e.g., we normally collect more that 100 compounds), partly because CATALYST-HYPOGEN is not suited to handle large number of compounds and partly because pharmacophore modeling is generally confused by electronic and steric bioactivity modifying factors commonly encountered in SAR data. This dilemma prompted us to break compound lists into smaller training subsets compatible with pharmacophore modeling, i.e., of bioactivity variations attributable solely to the presence or absence of pharmacophoric features. Nevertheless, the basic problem in this approach is to identify a particular training set capable of representing the whole list of collected compounds. This problem can be very significant in cases of large SAR lists. We found that the best way to solve this problem is by exploring the pharmacophoric space of several carefully selected training www.intechopen.com subsets, i.e., from the whole list of collected compounds, followed by allowing the resulting pharmacophores to compete within the context of genetic function approximation-based quantitative structure-activity relationship (GFA-QSAR) analysis such that the best pharmacophore(s) capable of explaining bioactivity variations across the whole list of collected compounds is(are) selected. However, since pharmacophore models fail in explaining electronic and steric bioactivity-modulating effects, the GFA-QSAR process should be allowed to select other 2D physicochemical descriptors to complement the selected pharmacophore(s) (see below).
The training compounds in these subsets are selected in such away to guarantee maximal 3D diversity and continuous bioactivity spread over more than 3.5 logarithmic cycles. Moreover, training subsets are selected in such a way that their member compounds share certain apparent 3D SAR rules (by visual evaluation).
We usually give special emphasis to the 3D diversity of the most active compounds in each training subset because of their significant influence on the extent of the evaluated pharmacophoric space during the constructive phase of HYPOGEN algorithm. However, it must be mentioned that not all collected compounds are incorporated in the pharmacophore training subsets, in fact, compounds that exhibit limited diversity or significant bioactivitymodifying steric or electronic influences are excluded from the training subsets.

Modeling boundaries
HYPOGEN implements an optimization algorithm that evaluates large number of potential binding models for a particular target through fine perturbations to hypotheses that survived the constructive and subtractive phases of the modeling algorithm. 67-76 The extent of the evaluated pharmacophoric space is reflected by the configuration (Config.) cost calculated for each modeling run. It is generally recommended that the Config. cost of any HYPOGEN run not to exceed 17 (corresponding to 2 17 hypotheses to be assessed by HYPOGEN) to guarantee thorough analysis of all models. [71][72][73] The size of the investigated pharmacophoric space is a function of training compounds, selected input chemical features and other CATALYST control parameters. [67][68][69][70][71][72][73][74][75][76] We envisaged that restricting the size of explored pharmacophoric space should improve the efficiency of optimization via allowing efficient assessment of limited number of pharmacophoric models. On the other hand, extreme restrictions imposed on the evaluated pharmacophoric space might reduce the possibility of discovering optimal binding hypotheses, as they might occur outside the "boundaries" of the evaluated space.
Therefore, we normally explore the pharmacophoric space of targeted ligands under reasonably imposed "boundaries" through numerous HYPOGEN runs and employing several carefully selected training subsets.
Guided by our rationally restricted pharmacophoric exploration concept, we usually restrict HYPOGEN to explore pharmacophoric models incorporating limited number of features, e.g., from zero to one negative NegIon, or PosIon features or from zero to three HBA, Hbic, and RingArom features instead of the default range of zero to five. Furthermore, we normally instructed HYPOGEN to explore only 4-and 5-featured pharmacophores, i.e., ignore models of lesser number of features in order to further narrow the investigated pharmacophoric space and to better represent the diverse interactions between known ligands and binding pockets. In fact, three-and two-featured pharmacophores are rather promiscuous as 3D search queries and not adequate descriptions of ligand-receptor binding.

Assessment of generated pharmacophore models
When generating hypotheses, CATALYST attempts to minimize a cost function consisting of three terms: Weight cost, Error cost and Configuration cost. 67-76 Weight cost is a value that increases as the feature weight in a model deviates from an ideal value of 2. The deviation between the estimated activities of the training set and their experimentally determined values adds to the error cost. The activity of any compound can be estimated from a particular hypothesis through equation (3). 73 Log (Estimated Activity) = I + Fit Where, Σ mapped hypothesis features represents the number of pharmacophore features that successfully superimpose (i.e., map or overlap with) corresponding chemical moieties within the fitted compound, W is the weight of the corresponding hypothesis feature spheres. This value is fixed to 1.0 in CATALYST-generated models. disp is the distance between the center of a particular pharmacophoric sphere (feature centroid) and the center of the corresponding superimposed chemical moiety of the fitted compound; tol is the radius of the pharmacophoric feature sphere (known as Tolerance, equals to 1.6 Å by default). Σ (disp/tol) 2 is the summation of (disp/tol) 2 values for all pharmacophoric features that successfully superimpose corresponding chemical functionalities in the fitted compound. 67-76 The third cost term, i.e., the configuration cost, penalizes the complexity of the hypothesis. This is a fixed cost, which is equal to the entropy of the hypothesis space. The more the numbers of features (a maximum of five) in a generated hypothesis, the higher is the entropy with subsequent increase in this cost. The overall cost (total cost) of a hypothesis is calculated by summing over the three cost factors. However, error cost is the main contributor to the total cost.
CATALYST also calculates the cost of the null hypothesis, which presumes that there is no relationship in the data and that experimental activities are normally distributed about their mean. Accordingly, the greater the difference from the null hypothesis cost, the more likely that the hypothesis does not reflect a chance correlation. In a successful automatic modeling run, CATALYST ranks the generated models according to their total costs. 67-76 An additional approach to assess the quality of CATALYST-HYPOGEN pharmacophores is to cross-validate them using the Cat-Scramble module implemented in CATALYST. This validation procedure is based on Fisher's randomization test. 43 In this validation test, a 95% confidence level was selected, which instruct CATALYST to generate 19 random www.intechopen.com spreadsheets by the Cat-Scramble command. Subsequently, CATALYST-HYPOGEN is challenged to use these random spreadsheets to generate hypotheses using exactly the same features and parameters used in generating the initial unscrambled hypotheses. 67 Success in generating pharmacophores of comparable cost criteria to those produced by the original unscrambled data reduces the confidence in the training compounds and the unscrambled original pharmacophore models.
Eventually, the top 10 binding hypotheses (i.e., pharmacophores) from each automatic HYPOGEN run are automatically ranked according to their corresponding "total cost" values and presented as output of the HYPOGEN run.

Clustering of successful pharmacophore hypotheses
Because the number of generated pharmacophores during our pharmacophore exploration step is usually large (> 60 model) and they usually share several 3D features and properties (cost criteria, Cat.scramble confidence, etc …), we normally cluster the resulting models into limited number of groups  utilizing the hierarchical average linkage method available in CATALYST. The highest-ranking hypothesis within each cluster (i.e., of lowest cost or highest correlation with bioactivity of the whole collected list) is selected to represent the corresponding cluster in subsequent QSAR modeling.
Clustering aims at avoiding overloading genetic function approximation-multiple linear regression (GFA-MLR), implemented during QSAR modeling, with numerous independent variables, which may allow the emergence of less-than-optimal regression models.

QSAR modeling
Pharmacophoric hypotheses are important tools in drug design and discovery as they provide excellent insights into ligand-macromolecule recognition and they can be used to mine for new biologically interesting scaffolds. However, their predictive value as 3D-QSAR models is usually limited by steric shielding and bioactivity-enhancing or -reducing auxiliary groups. 76 This point combined with the fact that pharmacophore exploration usually furnish several binding hypotheses of comparable success criteria and 3D features prompt us to use classical QSAR analysis to search for optimal combination of pharmacophore(s) and other 2D descriptors capable of explaining bioactivity variation across the whole list of collected inhibitors. We normally employ genetic function approximation and multiple linear regression QSAR (GFA-MLR-QSAR) analysis to search for an optimal QSAR equation(s) using the logarithm of measured 1/IC 50 or 1/Ki values are as dependent variables (thus correlating the data linear to the free energy change).
GFA-MLR-QSAR selects optimal descriptor combinations based on the Darwinian concept of genetic evolution whereby the statistical criteria of regression models from different descriptor combinations (chromosomes) are employed as fitness criteria. 77 GFA-MLR-QSAR analysis is employed to explore various combinations of pharmacophores and other structural descriptors and to evaluate their statistical properties as predictive QSAR models.
Representative pharmacophore hypotheses (selected during the clustering stage) are fitted against all collected ligands and their fit values (determined by equation 4) are enrolled, together with other 2D and 1D structural descriptors, as independent variables (genes) in a cycle of GFA-MLR-QSAR analysis over thousands of iterations. 77

www.intechopen.com
Other structural descriptors include various simple and valence connectivity indices, electro-topological state indices and other molecular descriptors (e.g., logarithm of partition coefficient, polarizability, dipole moment, molecular volume, molecular weight, molecular surface area, etc.). 77 However, to assess the predictive power of the optimal QSAR models on external set of inhibitors, we usually randomly select around 20% of the collected ligands and employ them as external testing molecules for validating optimal QSAR model(s) (r 2 PRESS ). Moreover, all QSAR models are cross-validated automatically using the leave-one-out crossvalidation. 77 Emergence of two or more orthogonal pharmacophoric models in the optimal QSAR model suggests the existence of complementary two or more corresponding binding modes accessible to ligands within the binding pocket of target protein, i.e., one of the pharmacophores can optimally explain the bioactivities of some training inhibitors, while the others explain the remaining inhibitors. Such conclusions were reached about the binding pockets of several targets, e.g., factor Xa, GSK-3and Mur F. 57-63

Final validation of optimal QSAR model and associated pharmacophores
To establish the validity of optimal GFA-selected QSAR model and associated pharmacophore(s), we normally implement two validation methods: (1) Receiver-Operating Characteristic (ROC) curve analysis, and (2) Comparing QSAR-selected pharmacophore(s) with the corresponding binding site, however, this is only done upon having available crystallographic structure of the targeted receptor.

Receiver Operating Characteristic (ROC) curve analysis
In ROC analysis, the ability of a particular pharmacophore model to correctly classify a list of compounds as actives or inactives is indicated by the area under the curve (AUC) of the corresponding ROC as well as other parameters, namely, overall accuracy, overall specificity, overall true positive rate and overall false negative rate. [78][79] The testing list for ROC analyses are usually prepared as described by Verdonk and coworkers. 78 Briefly, decoy compounds are selected based on three basic one-dimensional (1D) properties that allow the assessment of distance (D) between two molecules (e.g., i and j): (1) the number of hydrogen-bond donors (NumHBD); (2) number of hydrogenbond acceptors (NumHBA) and (3) The minimum distances are then averaged over all active compounds (Dmin). Subsequently, for each active compound in the test set, around 40 decoys are randomly chosen from the ZINC database. 80 The decoys are selected in such a way that they did not exceed Dmin distance from their corresponding active compound.
To diversify active members in the list, we exclude any active compound having zero distance ( (, ) Di j ) from other active compound(s) in the test set.
The test set is then screened by each particular pharmacophore employing the "Best flexible search" option implemented in CATALYST, while the conformational spaces of the compounds are usually generated employing the "Fast conformation generation option" implemented in CATALYST. Compounds missing one or more features were discarded from the hit list. In-silico hits were scored employing their fit values as calculated by equation (4).
The ROC curve analysis describes the sensitivity (Se or true positive rate, equation 6) for any possible change in the number of selected compounds (n) as a function of (1-Sp). Sp is defined as specificity or true negative rate (equation 7). 79 where, TP is the number of active compounds captured by the virtual screening method (true positives), FN is the number of active compounds discarded by the virtual screening method, TN is the number of discarded decoys (presumably inactives), while FP is the number of captured decoys (presumably inactive). 79,81 If all molecules scored by a virtual screening (VS) protocol with sufficient discriminatory power are ranked according to their score (i.e., fit values), starting with the best-scored molecule and ending with the molecule that got the lowest score, most of the actives will have a higher score than the decoys. Since some of the actives will be scored lower than decoys, an overlap between the distribution of active molecules and decoys will occur, which will lead to the prediction of false positives and false negatives. 79,81 The selection of one score value as a threshold strongly influences the ratio of actives to decoys and therefore the validation of a VS method. The ROC curve method avoids the selection of a threshold by considering all Se and Sp pairs for each score threshold. 79,81 A ROC curve is plotted by setting the score of the active molecule as the first threshold. Afterwards, the number of decoys within this cutoff is counted and the corresponding Se and Sp pair is calculated. This calculation is repeated for the active molecule with the second highest score and so forth, until the scores of all actives are considered as selection thresholds.
The ROC curve representing ideal distributions, where no overlap between the scores of active molecules and decoys exists, proceeds from the origin to the upper-left corner until all the actives are retrieved and Se reaches the value of 1. In contrast to that, the ROC curve for a set of actives and decoys with randomly distributed scores tends towards the Se = 1-Sp line asymptotically with increasing number of actives and decoys. 79,81 The success of a particular virtual screening workflow can be judged from the following criteria: 1. Area under the ROC curve (AUC). 79,81 In an optimal ROC curve an AUC value of 1 is obtained; however, random distributions cause an AUC value of 0.5. Virtual screening that performs better than a random discrimination of actives and decoys retrieve an www.intechopen.com AUC value between 0.5 and 1, whereas an AUC value lower than 0.5 represents the unfavorable case of a virtual screening method that has a higher probability to assign the best scores to decoys than to actives. 79

In Silico screening
Eventually, optimal QSAR-selected pharmacophores are employed as 3D search queries against several electronic multiconformer structural databases (e.g. NCI 238,819 structures) using the "Best Flexible Database Search" option implemented within CATALYST. Compounds that have their chemical groups spatially overlap (map) with corresponding features of the particular pharmacophoric model are captured as hits. Hits are normally filtered based on Lipinski's and Veber's rules. 82,83 Surviving hits are then fitted against QSAR-selected pharmacophores and their fit values, together with other relevant molecular descriptors, are substituted in optimal QSAR equation to predict their bioactivities. The highest-ranking available hits are evaluated in vitro.
Usually, the acquired hits are screened at 10 M concentrations, subsequently; compounds of significant bioactivities at 10 M are further assessed to determine their IC 50 values.
It remains to be mentioned that although QSAR predictions are rather accurate with some hit compounds, experimental IC 50 values of other hits differ significantly from QSAR predictions. These errors appear are usually related to structural differences between training compounds used in QSAR and pharmacophore modeling compared to hit molecules. This discrepancy seems to limit the extrapolatory potential of the QSAR equation.

Conclusions
This chapter summarizes an interesting novel approach for the discovery of new bioactive leads by implementing a sequential process of pharmacophore modeling and QSAR analysis. This approach has been used for the discovery of potent inhibitors against at least a dozen enzymes and receptors.