Since the discovery of Electron Paramagnetic Resonance (EPR) spectroscopy in 1944 by Zavoisky, and the realization of the Nuclear Magnetic Resonance (NMR) spectroscopic signal in the mid 1940’s by Bloch and Purcell[2,3], the capabilities and applications of the technology have continued to advance at an enormous rate particularly after the implementation of Fourier transform NMR in the mid 1960’s by Ernst. Magnetic Resonance (MR) spectroscopy was initially utilized to characterize the structure of matter.[1,5] Through the early to mid 1970’s, the development of multidimensional (nD) methods and more powerful instruments opened the door for the detailed atomistic characterization of small molecules culminating in structural elucidation of proteins by the mid 1980’s. At about the same time, it was proposed that a magnetic field gradient could be applied to obtain a 3-dimensional (3D) image leading to the invention of nuclear Magnetic Resonance Imaging (MRI) with, among other capabilities, the potential to monitor the bio-distribution and bio-accumulation of molecules in vivo. Beyond the 1980’s MR technologies were mixing with other technologies and evolving to play an integral role for advancing pharmaceuticals and becoming indispensable tools for drug discovery, design and diagnostics.
Early on it was recognized (see Ref. 8 and references therein) that MR techniques can offer a variety of unique advantages over other spectroscopic techniques such as MR is completely non-destructive and non-invasive. Thus, MR technologies can be utilized with inanimate samples or living organisms with no obvious detrimental or destructive effects. In addition, MR techniques can be applied to a variety of states of matter including solution, semi-solids, solids and mixtures obtaining comprehensive information of the chemical and physical properties. In addition to the typical static structural information, one can also detail dynamic processes. NMR measurements provide information about dynamic processes with rates in the range from 10-2 to 10-10 sec-1. Furthermore, many nuclei possess magnetic moments, and with the availability of more sensitive spectrometers, chemists are beginning to take greater advantage of the technique for structure/bonding information for organometallic compounds (for example see Ref. 9).
An important application, although commonly overlooked, is the accurate quantitative information that can be obtained without the need for laborious calibrations. Under quantitative conditions and for all practical purposes with semi-solid or solution state samples, NMR spectroscopy has the unique distinction of having a uniform molar response for all nuclei of the same type i.e. all 1H nuclei have the same integrated intensity and thus, a single calibrated (internal or more significantly external) standard can be used for accurate quantitation. For the aforementioned reasons NMR is a valuable tool for providing atomistic structural, dynamic and quantitative information on natural products such as small molecules, metabolites, peptides, proteins, complex mixtures, and molecular assemblies such as lipid bilayers or tissues.
Nuclear MRI, on-the-other-hand, can provide 3D images of macroscopic matter, and monitor the bio-accumulation and bio-distribution of MRI tagged natural products in vivo. Ultimately MR technologies can be used at almost every stage along the natural product discovery pipeline – from discovery to implementation, from molecules to medicine.
1.2. Scope and limitations
MR technologies encompass a range of techniques including electron or nuclear MR spectroscopy, MR time domain, and nuclear or electron MRI. Herein, this chapter focuses on the nuclear MR technologies of spectroscopy and imaging for solution and semi-solid states. We provide a general overview of techniques and methodologies applicable throughout the development pipeline for natural products, as well as some potential impacts the information has for product development. It is well beyond the scope of a chapter (or in fact an entire book) to be a comprehensive description of all applicable MR methodologies. Thus within each section, the reader is directed to review articles, books, etc.
2. NMR spectroscopy
NMR is a technique that detects electrical currents induced by precessing nuclear magnetic moments within a uniform static magnetic field. Nuclei with non-zero spin moments are MR active and in principle are detectable. Each individual type of spin-active nucleus has a unique precessional frequency dependent upon the strength of the static magnetic field, the magnetic properties of the isotope and the local electronic environment of the nucleus. The general precessional frequency is dependent upon the type of nucleus and thus NMR can readily distinguish among for example 13C, 1H, 2H or 3H. The applications and significance of NMR has exploded because the exact precessional frequency (i.e. the chemical shift) within a group of the same nuclei is influenced by the local electronic environment of the nuclei and thus, NMR can readily distinguish (for example) a 1H nuclei that is chemically bound to different nuclei (e.g. carbon vs. nitrogen, etc.), is chemically bound to different oxidation states of the same nuclei (e.g. methyl vs. methylene carbon), and/or are in identical bonding environments (e.g. methyl 1H nuclei) but in different electronic environments induced by surrounding functional groups (e.g. aromatic vs. carbonyl groups). In addition, if a nucleus is influenced by another spin active nucleus either through a bond connection or in spatial proximity, a correlation exists that may be NMR detectable. In this way atomistic properties (such as the 3D spatial arrangement of nuclei and dynamics) can be determined.
In addition to the distinct nuclear chemical shift, data from MR can be further separated based upon relaxation and/or diffusion properties of a nucleus or molecule.[12,13] Thus, MR technologies can discriminate among large molecules like peptides, proteins and macromolecular assemblies, and small molecules like metabolites or synthetic organic molecules. The relaxation time (influenced by the rotational correlation time and molecular fluctuations) of a molecule plays an important role in distinguishing among small drug molecules and large proteins, or between a single lipid molecule that behaves as a small molecule and an assembly of lipid molecules that, as a collective, behave as large molecules.
One drawback is that under typical conditions MR techniques are sample intensive requiring µM to mM concentrations translating to µg to g quantities of material. For natural product discovery, the sample intensive requirement can be an issue as extracts may only contain nano to micro gram quantities of material. A number of methods have been proposed to overcome the mass demand with the most significant for general applications being the invention of cryogenically helium cooled detection systems that substantially reduce thermal noise ultimately improving the signal-to-noise ratio by up to 10 fold and reducing data acquisition times by up to 100 times.[15,16] A next generation improvement is cryogenically cooled probes that require smaller sample volumes. Currently the combination of the 700 MHz NMR spectrometer and new detection technologies requiring only 35 µl of sample affords the Biomolecular Magnetic Resonance Facility at NRC-Halifax one of the world’s most sensitive instruments for mass-limited samples reducing the typical quantities by up to 50 times. The limits of detection for this instrument can be as low as 10 nano-grams for small molecules (IWB, NM, TK and RTS unpublished).
Although to some extent the sample intensive nature of NMR can be addressed, a second drawback for larger proteins and macromolecular assemblies (>40 kDa) is the loss of peak resolution due to spectral overlap, broad line-widths, reduced signal-to-noise ratios and increased spectral complexity. There have been efforts to address this issue however, these efforts are limited in scope and application.[18-20]
2.1. Structure elucidation
After a natural product or extract has been verified to be biologically active, an essential component within the discovery pipeline is to identify compound(s) and determine structure(s). Structural elucidation is essential if chemical modifications are to be made, if the product is for human consumption and/or if a patent application is to be filed as it will distinguish the uniqueness of the compound as well as help identify relationships with pre-existing compounds. Structural characterization is somewhat different between small and large molecules; the distinction between the two regimes is defined by the Nuclear Overhauser Effect (nOe) cross-relaxation rate which is positive or negative depending upon the spectrometer frequency and the overall molecular tumbling time. Generally, “small molecules” are regarded as molecules that do not aggregate and have a molecular mass of <1 000 atomic mass units.
2.1.1. Small molecules
The advent of nD experiments propelled NMR to be a leading tool for natural product characterization. Previously natural products were degraded into fragments, chemically derivatized and/or completely synthesised to confirm the structure. It is still valuable for structure elucidation using NMR to obtain information from the aforementioned techniques as well as other techniques such as mass spectrometry (MS; for exact mass, functional groups and connectivities), infra-red spectroscopy (for functional groups), and separation techniques (for classes of compound e.g. phenolic, steroid, protein, etc.).
The initial NMR spectroscopic assessment as outlined in Scheme 1, typically begins with 1-dimensional (1D) 1H spectra to determine purity, confirm the compound class, and examine the general appearance of the peaks. A spectrum with sharp well resolved peaks and the anticipated ratio of integrated intensities is indicative of a pure sample dissolved in an appropriate solvent. Broad peaks or peaks that are of fractional ratio could indicate an impure sample, however, they could also be an indication of chemical exchange or limited solubility. The preliminary information gained from other techniques is important when ascertaining if the spectral appearance is appropriate. From 1D data the splitting patterns from J-(i.e. scalar)-couplings provide information on the pattern of covalent bonding as well as the torsional angle distributions between spin active nuclei 3-bonds apart.
1H detected 2-dimensional (2D) experiments in which magnetic coherence is propagated through J-couplings or magnetization is transferred through dipole-dipole cross-relaxation interactions, reduce the overlap complexity of 1D spectra and provide correlations to other 1H nuclei or heteronuclei most commonly 13C or 15N. Common homonuclear 1H-1H 2D experiments based on J-couplings are TOtal Correlation SpectroscopY (TOCSY), and COrrelation SpectroscopY (COSY). Both of these experiments provide information on individual spin systems and chemical bonding. COSY experiments are used to connect 1H nuclei that are within 3-bonds of each other whereas TOCSY experiments can connect all spins belonging to a J-coupled network e.g. the entire spin system connected through 3-bond correlations. Analysis of COSY data can provide J-coupling constants which can be related via the Karplus curve to torsional angle restraints.
Homonuclear 2D 1H-1H nOe SpectroscopY (NOESY) and Rotating frame Overhauser Effect SpectroscopY (ROESY) experiments are based on dipolar cross-relaxation interactions providing distance information between nuclei that are physically close (up to ≈5 Å) in space. It is noteworthy to mention that for NOESY and ROESY spectra to have correlations, nuclei do not have to be on the same molecule. This aspect of the nOe provides the basis for determining ligand/receptor interaction characteristics (see Section 2.2). The sign and intensity of NOESY cross-peaks are dependent upon the main static magnetic field (ω0 = B0) and the rotational correlation time of the molecule (τc); for small molecules (<1 kDa) the nOe cross-relaxation rate is positive whereas for larger molecules (>2 kDa) the nOe is negative. ROESY are best suited for medium size molecules of ~1 kDa where for NOESY the nOe becomes zero (i.e. ω0τc ≈ 1.12). Analysis of the NOESY and/or ROESY data is important for determining the configuration/conformation of the compound and for connecting individual spin systems determined from the TOCSY and/or COSY data. Another aspect of nD NMR techniques is the addition of 13C editing to the spectra. These heteronuclear experiments are 1H detected increasing the sensitivity and indirectly providing 13C shifts especially important for mass limited samples. Standard heteronuclear experiments are 1H-13C-HSQC[27-29], 1H-13C-HMBC, and 1H-13C-H2BC. Strategies for selecting the proper pulse sequences, acquisition and processing parameters for natural product elucidation has been previously reviewed. Implementing higher-dimensional experiments, for example, HSQC-TOCSY, HSQC-NOESY, provides valuable information for complex natural products on the through bond or space connections by exploiting the heteronuclei chemical shift for further separation.
When assessing the structure of a chemically modified molecule or a molecule for which minor structural changes are suspected, acquiring a complete structural suite of experiments may not be necessary. In such circumstances, a series of edited 1D and 2D experiments have been developed that can isolate the chemical modification of interest and express only correlations to the modification. Isolating a particular peak of interest reduces the time required for data acquisition, simplifies analysis and can help to quickly confirm modifications; valuable tools for isolating information from complex molecules are reviewed in Ref. 33.
A standard approach for small molecule structure elucidation involves identification of the individual fragments or spin systems followed by their assembly. This approach outlined in Scheme 1 uses 1D data to assess purity, classify the compound type and compare with NMR chemical shift databases. Analysis of the 2D homonuclear data (COSY and TOCSY) identifies the individual short spin systems in the compound. Heteronuclear HSQC data provides 13C chemical shift information and direct H-C links. HMBC data links distant H-C spin systems that help link molecular fragments. Data from the NOESY and ROESY spectra also aid in linking spins systems, and determining relative configuration and conformation, for example, relative stereochemistry, ring junctions, and double bond regiochemistry. The final step is confirming that proposed shift assignments and structural characteristics agree with coupling constants and splitting patterns among spectra, along with other data collected. There are numerous books detailing the specifics for analyzing NMR data of small molecules, see for example Refs. 35-38.
The flowchart can be utilized as a general scheme for small natural product structural elucidation. Typically, a series of 1H and 13C NMR experiments are required in order to fully confirm the structure.
2.1.2. Proteins & peptides
MR allows for structural characterization of moderately sized proteins or peptides.[12,39] Since the line width of the NMR signal depends upon the rotational correlation time τc, the resultant signal intensity reduction and decreased spectral resolution typically precludes detailed analysis of large proteins. To date the largest protein to be structurally characterized by NMR is the 82 kDa or 723 amino acid malate synthase G. In contrast, X-ray techniques in principle have no size limit however, not all proteins are amenable to crystallization esp. membrane-associated proteins and crystallization can alter the protein structure making NMR attractive for elucidating structures in “native” solution environments. “Non-native” conditions can also be applicable for protein folding/unfolding or temperature stability. Initial investigation into biomolecular structure elucidation typically requires information from other techniques such as MS, circular dichroism, micro-array for initial determination of the amino acid sequence. A general scheme for the elucidation of a protein 3D structure (Scheme 2) involves initial production of the protein either by synthesis and refolding, or by recombinant expression, and followed by acquisition and data analysis. From the analysis through bond backbone and side-chain connections, through space nOe connections and additional constraints are utilized as restraints within simulated annealing protocols that calculate superimposable ensembles of lowest-energy structures.
NMR is a sample intensive technique requiring milligrams of the biomolecule. For small peptides, synthesis (mg quantities) is typical and offers the possibility of selective isotopic labelling with the preferred spin active nuclei 13C (over the 98% natural abundant spin-inactive 12C) and/or 15N (over the 99% natural abundance quadrupole and difficult to observe 14N). However for large proteins, synthesis is too onerous and costly. Therefore development of a recombinant expression protocol capable of producing bacterial, mammalian or other proteins with the correct folding and linkages (in the case of lipoproteins or polysaccharide proteins) is required. In addition, expression systems allow for point mutations. For proteins >50 amino acids it is advantageous to label the protein with the isotopes 13C and 15N. Alternatively, structure elucidation of small monomeric peptides of <50 amino acids does not necessarily require labelling and the experiments and strategies outlined in Section 2.1.1 can be utilized. Isotopically labelled proteins can be achieved by growing E. coli with the particular expression gene on minimal media supplemented with 13C6-glucose and/or 15N ammonium chloride. Although E. coli is widely used for its low cost, the high productivity yields depend upon the plasmid, host and tags. In addition with prokaryotic systems, protein re-folding can be a potential bottle neck along with failure to express toxic proteins, degradation and the absence of post-modification. Other non-E. coli prokaryotic and eukaryotic cell-lines that have been used for labelling, such as insect cells, are reviewed in Ref. 45. Eukaroytic Pichia pastoris has been widely and successfully used for labelling using minimal media, and cell-free expression allows for high yields with relatively small reaction volumes. With cell-free, any combination of labelled and unlabelled amino acids can be incorporated into the protein without isotopic scrambling. An advantage of the cell-free system is that reagents, that stabilize expression, can be added. For example, protease inhibitors, detergents or membrane mimetics for insoluble or membrane associated proteins. However, each amino acid is added to the medium which can become costly for uniform 15N and/or 13C labelling. Overall, the particular expression system used depends upon such factors as post-translational modifications, the labelling scheme, membrane association, total cost and expression efficiency.
Over the past decade specialized isotope labelling strategies have been developed for biomolecular NMR, including full or partial deuteration, specific amino acid labelling and regiospecific labelling. Labelling proteins with deuterium is practical for simplifying complex NMR spectra or for studying protein-substrate complexes. Furthermore, as the molecular weight of the biomolecule increases the spin-spin relaxation time (T2) decreases considerably, inhibiting coherence transfer along amino acid side-chains. Substitution of 2H for 1H nuclei increases T2 enhancing the coherence transfer. Perdeuteration of the backbone prevents connection to the side-chain 1H nuclei, whereas a random, uniform deuteration level between 50% and 90% is most desirable. [50,51] Nevertheless, perdeuteration is beneficial for studying protein-substrate or protein-protein complexes, in which one portion of the complex is “invisible” reducing the overlap and spectral complexity allowing information of conformational changes to be more readily identified. In the pursuit of NMR information of larger proteins, labelling strategies have been developed for selective methyl labelling of alanine, leucine, valine and isoleucine (Hγ1) residues with perdeuteration of the backbone. This labelling scheme allowed for the analysis of relaxation dynamics of a 1 MDa protein complex.[52,53] Site specific information on protein conformational changes upon substrate binding can be obtained from selective amino acid labelling of the backbone. Within the protocol, media is supplemented with isotopically labelled amino acids, isotopic scrambling may result in instances where the supplemented amino acids are precursors to other amino acids. Scrambling of the isotope labels can be overcome by using E. coli strains with lesions in their biosynthetic pathways, and recently demonstrated with a prototphic strain. Contrary to selective amino acid labelling it has been proposed that unlabelling specific amino acids against a uniformly 13C/15N labelled background is beneficial. Selective unlabelling of the protein still allows for sequential assignment of regions of the protein. Segmental labelling of individual domains or portions of large proteins has been established in which labelled segments of the protein are ligated together.[57,58] Segmental labelling of a large protein is useful for studying domain-domain interactions, conformational changes and substrate binding studies.
Preparation of protein samples is straightforward for soluble monomeric proteins at concentrations >1 mM. A wide range of deuterated buffers are available for controlling the pH of the sample. Deuterated or an inorganic buffer are desirable to minimize interference within 1H spectra. The pH requires consideration as at pH > 8.0 labile amide 1H nuclei can rapidly exchange with water becoming invisible. This can be utilized to reduce spectral complexity; however, it may also affect residues of interest. Approximately one-third of human genes code for membrane-associated proteins and to utilize NMR for studying the structure/function relationship of these proteins requires the protein to be folded in a membrane environment. A range of membrane mimetics are available for solution and solid state NMR.
The flowchart can be utilized as a general scheme for protein structural elucidation. Successful structural elucidation for proteins relies on elaborate but well established 1H, 15N and 13C NMR experiments.
With the advent of 2D NMR experiments a strategy for the 3D structure determination of small proteins utilizing homonuclear NMR spectra was established. Extension of the 2D NMR experiments to nD experiments along with isotopic labelling allowed for the 3D structure determination of much larger proteins including membrane proteins.[12,62] 3D and 4D NMR techniques allow NMR data to be filtered by the 13C or 15N nuclei thus, reducing spectral overlap especially important for large proteins. In the past decade data on larger proteins has been facilitated with the development of Transverse Relaxation Optimized SpectroscopY (TROSY) experiments.[63,64] NMR structural and dynamic analysis of a supra-molecular systems of 1 MDa has been achieved in combination with selective methyl labelling.[65,66]
Typical NMR structure determination involves manual assignment of the backbone and side-chain chemical shifts. Extensive assignment of the 1H-1H nOe from NOESY experiments yields distance restraints as the volumes of the NOESY peaks are proportional to the average of 1/r6 distance between the 1H nuclei. More recently routines for automatic resonance and nOe assignments have been developed.[67,68] However, these methods require labelled proteins and high-quality data to be effective. Regardless, protein structures are calculated using a molecular dynamics computer simulation program with predefined a priori bond connectivities, lengths and angles, and NMR derived restraints.
NMR restraints are most commonly from nOe experiments but may also include dihedral angles, hydrogen bonding information and residual dipolar couplings. Dihedral angles can be calculated from measured/fitted J-couplings or predicted from backbone chemical shifts of the 1Hα, 13Cα, 13Cβ, 13CO and 15N resonances.[42,71] 1H nuclei exchange rates are reduced in protein domains which are structured, or can be used to identify binding domains. Residual dipolar couplings are valuable for identifying angular constraints for large domains and structural changes in substrate binding.[72,73]
An ensemble of lowest energy structures that satisfy the NMR derived restraints is calculated. Quality of the calculated structures is based on the consistency of the experimental data compared to the inputted restraints. Agreement of the structures among each other is evaluated with the RMSD to the lowest energy structure. In addition, NMR quality assessment scores, recall, precision and F-measures (RPF scores) have been developed to directly measure the quality of structures compared to the NOESY peak list. Structural calculations are an iterative process as not all restraints will be satisfied during the first simulated annealing calculation. It is typical for misassignments to occur due to spectral overlap or poor volume calculations. In principle the number of correctly assigned and integrated restraints should out-weigh the incorrectly assigned restraints. Thus during the calculations, a set of restraints could be identified as being violated regularly. From the identification, the NMR data is re-examined and the restraints corrected. Calculations are re-run and violations checked iteratively until well defined structures with minimal restraint violations obtained.
Tertiary/quaternary structural aspects can be confirmed with NMR through diffusion experiments. NMR is one of the most accurate and precise methods for determining diffusion constants.[13,75] Diffusion constants are related to the hydrodynamic radius through the Stoke-Einstein equation and thus can be indirectly used to determine the mass of the diffusing species.[13,76] Of particular importance for proteins is determining the aggregation number.
2.2. Pharmacophore identification & binding characterization
Natural products have a diverse range of mechanisms for eliciting a biological response. Some natural products act as free-radical scavengers never directly interacting with the organism, whereas other compounds bind to molecular targets triggering a signaling cascade and altering the physiological state. Determining the mode of action of a small natural product molecule and where necessary the biological target requires extensive micro-biological investigations. NMR can play a role within these investigations in particular by identifying the pharmacophore of the natural product. The pharmacophore is the constituent of the molecule that binds to a biological receptor to modify its biological response. Identifying the pharmacophore is an important aspect for drug discovery and understanding the mechanism of action as it assists with “intelligent” design of drugs through modifications that change binding characteristics (e.g. modifying the pharmacophore region) or solubility/permeability properties (e.g. modifying sites distant from the pharmacophore).[79,80]
The difference in the NMR nOe response between a small molecule rapidly tumbling in solution and a small molecule that is bound to a slowly tumbling large protein (see Section 2.1.1) is exploited to isolate and identify the pharmacophore (see chapter 14 of Ref. 81 and Refs. 82,83). In order to clearly define the pharmacophore complete structural analysis of the molecule is required (see Section 2.1.1); in order to sequence identify the active site within the receptor, complete structural analysis of the protein (preferably including 15N and 13C chemical shifts and connectivities) is required (see Section 2.1.2). A number of these techniques require mg quantities of purified ligand, receptor or both. Purifying compounds can be a detrimental drawback especially if the receptor is a membrane bound protein that is difficult to express and purify. Nevertheless, if the receptor is highly over-expressed within a cell (e.g. cancer cell that over-expresses a particular protein) the possibility exists for the experiment to be performed in vivo. Essentially 6 fundamental methods are available for pharmacophore/binding characterization:[81,82] chemical-shift perturbations, saturation transfer difference (STD), water-logsy (wLogsy)[87,88], transfer-NOESY (tr-NOESY), selective relaxation and diffusion editing. Selective relaxation, diffusion editing and tr-NOESY experiments in principle can be used for nM to mM binding constants (KD) whereas chemical shift perturbations, STD and wLogsy are valuable for pM to mM KD ranges with the concentration of receptor in the nM range. It is well beyond the scope of this chapter to describe these experiments in detail especially since these experiments can be combined to provide further characterization such as combining diffusion editing with STD to simultaneously determine the pharmacophore and binding constant. There are numerous reviews that provided explicit details (see Refs. 82,92-94).
With these tools both the ligand and receptor can be characterized. Typically chemical shift perturbation or mapping methods helps to characterize the active site of the receptor. A series of 1H-15N or 1H-13C HSQC spectra of the labelled receptor are collected as the ligand is titrated. Changes in chemical shifts of the receptor are indicative of 1H nuclei that are perturbed during binding, although care must be taken as to not over-interpret data as this could also be indicative of structural alterations distant from the binding site. In cases where the receptor is large (i.e. > 30 kDa) extensive resonance overlap may preclude unambiguous interpretation of the HSQC data. Expression techniques to isolate particular amino acids or regions of the receptor are valuable for these experiments (see Section 2.1.2). One disadvantage of this technique is the necessity of a complete resonance assignment of the target or at least the active site.
Diffusion editing is a technique that can be used to determine the KD by examining the change in diffusion properties of the ligand (typically < 2 kDa) upon titration to the receptor (typically > 100 kDa). Although limited with pharmacophore and binding pocket identification, it is nevertheless a valuable tool to identify binding events from a mixture of possible small ligands or to combine with other techniques.
The most often utilized techniques are the STD, wLogsy, tr-NOESY and selective relaxation. These techniques are used when the target is too large for chemical shift perturbations, is not available with the desired isotopic labelling scheme or the target aggregates/precipitates at high concentrations. For these techniques, non-specific binding events in the nM-µM range may be difficult to rule out unless compared with a known binder or used with a competitive binder. Regardless, these techniques are invaluable for specifically observing resonances of a low-affinity ligands that bind to a receptor. With selective relaxation experiments, differences in relaxation properties of the ligand between a free and bound state help identify interacting 1H nuclei that are in close proximity to the receptor. The relaxation properties of the free ligand (i.e. no receptor) are compared to the relaxation properties for the ligand at various receptor titers and mixing times. Changes within the relaxation values can distinguish 1H nuclei that are in direct contact with the receptor from 1H nuclei that show magnetization relay or 1H nuclei distant from the receptor. For wLogsy experiments, 1H resonances arising from 1H nuclei in close proximity to the receptor (i.e. are part of the pharmacophore) are opposite sign to 1H resonances arising from 1H nuclei that are distant from the receptor or are on a ligand that does not bind to the receptor (Fig. 1A). The wLogsy also has the advantage of identifying 1H nuclei that are part of salt bridges between the ligand and receptor. For STD experiments, small molecules that do not bind to the receptor show a zero response (Fig. 1B) whereas the 1H nuclei of the pharmacophore show a response. Because the wLogsy is a direct observation technique whereas the STD is generated from a difference of spectra, the wLogsy tends to have fewer artifacts and can be more sensitive. The tr-NOESY experiments can provide structural information about the ligand in the bound state as well as potentially information on the type of amino acids on the receptor involved in binding; typically no information regarding the sequence specificity of the amino acids is gleaned. The tr-NOESY is advantageous if the ligand changes conformation upon binding as it can be utilized to determine the bound state structure of the ligand which is a valuable asset for understanding the mode of action. The tr-NOESY is a 2D technique and as such requires mg quantities of ligand and substantially more time to acquire the data. For rapid screening of natural products the STD and wLogsy are the experiments of choice.
Beyond the individual experiments, combinations of these various experiments are possible. For example, to investigate the folding/unfolding properties of a peptide the combination of a 1H-15N HSQC and wLogsy can be used to monitor amide 1H nuclei exchange rates valuable for identifying H-bonding and buried residues.[77,97] This experimental combination can also be utilized with/without ligand to identify amide 1H nuclei that are involved with ligand binding. Using the wLogsy to saturate the water signal avoids the challenges of the typical methods of monitoring exchange by addition of D2O such as protein precipitation, conformational changes induced by concentrating/diluting or complete loss of signal due to rapid deuterium exchange.[77,98]
2.3. Quantitative analysis & QA/QC
Nuclear MR technologies have traditionally been associated with molecular characterization. The quantitative nature has been acknowledged from integration of signals to distinguish between for example methyl and methylene 1H nuclei; however, it has rarely been exploited for absolute quantitative analysis (qNMR) or quality assurance/quality control (QA/QC). MR techniques are capable of accurately and precisely determining the concentration of molecules within a purified sample or complex mixture without the need for elaborate calibrations. In addition samples can be in solution or semi-solid states. Relatively simple protocols have been developed that use a single certified external standard to calibrate the instrument. From the calibrated system, other samples can be rapidly quantitated.[10,99]
MR technologies have the unique distinction of having a uniform molar response for all nuclei of the same type, i.e., the NMR signals are proportional to the molar concentration of the nuclei allowing for a direct comparison of the concentration of all compounds within a mixture. Thus for example, for all organic molecules regardless of the concentration, the intensity of each signal within the NMR spectrum is a direct measure of the number of 1H nuclei that contribute to that signal. Furthermore, spectra can be recorded in such a manner as to allow for accurate comparison between different samples within different sample tubes and different solvents; the implications are far reaching for qNMR and QA/QC. For example, concentrations of natural products or impurities can be determined for samples within sealed tubes reducing the handling requirements of toxic or precious samples, or rapid crude or refined product profiling ensuring purity, integrity and consistency with applications to fractionation or end-product QA/QC. Fully automated protocols have been developed that have been coupled with metabolomics investigations providing absolute scaling for temporal data.
2.4. Metabolomics analysis
Analysis of metabolites and metabolic flux can help ascertain the effects of a particular natural product or extract on an organism. Metabolic analysis can be utilized while identifying biological activity in vitro, or during in vivo investigations. Perhaps one of the best practical definitions of metabolomics was offered by Oliver describing it as an approach for simultaneously measuring the complete set of metabolites (low molecular weight intermediates) that are context dependent and which vary according to the physiological, developmental or pathological state of an organism. From the perspective of natural products, such a definition fits in the framework of extracts from plants, fungi and secretions from microorganisms such as bacteria. When metabolomics is viewed from context dependence, only those metabolites that vary according to environmental, biochemical, and/or physiological fluctuations are important. In this regard, there are a vast number of metabolites present in botanical extracts and secretions from organisms that are useful for inducing biological responses in organisms including humans. For instance, ginseng has been argued to induce biochemical changes in humans that lead to anti-tumor, antioxidant, anti-fatigue and anti-stress activities while Streptomyces coelicolor secrete therapeutic natural products during their quiescent growth phase.[102,103] The discussion within this section follows the aforementioned definition and is restricted to metabolomics applied to botanical natural products since it is one of the fastest growing subsections established as “plant metabolomics.”[104-106] NMR is a suitable method for such analyses since it allows simultaneous detection of a diverse group of both primary metabolites (sugars, organic acids, amino acids etc.) and secondary ones (flavonoids, alkaloids tri-terpenes etc.) Nevertheless, the applications are extensible to the “animal kingdom” metabolomics since the underlying sample composition is similar and characterized by large heterogeneity exhibiting vast dynamic range in concentration.
The numerous advantages of NMR have been a major driver for developing NMR based metabolomics technologies. NMR is ideal for resolving the complexity of metabolomics samples given that methods exist for compounds with nuclei such as 1H, 13C, 15N and 31P to provide spectral fingerprints with compound specificity and quantitative accuracy even within complex matrices. For example, from the un-annotated 1D 1H NMR spectra of seaweed extracts, visible differences in chemical composition are observed (Fig. 2). NMR signals have a uniform molar response (see Section 2.3). This property has particularly been important with propelling NMR metabolomics as a technology. NMR is a non-destructive technique ideal suited because many of these samples are difficult to obtain and may be precious. The stability of many NMR instrumentation allows for repeated measures (often years apart) of a sample with accurate reproducibly. This advantage also lends the technology to inter-laboratory studies that are important for establishing the robustness of a given measurement and technique. Advances in technology now allow for high throughput analysis with automated, temperature controlled sample changers such as Bruker’s SampleJet®.
The most commonly cited disadvantage of NMR for metabolomics is the lack of sensitivity. This is a major hindrance given that in typical sample matrices, the concentration of constituent metabolites often exhibits a large dynamic range in concentration. Low abundance metabolites will invariantly be overlapped by highly abundant ones. Furthermore, in some applications such low level metabolites maybe of high value. For instance a phytochemical preparation may exhibit activity in a biological readout but such activity is induced by the low level metabolites which are undetectable via NMR. Approaches for simplifying sample matrices have been developed in order to separate, for instance, lipophilic from hydrophilic metabolites. In addition, recent advances in NMR hardware have significantly improved sensitivity, especially with the advent of cryogenic probes and microprobes.
NMR-based metabolomics applications have been in the literature for over two decades,[108,109] but the technology only realized widespread acceptance and application in the later part of the last decade.[105,110-112] This shift is attributed to the utility of pattern recognition methods for analysis of multiple spectra, allowing the visualization of patterns corresponding to differences among samples and identification of chemical shifts responsible for eliciting such differences. Specifically, principal component analysis (PCA) has been a significant driver as it allows patterns associated with the variability in the relative concentrations of metabolites to be assessed by the human eye, often in two dimensions. PCA analysis assisted with the classification of different extracts of the sea weed Ascophyllum nodosum (Fig. 3). In recent years, many other pattern recognition methods (classified either as supervised or unsupervised) have been developed and applied to data. Unsupervised methods do not include a priori knowledge of the class memberships of a given sample. Such methods include PCA, SIMCA, independent component analysis (ICA) and the so called machine learning methods such as neural networks and self organizing maps (SOM). On the other hand, supervised methods use information about the samples in order to build models that can later be used to predict the class to which an unknown sample belongs. Such methods include partial least squares discriminant analysis (PLS-DA).
Ultimately, the impetus for natural product metabolomics analyses is the need for high throughput screening to determine biological activity and the requirements from the nutraceutical industry to purport therapeutic health benefits from unprocessed foods. Traditional approaches with the development of novel drugs are difficult, expensive and time consuming. It is estimated that over $800 million and an average of 14.2 years are spent before a novel drug application is approved. Because natural products are a rich source of lead compounds in the drug discovery pipeline, the metabolomics approach provides an avenue for a systematic characterization of complex mixtures such as phytochemical extracts linking observations made via biological assays without the need for isolation. This promises to complement high throughput screening of compounds in order to shorten the drug discovery process. Indirectly, NMR has been used in metabolomics to measure the fate of consumed natural products and their effects on human physiology. One study has shown that the consumption of dark chocolate, for instance, affects energy homeostasis in humans. In another study, differences in metabolic profiles were observed in human urine following consumption of black compared to green tea specifically increases in urinary hippuric acid and 1,3-dihydroxyphenyl-2-O-sulfate, which are end products of tea flavonoid degradation. Several other studies exist in the literature and have been reviewed for instance within Ref. 117.
Perhaps one of the biggest gaps with metabolomics developments for natural products is the lack of certified reference materials for quantification of analytes. Such reference materials would enhance product characterization and validation of biological observations, especially with studies of bioactivity assessment. Those studies are often inconsistent due to inadequate chemical characterization of complex botanical mixtures, making comparison of results across studies difficult. Fortunately, it is possible to determine the concentration of ‘active’ compounds using external standards, via NMR, as long as both the external standard and the compound of interest are of the same nuclei type (see Section 2.3).
2.5. Semi-solid state & macromolecular assemblies
In addition to studying soluble molecules or extracts dissolved in solution state, it can be advantageous and/or necessary to study semi-solids such as intact tissues, cells, raw materials or product formulations; metabolites or components are in their native environment i.e. potentially time consuming and disrupting/degrading extractions, or chemical modifications are not required in order to obtain valuable information. Semi-solid materials require specialized NMR probes to overcome sever spectral line-broadening as result of the restricted molecular mobility. To acquire a high-resolution spectrum of a semi-solid, High Resolution Magic Angle Spinning (HR-MAS) was developed in the late 1990’s as a hybrid between solid and solution state NMR. Similar to solid state NMR the spinning of the sample at the “magic angle” (54.7°) to the applied magnetic field reduces line broadening effects. Spinning speeds are typically between 3 to 5 kHz and cells remain intact and viable. HR-MAS requires <100 uL of the semi-solid. HR-MAS does not require the high powered pulses that solid state NMR requires and many of the nD experiments that are utilized for structural biology can be applied, although stable isotope labelling is preferred for the less abundant 13C, and 15N nuclei. HR-MAS can be applied for many natural product studies[118,120] commonly used to study metabolic changes in diseased and treated tissues,[121,122] metabolism, combinatorial chemistry and whole cells. Intensities in the HR-MAS NMR spectra are dependent on the environment of the analyte and as such molecules that are in rigid environments with completely restricted mobility are not detected. Macro molecular assemblies on-the-other-hand can be examined for profiling and quantitation of algae lipid content and polysaccharides or metabolites.
MRI is a nuclear MR technique applicable for natural product development in particular during in vivo testing and diagnostic stages. MRI is capable of producing 3D images that can be used to monitor changes in brain activity in response to application of a natural product via fMRI techniques[126-128], indirectly monitor the effects of a natural product on tumours, or directly monitor the bio-distribution and bio-accumulation of a natural product by tagging the compound with an MRI contrast agent.[130-133].
Direct monitoring of the natural product is one of the most typical methods for drug development. For direct monitoring however, the molecules should be tagged with a MRI contrast agent which contains a paramagnetic centre causing 1H nuclei on water in close proximity to “relax” much faster than 1H nuclei distant from the paramagnetic centre. This rapid relaxation is exploited with MRI and is depicted as “dark spots” within the image whereas unaffected water molecules remain as bright areas. In order to “tag” a natural product with a MRI contrast agent such as super-paramagnetic iron oxide (SPIO), the complete structure and pharmacophore identification is valuable (see Section 2.2) as it allows one to pick a functional group distant from the active site that can be chemically coupled to the contrast agent. Many SPIO contrast agents are commercially available with a variety of functional groups amenable to chemical coupling under aqueous conditions.
Once the natural product is conjugated to the paramagnetic particle, it can be administered to an organism and imaged. The bio-distribution and bio-accumulation of the tagged molecule is monitored and compared against a control particle that does not contain the active natural product.[133,135-137] The difference in clearance time is considered as conformation that the molecule is associated with the tissue being examined. As an example the peptide SOR-C27, a 27 amino acid fragment of the paralytic natural product peptide SOR-54 from the Northern Short Tail shrew (Blarina brevicauda) was found to bind the calcium ion channel TRPV6 which is highly over-expressed by breast, prostate and ovarian cancers.[138,139] The SOR-C27 peptide was chemically bonded to a maleimide functionalized SPIO particle through the sulfur centre of the Cys-14 residue. From MRI investigations on an ovarian cancer xenograft mouse model, the SPIO-peptide particle persisted at the tumour site 24 hours post injection whereas the control SPIO particle rapidly cleared. The persistence of the SPIO-peptide particle at the tumour site is conformation that the peptide is associated with the tumour.
A comprehensive overview of MR techniques for natural product development is well beyond the scope of a single chapter or book. The fundamental experiments have been briefly outlined and the typical information that is gleaned from the experiments presented. Many other opportunities were not covered such as determining equilibrium dissociating constants, or use of spin labels for lead drug optimization. MRI has tremendous application for later stage in vivo applications of the drug development pipeline. Together the MR technologies of NMR and MRI can cover the full range of natural product drug development from discovery through to clinical testing (Fig. 4).