Common iron-binding peptide monomers in known siderophores. Corresponding neutral-loss and fragment-ion searches can be used to mine LC-MS/MS datasets for siderophores. The mass for fragment ion searches can be obtained by adding a proton to the given neutral loss masses (+1.00783). Database searches in NORINE and METLIN were performed using a mass tolerance of ±0.005 Da. Corresponding compound structures are shown in Figure 7.
Rapid developments in tandem liquid chromatography-mass spectrometry (LC-MS/MS) have created wide interest in applications for the analysis of small molecule mixtures. MS/MS spectra can contain rich structural information, but because of the structural diversity of small molecules and different data acquisition methods, analysis algorithms and workflows frequently need to be tailored to individual research questions. This chapter shows how MATLAB can be used for LC-MS/MS-based structural characterization of small molecules. Starting with the import of raw data, ways for visualization and the creation of graphical user interfaces (GUIs) for individual applications are demonstrated. A selection of frequently used algorithms for pre-processing and data analysis is reviewed in context of their MATLAB implementation. The approaches are then tailored and applied to the analysis of iron-binding peptides (peptidic siderophores) by high-resolution LC-MS/MS. The method uses a database with siderophore structures to exploit prior knowledge about siderophore structural diversity for the interpretation of MS/MS spectra from known and new siderophores.
- small molecules
- fragmentation spectra
- liquid chromatography tandem mass spectrometry
- auto-convolution spectra
- molecular networks
- secondary metabolites
- nonribosomal peptides
Liquid chromatography-mass spectrometry (LC-MS) enables the analysis of complex mixtures of small molecules and is applied widely in diverse research areas, such as metabolomics analysis in biology  and medicine [2, 3] and molecular characterization of samples in environmental chemistry  and combinatorial chemistry , among many other applications. Recent advances in LC-MS instrumentation with respect to speed and sensitivity, coupled with improved computational methods to extract information from complex datasets, have translated into falling costs of analyses and have created wide interest in LC-MS applications. As instruments have become able to explore samples at very high sensitivity and resolution (e.g., nano-flow LC coupled to high-resolution MS detectors, such as Orbitraps or Q-TOFs), the computational analysis of the generated raw data has become more challenging. In untargeted, ‘discovery type’ LC-MS experiments, the analysis of the raw data may include the following steps [6–8]:
Extraction of features from a raw LC-MS dataset. LC-MS features are defined by unique, characteristic combinations of retention time and mass-to-charge ratio. Associated with the chromatographic peak of a feature is a peak intensity or area, which serves as a relative measure of the abundance of the compound producing the feature. These parameters represent a fingerprint of compounds in a given sample.
Comparison of corresponding features in different sample sets and evaluation of significant differences.
Compound identification and structural characterization of unknown compounds of interest.
Soft ionization methods in LC-MS, most commonly electrospray ionization (ESI), generate mass spectra with minimal compound fragmentation and facilitate the extraction of LC-MS features associated with the intact molecular ion. Nevertheless, extraction of features for subsequent statistical analysis is non-trivial when complex mixtures of small molecules are analyzed . Fortunately, open-source (e.g., mzMine2 , XCMS2 ) and commercial (e.g., Agilent MassHunter, Waters ProGenesis QI, ABSciex XCMSPlus, ThermoFisher Scientific SIEVE) software tools for this task have been developed and have become increasingly powerful and user-friendly.
With the extraction of features from LC-MS data becoming more readily achievable, the identification or structural characterization of small molecules has become a major bottleneck [11, 12]. Insight into chemical sum formulas and chemical interaction with the LC stationary phase (e.g., hydrophobicity in reversed phase chromatography) may be gained from LC-MS data because it contains information about molecular masses, isotope patterns, and retention times. However, even at ultra-high mass accuracies (<1 ppm), it is usually not possible to assign unique sum formulas to chromatographic features from MS1 data . In addition, for any given sum formula, there are typically many theoretically possible isobaric compounds with different structures. Therefore, structural assignment of a compound is generally not possible based on MS1 data alone. Direct structural information can be obtained by measurement of fragmentation spectra (MS/MS, tandem MS, MS2, or for multiple rounds of fragmentation MSN). LC-MS/MS datasets are composed of time series of individual full scan MS1 spectra, interspersed by one or more MS/MS spectra derived from the fragmentation of one or more species present in the MS1 (Figure 1).
Because of the vast diversity of small molecule structures, MS/MS-based structural characterization of compounds represents a great challenge [11, 12]. Identification by direct comparison of an experimental MS/MS spectrum to a library of MS/MS spectra is often limited by unavailable authentic standards. Even if MS/MS spectra for a compound (or a structurally closely related analogue) exist in a library, algorithms may not return the best match in the database, particularly if spectra are noisy or incomplete (e.g., contain contaminant ions, few fragments, minor fragments below detection limit, etc.) . Complementary to these MS/MS spectral library-based approaches, de novo methods use known structures to calculate in-silico fragmentation spectra (random or rule based). Observed fragments are matched to possible substructures to reconstruct the observed MS/MS spectrum. The success of these methods depends on the chosen fragmentation rules and database search space . Thus, the final success of attempts for structural characterization usually depends on a combination of prior knowledge about the compounds in the sample, adequate computational tools, and manual inspection of the raw data.
In addition to the identification of individual compounds, MS/MS structural information may be used at different stages in the interrogation of the samples. For example, at an early stage, it is possible to obtain an overview of structural diversity in a sample by clustering the MS/MS spectra into similarity networks , while at a later stage, once specific unknown features of interest have emerged, a more detailed interpretation of individual MS/MS spectra for compound characterization can be undertaken. MS/MS spectra can also be used to direct further targeted LC-MS/MS approaches aimed at deeper exploration of species possessing common fragments or fragmentation patterns of interest. This approach is commonly employed to specifically characterize molecules with certain functional groups or chemical modifications that produce characteristic patterns in their MS/MS spectra .
To make the best use of the rich structural information contained in LC-MS/MS datasets, there is a large need for MS/MS analysis algorithms that are tailored to many different individual research applications [11, 16]. As we demonstrate in this chapter, MATLAB provides an accessible and convenient platform for the interactive analysis and visualization of LC-MS/MS data and the implementation of customized algorithms and workflows. Tools are introduced for basic tasks, such as neutral-loss searches, as well as for more complex workflows, such as for the generation of MS/MS similarity networks or for the application of auto-convolution spectra to the structural characterization of peptides. We then apply these tools in the context of a specific basic research application: the discovery and structural characterization of peptidic siderophores. Siderophores are a class of secondary metabolites that are released by many bacteria and fungi to bind and take up iron (Fe), an essential and often growth-limiting micro-nutrient . Using a siderophore structural database to exploit considerable prior knowledge about siderophore structural diversity, an effective workflow is shown for the LC-MS/MS-based analysis of known and new siderophores.
2. Importing raw data into MATLAB
The LC-MS raw data generated by the MS instrumentation is stored in vendor-specific binary file formats. For access by non-vendor software, the raw data file first needs to be converted to an open file format, such as the common mzXML format . The program msconvert, which is part of the open-source proteomics package ProteoWizard (http://proteowizard.sourceforge.net/tools.shtml), can be used to convert from most raw data formats to mzXML . At this point, LC-MS data recorded in profile mode can also be centroided by vendor supported algorithms built into msconvert. During centroiding, the centroid is determined for each mass spectral peak, which consists of multiple
MATLAB’s bioinformatics toolbox provides several functions applicable to the processing of LC-MS data. For this study, we will use the functions mzxmlread and mzxml2peaks. The function mzxmlread can be used to import LC-MS data from mzXML files into a MATLAB structure array: mzXMLstruct = mzxmlread(mzXMLFilename). The returned structure contains the LC-MS/MS data and relevant metadata from the mzXML file, such as scan number, MS level, ionization mode, MS/MS collision energy, and MS/MS precursor information (Figure 2).
The function mzxml2peaks can be employed to extract the mass spectra (
If mass spectra are collected in positive and negative ionization modes during the same run, it may be useful to process positive and negative modes separately. To extract one ionization mode only, a filter can be applied to the data in mzXMLstruct:
These loops through scan metadata in mzXMLstruct can be added to the mzxml2peaks function together with optional output and input arguments. In the following, we will use an optional output matrix called Precursors which contains precursor information for each MS/MS scan in three columns:
3. Visualization and graphical-user-interface implementation
When analyzing LC-MS/MS data, it is helpful to be able to visualize spectra and chromatograms in order to display significant features, evaluate spectral noise, consider potential interferences, or evaluate fragmentation patterns, and so on. The two common projections of the three-dimensional LC-MS data (retention time,
To plot an MS/MS spectrum for a given precursor
Graphical-user-interfaces (GUIs) can aid in efficiently browsing and evaluating the LC-MS/MS data and can serve as a platform for data manipulation and analysis. GUIs are also particularly helpful for users who are not familiar with MATLAB and can be shared as stand-alone applications. MATLAB facilitates the creation and programming of GUIs with the tool GUIDE (graphical user interface design environment). GUIDE provides an interface for designing the layout, while creating the code for the GUI, including the implementation of callbacks for a number of standard User Interface (UI) controls, such as ‘pushbuttons’, ‘popup menus’, and ‘listboxes’. The callback functions can then be filled with user-defined code to provide the intended functionality. A GUI for LC-MS/MS data analysis created with MATLAB is shown in Figure 3.
4. Pre-processing of MS/MS spectra
Before analysis of LC-MS/MS data, pre-processing can be employed to increase signal-to-noise ratios, remove contaminant peaks, and reduce the data for the following analysis steps. Pre-processing and analysis algorithms are often tailored to individual data acquisition methods, data quality, and analytical goals.
4.1. Noise removal
The ratio of maximum-to-median peak intensities in MS/MS spectra has been used as an estimate of signal-to-noise ratios . In order to remove noise and reduce the amount of data in MS/MS spectra, filters have been applied to retain only the most intense peaks within a given mass spectral bracket (e.g., five most intense peaks in a 50 Da window) [19, 20]. If several fragment-ion spectra are available for the same precursor, noise can be removed by retaining only those fragments that are present in a majority of the MS/MS spectra  or the spectra can be summed or averaged, which may minimize the noise contribution if it is random (see Section 4.3). For high-resolution MS/MS spectra, exact masses can also be used to determine possible fragment-ion sum formulas in order to remove noise or satellite peaks that possess
Interfering signals are not only due to random noise but frequently a consequence of relatively wide precursor ion isolation windows (usually > 1 Da) utilized by the instruments’ ion optics during ion selection prior to MS/MS. For this reason, the selection of minor features that may be well resolved in MS1 can result in the co-isolation of significant quantities of unrelated species, which, in turn, produce significant contaminant peaks in the corresponding MS/MS spectra. Spectral deconvolution algorithms to remove possible contaminant peaks or to determine multiple precursors from unintentional or intentional wide-window ion isolation (such as that achieved in recently developed data-independent acquisition methods) have been published [22, 23].
4.2. Removal of 13C-isotopologue peaks
Another consequence of wide precursor isolation windows is that fragment ions may be accompanied to some degree by their 13C-isotopologues. To de-isotope a high-resolution MS/MS spectrum, an algorithm can proceed from the lowest to the highest intensity fragment ion. For each ion peak, it is evaluated whether it may represent a 13C isotopologue of a more abundant peak, by searching for another peak with the exact mass difference of a 13C isotope (∆
4.3. Consensus spectra
If several MS/MS spectra of the same precursor have been acquired, the construction of consensus spectra can increase signal-to-noise ratios and significantly speed up the downstream analysis. To select which MS/MS spectra are to contribute to one consensus spectrum, the Spectra cell array can be filtered by identifying the elements with the same precursor
To calculate a consensus spectrum for each cluster of MS/MS spectra, all
5. Analysis of fragmentation spectra
In this section, three relevant LC-MS/MS analysis approaches are reviewed that can be applied for a wide range of analytical questions. In Section 6, we illustrate an example application for each approach (Sections 6.2–6.4) that is specifically tailored to the discovery and structural characterization of siderophores.
5.1. Fragment-ion and neutral-loss search
Fragment-ion or neutral-loss searches can be utilized to mine LC-MS/MS data for molecules with characteristic substructures or fragmentation behavior, such as certain lipid headgroups  or metabolite conjugates (e.g., GSH or phosphate) . In targeted analysis, a defined fragment-ion or neutral-loss can be exploited to increase specificity for detection of the target compound.
Simple algorithms can loop through the MS/MS cell array (Spectra) to search for fragment-ion peaks within the defined
5.2. Pairwise similarity of MS/MS spectra
Given the task to find a best match for an experimental MS/MS spectrum among members of a spectral library, the experimental spectrum needs to be compared to each spectrum in the database and a pairwise similarity score needs to be computed. A common approach is to calculate a normalized dot product (cosine similarity) between pairs of MS/MS spectra SA and SB :
If a matching structure is not present in the database, the information may nevertheless be used for the identification of potential common substructures of a structurally related compound. For such applications, the matching of fragments in
Another useful application of similarity calculations is found in the generation of MS/MS similarity networks as described previously . Calculating all pairwise similarities between consensus LC-MS/MS spectra yields a table that can be visualized in a molecular network using freeware tools such as Cytoscape (www.cytoscape.org). In the MS/MS network, each node represents one consensus spectrum (precursor information) and each edge between two nodes illustrates the relatedness. Cytoscape provides functionality to create edge-weighed force-directed layouts to cluster closely related nodes in order to obtain an overview of structural diversity in a sample.
5.3. MS/MS convolution and auto-convolution
De-novo sequencing of peptides is most widely performed by the analysis of fragmentation spectra that are acquired by positive-mode collision-induced dissociation (CID) . With positive-mode CID, major MS/MS peaks result from dissociation of the molecule at the peptide bonds, yielding b- and y- type ions (Figure 5) [29, 30]. In addition, the spectra regularly include other related fragments. For example a-ions have a mass difference corresponding to a CO group relative to b-ions (∆
As illustrated in Figure 5, the mass differences between fragments include the mass of individual peptide monomers (amino acid residues). Spectral convolution between two spectra SA and SB calculates the
Auto-convolution spectra are generated by calculating the
6. Application to siderophore analysis
In this section, the LC-MS/MS analysis methods discussed above are applied to the discovery and structural characterization of peptidic siderophores. Siderophores are secondary metabolites that are released by many bacteria and fungi to bind and take up iron (Fe), an essential and often growth-limiting micro-nutrient . Using a siderophore structural database to exploit considerable prior knowledge about siderophore structural diversity, an effective workflow is presented for the LC-MS/MS-based analysis of known and new siderophores.
Previously, we described an algorithm for the discovery of siderophores in high-resolution LC-MS1 data by screening for the natural stable isotope pattern of iron (54Fe and 56Fe) bound to siderophores and by searching for related iron-free siderophores . Here, we complement this method by analysis of high-resolution LC-MS/MS data for discovery and structural characterization of siderophores (Figure 6).
To obtain a list of siderophore candidates, Fe can be added to the sample extract before injection onto the LC-MS system, which facilitates the generation of Fe-ligand complexes and the recognition of the Fe isotope patterns associated with Fe-bound siderophores (Figure 6-1a). Independent of isotope patterns, fragmentation spectra can be mined for siderophore-characteristic substructures by fragment-ion and neutral-loss searches (Figure 6-1b, Section 6.2). Both approaches yield a table with
For the replicate run, no Fe is added to the sample extract, maximizing the signal for the free siderophore species, which are preferentially selected for structural characterization in subsequent analytical steps (at the same time, differences in peak abundances between the extracts with and without added Fe can give further confidence to the assignment of siderophores). MS/MS spectra of unbound siderophore candidates are selected to generate an MS/MS molecular network which provides an overview of structurally distinct groups of siderophores (Figure 6-2a, Section 6.3). Representative species in the network are selected for structural characterization by calculation of auto-convolution spectra (Figure 6-2b, Section 6.4). By matching peaks in the auto-convolution spectra to masses in a database of siderophore peptide monomers, possible siderophore substructures are assigned. Combinations of peptide-monomers are then used as a signature to find possible related known structures in the database, aiding in the reconstruction of the original MS/MS spectrum (Figure 6-2c). Iterations with structurally related compounds in the MS/MS network can refine the structure suggestions and be used to efficiently evaluate structures of derivatives. Depending on the analysis outcomes, the putative structures can be confirmed with authentic standards or by isolation of the compound for characterization by orthogonal means (e.g., by nuclear magnetic resonance spectroscopy, etc.).
6.1.2. Experimental methods and data pre-processing
The samples for this study include the siderophore standards desferrioxamine B (DFOB, Aldrich), enterobactin (EMC Biochemicals), amphibactins (kindly provided by A. Butler, UC Santa Barbara), and extracts of iron-limited
LC-MS analyses were performed on a high-performance liquid chromatography (HPLC)-MS platform, using a C18 column coupled to an LTQ-Orbitrap XL hybrid mass spectrometer (ThermoFisher). Samples were separated under a gradient of solutions A and B (solution A consisted of water, 0.1% FA, and 0.1% acetic acid; solution B consisted of acetonitrile, 0.1% FA, and 0.1% acetic acid; gradient, 0 to 100% B; flow rate, 50 µl/min). Full-scan mass spectra were acquired in positive-ion mode with a resolving power (R) of 60,000 (
LC-MS/MS raw data are converted to mzXML, centroided, and imported into MATLAB as described in Section 2. Spectra in which the maximum-to-median intensity ratio is below 3 are removed. Spectra are de-isotoped with an
6.2. Fragment-ion and neutral-loss searches
Utilizing a database with >300 known siderophore structures, the most frequently occurring iron binding substructures in siderophores are identified (Table 1, Figure 7). The specificity of these structures for siderophore discovery by fragment-ion or neutral-loss searches is evaluated by searching the NORINE database of nonribosomal peptides (NRPs) with >1,100 NRP structures (http://bioinfo.lifl.fr/norine/). With the exception of N-hydroxyornithine and N-cyclo-hydroxyornithine, the peptide monomers have unique masses in NORINE (±0.005 Da, Table 1). In addition, the iron binding substructures are not observed in non-siderophore structures in the database, with the exception of two putative NRPS products that are predicted to include compounds 10 and 11. Corresponding neutral-loss searches in the METLIN library with >70,000 small molecule CID MS/MS spectra reveal 39–232 false positive neutral-loss hits (i.e., matching neutral-loss mass within ±0.005 Da despite siderophore-unrelated structures), representing less than 1% of the MS/MS spectra in the database.
in NORINE (**)
in METLIN (***)
|148.0848||130.0742, 148.0848||2||0||93, 159|
|162.1004||144.0898, 162.1004||0||not in NORINE||114, 232|
|118.1106||118.1106||0||Not in NORINE||93|
|104.095||104.095||0||Not in NORINE||48|
|90.0793||90.0793||0||Not in NORINE||69|
|9||Citric acid (Cit)||192.0270||174.0164||0||Not in NORINE||44|
Application of neutral-loss and fragment-ion searches with the standards DFOB (containing 5AHA, compound 6), enterobactin (containing di-OH-Bz, compound 11), and amphibactin (containing Ac-OH-Orn, compound 1) readily reveal the parent ion
6.3. Siderophore MS/MS similarity networks
Bacteria are known to often produce suits of structurally closely related siderophores . MS/MS similarity networks can give an overview of which precursors in a list of candidate siderophores are structurally independent, potentially originating from separate siderophore gene clusters, and which are likely to be structural analogues. An MS/MS similarity network for
6.4. Application of MS/MS auto-convolution to siderophore analysis
The success of de novo structural analysis of nonribosomal peptides (NRPs) using spectral auto-convolution depends on the identification of most or all amino acids involved in the structure, thus requiring a database that contains all involved peptide monomers as well as a good coverage of fragments in the MS/MS [19, 34]. To use auto-convolution for siderophore analysis, a database with >300 siderophore structures was compiled along with a database of the peptide-monomers occurring in these structures (160 different structures with 51 different Fe binding monomers), most of which are not present in the NORINE database of nonribosomal peptides (http://bioinfo.lifl.fr/norine/). Auto-convolution spectra were previously applied to cyclic NRPs (see Section 5.3), in which an MS/MS experiment leads to ring opening, and MS3 creates additional fragmentation . Because ring opening may occur at any peptide bond, the theoretical MS3 spectra are a superposition of spectra derived from all possible linear peptides (circular permutations).
A modified auto-convolution approach is used here for analysis of peptidic siderophores and applied to siderophore structures that can also be linear, branched, or partly cyclic. Before auto-convolution, the algorithm adds the parent ion as a peak to the MS/MS spectrum and its intensity is set equal to the most intense peak in the spectrum. This ensures that the all precursor neutral loss
The relevance of database hits is judged by the multiplicity in the auto-convolution spectrum and the corresponding relative intensities of the MS/MS peaks involved. Illustrative results from the analysis of the siderophore amphibactin B are shown in Figure 9. The auto-convolution peaks with the highest multiplicity and highest relative intensities reveal all structural features of the molecule: the iron binding N-acetyl-hydroxyornithines, a serine, and the fatty acid tail. Combinations of monomers are then used as a fingerprint to search for possible related structures in the siderophore database. Four families of siderophores in the database contain the three possible substructures: amphibactins, aquachelins, marinobactins, and loihichelins. With this information, the amphibactin can be readily identified by reconstruction of the original MS/MS spectrum. A number of
The approach was also successfully applied to the other siderophore standards used in this study: DFOB showed the iron-binding monomer 5AHA (Table 1) together with the succinic acid linker among the three substructures with the highest intensity and multiplicity. One potential peptide monomer (N1,N1-dimethyl-N5-acetyl-N5-hydroxy-ornithine,
When analyzing an unknown siderophore, the described auto-convolution approach can give confidence in the siderophore assignment: siderophores often contain three characteristic iron chelating peptide monomers for hexadentate iron coordination, which cause high multiplicities and relative intensities. A combination of peptide monomers in the structure can be used as a fingerprint to search the siderophore database for possible related structures, giving insight into the possible ‘siderophore class’ and aiding in the reconstruction of the original MS/MS spectrum to make a structure suggestion.
The basic tools and considerations introduced in this chapter provide insight into the systematic analysis of LC-MS/MS fragmentation data for the structural characterization of small molecules and demonstrate how this can be performed within MATLAB. Since many researchers are familiar with MATLAB, this environment provides a low-barrier entry point and facilitates the creation of new strategies and tools to exploit the full power of modern high-resolution LC-MS/MS for structural interrogation. MATLAB facilitates data handling, manipulation, and the implementation of graphical user interfaces to serve as a platform for the visualization, pre-processing, and analysis of LC-MS/MS data.
The discussed methods were applied in a new workflow for the discovery and structural characterization of siderophores by high-resolution LC-MS/MS. Using a database with siderophore structures, characteristic neutral-loss and fragment-ion masses were identified to mine LC-MS/MS data for potential siderophores. MS/MS siderophore networks in combination with a modified MS/MS auto-convolution approach revealed siderophore peptide monomers and corresponding siderophore families. This information was key tostructure assignments by reconstruction of the original MS/MS spectrum. The tools and approaches outlined here may also be adapted to explorations of other classes of complex small molecules.
We thank Yubo Li for helping with the assembly of the siderophore database and Alison Butler (UC Santa Barbara) for providing the amphibactin samples. Financial support for O.B. was provided by the Grand Challenge Program of the Princeton Environmental Institute.
The MATLAB siderophore analysis software (‘MS2Browser’) and the siderophore database are available for download on SourceForge (https://sourceforge.net/projects/ms2browser/).
Patti GJ, Yanes O, Siuzdak G. Metabolomics: the apogee of the omics trilogy. Nature Reviews Molecular Cell Biology. 2012;13(4):263–9.
Mastrangelo A, Armitage EG, Garcia A, Barbas C. Metabolomics as a tool for drug discovery and personalised medicine. A review. Current Topics in Medicinal Chemistry. 2014;14(23):2627–36.
Beger R. A review of applications of metabolomics in cancer. Metabolites. 2013;3(3):552.
Zwiener C, Frimmel FH. LC-MS analysis in the aquatic environment and in water treatment technology—a critical review. Part II: Applications for emerging contaminants and related pollutants, microorganisms and humic acids. Analytical and Bioanalytical Chemistry. 2004;378(4):862–74.
Cheng X, Hochlowski J. Current application of mass spectrometry to combinatorial chemistry. Analytical Chemistry. 2002;74(12):2679–90.
Castillo S, Gopalacharyulu P, Yetukuri L, Orešič M. Algorithms and tools for the preprocessing of LC–MS metabolomics data. Chemometrics and Intelligent Laboratory Systems. 2011;108(1):23–32.
Sugimoto M, Kawakami M, Robert M, Soga T, Tomita M. Bioinformatics tools for mass spectroscopy-based metabolomic data processing and analysis. Current Bioinformatics. 2012;7(1):96–108.
Wolfender JL, Marti G, Thomas A, Bertrand S. Current approaches and challenges for the metabolite profiling of complex natural extracts. Journal of Chromatography A. 2015;1382:136–64.
Pluskal T, Castillo S, Villar-Briones A, Oresic M. MZmine 2: modular framework for processing, visualizing, and analyzing mass spectrometry-based molecular profile data. BMC Bioinformatics. 2010;11:395.
Benton HP, Wong DM, Trauger SA, Siuzdak G. XCMS2: processing tandem mass spectrometry data for metabolite identification and structural characterization. Analytical Chemistry. 2008;80(16):6382–9.
Scheubert K, Hufsky F, Böcker S. Computational mass spectrometry for small molecules. Journal of Cheminformatics. 2013;5:12.
Xiao JF, Zhou B, Ressom HW. Metabolite identification and quantitation in LC-MS/MS-based metabolomics. Trends in Analytical Chemistry. 2012;32:1–14.
Kind T, Fiehn O. Metabolomic database annotations via query of elemental compositions: mass accuracy is insufficient even at less than 1 ppm. BMC Bioinformatics. 2006;7(1):1–10.
Stein SE. Mass spectral reference libraries: an ever-expanding resource for chemical identification. Analytical Chemistry. 2012;84(17):7274–82.
Watrous J, Roach P, Alexandrov T, Heath BS, Yang JY, Kersten RD, et al. Mass spectral molecular networking of living microbial colonies. Proceedings of the National Academy of Sciences. 2012;109(26):E1743–E52.
Kind T, Fiehn O. Advances in structure elucidation of small molecules using mass spectrometry. Bioanalytical Reviews. 2010;2(1–4):23–60.
Hider RC, Kong XL. Chemistry and biology of siderophores. Natural Product Reports. 2010;27(5):637–57.
Holman JD, Tabb DL, Mallick P. Employing ProteoWizard to convert raw mass spectrometry data. Current Protocols in Bioinformatics. 2014;46:13.24.1–9.
Ng J, Bandeira N, Liu W-T, Ghassemian M, Simmons TL, Gerwick WH, et al. Dereplication and de novo sequencing of nonribosomal peptides. Nature Methods. 2009;6(8):596–9.
Frank AM, Bandeira N, Shen Z, Tanner S, Briggs SP, Smith RD, et al. Clustering millions of tandem mass spectra. Journal of Proteome Research. 2008;7(1):113–22.
Yang X, Neta P, Stein SE. Quality control for building libraries from electrospray ionization tandem mass spectra. Analytical Chemistry. 2014;86(13):6393–400.
Tsugawa H, Cajka T, Kind T, Ma Y, Higgins B, Ikeda K, et al. MS-DIAL: data-independent MS/MS deconvolution for comprehensive metabolome analysis. Nature Methods. 2015;12(6):523–6.
Stein SE. An integrated method for spectrum extraction and compound identification from gas chromatography/mass spectrometry data. Journal of the American Society for Mass Spectrometry. 1999;10(8):770–81.
Brown NL, Stoyanov JV, Kidd SP, Hobman JL. The MerR family of transcriptional regulators. FEMS Microbiology Reviews. 2003;27(2–3):145–63.
Stein SE. Chemical substructure identification by mass spectral library searching. Journal of the American Society for Mass Spectrometry. 1995;6(8):644–55.
Pevzner PA, Dančík V, Tang CL. Mutation-tolerant protein identification by mass spectrometry. Journal of Computational Biology. 2000;7(6):777–87.
Baars O, Zhang X, Morel FM, Seyedsayamdost MR. The siderophore metabolome of Azotobacter vinelandii. Applied and Environmental Microbiology. 2015;82(1):27–39.
Brodbelt JS. Ion activation methods for peptides and proteins. Analytical Chemistry. 2016;88(1):30–51.
Chaturvedi KS, Henderson JP. Pathogenic adaptations to host-derived antibacterial copper. Frontiers in Cellular and Infection Microbiology. 2014;4:3.
Roepstorff P, Fohlman J. Proposal for a common nomenclature for sequence ions in mass spectra of peptides. Biomedical Mass Spectrometry. 1984;11(11):601.
Caboche S, Pupin M, Leclère V, Fontaine A, Jacques P, Kucherov G. NORINE: a database of nonribosomal peptides. Nucleic Acids Research. 2008;36:D326–D31.
Baars O, Morel FMM, Perlman DH. ChelomEx: isotope-assisted discovery of metal chelates in complex media using high-resolution LC-MS. Analytical Chemistry. 2014;86(22):11298–305.
Martinez JS, Carter-Franklin JN, Mann EL, Martin JD, Haygood MG, Butler A. Structure and membrane affinity of a suite of amphiphilic siderophores produced by a marine bacterium. Proceedings of the National Academy of Sciences. 2003;100(7):3754–9.
Mohimani H, Liu W-T, Yang Y-L, Gaudêncio SP, Fenical W, Dorrestein PC, et al. Multiplex de novo sequencing of peptide antibiotics. Journal of Computational Biology. 2011;18(11):1371–81.