Proposed concept for hazard characterization: HazChar Scorea.
Recent developments in the field of pathogen genomics herald a new paradigm for analytical food microbiology in which pathogenic bacteria will be characterized on the basis of their genetic profile rather than traditional approaches relying on phenotypic properties. The ability to identify gene markers associated with virulence, antimicrobial resistance, and other properties relevant to the identification, risk profiling, and typing of foodborne bacterial isolates will play a critical role in informing regulatory decisions and tracing sources of food contamination. Here we present several scenarios illustrating current and prospective roles for pathogen genomics in food inspection.
- pathogen genomics
- foodborne pathogens
- whole-genome sequencing
- food inspection
- Escherichia coli
The food microbiology testing laboratory has a key role in supporting regulatory food safety investigations, whether stemming from a contamination incident identified through routine monitoring food inspection programs or a foodborne illness outbreak event where human lives and well-being are at risk. While such investigations typically involve the concerted actions of food inspection and public health authorities at different levels of government, the main role of the regulatory testing laboratory is to confirm the presence of a specified hazard in a food vehicle and provide data indicating the scope and source of a contamination event. The extent to which the laboratory can contribute critical information to an investigation will to a large degree depend on the application of leading-edge technologies for detection and characterization of foodborne pathogens. Approaches capable of maximizing the amount of information obtained in the course of conducting laboratory testing of inspection samples will foster the most appropriate regulatory responses, for example, by informing the health risk assessment process undertaken to categorize the degree of risk attending a contamination incident.
The impact of analytical service delivery on public health outcomes depends on the ability to process large numbers of investigative samples and produce accurate test results in the shortest timeframe possible. While cultural enrichment of food samples to amplify pathogens to detectable levels is generally necessary for their recovery, current approaches often use protracted identification processes relying on phenotypic characteristics elucidated by time-consuming cultivation, biochemical and serological techniques. While effective under certain circumstances, there are shortcomings to such a limited approach, when dealing with novel pathogens for which analytical parameters may not have been comprehensively worked out or in trying to attribute contamination sources.
The exploitation of the genetic blueprint, or known parts thereof, associated with a targeted bacterial pathogen is now widely accepted as an effective means for the identification of food pathogens. Polymerase chain reaction (PCR) technology is now well established as an analytical tool in the regulatory food laboratory [1, 2, 3, 4]. Its implementation in regulatory testing programs underscores the growing acceptance of redefining the terms for the identification and characterization of bacteria from a phenotypic to a genotypic basis. Indeed, the advent of leading-edge genomics technologies opens new possibilities for comprehensive analyses of microbial isolates recovered from food inspection samples; for example, next-generation sequencing technologies (whole-genome sequencing, WGS) can now render a bacterial genome much faster (i.e., within 1–3 days) and at a significantly lower cost (about 100 dollars) than previously possible, making it feasible to sequence foodborne isolates within the timeframes of food safety investigations [5, 6, 7].
Currently available bioinformatics tools are sufficiently advanced to enable the rapid processing of raw sequence data into a usable form for many purposes. Sequencing pathogenic bacteria, whether in the context of outbreak investigations or information gathering in the course of research, can yield an unprecedented level of information regarding the presence of virulence and other marker genes relevant to the identification and risk characterization of food isolates [6, 7, 8, 9, 10]. WGS data can provide an exquisite degree of resolution capable of ascertaining differences between strains and determining phylogenetic relationships among different bacterial isolates for pinpoint precision in the attribution of contamination sources [6, 11]. Finally, the identification of strain-specific features such as unique DNA sequences, metabolic properties, and antimicrobial resistance will enable testing labs to implement customized tests addressing specific strains of interest in determining the scope of contamination events.
While genomics, including WGS technology, already plays a significant role in the clinical sciences, its role in regulatory food microbiology inspection programs remains to be fully delineated. Currently, methods used to characterize foodborne pathogens recovered in regulatory food testing programs aim to answer three questions: (1) what is it? (2) have we seen it before? and (3) is it dangerous? This work describes some of the ways in which characterization of bacterial pathogens using genomics technologies has provided or may contribute to faster, more reliable and cost-effective results addressing these questions. Our purpose is to present different scenarios to illustrate impacts of leading-edge genomics technologies, some imagined, and others already achieved, on food inspection programs.
2. Impacts of the implementation of genomics in regulatory food microbiology
2.1 What is it? Definitive identification of pathogenic bacteria based on genomics techniques
2.1.1 In the beginning: detection of genomic markers by PCR
As an alternative to classic phenotypic techniques, the identification of foodborne colony isolates can be achieved on the basis of detection of defining gene markers. Detection platforms incorporating PCR techniques are particularly well suited for same-day analysis of a primary colony isolate. The CFIA microbiology laboratory network has undertaken a program of method development aimed at the rapid identification of colonies isolated on plating media at an early stage during the enrichment process. A key technology platform adopted by CFIA for this purpose is the cloth-based hybridization array system (CHAS) providing for identification of pathogens through amplification of key target genes by multiplex PCR, followed by rapid hybridization of the amplicons with an array of immobilized capture probes on a polyester cloth support [2, 3, 4, 17]. This approach enables facile detection of many gene markers in a single reaction, with specificity assured through the hybridization process.
A CHAS method for the identification of
2.1.2 Enter the next generation: whole-genome sequencing
Non-O157 STEC, particularly strains bearing certain O antigenic determinants (O26, O45, O103, O111, O121, O145, and O157), are emerging as a serious foodborne public health concern . Unlike
The potential of PCR technology to provide informative test results is limited by its rather fragmentary nature, that is, the fact that only a relatively small number of different DNA markers can be assessed in a single analytical procedure. A considerable effort is required to optimize and validate the performance of PCR systems, particularly those in which multiple primer pairs are combined (e.g., identification of priority Shiga toxigenic
The use of WGS technologies for STEC characterization can provide a more complete picture ; however, completion of WGS analyses typically requires 3–5 days. As this timeline may be too long in an outbreak situation, we have developed practical processes in which genomic DNA isolated from a single STEC colony is sequenced using the Illumina MiSeq platform, followed by analysis of the sequence data during the course of the sequencing run for the determination of genomic markers for phylogenetic identity, virulence profile, serotyping, as well as biological metrics serving as quality control features supporting the validity of the process . Identification and characterization of isolates can be completed within 9 h, comparable to the current method used for characterization of STEC. This real-time WGS approach produces high-resolution characterization of bacterial pathogens at a cost and within a timeframe that are similar to standard microbiological techniques and has the potential to replace lengthy biochemical characterization and molecular and serological typing procedures widely used in food testing laboratories. Our laboratory is currently studying strategies for broad implementation of WGS technology in support of regulatory food inspection objectives through the detection, identification, and characterization of priority bacterial pathogens such as STEC,
2.2 Have we seen it before? Impact of high-resolution molecular typing by WGS for distinguishing new isolates from control and historical laboratory isolates
Following detection of a pathogen in foods, (sub)typing methods are often used to generate a profile of the isolated organism to determine similarity to previously characterized isolates from clinical or food sources (reviewed in [27, 28, 29]). Typing of isolates recovered from food samples can provide important information regarding the complexity and source(s) of a given contamination incident, enables tracking of foodborne bacterial strains, and is frequently used to support regulatory decisions.
Early methods of typing were based on phenotypic properties, for example, serotyping based on proteins expressed by the organism on surface structures [30, 31, 32] and phage typing based on susceptibility to reference panels of bacteriophage for distinguishing closely related isolates . The development of methods based on DNA sequences followed . Methods such as multilocus sequence typing (MLST), involving sequencing of PCR-amplified fragments of a small number (i.e., 6–9) of housekeeping genes, can be used to infer evolutionary relationships among organisms [34, 35]. This portable and highly reproducible method of typing has been widely deployed, and MLST schema have been developed for all of the priority foodborne pathogens . For public health surveillance in Canada, as in other jurisdictions, the standard approach for typing of foodborne pathogens for cluster identification has been pulsed-field gel electrophoresis (PFGE) and multilocus variable-number tandem repeat analysis (MLVA) [27, 28, 36, 37]. In North America, data is shared among public health agencies through PulseNet [27, 38]. The selection of a typing method depends on a number of factors, including proven utility of the method for the pathogen being investigated. Each method requires costly training of lab personnel and in many cases the purchase of specialized equipment . Furthermore, comparisons of typing data among different strains can only be done in cases where the same method has been applied. In some cases, variability in the execution of methods by different analysts or different labs significantly impacts the comparability of molecular typing data .
These typing methods are based on a limited subset of genomic sequences and often lack the discriminatory power to differentiate among closely related organisms . DNA typing profiles from two isolates appearing indistinguishable might be interpreted as evidence that the bacteria have a common source. However, the strength of this type of evidence rests on the extent to which the DNA profile consists of a combination of rare traits. When the traits defining a DNA profile are not rare, there is the possibility that two isolates are in fact unrelated and that matches are mere chance occurrences. In highly clonal strains (e.g.,
WGS provides a high-resolution molecular typing platform that can be universally applied to bacterial pathogens [6, 38, 41]. In principle, strains differing by even a single nucleotide can be distinguished [42, 43]. Furthermore, WGS can now be done more cheaply than lower resolution methods such as MLST and is backward-compatible with previous methods since, in some cases, typing data can be generated from minimally processed genomic data in silico [8, 9, 44]. Strains characterized by WGS can be compared to strains characterized by any other DNA-based subtyping method, enabling optimal use of historical data. Molecular typing data have generally been developed as a surrogate measure of the genetic similarity between bacterial strains. Using databases of WGS information, the utility of existing subtyping methods can be rigorously assessed, and improved subtyping schemes that reflect true strain relationships can be developed .
One advantage of the availability of WGS for typing bacterial isolates is the ability to evaluate datasets at different levels of resolution as needed to resolve biological questions. In this regard, MLST continues to be a valuable approach for tracking foodborne pathogens in the genomic era [41, 44]. WGS data can be matched to current and historical databases at different levels of resolution including pathogen-specific MLST (described above), core genome MLST (cgMLST) which uses hundreds of genes that are conserved within a species, and whole-genome MLST (wgMLST) which considers all genes within a species . Similarly, ribosomal MLST (rMLST) is a 53-gene scheme with the advantage of being universally applicable to bacteria typically encountered in a food testing laboratory .
The highest-resolution WGS-based analyses such as wgMLST and single-nucleotide variant (SNV) analyses provide unrivaled DNA fingerprinting capability and offer tremendous potential for food safety applications [41, 42, 47]. Still, the use of these analyses in a food safety context is in its infancy, and the interpretation of genomic data from foodborne pathogens in support of regulatory interventions remains challenging . For example, how many SNVs are required to exclude a sample from a food safety investigation? This question remains difficult to answer, in part because of differences in rates at which DNA accumulates changes within a species or among strains within a species. Bacterial strains with a mutator phenotype have an elevated mutation rate, typically due to mutations in genes encoding components of DNA replication and repair pathways . Mutator phenotypes are commonly found in clinical bacterial populations and may contribute significantly to the acquisition of antimicrobial resistance [50, 51, 52]. For example, the mutator phenotype has been attributed to the development of multidrug-resistant
2.2.1 Identification of persistent contamination
Persistent contamination of food manufacturing environments with bacteria such as
The ability to distinguish the two modes of contamination in the analysis of environmental isolates recovered during routine monitoring activities would constitute an important element to inform the best approach for the management of microbial hazards in food manufacturing plants [54, 55]. For example, while regular sanitation procedures may be effective in dealing with removal of sporadic surface contaminants, a more comprehensive approach requiring equipment teardown and aggressive sanitation would be required to deal with persistent contaminants, which are by nature highly resistant to sanitizers and cleaning procedures. The traditional approach to identify the occurrence of persistent contamination in food manufacturing environments involves the characterization of successive isolates using typing procedures such as PFGE to determine their relatedness . However, PFGE is of limited value for this purpose because it is not sufficiently discriminatory to unequivocally establish whether two strains are clonally related (i.e., one being descendant from the other). Depending on the scope of the contamination, there may be multiple related populations within the food production environment.
Whole-genome sequencing approaches offer the prospect of determining the degree of relatedness among isolates on the basis of very fine base sequence differences, because more closely or clonally related isolates have fewer SNV differences. Therefore, it should be possible to compare two isolates (e.g., recovered on successive sampling incursions in the same plant) using high-resolution WGS typing methods to ascertain whether they are clonally related or different . In the former case, this would be a strong indication that there is either an unresolved source of contamination in the plant, or more likely, a case of persistent contamination, whereas the latter case would suggest two independent contamination incidents. Each scenario would warrant a different approach to decontamination, and the ability to differentiate persistent and sporadic strains on the basis of the relatedness of successive isolates would constitute a powerful risk assessment and risk management tool for the use in the most highly proactive food safety programs. There is one caveat in the use of SNV-based typing, and that is the temporal drift which naturally occurs in bacteria, resulting in the accumulation of SNVs among the progeny derived from a single parent. The question remains under which conditions this occurs for bacterial strains in a food manufacturing environment and how many SNV differences constitute a real difference in terms of the provenance of isolates under comparison.
One cause of
2.2.2 Contribution to surveillance programs and outbreak investigations
Although numerous methods are used by food safety and public health agencies to support regulatory decisions during outbreak investigations, demonstrating that food and clinical isolates originated from the same source can be challenging. As the results generated by WGS make their way into situation rooms to guide decision-makers, concise metrics for the interpretation and contextualization of genomics-derived data will be required to achieve more precise assessments . The value of WGS-based typing for cluster identification has already been demonstrated through global surveillance networks such as GenomeTrakr and PulseNet [6, 38, 41]. Smaller clusters of cases can be linked through WGS analyses and investigated , and conversely, unrelated cases can be excluded from an epidemiological investigation leading to improved outcomes for rapid identification of foods implicated in outbreaks . Nonetheless, WGS results should not be interpreted in the absence of epidemiological context as some lineage rates of mutation are low and strains from different sources may appear to be linked .
The concepts of “match probability” and “likelihood ratios” are well known in human forensic sciences where they facilitate the interpretation of DNA profiles in matching individuals to a crime scene . For example, when the DNA profile found on a crime scene matches that of a suspect and there is only a one-in-one million probability that this DNA profile might be found in another individual, there is a strong case linking the suspect to the crime scene. Food inspectors face a similar situation during outbreak investigations when trying to establish causal links between isolates from different sources . Bacteria may undergo subtle changes in their genomes during the course of a foodborne illness outbreak event, with possible impacts on the typing profiles of clonally related isolates recovered over time. The question arises as to how much change in a genome constitutes a significant difference between individual isolates (i.e., different origins or strains). Through statistical analyses of comprehensive pathogen genome databases, it should be possible to develop a likelihood ratio approach to determine the probability of finding a given profile in a defined population and, hence, develop criteria to measure sequence diversity between isolates with different degrees of relatedness and even among clonally related isolates recovered over the course of an outbreak event . This in turn would provide a greater degree of confidence in attributing the origins of isolates, identifying clusters of foodborne illness and their food vehicles, and the scope of contamination. This information can also be used to revise and adjust detection tools (e.g., PCR primers) to ensure their effectiveness in identifying “moving” genomic targets. The development of a forensic likelihood ratio approach would provide a valuable tool to assess the reliability of genomic information underlying regulatory decision-making.
2.2.3 Attribution of food vehicles through genomic surveillance
The advent of genomic typing augurs well for the creation of highly refined databases of bacterial isolates from various sources (foods, production facilities, farms, environmental and clinical strains) providing high-resolution characterization of individual strains with established linkages to their geographic and temporal origins. Historically, the use of low-resolution typing approaches such as MLST profiles or serotypes has been valuable for the association of specific lineages with a given food type, production environment, or country [70, 71]. Initiatives such as the GenomeTrakr and the PulseNet WGS networks represent rich resources from which to draw valuable information linking isolates to their origin in the food production continuum [6, 38, 72]. For example, an analysis of
With the aid of bioinformatics tools, databases can be queried to identify genomic signatures that are overrepresented in particular head sources for bacterial isolates. For example, a study by Thépault et al.  identified 15 host-associated
2.3 Is it dangerous? Rapid identification of virulence, antimicrobial resistance, and epidemiological markers through WGS
Genomic information is highly complex, and there are many knowledge gaps with respect to the significance of various marker genes to public health . Nonetheless, there is a growing body of evidence linking certain well-defined gene markers to virulence characteristics of bacteria, for example, the role of intimin (coded by the
In the case of STEC, regulatory food testing programs currently define priority target strains as bearing markers for Shiga toxin genes and intimin, in addition to markers associated with a narrow family of O serogroups . However, the question arises whether in the course of conducting routine monitoring of food inspection samples the occurrence of an isolate bearing markers for Shiga toxin and intimin, but none of the so-called priority serogroups would be actionable. There are varying subjective opinions on the matter, ranging from a narrow interpretation of test results in which only isolates bearing all the designated factors are considered hazardous to the more precautionary approach whereby any isolate bearing both Shiga toxin and intimin factors, regardless of O serogroup, constitutes a public health risk. There is also evidence suggesting that severity or likelihood of foodborne illness varies with Shiga toxin type and subtype (e.g., STEC strains possessing st2a tend to be more frequently implicated in cases of severe foodborne illness ) and that this should be a factor in determining the appropriate response to the presence of a food contaminant. These properties are readily discoverable through the analysis of WGS data. For instance, the Shiga toxin subtype can be reliably determined using the V-typer tool, which is an automated assembly-independent subtyping module that can be integrated in a bioinformatics pipeline for the analysis of foodborne STEC isolates . Yet another possibility would be to define priority STEC on the basis of contemporary public health data (reviewed periodically) identifying STEC serogroups most frequently associated with illness in a given jurisdiction. The serogroup of an
Such considerations raise problems for health risk assessment specialists who must interpret laboratory results (among other factors) to determine the degree of risk informing the course of regulatory interventions . It is tempting to speculate that it may be possible to devise an objective scheme for rating the degree of hazard associated with a given isolate on the basis of genomic analyses. For instance, the public health and food inspection communities could agree on a list of key factors relevant to the characterization of a given pathogen (Table 1). For organisms such as
|Primary virulence||Toxin||Presence or absence|
|Attachment and colonization|
|Pathogenicity||Pathogenesis mechanisms (e.g., hemolysin)|
|Accessory functions||Antibiotic resistance||Therapeutic impact|
|Antimicrobial resistance||Sanitizer efficacy|
|Pathogenicity islands||Signatures for novel pathogens|
|Epidemiological markers||Serotype||Outbreak vs. sporadic vs. nil association|
|Phage type||Reservoirs, illness outbreaks|
|Molecular type||PFGE/SNV cluster|
|Phylogenetic markers||Genus/species||Identity (e.g., |
|Family or group||STEC|
The antimicrobial resistance profile of pathogenic bacteria, while not a virulence attribute per se, remains an important factor in the ultimate public health outcomes of foodborne illness events, since a significant fraction of the affected population (e.g., the elderly and the immunocompromised) may critically require antibiotic therapy to recover. Furthermore, antimicrobial use at sub-therapeutic levels for growth promotion in food animal production has been implicated in the development of antimicrobial resistance (AMR) in animals and humans [81, 82], though there is paucity of data to support this claim. Food testing laboratories can play an important role in contributing data on the occurrence of AMR bacteria to national and international surveillance initiatives seeking to understand the role of production practices in the emergence of these microorganisms. As an alternative to labor- and time-consuming phenotypic testing, AMR profiles can be predicted from WGS data through the identification of genetic markers by querying the subject genome using DNA sequence information deposited in curated AMR gene databases, such as well-cataloged AMR gene markers. A number of tools are currently available to predict AMR from bacterial WGS data (e.g., ResFinder [10, 83], SEAR , Resistance Gene Identifier , and ARMI ). These AMR marker prediction tools rely on curated international AMR gene databases such as CARD , ARDB , and ARG-ANNOT . WGS-based methods for prediction of AMR phenotype have been shown to be highly accurate [86, 89, 90, 91].
3. Future applications
3.1 Deployment of ad hoc methods in support of outbreak investigations
Despite recent efforts of regulatory food safety agencies to implement test methods targeting defined serogroups of so-called priority STEC, the history of foodborne disease outbreaks is rife with examples of causative strains with unexpected characteristics (e.g., the 2011 German outbreak in which the etiologic agent belonged to serogroup O104, not a designated priority serogroup, and lacked the definitive virulence marker
The state of the art in WGS technology has reached the point where clinical isolates implicated in foodborne disease outbreaks are routinely sequenced in public health laboratories at an early stage during these events . With the application of appropriate bioinformatics tools to analyze the ensuing data, it should be possible to develop customized strain-specific test methods that can be rapidly deployed to food testing labs conducting analyses in support of outbreak investigations. The availability of WGS information for these strains should make it possible to ascertain the presence of traits conferring resistance to antimicrobial agents such as antibiotics, quaternary ammonium compounds, and tellurite, suggesting an avenue for the formulation of customized selective enrichment media enabling recovery of specific outbreak strains . This would be a particular advantage in instances where a food matrix (e.g., meats, sprouts, etc.) contains high levels of background microbiota, which might otherwise interfere with recovery of the target organism. Genomic AMR prediction tools can be used to discern the AMR marker profile of a strain of interest (e.g., outbreak strain) to identify an antibiotic resistance trait which can be exploited for customization of selective enrichment media favoring its recovery from samples with high background bacteria loads [86, 95]. In addition, WGS data can be analyzed using a pipeline such as SigSeekr  designed to identify DNA sequences associated with a particular strain for its rapid identification by PCR. By combining strain-specific selective enrichment and PCR detection tools, it should be possible to deploy custom recovery and identification tools for the efficient detection of STEC outbreak strains within the timeframe of an active investigation. The feasibility of such an approach has been demonstrated using laboratory STEC strains as models, with resistance to a variety of antibiotic classes used as the basis for their selective recovery against high backgrounds of commensal
3.2 Characterization of food microbiomes in support of improved method development
Metagenomic analysis of enrichment dynamics can be used to inform the development of improved methods for cultural enrichment of pathogens [97, 98, 99]. A practical approach for this is to selectively amplify hypervariable regions within the 16 s rDNA and to sequence amplicons using WGS technologies. Sequences are then mapped to databases to determine composition of microbial communities using bioinformatics tools such as QIIME  or mothur . Samples from enrichment cultures can be used to evaluate growth of target pathogens relative to background food microbiota over time [97, 99]. Such studies can provide valuable insight into species that could potentially interfere with target pathogens that can be applied to the development of improved methodology.
A modern concept in the study of pathogenic bacteria is the emergence of novel pathogens among commensals in a given environment through the acquisition of virulence factors by horizontal gene transfers from other bacteria [102, 103, 104]. The evolutionary trail of the STEC family suggests a priori transformations of benign
Modern food microbiology research has generated a critical understanding of the epidemiology, pathogenic mechanisms, virulence factors, and other salient characteristics of the major food pathogens. The convergence of expanded scientific knowledge and sophisticated technological capability create exciting new opportunities for the refinement of food microbiology testing programs to meet the needs of a comprehensive risk-based inspection approach. Advances in next-generation sequencing technologies have made it possible for investigators to carry out sequencing and processing of bacterial genomes within the time course of a typical foodborne illness outbreak investigation. It may reasonably be expected that in the near future, analysts will be moving from traditional DNA hybridization approaches (e.g., PCR and microarrays) toward rapid whole-genome sequencing allowing a much more comprehensive examination of the isolate at hand. This new approach will require broad access to leading-edge bioinformatics capability for analysis of complex genomics data in silico to ascertain the presence of key genetic markers (e.g., presence of virulence genes in bacterial pathogens, completeness and functionality of gene products, markers for molecular typing, etc.). The generation and analysis of WGS information requires the migration of large packets of information between laboratory sites involved in the exploitation of this information, remote computing sites, and internet databases for data manipulation and comparative analyses. There are many ways in which the high-tech needs of the future can be met, even for relatively small laboratories with low operating budgets. Opportunities abound for food inspection, public health, and academic laboratories to pool their resources and serve one another in the common purpose of safeguarding citizens from preventable food-acquired illness.
The authors thank Paul Manninger, Mylène Deschênes, and Martine Dixon for providing laboratory support throughout many years in developing the food pathogen genomics program at the Canadian Food Inspection Agency and Andrew Low for providing bioinformatics support and for critical review of this manuscript.