Steps and Tools for PCR-Based Technique Design

The identity and clonal differences within bacterial populations have been broadly explored through PCR-based techniques. Thus, bacterial identification and elucidation of DNA fingerprinting have provided insights regarding their phenotypic and genotypic variations. Indeed, some diversity of rates may reflect changes among subpopulations that have their own ecological dynamic and individual traits on coexisting genotypes. Therefore, identification of polymorphic regions from nucleic acid sequences is based on the identification of both conserved and variable regions. Advantages of PCR-based methods are high sensitivity, specificity, speed, cost-effectiveness, and the opportunity for simultaneous detection of many microbial agents or variants. Fingerprint information might allow the tracking of certain outbreaks globally in several reference databases containing valuable genotyping information. In this chapter, we will review applications from Web resources and computational tools online for the designing of PCR-based methods to identify bacterial species. We will also focus on lab applications and key conditions for technique standardization.


Introduction
Bacterial culture is the conventional test to identify a microorganism which is based on the isolation and growth of live specimens [1]. Currently, these methods have been considered the gold standard method for assessing the validity of new diagnostic methods. However, phenotypic-based methods are time-consuming tests, difficult to interpret, low reproducible between laboratories, expensive, and laborious. Several commercial polymerase chain reactions or PCR-based methods are available nowadays, and like in-house PCR assays, they use different target specific genes in a clinical sample to identify a pathogen [2].
Bacterial identification in many cases is performed through a fingerprint comparison against some reference genotyping databases. Thus, many organisms can be taxonomically classified and specifically differentiated according to several conserved genes which are recognized as "molecular clocks" [3]. One well-known example is the ribosomal RNA (rRNA) gene, which is a good candidate due to its universal distribution and reasonably well conservation in sequence across evolution [4]. Thus, a good method is feasible, rapid, and cheap and can be implemented in local settings from highly endemic areas of a certain infectious disease [5]. Based on designing a proper PCR-based method, the challenge was addressed on serial steps to reach the expected aim. Table 1. Techniques commonly used for bacterial genotyping and identification [6].

Step two: defining a gene target
To analyze the genetic variability, it is necessary to select an appropriate target which supports the required hierarchical level. Important insights remain regarding the choice of a "molecular clock" including traces of the evolutionary record from microbial diversity [16]. Ribosomal genes become ancient molecules that harbor information with high phylogenetic value and can differentiate organisms at genus and species level [17]. However, alternative core genes also defined as housekeeping have been proposed such as RNA polymerase beta subunit (rpoB), DNA gyrase alpha subunit (gyrA), glyceraldehyde 3-phosphate dehydrogenase A (gapA), GroEL genes (groE, groL), outer membrane protein A (ompA), and glucose-6-phosphate isomerase (pgi) ensuring the success for bacterial species definition [18][19][20]. Some of these genes are included in a multi-locus sequence typing (see further information in step ten).
Typing methods based on the 16S rRNA genes represent an accurate strategy for strain characterization because these genes harbor both conserved and variable regions that might delineate changes on a specific position on the bacterial ribosome leading to strain differentiation [4,17]. Although, it is important to consider that multiple copies of the 16S rRNA genes are present in all bacterial genomes. 16S rRNA gene has been subjected to many phylogenetic studies including those related to bacterial definition species [17].
Sequencing of rRNA genes is the preferential method for phylogenetic reconstruction, nucleic acid-based detection, and quantification of microbial diversity [21]. Thus, many genotyping approaches remain based on 16S rRNA gene analysis or ribosomal gene sequencing which still constitutes a gold standard for bacterial taxonomy [22]. Therefore, it is possible to explore a sequence by searching against the Ribosomal Gene Database (RGD) release 11.5 which contains 3.356.809 16S rRNAs and 125.525 fungal 28S rRNA gene sequences by November on 2018 (https://rdp.cme.msu.edu/) [23,24]. A similar resource containing information regarding the ribosomal genes is the SILVA databases (https://www.arb-silva.de/). It includes over 6,800,000 small (16S/18S, SSU) and large (23S/28S, LSU) subunit rRNA sequences [25]. Another useful database provides detailed information about the ribosomal protein gene database (RPG). This tool contains information from some eukarya, archaea, and bacteria organisms including sequences (genomic, cDNA, and amino acid sequences), intron/exon structures, genome locations, small nucleolar RNAs (snoRNAs), and ortholog data (http://ribosome. med.miyazaki-u.ac.jp) [26].

Step three: primer design for PCR-based methods
Once we have chosen a gene target, the next step is the design of specific primers to detect it in the DNA sample. In the last decades, PCR-based techniques have been successfully employed for the genetic characterization of many taxa of many pathogens [2,27]. Primers for DNA amplification are short synthetic oligonucleotides which may be complementary to target sites on the template DNA. PCR is performed at different temperatures (denaturation, annealing, and extension) where efficiency is determined based on primer annealing [28]. Some essential features have to be taken into consideration for accurate primer design: 1. Primers should contain guanine or cytosine, or both at the 3′-ends to increase the efficiency of oligonucleotide binding. Primers must form a stable duplex with target DNA at the annealing temperature.
2. Oligonucleotide should not be self-complementary to avoid the generation of secondary structures such as hairpins loops.
3. No complementarity between forward and reverse primers.
4. Melting temperature (T m ) defines the balance between the unbounded primer and free template compared with primer bound to the target DNA (50%). Tm is recommended to be among 42-65°C with an ideal temperature of 62°C. This parameter is a key because too high T m can affect specificity and decrease the amount of PCR product (amplicon). If T m is too low, unspecific products can be produced due to mismatch base pairing. Usually, T m is set, 5°C below or 5°C above.

5.
The optimal length of primers should be long enough to increase specificity (usually between 18 and 30 bases).
6. It is recommended to ensure a similar distribution of G/C and A/T content in the primers (40-60%).
7. For standard PCR, length of the amplified product should be between 200 and 1000 bp and for quantitative PCR in the range from 75 to 150 bp. Nevertheless, PCR products larger than 1000 bp may require additional time during the extension step (1 min/kb of PCR product).
8. Primers must be designed specifically in the target gene to avoid cross contamination with unwanted DNA sequences in the PCR. Typically, primers are designed, and sequences are analyzed in silico using BLAST analysis or others, to check for the specificity.
Many resources are available for primer and probe design to optimize the PCR method; therefore, the researcher might consider parameters including molecular weight, millimolar extinction coefficient, T m , and prediction of secondary structure formation and magnesium chloride (MgCl 2 ) concentration ( Table 2). For some molecular biology procedures, it is recommended to design the forward primer less 35 pb downstream from the start site of the coding gene and also it applies for the reverse primer regarding the stop site. For example, in a sequencing protocol, the fragment size should not be large due to artifacts introduced on lecture sequence. Likewise, unclear results might increase concomitantly with larger fragments [29].

Step four: in silico simulation of molecular biology experiments
Nowadays, many resources (Web servers and programs) are available to simulate PCR results, predicting expected bands and successful primer annealing [30,31]. Although, in silico simulation of several PCR-based methods is possible by using tools to obtain theoretical PCR results with many bacterial species sequenced up to date [32]. The list of target genomes is updated according to their availability at NCBI. Many experiments against prokaryotic genomes can be performed such as PCR amplification, restriction digest, and PFGE, PCR-RFLP, double digestion fingerprinting, AFLP-PCR, and other DNA fingerprinting techniques (http://insilico. ehu.es/) [33]. The PCR simulation is also possible if the researcher already knows the target sequence and can test it by using certain resources already mentioned in Table 2. 5 Steps and Tools for PCR-Based Technique dNTPs, primers (forward and reverse), sample DNA, and DNA polymerase [29]. The 10× reaction buffer includes magnesium, thus it is optional to use separately as MgCl 2 . If so, typical MgCl 2 concentration in a standard PCR should be between 1.5 and 2.0 millimolar (mM). When magnesium is too low, no amplicon might appear; but if it is too high, undesired amplicons would be observed as extra bands in the agarose gel. For dNTP, concentration should be 200 μM of each nucleotide. Regarding the Taq (obtained from Thermus aquaticus) DNA polymerase (further information in step 6), it is recommended the addition between 0.5 and 2.0 units per 50 μL mix (preferably 1.25 units). Primers work well at the default concentration (50 nM), but concentrations between 0.1 μM and 1 μM of each primer are recommended. Last, DNA sample should be used between 1 ng and 1 μg of genomic templates because higher amounts might reduce PCR product specificity [28].

7.
Step six: choosing the DNA polymerase enzyme DNA polymerase is an enzyme which synthesizes the new DNA strands. DNA polymerase was first isolated from T. aquaticus in 1976 [34]. This enzyme has an optimum temperature between 75 and 80°C, which possess a half-life until 97.5°C during 9 min and can polymerize 150 nucleotides per second [35]. When choosing a DNA polymerase, the researcher must consider key aspects such as specificity, thermostability, fidelity, and processivity. First, if specificity is low, low-quality amplicon would affect the yield product, sensitivity, and possible problems in downstream applications (e.g., cloning or protein expression). Second, regarding thermostability, consider using enzymes with a half-life above 90°C because of the denaturing step. Third, if the researcher needs amplicons with 100% similarity to DNA target, consider using high-fidelity DNA polymerases with significantly proof-reading activity. Last, processivity reflects the rate and speed of the reaction from the enzyme. Thus, processivity should be considered in case of long templates, self-complementary targets, high G/C content, and samples containing PCR inhibitors including if the amplicon is accumulating during later PCR cycles [29,36].

Step seven: setting the PCR conditions
The PCR runs in cycles composed of three called steps: denaturation, annealing, and extension. For default, PCR includes between 25 and 35 cycles per reaction. The denaturation step produces single-stranded DNA and usually is performed initially at 95°C for 2 min [37]. The following step is the primer annealing which pair-base primer with the complementary DNA template and generally is carried out considering the primer's T m of both PCR oligos. Third, for extension step is recommended to set 1 min per 1000 bp (or 30 s per 500 bp) of the amplicon. Larger PCR products (>3 kb) may require longer extension times. The extension is usually performed at 72°C which is considered the optimum temperature for thermostable DNA polymerases. The standard PCR protocol for a 500 bp amplicon includes: an initial denaturation at 95°C for 2 min, followed by 25 cycles of denaturation at 95°C for 15 s, annealing at 55°C for 15 s, and extension at 68°C for 45 s. One additional cycle of final extension is at 68°C for 5 min [28,29].

Step eight: setting specificity and sensitivity of the PCR method
To test assay specificity, it should be assessed against many related microorganisms. Potential cross-reactivity with DNA contaminants in the sample should also be investigated especially when the method applies to natural populations [38]. This issue is essential, particularly when the new method is compared with traditional techniques. DOI: http://dx.doi.org /10.5772/intechopen.83671 Specificity is first tested in silico using the BLAST tool [39]. Then, specificity can be assessed in vitro by the PCR amplification of genomic DNA purified from taxonomically related species. Regarding sensitivity, defines the detection limit of the minimum of DNA target in a sample. This issue is relevant when it is difficult to obtain cultures or when the low number of bacteria cannot be detected in other diagnostic technique [40].
Identifying the bacterial species related to clinical phenotypes requires a method to cluster fingerprints into groups which are likely to share most genotypic and phenotypic traits [8]. Instead, species have been genotyped by measuring genetic variation in the number of repetitive genetic elements in the direct repeat region or detection of polymorphic sequence [41]. These techniques have identified clonal groups of isolates that each appears to be related through a common ancestor [42].

Step nine: evaluation of the amplicons
Experimental validation of PCR results entails two possibilities: the first option is to load and run the PCR product on an agarose gel testing the expected sizes with the suitable molecular weight marker. In the second place, it is to sequence the amplicon evaluating its sequence identity compared to the DNA target [28,29].

Step ten: comparing a query band pattern or DNA sequence against databases
Genotyping programs rely on the collection and analysis of large quantities of data. Control infection programs are implementing genotyping programs for comparing against a database. Central databases for isolate tracking, laboratory

Resource Description Reference
PubMLST This database contains more than 140 MLST allelic profiles and sequences. BIGSdb software runs in PubMLST to store and analyze sequence data for bacterial isolates.  results, and epidemiologic data are essential. Because cluster investigations are an epidemiologic activity, the infectious disease programs should maintain the principal databases for spread analysis and control measurements [43]. The information in these databases can enable infectious disease programs to identify easily patients with matching genotypes and epidemiologic links [44]. Today, information available regarding bacterial genotyping at both traditional MLST and whole-genome sequencing (WGS) are available (Table 3) [10,15,[45][46][47]. For more applications, public resources also store primer information for quantitative gene expression analysis or comparing with previous reports [48][49][50].

Variants of the PCR methods
Beyond PCR use in the lab, availability of an improved version of thermocyclers, dyes, primers, probes, and DNA polymerases have extended applications.

Technique Principle Use Reference
In house PCR Conventional PCR targeting a single gene Detection of a pathogen in a clinical sample [62] Nested PCR Two pairs of primers are used to amplify a fragment. First pair amplifies similar to a conventional PCR. Second pair is nested within the first fragment Increase specificity of an amplification reaction when targeting a gene [61] Multiplex PCR This PCR variation enables the simultaneous amplification of many targets in a single reaction by using over one pair of primers. Amplicon sizes should be different or be labeled Screening for a set of genes at once in a DNA sample. Analysis of microsatellites and SNPs [60] Real-time PCR or quantitative PCR It is an assay that monitors the fluorescence emitted during the reaction as an indicator of amplicon production at each PCR cycle (realtime) as opposed to the endpoint detection DNA quantification in a sample. Level of gene expression. Copy number variation. Genotyping.
Multi-species analysis [59] Inverse PCR In this PCR, primers are oriented in the reverse direction of the usual orientation. The template for the reverse primers is a restriction fragment that has been self-ligated Droplet PCR It includes a separation step of sample into multiple compartments so that only few molecules are present in each partition. Thus, each droplet will be an independent PCR Rare species detection and mutations with low frequency [53]  Many PCR variants allow quantify gene expression, improving diagnostic sensitivity and genotyping without further procedures as restriction analysis (Table 4) [51][52][53][54][55][56][57][58][59][60][61][62][63][64]. Advanced technique such as digital PCR (dPCR) has showed to improve sensitivity and reliability until single-cell applications and tolerance to PCR inhibitors such as chelating agents [64]. Finally, another common application of several PCR-based techniques (conventional PCR, RT-PCR, qPCR, and LAMP assays) is the detection of antimicrobial resistance against first-line drugs or even last-resort antibiotics [65].

Educational tools for beginners
Biotechnology students face challenges understanding molecular biology principles and techniques. E-learning resources simulate a real environment where students create a real experience when running protocols in the lab [66,67]. The genetic science learning center provides a flash-simulation of PCR principles useful for beginners (https://learn.genetics.utah.edu/content/labs/pcr/). This virtual lab explains step-by-step concepts behind the technique principles and concept background. Another educational tool to train students not only in the method but also regarding interpretation results is the virtual application for bacterial identification (available on http://www.hhmi.org/biointeractive/vlabs/bacterial_id/index.html). The experiment's purpose is to use molecular tools to identify different bacterial species based mainly on their 16S rRNA genes. This virtual lab also requires that students prepare a patient's sample, isolate the whole bacterial DNA, perform a DNA sequencing analysis, run the query sequence in the BLAST tool, and identify the pathogen. Finally, students can practice PCR-RFLP simulations through online exercises based on available bacterial genomes [32].

Case study: identification at species and subspecies level
PCR-based detection based on the conserved regions of the 16S rRNA sequence of bacterial pathogens is currently performed by several groups [4,18,21]. The rRNA at SSU contains segments that are conserved in species, genus, and kingdom level. In this case, Klebsiella pneumoniae is divided into three subspecies: K. pneumoniae pneumoniae, K. pneumoniae rhinoscleromatis, and K. pneumoniae ozaenae. All together are phenotypically closely related and difficult to differentiate based on conventional tests [68].
The 16S rRNA gene sequences from K. pneumoniae subspecies were retrieved in FASTA format and aligned by using MACAW program (download link http://en.biosoft.net/format/MACAW.html). Subsequently, sequences were virtually cleaved according to restriction enzyme database obtained from Rebase (GCG format) implemented in GeneDoc program (available in https://www.softpedia.com/get/ Science-CAD/GeneDoc.shtml). Restriction patterns were predicted by testing each enzyme and considering specific patterns. Primers were custom-designed using Gene Runner program targeting 16S rRNA gene ( Table 2). Experimental validation was performed by harvesting the reference strains and extracting their genomic DNA. After PCR, amplification products and restriction enzyme digests were resolved by using agarose gel electrophoresis. Restriction enzyme combinations generated fragments that allowed easy identification of subspecies after separation of digested DNA. Primer design entailed a challenge since 16S rRNA genes are highly conserved in all bacterial genomes and may be present in multiple copies [69,70]. PCR-RFLP protocol was designed specifically to detect single nucleotide polymorphisms in the 16S rRNA genes from K. pneumoniae subspecies. In conclusion, our method combined the detection of specific nucleotide polymorphisms with the easy identification and categorization of the three bacterial subspecies from K. pneumoniae [68].
In another case, it was the implementation of in-house PCR method for tuberculosis diagnosis [62]. This molecular method might be an important tool in high-incidence areas due to its speed, sensitivity, and discriminatory power overcoming conventional methods (acid-fast stain and culture). The method was designed based on the IS6110 gene specific for Mycobacterium tuberculosis complex and was successfully tested in sputum, bronchoalveolar lavage fluid, blood, gastric fluid aspirate, urine, cerebrospinal fluid, ascitic fluid, and abscess secretions. The method improved diagnostic accuracy and confirmed to be fast, low cost, and feasible and can be implemented in a middle-income resource setting [62].

Conclusions
Some bacterial pathogens may be undetectable by traditional culture methods due to their nutrient requirements, growth conditions or the bacterial inoculum per sample. Therefore, PCR emerged as the effective method which overcomes the detection limit of certain pathogens in clinical samples. The success in the PCR experiment implies planning and results prediction with available tools and resources. Web-based tools and programs are useful for primer design, calculating accurate thermodynamic and physicochemical parameters, changing the thermal cycling protocol, and performing a good experimental design. Success in the PCR-based protocol depends on performing an accurate in silico simulation which would allow the optimal selection of reagents and test conditions and to avoid troubleshooting on inefficient reactions. Recommendations in this chapter might enable the researcher to customize and troubleshoot a wide variety of PCR-based methods. Hence, PCR remains as a versatile technique in molecular biology that allows changes in adjustable standard protocol to any gene target choosing the most suitable option for pathogen identification.