Ranking of stable reference genes for qPCR experiments using developing Jatropha curcas seeds according to the NormFinder software.
This chapter was developed to provide some important guidelines for studies with quantitative PCR (qPCR) using either dyes or probes, citing several essential components necessary for a good PCR assay. The efficiency and specificity of quantitative PCR (qPCR) depend on several parameters related to mRNA quantification that must be controlled to avoid mistakes in data interpretation. Avoiding contamination with proteins, carbohydrate and phenolic compounds during RNA extraction and purification processes will improve RNA quality and provide reliable results. Specific primers and sensible probes are also crucial to intensify efficiency, specificity and fluorescence. Other parameters such as the optimization of primer concentrations and efficiency primer curves must be done. During gene-expression profile quantification, qPCR assays using reference genes are required to normalize the target gene expression data. These reference genes are checked for stability to identify the most stable genes among a group of candidate genes that will be used to normalize the qPCR data, using programs such as geNorm, BestKeeper and NormFinder. Additionally, the choice of appropriate reference genes for a specific experimental condition is fundamental. The main aim of this chapter is to provide guidelines and highlight precautions to obtain a successful qPCR assays.
- gene expression
- real-time qPCR
- reference genes
The polymerase chain reaction (PCR) technique was first introduced by Kary Mullis . Thereafter, progresses in PCR reactions raised a more sensitive PCR technique, a quantitative PCR (qPCR: quantitative real-time polymerase reaction) that employs cDNA as template. cDNA is a complementary DNA from RNA molecules synthesized from reverse transcriptase reactions. During the qPCR reactions, a dye or probe binds to and is incorporated into amplified double-stranded DNA (dsDNA), acting as a fluorescent reporter during amplification. Thus, the enhancement of fluorescent signal is directly proportional to the number of PCR products synthesized in the reactions.
qPCR is widely known as the most effective method to analyze modulations in gene expression because of its efficiency to detect and precisely quantify the target genes, even at low expression levels . The reactions of qPCR enable us to measure the mRNA expression levels in numerous kinds of samples. Nonetheless, a successful qPCR assay requires an appropriate normalization approach, avoiding nonspecific variations among cDNA samples. Thus, employing qPCR with target genes coupled to reference genes is determinant to avoid probable mistakes in either RNA extraction or contamination in the course of sample manipulation.
Summarily, several precautions must be considered in order to obtain consistent results and avoid mistakes in the data interpretation in qPCR assays , including: (i) an accurate primer design with respect to specificity and efficiency; (ii) a purified mRNA free of contaminants, such as carbohydrates, proteins and phenols; and (iii) a rigorous choice of reference gene, which must be stable for analyzed experimental condition.
1.1. Quality of RNA
High quality of RNA is an essential requirement for qPCR. There are several probable contaminants that may interfere in PCR reactions by inhibiting mainly transcriptase reverse and DNA polymerases enzymes, such as DNA genomic, excess of proteins and carbohydrates, as well as phenolic compounds . RNA can be quantified at 260 nm in spectrophotometer and readings of absorbance at 280 and 230 are used to detect proteins and carbohydrates, respectively. According to Sambrook et al. , in order to verify the RNA purity, the values of ratio A260/280 between 1.8 and 2.0 denote a low contamination with proteins, whereas a ratio A260/230 > 2.0 indicates very low contamination with carbohydrates. These arguments are corroborated by the reports of Ref. .
The integrity of the RNA must be also analyzed trough electrophoresis gel. In this sense, the RNA reliability is investigated by analyzing the 28S and 18S ribosomal bands; and their absence individual or dual suggests the RNA degradation. In general, an electrophoretic approach employing agarose gel at 0.8–1.0% is useful to detect the integrity of ribosomal RNA subunits .
An additional purification step must be performed before starting qPCR reactions, digesting the genomic DNA; on the contrary, the DNA can act as template during qPCR and produce unreliable results. It may be avoided employing RNase-free DNase enzymes directly in the samples of RNA or without treatments but using specific primers designed in the exon-exon boundary of gene coding region.
1.2. Primer design and probe considerations
Designing specific primers and adopting appropriate probes are crucial requirements for amplification efficiency, specificity and fluorescence in qPCR assays. The primer should be designed in junction exon-exon of genic sequence to avoid amplification of contaminant genomic DNA, amplifying specifically the target cDNA sequence . The primer efficiency might be analyzed employing serial dilutions or standard curves, defining the ideal primer concentration and/or assessing the reaction efficiency. In this case, the log of each used concentration in standard curve is plotted against a Cq value for that concentration, eluding the reaction performance and other reaction parameters (including y-intercept, slope and coefficient of correlation). In the literature, the researches have typically used the formula:
Slope is the Cq value of first dilution subtracted of the Cq value of last dilution, divided by the number of dilutions (Figure 1). Therefore, considering that in a 100% qPCR efficiency the total PCR products will double after each cycle, the standard curve slope must be −3.33 (100 = 100% = 10(–1/–3.33) − 1). In the large majority of cases, an acceptable slope is around −3.33 cycles, although, a slope between –3.9 and –3.0 (80–110% efficiency) is commonly suitable [7, 8]. An example of inefficient primer by analyzing serial dilutions is provided in Figure 1A, whereas a perfect efficiency curve is shown in Figure 1B.
Inefficient reactions always provide inaccurate calculated levels of target input. Thus, the researcher may either (i) optimize primer concentrations or (ii) design alternative primers to improve reaction efficiency. Several programs are available to perform primer design, including PerlPrimer , primer BLAST (http://www.ncbi.nlm.nih.gov/tools/primer-blast/) and Primer3 Plus (http://www.bioinformatics.nl/cgi-bin/primer3plus/primer3plus.cgi). In the large majority of them, a previous annotation of gene sequence instituting the introns and exons is necessary to input the program.
Highly effective primers for qPCR assays must not form neither primer-dimer nor nonspecific amplification. Some online software such as https://www.idtdna.com/calc/analyser are accessible to analyze the primers with respect to the formation of homodimers, heterodimers and self-dimers, as well as a harpin formation between the forward and reverse primer [10, 11]. At the end of qPCR cycles, elaborating a melting curve is fundamental to assess the primer specificity, in which a peak lower than 78°C most likely corresponds to the formation of dimmers and/or alternatively unspecific amplifications (Figures 2 and 3).
Probe, or dye, is a fluorescent marker capable to incorporate inside the qPCR amplification product, into the double-stranded DNA (dsDNA) . It is widely used to measure the amount of amplified DNA during qPCR reactions, taking into account that the fluorescent signal is directly proportional to the amount of PCR products (amplicons) produced in the exponential phase of the reaction. During the reactions, the amplicon accumulation rate improves the fluorescence level, it being directly proportional to the amount of DNA amplified in the sample [12, 13]. SYBR Green is perhaps the best known fluorescent dye that binds to dsDNA and fluoresces upon excitation (Figure 4A); whereas TaqMan®, Molecular Beacon and Scorpions are probes designed to react with specific DNA sequences.
The TaqMan® probe is generally an oligonucleotide sequence complementary to specific regions of target DNA, in which a quencher and a reporter fluorophore dye binds to its 3′ and 5′ regions, respectively . TaqMan® interacts to complementary target DNA during the amplification reactions and, thereafter, is cleaved by Taq DNA polymerase 5′-3′ exonuclease activity (Figure 4B). During cleavage, the reporter dye is released and a fluorescent signal is generated, increasing cycle-by-cycle . Yet, the Molecular Beacon probe remained in a hairpin structure (composed by a quencher and a reporter dye) when free in the solution; thus, no fluorescence is emitted because of the fluorescent reporter and the quencher are extremely closed (Figure 4C) .
Scorpions are single-stranded oligonucleotide probes consisting of nearly 20–25 nt, composed by a reporter fluorophore at 5′ region and a quencher at 3′ region, resembling a stem and loop structure where the primer is attached (Figure 4D). The stem-and-loop structure acts as a blocker to prevent DNA polymerase activity during the interaction of the probe with the target DNA [12, 15]. In the absence of reaction, reporter and quencher are closely near, occurring at a continuous suppression of fluorescence by the reporter. In general, dyes are less specific than probes; whereas the dyes may bind to any regions of DNA double-stranded during a PCR amplification reaction, the probes are specific for binding in particular regions that allow to emit fluorescence. SYBR Green has been widely used because it has low cost and high efficiency.
1.3. Importance of the reference genes to normalize qPCR data
Reference or constitutive genes are required to normalize the target gene-expression quantification in qPCR assays. The normalization of expression levels is pivotal once it avoids misinterpretation of data obtained in qPCR reactions. Thus, a group of constitutive genes should be analyzed by stability, choosing the most stable ones as reference to be used in the data normalization. The assays of normalization must be conducted using at least eight stable reference genes because a single reference one as proposed by Livak et al. ) is not always constitutively expressed in all cell types [16, 17].
Accuracy of relative gene expression can be severely affected by a wrong choice of reference genes to normalize and validate the final results; consequently, employing inappropriate genes as reference for data normalization may lead to erroneous results and data misinterpretation . Thus, the expression stability of a reference gene must be confirmed in each experimental condition before the qPCR assays and it should be taking into account that a unique gene is generally not suitable for normalization [16, 17].
In the last decade, several tools have been developed to identify genes for normalization purposes and ensure a reliable normalized gene expression, including BestKeeper, geNorm and NorFinder [8, 16, 19, 20]. These programs are available online for free download and are widely used to calculate a normalization factor over multiple reference genes, improving the robustness of the normalization even further .
GeNorm program has been cited as the best statistical method to choose stable reference genes for qPCR reactions. Summarily, the principle is that the expression level of reference genes must be equal in all samples, regardless of experimental condition or cell type. The M values below cut-off (<1.5) are considered the most stable genes among all candidate reference genes (Figures 5 and 6) . Thus, highly stable genes are recognized by the lowest M values and genes presenting the highest M values should be disregarded and not be used as reference [10, 16, 20].
All genes expressed cutoff values for M of <1.0, as suggested by GeNorm. The most stable reference gene for samples of inner integument of developing Jatropha curcas seeds were : GAPDH, UCP,ACTIN, PP2A2 and ciclofilin as showed in Figure 6. However, the less stable genes were: EF1-α and tubulin alpha-2. This Figure 6 show that 5 genes are necessary to narmalization the qPCR assay in developing seeds as cited above (Figure 6). In to Under stress conditions leaves exposure, different gene combinations were also necessary for accurate normalization. For total (a mix of all conditions) and SA stress treatments in leaves exposure, three and four genes were respectively required to normalize gene expression in leaves (Figure 6). Nevertheless, for PEG and NaCl stress treatments, four and five genes were respectively necessary to normalize gene expression (Figure 7a and 7b). The best combinations of stable genes in each tissue under stress conditions were as follows: for total stress, the following two genes were used for normalisation: E. factor, PP2A and GAPDH (Figure 7a and 7b). However, in SA stress, four genes were required: PP2A, E. factor, GAPDH and PUB. In PEG stress, two genes were identified as the best genes for normalisation: PP2A and E. factor, while in NaCl stress, five genes were necessary: PP2A, GAPDH, E. factor, PUB and B. tubulin (Figure 7a and 7b) as suggested by geNorm with cutoff values for M of <1.0.
The GeNorm also indicates the optimal number of genes to be used as reference in normalization by evaluating the variation in pairs (V values) and analyzing the disparity of expression in pairwise gene combinations. In this sense, the ideal number of genes is also influenced by experimental condition, it being selected by calculating V values as a pairwise variation (Vn/Vn+1) between two consecutively ranked normalization factors (NF), followed by an addition of the subsequent more stable reference gene (NFn and NFn+1).
Currently, the GeNorm is integrated to qBASEPlus (Biogazelle) software, constituting a pivotal tool to offer the more stable reference genes (M value) coupled to the number of genes appropriate for normalization (V value). qBASEPlus is widely employed to determine the relative expression of qPCR assays based on normalization factor (NF), requiring at least eight reference genes and 2 samples (control and treatment).
The values of V used for selecting the number of reference genes for qPCR assays in J. curcas plants exposed to abiotic stresses are shown in Figure 7. The evaluations were performed in leaves of plants submitted to isolate and combined [salicilic acid (SA) + polyethylene glycol (PEG) + NaCl] stress. Considering cut-off V value <0.15, the genes appropriate for normalize data in each condition stress were (Figure 7):
Combined stress: E. factor, PP2A and GAPDH.
SA stress: PP2A and E. factor.
PEG stress: PP2A, E. factor, GAPDH, PUB, actin and B. tubulin.
NaCl stress: PP2A, GAPDH, E. factor, PUB, B. tubulin and UCP.
NormFinder is an algorithm capable of determining gene normalization among a series of candidate genes. The normalization is done according to the candidate gene expression stability in both specific sample set and experimental design. The NormFinder employs a mathematical model coupled to a solid statistical framework to determine an overall expression variation of candidate normalization genes, as well as a variation among a subgroup form the set of samples . Markedly, the NormFinder also provides a stability value for each gene that is an estimative for variation in the expression, enabling the operator to evaluate the occurrence of normalization gene-introduced systematic error [21, 22].
|Gene name||Stability value|
|Rank||Combined stress||SA stress||PEG stress||NaCl stress|
The ranking of candidate reference genes for normalize data of qPCR reactions of J. curcas using the NormFinder software is presented in Tables 1 and 2. In developing seeds, most stable genes were GAPDH, UCP, PP2A and ciclofilin with stability values of 0.035, 0.052, 0.148 and 0.262, respectively (Table 1). Under abiotic stress conditions, PP2A, GAPDH and EF1-α were considered as the most stable genes with respect to stability values, regardless of stress condition (combined, PEG, SA and NaCl stress) (Table 2).
Unlike the NormFinder software, the algorithm of BestKeeper assesses the variability of reference gene by analyzing the quantification cycle (“Cq”) value, which takes into account the coefficient of correlation (“r”) and standard deviation (“SD”) values. Values of SD [±Cq] < 2 are considered acceptable. According to this software, the most stable genes commonly present the highest r- and lowest SD-values; and the less stability of candidate genes is denoted by the highest SD values (Table 3) [8, 12, 23, 24].
According to the BestKeeper software, the PP2A (r = 0.958; SD = 1.38), GAPDH (r = 0.887; SD = 1.29), beta tubulin (r = 0.843; SD = 1.02) and alpha tubulin (r = 0.889; SD = 1.04) genes showed the best correlations (Table 3) and were considered ideal for normalization data. Although actin has presented a SD value = 1.32 and showed an r value = 0.65, becoming inappropriate for normalization of data.
GeNorm, NormFinder and BestKeeper are determinant for reference gene evaluation and normalization data in qPCR assay. GeNorm is considered the best software since it does not only provide the best reference genes by M value but also supply the V value that indicates an ideal number of genes necessary for normalization purposes; whereas the NormFinder and BestKeeper algorithms specifically identify the most stable candidate genes. Nonetheless, all three algorithms are employed together in order to provide reliable normalization results.
In general, two kinds of qPCR are widely employed in studies of gene expression, absolute and relative qPCR, in which references genes (i.e., constitutive genes expressed in all cells) must be used to quantify the results. The relative mRNA quantification by real-time PCR has been the most frequently reported, as initially described by Livak et al .
Relative qPCR has several advantages, excluding the need for standard curves. It uses mathematical equations to calculate the relative expression level from target gene as compared (relative) to reference control and/or calibration. By using both calibrator and reference gene, the amount of target gene transcripts in a sample is first normalized with the reference genes and their expression is relativized to normalized calibrator, according to the following formula:
where ∆∆Cq = ∆Cq (sample) − ∆Cq (calibrator); and ∆Cq = target gene Cq − reference gene Cq; note: Cq = cycle quantification is usual known as Ct = cycle threshold . It is very important to mention that biological and technical sample replications must be carried out to conduct a statistical analysis, evaluating significantly gene expression and validating the results.
In the last years, the use of a single reference gene proposed by Livak et al.  has been not advised for qPCR data normalization since it may vary depending on specific tissues. Thus, Vandesompele et al.  suggests employing at least eight candidate reference genes in geNorm built-in qBASE plus (biogazelle) software for obtaining reliable results in qPCR assays.
2. Conclusion remarks
qPCR is an efficient power tool to ensure the mRNA expression in several kinds of samples. To obtain reliable results, numerous parameters should be considered, including: (a) a good quality RNA; (b) specific and efficient primers; (c) appropriate dyes or probes according to the analysis; (d) stable reference genes with respect to analyzed condition; (e) normalization of expression; and (f) a combined approach of available software. Finally, we highlight that adopting all described guidelines, the possible errors and wrong procedures will be decreased, thus rendering successfully the results in real-time qPCR assays.