Hot Start 7-Deaza-dGTP Improves Sanger Dideoxy Sequencing Data of GC-Rich Targets

DNA sequencing has developed substantially over the years into a more cost-effective and accurate technique for scientific advancement in medical diagnostics, forensics, systematics, and genomics. Sanger dideoxy sequencing is currently one of the most established and popular sequencing methods (Sanger et al., 1977). Over the years, sequencing methods have become automated, faster and more specific and now allow sequencing of difficult and unknown regions of DNA. Several protocols have been developed over the years which include new dye chemistries, use of modified nucleotide analogs, use of additives in the sequencing reaction, and variations to the sequence cycling parameters (Prober et al., 1987; Kieleczawa, 2006). These modified protocols allow for sequencing through difficult regions of DNA and may be applied in the pre-sequencing PCR step or in the actual sequencing reaction itself. Despite these improvements, there continues to be DNA regions that are problematic to sequence, such as AT, GT, GC-rich regions, regions high in secondary structure, hairpins, homopolymer regions, and regions with repetitive DNA sequence (Kieleczawa, 2006; Frey et al., 2008). These challenging DNA templates often result in ambiguous sequencing data which include false stops, compressions, weak signals, and premature termination of signal. In particular, sequences high in GC content still suffer from several of these problems despite all the advancements.

several advancements have been developed to address the challenge of GC-rich DNA in the pre-sequencing PCR step as well as in the sequencing reaction itself.
To overcome the higher melting temperatures of GC-rich DNA sequences, sequence cycling parameters have been adjusted to use higher temperatures. Additives such as dimethyl sulfoxide (DMSO), betaine, formamide, and glycerol are often added into sequencing reactions to reduce secondary structure and to promote strand separation (Jung et al., 2002). It is well known that the use of modified analogs such as 7-deaza-dGTP (7-deaza-2deoxyguanosine-5-triphosphate) and dITP (2-deoxyinosine-5-triphosphate) can be used in the sequencing reaction in place of dGTP or in some combination thereof to reduce secondary structure (Motz, Svante et al., 2000). 7-deaza-dGTP is a modified dGTP analog which lacks a nitrogen at the seven position of the purine ring. The absence of this nitrogen destabilizes G-quadruplex formation by preventing Hoogsteen base pairing without affecting Watson-Crick base pairing. Alternatively, dITP alters the Watson-Crick base pairing by reducing the number of hydrogen bonds from three to two in a base pair between inosine and cytosine. This reduces the strength of GC-rich duplexes by lowering the melting temperature.
Efforts have also focused on improving the PCR amplification of GC-rich targets to provide a good quality DNA template prior to the sequencing reaction. The addition of 7-deaza-dGTP in the PCR step permanently disrupts secondary structure by incorporation of this modified analog into the amplicon. The pre-sequencing PCR amplification effectively linearizes the DNA template and prepares it for sequencing. Though the optimal ratio of 7deaza-dGTP to dGTP in a PCR reaction has been thoroughly investigated and determined, this method often requires additional components such as a Hot Start polymerase or additives to be included into the reaction (McConlogue et al., 1988).
A recently developed Hot Start technology named CleanAmp™ employs a thermolabile group on the 3 hydroxyl of a deoxynucleoside triphosphate. The presence of the protecting group blocks low temperature primer extension, which can often be a significant problem in PCR. At higher temperatures, the protecting group is released to allow for incorporation by the DNA polymerase and for more specific amplification of the intended target (Koukhareva & Lebedev, 2009). CleanAmp™ dNTPs are Hot Start versions of standard deoxynucleotide triphosphates (dATP, dCTP, dGTP, and dTTP) while CleanAmp™ 7deaza-dGTP applies the same concept to a modified nucleoside triphosphate, 7-deaza-dGTP. Previous results have shown that a dNTP mixture containing CleanAmp™ 7-deaza-dGTP provides a significant improvement over standard 7-deaza-dGTP in the amplification of GC-rich targets in PCR assays (Shore & Paul, 2010). CleanAmp™ 7-deaza-dGTP Mix is a formulation which contains the CleanAmp™ versions of dATP, dCTP, dGTP, dTTP, and 7deaza-dGTP where the 7-deaza-dGTP:dGTP ratio is 3:1. CleanAmp™ 7-deaza-dGTP applies a Hot Start technology to a secondary structure reducing analog that is permanently incorporated into the PCR amplicon. Preliminary sequencing results on PCR amplicons generated using the CleanAmp™ 7-deaza-dGTP Mix have shown an improvement in sequencing reads over a standard (unmodified) 7-deaza-dGTP mix. The significant improvement in data quality when CleanAmp™ 7-deaza-dGTP Mix was compared to an analogous mix containing unmodified versions of the dNTPs instigated further experimental inquiries to identify the optimal mix composition for use prior to sequencing of PCR products.
Herein we will compare the application of a CleanAmp™ 7-deaza-dGTP Mix to a standard 7-deaza-dGTP mix for the PCR amplification of GC-rich targets in preparation for Sanger dideoxy sequencing. We show that CleanAmp™ 7-deaza-dGTP Mix provides an improvement compared to the standard version of a 7-deaza-dGTP mix and provide guidance as to the best ratio of 7-deaza-dGTP to dGTP to use for optimal PCR and downstream sequencing performance. Performance categories that weigh into this decision include measures of PCR performance, such as preliminary amplicon yield and amplicon quality, and measures of sequencing performance including percent of high quality base calls, read length, and pairwise identity. The most crucial metric for determining sequence performance is the percent high quality base calls, which provides a numerical readout of fragment resolution and sequence determination. The effects of amplicon yield and amplicon quality were also assessed to determine if these parameters directly correlate with sequencing quality.

Methods
The use of the CleanAmp™ technology in a pre-sequencing PCR amplification step was explored by comparing analogous reactions with standard dNTPs to investigate the effect of Hot Start activation on the PCR step. The ratio of 7-deaza-dGTP to dGTP was thoroughly investigated to determine an optimal mixture that will provide robust PCR yield and accurate Sanger dideoxy sequencing results. The effect of magnesium chloride concentration on amplicon yield and subsequent sequencing results was also investigated for mixtures with high 7-deaza-dGTP ratios, which showed low amplicon yield under normal 1.5 mM magnesium chloride concentrations. Experimental data evaluates five GC-rich targets of varying GC content and amplicon length.
Each target was amplified in five individual PCR experiments and analyzed by sequencing. Samples were submitted to Eton Bioscience, Inc. (San Diego, CA) where they were quantified using a nanodrop and submitted for Sanger dideoxy sequencing with the Big Dye Terminator v3.1 (Applied Biosystems) kit used for cycle sequencing reactions. A specified amount for each PCR product was used during the sequencing reaction, depending on the number of base pairs in the amplicon: ACE (156 bp, 10 ng DNA), BRAF (185 bp, 12 ng DNA), B4GN4 (720 bp, 28 ng DNA), GNAQ (642 bp, 27 ng DNA), GNAS (242 bp, 15 ng DNA). All sequencing results were analyzed with the Genious Pro v5.5.2 sequencing software (Drummond AJ, 2011).

Data analysis
Several categories of results were examined including preliminary amplicon yield, amplicon quality, pairwise identity, read length of target, and percent of high quality base calls. All values were averaged for the five independent PCR or sequencing runs and statistical analysis was done using Graphpad Prism Software (San Diego, CA). A two tailed t-test was used to test the null hypothesis that CleanAmp TM and standard dNTPs yield the same results in each 7-deaza-dGTP mixture. A one-way ANOVA test was used to test the null hyposthesis that all 7-deaza-dGTP compositions perform equally within a group (CleanAmp™ or standard dNTPs). The Tukey-Kramer post test was also done in addition to the one-way ANOVA to compare all levels of substitution with one another (Prism). Statistical probability values are indicated on graphs and tables.

Results
Several experimental factors were considered to rank the performance of the PCR amplification and downstream sequencing for five different GC-rich targets with 60 to 84 % GC content. These targets were amplified using five nucleotide mixtures containing different percentages of 7-deaza-dGTP, where the nucleotide mix used in the PCR step was either unmodified (standard) or Hot Start (CleanAmp™). Experimental performance criteria sought to identify the influence of the PCR conditions on amplicon formation (yield and quality) and on Sanger sequencing data quality (percent of high quality base calls, read length, and pairwise identity) (Kieleczawa et al., 2009).

PCR specificity and yield
The first factor was assessment of the PCR yield of reactions with CleanAmp™ (or hot start dNTPs) and standard dNTPs to determine if one would provide a superior result. The graphs in Figure 1 show this comparison, where each of the five targets was PCR amplified with five different 7-deaza-dGTP blends, ranging from 70 to 100% substitution of dGTP with 7-deaza-dGTP. The preliminary amplicon yield is presented as the average relative amplicon yield for each target, where the raw data values for target band gel densitometry in each experiment were normalized to the 75% CleanAmp™ 7-deaza-dGTP mixture. Amplicon yield was quantified prior to PCR cleanup and reflects approximately how much DNA yield was generated from an independent PCR reaction. While many reactions generated sufficient amplicon for downstream sequencing from a single set-up, all reactions contained five replicates to ensure that the amplicon yield after PCR cleanup was sufficient for sequencing. It was also noted that reactions using a higher percentage of 7-deaza-dGTP often required an increase in magnesium chloride concentration for sufficient amplicon yield. Therefore, the 90 and 100% 7-deaza-dGTP brews were prepared with 1.5, 2, and 2.5 mM MgCl 2 to ensure that enough product was formed. From the samples with varying MgCl 2 , the PCR product with the highest yield and least off-target per run was submitted for sequencing. Preliminary experiments indicated that altered magnesium chloride concentration in the PCR step did not significantly affect sequencing reads, so this variable was eliminated when analyzing results.
When the amplicon yield of a reaction containing standard dNTPs was compared to an analogous reaction using CleanAmp™ dNTPs for a given percentage of 7-deaza-dGTP, twelve out of twenty five reactions showed a significant improvement in amplicon yield with CleanAmp™ dNTPs (Figure 1). While there were not many statistically significant differences for the lower GC-rich targets ACE (60%) and BRAF (74%), the three highest GCrich targets: B4GN4 (78%), GNAQ (79%), and GNAS (84%) showed an improved amplicon yield using CleanAmp™ dNTPs over standard dNTPs for several 7-deaza-dGTP compositions. In the case of B4GN4, CleanAmp™ dominates in all the 7-deaza-dGTP mixtures except 100%, where very little amplicon was formed with either type of dNTPs. Furthermore, the addition of magnesium chloride had little effect on amplicon yield for this target producing very low yields prior to PCR cleanup.
The second factor was the evaluation of several 7-deaza-dGTP nucleotide blends (from 70 to 100%) across each group (CleanAmp™ or standard dNTPs) for a given target to determine an ideal amount of 7-deaza-dGTP for optimal PCR yield. In addition to the graphs shown in Figure 1, Table 1 includes the raw data averages and numerical standard deviations for this figure. Also featured in this table are the results for statistical comparison of each of the 7deaza-dGTP compositions with one another across each of the two groups: standard or CleanAmp™ dNTPs. The colored boxes in the tables represent which of the five 7-deaza-dGTP mixtures fared the best with each type of dNTP. The results of this analysis indicate that there was not just one composition that dominated over the others for all five targets but there were some noteworthy trends. First, it was common that the 90 or 100% compositions produced less or comparable amplicon yield than any of the lower percentage compositions despite the additional MgCl 2 used. Second, although several different 7-deaza-dGTP blends gave comparable amplicon yields for a given target, the blend that was optimal was not consistent from one target to the next. Results were more well-defined for the CleanAmp™ group than for standard dNTPs. For CleanAmp™ dNTPs, a 75% 7-deaza-dGTP mix produced the most www.intechopen.com Amplicon yield for each target was normalized to the 75% CleanAmp™ 7-deaza dGTP mixture, and all values were analyzed with a two tailed t-test where (* p < 0.05; **p < 0.01). Fig. 1. Average Relative Amplicon Yield for five GC-rich targets (A-E) amplified using a dNTP mixture with 70 to 100% 7-deaza-dGTP.

www.intechopen.com
Off-Target (primer dimer and mis-priming) yield for each target was quantified relative to the desired amplicon and analyzed with a two tailed t-test where probability values (* p < 0.05; **p < 0.01) specify statistically significant values. Fig. 2. Average Relative Off-Target Yield for five GC-rich targets (A-E) amplified using a dNTP mixture with 70 to 100% 7-deaza-dGTP.
www.intechopen.com amplicon for two targets, B4GN4 and GNAQ. A 70% 7-deaza-dGTP mix provided the highest yield for BRAF and gave higher yields than at least three other mixtures for ACE and GNAS. On the other hand, standard dNTP mixtures showed few obvious trends and varied considerably from one target to the next. For example, the B4GN4 target had comparable amplicon yields for the 75 and 80% mixtures, which provided higher yields than the remaining mixtures, while for the GNAS target, the 75, 90, and 100% yields were comparable and had greater yields than the 70 and 80% mixtures. Overall, one single 7-deaza-dGTP composition could not be identified for highest amplicon yield. In addition to amplicon yield another important factor in PCR product preparation for sequencing is amplicon quality. Amplicon yields were integrated and normalized relative to the 75% CleanAmp™ 7-deaza-dGTP mixture. Off-target amplification yields are the fraction of off-target product formed relative the desired amplicon as determined by gel densitometry. The five sets of 7-deaza-dGTP compositions in each group (standard or CleanAmp™) were analyzed by a one-way ANOVA and Tukey-Kramer post test where (p < 0.05; p < 0.01; p < 0.001) for a given percentage of 7-deaza-dGTP. Boxes outlined in color represent means that give statistically significant values. Amplicon quality indicates the purity of the sample that is being sent for sequencing. A high quality sample should contain only the DNA target to be sequenced and be free of any contaminants, excess primers, excess dNTPs, or off-target products which might interfere with the sequencing reaction. In this study, the PCR products went through a commonly used PCR cleanup process that rids the samples of excess dNTPs and primers but does not remove off-target products that were generated during the PCR. Generation of a high quality PCR product with no off-target would eliminate the more laborious step of gel purification prior to sequencing. Therefore, in this chapter the amplicon quality was assessed by integrating the amount of average relative off-target products in the sample after the five PCR replicates were pooled, cleaned, and concentrated. Amplicon quality is represented graphically as the fraction of off-target (mis-priming and primer dimer) generated relative to the amplicon (Figure 2, Table 1), so the lower this value, the higher the sample quality will be.
Thirteen out of twenty five reactions showed significant reduction in off-target when CleanAmp™ dNTPs were used ( Figure 2). The ACE off-target products consisted entirely of primer dimer while the other four targets were prone to a combination of primer dimer and mis-priming side products ( Figure 3). The two lowest percent GC-rich targets, ACE and BRAF, produced the lowest amount of off-target and highest amplicon quality, with comparable performance between CleanAmp™ and standard dNTPs. The other three target reactions formed substantially more off-target, especially when amplified with standard dNTPs. Figure 2 results show that amplicon quality is highest in most cases when CleanAmp™ mixtures are used. For the B4GN4 and GNAQ targets, reactions with standard dNTPs produced significantly more off-target products than CleanAmp™ dNTPs regardless of the percent 7-deaza-dGTP. For all five amplicons, the reactions using 75 and 80% 7-deaza-dGTP produced no primer dimer or mis-priming when CleanAmp™ was used. As was the case for amplicon yield, there was not just one composition of 7-deaza-dGTP that produced the highest quality DNA product. However it was evident that the use of CleanAmp™ dNTPs reduced the amount of off-target compared to standard dNTPs, providing an amplicon with a higher chance of successfully being sequenced.
Although it is important to generate a high quality PCR product with adequate yield, other measures of the sequencing results, such as read length, pairwise identity, and percent high quality base scores should also be considered. Read length is the number of bases that were called for a given target. Optimally, this value should match the length of the reference sequence, provided it does not exceed the ~1000 base pair limits of the current Sanger dideoxy sequencing technology (Slater & Drouin, 1992;Kieleczawa, Adam et al., 2009). For some challenging targets shorter than these upper limits, the read lengths can often become truncated due to complex secondary structures or regions of DNA that the polymerase can not read through such as GC-rich regions. Results in Table 2 indicate that the average read lengths for the five GC-rich targets were comparable between CleanAmp™ and standard dNTPs in nearly every case. These results were not surprising since these five targets, which were chosen mainly for GC content are not long enough to accurately assess the impact on read length. Therefore the effect of GC content on read length remains yet to be determined in assays with longer amplicons. B4GN4, the longest target with 720 base pairs, varied the most in read length based on standard deviations, suggesting that this was one of the most difficult targets out of the group and just beginning to approach the length www.intechopen.com threshold where sequencing becomes more of a challenge. For these targets, no one 7deaza-dGTP mixture fared better than the rest, as all of the average read length values were statistically comparable to one other. In addition to the read length of the sequence, it is critical to identify the correct bases within a sequence.
Gel images show the variability among the 3 different GC-rich targets in amplicon yield and off-target products (primer dimer and mis-priming).

Sanger dideoxy sequencing data quality
The pairwise identity in an alignment of two sequences is the percentage of shared identical bases (Drummond AJ, 2011). In this chapter, all sequences were known, so all experimental sequencing read-outs were aligned to the appropriate GenBank reference sequences (Dennis A. Benson, 2005). If the read-outs matched exactly, they would have 100% pairwise identity to the reference sequence. For ACE and BRAF, the sequencing data matched with the reference sequences nearly 100% of the time for both standard and CleanAmp™ dNTPs in all 7-deaza-dGTP compositions. For the three targets with higher GC content, results showed that standard and CleanAmp™ dNTPs pairwise identity values approached 100% with no statistically significant differences for most 7-deaza-dGTP mix compositions. However, at 70% 7-deaza-dGTP, three targets amplified with CleanAmp™ dNTPs had higher pairwise identity values than standard dNTPs. Another noteworthy outlier was B4GN4, which was prone to the highest level of off-target formation. Although no one 7deaza-dGTP composition improved results over any of the others in the group, reactions employing CleanAmp™ dNTPs provided a higher pairwise identity to the B4GN4 reference sequences at lower 7-deaza-dGTP substitutions. While read length and correct base calls are important parameters in determining the quality of sequencing data, the most critical parameter is the percentage of high quality base calls or HQ percentage.
HQ percentage differs from pairwise identity, as it is a measure of the confidence by which the sequencing software can determine a sequence (Drummond AJ, 2011). Often, the pairwise identity may be a 100% match to the reference sequence but have very low confidence values for each base call within the sequence. The confidence in base calling becomes more important when sequencing unknown regions of DNA where there is no reference sequence. In these cases, the resultant sequence can only be trusted if the confidence of the sequencing software is high enough. Phred scores, or quality scores are a widely accepted measure of the quality of DNA sequences. Phred scores are numerical estimates of error probability for a given base (Ewing & Green, 1998). Sequencing softwares each have their own scale for base scoring, and in these studies, the percent of high quality base calls in a sequence read-out (HQ%) is defined to be the percent of bases that have a quality score (phred score) greater than 40. The highest score is a one in a million (10 −6 ) probability of a calling error where a middle score (20-40) represents a probability of only a one in a thousand (10 −3 ) (Drummond AJ, 2011). The data presented herein includes the HQ percentages for each sample (Table 2 and Figure 4).
The HQ percent scores for the ACE and BRAF sequencing data showed minimal differences between CleanAmp™ and standard dNTPs for a given 7-deaza-dGTP mix composition. For B4GN4, GNAQ, and GNAS, the sequencing of amplicons generated with CleanAmp™ dNTPs showed significantly improved HQ scores at 70% 7-deaza-dGTP relative to analogous reactions with standard dNTPs which did not reach a value of 50%. CleanAmp™ also yielded higher HQ scores (p< 0.05) in four out of the five mixtures for the B4GN4 target. When looking at the compositions of 7-deaza-dGTP across each group of CleanAmp™ or standard dNTPs, all HQ percent values for each 7-deaza blend are comparable for low GC content targets. For targets with greater than 75% GC content, there are some noticeable differences in the lowest (70%) and the highest (100%) substitutions of 7-deaza-dGTP. However, there were no statistically significant differences between HQ values in the middle mix compositions of 75, 80, and 90% (Table 2). Although a specific optimal percentage of 7-deaza-dGTP could not be identified, results indicate that 75, 80, and 90% blends provided the best results across a wide www.intechopen.com range of targets. Furthermore, CleanAmp™ was found to improve high quality base calls for seven out of the twenty five reactions which included certain challenging targets and lower 7deaza-dGTP compositions for higher GC-rich species.  The five sets of 7-deaza-dGTP compositions in each group (standard or CleanAmp™) were analyzed by a one-way ANOVA and Tukey-Kramer post test where probability values shown with highlighted boxes (p < 0.05; p < 0.01; p < 0.001) find the means statistically significant for HQ values, pairwise identity, and read length. In addition, Pairwise Identity and Read Length comparison of CleanAmp™ to standard dNTPs in individual 7-deaza-dGTP compositions, which is not shown by bar graph, are indicated by stars where (*p < 0.05; **p < 0.01; **p < 0.001) represent values that are statistically significant. The percentage of bases in a sequencing read out which had a high quality score of 40 or higher determined a HQ percent value. These values were averaged after alignments to each reference sequence and analyzed with a two tailed t-test where probability values (* p < 0.05; **p < 0.01) find the means statistically significant. In summary, these studies have investigated both the percent of 7-deaza-dGTP substitution and the influence of standard and CleanAmp™ versions of the nucleotide mix on PCR and Sanger sequencing performance. When different metrics such as amplicon yield, amplicon quality, HQ percentage and sequencing chromatogram quality are considered, there are notable instances where reactions employing CleanAmp™ dNTPs have either comparable performance or statistically significant improvements in performance. To better understand how these parameters interplay with one another, a more detailed analysis will be presented in the Conclusion section.

Conclusion
Both standard and CleanAmp™ dNTPs can effectively generate PCR amplicons with GC-rich sequence when amplified in combination with 7-deaza-dGTP. The use of this nucleotide analog effectively linearizes the DNA sequences in preparation for Sanger dideoxy sequencing by destabilizing secondary structures such as G-quadruplexes. This allows the DNA fragments to migrate more predictably through the polyacrylamide gel and reduces the possibility of ambiguous base calls in sequencing results (Motz, Svante et al., 2000). CleanAmp™ dNTPs also offer the added benefit of reduced off-target amplification due to Hot Start activation in the PCR assay. Therefore the effect of Hot Start PCR activation in conjunction with the extent of 7-deaza-dGTP substitution was investigated to determine its potential benefit.
For the targets with lower GC content (less than 75%), CleanAmp™ dNTPs helped to reduce off-target product formation at the PCR step but gave comparable results to standard dNTPs in all other categories. For the three highest GC-rich targets, CleanAmp™ improved amplicon yield and amplicon quality with several different 7-deaza-dGTP compositions, indicating that the Hot Start activation is a much-needed benefit. In one case, B4GN4, CleanAmp™ significantly improved amplicon yield, amplicon quality, and percent HQ over standard dNTPs for at least four out of the five 7-deaza-dGTP mixtures (from 70-100% 7-deaza-dGTP). Quality sequencing results for this target were not achieved with standard dNTPs alone. The use of CleanAmp™ dNTPs at the PCR stage also improved amplicon yield, pairwise identity, and percent HQ with the 70% 7-deaza-dGTP composition. However, reactions with an analogous mixture of standard dNTPs were not as successful, indicating that standard dNTP mixtures may require a higher percentage of 7-deaza-dGTP. Although the categories of pairwise identity and read length showed minimal differences when it came to using standard or CleanAmp™ dNTPs, CleanAmp™ dNTPs improved the PCR assay and down stream sequencing results over standard dNTPs in DNA targets with GC content higher than 75%.
After analyzing the results in five categories individually, it was determined that three of the categories, amplicon yield, amplicon quality, and percent HQ, were most affected by the variables being investigated. Therefore the influence of these categories on one another was more thoroughly studied to discern the most optimal percent 7-deaza-dGTP mixture. Figure  5A(I to V) shows scatter plots of the percent HQ and amplicon yield for all variables being tested. The shaded portion of the plot highlights dNTP mixtures that reached a threshold of at least 50% relative amplicon yield and 50% high quality bases called. The dNTP compositions that were found in this region of the scatter plot were identified and re-plotted in a scatter plot of HQ percent versus amplicon quality ( Figure 5B (I to V)). Optimal compositions for Figure 5B lie highest on the plots for HQ scores and furthest to the left for the least amount of off-target formed or highest amplicon quality.
Several of the different CleanAmp™ dNTP mixtures met the threshold requirements for amplicon yield, were high in amplicon quality and yielded high HQ scores. While many of the standard dNTP mixtures also have adequate amplicon yield, the amplicon quality suffered for several targets. From the scatter plot analysis in Figure 5, the 75% CleanAmp™ mixture provided adequate amplicon yield, best amplicon quality and highest HQ scores for 4 out of the 5 targets. For the 75% mixture of CleanAmp™ dNTPs in the GNAS target adequate yield and high amplicon quality were evident but the HQ scores were lower for this composition. The lesser correlation of GNAS to the other samples may be due to its higher GC content (84%) and the higher concentrations of magnesium chloride, which was needed for adequate amplicon yield. For targets with higher GC than 80% composition, the data indicated that complete substitution of 7deaza-dGTP may be necessary. If more replicates are pooled to produce enough amplicon at a lower magnesium concentration then it is likely that the off-target products will decrease and base call quality will still remain high. If standard dNTPs are used in the PCR step, an 80% 7-deaza-dGTP mixture was the optimal composition. However amplicon quality can sometimes still be affected at this composition with standard dNTPs, which may require a more laborious gel purification step to remove offtarget products prior to sequencing.
In addition to the numerical metrics from the sequencing data of the five targets, the sequencing chromatograms were studied. In Figure 6, representative chromatograms for the regions with high GC content for the BRAF, GNAS, and B4GN4 targets are presented. Comparisons show the optimal compositions of CleanAmp™ with 75% 7-deaza-dGTP and standard with 80% 7-deaza-dGTP. For BRAF (74% GC), a region from 50-100 bp is shown. Since the analogous HQ percentages are similar for templates with standard and CleanAmp™ dNTPs, it was not surprising that the chromatograms are similar in this region. Similarly for GNAS, the HQ percentages were comparable for both CleanAmp™ and standard dNTP samples, with a modest improvement in chromatogram shape and base call confidence for the CleanAmp™ dNTP target. For B4GN4, there were significant differences in the HQ percentages between standard (HQ: 43.7%) and CleanAmp™ (HQ: 82.2%). The sequencing trace for standard dNTPs died out at 600 bp, while the trace for CleanAmp™ dNTPs persists to the end of the target (~700 bp). Two representative regions of sequence are shown (10-60 bp and 160-210 bp), where reactions with CleanAmp™ dNTPs had strong performance for both regions, and reactions with standard dNTPs had poor performance in the early part of the read, culminating in stronger performance mid-sequence.
Overall, these studies represent a thorough investigation of both the effect of 7-deaza-dGTP substitution and the use of a Hot Start PCR technology on PCR amplification and downstream sequencing performance. Though the differences were subtle for the extent of 7-deaza-dGTP substitution when individual parameters were analyzed, the advantages of using CleanAmp™ over standard versions of the nucleotide mix, were more pronounced. Upon a more detailed analysis of amplicon yield, specificity and downstream sequencing quality, optimal nucleotide compositions were revealed. Future studies may include exploration of more targets greater than 80% in GC composition, longer templates, and the incorporation of CleanAmp™ 7-deaza-dGTP into the sequencing step.
In column A, HQ and amplicon yield values that lie in the blue shaded boxes (top right) met HQ and amplicon yield threshold values and were then re-plotted with scatter plots in column B (HQ versus relative off-target yield). Values that lie furthest to the left and highest on the plots in column B are the most optimal mixtures for PCR and downstream sequencing.

www.intechopen.com
Base call quality is represented by the height of the blue shading behind the peaks which correlates with the base color (shade of blue) in the sequence. Light blue shaded bases (A, C, G, or T) are of high quality and have scores higher than 40. CleanAmp™ dNTPs show improved read length, high quality base scores, and reduced base compressions. Fig. 6. Sequencing chromatograms of high GC-rich regions of A) BRAF, B) GNAS, and C) B4GN4 comparing standard 7-deaza-dGTP at 80% and CleanAmp™ 7-deaza-dGTP at 75% 7-deaza-dGTP substitution. www.intechopen.com