Node sequences used in the calculation.

## Abstract

An asymmetric, fully-connected 8-city traveling salesman problem (TSP) was solved by DNA computing using the ordered node pair abundance (ONPA) approach through the use of pair ligation probe quantitative real time polymerase chain reaction (PLP-qPCR). The validity of using ONPA to derive the optimal answer was confirmed by in silico computing using a reverse-engineering method to reconstruct the complete tours in the feasible answer set from the measured ONPA. The high specificity of the sequence-tagged hybridization, and ligation that results from the use of PLPs significantly increased the accuracy of answer determination in DNA computing. When combined with the high throughput efficiency of qPCR, the time required to identify the optimal answer to the TSP was reduced from days to 25 min.

### Keywords

- DNA computing
- traveling salesman problem
- ordered node pair abundance
- pair ligation probe-qPCR
- PLP-qPCR

## 1. Introduction

The traveling salesman problem (TSP) computes the shortest route on the arcs of a network that visits a given set of nodes (cities) before returning to the starting point [1, 2, 3, 4, 5, 6]. In many cases, * in silico*computers are incapable of quickly determining an exact solution of these nondeterministic polynomial (NP) problems because a linear increase in the number of variables leads to a factorial increase in the number of potential solutions. Although advanced heuristic methods have increased the ability of

*computers to provide approximate answers to NP problems [7], they lack the massive parallelism and data storage required to find exact solutions. DNA computing, which uses the hybridization of DNA molecules as a means to make computations [8, 9, 10, 11, 12, 13, 14, 15], is particularly well-suited to solve computationally intense NP problems such as the TSP because multiple sequences of DNA in a solution can hybridize simultaneously, thereby performing massively parallel computing.*in silico

Several technical limitations have prevented DNA computing from reaching its full potential. Although the computation can occur within seconds, current applications of DNA computing are largely limited by time-consuming and labor-intensive answer sorting and determination processes. For example, a gradient PCR procedure that involved a series of PCR reactions with a variety of primer pair combinations was used to determine the solution to a 7-city directed Hamiltonian circuit problem [8]. Other answer determination methods include DNA sequencing [16], and denaturation temperature gradient-polymerase chain reaction (DTG-PCR) associated with denaturing gradient gel electrophoresis (DGGE) and/or temperature gradient gel electrophoresis (TGGE) [10, 17, 18]. Neither DNA sequencing nor DGDC/TGDC can be used to identify the optimal answer to a TSP because the feasible answers represent both optimal and suboptimal answers that differ only in the order in which the components were ligated when the answers were formed, and are flanked by the sequences that encode the start and end nodes. The dependence of the discriminatory powers of DTG-PCR, DGGE, and TGGE on the diversity of oligonucleotide components, especially the G and C content [19, 20] also limits their applicability for answer determination of NP-complete problems.

Xiong et al. [12] developed the ordered node pair abundance (ONPA) approach to identify the optimal answer of an asymmetric, fully-connected 15-city TSP, the largest problem solved to date, using DNA computing. In that study, 20-mer DNA sequences that specifically encoded the nodes and arcs in the network were added to the reaction mixture in the presence of ligase. The sequence of each arc was capable of linking two node sequences in a specific order via hybridization of the last 10 bases of the prior node to the first 10 bases of the subsequent node. The efficiency of each arc in the network was encoded as the concentration of DNA for that arc using saturating concentrations of DNA for each node. Ligation of these sequences then formed covalently linked ordered node pairs (ONPs) that assembled into answer sequences representing tours through the network, such that the optimal answer would be formed in greatest abundance.

The optimal answer to the 15-city TSP was identified by ONPA after infeasible answer sequences were removed by electrophoresis and magnetic bead separation [11, 12]. For each ONP examined by ONPA, probes were designed to hybridize to the 3′-end of the prior node and the 5′-end of the subsequent node in the ordered pair as well as a complementary pair for use in ligation chain reaction (LCR). Each pair of probes was then subjected to the same number of LCR cycles in the presence of an aliquot of answer sequences, and the relative abundance of LCR product was quantified by PAGE band intensity. This was repeated for every possible combination of ONPs to identify the ones present in greatest abundance. Although this provided a clear determination of the optimal answer, the accuracy and precision of the computation were limited by the error inherent in determining DNA abundance via bands in a PAGE gel and by the fact that LCR is an end-point assay. In addition, the procedure required days to complete.

Quantitative real-time PCR (qPCR), which uses an increase in fluorescent signal to monitor increases of PCR amplicons, has proven to be a fast, precise and reproducible method to rapidly quantify the relative abundance of many nucleic acid sequences. This results in part because qPCR is not an end-point assay. Instead, the cycle at which a positive signal is first consistently detectable, termed the cycle threshold (C_{t}), is proportional to the initial content of a given template in the sample. Higher initial target contents give rise to earlier detectable increases in signal, resulting in lower C_{t} values. Simultaneous amplifications of multiple DNA targets in a single reaction tube can be achieved by multiplex qPCR, which maximizes throughput and decreases the time required to collect information. Ibrahim et al. [21] reported a qPCR approach to readout answers to a Hamiltonian path problem. However, the protocol required multiple combinations of qPCR amplifications and was ultimately dependent upon the use of * in silico*information processing to determine the answer.

Answer determination via a direct qPCR amplification of the entire length of TSP answer sequences is impossible. This is due to the intra-molecular heterogeneity of answers sharing the same starting and the ending nodes, which are the same limitations that eliminate the use of DTG-PCR. The application of qPCR for the quantification of the short 20-mer sequences using ONPA for answer determination of a TSP has also not been possible because standard PCR methodologies require two DNA primers that flank the region of the DNA sequence to be amplified. However, we developed a pair ligation probe (PLP)-dependent qPCR system (PLP-qPCR) that is capable of rapidly quantifying the abundance of short DNA sequences under the multiplex conditions [22]. Upon hybridization to its 20-mer target DNA, the PLP becomes circularized by ligation. This allows the abundance of adjacent short sequences in a DNA strand to be transformed into the copy number of circularized PLP that can be amplified by qPCR.

We have now solved an asymmetric, fully-connected 8-city TSP by the ONPA approach using PLP-qPCR without the need for * in silico*information processing. The validity of using ONPA to derive the optimal answer was confirmed using a reverse-engineering method to reconstruct the complete tours in the feasible answer set from the measured ONPA. The high specificity of the sequence-tagged hybridization and ligation that results from the use of PLPs significantly increased the accuracy of answer determination in DNA computing. When combined with the high throughput efficiency of qPCR, the time required to identify the optimal answer to the TSP was reduced from days to 25 min.

## 2. Materials and methods

### 2.1 Oligonucleotide design and construction

Oligonucleotide sequences were designed using Primer Express Version 2 for Windows (Applied Biosystems) and were synthesized by Invitrogen Inc. Nodes labeled B through H were represented by synthetic 20-mer sequences of DNA except for the starting and ending nodes (A_{start} and A_{end}) that were comprised of 30-mers (Table 1). Node sequences were chosen to minimize cross hybridization, and nodes B through H had a GC content from 30% to 35% such that melting points varied from 60.6 to 62.0 °C, while the A_{start} and A_{end} sequences contained GC contents of 66% and 72%, respectively. None of the arc sequences were complementary to the first 15-mer sequence for the A_{start} and the last 15-mer sequence for the A_{end}. Thus, incorporation of A_{start} or A_{end} into an answer sequence prevented further extension of that end of the answer sequence. The start and end sequences also served as primer sequences for downstream amplification by PCR, which provided the capability to increase the amount of answer sequences. The longer A_{start} and A_{end} sequences with higher GC content were important for downstream PCR amplification with enhanced fidelity and improved efficiency for answer purification. Any two node sequences could be linked together by a 20-mer arc sequence composed of two 10-mers that were complementary to the last and first 10 nucleotides in the sequences of the former and latter nodes, respectively (Figure 1). Arcs were made to complement every combination of nodes with the exception of the respective 5′ and 3′ ends of A_{start} and A_{end}. Oligo sequences (Table 2) designed for the PLP-mediated qPCR assay followed the protocol of Xiong and Frasch [23]. All oligonucleotides were purified by PAGE under denaturing conditions [12].

Node | Length | Sequence a | GC (%) | T_{M} (°C) |
---|---|---|---|---|

A_{S} e | 30 | ^{,} c | 66 | 70.0 |

B | 20 | 30 | 61.2 | |

C | 20 | 35 | 61.6 | |

D | 20 | 30 | 61.6 | |

E | 20 | 35 | 61.6 | |

F | 20 | 30 | 61.6 | |

G | 20 | 30 | 61.7 | |

H | 20 | 25 | 61.7 | |

A_{E} e | 30 | 72 | 72.0 |

DNA sequence | |
---|---|

NED-TAqMan^{®} spacer | |

^{®} reporter | |

NED | |

5′-B′ | |

5′-D′ | |

5′-F′ | |

3′-C″ | |

3′-E″ | |

3′-G″ | |

P_{r} | |

P_{f} |

### 2.2 Answer formation and purification

Hybridization was performed by heating the answer formation reaction medium (AFRM) to 92°C for 4 min, followed by a programmed annealing process at a cooling rate of 1°C per min to 8°C. The AFRM was composed of all relevant oligonucleotides in T4 Ligase Buffer (Fermentas). Ligation was performed by incubating at 8°C for ∼16 h after addition of 5 Weiss U of T4 ligase, 20 mM DTT, and 10 mM ATP to the AFRM.

Ligation products containing all answer sequences were separated by 6% denaturing PAGE containing 8 M urea and 15% formamide at 95 V for about 3 h. The gels were visualized with UV light after staining with ethidium bromide (1 mg/ml) for 10 min. Fragment sizes were determined by comparison with mobility of a 20-bp DNA ladder (Bayou BioLabs). The 190 bp band was excised from the gel that contained sequences consistent in length to those of feasible answers (i.e. containing one copy of each node sequence starting and ending with A_{start} and A_{end}). These sequences were amplified by PCR using PCR primers for A_{start} and A_{end}, and subject to sequential magnetic affinity purification steps as reported by Spetzler et al. [11, 12] to purify feasible answers.

### 2.3 Preparation of target-specific pair ligation probes

Each PLP consisted of one 55-mer core and two 10-mer target-specific sequences that comprised the 5′ and 3′ arms located at ends of the core (Table 2). The core sequence contained the forward (P_{f} 19-mer) and reverse (P_{r} 21-mer) PCR primer-binding sequences, and a qPCR reporter-identifying sequence known as a TaqMan spacer for use with the TaqMan-MGB^{®} (Applied Biosystems) NED reporter dye (λ_{max} = 580 nm). The PLPs specific for each ONP in the answer sequences were made from core and arm components by ligation as per Xiong and Frasch [23].

### 2.4 Circularization of the PLP and the qPCR assay

Aliquots of the purified answer sequences containing ∼2 pmol DNA were denatured and annealed with 20 pM of linear PLP. The hybridized PLPs were circularized by ligation at 10°C overnight. The ligation reaction mixture contained 2 μl of 10× ligation buffer (Fermentas), 50 mM DTT, and 5 Weiss U of T_{4} DNA ligase, in a final volume of 20 μl. After ligation, 2 μl of ligation product was added to 18 μl of the exonuclease mixture that contained 10 mM Tris-HCl, pH 9.0, 5 mM MgCl_{2}, 0.1 mg per ml of BSA, 10 U Exonuclease I and 10 U Exonuclease III to remove any remaining linear PLPs. The samples were incubated at 37°C for 2 h followed by inactivation at 65°C for 20 min.

Quantitative real-time PCR (qPCR) assays were performed in a 96-well, closed plate using the AB 7500 Fast RT-PCR System (Applied Biosystems). In a typical qPCR assay, 20 μl of qPCR reaction mixture that contained 10 μl of 2× TaqMan^{®} Fast universal PCR Master Mix including the ROX fluorescent reporter as a passive reference, 900 nM of each of the P_{f} and P_{r} primers, 0.25 μM of the NED TaqMan-MGB^{®} reporter, and ∼2 ng of the circularized PLP were used to determine the abundance of each ordered node pair. Thermal cycling profiles for qPCR included heating at 95°C for 20s followed by 45 cycles with 95°C for 3s and 60°C for 30s.

During the PLP-qPCR assay, DNA amplification was monitored by quantitatively analyzing fluorescence emissions using an ABI prism sequence detector. The intensity of the NED reporter dye was measured against the ROX internal reference dye signal to normalize for non-PCR-related fluorescence fluctuations that occurred from well to well. The threshold cycle (C_{t}) represented the refraction cycle number at which a positive amplification reaction was measured and was set at 10 times the standard deviation of the mean baseline emission calculated from PCR cycle 5–15.

### 2.5 Linear programming (LP) model of tour frequency from ONPA data

The model is based on LP approaches to data fitting [24, 25, 26, 27], and solves for the tour frequencies * X*that best fit the observed ONPA

_{t}

*.*F

_{ij}

where * ij*;

*;*ij

*= estimated frequency of tour*X

_{t}

*;*t

*= 1, if tour*h

_{tij}

*contains ONP*t

*0 otherwise;*ij;

*= observed frequency of arc*F

_{ij}

*.*ij

The number of possible tours * t*that begin and end with node A, and visit each node only once is

*that were made by the DNA computer were not explicitly measured, one*t

*variable is generated for each tour*X

_{t}

*representing how many DNA molecules of tour*t,

*were likely to have been made, given the ONPA. The goal was to solve for the best-fitting values of*t

*. Constraints*X

_{t}

**are written once for each ONP**2

*. The first expression on the left-hand side of the constraint,*ij

*within all tours*ij

*Constraints*t.

**then add a positive deviation or error (**2

*. Thus, if there is no set of*F

_{ij}

*values that can perfectly reproduce the observed ONPA, the deviation variables are forced to take up any slack or surplus. The objective function*X

_{t}

**minimizes the sum of the deviations so as to find the best-fitting tour frequencies**1

*for the observed ONPA*X

_{t}

*. The deviations are separated into positive and negative components because linear programming cannot handle absolute values explicitly. The deviation variables, like the tour frequency variables, are constrained by*F

_{ij}

**to be non-negative. If the deviation variables were permitted to be either positive or negative, minimizing their sum would not provide a good fit because the objective would reward large negative deviations pushed toward −∞.**3

## 3. Results

The computation to find the optimal solution to an asymmetric, fully-connected 8-city TSP was defined by the distance matrix in Table 3. The problem was designed such that the alphabetical order of the nodes was the optimal tour (Figures 1 and 2d). To accomplish this, the distances were translated into concentrations (1:1 pmol) of the arc sequences that connect the nodes in inverse proportion to the distances (Table 3). These sequences were then hybridized with an excess of all node sequences and ligated to form answer strands as summarized in Figure 2. Node and arc sequences were asymmetric in design so that two node sequences could become ligated in an order-specific manner to form an ordered node pair (ONP). For example, to compute the formation of the BC-ONP, the last (3′) and first (5′) halves of the sequences for nodes B and C hybridize to the first (3′) and last (5′) halves of the BC arc oligonucleotide, respectively, which then permits ligase to link them covalently.

Prior | Subsequent node | ||||||||
---|---|---|---|---|---|---|---|---|---|

Node | A_{S} | B | C | D | E | F | G | H | A_{E} |

A_{S} | * a (0) b | 1 c (100) b | 100 (1) | 100 (1) | 100 (1) | 100 (1) | 100 (1) | 100 (1) | 100 (1) |

B | * (0) | * (0) | 1 (100) | 100 (1) | 100 (1) | 100 (1) | 100 (1) | 100 (1) | 100 (1) |

C | * (0) | 100 (1) | * (0) | 1 (100) | 100 (1) | 100 (1) | 100 (1) | 100 (1) | 100 (1) |

D | * (0) | 100 (1) | 100 (1) | * (0) | 1 (100) | 100 (1) | 100 (1) | 100 (1) | 100 (1) |

E | * (0) | 100 (1) | 100 (1) | 100 (1) | * (0) | 1 (100) | 100 (1) | 100 (1) | 100 (1) |

F | * (0) | 100 (1) | 100 (1) | 100 (1) | 100 (1) | * (0) | 1 (100) | 100 (1) | 100 (1) |

G | * (0) | 100 (1) | 100 (1) | 100 (1) | 100 (1) | 100 (1) | _{*} (0) | 1 (100) | 100 (1) |

H | * (0) | 100 (1) | 100 (1) | 100 (1) | 100 (1) | 100 (1) | 100 (1) | * (0) | 1 (100) |

A_{E} | * (0) | * (0) | * (0) | * (0) | * (0) | * (0) | * (0) | * (0) | * (0) |

Given that the concentration of arcs along the upper diagonal of the efficiency matrix was 100-fold that of the other arcs, strands containing node sequences in alphabetical order were anticipated to be produced in highest abundance. A comparable result was obtained previously using a similar approach to solve an asymmetric, fully-connected 15-city TSP [10]. Consequently, this computational method could be reliably used to examine the efficacy of PLP-qPCR in the answer determination step of a DNA computation.

A tour of the nodes was feasible, whether it was optimal or suboptimal, only if it contained one copy of each node sequence flanked by the A_{start} and A_{end} sequences (Figure 2d–f). Figure 3 shows a stepwise summary of the procedure used to make the calculation. The PAGE profile of answer sequences at various stages of purification is shown in Figure 4. Selective amplification of answer sequences flanked by A_{start} and A_{end} (Lane 1) was achieved by PCR using primers specific for those sequences (Figure 2a–f). This PCR product (Lane 2) included the 190-mer band that was the size required for feasible answers (Figure 2c–f), which was then subjected to sequential magnetic bead affinity purification for every node sequence. This insured that all sequences used for answer analysis contained one and only one copy of each node sequence, and thus represented Hamiltonian circuits through the network. The purified sample composed only of answer DNA sequences for feasible tours (i.e. Figure 2d–f) appeared as a single 190-mer band (Lane 3) that was used to determine the optimal answer to the computation.

To identify the optimal answer, the feasible answers (Figure 4, Lane 3) were analyzed by the ONPA approach. This was accomplished using PLP-qPCR with the 56 PLPs that were specific to each of the possible ONPs. Each PLP was added to an aliquot of the solution containing the answer set to prepare it for PLP-qPCR. Preparation included the steps summarized in Figure 5 for the BC-PLP that was used to detect the ONP in which the sequence encoding node B immediately precedes that of node C in a tour. For the BC-ONP, the 5′-arm of the PLP complementary to the last 10 bases of sequence B (designated B′), and the 3′-arm (designated C″) that were complementary to the first 10 bases of sequence C were created [23]. The hybridized PLP was circularized by ligase, after which the PLP was denatured from the target strand and exonuclease was added to eliminate any DNA that had not been circularized. Because the amount of circulated PLP was quantitatively related to the initial content of a target, the target content used in the determination was transformed into the copy number of the circularized PLPs. The presence of a qPCR reporter-identification sequence, and PCR primers in the PLP enabled the use of qPCR to measure the copy number.

Figure 6 shows the PLP-qPCR fluorescence amplification plots as a function of cycle number that determines the amount of each possible ONP in the answer sequences of the 8-city TSP. The ONPA was determined in groups according to the prior node in the ONP such that the relative abundance of any of the ONPs in that group could be compared directly. In each case, it is evident that the first sample in which fluorescence amplification consistently increased above the threshold corresponded to that ONP where the nodes appeared in alphabetical order (i.e. AB, BC, etc.). Thus, it was possible to determine the optimal answer to the 8-city TSP using PLP-qPCR by simple inspection of the raw data. Due to the quantitative nature of qPCR, the C_{t} values measured from the data in Figure 3 were used to calculate the copy number of each ONP (Table 4). The results are organized so that each row and column define the prior and subsequent nodes of each ONP. Since these measurements were made in groups corresponding to each row, comparisons of the ONPA between rows were achieved by determining the fractional ONPA after normalizing the ONPA present in each row in highest abundance (data in parentheses). The average copy number for suboptimal ONPs was 0.35% of the optimal ONP in each row. However, the suboptimal ONPs were not present in equal abundance. Seven of the suboptimal ONPs were present in amounts that were below the limit of detection. However, the ability to detect these low abundance ONPs was not needed to obtain the optimal answer to the problem solved here. Some of the suboptimal ONPs were formed in much higher abundance than the average. Figure 7A shows the number of suboptimal ONPs from Table 4 as a percentage of the preceding node in the pair. The fraction of suboptimal ONPs was significantly larger in rows C, D, E and F in Table 4. This was due to a relatively small number of specific alternate ONPs that included CE (0.2%), DA (0.2%), EC (0.47%), FD (0.5%), and FE (0.47%).

Prior node | Subsequent node | |||||||
---|---|---|---|---|---|---|---|---|

B | C | D | E | F | G | H | A_{E} | |

A_{S} | 1.4 × 10^{6} a | 10 | 3.5 | 2.9 | 0.019 | 0.19 | 1.4 | * |

B | * | 4.8 × 10^{5} | 0.49 | 6 | 10 | 9 | 4.9 | 21 |

C | 0.04 | * | 2.2 × 10^{5} | 420 | 0.18 | 6.4 | 22 | 0 |

D | 28.5 | 190 | * | 3.5 × 10^{5} | 395 | 320 | 48.5 | 750 |

E | 640 | 1600 | 380 | * | 3.4 × 10^{5} | 87 | 1.5 | 0 |

F | 1.5 | 0 | 1800 | 1600 | * | 3.4 × 10^{5} | 380 | 380 |

G | 0.87 | 0.057 | 0 | 47 | 0 | * | 1.2 × 10^{5} | 31 |

H | 0.4 | 22 | 1.6 | 0 | 47 | 0.34 | * | 7.9 × 10^{6} |

To test the validity of using ONPA to identify the optimal answer of a TSP, a linear programming (LP) model containing 5168 variables and 64 constraints was developed to “reverse engineer” the frequency of the feasible tours formed by DNA computing that would most closely match the individual ONP frequencies found in Table 4 when broken down into their constituent ONPs. Since the data for each row in Table 4 were obtained independently from the others, the model ** 1–3**was solved by inputting either the normalized and absolute ONP frequencies in Table 4 as the

*parameters. In the normalized model, the*F

_{ij}

*were converted into percentages of each row total, whereas in the absolute model, the*F

_{ij}

*values were the total number of molecules detected (see Table 5 footnotes a and b).*F

_{ij}

Normalized ONPA a_{ij} | Absolute ONPA b_{ij} | |||||
---|---|---|---|---|---|---|

Top five tours c | # suboptimal ONPs | Probability | Top five tours d | # suboptimal ONPs | Molecules (in 1000s) | Probability |

A_{S}BCDEFGHA_{E} | 0 | 0.990131 | A_{s}BCDEFGHA_{E} | 0 | 339,742 | 0.991252 |

A_{S}BFECDGHA_{E} | 4 | 0.003831 | A_{s}BFDECGHA_{E} | 4 | 1,259 | 0.003672 |

A_{S}FDEBCGHA_{E} | 4 | 0.001860 | A_{s}FDEBCGHA_{E} | 4 | 512 | 0.001494 |

A_{S}BCEDHGFA_{E} | 6 | 0.001099 | A_{s}BCEDFGHA_{E} | 3 | 258 | 0.000754 |

A_{S}BCDFEGHA_{E} | 3 | 0.000733 | A_{s}BFECDGHA_{E} | 4 | 217 | 0.000634 |

For the normalized model, the largest individual deviation is for the FG-ONP. The observed abundance * F*was 0.98791. In the optimal solution, summed over all reverse-engineered tours (

_{FG}

*values such that*X

_{t}

*= 0.98791 exactly, but none of them were optimal because it would have increased the deviations for other ONPs. The same set of*F

_{FG}

*values appeared in all 64 constraints and, while there is*X

_{t}

*set of*no

*values that can satisfy all 64 instances of constraint*X

_{t}

**perfectly with no deviations, the model was designed to minimize the sum of the deviations for all ONPs simultaneously.**2

When the normalized ONPA data from Table 4 were input, 99.01% of feasible tours formed by the DNA computer are estimated to have been the optimal tour ABCDEFGHA. Given the relative proportions of the ONPs formed, no other tour is likely to have comprised more than 0.38% of the tours formed (Table 5), which was the percentage for the second-most frequently formed tour in the reverse-engineering model (A_{S}BFECDGHA_{E}). Of the 5040 possible tours in the LP model, the optimal reverse-engineered set of tours had only 30 tours with a non-zero frequency. The objective function value in the model of the normalized data (the sum of the positive and negative deviations over all 64 ordered node pairs) was 0.02997.

Using the * absolute*molecule frequencies as the

*values, the LP model estimated that 339,742 molecules, or 99.13% of all feasible tour molecules, were the optimal answer. The next most frequently produced tour by the DNA computer was estimated to have formed only 1259 molecules (Table 5). As expected, the error levels increased dramatically when the model was used to reverse-engineer the tours formed from the absolute molecule frequency. This occurred because the row sums of molecules in Table 4 were determined in separate qPCR measurements without controlling for the absolute amount of answer sequences used in the samples between rows. Had this been held constant during the measurement, the total number of molecules in each row of Table 4 should be the same because each feasible answer contains one copy of each node sequence. The total deviations summed to 9,057,230, which represents 81.57% of the total Σ*F

_{ij}

*Σ*

_{i}

*, compared with only 0.37% for the normalized model. It is noteworthy that, in the model based on the absolute data, the five ONPs with the greatest deviations from the observed frequencies (HA*

_{j}F

_{ij}

_{E}, A

_{S}B, GH, BC, CD) were all part of the optimal tour, whereas only two such ONPs were among the top five deviations (FG, BF, DG, BC, and FD) when matching to the normalized ONPA. Despite the fact that the deviations derived from the absolute ONPs were much higher, the DNA computation still returned the optimal tour over 99% of the time.

A fully connected 8-city TSP was solved by the ordered node pair abundance (ONPA) approach through the use of a pair ligation probe quantitative real time polymerase chain reaction (PLP-qPCR) system. The high specificity of the sequence-tagged hybridization and ligation that results from the use of PLPs significantly increased the accuracy of answer determination in DNA computing. When combined with the high throughput efficiency of qPCR, the time required to identify the optimal answer to the TSP was reduced from days to 25 min.

The reverse-engineering LP model provides additional evidence that the DNA computer can distinguish the quality of solutions to the TSP. Although the problem was set up artificially to have a single solution that was far better than any other possible solution, the reverse-engineering LP model confirms that the optimal solution was produced in far greater abundance than any other answer, and among the suboptimal solutions, those with greater optimality were generally produced in greater abundance than those that were less optimal.

Examination of the less frequently produced tours provides a measure of the sensitivity of DNA computer to the abundance of the arc sequence molecules initially input to carry out the computation. To make any change to the optimal tour A_{S}BCDEFGHA_{E}, at least three ONPs in the tour must be different. For example, the tour with penultimate optimality (Figure 2e) must have 3 ONPs that are different from the optimal (Figure 2d), while tours containing 4 different predominant ONPs (Figure 2f) would be even less optimal, and so forth. As shown in Table 5, among the 2nd-5th most frequently produced tours for models predicted from both the absolute and normalized data, all but one exhibit either three or four ONP changes from the optimal tour. When the probabilities of the tours were summed over all tours with a given number of suboptimal ONPs, the frequency of tours formed with more suboptimal ONPs exponentially declined as a function of an increase in the number of suboptimal ONPs incorporated (Figure 7B). Thus, not only did the DNA computer generate the optimal answer far more frequently, but other answers were generated in amounts that were inversely proportional to their objective function value.

Even though the data of Table 4 were obtained in separate sets grouped according to each row without controlling for the abundance of molecules between rows (absolute data), the computation returned the correct answer, demonstrating the robustness of this approach. Several ONPs were not detected as products of the computation (e.g. FB). However, this does not imply that no molecules of these ONPs were formed. Although the amounts of these ONPs were below the level of detection, this did not alter the determination of the optimal answer to this problem because the efficiency of the tour that gave rise to the optimal answer was so much greater than any other tour. Identification of the optimal answers to some less obvious problems may require a more sensitive discrimination of ONPA. Under these circumstances, to quantify smaller amounts of ONPs or to discriminate between smaller differences in abundance of two specific ONPs, the PLP-qPCR can be repeated using larger aliquots of feasible answer solutions.

## Acknowledgments

This work was supported by grants to WDF from DARPA-DSO and AFOSR (FA95500710219). The Fair Isaac Corp. (www.fico.com) provided the * Xpress*Optimization Suite to the university under their Academic Partnership Program.