Quantum Algorithms for Nonlinear Equations in Fluid Mechanics

In recent years, significant progress has been made in the development of quantum algorithms for linear ordinary differential equations as well as linear partial differential equations. There has not been similar progress in the development of quantum algorithms for nonlinear differential equations. In the present work, the focus is on nonlinear partial differential equations arising as governing equations in fluid mechanics. First, the key challenges related to nonlinear equations in the context of quantum computing are discussed. Then, as the main contribution of this work, quantum circuits are presented that represent the nonlinear convection terms in the Navier – Stokes equations. The quantum algorithms introduced use encoding in the computational basis, and employ arithmetic based on the Quantum Fourier Transform. Furthermore, a floating-point type data representation is used instead of the fixed-point representation typically employed in quantum algorithms. A complexity analysis shows that even with the limited number of qubits available on current and near-term quantum computers ( < 100), nonlinear product terms can be computed with good accuracy. The importance of including sub-normal numbers in the floating-point quantum arithmetic is demonstrated for a representative example problem. Further development steps required to embed the introduced algorithms into larger-scale algorithms are discussed.


Introduction
Quantum computing [1] and quantum communication are research areas that have seen significant developments and progress in recent years, as is apparent from the work covered in this book.In this chapter, the focus is on the development of quantum algorithms for solving nonlinear differential equations, highlighting key challenges that arise from the non-linearity of the equations to be solved.For this application of quantum computing, progress has so far been relatively limited and in this work, a promising approach to deriving efficient quantum algorithms is proposed.Although the focus is on non-linear equations related to fluid mechanics, the approach put forward here is applicable to a much wider range of problems.
Furthermore, in developing the proposed method, efficient quantum circuits involving floating-point arithmetic were created, in contrast to the more commonly used fixed-point arithmetic employed in a range of quantum algorithms.This aspect of the work described here should also be useful for a wider audience.In this work, the development of quantum algorithms for the nonlinear governing equations for fluid mechanics is described with a particular focus on representing the non-linear product terms in the equations.A key aspect of the derived quantum circuits in the present work is the (temporary) representation of the solution in the computational basis, along with the the use of a floating-point data representation in the arithmetic operations.The quantum circuits for obtaining the non-linear product terms are new developments and form the main contribution of this work.In recent years, a small number of works have considered quantum computing applications to fluid mechanics [2][3][4][5][6][7][8].A brief review of this previous work will be presented in Section 2 and will provide context to the proposed approach.Related work on algorithms with representation in the computational basis is reviewed in this chapter.This chapter is structured as follows.Section 2 describes the background to the current work.Section 3 reviews the key challenges related to treating nonlinear differential equations in a quantum computing context, followed by a discussion of the nonlinear governing equations in fluids dynamics in Section 4. Section 5 then describes how nonlinear terms in governing equations can be evaluated in quantum algorithms using the computational basis.Section 6 and Section 7 discuss the quantum circuits used for computing the square of a floating-point number and the multiplication of two floating-point numbers, respectively.The simulation and verification of the derived quantum circuits is presented in Section 8.The complexity of the circuits is analyzed in Section 9. Finally, conclusions from this work and suggestions for further work are presented in Section 10.

Background of present work
For a small number of applications, quantum algorithms have been developed that display a significant speed-up relative to classical methods.Computational quantum chemistry is proving to be one of the key areas of application.Important developments for a wider range of applications include quantum algorithms for linear systems [9,10] and the Poisson equation [11].Applications to computational science and engineering problems beyond quantum chemistry have only recently begun to appear [4][5][6][12][13][14].Despite this research effort, progress in defining suitable engineering applications for quantum computers has been limited.
Significant progress has been made in recent years in the development of quantum algorithms for linear ordinary differential equations (ODEs) as well as linear partial differential equations (PDEs) [15][16][17][18][19].However, in contract to this progress for linear equations, there has not been similar progress in the development of quantum algorithms for nonlinear ODEs and nonlinear PDEs.An early work by Leyton and Osborne [20] presented an innovative and highly ambitious algorithm.However, the computational complexity of this work involves exponential dependency on the time interval used in the time integration.A small number of more recent works have addressed nonlinear differential equations and typically algorithms for very specific problems were obtained [8].Therefore, much research work is needed into quantum algorithms for a wider range of nonlinear problems.
Early work in quantum computing relevant to the field of Computational Fluid Dynamics (CFD) mainly involved the work on quantum lattice-gas models by Yepez and co-workers [2,3].This work typically used type-II quantum computers, consisting of a large lattice of small quantum computers interconnected in nearest neighbor fashion by classical communication channels.In contrast to these quantum lattice-gas based approaches, the present study focuses on quantum algorithms designed for near-future 'universal' quantum computers.The potential of quantum computing in the context of direct numerical simulation of flows was reviewed recently by Griffin et al. [7], showing that a number of further developments are needed to make this approach viable.
Typically, there are two methods of encoding the result of a quantum algorithm: encoding within the computational basis of the quantum state and encoding within the amplitudes of the quantum state.The widely-used Quantum Fourier Transform (QFT) uses the second approach.The QFT with complexity O log 2 N ð Þ À Á for problem size N has exponential speed-up compared to the classical fast Fourier transform (complexity O NlogN ð Þ) and plays an important role in quantum computation as an essential part of many quantum algorithms.The exponential speed-up realized is due to superposition and quantum parallelism.However, in same quantum algorithms, the Fourier coefficients may be needed in the computational basis [21].
Here, the two different encoding methods are illustrated using the discrete Fourier Transform (DFT).The QFT performs the DFT in terms of amplitudes as, The QFT performs a DFT on a list of complex numbers, and the result is stored as amplitudes of a quantum state vector.In order to extract the individual Fourier components, measurements need to be performed on the quantum state vector.Therefore, the QFT is not directly useful for determining the Fourier-transformed coefficients of the input state.However, the QFT is widely used as a subroutine in larger algorithms.In contrast to the amplitude encoding in Eq. (1), Zhou et al. [21] presented a quantum algorithm computing the Fourier transform in the computational basis (termed QFTC).This quantum algorithm encodes Fourier coefficients with fidelity 1 À δ and digit accuracy ε for each Fourier coefficient.Its time complexity depends polynomially on log N ð Þ, and linearly on 1=δ and 1=ε.The QFTC, enables the Fourier-transformed coefficient to be encoded in the computational basis as follows, where y k corresponds to the fixed-point binary representation of y k ∈ À1, 1 ð Þ using two's complement format.In the algorithm proposed by Zhou et al. [21], the input vector x ! is provided by an oracle O x such that, which can be efficiently implemented if x ! is efficiently computable or by using the qRAM that takes complexity log N ð Þ under certain conditions [21].Comparing Eq. ( 1) and Eq. ( 2), it is clear that encoding in the computational basis requires a number of additional qubits depending on the required fixed-point representation.

Nonlinear problems on quantum computers
An early work by Leyton and Osborne [20] introduced a quantum algorithm to solve nonlinear differential equations with an unfavorable complexity.Since then, very few works have considered quantum algorithms for nonlinear equations.In contrast, algorithms for linear differential equations have continued to receive significant attention.As an example, advanced quantum spectral methods for differential equations were published recently by Childs and Liu [19].
A key contributing factor to the limited progress in algorithms for non-linear problems is the inherent linearity of quantum mechanics.For quantum algorithms encoding information as amplitudes of a quantum state vector, nonlinear (product) terms cannot be obtained by multiplying these amplitudes by themselves, as a result of the no-cloning theorem that prohibits the copying of an arbitrary quantum state.Furthermore, all quantum-gate operations (with the exception of measurements) in the quantum-circuit model used here need to be unitary and reversible.These requirements add further challenges to representing nonlinear terms when using the amplitude-based encoding approach.Specifically, in a normalized quantum state vector all amplitudes in the vector are ≤ 1 (unless only a single amplitude is non-zero), therefore an operator performing products of the amplitudes cannot be unitary since the resulting quantum state vector will no longer have a unit norm.
One possible way around these problems associated with nonlinear terms would be a hybrid quantum-classical approach where the nonlinear products are computed on a classical computer.However, due to the complexity introduced by measuring the quantum state (needed before each transfer of information to the classical computer) and the cost of (re-)initialization of the quantum computer with the result of these products, this is not a promising line of development.It is highly unlikely to lead to a quantum speed-up.Recently, Variational Quantum Computing (VQC) was introduced as an effective hybrid classical-quantum approach [22,23], firstly for applications in quantum chemistry and more recently for a wider range of linear and nonlinear problems [24].The VQC approach constructs the required solution from a layered network, as illustrated in Figure 1.As shown in Figure 1(a), multiple layers are used (4 in the illustration), each taking as input multiple qubits (6 in example shown).Using depth 5 in the example, the quantum circuits defined Illustration of the Variational Quantum computing (VQC) approach (adapted from Lubasch et al. [24]).
by U λ ð Þ involve 13 two-qubit gates as shown in Figure 1(b).Each of these gates has a parameter λ i ∈ 1, 13 ½ associated with it.A classical computer is used to create optimized parameters λ employing an iterative approach that takes the measured state of the ancilla qubit as input.A further key part of the approach is the problemspecific Quantum Nonlinear Processing Unit (QNPU).Recently, Lubasch et al. [24] published an example for the QNPU for the nonlinear Burgers equation.In applications of the VQC approach, the efficiency strongly depends on the choice of the number of parameters λ used in U λ ð Þ.The work by Lubasch et al. [24] showed that exponential speed-up is only possible if the depth of U λ ð Þ scales with the number of qubits and not with the overall problem size.It is clear that the proposed VQC approach is an important development toward QC applications to nonlinear problems.It therefore constitutes a leading candidate for applications to fluid dynamics.However, it is also clear that further investigation is needed to further assess its suitability for a range of applications.

Nonlinear governing equations in fluid mechanics
The Navier-Stokes equations for an incompressible, Newtonian fluid can be written as, where U, p, ρ and ν are the velocity, pressure, density and kinematic viscosity, respectively.x denotes the coordinate in space.The second term on the right-hand side of Eq. ( 4) is the nonlinear convection term that poses a key challenge to developing efficient quantum algorithms for the Navier-Stokes equations.Efficient quantum algorithms for linear convection equations discretized on regular Cartesian meshes with periodic boundary conditions have been devised in recent years [6].When studying numerical methods for the Navier-Stokes equations, it is often useful to switch to Burgers' model equation, to obtain a single nonlinear partial differential equation that retains a nonlinear convection term similar to the Navier-Stokes equations.Using the VQC approach, Lubasch and co-workers recently published example quantum circuits to model the Burgers equation [24].Griffin et al. [7] discuss two approaches for treating the nonlinear term in the Navier-Stokes equations: the VCQ approach of Lubasch et al. [24] and a linearized approach.These authors conclude that, at present, VQC represents the most promising approach for Navier-Stokes equations.Their study also highlights that much further research work is needed to create efficient algorithms for fluid dynamics applications.It is relatively easy to show that the linearization approach to solving non-linear governing equations on a Quantum Computer is generally unfeasible.In applying linearization to nonlinear governing equations, the idea is to use a linearization about the present state of the solution, and then advance this linearized problem in time.This creates a linearization error, which is small if the time step is small.However, even if this linearization error can be tolerated, the linearization approach is problematic in a quantum computing context.This is due to the need for repeated measuring of the quantum state (so that the gates that implement the linear operator may be updated with the current solution) and repeated re-initialization of the quantum state.The complexity associated with repeated measuring and re-initialization is so large that any benefit of a quantum algorithm over a classical algorithm is very likely to vanish.
The development of quantum algorithms for fluid dynamics is clearly at a very early stage and therefore it is essential that different approaches are considered.

Representing nonlinear terms in computational basis
In the present work, an alternative approach to introducing the nonlinear terms of nonlinear differential equations into a quantum algorithm is investigated.Specifically, the assumption is made that in a large-scale quantum algorithm for the solution of the nonlinear (partial) differential equations, the solution is encoded in terms of amplitude in the quantum state vector, i.e. the approach used in a wide range of algorithms including the QFT.Then, for the nonlinear terms of the equations, the following steps are suggested.First, within the larger quantum algorithm, a quantum algorithm is embedded that converts the solution from the quantumamplitude representation to a representation in the computational basis.Recently, quantum algorithms for this 'analog-to-digital conversion' were published by Mitarai et al. [25].Using the representation of the solution in the computational basis, the required nonlinear terms are then efficiently evaluated using quantum circuits presented later in this chapter.Once computed, a conversion back to quantum-amplitude representation is to be used, enabling the rest of the quantum algorithm to proceed.For this 'digital-to-analog' conversion, quantum algorithms were recently studied and published by SaiToh [26].For the representation in the computational basis, a fixed-point approach is typically employed to represent real or complex numbers in quantum algorithms.The number of additional qubits required when using computational-basis encoding depends directly on the number of qubits required to represent the real and complex numbers needed in the algorithm.In the present work, a different approach is put forward: instead of using fixed-point arithmetic, a floating-point representation is used.
In the literature, quantum arithmetic using floating-point numbers has received very little attention so far.Haener et al. [27] described an investigation into quantum circuits for floating-point addition and multiplications and compared automatically generated circuits from Verilog implementations with hand-crafted optimized circuits.Their study provides evidence that floating-point arithmetic is a viable candidate for use in quantum computing, at least for typical scientific applications, where addition operations usually do not dominate the computation.Following on from these conclusions, the present work investigates the use of floating-point arithmetic as part of evaluating nonlinear terms in the computational basis.

Previous works on algorithms in computational basis
Quantum arithmetic in the computational basis constitutes an important component of many quantum algorithms, and as a result reversible implementations of algebraic functions (addition, multiplication, inverse, square root, etc.) have been widely studied.In contrast, there is relatively little work on quantum algorithm implementation of higher-level transcendental functions, such as logarithmic, exponential, trigonometric and inverse trigonometric functions.Examples of applications of trigonometric and inverse trigonometric functions in the computational basis can be found in the famous HHL algorithm [9] and in the state preparation algorithm introduced by Grover and Rudolph [28].More recently, a quantum algorithm for approximating the QR decomposition of a N Â N matrix in the computational basis was published by Ma et al. [29], with polynomial speed-up over the best classical algorithm.

Fixed-point and floating-point arithmetic
A fixed-point number held in an n q qubit register can be defined as the following quantum state, where . This state represents the number w ¼ P j w j ð Þ 2 j .The n int leftmost qubits are used to represent the integer part of the number and the remaining n frac ¼ n q À n int qubits represent its fractional part.In this example, no sign qubit is used so that only positive numbers can be represented (for most applications an additional sign qubit would be required).Since fewer than n q bits may suffice for the representation of the input, a number of the leftmost qubits in the register may be set to 0 j i.Clearly, the fixed point system is very limited in terms of the size of the numbers it can store.Therefore, soon after computers were introduced for numerical computing the switch to floating-point arithmetic was made.In a computer implementation of a floating point number with base 2, a non-zero signed number x, defined through a normalized representation, is expressed as, where the numbers S and E are the mantissa and the exponent, respectively.The binary expansion of the mantissa is Here, it is important to note that always b 0 ¼ 1 for non-zero numbers in a normalized representation.This will be used in the present work to achieve savings in the number of required qubits, as detailed later.In the binary representation, the bits following the binary point are the fractional part of the mantissa.Once floatingpoint numerical computation on classical computers became commonplace, the industry standard IEEE 754 was introduced [31].A similar standard for floatingpoint representations on a quantum computer does not yet exist, but is desirable [30].A key feature of the IEEE standard is that it requires correctly rounded operations: correctly rounded arithmetic operations, correctly rounded remainder and square root operations and correctly rounded format conversions.Typically, rounding to the nearest floating pointing number available in the destination (output register) is used.In the quantum circuits in the present work, rounding down to nearest is used, for reasons of simplicity.Detailed analysis of quantum-circuits developed here for squaring and multiplication operations shows that 'correctly' rounding to nearest involves a significant increase in circuit complexity (i.e. using quantum equivalents of guard and sticky bits, that are well established in arithmetic on classical computers [31]).A key aspect of the IEEE 754 that has been incorporated in the present work is the definition of sub-normal numbers.To illustrate the concept of subnormal numbers, the IEEE 754 standard representation of single format numbers using a 32-bit word is considered.The first bit is the sign bit, followed by 8 bits representing the exponent.Then, 23 bits are used to store a 24-bit representation of the mantissa, i.e. b 0 is not stored.Numbers with exponent bits 00000000 . Sub-normal numbers are used to represent smaller numbers, i.e. in this case the exponent field has a zero bit string but the fraction field has a nonzero bit string.Zero is represented with a zero bit string for the fractional field.For all subnormal numbers, the 00000000 ð Þ used for the exponent represents 2 À126 and by using the 23 fractional field bits, equally-spaced numbers in the range 0:00 … 01 ð Þ 2 Â 2 À126 (with 22 zero bits after the binary point) to 0:11 (with 23 one bits after the binary point) are encoded.

Quantum floating-point format used in present work
Based on the floating point representation defined in the IEEE standard, the present work introduces a floating-point system with fewer bits (i.e.qubits in this case) than the 32 used for single format numbers.This is the direct result of the limited number of qubits available on current and near-term quantum computers.To optimize the range of floating-point numbers that can be represented with the approach used here, the following key aspects of the IEEE standard were adopted: • For the mantissa only the fractional part is stored, • Exponent bit strings 00 … 00 ð Þ 2 and 11 … 11 ð Þ 2 are used for special cases, i.e. dealing with 0, subnormal numbers as well as cases with overflow, • The remaining range of exponent bit strings is used for a range of exponential centred around 2 0 ¼ 01 • Sub-normal numbers are used to extend the range of small numbers, • Rounding down to nearest is used as rounding mode, • Only unsigned numbers are considered for simplicity.Signed numbers can easily be obtained by adding a further 'sign' qubit.
In this work, a floating-point number is represented as an n q ¼ N M þ N E quantum register.In the quantum-circuit implementation, the most significant (leftmost) mantissa qubit is not stored, using the hidden-bit approach used in the IEEE 754 standard.Therefore, N M À 1 qubits define the fractional part of the mantissa in the developed quantum circuits.N E defines the number of qubits used to define the exponent.In the following, examples with N E ¼ 3 and N E ¼ 4 and N M ∈ 3, 5 ½ are considered.For N E ¼ 3, the number 1:00 is defined by |00|011i when N M ¼ 3. Similarly, |000|0111i defines the number 1:000 for N M ¼ 4 and N E ¼ 4. For N E ¼ 3, the smallest normalized number that can be represented is 1=4 independent of the number mantissa qubits.Then, exponent state |000i defines zero and sub-normal numbers, as shown in Table 1 for Similarly, using 4 qubits for the exponent (N E ¼ 4) means that the smallest normalized number is 1=64.For N M ¼ 4 and N M ¼ 5, Table 2 shows the corresponding sub-normal numbers.
In line with the IEEE 754 standard, exponent state |1 … 1i denotes numbers for which an overflow has occurred.For N E ¼ 3, the largest normalized number available is |11 … 1|110i which equates to 14 and 15 for N M ¼ 3 and N M ¼ 4, respectively.Similarly, for N E ¼ 4, the largest normalized number available is |11 … 1|1110i which equates to 240 and 248 for N M ¼ 4 and N M ¼ 5, respectively.

Quantum circuits for squaring floating-point numbers
For a floating-point number defined by N M mantissa and N E exponent bits, a total of N M À 1 þ N E qubits is needed to define the state in the quantum circuits Sub-normal numbers for floating-point numbers with 3 qubits as exponential.
Sub-normal numbers for floating-point numbers with 4 qubits as exponential.introduced here.An example with N M ¼ 3 and N E ¼ 3 will now considered, using registers |imb1|imb0i and |ieb2|ieb1|ieb1i to define the fractional part of the mantissa and the exponent of the input number, respectively.For the multiplication operation described later a second input floating-point number is defined using |ima1|ima0i and |iea2|iea1|iea1i.The output of the squaring and multiplication operations is a floating-point number r defined by |imr1|imr0i and |ier2|ier1|ier1i (initialized at |0i).In addition to the input and output registers, the quantum circuits will need additional qubits to hold results of intermediate results, e.g. for N M ¼ 3 a 6-qubit sub-register |imp5| … |imp0i is used.To facilitate the quantummultiplication operations, a further ancilla qubit |a0i is used.For quantum circuits without measures to deal with sub-normal numbers and overflow, the quantum state for N M ¼ 3 and For N M ¼ 4 and N E ¼ 4, the required number of qubits increases to 2 The quantum circuit performing the squaring operation for N M ¼ 3 and N E ¼ 3 is detailed here as example (in realistic applications N M > 3 will typically be needed).Figure 2 shows the quantum circuit used in the first step of computing the square of a quantum floating point with N M ¼ 3 and N E ¼ 3.This step involves computing the square of the mantissa, with this result temporarily stored in |imp5| … |imp0i.In this circuit, QFT6 prepares this temporary register for the three subsequent product steps denoted by P 1 , P 2 and P3, involving doubly-controlled phase operations.Specifically, three-qubit gates are used applying a phase rotation conditional on state of |a0i and either |imb1i or |imb0i.The P i steps are controlled-summation operations in the shift-and-add approach to computing the products, i.e. the circuits in P i are derived from quantum adders controlled by an additional qubit.Once the controlled phase changes in the circuits P 1 , P 2 and P 3 have been applied, inverse QFT6 on |imp5| … |imp0i creates the desired output state.In case the square of the mantissa ≥ 2, i.e. |imp5 ¼ 1i, the result exponent needs to be incremented by 1.This is achieved by apply a controlled-NOT to |ier0i (which was initialized at |0i) with |imp5i as control.In the next step, result mantissa qubits |imr1|imr0i are set using temporary results in |imp5| … |imp0i, where the required gate operations are conditional on the state of |imp5i.Then the steps shown in Figure 2 are 'uncomputed' so that the subregister |imp5| … |imp0i is set to |0i again.The next step is illustrated in Figure 3, where the output exponent is obtained.This step involves the initialization of the temporary register imp3| … |imp0 with 2 Â E b (i.e.twice the input exponent).Then, the bias of 011 ð Þ 2 ¼ 3 is removed (denoted by À011).This bias removal uses two's complement to create a modified modulo-5 adder that removes a value 011 ð Þ 2 ¼ 3 from |imp4| … |imp0i.Then, the result exponent sub-register |ier2|ier1|ier0i is prepared for the subsequent modulo-3 addition (denoted by MADD3) by applying QFT3.Next, the modulo-3 adder is used to add the qubits |imp2|imp1|imp0i into |ier2|ier1|ier0i.By applying the inverse QFT3 on |ier2|ier1|ier0i the required state is obtained.The remaining steps shown in the quantum circuit in Figure 3 are used to 'uncompute' and clean-up the temporary register, e.g. using inverse QFT3 and a modified modulo-5 adder to re-apply the bias 011 ð Þ 2 ¼ 3. The circuits described so far do not take into account the special situation arising from creating sub-normal numbers as output as well as cases with 'overflow' results.This is discussed next.For certain normalized input numbers the squaring operation leads to outputs truncated to 0 or to the non-zero sub-normal numbers discussed in Section 5.3.The quantum circuits discussed so far need to be modified in a number of ways to deal with this possible sub-normal output.shown on the right-hand side of Figure 3. First, for |imp5i ¼ |1i, setting |ier0i ¼ |1i becomes conditional of both |isubi ¼ |1i and |icuti ¼ |1i.The next two 4-qubit gates are used to guarantee that correct output with exponent |111i is created for inputs with exponents |101i and |110i.The remaining gate operations perform the 'copying' of the mantissa squared into |imr1|imr0i taking into account the possible subnormal output (cases with |isubi ¼ |0i).The steps for |isubi ¼ |1i are the same as in the corresponding circuit for squaring without the sub-normal number modifications.A further set of circuit modifications to deal with sub-normal numbers is required in the quantum circuit used to obtain the output exponent.Figure 5 shows the additional operations required relative to the original quantum circuit shown in Figure 3. Three additional CNOT operations are introduced just before performing the QFT3.For |isubi ¼ |0i and |icuti ¼ |0i the initialization of |imp1|imp0i is modified so that the subsequent steps will produce the correct result for the exponent.The three CNOT operations also appear in the 'uncompute' stage at the righthand side of the circuit.Further changes comprise two 4-qubit controlled-NOT operations on |ier2i and |ier1i required to create |111i exponents for inputs with exponent |110i.
For a fixed value of N E it is important to note that the additional complexity introduced by increasing N M is limited.In fact, the quantum circuit shown on the left-hand side of Figure 4 does not depend on N M .Similarly, the quantum circuits used to obtain the result exponent are independent of N M .The circuit shown on the right-hand side of Figure 4, representing the definition of |imr1|imr0i for cases with normalized or sub-normal output requires modification.Figure 6 shows how |imr2|imr1|imr0i are set for N M ¼ 4 using a set of gate operations that has grown linearly with N M .The circuit shown accounts for sub-normal numbers and includes underflow/overflow protection.

Quantum circuits for multiplication of floating-point numbers
In the interest of brevity, only the main features of the quantum circuits used for multiplication of two quantum floating-point numbers are summarized here.Figure 7 illustrates the quantum circuit used to compute the product of the mantissas of two inputs.Compared to the circuit shown in Figure 2 the main difference is that ancilla qubit |a0i is now set using the mantissa of a second input.A further difference relative to the squaring operation occurs in the circuit used to obtain the result exponent.Here, instead of setting 2Â the exponent using a bit shift, the sum of the two input exponents needs to be computed employing a quantum full adder.

Results of simulation and verification of quantum circuits
The proposed quantum circuits for squaring and multiplying floating-point numbers as part of the computational-basis representation, were systematically verified by gate-level simulation of the circuits for a wide range of cases with and without sub-normal numbers as well as cases with overflow results.The C++ quantum computer simulator detailed in previous work [4] was used for this purpose.To illustrate the process, the quantum algorithm used to square numbers with N M ¼ 3 and N E ¼ 3 is considered, with the following 19-qubit register (algorithm demonstrated accounts for sub-normal numbers as well as underflow/overflow protection, see Eq. ( 8) for reference): where |ieb2|ieb1|ieb0i and |imb1|imb0i define the exponent and the fractional part of the mantissa of the input, respectively.Qubits |icuti and |isubi are initialized as |1i, while all other qubits are initialized as |0i.The quantum state in the simulation is then initialized with a single non-zero (unit) amplitude, with the index in the quantum state vector defined by the binary representation of input exponent and fractional part of mantissa.With the rounding mode fixed at rounding down to nearest, the intended output can be easily computed before the quantum circuit is simulated.In effect, this defines the index of the single non-zero (unit) amplitude of the output quantum state that should be returned in case the circuit is correct.Upon finalizing the quantum computer simulation the actual quantum state vector obtained is compared against the previously-computed required output.For this verification to be meaningful, the following range of possible inputs and outputs

Input
Initial state Output state Results from quantum circuit simulation for representative range of inputs (squaring were considered: (i) input and output are both normalized numbers, (ii) input is normalized number and output is a sub-normal number, (iii) input is a sub-normal number and result truncated to 0, (iv) input is a normalized number, with output overflow.For N M ¼ 3 and N E ¼ 3, Table 3 summarizes the input and output states for examples of each of the 4 categories considered.For inital and output the single non-zero amplitudes are shown.Since the simulator employed here stores the full 2 n q state vector for n q qubits, only circuits with ≤ 28 qubits were considered as a result of limited computational resources and the large number of cases considered ( > 100).For the squaring operation, N M ∈ 3, 6 ½ and N E ∈ 3, 4 ½ were considered, while for multiplication the range of N M needed to be reduced, i.e.N M ∈ 3, 4 ½ .
Rounding down -using sub-normal numbers

Complexity analysis
Before analyzing the quantum circuits introduced here in terms of complexity, first the choice of N M and N E for representing realistic flow fields is considered.

Representing Taylor-green vortex flow
In a two-dimensional flow field, the non-linear terms appearing in the Navier-Stokes equations, shown in Eq. ( 4), involve the square of the velocity components in xÀ and Ày directions, i.e. u 2 and v 2 , as well as, the product uv.Here, the example flow field defined by the two-dimensional Taylor-Green vortex is considered, where velocity and pressure are defined in a square domain 0, 2π ½ 2 with periodic boundary conditions as, Considering a 100 Â 100 uniform mesh, the effect of representing the flow field variables with a reduced-precision floating-point format is analyzed first.
Rounding down -using sub-normal numbers    10) are in the range À1, 1 ½ , so that by increasing N E from 3 to 4, far fewer subnormal numbers are used to represent the flow field.As a result, removing the subnormal number capability (as shown in bottom half of table), results in smaller errors for N E ¼ 4. For realistic applications of the proposed quantum floating point format, the relatively small overhead incurred by introducing sub-normal numbers in the quantum circuits clearly suggests that sub-normal numbers should be included.
For N E ¼ 4, the representation of u 2 and |uv| is considered.Specifically, the error shown is that introduced by the multiplication: the difference between the 'exact' product of the reduced-precision representation of |u| and |v| and the corresponding reduced precision representation of the products is shown in Table 5.The results highlight that although sub-normal numbers played a relatively smaller role in representing velocity components, in the computation of the nonlinear terms, the inclusion of sub-normal numbers is more important for the minimization of approximation errors.

Mantissa multiplication step
QFT and inverse QFT are used involving 2N M qubits, so that the complexity in terms of two-qubit (controlled-phase) gates scales as N 2 M , where the well-known complexity of the standard QFT implementation is used.The complexity of the phase-addition steps involved in the multiplication are detailed in Table 6.For the two-qubit gates the number can be seen to scale as N 2 M , while the number of three-qubit gates shows a N 3 M scaling.

Computation of exponent
QFT and inverse QFT are used involving N E , N E þ 1 and N E þ 2 qubits, representing a smaller complexity than the QFT used in mantissa multiplications.The main contributions to complexity of exponent computation stems from the modulo and full-adders involving a number of qubits scaling linearly with N E .The polynomial complexity in terms of qubits for the adders implemented here is shown in Table 7.

Discussion
The quantum circuits presented here for squaring two floating-point numbers in the format proposed show that by accounting for sub-normal numbers and under/overflow an additional number of multi-qubit controlled-NOT gates is needed.However, for the examples analyzed a polynomial dependence on N M and N E was observed.This means that in terms of quantum-algorithm complexity this implementation has the desired efficiency.The relatively small complexity as compared to circuits used for mantissa multiplication highlights that for most applications it is desirable to include the capability of using sub-normal numbers and provide under/overflow protection in the quantum circuits.The analysis in this section also shows that for a realistic application, a well-considered scaling of the governing equations to O 1 ð Þ variables is even more important here than in classical implementations using IEEE single-or double-precision arithmetic.Using the limited number of qubits available on current and near-term quantum computers ( < 100), the proposed approach to introducing non-linearity is a good candidate in cases where N M and N E can be chosen significantly smaller than in equivalent classical floating-point representations.

Conclusions
The challenges associated with representing non-linear differential equations in terms of quantum circuits were discussed in this chapter.In this work, a new approach for representing product-terms in nonlinear equations suitable for nearterm (e.g.NISQ generation) quantum computers was proposed.A key aspect discussed is the (temporary) representation of the variables in the computational basis.Furthermore, the use of a suitably-chosen floating-point format was detailed.The importance of including sub-normal numbers, such as defined in the IEEE 758 standard for floating-point arithmetic on classical computers, was demonstrated.Based on the current findings, a number of suggestions for further work can be put forward.The presented circuits performed arithmetic for a single set of input data, i.e. equivalent to data for a single point in a computational domain.Extending the approach to a multi-dimensional computational mesh is a first step to consider.A complexity analysis will be needed to assess the potential speed-up relative to classical discretization approaches for the considered equations.A further step involves investigating how the proposed approach can be made part of a larger quantum algorithm, where a mix of amplitude-based encoding and computationalbasis encoding occurs.A key aspect is therefore the development of efficient quantum circuits to perform the required conversions between the two different encoding approaches.Finally, further work is needed to establish how the approach presented here can be used in a wider range of quantum computing applications.

Figure 2 .
Figure 2. Quantum circuit used to compute square of mantissa (for N M ¼ 3 and N E ¼ 3).

Figure 3 .
Figure 3.Quantum circuit used to obtain exponent for squaring operation (N M ¼ 3 and N E ¼ 3).

Figure 4 .
Figure 4.Quantum circuits used in obtaining output mantissa for squaring operation, including sub-normal numbers and underflow/overflow protection (N M ¼ 3 and N E ¼ 3).

Figure 5 .
Figure 5.Quantum circuit used to obtain exponent for squaring operation, including sub-normal numbers and under/ overflow protection (N M ¼ 3 and N E ¼ 3).

Figure 4
illustrates the required changes for N M ¼ 3 and N E ¼ 3. Two additional qubits are needed.Qubit |isubi ¼ |0i is used as indication that result is a sub-normal number.Qubit |icuti ¼ |0i is similarly used to define cases with output truncated to 0. Both qubits are initialized to |1i.Then, before the mantissa multiplication step takes place, a first modification is introduced, shown on the left-hand side of Figure 4.For N E ¼ 3, only inputs with exponent |000i will need truncating to 0, as shown in the first 4-qubit controlled-NOT gate flipping |icuti to |0i.For N E ¼ 3, inputs with exponent |001i are guaranteed to lead to sub-normal output (or 0), and for these cases |isubi is set to |0i, using the second 4-qubit controlled-NOT gate with |isubi as target.The mantissa-multiplication step shown in Figure 2 remains unchanged (i.e.qubits |isubi and |icuti are not used).The next required modification relates to the 'copying' of the result of the mantissa multiplication to output register |imr1|imr0i and the application of increments to the output exponent.The additional logic needed is

Figure 6 .
Figure 6.Quantum circuit used to set output mantissa for squaring operation, including sub-normal numbers and underflow/overflow protection (N M ¼ 4 and N E ¼ 3).

Figure 7 .
Figure 7.Quantum circuit used in multiplying the mantissa of two input numbers (N M ¼ 3 and N E ¼ 3).

Table 4 .
Approximation errors in Taylor-green vortex flow field due to reduced-precision floating-point representation.
L ∞ and L 2 norms of errors relative to IEEE double-precision representation for velocity (u) and pressure (p) for different N M and N E .100 Â 100 uniform mesh.

Table 5 .
Approximation errors of velocity products in Taylor-green vortex flow field due to reduced-precision floatingpoint representation.L ∞ and L 2 norms of errors relative to IEEE double-precision representation for velocity (u 2) and pressure (uv) for different N M and N E .100 Â 100 uniform mesh.

Table 6 .
Number of controlled-phase gates (CPHASE) and doubly-controlled-phase (C 2 PHASE) for phase-addition operator in quantum-multiplier.Also, smallest rotation angle is shown.

Table 4
summarizes the results, highlighting the importance of including subnormal numbers in the floating-point representation.Since a sign bit is not used here, the absolute values of u, v, p were actually used.Flow variables defined in Eq. (

Table 7 .
Number of controlled-phase gates (CPHASE) in phase-addition step for modulo adder (MADD) and full adder (FADD).Also, smallest rotation angle is shown.
18Quantum Computing and Communications