Analogy between Hamilton-Jacobi’s classical mechanics and geometrical optics.
It is well known that, by taking a limit of Schrödinger’s equation, we may recover Hamilton-Jacobi’s equation which governs one of the possible formulations of classical mechanics. Conversely, we may start from the Hamilton-Jacobi’s equation and, by using a lifting principle, we may reach a set of nonlinear generalized Schrödinger’s equations. The classical Schrödinger’s equation then occurs as the simplest equation among the set.
- Schrödinger’s equation
- Hamilton-Jacobi’s equation
- correspondence principle
- lifting principle
Schrödinger’s equation is the fundamental equation of quantum mechanics. Using a correspondence principle, we may recover the classical limit of mechanics under the form of the Hamilton-Jacobi’s equation. This is a up-down process, from a general theory to a limit restricted theory, i.e. from quantum mechanics to classical mechanics. We may use another principle, that I call a lifting principle, which, starting from Hamilton-Jacobi’s equation allows one, through a bottom-up process, to reach a set of generalized Schrödinger’s equations, encompassing nonlinear terms. From this generalized set, we may turn back to a up-bottom process. In a first step, we recover the classical Schrödinger’s equation as, in some sense, the simplest equation in the set and, in a second step, we recover again classical mechanics from quantum mechanics, using again a correspondence principle.
The chapter is organized as follows. Section 2 recalls the Hamilton-Jacobi’s equation of classical mechanics which, in the present chapter, may be viewed as a turning equation, both the end of a up-bottom process and the beginning of a bottom-up process. Section 3 exemplifies a way to obtain Schrödinger’s equation by using an analogy relying on Hamilton-Jacobi’s equation. Section 4 expounds the bottom-up process from Hamilton-Jacobi’s equation to a set of generalized Schrödinger’s equations. Section 5 provides a complementary discussion while Section 6 is a conclusion.
2. Hamilton-Jacobi’s formulation of classical mechanics
We know that classical mechanics can be declined under four different formulations, which are mathematically and empirically equivalent. These are the Newton’s, Lagrange’s, Hamilton’s and Hamilton-Jacobi’s formulations. In the present chapter, we rely on the Hamilton-Jacobi’s formulation, see for instance Louis de Broglie , Blotkhintsev , Landau and Lifchitz , and Holland . This formulation of nonrelativistic classical mechanics of a matter point relies on an equation, that I shall call Hamilton-Jacobi’s equation, reading as:
This equation allows one to study the motions of a particle of mass in a potential . The ’s denote Cartesian coordinates and is the time. The field is a real field that I shall call the Jacobi’s field. Eq. (1) has to be complemented by two other equations reading as:
in which is the energy and is the momentum. From Eq. (2), we see that is an action (energy multiplied by time) and, from now on, we may call it the action. Also, inserting Eqs. (2) and (3) in Eq. (1), we see that we obtain , which should be enough to convince us of the equivalence between Newton’s and Hamilton-Jacobi’s formulations. For a conservative motion, the energy (that we denote in that case) is constant along each particular motion, and Eq. (2) implies:
We now consider the locus of the points for which possesses a given value :
Eq. (6) shows that the locus is a time-independent surface. There is one surface, and only one, containing a point of space, according to . The whole space is therefore filled by a set of motionless surfaces forming what I call the Jacobi’s static field. From Eqs. (3) and (4), we have:
Therefore, is the gradient of (and of ). This means that trajectories are orthogonal to the surfaces (and to the surfaces ). Next, we consider the locus of the points for which the action possesses a given value :
Eq. (8) shows that the locus is still a surface but which now depends on time. When times goes on, the surface moves and, in general, experiences a deformation. For a given time , the moving surface coincides with a motionless surface , according to, from Eq. (4): . Therefore, when time goes on, the moving surface sweeps over all motionless surfaces .
We now consider a fictitious point P, pertaining to the surface , and therefore moving with it, with the constraint that its displacement remains orthogonal to the swept surfaces . The velocity of the moving surface may then be defined as:
in which is an infinitesimal displacement of the point P. But we have:
that is to say:
But (modulus: ) is colinear to (modulus: ). Hence, with positive, we obtain:
We then remark that Newton’s formulation relies on the existence of trajectories while Hamilton-Jacobi’s formulation relies both on trajectories and on a field filling the space. Hamilton-Jacobi’s formulation is the first one in which the motion of a localized object has been associated with a space filling field. In other words, Hamilton-Jacobi’s formulation is nonlocal. This nonlocality actually anticipates the nonlocality of quantum mechanics and the space filling field is an anticipation as well of a space filling field of quantum mechanics. It has furthermore been argued that Newton’s and Hamilton-Jacobi’s formulation, although empirically equivalent, are ontological contradictory, representing an example of the Duhem-Quine ontological underdetermination of theory by experience [5, 6].
3. Guessing Schrödinger’s derivation
Strictly speaking, there is no derivation of Schrödinger’s equations but a variety of guessing approaches, with different flavors depending on the preferences of the authors. Basically, however, Schrödinger’s equation has been introduced in [7, 8] under its stationary form and in  under its time-dependent form. English translation is available from  and French translation from . The derivation relies on an analogy between Hamilton-Jacobi’s formulation of classical mechanics and geometrical optics. As rather usual when something new is exposed for the first time, Schrödinger’s argument is more complicated than necessary. For instance, it relies on the use of non-Cartesian coordinates and on a non-Euclidean interpretation of the configuration space, requiring the use of covariant and contravariant components of vectors (more generally, of tensors), which may be unfamiliar to some readers. Feynman even commented that some arguments invoked by Schrödinger are erroneous . Without showing any disrespect to Schrôdinger’s work, I prefer to present a more recent exposition extracted from Winogradski  who defended her thesis under the supervision of Louis de Broglie.
We begin with scalar wave optics and with the corresponding wave equation reading as:
in which is the velocity of the wave . We may also introduce the refractive index of the medium according to in which is the speed of light. We now consider a steady medium () which may support monochromatic waves of angular frequency , reading as:
Because and are, in general, complex fields, we set:
In these expressions, is a complex amplitude, a real amplitude, and are phases. We may then introduce the wave-number vector reading as:
The wave-number is defined as and the wave-length is defined by . Also, we have:
If the medium, besides being steady, is homogeneous (), the wave equation admits plane wave solutions reading as:
in which are constant quantities, and becomes the spatial period of the wave along the direction of propagation.
We are now equipped enough to turn to a discussion of geometrical optics which is an approximation to wave optics. This approximation is valid whenever the optical wave approximately behaves as a plane wave over a distance of the order of the wave-length , that is to say when and are approximately constant over . Equivalently, we may take the limit . There is a rigorous but tedious way to take this limit by examining the relative variations of and over , in the direction , relying on Taylor expansions. I shall rather use heuristic and convincing enough arguments which furthermore lead to the correct results. Because is approximately a constant, Eq. (23) reduces to:
Furthermore, because is approximately a constant too, Eq. (24) reduces to an identity Therefore, Eq. (26) is the geometrical optics version of the wave optics. Eqs. (23) and (24), i.e. two equations, have collapsed into a single one. We observe that Eq. (26) contains the phase , but does not contain any more the amplitude . This means that the concept of amplitude has no meaning, in a strict sense defined by the above derivation, in geometrical optics (this does not prevent to build geometrical optics models using the concept of amplitude).
Now, similarly as for and , and are equiphase surfaces satisfying the following obvious analogous results. The locus of the points for which possesses a given value , i.e. , is a time-independent equiphase surface. There is one surface, and only one, containing a point of space, given by . The whole space is therefore filled by a set of motionless surfaces forming the static phase field. The trajectories orthogonal to these surfaces are called rays. The locus of the points for which possesses a given value , i.e. , is a time-dependent equiphase surface. For a given time , the moving equiphase surface coincides with a motionless equiphase surface . When time goes on, the moving surface sweeps over all motionless surfaces .
Assembling the results obtained for the conservative Hamilton-Jacobi’s classical mechanics and for geometrical optics, we obtain a remarkable analogy exhibited in Table 1.
|Classical mechanics||Geometrical optics|
This analogy has been discovered by Hamilton, about one century (!) before its use to the discovery of Schrödinger’s equations, see Refs. [14, 15], references therein and prior references from Hamilton. Formally, we may express the same structure by using a mechanical language or an optical language. Both languages may be translated, from one to the other, by using a dictionary D exhibited in Table 2, where the newly introduced constant has the dimension of an action.
An analogy is not necessarily significant but any analogy should be, at least tentatively, taken seriously. If the analogy is fully meaningless, then the value of the constant does not matter, and any value for would do.
which we call de Broglie, or Einstein-de Broglie relations. Eq. (28) expresses an equivalence between momentum (mechanical language) and wave-number (optical language), while Eq. (29) expresses an equivalence between energy (mechanical language) and angular frequency (optical language).
The situation we are facing is now sketched in the Figure 1 below. First, we possess an analogy between Hamilton-Jacobi’s classical mechanics and geometrical optics, expressed by a dictionary D. Second, geometrical optics is an approximation to scalar wave optics. The Figure 1 then exhibits three filled rectangles, and we may feel intuitively but clearly that something is lacking, corresponding to the fourth empty rectangle. To fill this rectangle, we apply the dictionary D to wave optics. From the dictionary of Table 2, with , we have:
We may then translate Eq. (22) to:
which is exactly the time-independent (stationary) Schrödinger’s equation. Therefore, Eq. (16) is translated to:
and we readily establish that also satisfies Eq. (31) that we better rewrite as:
which is the general time-dependent Schrödinger’s equation. Invoking the “simplest” way to obtain Eq. (34) rules out awkward expressions such as the one obtained by deriving Eq. (32) twice with respect to time, i.e.:
4. Deriving a set of generalized Schrödinger’s equations
There are good reasons to believe that classical mechanics is suspicious. One of them is the existence of singularities in classical mechanics such as exhibited in the mechanical rainbow [16, 17]. If we trust a non-singularity principle stating that “local infinity in physics is not admissible” , we arrive to the conclusion that we must build a wave mechanics (nowadays better known as “quantum mechanics”). For this, we decide to start from what we know (actually what we are supposed to know), namely classical mechanics. We are looking for a wave mechanics based on a wave which should have the virtue of washing out the singularities exhibited by classical mechanics. The most general form for a wave reads as:
in which is a complex dimensionless phase. At this stage, our amount of knowledge is supposed to be very weak. We only possess one field for classical mechanics and two fields and for wave mechanics. These fields are the only quantities involved in the problem. Therefore, we have to search for a relationship between and (first option), or between and (second option). Because and possess the same nature (they are fields without being waves), I preferably choose the second option. Of course, the first option is likely to be valid as well, but it would certainly lead to more complicated derivations and equations.
For the relationship between and , we could search for or for . Because wave mechanics () is assumed to be more general than classical mechanics (), it is apparent that we better have to try to determine rather than the inverse version . We therefore have to explicitly consider . However, this is to be slightly corrected. Indeed, is dimensionless while is an action (the action). This will require us to introduce a new constant, that will be denoted .
Now, I invoke a principle that I call the lifting principle (later to be commented a bit more when the demonstration is completed). This principle tells us something very simple, even looking a bit like tautological, as follows: classical mechanics is an approximation to wave mechanics. Rather than simply using the argument in , we then have to look for a function in which the functional argument reads as:
in which is a constant having the dimension of an action, is a correcting function, and is a small parameter. To recover classical mechanics from wave mechanics, we shall have to take the limit so that, the constant being dismissed, we are left with the field (and with its equation). Also, we can take . Indeed, if were complex, it would exhibit a phase factor which could be absorbed in . Similarly, the prefactor “” which is introduced for convenience could be absorbed in
The function may be explicitly written as:
in which we used a subscript to insist on the fact that depends on . Eq. (39) may give the feeling that we are dealing with a restricted first-order perturbation approach. However, instead of Eq. (38), let us assume:
This can be rewritten as:
which, relabelling, identifies with Eq. (38).
We are now looking for a differential equation satisfied by the wave , involving partial derivatives with respect to and . This equation must be fundamental, that is to say it must contain lowest-order derivatives compatible with the constraints imposed by the problem under study. Once the fundamental equation is obtained, we can of course generate other equations by further differentiating with respect to and , but such extra-equations are said to be non-fundamental.
We begin with the assumption that, besides derivatives with respect to , the wave equation only contains the first derivative with respect to time. We shall later comment on the use of higher-order derivatives with respect to time.
The derivative may always be written as:
in which we again use a subscript to insist on the dependence on . Also, is an extra-field (i.e. a function of time and space, but not a dynamical field possessing its own differential equation), possibly a constant, and represents a set of arguments formed from various derivatives of with respect to :
The set is infinite and there is a systematic way to generate all arguments of the set. For instance, the subset generated by contains , , , and other arguments obtained by using complex conjugations.
We rewrite Eq. (44) as:
or, invoking Eq. (42):
But, Hamilton-Jacobi’s equation (and the lifting principle) implies that the r.h.s. of Eq. (46) must contain a term with no derivative associated with in Eq. (1), and a term involving , associated with the first term in the r.h.s. of Eq. (1). These terms have to be involved in the function . Upon investigation, we find that the term involving can only be generated by which indeed is found to be:
We therefore set, without any loss of generality:
in which is a complementary function, possibly including non-linear terms, and which also could possibly annihilate the terms and if, eventually, we would find that they should be zero.
The evolution Eq. (42) then takes the form:
and our next task is to evaluate and .
In the classical limit (), Eq. (51) simplifies to:
which must identify with Hamilton-Jacobi’s equation. Under the proviso to be checked later that the r.h.s. of Eq. (52) must be vanishingly small, we then obtain, from the l.h.s.:
in which and therefore reduces to . Eq. (53) implies:
We must now recall that the coefficient has been actually set as a function , and Eq. (50) shows that it must pertain to the wave mechanical level. In other words, it does not pertain to the classical mechanical level, that is to say, as a rational demand, we would not like it to depend on . Therefore, must be a constant that we denote as .
From Eq. (55), we then have:
With (since the first derivative is a constant), Eq. (54) then implies:
Concerning the constant , I have (at least at the present time) no theoretical reason to assign a value to it.
which is indeed in the limit . This implies that is a small action, actually so small that it could not be detected in a classical framework.
Eq. (58) is the main result of this subsection. It provides a set of generalized Schrödinger’s equations, being admitted that they are evolution equations (first derivative with respect to time), obtained by a deformation of Hamilton-Jacobi’s equation, according to the lifting principle. The classical Schrödinger’s equation is, in a certain sense, the simplest equation in the set. It is obtained by setting the nonlinear term to and to , while the constant identifies with the Planck’s constant . This is equivalent to saying that in Eqs. (49) and (50), only the - and -terms in the r.h.s. of the equations, required to match Hamilton-Jacobi’s equation in the classical limit, are retained.
Let us note that the function in Eq. (58) may be significant because it allows one to introduce non-linear wave equations. Non-linear Schrödinger’s equations in quantum theory are considered in the literature in many papers. For example, they are comprehensively discussed by Doebner and Goldin in , and in many references therein. We may also meet such equations in the Bohm-Bub hidden-variables theory , or with the Ghirardi-Rimini-Weber equation for spontaneous collapse of the wave function . More generally, non-linear equations may provide a solution to the measurement problem insofar as linear equations, in utmost rigor, do not allow one to get rid of quantum superpositions. This fact has been recently heavily emphasized by R. Penrose in one of his books . A word of caution is however required, namely that, according to Gisin , “the Schrödinger evolution is the only quantum evolution that is deterministic and compatible with relativity”. Hence, “the fact that a deterministic evolution compatible with relativity must be linear puts heavy doubts on the possibility to solve the measurement problem […] by adding non linear terms to the Schrödinger equation”.
5. Complementary discussion
From the generalized Schrödinger’s Eq. (58) we may recover the classical Schrödinger’s equation, as we have commented, by setting , and , leading to:
This is a first application of the correspondence principle. A second application of this correspondence principle afterward allows one to recover the classical Hamilton-Jacobi’s equation from Schrödinger’s equation, as discussed for instance by Blotkhintsev . From the generalized Schrödinger’s equation, we therefore recover the classical Hamilton-Jacobi’s equation by a two-step up-bottom process, applying twice the correspondence principle. Another approach is to use Eq. (58) as an Ansatz under the form:
and to pursue the game with the correspondence principle to recover, using again a two-step approach, Hamilton-Jacobi’s equation. But the use of an Ansatz is less rigorous than the lifting principle because it contains the risk to make the Ansatz too simple, and therefore to omit significant terms. Note, however, that we have implicitly made the assumption that the state of the wave is defined by the wave itself so that we have obtained what is called an evolution equation. The use of a second-order derivative with respect to time would require, for integration, to have the state defined by and by its first derivative (and similar considerations for higher order derivatives with respect to time) so that the result would not be an evolution equation. Therefore, in utmost rigor, what we have demonstrated is that Schrödinger’s equation is the simplest evolution equation satisfying the lifting principle.
To clearly emphasize the difference between the correspondence and the lifting principles, let us consider two theories, denoted (standing for “general”) and (standing for “approximate”). By taking some kind of limit on , we must recover , a up-down process () that may be denoted as . We then say that satisfies a correspondence principle with respect to . If is unknown and under construction, any valid candidate, say , … must satisfy the correspondence principle: , …. It it does not, it is not valid and must be rejected. If several valid candidates are retained, then the discrimination among the candidates may need to rely on other considerations, or even remaining undecidable, such as when dealing with the Duhem-Quine underdetermination of theories by experiments. The lifting principle is a down-up process (): . It starts from a theory relying on an equation (or a set of equations) which is acknowledged to be valid within a certain domain of applicability and extends this domain of validity by extending the original equation (or set of equations) under conditions defined by physical requirements.
For example, the lifting principle tells us that classical mechanics is an approximation to quantum mechanics. Therefore, quantum mechanics must indeed satisfy a correspondence principle, meaning that the correspondence principle is contained in the lifting principle. However, as we have seen, it does not identify with it. What we have done to use it is to start from and find a way to reach candidates for . However, the word “lifting” may have other meanings, for instance in the theory of nonlinear dynamics when, to study a low-dimensional system it can be easier to study its elevation in a higher dimensional system [24, 25]. On the one hand, the higher-dimensional system must satisfy a correspondence principle. One the other hand, it is said that it is obtained as a result of the “lifting” of the low-dimensional system. My choice of the word “lifting” in the context of the present chapter is the result of my borrowing it to the context of chaos theory.
Another point of view may be taken by using a metaphor from Feynman  according to which the correspondence principle proceeds from one object to its shadow (and there is one shadow for one object) while the lifting principle proceeds from a shadow to objects (and there are several possible objects for a given shadow). Our results agree with this expectation. We did not reach Schrödinger’s equation, but rather a set of generalized Schrödinger’s equation. The derivation of Schrödinger, and all Schrödinger-like derivations, reach a single result because they used analogies, guesses and trials, with more or less implicit assumptions. Conversely, the use of the lifting principle simultaneously provides the whole set of admissible possibilities with a minimal number of assumptions (namely that we have to deal with an evolution equation). All candidates are reached in a single step.
The realm of nonlinear Schrödinger’s equations is very rich, with many applications such as to fluid mechanics, solitons, nonlinear optics and Bose-Einstein condensates. In the present chapter, we have demonstrated, using a lifting principle, that such equations occur naturally as a generalization of Hamilton-Jacobi’s formulation of classical mechanics, without however pretending that nonlinear equations obtained by the lifting process identify with nonlinear Schrödinger’s equations used in other different contexts (this would require another specific study outside of the scope of the present chapter). The material presented in this chapter is extracted from a book, namely . It is here however presented under a single roof and might then attract the interest of other readers.