Open access

Better Unification for Physics in General Through Quantum Mechanics in Particular

Written By

Cynthia Kolb Whitney

Submitted: May 10th, 2011 Published: February 24th, 2012

DOI: 10.5772/35211

Chapter metrics overview

3,414 Chapter Downloads

View Full Metrics

1. Introduction

Physics has always had several different domains of application in on-going development, and physicists have always striven for unification among its different domains. Unification is usually achieved through development of so-called ‘covering theories’. In the nineteenth century, the stunning example was Maxwell’s Electrodynamics (MED), which unified electricity and magnetism as one domain of theory. Another major domain of theory then present was Newton’s Mechanics (NM), which in the eighteenth century had really launched modern physics as a mathematical discipline.

At the turn of the twentieth century, NM and MED were well in place, and were fulfilling many technologically important requirements. But there seemed to be an incompatibility between them. The problem concerned their invariance with respect to choice of reference frame: NM exhibited invariance if the allowed reference frames were all connected through Galilean transformations, whereas MED exhibited invariance if the allowed reference frames were all connected through Lorentz transformations. It looked as though one of these two theories must be more nearly correct than the other, but it was not clear which one was the better one.

That problem seemed resolved with the advent of Einstein’s Special Relativity Theory (SRT). SRT was believed to capture the true meaning of MED concerning the behavior of light signals, and SRT was certainly an endorsement of Lorentz transformation, so SRT was believed to offer the one possible revision of NM that could make mechanics fully consistent with MED.

But meanwhile, new phenomena were being discovered at the micro scale of physics, and they often seemed inexplicable with any known theory, whether NM, SRT, or MED. These were phenomena suggesting quantization of light, quantized atomic states, atomic, molecular and crystal structures, radioactivity, etc.

So at almost the same time as one problem seemed to be resolved, other problems were emerging. Since the earlier situation between NM and MED had demanded that Physics allow two seemingly discordant theories to co-exist until some good argument could replace one of them, the situation then presented by the new phenomena being discovered naturally invited the development of another potentially discordant theory: Quantum Mechanics (QM).

The discovery of the photoelectric effect, and the introduction of the idea of the photon, initiated QM. Almost immediately, QM was developed to handle the Hydrogen atom, and the ground state thereof, the stability of which was thought to be impossible with MED. Accepting that apparent incompatibility with MED, and even embracing it, researchers moved on to excited states, to other atoms, then to molecules, and reactions, and to all the rest of the complexity that today makes up modern Quantum Chemistry (QC).

Also, experimenters got into sub-atomic elementary particles, especially electrons and positrons, their annihilation and creation, along with creation and annihilation of photons. All that led to Quantum Electrodynamics (QED).

So today physics still has several different bodies of theory, aimed at several different domains of application. On the one hand, we have QM for atomic and other micro-system interactions. It has at least two identifiable parts: QC for interactions at the level of atoms and molecules, and QED for interactions at the level of elementary particles. And on the other hand, we have Einstein’s relativity theory (RT) for physics at human scale and larger. It too has two parts: SRT for electromagnetic interactions, and general relativity theory (GRT) for gravitational interactions.

QM and RT are the major pillars of twentieth century physics. And they are not entirely compatible. QM features wave-like entities with seemingly instantaneous correlations between the states of even quite distant entities, whereas RT features point-like entities interacting via fields propagating at a finite speed.

So are we defeated in the quest for unification in Physics? Apparently many people hope not, as they do vigorously pursue various forms of unification. The prominent one sought today is Quantum Gravity (QG). It would be the twenty-first century capstone for the two twentieth-century pillars of QM and RT. But it is not yet fully in sight.

In the pursuit of unification, one often sees phrases like ‘Theory of Everything’. The objective of this Chapter is certainly modest by comparison! It just notes some observations about the status of available theories, and discusses the removal of some incompatibilities between the available theories that arose only because of unfortunate choices.

Because QM is relatively new, there are still lots of alternative approaches being developed in parallel. Putz (2009) gives us one very big and recent anthology about them, and this book will give another even more recent one. The QM atmosphere is clearly right for generating new illumination that can facilitate new observations about physics overall.

The first observation driving the present work is just this: QED is arguably the most successful theory that modern Physics possesses. The fact that QED now exists, and that is has the name that it has, naturally begs the question: How could there have been any real disconnect between MED

Note that I speak of MED, not of Classical Electrodynamics (CED) in general. CED involves, not only the works of Maxwell, but also those of a large number of other individuals. I am inclined to trust results from Maxwell, but question some of those from other authors, as reported in the present work.

and early QM?

It is this author’s belief that Nature is not so perverse. Connections between different domains of theory are still possible to find, even though the diligent search that was conducted a century ago did not find them. We have developed more tools now. Every new tool developed should invite us to revisit the old problems.

Section 2 talks about the photon from the point of view of MED. It explores the implications of the finite energy, which characterizes a photon. It finds a plausible model for the photon expressed in terms of MED.

The second observation is just this: If MED can connect better with QM, then shouldn’t SRT also connect better with QM? After all, how much difference can there be between a photon in QM and a light signal in SRT?

Section 3 explores the implications of modeling the light signal in SRT in the same way as the photon in QM. The photon model suggests a slight alteration to Einstein’s second postulate, and thereby produces a slightly altered version of SRT.

The third observation is just this: If SRT is to be altered, however slightly, in response to the photon concept from QM, isn’t it then possible that the revised SRT can be used to better explain some things about QM that presently seem mysterious?

Section 4 talks about what the photon/signal model implies about atoms: the stability of atoms, the occurrence of Planck’s constant.

The fourth observation is this: Much of science works on scaling laws. It is in that spirit that we should look for scaling laws about atoms, and thereby reduce the effort of looking at each element as a particular special-case problem for detailed calculations.

Section 5 talks about the inferences from to the story about all isotopes of Hydrogen, all elements beyond Hydrogen, and the ions of any element; the possible nature of ‘excited’ atomic states, and the character of the light spectrum that an element produces.

The fifth observation is this: If QM can be better connected to SRT, then where does that leave its relationship with NM? Early QM was basically NM, although not for particles possessing momentum and energy in the classical way, but rather for waves, with an amplitude factor and a phase factor, in the latter of which momentum and energy appeared as variables. Is that formulation now completely outdated on account of a rift between NM and MED?

Section 6 establishes that there was no necessary disconnect even between NM and MED. It argues that, with an adequately extended notation to support an extended tensor calculus, Maxwell’s equations can be seen to be invariant in form, even under Galilean transformation. (It is useful here to distinguish two kinds of invariance: ‘form invariance’ for symbolic equations, and ‘number invariance’ for individual symbols that have numerical values.)

The last observation is the ‘meta’ observation about the present work: Physics in general can become significantly more unified throughout because of some specific developments surrounding QM.

Section 7 summarizes the several specific conclusions implied by the present work. Boiled down to one sentence, these conclusions come to this: the existence of apparent discord between theories that are addressed to different problem domains within Physics sometimes means that there exists a more productive way to pose one or more of the theories involved.


2. Maxwell’s electrodynamics and QM’s photons

It often seems that MED, a theory largely about spatially extended EM fields, has little in common with QM, a theory largely about discrete material systems and the discrete photons that they emit and absorb. Photons are imagined to be the opposite of spatially extended; i.e., localized, like the matter particles that emit and absorb them.

So our mental picture for a photon in its interactions with matter is rather bullet-like: the photon is shot out of a source, travels through space, and hits a receiver that absorbs it. But the travel part of the story is unobservable. So we imagine that the photon in flight is possibly wavelike, in accord with Maxwell theory. Certainly the evidence for that is present, in the form of interference effects, even with small numbers of photons. So the photon is assigned a quality of ‘duality’. This is a rather mysterious way of describing a photon.

What seems missing here is an adequate model for the photon throughout its life history, expressed in terms of EM fields. The purpose of this Section is to develop one.

I like to begin the development of such a history with a waveform consisting of finite energy distributed in a three-dimensional Gaussian peak located very close to a source that has emitted it. This three-dimensional Gaussian peak is limited in all three spatial directions so as to integrate to a finite total energy.

To allow subsequent propagation, the energy has to be divided between two orthogonal fields, electric and magnetic. To allow circular polarization, the energy has to be further divided between real and imaginary parts, real being alive now, and imaginary becoming alive a quarter of an oscillation cycle later.

Given such a start, the whole life history of a photon can then develop in the manner that Maxwell’s equations allow. Describing that development is the objective of the following Sub-Sections.

2.1. Waveform development

The first step in the life history of a photon is its development from a spatially localized energy bundle that is emitted from a source into a spatially extended waveform that travels through space. To help think about this problem, it is useful to recall some phenomenology familiar from physics at a more macroscopic scale.

  1. One phenomenon very well known for light modeled as EM waves is the spreading transverse to the propagation direction known as of ‘diffraction’. Diffraction is the result of some sort of limitation transverse to the propagation direction. Historically, the limitation has been due to a finite aperture through which the light propagates. The light spreads out from the aperture, more-so the smaller the aperture is. In the photon model discussed here, the limitation is softer than an aperture edge, but a limitation nevertheless: it is the finite spread of the Gaussian waveform in the two directions transverse to the propagation direction. The more narrow the Gaussian peak is, the more spread there will be.

  2. The closest familiar analog for longitudinal spreading is known as ‘dispersion’. This word refers to the ‘blurring’ effect that any frequency dependence the propagation speed through the medium entails. For example, a signal pulse in a medium looses its sharp edges because those sharp edges imply superposition of many different wavelengths, and hence different frequencies, which the medium may affect differently. In Earth’s atmosphere, or ocean, square waves can turn to blob waves because of dispersion.

Let us begin a scenario with a single pulse in E . Let it have a Gaussian profile along the propagation direction, say x , with E exp ( x 2 ) . We can apply Maxwell’s equations, and watch what happens. The Gaussian is the so-called ‘generating function’ for the infinite set of Hermite polynomials, all of which have very regularly spaced zero crossings. What happens is that the single pulse in E (an even function) generates a double pulse in B (an odd function), which in turn generates a triple pulse in E (another even function), and so on; that is, all the derivatives in play generate successively higher-order Hermite polynomials multiplying the Gaussian. Meanwhile, all the E × B Poynting vectors in play support general spreading of the Gaussian. With each step, the emergent functions look more and more like wavelets, and the individual peaks in the wavelets stay about the same width as more of them accrue, so the wavelength for the emergent wavelet becomes more and more defined. Figure 1 illustrates this behavior at the stage where E has developed five peaks (four zero crossings). Series 1 is the original input Gaussian function, Series 2 is the Gaussian after the overall spreading has developed to this point, and Series 3 is the wavelet that has emerged in the process; i.e. the spread-out Gaussian times the fourth-order Hermit polynomial generated.

Figure 1.

A wavelet develops when an EM pulse is acted upon by Maxwell’s equations.

What we have so far is only one eighth of the story needed to fully represent a photon: development from a pulse into a waveform. We have told the story for one pulse in E . If we would match that with another pulse in B , we would have overall propagation along with waveform development. That would bring us to one quarter of the whole story of the photon. If we would match that with two more pulses, E and B pointing at 90 ° in space from the first pair and coming ‘alive’ a quarter cycle out of phase with the first pair, we would have the circular polarization characteristic of photons, but we would still have just half the story. So let us move on, and seek the other half.

2.2. Waveform regression

The remaining half of the story of the photon is about waveform regression. How does this complex structure of four Hermite polynomials multiplied by their generating Gaussian unwind, and go back to being a set of four pulses, so that it can be absorbed into a receiver? Again, let us refer to some similar but more familiar phenomenology:

  1. A third phenomenon possible for light modeled as EM waves is ‘focusing’. This is what we have optical lenses and shaped mirrors for. It works somewhat contrary to transverse spreading, gathering incident energy into a smaller area transverse to the propagation direction. Of course we don’t have any lenses or mirrors in the photon model, but we shall find a mechanism that produces a similar effect.

  2. A fourth phenomenon possible for light modeled as EM waves is ‘pulse restoration’. This is what transmission lines have ‘repeater stations’ for. A communication signal degraded by dispersion can be reconstituted when passed through an intelligent filter. Of course we don’t have any filters in a photon model, but we shall find a mechanism that produces a similar effect.

The ‘similar effect’ comes from the imposition of boundary conditions in the longitudinal direction. The Gaussian pulse that was used to describe the waveform development part of the scenario was somewhat unrealistic in that its tails extended to infinity. There is no way that a localized source could emit an energy pulse whose tails would extend to infinity. It is somewhat more realistic to imagine the equivalent of a mirror at the source, and another mirror at the eventual receiver, to confine the waveform like a wave in a box, with zero amplitude at the surface of each mirror and everywhere beyond.

With such boundary conditions imposed, the analytic functions involved in the model are no longer the simple Gaussian and the simple Hermite polynomials that it generates. Now we have not one, but three, Gaussians, the extra two being needed to cancel the first one at the two boundaries. Correspondingly, we always have at least three (actually six) Hermite polynomials alive at any given time. That is a loss of mathematical simplicity. But there is a gain of conceptual simplicity. It is easy to envision that the propagation scenario has some symmetry about its mid point. The waveform will spread until its central peak is halfway between the source and the receiver. After that, the mirror at the receiver will be more significant than the mirror at the source, causing the waveform to start ‘piling up’ near the receiver, and eventually end up as a pulse near the receiver, similar to the pulse originally launched near the source.

This ‘regressing waveform’ is somewhat reminiscent of ‘advanced’ solutions to Maxwell’s equations going backwards in time. These were introduced many times in the early 20th century, but particularly popularized in the mid 20th century by Wheeler and Feynman (1945 and 1949).

Wheeler and Feynman were looking to time symmetry as the basis for an electromagnetic generalization of instantaneous (Newtonian) gravitational interaction. There are important differences between the regressing waveforms introduced above and the Wheeler-Feynman advanced solutions: 1) Wheeler and Feynman were looking at interactions between essentially point sources and receivers, and so had to be looking at spherically expanding retarded solutions and spherically contracting advanced solutions, not at essentially one-dimensional expanding and contracting wavelets. 2) The Wheeler-Feynman expansion or contraction is related to the spherical area of a wave front, not the waveform in the radial propagation direction. 3) A lengthy discussion of the paradox of advanced actions is necessitated in the Wheeler-Feynman work, whereas the ‘regressing’ solutions introduced here are not in fact ‘advanced’ at all; they are just regressing, in real time, in the propagation direction.

What we have here is quite different though. There are no differential equations running backwards in time; there is just ‘piling up’ of a solution to differential equations in response to a boundary condition.

2.3. The photon model in terms of EM fields

Taken together, the waveform development followed by the waveform regression suggest a photon model in terms of EM fields that exhibits continuous evolution: it goes from a state of pulse-like localization near its source, to a state of wave-like extension in space during its travel, and then back to a state of pulse-like localization near its receiver.

Observe that with this photon model, ‘light in flight’ develops its wavelength only during its flight. It doesn’t have it to start with, and it gives it up at the end. So light at emission, or reception, has a position, but no wavelength, whereas light in flight has a wavelength, but no position. Thus the model expresses a ‘wave-particle duality’ for light.

Observe too that this photon model exhibits a form of QM ‘complementarity’, or uncertainty relationship. Consider that, under Fourier transformation, Gaussians map into Gaussians, and that the product of the spreads of such Gaussians is a constant. In the process of wave train development, a Gaussian in position space x spreads out, while its corresponding Gaussian in wave number space k sharpens up.

Inasmuch as the discovery of photons was the point of departure for the development of QM, having this photon model expressed in terms of Maxwell fields is a first step in reconciling MED with QM. But there is much more to do, because the bigger problem for MED was not the photon itself, but rather the atom that emitted or absorbed it. It looked as though MED could never explain an atom being stable in its ground state, much less anything about its excited states. To find any reconciliation there, we must move on.


3. EM signals as photons

Every neutral atom contains at least two particles, and generally a lot more. Prior to QM, electromagnetic forces were presumed to hold such a system together, but there was clearly a problem with that understanding.

The simplest atom is the Hydrogen atom, with just one electron circulating about a nucleus consisting of just one proton. So consider the Hydrogen atom. The electron circulates and so accelerates, and that must generate radiation. It was assumed that this radiation would rob the atomic system of energy, and thereby cause the collapse of the atom.

So it was assumed that Maxwell’s EMT is simply incompatible with the stability of atoms. The solution then was to postulate the existence of a different regime of physics in which that wouldn’t happen. But was that really necessary? The purpose of this Section is to argue that it was not.

The underlying belief in inevitability of atomic collapse reflects a belief that the electrodynamic forces within the atom are essentially central, and therefore cannot affect the energy budget of the atom. This latter belief traces to the turn of the 20th century, when A. Liénard (1898) and E. Wiechert (1901) developed models for the potentials and fields created by rapidly moving charges. Although Liénard and Wiechert worked independently, they made the same assumption, and they got the same results, and so confirmed each other. This Section looks at those results, and thereby develops a motivation to look back at their underlying assumption.

3.1. Standard formulae for scalar and vector potentials

Expressed in Gaussian units, the Liénard-Wiechert (LW) scalar and vector potentials at position r and time t are

Φ ( r , t ) = e [ 1 / κ R ] retarded


A ( r , t ) = e [ β / κ R ] retarded E1

where κ = 1 n β , β is source velocity normalized by c , and n = R / R (a unit vector), and R = r source ( t R / c ) r (an implicit definition for the terminology ‘retarded’). The LW fields obtained from those potentials are then

E ( x , t ) = e [ ( n β ) ( 1 β 2 ) κ 3 R 2 + n c κ 3 R × ( ( n β ) × d β d t ) ] retarded


B ( r , t ) = n retarded × E ( r , t ) E2

The LW fields have some interesting properties. The 1 / R fields are radiation fields, and they make a Poynting vector (energy flow per unit area per unit time) that lies along n retarded :

P = c 4 π E radiative × B radiative = c 4 π E radiative × ( n retarded × E radiative ) = c 4 π ( E radiative ) 2 n retarded E3

But the 1 / R 2 fields are Coulomb-Ampère fields, and the Coulomb field does not lie along n retarded as one might naively expect; instead, it lies along ( n β ) retarded . Assume that β does not change much over the total field propagation time, in which case ( n β ) retarded is virtually indistinguishable from n present . So then the Coulomb field and the radiation are arriving to the observer from different directions.

One can feel moved to check this surprising result. Fortunately, one can look up the original sources, obtain translations if necessary, and verify the original algebra. There is no problem with the algebra. There are also numerous re-derivations that use more modern techniques involving the Dirac delta function and the Heaviside step function. These are ‘generalized’ functions of some parameter that, when driven to infinity, produces an infinite pulse or a unit step. One can study these re-derivations too. One finds various re-orderings of the mathematical operators ‘differentiate’, ‘integrate’, and ‘go to parameter limit’. These re-orderings are dodgy because the generalized functions lack the mathematical property of uniform convergence, so these operations don’t necessarily commute; it is possible to change the result by changing operation order. But even so, such findings do not change the fact that the original LW derivations, although pedestrian, were correct.

If a problem exists with this LW result, then there is really only one place where it can arise: in the initial assumption; namely, that electromagnetic fields propagate like bullets shot at speed c . But this is the very same assumption that Einstein later formalized as his Second Postulate (1905, 1907). He just called them “signals” rather than “fields”.

The LW idea of bullets shot at speed c is the foundation for Special Relativity Theory (SRT). (Indeed, SRT offers one of the modern ways to re-derive the LW results.) But SRT is also the foundation for General Relativity Theory (GRT). SRT and GRT together make one of the two great pillars of 20th century Physics: Relativity Theory (RT). So questioning the LW assumption is not just questioning the LW results; it is questioning the founding assumption of SRT, and so threatening this whole pillar of 20th century theory.

Many people have just accepted that this is just ‘the way things are’ with classical field theory, and with SRT, and with all of relativity theory as well. But what if one wanted to describe the same scenarios in a thoroughly modern way, with photons instead of radiation fields, and virtual photons instead of Coulomb-Ampère fields? Could anyone really accept the idea that the real photons and the virtual photons created by the same space-time event would arrive at a detector from different directions?

But one needn’t accept any such thing, given the photon model in terms of Maxwell fields developed in Sect. 2. In short, since we have a model for photons in terms of fields, we should be able to reverse engineer a model for fields in terms of photons. So what does the photon model developed in Sect. 2 imply? Observe that the developing wavelet can move at speed c relative to the source, and the regressing wavelet can move at speed c relative to the receiver. Applying this idea can help to modify the LW results appropriately.

3.2. Updated formulae for scalar and vector potentials

Recall that with the photon model developed in terms of Maxwell fields in Sect. 2, the life history of the photon has a symmetry point in the middle. Before the mid point of the propagation scenario, the waveform is developing, and after the mid point of the propagation scenario the waveform is regressing. That makes the mid point very important. So far as the receiver is concerned, nothing that happened before the midpoint affects the signal he receives. The source position and velocity information he receives is determined, not by the specification ‘retarded’, but rather by the specification ‘half retarded’.

With this new specification, the scalar and vector potentials become:

Φ ( r , t ) = e [ 1 / κ R ] half retarded


A ( r , t ) = e [ β / κ R ] half retarded E4

The fields become:

E ( r , t ) = e [ ( n β ) ( 1 β 2 ) κ 3 R 2 + n c κ 3 R × ( ( n β ) × d β d t ) ] half retarded E5


B ( r , t ) = n half retarded × E ( r , t ) E6

The Poynting vector P ( r , t ) becomes:

c 4 π E radiative × B radiative = c 4 π E radiative × ( n half retarded × E radiative ) = c 4 π ( E radiative ) 2 n half retarded E7

Observe that now the direction of the Coulomb field is ( n β ) half retarded ( n present ) half retarded n half retarded and the direction of the Poynting vector is n half retarded too. So now, the Coulomb field and the Poynting vector are reconciled to the same direction. That is the first big gift from the photon model in terms of EM fields given in Sect. 2.

And the gifts of photon model in terms of EM fields go well beyond this rather arcane problem about field direction. The photon model in terms of EM fields eliminates the central mystery of Einstein’s SRT: having just one light speed relative to however many different observers there may be. This is complexity at the level of ‘multiplicity’, much more daunting than the complexity at the level of the mere ‘duality’ that is found in modern QM.


4. EM fields within atoms

An noted in Sect. 2, atoms were the really big problem for Maxwell’s EMT. Now armed with some new information about EMT, it is appropriate the revisit the problem about atoms. We turn again to Hydrogen. From Sect. 3 can infer that at least two processes go on inside the Hydrogen atom, and we shall discover shortly that there are actually three. Only one is familiar. The other two challenge familiar concepts of ‘conservation’ that originally grew out of Newtonian mechanics. But electromagnetism is not Newtonian mechanics. In electromagnetic problems, the concepts of momentum and energy ‘conservation’ have to include the momentum and energy of fields, as well as those of matter. Momentum and energy can both be exchanged between matter and fields. ‘Conservation’ applies only to the system overall, not to matter alone (nor to fields alone either).

4.1. Energy loss due to far-field radiation

The first process that occurs with the Hydrogen atom is the familiar energy loss from the atom due to far-field radiation. There will be a far-field power radiated (energy loss per unit time) of magnitude

P radiated = 4 π | P | R 2 d Ω = 4 π c 4 π ( E radiative ) 2 R 2 d Ω = 4 π   c 4 π e 2 c 2 κ 6 | n × ( ( n β ) × d β d t ) | 2 d Ω E8

where Ω means ‘solid angle’. Because the full 4 π of solid angle captures opposing directions of n , contributions to the integral from the vector β visible in the integrand cancel out. Contributions to the integral that come from the dot product n β that is hidden in the κ 6 factor may not be zero at every moment, but they time-average to zero. So let us simplify the expression for far-field power radiated by setting β to zero. We have:

P radiated = e 2 4 π c 4 π | n × ( n × d β d t ) | 2 d Ω E9

It evaluates to the well-known Larmor result:

P radiated = e 2 4 π c 1 1 d cos θ 0 2 π d φ | d β d t | 2 sin 2 θ = e 2 2 c | d β d t | 2 1 1 d cos θ ( 1 cos 2 θ ) = e 2 2 c | d β d t | 2 ( cos θ 1 3 cos 3 θ ) | 1 1 = 2 e 2 3 c | d β d t | 2 E10

4.2. Energy gain due to internal torquing

The second process that occurs in the Hydrogen atom is a not previously noticed energy gain due to internal torquing. This process occurs because the Coulomb force within the atom is not central; it is along n half retarded , and not along ( n β ) retarded n present .

The power inflow to the electron is P torquing = T e Ω e , where Ω e is the electron orbit frequency, and T e is the magnitude of the torque on the electron, given by T e = r e × F e where r e is the electron orbit radius, and F e is the tangential force on the electron. But that is not all. The proton also orbits at frequency Ω e , and experiences its own torque, given by T p = r p × F p , where r p is the proton orbit radius (tiny) and F p is the tangential force on the proton (huge), with the result that the magnitude T p is the same as T e . The total torque on the system is T = T e + T p = 2 T e . It is determined by the angle between r e and F e , which is given by r p Ω e / 2 c = ( m e / m p ) r e Ω e / 2 c . So torque T = ( m e / m p ) ( r e Ω e / c ) e 2 / ( r e + r p ) and power received is

P torqing = m e m p r e Ω e 2 c e 2 ( r e + r p ) = ( e 4 / m p ) / c ( r e + r p ) 3 E11

The existence of such a process is why the concept of ‘balance’ emerges: there can be a balance between gain of energy due to internal torquing and the inevitable loss of energy due to radiation. But we are not done with radiation yet.

4.3. Extra radiation due to Thomas rotation

The fact that the electron and the proton have such different masses, and orbit at such different radii, means that the EM forces within the atom are not only not central; they are not even balanced. This situation has another major implications: The system as a whole experiences a net force. That means the system center of mass (C of M) can move. This sort of effect does not occur in Newtonian mechanics due to the fact that Newtonian mechanics assumes infinite signal propagation speed.

Looking in more detail, the unbalanced forces in the Hydrogen atom must cause the C of M of the whole atom to traverse its own circular orbit, on top of the orbits of the electron and proton individually. This is an additional source of accelerations, and hence of radiation. It evidently makes even worse the original problem of putative energy loss by radiation that prompted the development of QM. But on the other hand, the torque on the system is a candidate mechanism to compensate the rate of energy loss due to radiation, even if there is a lot more radiation than originally thought..

The details are worked out quantitatively as follows. First ask what the circulation can do to the radiation. Some 20 years after the advent of SRT, a relevant kinematic truth about systems traversing circular paths was uncovered by L.H. Thomas (1927), in connection with explaining the then anomalous magnetic moment of the electron: 1/2 its expected value. He showed that a coordinate frame attached to a particle driven around a circle naturally rotates at half the imposed circular revolution rate.

Applied to the scenario of the electron orbiting the proton, the gradually rotating x , y coordinate frame of the electron means that the electron sees the proton moving only half as fast as an external observer would see it. That fact explained the electron’s anomalous magnetic moment, and so was received with great interest in its day. But the fact of Thomas rotation has since slipped to the status of mere curiosity, because Dirac theory has replaced it as the favored explanation for the magnetic moment problem. Now, however, there is a new problem in which to consider Thomas rotation: the case of the C of M of a whole Hydrogen atom being driven in a circle by unbalanced forces. In this scenario, the gradually rotating local x , y coordinate frame of the C of M means that the atom system doing its internal orbiting at frequency Ω e relative to the C of M will be judged by an external observer to be orbiting twice as fast, at frequency Ω = 2 Ω e relative to inertial space. This perhaps surprising result can be established in at least three ways:

  1. by analogy to the old electron-magnetic-moment problem;

  2. by construction from Ω e in the C of M system as the power series Ω = Ω e × ( 1 + 1 2 + 1 4 + 1 8 + ... ) Ω e × 2 ;

  3. by observation that in inertial space Ω must satisfy the algebraic relation Ω = Ω e + 1 2 Ω , which implies Ω = 2 Ω e .

The relation Ω = 2 Ω e means the far field radiation power, if it really ever manifested itself in the far field, would be even stronger than classically predicted. The classical Larmor formula for radiation power from a charge e ( e in electrostatic units) is P = 2 e 2 a 2 / 3 c 3 , where a is total acceleration. For the classical electron-proton system, most of the radiation would be from the electron, orbiting with a e = r e Ω e 2 , Ω e given by the Coulomb force m e r e Ω e 2 = F e = e 2 / ( r e + r p ) 2 . But with Ω = 2 Ω e , the effective total acceleration is a = a e × 2 2 . The total radiation power is then

P total radiated = 2 4 ¯ 2 e 2 3 c 3 a e 2 = ( 2 5 ¯ e 6 / m e 2 ) / 3 c 3 ( r e + r p ) 4 E12

Now posit a balance between the energy gain rate due to the torque and the energy loss rate due to the radiation. The balance requires P torquing = P total radiated , or

( e 4 / m p ) / c ( r e + r p ) 3 = ( 2 5 e 6 / m e 2 ) / 3 c 3 ( r e + r p ) 4 E13

This equation can be solved for

r e + r p = 32 m p e 2 / 3 m e 2 c 2 = 5.5 × 10 9 E14

Compare that value to the accepted value r e + r p = 5.28 × 10 9 cm. The match is fairly close, running just about 4% high. That means the concept of torque vs. radiation does a fairly good job predicting the ground state of Hydrogen.

4.4. Unification of physics via Planck’s constant

In conventional QM, r e + r p is expressed in terms of Planck’s constant h , which is presumed to be a fundamental constant of Nature:

r e + r p = h 2 / 4 π 2 μ e 2 E15

Here μ is the so-called ‘reduced mass’, defined by μ 1 = m e 1 + m p 1 , which makes μ m e . Using that approximation and equating the two expressions (13) and (14) for r e + r p implies

h π e 2 c 128 m p / 3 m e E16

This expression comes to a value of 6.77 × 10 34 Joule-sec, about 2% high compared to the accepted value of 6.626176 × 10 34 Joule-sec. This reasonable degree of closeness suggests that Planck’s constant may reasonably be considered a possible function of other fundamental constants of Nature, and so not itself an independent fundamental constant of Nature. Or the situation may reasonably be considered the other way around: that some other fundamental constant of Nature is really a function of Planck’s constant. Either way, we would have one less independent fundamental constant of Nature, and that would mean one more degree of unification among the different branches of physics.

But of course, the expression for h developed here can fulfill such aspirations only if the theory being developed can do a great deal more than just match the ground state of Hydrogen. Worthy targets for additional work include: anticipating the story for isotopes of Hydrogen, anticipating from there what happens with other elements, explaining the excited states of Hydrogen and their resulting spectral lines, anticipating from there the spectral features of some other elements, and characterizing the behavior of the full database on ionization potentials of all elements, and much more. It all constitutes a developing research area that I refer to as ‘Algebraic Chemistry’.


5. Extensions and extrapolations from hydrogen

5.1. Larger nuclear mass

The negative energy of the electron in the ground state of the Hydrogen is

e 2 / ( r e + r p ) = 3 c 2 m e 2 / 2 5 m p E17

This is the energy that would have to be provided to liberate the electron, or ionize the atom: the ‘ionization potential’.

Eq. (16) provides the basis from which to build corresponding expressions for other entities. For example, the extension to Deuterium and/or Tritium requires that the proton mass m p be replaced with a more generic nuclear mass M , and that r p be replaced by r M . Then we have for the ionization potential of this more massive system:

e 2 / ( r e + r M ) = 3 c 2 m e 2 / 2 5 M E18

5.2. Arbitrary nuclear charge

The extension of the model to a neutral atom with nuclear charge number Z involves Z electrons as well. To develop the mathematical model, we must return to the expressions for P torquing and P total radiated , Eqs. (10) and (11). All the factors of e 2 change to Z 2 e 2 , and the factor of m e 2 changes to Z 2 m e 2 . The equality P torquing = P total radiated becomes Z 4 P torquing = ( Z 6 / Z 2 ) P total radiated = Z 4 P total radiated . So nothing happens to the equality between P torquing and P total radiated , Eq. (12). But for the more charged system, the energy Eq. (17) becomes

Z 2 e 2 / ( r e + r p ) = Z 2 3 c 2 m e 2 / 2 5 M E19

This scaled-up expression represents the magnitude of the total ionization potential of the system involving Z protons and Z electrons. What is then comparable to the ionization potential for removing a single electron is:

Z e 2 / ( r e + r M ) = Z × ( 3 c 2 m e 2 / 2 5 M ) ( Z / M ) × ( 3 c 2 m e 2 / 2 5 ) E20

Thus in the math we find a Z / M scaling law. What do we find in the actual data? Something much more complicated, and indeed so complicated that we would be unlikely ever to figure it out without the clue that Z / M is part of the story. The involvement of M means the involvement of isotopes, and unwanted complexity. So the clue tells us to look at ionization potentials, not in raw form, but scaled by M / Z , to remove the Z / M factor that the math anticipates.

Figure 2 shows the pattern found. Seven orders of ionization are included. There is a fascinating, but lengthy, story about ionization orders 2 and up; see Whitney (2012). The part of it that will be most important for the present development is obvious from Fig. 2: the energy required to completely strip the atom scales with Z 2 .

Figure 2.

Ionization potentials, scaled by M / Z and modeled algebraically.

With their M / Z scaling, all of the I P ’s can be represented in terms of a baseline value equal to that of Hydrogen, I P 1 , 1 , and an increment Δ I P 1 , Z . The increment arises from interactions just between the electrons, quite apart from the nucleus. The electron-on-electron increments are very regular in their behavior. First of all, every period exhibits a general rise, and by the same factor of 7 / 2 . Second, there is a general drop from one period to the next, for the first three periods, and all by the same factor of 7 / 8 .

Then within periods, there is a very regular pattern. There are sub-period rises keyed to the traditional ‘angular momentum’ quantum number l , and to a non-traditional parameter N that goes 1 , 2 , 2 , 3 , 3 , 4 , 4 for periods 1 through 7 , and gives the number of elements in a period as 2 N 2 . For l 0 , we have:

incremental rise = total rise × fraction ,


fraction = [ ( 2 l + 1 ) / N 2 ] [ ( N l ) / l ]

The following Table details the behavior fractional rises in First-order I P ’s over all sub-periods:


The scaled ionization potentials are called I P ’s. They are meant to be ‘population generic’; that is, the information they contain concerning one element can be applied to a calculation about another element in a different state of ionization, or excitation, by applying the Z / M appropriate for the second element and its state.

5.3. Unequal counts for electrons and protons

Let us first consider ionization sates. These are important for applications in Chemistry, since chemical reactions involve ions. With all this regularity displayed in Fig. 1, it should be possible to use it to help predict the energy budget for all sorts of chemical reactions. We just need a rational way to extrapolate from all the formulae representing the regularities for single electrons being removed from neutral atoms to formulae for electrons being removed from, or added to, ions of all sorts.

Generally, if an atom is in an ionized state, then in place of just Z we have an electron count Z e distinct from the proton count Z p . The electron-on-electron interaction does not involve the nucleus, and so always scales with Z e / M . But electron-nucleus interaction previously represented by ( Z / M ) I P 1 , 1 now has to involve both Z e and Z p . We have for the total system

Z p Z e e 2 / ( r e + r M ) = ( Z p Z e / M ) × ( 3 c 2 m e 2 / 2 5 ) E21

What is then generally comparable to the nuclear-orbit part of the ionization potential for removing a single electron? To develop an answer to this question, we must return again to the expressions for P torquing and P total radiated , Eqs. (10) and (11). Clearly, all of the factors of e 2 change to Z p Z e e 2 . It is as if all factors of e changed to Z p Z e e . Removal of one electron is then like removal of one Z p Z e e charge. What is comparable to the ionization potential for removing a single electron from the ion is then

Z p Z e e 2 / ( r e + r M ) = ( Z p Z e / M ) × ( 3 c 2 m e 2 / 2 5 ) E22

Thus for ions, we see in the math a Z p Z e / M ( Z p ) scaling law for that part of the ionization potential that reflects electron-nucleus interaction, I P 1 , 1 . So for computations we use:

I P 1 , 1 Z / M I P 1 , 1 Z p Z e / M ( Z p ) E23

For the other part of the ionization potential, that reflecting just the electron-on-electron interactions, Δ I P 1 , Z , the relevant Z is Z e . But the relevant M is still M ( Z p ) , the only significant mass in the problem. So for computations we use:

Δ I P 1 , Z Z / M Δ I P 1 , Z e Z e / M ( Z p ) E24

This basic information can help one to model the energy budget for any chemical reaction. To assist readers who want to try this out, the necessary data displayed by Fig. 1 is tabulated in numerical form as Appendix 1 at the end of this Chapter.

Here is one small example. Recall the comment about Fig. 1 that, for nuclear charge Z = 2 and up, the energy required to completely strip the atom scales with Z 2 . The actual formula plotted on Fig. 1 goes

I P Z , Z = 2 I P 1 , 1 × Z 2 = 2 × 14.250 × Z 2 E25

The resulting I P Z , Z is population-generic. The corresponding element-specific quantity is I P Z , Z multiplied by the factor Z / M ( Z p ) . Thus the element-specific energy requirement for total stripping is 2 × 14.250 × Z 3 / M ( Z p ) eV’s.

We can now compare the total energy required to strip an atom one electron at a time with the energy required to strip it of its electrons all at once. The two elements Helium and Lithium are good examples because they represent the extremes of very high first-order ionization potential and very low first-order ionization potential. The data for them in numerical form comes from Appendix 1. Here is how the calculations go:

Helium: ( Z p = 2 , M ( Z p ) = M 2 = 4.003 )

Write Formulae:

H 2 e H 2 e + : I P 1 , 1 × 2 / M 2 + Δ I P 1 , 2 × 2 / M 2 ; H 2 e + H 2 e + + :
I P 1 , 1 × 2 × 1 / M 2 E26

Insert Data:

H 2 e H 2 e + : 14.250 × 2 / 4.003 + 35.625 × 2 / 4.003 ; H 2 e + H 2 e + + :
14.250 × 1.4142 / 4.003 E27

Evaluate Formulae:

H 2 e H 2 e + : 7.1197 + 17.7992 = 24.9189 ; H 2 e + H 2 e + + :
5.0343 E28

Evaluate Total Stripping One-at-a-Time:

H 2 e H 2 e + + :

24.9189 + 5.0343 = 29.9532 E29

Compare to Total Stripping All-at-Once:

2 × 14.250 × Z 3 / M ( Z p ) = 2 × 14.250 × 2 3 / M 2 = 2 × 14.250 × 2 3 / 4.003 = 56.957 E30

Lithium: ( Z p = 3 , M ( Z p ) = M 3 = 6.941 )

Write Formulae:

L 3 i L 3 i + :
I P 1 , 1 × 3 / M 3 + Δ I P 1 , 3 × 3 / M 3 Δ I P 1 , 2 × 2 / M 3 E31
; L 3 i + L 3 i ++ : I P 1 , 1 × 3 × 2 / M 3 + Δ I P 1 , 2 × 2 / M 3 ; L 3 i ++ L 3 i 3+ :
I P 1 , 1 × 3 × 1 / M 3 E32

Insert Data:

L 3 i L 3 i + :
14.250 × 3 / 6.941 + ( 1.781 ) × 3 / 6.941 35.625 × 2 / 6.941 E33
; L 3 i + L 3 i ++ : 14.250 × 2.449 / 6.941 + 35.625 × 2 / 6.941 ; L 3 i ++ L 3 i 3+ :
14.250 × 1.7321 / 6.941 E34

Evaluate Formulae:

L 3 i L 3 i + :
6.1591 + 0.7698 10.2651 = 4.8758 E35
; L 3 i + L 3 i ++ : 5.0278 + 10.2651 = 15.2929 ; L 3 i ++ L 3 i 3+ :
3.5560 E36

Evaluate Total Stripping One-at-a-Time:

L 3 i L 3 i 3 + :

4.8758 + 15.2929 + 3.5560 = 13.9731 E37

Compare to Total Stripping All-at-Once:

2 × 14.250 × 3 3 / M ( Z 3 ) = 2 × 14.250 × 3 3 / M 3 = 2 × 14.250 × 3 3 / 6.941 = 110.8630 E38

In these two examples, we see that removal of all the electrons, all at once, takes much more energy than removing the electrons one electron at a time. It is plain to see that total stripping all-at-once is a vigorous, even violent, event. It is the stuff of special-purpose laboratory or field investigation. By contrast, total stripping one-at-a-time is a gentle process. The one-at-a-time process is an example of the stuff of ordinary production Chemistry.

5.4. Excited states - hydrogen

Now let us begin to consider excitation states. These are key for understanding emission or absorption spectra, a fabulously rich source of data about atoms. But atomic spectra are complicated. The standard way to begin to understand them is mathematically, from the family of solutions provided by the differential equation that Schrödinger postulated for the abstract wave function characterizing the electron in the Hydrogen atom.

The standard QM view is that the Hydrogen atom has multiple ‘stable states’, each with negative energy, E , determined largely by a principle quantum number n = 1 , 2 , 3... according to E n E 1 / n 2 . The idea is that the electron can reside in an upper state ( n 1 ), but only rather precariously, and when it teeters and falls back to the ground state ( n = 1 ), a photon is emitted.

But the Hydrogen atom has only two constituent particles, the electron and the proton, and thus very few classical degrees of freedom. That fact makes it difficult to imagine an infinite multiplicity of different ‘states’ that a Hydrogen atom could exhibit. We are left to ponder a mystery of mathematical QM. So it is tempting to try to develop an additional, more immediately physical, way of understanding the spectral complexity that we see. Consider the possibility that individual Hydrogen atoms may not, by themselves, actually have excited states. Instead, the term ‘excited state’ may be better applied to a system that involves several Hydrogen atoms.

Key to this idea is that charges can form entities called ‘charge clusters’. [Concerning charge clusters at the macro scale of laboratory experiments and field observations: see, for example, Beckmann (1990), Aspden (1990), Piestrup and Puthoff (1998).]

Evidence concerning the probable existence of charge clusters at the micro scale of atoms is plainly visible in the data on I P ’s (Fig. 1): some electron counts are very stable and hard to break apart (e.g. noble gasses), while some electron counts are very un-stable and hard to keep together (e.g. alkali metals). Why would electron counts matter so much if the electrons were not in deep relationships with each other?

But how can electrons outwit electrostatic repulsion? Once given the clue that they evidently can do this, it becomes possible to imagine how they might do it. The key is that electrostatic repulsion dominates in a static situation. In a dynamic situation, electrons may move at speeds exceeding light speed (Remember, Sect. 3 cast doubt on the founding postulate of SRT, and SRT is all there is to forbid superluminal speeds.). If so, a repulsion signal from one electron may reach another electron only by the time the first electron has moved so much that the repulsion from its ‘then’ position has become the attraction to its ‘now’ position. In fact, multiple electrons can form circulating ring structures that are quite stable (for details, see Whitney 2012).

So consider the possibility that an excited state of Hydrogen is actually n H neutral Hydrogen atoms, with the n H electrons in a negative charge cluster, and the n H protons in a positive charge cluster, making a kind of ‘super’ Hydrogen atom; i.e., Hydrogen with every factor of electron mass m e , proton mass m p and charge e scaled by n H . The torquing power P T = ( e 4 / m p c ) / ( r e + r p ) 3 then scales by ( n H ) 4 / n H = ( n H ) 3 , and the radiation power P R = 2 5 e 6 / 3 c 3 ( m e ) 2 ( r e + r p ) 4 scales by ( n H ) 6 / ( n H ) 2 = ( n H ) 4 . The solution for system radius r e + r p = 32 m p e 2 / 3 ( m e ) 2 c 2 then becomes:

r n H e + r n H p = 32 ( n H m p ) ( n H e ) 2 / 3 ( n H m e ) 2 c 2 = n H ( r e + r p ) E39

i.e., the system radius is scales with n H . The system orbital energy E 1 = 1 2 e 2 ( r e + r p ) = 1 2 e 2 × 3 ( m e ) 2 c 2 / 32 m p e 2 becomes

E n H = 1 2 ( n H ) 2 n H e 2 ( r e + r p ) = 1 2 n H [ e 2 × 3 ( m e ) 2 c 2 / 32 m p e 2 ] E40

i.e., the system orbital energy also scales with n H . This result is the same as if the atoms were isolated, instead of being organized into a big system with two charge clusters. This suggests that the energy available for generating photons by de-excitation isn’t ‘orbital’ at all, but is instead the energy tied up in forming the charge clusters out of the multiple electrons and the multiple protons from the multiple Hydrogen atoms.

What can we infer about such charge clusters? As in the modeling of I P ’s for ions, we can again consider Fig. 1 as a source of information about electron clusters of sizes up to 118, quite apart from the particular element that the information is located with. From Fig. 1, It is clear that most of the Δ I P ’s are positive, meaning their electron clusters are hard to break. So despite being made of same-sign charges, most of them exist in negative energy states. The ones that are particularly hard to break are the ones associated with the noble gasses: Z = 2 ,   10 ,   18 ,   36 ,   54 ,   86 ,   ( 118 ) . These elements are at the ends of periods on the periodic table, and the lengths of the periods themselves are: 2 ,   8 ,   8 ,   18 ,   18 ,   32 ,   ( 32 ) . (Parentheses mean we haven’t discovered, or created, that element yet.) The implication is that excited states of Hydrogen existing in the form of ‘super Hydrogen’ would most frequently exist with n H = 2 ,   8 ,   18 ,   32 ,   ...

Can we anticipate what would happen when any such excited state de-excites? Suppose we started with n H = 32 . It could, for example, decompose into 18, 8, 2, 2, and 1, 1; i.e. some less excited states and a couple of ground-state atoms; 6 daughter systems in all. Suppose that for every such daughter produced, there is a photon released. Exactly how might that work? Observe that four daughters are in states that are even more negative than the starting state, so those are no problem. But two daughters are in the ground states, which is not more negative than the starting state. So energy from the other daughters has to be enlisted to create any photons there.

For any n H , there may be a de-excitation path, or paths, for which the energy budget is insufficient, in which case those paths won’t be taken. There may also be de-excitation paths for which the energy budget is more than sufficient, in which case there will be, not only spectral radiation, but also a bit of heat radiation. Very rarely, there might be a de-excitation path for which the energy budget is just exactly right.

The spectral lines that occur with Hydrogen (or any element) are typically characterized in part by differences in inverse square integers. The integers involved are traditionally understood in terms of the familiar radial quantum number n . Is it possible to understand them also in terms of the n H used here?

Recall that if one then chooses to model the behavior of Hydrogen ‘excitation’ in terms of a single Hydrogen atom with discrete radial states identified with the radial quantum number n , then the orbit-radius scaling has to be the quadratic scaling r 1 r n = n 2 r 1 of standard QM, not the linear orbit-radius scaling r e + r p r n H = n H ( r e + r p ) of the present model. So why does one way of looking at the problem involve a quadratic n 2 , while the other way of looking at it involves a linear n H ?

Recall that there was good reason to suggest highest probability for the values n H = 2 ,   8 ,   18 ,   32 , ... corresponding to the lengths of the rows in the Periodic Table. These row lengths can be characterized as 2 N 2 for N = 1 ,   2 ,   2 ,   3 ,   3 ,   4 ,   ( 4 ) . So n H actually does encode something that is quadratic, namely the N 2 , and is therefore similar to the quadratic n 2 .

5.5. Beyond both hydrogen and ground states: Spectroscopy

In spectroscopy, we observe light created when an atomic system relaxes in some way. For elements beyond Hydrogen, the spectral lines that occur are often characterized in part by the so-called Rydberg factor:

R = 2 π 2 m e e 4 c h 3 Z 2 1 + m e / M E41

The R is traditionally interpreted as the energy needed for total removal of one electron from the ground state to infinity, leaving an ion. The energy needed for an electron to get from a state labeled n 1 to a higher state labeled n 2 n 1 , and conversely the energy released when it goes back to n 1 , is then modeled as

Δ E = R [ 1 / ( n 1 ) 2 1 / ( n 2 ) 2 ] E42

Observe that R contains a factor of Z 2 , just like the I P ’s for total ionization, I P Z , Z of Eq. (25) do. That means R is referring to the absolutely largest photon energy that the system could ever possibly be imagined deliver: starting from a state of total ionization, i.e. a naked nucleus, and having the entire electron population return in one fell swoop, with the emission of just one photon for the whole job. That scenario could never actually happen. One-at-a-time electron return is the only plausible return scenario. The inverse square integers in the square bracket bring Δ E down to values appropriate for one-at-a-time scenarios.

Observe that the Rydberg model for spectral lines already conflicts with an older model for the atom developed from the PT; i.e., electron ‘shells’ enclosing the nucleus, inner shells filled, and at most one outer shell unfilled; partially filled for most elements, and completely filled for noble gasses. The older PT-based model suggests shielding of the nucleus by the filled inner shells of electrons. But the occurrence of a Z 2 in R , even for large n 1 and n 2 in Δ E , means there is no shielding of the nucleus. So electrons must be in tight clusters, rather than nucleus-enclosing shells.

Observe that R does not contain any Z / M factor like the I P model contains. Instead, it has a factor

1 / ( 1 + m e / M ) = M / ( M + m e ) E43

which is essentially unity. At the time when R was formulated, most of the known trans-Hydrogenic elements had M 2 Z , and the factor of Z / M 1 / 2 could be absorbed into an external constant factor, the 2 π 2 m e e 4 / c h 3 in R . That is no longer the case today. We know about heavy elements for which M 2.5 Z , or Z / M 2 / 5 . So now it would be better to use the function Z / M instead of the number 1 / 2 . An extra bonus would be that Hydrogen, with Z / M = 1 , would be included.

Now consider that spectral lines might not to arise from de-ionizing one ion of one atom, but rather from de-exciting a system involving multiple neutral atoms. In this description, the n 1 and n 2 are not identifiers of different states of one atom, but rather numbers of atoms organized into super atoms. Otherwise, nothing really changes. However we interpret their meaning, the predicted spectral lines remain the same.


6. Unification between Newton and Maxwell

This last technical Section of this Chapter returns to the first physics disunity mentioned in the Introduction: the seemingly different coordinate-transformation properties of Newton’s Laws for mechanics and Maxwell’s equations for electrodynamics. Newton’s laws are form invariant under Galilean transformations. But Maxwell’s equations are generally thought to be form invariant only under Lorentz transformations. Especially, they are thought to be not form invariant under Galilean transformations.

So a curious situation exists within physics today. It is generally expected that the equations of physics should be tensor equations. By definition of the word ‘tensor’, a tensor equation is form invariant under arbitrary changes of reference frame, assuming no singularities or other cruel and unusual circumstances in the transformation or its inverse transformation. That means a tensor equation should be form invariant under arbitrary, though reasonably well-behaved, space-time transformations.

So, are Maxwell’s equations really tensor equations? Or not? Mathematicians have good reason to challenge the believed tensor status of Maxwell’s equations, while physicists have good reason to challenge the believed requirement for invariance under anything other than Lorentz transformation. But the situation is not generally acknowledged. It is the proverbial ‘elephant in the living room’.

Clarifying this situation can assist physics in becoming more unified from its beginning to its present. And mathematics has lots of applicable tools; see Kiein (2009). The present work offers an approach that is also mathematical, but a lot more elementary. Maybe it will communicate to different readers.

The problem, I believe, is of a type with which QM has some history. QM appears to be the first branch of physics that well and truly needed complex numbers. They may have been used in physics before QM, but they were only one of the tools available for the problems then at hand. Sines and cosines could generally handle any problem just as well as complex exponentials could handle it. But with QM, complex exponentials became truly essential for doing physics.

The history of mathematics has been a tale of increasing range of objects included in the discussion. It began with real, positive integers; it grew with the inclusion of zero and negative integers, and grew again with the inclusion of all rational numbers, and again with the inclusion of all irrational numbers. Then it grew with the inclusion of imaginary numbers, thus creating complex numbers. This was the first of a number of ‘doublings’ of the number of dimensions attributed to mathematical objects. [See Rowlands (2007).] After complex numbers, we got quaternions, and bi-quaternions, or octonians, and there is no reason to suppose that further doublings will not continue to prove useful.

Complex numbers make possible operations that are not possible without them. Consider, for example, the square root of 3 . It cannot be evaluated within the real number system, but in the complex number system, it is just ± i 3 .

I believe ‘doublings’ are generally like this: they make possible operations that were not possible without them. There appears to be today an opportunity for a doubling in the realm of tensor calculus. There are presently exactly two tensor-transformation behaviors identified, called ‘contravariant’ and ‘covariant’. It appears that tensor calculus can be usefully extended through a doubling of the number of transformation behaviors that can be described, from two to four. It appears that such a doubling can resolve the apparent conflict between Newtonian and Maxwellian physics: it can make possible a display showing how Maxwell’s equations can actually be form invariant under arbitrary coordinate transformations.

6.1. The opportunity offered by tensor notation

The display of four transformation behaviors requires the use of four tensor index positions. So in addition to the usual contravariant (index up-right) and covariant (index down-right) positions on the right side of a tensor symbol, we need to us the positions available on the left side: index up-left and index down-left. Since left-side index positions have not been in used in this new way before, they need new names designed for the purpose. To recall the move from right to left, let us use the prefix ‘trans’. So let the up-left index position be called ‘transcontravariant’, and let the down-left index position be called ‘transcovariant’).

All the transformations are describing what happens to tensor merates when the frame of reference changes; i.e. when the basis unit vectors defining the frame of references are replaced with other basis unit vectors. The transformations discussed here are arbitrary within the specifications that make the connections between reference frames reasonably well behaved; the individual relationships are differentiable and reversible, the matrix representations of them are invertible and unimodular.

I mention both tensors and matrices because they are equivalent notation schemes that can be used interchangeably for describing systems of linear equations. Tensor notation is useful for making a compact statement of a whole mathematical situation. Matrix notation is useful for separating a whole mathematical situation into constituent parts for calculations. Individual linear equations are useful for focusing on individual parts of the mathematical problem. Human beings do have strong personal preferences about which approach to use, but all of these approaches should agree on the basic facts of a given situation, so any of these approaches should be acceptable. In the present work, all approaches will be used. That way, everyone can find something to like, and everyone can find something to dislike!

In the case of the matrix displays and the linear equations, the presentation does save a little space by ignoring two spatial dimensions and focusing on one spatial dimension (call it 1) and the temporal dimension (call it 0).

6.2. Transformation of a contravariant object

The most familiar transformation is the contravariant one. The prefix ‘contra’ means these tensor merates change opposite to the way the basis unit vectors of the reference frame change. For an arbitrary input vector X α , the transformation reads X ¯ β = [ x ¯ β / x α ] X α , where we see the transformation as partial derivatives of coordinates, new with respect to old. Equivalently X ¯ β = T α β X α , where we see the transformation written as the tensor T α β . Also equivalently, we have [   X ¯ 0     X ¯ 1   ] = [   T 0 0       T 1 0     T 0 1         T 1 1   ] [   X 0     X 1   ] , where we see everything, the input and output vectors and the transformation, in matrix format. Or equivalently, we have X ¯ 0 = T 0 0 X 0 + T 1 0 X 1 and X ¯ 1 = T 0 1 X 0 + T 1 1 X 1 as two separate linear equations.

For the contravariant transformation matrix [   T 0 0         T 1 0     T 0 1         T 1 1   ] one can define a reverse transformation matrix [   R 0 0       R 1 0     R 0 1       R 1 1   ] , wherein R 0 0 = T 0 0 and R 1 1 = T 1 1 , but R 1 0 = T 1 0 and R 0 1 = T 0 1 . Applied to X ¯ β , the reverse transformation R β α takes X ¯ β back to X α : X α = R β α X ¯ β . That is to say: X 0 = R 0 0 X ¯ 0 + R 1 0 X ¯ 1 and X 1 = R 0 1 X ¯ 0 + R 1 1 X ¯ 1 . Expressed in matrix form, the reverse transformation is the inverse transformation: [   X 0     X 1   ] = [   R 0 0       R 1 0     R 0 1       R 1 1   ] [   X ¯ 0     X ¯ 1   ] , or [   R 0 0       R 1 0     R 0 1       R 1 1   ] [   T 0 0       T 1 0     T 0 1       T 1 1   ] = [   1       0     0       1   ] . The distinction between the words reverse and inverse is nil in the contravariant context. But it becomes important in the next context.

6.3. Transformation of a covariant object

The prefix ‘contra’ means reverse to the prefix ‘co’. The covariant transformation goes the same way the basis unit vectors change. So the covariant transformation X ¯ β = C β α X α in matrix format [   X ¯ 0     X ¯ 1   ] = [   C 0 0       C 0 1     C 1 0       C 1 1   ] [   X 0     X 1   ] uses transformation matrix C equal to the reverse contravariant transformation matrix R : [   X ¯ 0     X ¯ 1   ] = [   R 0 0       R 1 0     R 0 1       R 1 1   ] [   X 0   X 1 ] , or equivalently X ¯ 0 = R 0 0 X 0 + R 1 0 X 1 and X ¯ 1 = R 0 1 X 0 + R 1 1 X 1 . It is generally assumed that this is the same as saying the covariant transformation is the inverse to the contravariant transformation. Notice however that the off-diagonal merates C 0 1 = R 1 0 , and C 1 0 = R 0 1 have indices switched around. This is because C operates on a covariant object, whereas, in its original definition, R operated on a contravariant object.

The index switching makes no difference if we limit attention to transformations that are space-time symmetric, i.e. Lorentz transformations. But if we wish to investigate any other type of transformation, we have to investigate whether the switch makes a difference. Consider the inner product X ¯ β X ¯ β . Under Lorentz transformation, it is preserved, equal to X ¯ β X ¯ β = X α X α . But if we do not have space-time symmetry, is it still preserved? This question has to be answered by testing.

Laying out the problem in matrix format, we have to make one of the vectors, say the covariant one, a row vector, and then we have to test:

[   X ¯ 0       X ¯ 1   ] [   X ¯ 0     X ¯ 1   ] = [   X 0       X 1   ] [   R 0 0       R 0 1     R 1 0       R 1 1   ] [   T 0 0       T 1 0     T 0 1       T 1 1   ] [   X 0     X 1   ] = ? = [   X 0       X 1   ] [   X 0     X 1   ] E44

Observe that the R matrix is transposed from what it would need to be to make the R T matrix product collapse to the identity. So the inner product X α X α is generally not preserved if we do not have space-time symmetry.

6.4. Transformations for objects of four types

In order to recover the general availability of preserved inner products, the two additional transformation behaviors are defined. The transcovariant transformation is defined as the transposed inverse of the contravariant one. The transcontravariant transformation is defined as the transposed inverse of the covariant one.

Recall that this discussion began with the contravariant transformation written in the tensor notation X ¯ β = [ x ¯ β / x α ] X α . The discussion soon became complicated enough to merit introduction of more detailed notation that can clearly distinguish the four cases. The following Table illustrates the expanded tensor notation:


The Table is organized for user convenience, with the position of information corresponding to the index position: upper right for contravariant, lower right for covariant, lower left for transcovariant, and upper left for transcontravariant. The index position assigned to an object determines the transformation law that it follows.

Now let two arbitrary numbers with magnitude less than unity be represented by the letters A and B (chosen from the word ‘arbitrary’!). Let the arbitrary numbers represent in turn the off-diagonal elements of transformation matrices. The following table shows the corresponding matrix notation:


Observe that this Table uses negative signs on the arbitrary A and B in the contravariant and transcontravariant cases, positive signs in the covariant and transcovariant cases. This sign choice is used to help recall the prefixes ‘contra’ and ‘co’. Observe too that if B = A , we have space-time symmetry, which is the case of Lorentz transformations. And observe finally that if B = 0 , we have universal time, which is the case of Galilean transformations. But A and B are arbitrary, and so can also represent other transformations as yet unnamed.

6.5. Transformations for invariant objects

The underlying purpose of tensor calculus is to focus on mathematical objects that are ‘coordinate free’, or ‘frame independent’, or ‘invariant’ (whether in form or in numerical value), – all expressions meaning that coordinate transformation does not change anything fundamental about an object so-described: values of scalars, or relationships expressed as equations involving tensors.

The user of tensor calculus expects certain behaviors. There should be number invariant inner products of vectors and of higher-order tensors. The ‘unity’, or ‘Kronecker delta’ is not presently regarded as a real tensor, but can be accepted as one if it can be demonstrated number invariant. Finally, the user will certainly expect a number invariant ‘metric tensor’, the essential tool for manipulating index positions to develop tensor equations. Displaying that all these expectations can be met in the case of arbitrary transformations, not just Lorentz transformations, is the objective of this Sub-Section.

The matrix notation is useful in checking out the transformation of all these entities. For example, the preserved inner product of a vector X with itself looks like (note the transpositions for operating on row vectors):

X ¯ β X ¯ β = [ X ¯ 0   X ¯ 1 ] [   X ¯ 0     X ¯ 1   ] = [   X 0       X 1   ] 1 1 A B [     1       + B     + A       1   ] [   X 0       X 1   ] [     1       B     A       1   ] [   X 0     X 1   ] = [   X 0       X 1   ] [   X 0     X 1   ] = X α X α E45


X ¯ β X ¯ β = [   X ¯ 0       X ¯ 1   ] [   X ¯ 0     X ¯ 1   ] = [   X 0       X 1   ] 1 1 A B [     1         B     A       1   ] [   1       + B     + A       1   ] [   X 0     X 1   ] = [   X 0       X 1   ] [   X 0     X 1   ] = X α X α E46

The more familiar inner product X α X α is preserved with Lorentz transformations, but not with arbitrary transformations. So it shouldn’t be considered any kind of ‘invariant’. The same is true of the unfamiliar ( X α ) ( X α ) .

With the extended tensor notation, we can identify the index positions that definitely make a number invariant Kronecker delta. It looks like (note the transpositions for operating on row vectors):

δ ¯ δ γ = 1 1 A B [   1       + A     + B       1   ] [   1       0     0       1   ] [   1       A     B       1   ] = [   1       0     0       1   ] = δ β α E47


δ ¯ δ γ = 1 1 A B [   1       A     B       1   ] [   1       0     0       1   ] [   1       + A     + B       1   ] = [   1       0     0       1   ] = δ β α E48

The more familiar δ β α is preserved with Lorentz transformations, but not with arbitrary transformations. That is why it does not qualify as a tensor. The same is true of the unfamiliar δ β α .

Some readers will be surprised to see the present argument using the Lorentz metric, [   1         0     0       1   ] , without accepting a limitation to Loentz transformations. It is widely supposed that the Lorentz metric requires Lorentz transformations, and/or Lorentz transformations require the Lorentz metric. But such a connection is not in fact mandatory.

The generally preserved forms of the Lorentz metric tensor look like (note the transpositions for operating on row vectors):

g ¯ δ γ = 1 1 A B [     1       A     B       1   ] [   1         0     0       1   ] [     1       A     B       1   ] = 1 1 A B [   1       A     B       1   ] [     1       A     + B       1 ] = [   1         0     0       1 ] = g β α E49


g ¯ δ γ = 1 1 A B [   1       + A   + B       1   ] [   1         0     0       1   ] [   1       + A     + B       1   ] = 1 1 A B [   1       + A     + B       1   ] [     1       + A     B       1   ] = [   1         0     0       1   ] = g β α E50

The more familiar g β α and g β α are preserved with Lorentz transformations, but not with arbitrary transformations. They shouldn’t be considered any kind of ‘invariant’. The same is true of the unfamiliar g β α and g β α .

The number invariant g α β and g α β can function to raise and lower indices on objects. For example, X β = ( g α β ) X α and X β = ( g α β )   X α , or X α = ( g α β )   ( X β ) and X a = ( g a b )   ( X b ) .

One can also write additional index assignments for g . Altogether, there are 10 possible assignments, as there are 4 × 3 / 2 = 6 with indices in different corners, and 4 with indices in the same corner.

Two of the additional index assignments look like g β α and g β α . These two entities cannot do anything to an index except change its name. For example, X α = X β ( g β α ) and X α = X β ( g β α ) , or X β = ( g β α )   ( X α ) and X β = ( g β α )   ( X α ) . So g β α and g β α are just equivalent to the number invariant δ β α and δ β α already noted above.

Further additional index assignments on g create entities that can serve to convert a regular index into a trans one, or a trans one into a regular one. None of these entities are number invariant, but in practice, that does not matter. The user does not convert just a single object; the user converts a whole tensor equation. The index-converting g entities typically occur in pairs, and the pairs contract to number invariant objects. When they don’t occur in pairs, they do occur on both sides of an equation, and cannot affect the issue of equation form invariance.

Another two of these of g ’s are g β α and g α β . They function to do X β = ( g α β ) X α and X α = ( g α β ) X β , or X β = g β α ( X α ) and X α = g β α ( X β ) . As a pair, they contract to ( g β γ ) ( g γ α ) = g β α , or to ( g γ β ) ( g α γ ) = g β α , both of which are number invariant.

The final four indexed g ’s are g α β , g α β , g α β , and g α β . They can all function to change a regular index into a trans one, or a trans one into a regular one, but with a twist: ‘co’ goes to ‘contra’, or ‘contra’ goes to ‘co’. That is, X α = ( g α β )   X β , X α = ( g α β )   X β , g α β ( X α ) = X β and g α β ( X α ) = X β . As noted above, the contractions ( g β γ ) ( g γ α ) = g β α and ( g β γ ) ( g γ α ) = g β α are number invariant.

The bottom line is this: to be sure of invariance under arbitrary transformation, not just Lorentz transformation, always contract a regular index with a trans index.

6.6. General invariance for Maxwell’s equations

Maxwell’s equations in current tensor notation read:

α F α β = 4 π c J β E51


α D α β = 0 β E52

The two-index F α β and D α β tensors refer to the electromagnetic field and the ‘dual’ thereof. The electromagnetic field tensor F α β has merates that are components of the three-dimensional electric and magnetic field vectors, E and B . The D α β is the dual to F α β , whose merates are components of B and E . The one-index tensors J β and α refer to the source charge-current density vector and the differential operator vector. The indices α and β take the four values 0 , 1 , 2 , 3 .

The seeming limitation of Maxwell’s equations to invariance only under Lorentz transformation arises entirely from the differential operator being written as a covariant vector. In the extended tensor algebra, this operator is identified as transcovariant, and then Maxwell’s equations look like:

( α ) F α β = 4 π c J β E53


( α ) D α β = 0 β E54

Written this way, Maxwell’s equations are manifestly form invariant, not only under Lorentz transformation, but also under any arbitrary (just well-behaved) transformation, including Galilean transformation.


7. Conclusions

About Maxwell’s equations and photons: Photons have a life history that begins with emission as an electromagnetic pulse pulse, proceeds with development into a waveform, then changes into regression back to a pulse, and ends with absorption by a receiver. This life history of the photon can be modeled by imagining some mirrors that apply boundary conditions corresponding to the desired scenario, feeding a Gaussian pulse at the source to Maxwell’s equations, watching Hermite polynomials then emerge, and then finally pile up at the receiver.

About EM signals and photons: The life history of the photon suggests that the assumption upon which Einstein’s SRT is founded is over-simplified. If we will make the founding assumption more realistic, then we will get more believable results. The more believable results can help us reconcile SRT with the QM of atoms. We can understand why Planck’s constant occurs. It represents the balance between competing phenomena: on the one hand, energy loss due to radiation from accelerating charges; on the other hand, energy gain due to internal torquing within the atomic system due to finite speed of signal propagation.

About Atoms: Viewed in the right way, chemical and spectroscopic data reveal a tremendous amount of regularity. So we are well enabled to interpolate and extrapolate for situations where actual data is not available. We can analyze scenarios where electrons are subtracted from or added to an atom, all at once, or one at a time; whatever we need. But take care: in the existing literature, the distinction between ‘all-at-once’ and ‘one-at-a-time’ is often obscure, so be careful.

About Maxwell and Newton: There should have been no conflict between Maxwell’s equations and Newton’s equations over the issue of transformation invariance. Maxwell’s equations are form invariant under Galilean transformations, just as they are form invariant under Lorentz transformations. Physics does not have conflicts. Only people have conflicts. And people can resolve their conflicts. The conflict perceived in the case of Newton vs. Maxwell is resolved with an extension of mathematical formalism.

About Physics in General: This work has shown that SRT deserves a moment of caution, and the reader may reasonably worry that GRT deserves some caution too. So it may be premature to develop a theory of quantum gravity. Placing the QG capstone onto the RT and QM pillars of 20th century physics may produce something that resembles the ancient constructions at Stonehenge, but not the Gothic cathedrals of Europe, much less anything modern.


8. Appendix 1. Numerical data on ionization potentials for all elements

Periods 1, 2 and 3.

Period 4.

Period 5.

Period 6.

Period 7.



The author thanks colleagues for deep and intense conversations on the topics discussed here, especially Dr. Peter Enders, Dr. Yuri Keilman, Dr. Robert Kiehn, Prof. Zbigniew Oziewicz, and Dr. Tom Phipps.


  1. 1. Aspden H. 1990 ‘Electron Clusters’ (Correspondence) Galilean Electrodynamics 1 81 82.
  2. 2. Beckmann P. 1990 “Electron Clusters”, Galilean Electrodynamics 1 55 58see also vol. 1, 82.
  3. 3. Kiein R. M. 2009 Non-Equilibrium Systems and Irreversible Processes, 4 Adventures in Applied Topology, especially Sect. 2.2.3,
  4. 4. Piestrup M. A. Puthoff H. E. Ebert P. J. 1998 Correlated Emissions of Electrons, Galilean Electrodynamics 9 43 49.
  5. 5. Putz M. V. 2009 Quantum Frontiers of Atoms and Molecules, Nova Science Publishers, New York.
  6. 6. Rowlands P. 2007 Zero to Infinity- The Foundations of Physics, Chapter 16, The Factor 2 and Duality, World Scientific, New Jersey, London, etc.
  7. 7. Thomas L. H. 1927 The Kinematics of an electron with an Axis, Phil. Mag. S. 7, 3 13 1 22.
  8. 8. Wheeler J. A. Feynman R. P. 1945 Interaction with the Absorber as the Mechanism of Radiation, Revs. Mod. Phys., 17 2 & 3 157 181.
  9. 9. Wheeler J. A. Feynman R. P. 1949 Classical Electrodynamics in Terms of Direct Interparticle Action, Revs. Mod. Phys., 21 3 425 433 .
  10. 10. Whitney C. K. 2012 Algebraic Chemistry: Applications and Origins, Nova Science Publishers.


  • Note that I speak of MED, not of Classical Electrodynamics (CED) in general. CED involves, not only the works of Maxwell, but also those of a large number of other individuals. I am inclined to trust results from Maxwell, but question some of those from other authors, as reported in the present work.
  • Wheeler and Feynman were looking to time symmetry as the basis for an electromagnetic generalization of instantaneous (Newtonian) gravitational interaction. There are important differences between the regressing waveforms introduced above and the Wheeler-Feynman advanced solutions: 1) Wheeler and Feynman were looking at interactions between essentially point sources and receivers, and so had to be looking at spherically expanding retarded solutions and spherically contracting advanced solutions, not at essentially one-dimensional expanding and contracting wavelets. 2) The Wheeler-Feynman expansion or contraction is related to the spherical area of a wave front, not the waveform in the radial propagation direction. 3) A lengthy discussion of the paradox of advanced actions is necessitated in the Wheeler-Feynman work, whereas the ‘regressing’ solutions introduced here are not in fact ‘advanced’ at all; they are just regressing, in real time, in the propagation direction.

Written By

Cynthia Kolb Whitney

Submitted: May 10th, 2011 Published: February 24th, 2012