Open access peer-reviewed chapter

Better Unification for Physics in General Through Quantum Mechanics in Particular

By Cynthia Kolb Whitney

Submitted: May 10th 2011Reviewed: August 30th 2011Published: February 24th 2012

DOI: 10.5772/35211

Downloaded: 3115

1. Introduction

Physics has always had several different domains of application in on-going development, and physicists have always striven for unification among its different domains. Unification is usually achieved through development of so-called ‘covering theories’. In the nineteenth century, the stunning example was Maxwell’s Electrodynamics (MED), which unified electricity and magnetism as one domain of theory. Another major domain of theory then present was Newton’s Mechanics (NM), which in the eighteenth century had really launched modern physics as a mathematical discipline.

At the turn of the twentieth century, NM and MED were well in place, and were fulfilling many technologically important requirements. But there seemed to be an incompatibility between them. The problem concerned their invariance with respect to choice of reference frame: NM exhibited invariance if the allowed reference frames were all connected through Galilean transformations, whereas MED exhibited invariance if the allowed reference frames were all connected through Lorentz transformations. It looked as though one of these two theories must be more nearly correct than the other, but it was not clear which one was the better one.

That problem seemed resolved with the advent of Einstein’s Special Relativity Theory (SRT). SRT was believed to capture the true meaning of MED concerning the behavior of light signals, and SRT was certainly an endorsement of Lorentz transformation, so SRT was believed to offer the one possible revision of NM that could make mechanics fully consistent with MED.

But meanwhile, new phenomena were being discovered at the micro scale of physics, and they often seemed inexplicable with any known theory, whether NM, SRT, or MED. These were phenomena suggesting quantization of light, quantized atomic states, atomic, molecular and crystal structures, radioactivity, etc.

So at almost the same time as one problem seemed to be resolved, other problems were emerging. Since the earlier situation between NM and MED had demanded that Physics allow two seemingly discordant theories to co-exist until some good argument could replace one of them, the situation then presented by the new phenomena being discovered naturally invited the development of another potentially discordant theory: Quantum Mechanics (QM).

The discovery of the photoelectric effect, and the introduction of the idea of the photon, initiated QM. Almost immediately, QM was developed to handle the Hydrogen atom, and the ground state thereof, the stability of which was thought to be impossible with MED. Accepting that apparent incompatibility with MED, and even embracing it, researchers moved on to excited states, to other atoms, then to molecules, and reactions, and to all the rest of the complexity that today makes up modern Quantum Chemistry (QC).

Also, experimenters got into sub-atomic elementary particles, especially electrons and positrons, their annihilation and creation, along with creation and annihilation of photons. All that led to Quantum Electrodynamics (QED).

So today physics still has several different bodies of theory, aimed at several different domains of application. On the one hand, we have QM for atomic and other micro-system interactions. It has at least two identifiable parts: QC for interactions at the level of atoms and molecules, and QED for interactions at the level of elementary particles. And on the other hand, we have Einstein’s relativity theory (RT) for physics at human scale and larger. It too has two parts: SRT for electromagnetic interactions, and general relativity theory (GRT) for gravitational interactions.

QM and RT are the major pillars of twentieth century physics. And they are not entirely compatible. QM features wave-like entities with seemingly instantaneous correlations between the states of even quite distant entities, whereas RT features point-like entities interacting via fields propagating at a finite speed.

So are we defeated in the quest for unification in Physics? Apparently many people hope not, as they do vigorously pursue various forms of unification. The prominent one sought today is Quantum Gravity (QG). It would be the twenty-first century capstone for the two twentieth-century pillars of QM and RT. But it is not yet fully in sight.

In the pursuit of unification, one often sees phrases like ‘Theory of Everything’. The objective of this Chapter is certainly modest by comparison! It just notes some observations about the status of available theories, and discusses the removal of some incompatibilities between the available theories that arose only because of unfortunate choices.

Because QM is relatively new, there are still lots of alternative approaches being developed in parallel. Putz (2009) gives us one very big and recent anthology about them, and this book will give another even more recent one. The QM atmosphere is clearly right for generating new illumination that can facilitate new observations about physics overall.

The first observation driving the present work is just this: QED is arguably the most successful theory that modern Physics possesses. The fact that QED now exists, and that is has the name that it has, naturally begs the question: How could there have been any real disconnect between MED

Note that I speak of MED, not of Classical Electrodynamics (CED) in general. CED involves, not only the works of Maxwell, but also those of a large number of other individuals. I am inclined to trust results from Maxwell, but question some of those from other authors, as reported in the present work.

and early QM?

It is this author’s belief that Nature is not so perverse. Connections between different domains of theory are still possible to find, even though the diligent search that was conducted a century ago did not find them. We have developed more tools now. Every new tool developed should invite us to revisit the old problems.

Section 2 talks about the photon from the point of view of MED. It explores the implications of the finite energy, which characterizes a photon. It finds a plausible model for the photon expressed in terms of MED.

The second observation is just this: If MED can connect better with QM, then shouldn’t SRT also connect better with QM? After all, how much difference can there be between a photon in QM and a light signal in SRT?

Section 3 explores the implications of modeling the light signal in SRT in the same way as the photon in QM. The photon model suggests a slight alteration to Einstein’s second postulate, and thereby produces a slightly altered version of SRT.

The third observation is just this: If SRT is to be altered, however slightly, in response to the photon concept from QM, isn’t it then possible that the revised SRT can be used to better explain some things about QM that presently seem mysterious?

Section 4 talks about what the photon/signal model implies about atoms: the stability of atoms, the occurrence of Planck’s constant.

The fourth observation is this: Much of science works on scaling laws. It is in that spirit that we should look for scaling laws about atoms, and thereby reduce the effort of looking at each element as a particular special-case problem for detailed calculations.

Section 5 talks about the inferences from to the story about all isotopes of Hydrogen, all elements beyond Hydrogen, and the ions of any element; the possible nature of ‘excited’ atomic states, and the character of the light spectrum that an element produces.

The fifth observation is this: If QM can be better connected to SRT, then where does that leave its relationship with NM? Early QM was basically NM, although not for particles possessing momentum and energy in the classical way, but rather for waves, with an amplitude factor and a phase factor, in the latter of which momentum and energy appeared as variables. Is that formulation now completely outdated on account of a rift between NM and MED?

Section 6 establishes that there was no necessary disconnect even between NM and MED. It argues that, with an adequately extended notation to support an extended tensor calculus, Maxwell’s equations can be seen to be invariant in form, even under Galilean transformation. (It is useful here to distinguish two kinds of invariance: ‘form invariance’ for symbolic equations, and ‘number invariance’ for individual symbols that have numerical values.)

The last observation is the ‘meta’ observation about the present work: Physics in general can become significantly more unified throughout because of some specific developments surrounding QM.

Section 7 summarizes the several specific conclusions implied by the present work. Boiled down to one sentence, these conclusions come to this: the existence of apparent discord between theories that are addressed to different problem domains within Physics sometimes means that there exists a more productive way to pose one or more of the theories involved.


2. Maxwell’s electrodynamics and QM’s photons

It often seems that MED, a theory largely about spatially extended EM fields, has little in common with QM, a theory largely about discrete material systems and the discrete photons that they emit and absorb. Photons are imagined to be the opposite of spatially extended; i.e., localized, like the matter particles that emit and absorb them.

So our mental picture for a photon in its interactions with matter is rather bullet-like: the photon is shot out of a source, travels through space, and hits a receiver that absorbs it. But the travel part of the story is unobservable. So we imagine that the photon in flight is possibly wavelike, in accord with Maxwell theory. Certainly the evidence for that is present, in the form of interference effects, even with small numbers of photons. So the photon is assigned a quality of ‘duality’. This is a rather mysterious way of describing a photon.

What seems missing here is an adequate model for the photon throughout its life history, expressed in terms of EM fields. The purpose of this Section is to develop one.

I like to begin the development of such a history with a waveform consisting of finite energy distributed in a three-dimensional Gaussian peak located very close to a source that has emitted it. This three-dimensional Gaussian peak is limited in all three spatial directions so as to integrate to a finite total energy.

To allow subsequent propagation, the energy has to be divided between two orthogonal fields, electric and magnetic. To allow circular polarization, the energy has to be further divided between real and imaginary parts, real being alive now, and imaginary becoming alive a quarter of an oscillation cycle later.

Given such a start, the whole life history of a photon can then develop in the manner that Maxwell’s equations allow. Describing that development is the objective of the following Sub-Sections.

2.1. Waveform development

The first step in the life history of a photon is its development from a spatially localized energy bundle that is emitted from a source into a spatially extended waveform that travels through space. To help think about this problem, it is useful to recall some phenomenology familiar from physics at a more macroscopic scale.

  1. One phenomenon very well known for light modeled as EM waves is the spreading transverse to the propagation direction known as of ‘diffraction’. Diffraction is the result of some sort of limitation transverse to the propagation direction. Historically, the limitation has been due to a finite aperture through which the light propagates. The light spreads out from the aperture, more-so the smaller the aperture is. In the photon model discussed here, the limitation is softer than an aperture edge, but a limitation nevertheless: it is the finite spread of the Gaussian waveform in the two directions transverse to the propagation direction. The more narrow the Gaussian peak is, the more spread there will be.

  2. The closest familiar analog for longitudinal spreading is known as ‘dispersion’. This word refers to the ‘blurring’ effect that any frequency dependence the propagation speed through the medium entails. For example, a signal pulse in a medium looses its sharp edges because those sharp edges imply superposition of many different wavelengths, and hence different frequencies, which the medium may affect differently. In Earth’s atmosphere, or ocean, square waves can turn to blob waves because of dispersion.

Let us begin a scenario with a single pulse inE. Let it have a Gaussian profile along the propagation direction, sayx, withEexp(x2). We can apply Maxwell’s equations, and watch what happens. The Gaussian is the so-called ‘generating function’ for the infinite set of Hermite polynomials, all of which have very regularly spaced zero crossings. What happens is that the single pulse in E(an even function) generates a double pulse in B(an odd function), which in turn generates a triple pulse in E(another even function), and so on; that is, all the derivatives in play generate successively higher-order Hermite polynomials multiplying the Gaussian. Meanwhile, all the E×BPoynting vectors in play support general spreading of the Gaussian. With each step, the emergent functions look more and more like wavelets, and the individual peaks in the wavelets stay about the same width as more of them accrue, so the wavelength for the emergent wavelet becomes more and more defined. Figure 1 illustrates this behavior at the stage where Ehas developed five peaks (four zero crossings). Series 1 is the original input Gaussian function, Series 2 is the Gaussian after the overall spreading has developed to this point, and Series 3 is the wavelet that has emerged in the process; i.e. the spread-out Gaussian times the fourth-order Hermit polynomial generated.

Figure 1.

A wavelet develops when an EM pulse is acted upon by Maxwell’s equations.

What we have so far is only one eighth of the story needed to fully represent a photon: development from a pulse into a waveform. We have told the story for one pulse inE. If we would match that with another pulse inB, we would have overall propagation along with waveform development. That would bring us to one quarter of the whole story of the photon. If we would match that with two more pulses, Eand Bpointing at 90°in space from the first pair and coming ‘alive’ a quarter cycle out of phase with the first pair, we would have the circular polarization characteristic of photons, but we would still have just half the story. So let us move on, and seek the other half.

2.2. Waveform regression

The remaining half of the story of the photon is about waveform regression. How does this complex structure of four Hermite polynomials multiplied by their generating Gaussian unwind, and go back to being a set of four pulses, so that it can be absorbed into a receiver? Again, let us refer to some similar but more familiar phenomenology:

  1. A third phenomenon possible for light modeled as EM waves is ‘focusing’. This is what we have optical lenses and shaped mirrors for. It works somewhat contrary to transverse spreading, gathering incident energy into a smaller area transverse to the propagation direction. Of course we don’t have any lenses or mirrors in the photon model, but we shall find a mechanism that produces a similar effect.

  2. A fourth phenomenon possible for light modeled as EM waves is ‘pulse restoration’. This is what transmission lines have ‘repeater stations’ for. A communication signal degraded by dispersion can be reconstituted when passed through an intelligent filter. Of course we don’t have any filters in a photon model, but we shall find a mechanism that produces a similar effect.

The ‘similar effect’ comes from the imposition of boundary conditions in the longitudinal direction. The Gaussian pulse that was used to describe the waveform development part of the scenario was somewhat unrealistic in that its tails extended to infinity. There is no way that a localized source could emit an energy pulse whose tails would extend to infinity. It is somewhat more realistic to imagine the equivalent of a mirror at the source, and another mirror at the eventual receiver, to confine the waveform like a wave in a box, with zero amplitude at the surface of each mirror and everywhere beyond.

With such boundary conditions imposed, the analytic functions involved in the model are no longer the simple Gaussian and the simple Hermite polynomials that it generates. Now we have not one, but three, Gaussians, the extra two being needed to cancel the first one at the two boundaries. Correspondingly, we always have at least three (actually six) Hermite polynomials alive at any given time. That is a loss of mathematical simplicity. But there is a gain of conceptual simplicity. It is easy to envision that the propagation scenario has some symmetry about its mid point. The waveform will spread until its central peak is halfway between the source and the receiver. After that, the mirror at the receiver will be more significant than the mirror at the source, causing the waveform to start ‘piling up’ near the receiver, and eventually end up as a pulse near the receiver, similar to the pulse originally launched near the source.

This ‘regressing waveform’ is somewhat reminiscent of ‘advanced’ solutions to Maxwell’s equations going backwards in time. These were introduced many times in the early 20th century, but particularly popularized in the mid 20th century by Wheeler and Feynman (1945 and 1949).

Wheeler and Feynman were looking to time symmetry as the basis for an electromagnetic generalization of instantaneous (Newtonian) gravitational interaction. There are important differences between the regressing waveforms introduced above and the Wheeler-Feynman advanced solutions: 1) Wheeler and Feynman were looking at interactions between essentially point sources and receivers, and so had to be looking at spherically expanding retarded solutions and spherically contracting advanced solutions, not at essentially one-dimensional expanding and contracting wavelets. 2) The Wheeler-Feynman expansion or contraction is related to the spherical area of a wave front, not the waveform in the radial propagation direction. 3) A lengthy discussion of the paradox of advanced actions is necessitated in the Wheeler-Feynman work, whereas the ‘regressing’ solutions introduced here are not in fact ‘advanced’ at all; they are just regressing, in real time, in the propagation direction.

What we have here is quite different though. There are no differential equations running backwards in time; there is just ‘piling up’ of a solution to differential equations in response to a boundary condition.

2.3. The photon model in terms of EM fields

Taken together, the waveform development followed by the waveform regression suggest a photon model in terms of EM fields that exhibits continuous evolution: it goes from a state of pulse-like localization near its source, to a state of wave-like extension in space during its travel, and then back to a state of pulse-like localization near its receiver.

Observe that with this photon model, ‘light in flight’ develops its wavelength only during its flight. It doesn’t have it to start with, and it gives it up at the end. So light at emission, or reception, has a position, but no wavelength, whereas light in flight has a wavelength, but no position. Thus the model expresses a ‘wave-particle duality’ for light.

Observe too that this photon model exhibits a form of QM ‘complementarity’, or uncertainty relationship. Consider that, under Fourier transformation, Gaussians map into Gaussians, and that the product of the spreads of such Gaussians is a constant. In the process of wave train development, a Gaussian in position space xspreads out, while its corresponding Gaussian in wave number space ksharpens up.

Inasmuch as the discovery of photons was the point of departure for the development of QM, having this photon model expressed in terms of Maxwell fields is a first step in reconciling MED with QM. But there is much more to do, because the bigger problem for MED was not the photon itself, but rather the atom that emitted or absorbed it. It looked as though MED could never explain an atom being stable in its ground state, much less anything about its excited states. To find any reconciliation there, we must move on.

3. EM signals as photons

Every neutral atom contains at least two particles, and generally a lot more. Prior to QM, electromagnetic forces were presumed to hold such a system together, but there was clearly a problem with that understanding.

The simplest atom is the Hydrogen atom, with just one electron circulating about a nucleus consisting of just one proton. So consider the Hydrogen atom. The electron circulates and so accelerates, and that must generate radiation. It was assumed that this radiation would rob the atomic system of energy, and thereby cause the collapse of the atom.

So it was assumed that Maxwell’s EMT is simply incompatible with the stability of atoms. The solution then was to postulate the existence of a different regime of physics in which that wouldn’t happen. But was that really necessary? The purpose of this Section is to argue that it was not.

The underlying belief in inevitability of atomic collapse reflects a belief that the electrodynamic forces within the atom are essentially central, and therefore cannot affect the energy budget of the atom. This latter belief traces to the turn of the 20th century, when A. Liénard (1898) and E. Wiechert (1901) developed models for the potentials and fields created by rapidly moving charges. Although Liénard and Wiechert worked independently, they made the same assumption, and they got the same results, and so confirmed each other. This Section looks at those results, and thereby develops a motivation to look back at their underlying assumption.

3.1. Standard formulae for scalar and vector potentials

Expressed in Gaussian units, the Liénard-Wiechert (LW) scalar and vector potentials at position rand time tare




whereκ=1nβ, βis source velocity normalized byc, and n=R/R(a unit vector), and R=rsource(tR/c)r(an implicit definition for the terminology ‘retarded’). The LW fields obtained from those potentials are then




The LW fields have some interesting properties. The 1/Rfields are radiation fields, and they make a Poynting vector (energy flow per unit area per unit time) that lies alongnretarded:


But the 1/R2fields are Coulomb-Ampère fields, and the Coulomb field does notlie along nretardedas one might naively expect; instead, it lies along(nβ)retarded. Assume that βdoes not change much over the total field propagation time, in which case (nβ)retardedis virtually indistinguishable fromnpresent. So then the Coulomb field and the radiation are arriving to the observer from different directions.

One can feel moved to check this surprising result. Fortunately, one can look up the original sources, obtain translations if necessary, and verify the original algebra. There is no problem with the algebra. There are also numerous re-derivations that use more modern techniques involving the Dirac delta function and the Heaviside step function. These are ‘generalized’ functions of some parameter that, when driven to infinity, produces an infinite pulse or a unit step. One can study these re-derivations too. One finds various re-orderings of the mathematical operators ‘differentiate’, ‘integrate’, and ‘go to parameter limit’. These re-orderings are dodgy because the generalized functions lack the mathematical property of uniform convergence, so these operations don’t necessarily commute; it is possible to change the result by changing operation order. But even so, such findings do not change the fact that the original LW derivations, although pedestrian, were correct.

If a problem exists with this LW result, then there is really only one place where it can arise: in the initial assumption; namely, that electromagnetic fields propagate like bullets shot at speedc. But this is the very sameassumption that Einstein later formalized as his Second Postulate (1905, 1907). He just called them “signals” rather than “fields”.

The LW idea of bullets shot at speed cis the foundation for Special Relativity Theory (SRT). (Indeed, SRT offers one of the modern ways to re-derive the LW results.) But SRT is also the foundation for General Relativity Theory (GRT). SRT and GRT together make one of the two great pillars of 20th century Physics: Relativity Theory (RT). So questioning the LW assumption is not just questioning the LW results; it is questioning the founding assumption of SRT, and so threatening this whole pillar of 20th century theory.

Many people have just accepted that this is just ‘the way things are’ with classical field theory, and with SRT, and with all of relativity theory as well. But what if one wanted to describe the same scenarios in a thoroughly modern way, with photons instead of radiation fields, and virtual photons instead of Coulomb-Ampère fields? Could anyone really accept the idea that the real photons and the virtual photons created by the same space-time event would arrive at a detector from different directions?

But one needn’t accept any such thing, given the photon model in terms of Maxwell fields developed in Sect. 2. In short, since we have a model for photons in terms of fields, we should be able to reverse engineer a model for fields in terms of photons. So what does the photon model developed in Sect. 2 imply? Observe that the developing wavelet can move at speed crelative to the source, and the regressing wavelet can move at speed crelative to the receiver. Applying this idea can help to modify the LW results appropriately.

3.2. Updated formulae for scalar and vector potentials

Recall that with the photon model developed in terms of Maxwell fields in Sect. 2, the life history of the photon has a symmetry point in the middle. Before the mid point of the propagation scenario, the waveform is developing, and after the mid point of the propagation scenario the waveform is regressing. That makes the mid point very important. So far as the receiver is concerned, nothing that happened before the midpoint affects the signal he receives. The source position and velocity information he receives is determined, not by the specification ‘retarded’, but rather by the specification ‘halfretarded’.

With this new specification, the scalar and vector potentials become:

Φ(r,t)=e[1/κR]half retarded


A(r,t)=e[β/κR]half retardedE4

The fields become:

E(r,t)=e[(nβ)(1β2)κ3R2+ncκ3R×((nβ)×dβdt)]half retardedE5


B(r,t)=nhalf retarded×E(r,t)E6

The Poynting vector P(r,t)becomes:

c4πEradiative×Bradiative=c4πEradiative×(nhalf retarded×Eradiative)=c4π(Eradiative)2nhalf retardedE7

Observe that now the direction of the Coulomb field is (nβ)half retarded(npresent)half retardednhalf retardedand the direction of the Poynting vector is nhalf retardedtoo. So now, the Coulomb field and the Poynting vector are reconciled to the same direction. That is the first big gift from the photon model in terms of EM fields given in Sect. 2.

And the gifts of photon model in terms of EM fields go well beyond this rather arcane problem about field direction. The photon model in terms of EM fields eliminates the central mystery of Einstein’s SRT: having just onelight speed relative to however manydifferent observers there may be. This is complexity at the level of ‘multiplicity’, much more daunting than the complexity at the level of the mere ‘duality’ that is found in modern QM.

4. EM fields within atoms

An noted in Sect. 2, atoms were the really big problem for Maxwell’s EMT. Now armed with some new information about EMT, it is appropriate the revisit the problem about atoms. We turn again to Hydrogen. From Sect. 3 can infer that at least two processes go on inside the Hydrogen atom, and we shall discover shortly that there are actually three. Only one is familiar. The other two challenge familiar concepts of ‘conservation’ that originally grew out of Newtonian mechanics. But electromagnetism is not Newtonian mechanics. In electromagnetic problems, the concepts of momentum and energy ‘conservation’ have to include the momentum and energy of fields, as well as those of matter. Momentum and energy can both be exchanged between matter and fields. ‘Conservation’ applies only to the system overall, not to matter alone (nor to fields alone either).

4.1. Energy loss due to far-field radiation

The first process that occurs with the Hydrogen atom is the familiar energy loss from the atom due to far-field radiation. There will be a far-field power radiated (energy loss per unit time) of magnitude

Pradiated=4π|P|R2dΩ=4πc4π(Eradiative)2R2dΩ=4π c4πe2c2κ6|n×((nβ)×dβdt)|2dΩE8

where Ωmeans ‘solid angle’. Because the full 4πof solid angle captures opposing directions ofn, contributions to the integral from the vector βvisible in the integrand cancel out. Contributions to the integral that come from the dot product nβthat is hidden in the κ6factor may not be zero at every moment, but they time-average to zero. So let us simplify the expression for far-field power radiated by setting βto zero. We have:


It evaluates to the well-known Larmor result:


4.2. Energy gain due to internal torquing

The second process that occurs in the Hydrogen atom is a not previously noticed energy gain due to internal torquing. This process occurs because the Coulomb force within the atom is not central; it is alongnhalf retarded, and not along(nβ)retardednpresent.

The power inflow to the electron isPtorquing=TeΩe, where Ωeis the electron orbit frequency, and Teis the magnitude of the torque on the electron, given by Te=re×Fewhere reis the electron orbit radius, and Feis the tangential force on the electron. But that is not all. The proton also orbits at frequencyΩe, and experiences its own torque, given byTp=rp×Fp, where rpis the proton orbit radius (tiny) and Fpis the tangential force on the proton (huge), with the result that the magnitude Tpis the same asTe. The total torque on the system isT=Te+Tp=2Te. It is determined by the angle between reandFe, which is given byrpΩe/2c=(me/mp)reΩe/2c. So torque T=(me/mp)(reΩe/c)e2/(re+rp)and power received is


The existence of such a process is why the concept of ‘balance’ emerges: there can be a balance between gain of energy due to internal torquing and the inevitable loss of energy due to radiation. But we are not done with radiation yet.

4.3. Extra radiation due to Thomas rotation

The fact that the electron and the proton have such different masses, and orbit at such different radii, means that the EM forces within the atom are not only not central; they are not even balanced. This situation has another major implications: The system as a whole experiences a net force. That means the system center of mass (C of M) can move. This sort of effect does not occur in Newtonian mechanics due to the fact that Newtonian mechanics assumes infinite signal propagation speed.

Looking in more detail, the unbalanced forces in the Hydrogen atom must cause the C of M of the whole atom to traverse its own circular orbit, on top of the orbits of the electron and proton individually. This is an additional source of accelerations, and hence of radiation. It evidently makes even worsethe original problem of putative energy loss by radiation that prompted the development of QM. But on the other hand, the torque on the system is a candidate mechanism to compensate the rate of energy loss due to radiation, even if there is a lot more radiation than originally thought..

The details are worked out quantitatively as follows. First ask what the circulation can do to the radiation. Some 20 years after the advent of SRT, a relevant kinematic truth about systems traversing circular paths was uncovered by L.H. Thomas (1927), in connection with explaining the then anomalous magnetic moment of the electron: 1/2 its expected value. He showed that a coordinate frame attached to a particle driven around a circle naturally rotates at half the imposed circular revolution rate.

Applied to the scenario of the electron orbiting the proton, the gradually rotating x,ycoordinate frame of the electron means that the electron sees the proton moving only half as fast as an external observer would see it. That fact explained the electron’s anomalous magnetic moment, and so was received with great interest in its day. But the fact of Thomas rotation has since slipped to the status of mere curiosity, because Dirac theory has replaced it as the favored explanation for the magnetic moment problem. Now, however, there is a newproblem in which to consider Thomas rotation: the case of the C of M of a whole Hydrogen atom being driven in a circle by unbalanced forces. In this scenario, the gradually rotating local x,ycoordinate frame of the C of M means that the atom system doing its internal orbiting at frequency Ωerelative to the C of M will be judged by an external observer to be orbiting twiceas fast, at frequency Ω=2Ωerelative to inertial space. This perhaps surprising result can be established in at least three ways:

  1. by analogy to the old electron-magnetic-moment problem;

  2. by construction from Ωein the C of M system as the power series Ω=Ωe×(1+12+14+18+...)Ωe×2;

  3. by observation that in inertial space Ωmust satisfy the algebraic relationΩ=Ωe+12Ω, which impliesΩ=2Ωe.

The relation Ω=2Ωemeans the far field radiation power, if it really ever manifested itself in the far field, would be even strongerthan classically predicted. The classical Larmor formula for radiation power from a charge e(ein electrostatic units) isP=2e2a2/3c3, where ais total acceleration. For the classical electron-proton system, most of the radiation would be from the electron, orbiting withae=reΩe2, Ωegiven by the Coulomb force mereΩe2=Fe=e2/(re+rp)2. But withΩ=2Ωe, the effective total acceleration isa=ae×22. The total radiation power is then

Ptotal radiated=24¯2e23c3ae2=(25¯e6/me2)/3c3(re+rp)4E12

Now posit a balance between the energy gain rate due to the torque and the energy loss rate due to the radiation. The balance requiresPtorquing=Ptotal radiated, or


This equation can be solved for


Compare that value to the accepted value re+rp=5.28×109cm. The match is fairly close, running just about 4% high. That means the concept of torque vs. radiation does a fairly good job predicting the ground state of Hydrogen.

4.4. Unification of physics via Planck’s constant

In conventional QM, re+rpis expressed in terms of Planck’s constanth, which is presumed to be a fundamental constant of Nature:


Here μis the so-called ‘reduced mass’, defined byμ1=me1+mp1, which makesμme. Using that approximation and equating the two expressions (13) and (14) for re+rpimplies


This expression comes to a value of 6.77×1034Joule-sec, about 2% high compared to the accepted value of 6.626176×1034Joule-sec. This reasonable degree of closeness suggests that Planck’s constant may reasonably be considered a possible function of other fundamental constants of Nature, and so not itself an independent fundamental constant of Nature. Or the situation may reasonably be considered the other way around: that some other fundamental constant of Nature is really a function of Planck’s constant. Either way, we would have one less independent fundamental constant of Nature, and that would mean one more degree of unification among the different branches of physics.

But of course, the expression for hdeveloped here can fulfill such aspirations only if the theory being developed can do a great deal more than just match the ground state of Hydrogen. Worthy targets for additional work include: anticipating the story for isotopes of Hydrogen, anticipating from there what happens with other elements, explaining the excited states of Hydrogen and their resulting spectral lines, anticipating from there the spectral features of some other elements, and characterizing the behavior of the full database on ionization potentials of all elements, and much more. It all constitutes a developing research area that I refer to as ‘Algebraic Chemistry’.

5. Extensions and extrapolations from hydrogen

5.1. Larger nuclear mass

The negative energy of the electron in the ground state of the Hydrogen is


This is the energy that would have to be provided to liberate the electron, or ionize the atom: the ‘ionization potential’.

Eq. (16) provides the basis from which to build corresponding expressions for other entities. For example, the extension to Deuterium and/or Tritium requires that the proton mass mpbe replaced with a more generic nuclear massM, and that rpbe replaced byrM. Then we have for the ionization potential of this more massive system:


5.2. Arbitrary nuclear charge

The extension of the model to a neutral atom with nuclear charge number Zinvolves Zelectrons as well. To develop the mathematical model, we must return to the expressions for PtorquingandPtotal radiated, Eqs. (10) and (11). All the factors of e2change toZ2e2, and the factor of me2changes toZ2me2. The equality Ptorquing=Ptotal radiatedbecomes Z4Ptorquing=(Z6/Z2)Ptotal radiated=Z4Ptotal radiated. So nothing happens to the equality between PtorquingandPtotal radiated, Eq. (12). But for the more charged system, the energy Eq. (17) becomes


This scaled-up expression represents the magnitude of the totalionization potential of the system involving Zprotons and Zelectrons. What is then comparable to the ionization potential for removing a singleelectron is:


Thus in the math we find a Z/Mscaling law. What do we find in the actual data? Something muchmore complicated, and indeed socomplicated that we would be unlikely ever to figure it out without the clue that Z/Mis part of the story. The involvement of Mmeans the involvement of isotopes, and unwanted complexity. So the clue tells us to look at ionization potentials, not in raw form, but scaled byM/Z, to remove the Z/Mfactor that the math anticipates.

Figure 2 shows the pattern found. Seven orders of ionization are included. There is a fascinating, but lengthy, story about ionization orders 2 and up; see Whitney (2012). The part of it that will be most important for the present development is obvious from Fig. 2: the energy required to completely strip the atom scales withZ2.

Figure 2.

Ionization potentials, scaled byM/Zand modeled algebraically.

With their M/Zscaling, all of theIP’s can be represented in terms of a baseline value equal to that of Hydrogen, IP1,1, and an incrementΔIP1,Z. The increment arises from interactions just between the electrons, quite apart from the nucleus. The electron-on-electron increments are very regular in their behavior. First of all, every period exhibits a general rise, and by the same factor of7/2. Second, there is a general drop from one period to the next, for the first three periods, and all by the same factor of7/8.

Then within periods, there is a very regular pattern. There are sub-period rises keyed to the traditional ‘angular momentum’ quantum numberl, and to a non-traditional parameter Nthat goes 1,2,2,3,3,4,4for periods 1through7, and gives the number of elements in a period as2N2. Forl0, we have:

incremental rise=total rise×fraction,



The following Table details the behavior fractional rises in First-orderIP’s over all sub-periods:


The scaled ionization potentials are calledIP’s. They are meant to be ‘population generic’; that is, the information they contain concerning one element can be applied to a calculation about another element in a different state of ionization, or excitation, by applying the Z/Mappropriate for the second element and its state.

5.3. Unequal counts for electrons and protons

Let us first consider ionization sates. These are important for applications in Chemistry, since chemical reactions involve ions. With all this regularity displayed in Fig. 1, it should be possible to use it to help predict the energy budget for all sorts of chemical reactions. We just need a rational way to extrapolate from all the formulae representing the regularities for single electrons being removed from neutral atoms to formulae for electrons being removed from, or added to, ions of all sorts.

Generally, if an atom is in an ionized state, then in place of just Zwe have an electron count Zedistinct from the proton countZp. The electron-on-electron interaction does not involve the nucleus, and so always scales withZe/M. But electron-nucleus interaction previously represented by (Z/M)IP1,1now has to involve both ZeandZp. We have for the total system


What is then generally comparable to the nuclear-orbit part of the ionization potential for removing a singleelectron? To develop an answer to this question, we must return again to the expressions for PtorquingandPtotal radiated, Eqs. (10) and (11). Clearly, all of the factors of e2change toZpZee2. It is as if all factors of echanged toZpZee. Removal of one electron is then like removal of one ZpZeecharge. What is comparable to the ionization potential for removing a single electron from the ion is then


Thus for ions, we see in the math a ZpZe/M(Zp)scaling law for that part of the ionization potential that reflects electron-nucleus interaction,IP1,1. So for computations we use:


For the other part of the ionization potential, that reflecting just the electron-on-electron interactions, ΔIP1,Z, the relevant ZisZe. But the relevant Mis stillM(Zp), the only significant mass in the problem. So for computations we use:


This basic information can help one to model the energy budget for any chemical reaction. To assist readers who want to try this out, the necessary data displayed by Fig. 1 is tabulated in numerical form as Appendix 1 at the end of this Chapter.

Here is one small example. Recall the comment about Fig. 1 that, for nuclear charge Z=2and up, the energy required to completely strip the atom scales withZ2. The actual formula plotted on Fig. 1 goes


The resulting IPZ,Zis population-generic. The corresponding element-specific quantity is IPZ,Zmultiplied by the factorZ/M(Zp). Thus the element-specific energy requirement for total stripping is 2×14.250×Z3/M(Zp)eV’s.

We can now compare the total energy required to strip an atom one electron at a time with the energy required to strip it of its electrons all at once. The two elements Helium and Lithium are good examples because they represent the extremes of very high first-order ionization potential and very low first-order ionization potential. The data for them in numerical form comes from Appendix 1. Here is how the calculations go:

Helium: (Zp=2,M(Zp)=M2=4.003)

Write Formulae:

H2eH2e+:IP1,1×2/M2+ΔIP1,2×2/M2; H2e+H2e++:

Insert Data:

H2eH2e+:14.250×2/4.003+35.625×2/4.003; H2e+H2e++:

Evaluate Formulae:

H2eH2e+:7.1197+17.7992=24.9189; H2e+H2e++:

Evaluate Total Stripping One-at-a-Time:



Compare to Total Stripping All-at-Once:


Lithium: (Zp=3,M(Zp)=M3=6.941)

Write Formulae:

; L3i+L3i++:IP1,1×3×2/M3+ΔIP1,2×2/M3; L3i++L3i3+:

Insert Data:

;L3i+L3i++:14.250×2.449/6.941+35.625×2/6.941; L3i++L3i3+:

Evaluate Formulae:

; L3i+L3i++:5.0278+10.2651=15.2929; L3i++L3i3+:

Evaluate Total Stripping One-at-a-Time:



Compare to Total Stripping All-at-Once:


In these two examples, we see that removal of all the electrons, all at once, takes much more energy than removing the electrons one electron at a time. It is plain to see that total stripping all-at-once is a vigorous, even violent, event. It is the stuff of special-purpose laboratory or field investigation. By contrast, total stripping one-at-a-time is a gentle process. The one-at-a-time process is an example of the stuff of ordinary production Chemistry.

5.4. Excited states - hydrogen

Now let us begin to consider excitation states. These are key for understanding emission or absorption spectra, a fabulously rich source of data about atoms. But atomic spectra are complicated. The standard way to begin to understand them is mathematically, from the family of solutions provided by the differential equation that Schrödinger postulated for the abstract wave function characterizing the electron in the Hydrogen atom.

The standard QM view is that the Hydrogen atom has multiple ‘stable states’, each with negative energy, E, determined largely by a principle quantum number n=1,2,3...according toEnE1/n2. The idea is that the electron can reside in an upper state (n1), but only rather precariously, and when it teeters and falls back to the ground state (n=1), a photon is emitted.

But the Hydrogen atom has only two constituent particles, the electron and the proton, and thus very few classical degrees of freedom. That fact makes it difficult to imagine an infinite multiplicity of different ‘states’ that a Hydrogen atom could exhibit. We are left to ponder a mystery of mathematical QM. So it is tempting to try to develop an additional, more immediately physical, way of understanding the spectral complexity that we see. Consider the possibility that individual Hydrogen atoms may not, by themselves, actually have excited states. Instead, the term ‘excited state’ may be better applied to a system that involves several Hydrogen atoms.

Key to this idea is that charges can form entities called ‘charge clusters’. [Concerning charge clusters at the macro scale of laboratory experiments and field observations: see, for example, Beckmann (1990), Aspden (1990), Piestrup and Puthoff (1998).]

Evidence concerning the probable existence of charge clusters at the micro scale of atoms is plainly visible in the data onIP’s (Fig. 1): some electron counts are very stable and hard to break apart (e.g. noble gasses), while some electron counts are very un-stable and hard to keep together (e.g. alkali metals). Why would electron counts matter so much if the electrons were not in deep relationships with each other?

But how can electrons outwit electrostatic repulsion? Once given the clue that they evidently cando this, it becomes possible to imagine howthey might do it. The key is that electrostatic repulsion dominates in a static situation. In a dynamic situation, electrons may move at speeds exceeding light speed (Remember, Sect. 3 cast doubt on the founding postulate of SRT, and SRT is all there is to forbid superluminal speeds.). If so, a repulsion signal from one electron may reach another electron only by the time the first electron has moved so much that the repulsion from its ‘then’ position has become the attraction to its ‘now’ position. In fact, multiple electrons can form circulating ring structures that are quite stable (for details, see Whitney 2012).

So consider the possibility that an excited state of Hydrogen is actually nHneutral Hydrogen atoms, with the nHelectrons in a negative charge cluster, and the nHprotons in a positive charge cluster, making a kind of ‘super’ Hydrogen atom; i.e., Hydrogen with every factor of electron massme, proton mass mpand charge escaled bynH. The torquing power PT=(e4/mpc)/(re+rp)3then scales by(nH)4/nH=(nH)3, and the radiation power PR=25e6/3c3(me)2(re+rp)4scales by(nH)6/(nH)2=(nH)4. The solution for system radius re+rp=32mpe2/3(me)2c2then becomes:


i.e., the system radius is scales withnH. The system orbital energy E1=12e2(re+rp)=12e2×3(me)2c2/32mpe2becomes


i.e., the system orbital energy also scales withnH. This result is the same as if the atoms were isolated, instead of being organized into a big system with two charge clusters. This suggests that the energy available for generating photons by de-excitation isn’t ‘orbital’ at all, but is instead the energy tied up in forming the charge clusters out of the multiple electrons and the multiple protons from the multiple Hydrogen atoms.

What can we infer about such charge clusters? As in the modeling ofIP’s for ions, we can again consider Fig. 1 as a source of information about electron clusters of sizes up to 118, quite apart from the particular element that the information is located with. From Fig. 1, It is clear that most of theΔIP’s are positive, meaning their electron clusters are hard to break. So despite being made of same-sign charges, most of them exist in negative energy states. The ones that are particularly hard to break are the ones associated with the noble gasses: Z=2, 10, 18, 36, 54, 86, (118). These elements are at the ends of periods on the periodic table, and the lengths of the periods themselves are:2, 8, 8, 18, 18, 32, (32). (Parentheses mean we haven’t discovered, or created, that element yet.) The implication is that excited states of Hydrogen existing in the form of ‘super Hydrogen’ would most frequently exist with nH=2, 8, 18, 32, ...

Can we anticipate what would happen when any such excited state de-excites? Suppose we started withnH=32. It could, for example, decompose into 18, 8, 2, 2, and 1, 1; i.e. some less excited states and a couple of ground-state atoms; 6 daughter systems in all. Suppose that for every such daughter produced, there is a photon released. Exactly how might that work? Observe that four daughters are in states that are even more negative than the starting state, so those are no problem. But two daughters are in the ground states, which is not more negative than the starting state. So energy from the other daughters has to be enlisted to create any photons there.

For anynH, there may be a de-excitation path, or paths, for which the energy budget is insufficient, in which case those paths won’t be taken. There may also be de-excitation paths for which the energy budget is more than sufficient, in which case there will be, not only spectral radiation, but also a bit of heat radiation. Very rarely, there might be a de-excitation path for which the energy budget is just exactly right.

The spectral lines that occur with Hydrogen (or any element) are typically characterized in part by differences in inverse square integers. The integers involved are traditionally understood in terms of the familiar radial quantum numbern. Is it possible to understand them also in terms of the nHused here?

Recall that if one then chooses to model the behavior of Hydrogen ‘excitation’ in terms of a single Hydrogen atom with discrete radial states identified with the radial quantum numbern, then the orbit-radius scaling has to be the quadratic scaling r1rn=n2r1of standard QM, not the linear orbit-radius scaling re+rprnH=nH(re+rp)of the present model. So why does one way of looking at the problem involve a quadraticn2, while the other way of looking at it involves a linearnH?

Recall that there was good reason to suggest highest probability for the values nH=2, 8, 18, 32,...corresponding to the lengths of the rows in the Periodic Table. These row lengths can be characterized as 2N2forN=1, 2, 2, 3, 3, 4, (4). So nHactually does encode something that is quadratic, namely theN2, and is therefore similar to the quadraticn2.

5.5. Beyond both hydrogen and ground states: Spectroscopy

In spectroscopy, we observe light created when an atomic system relaxes in some way. For elements beyond Hydrogen, the spectral lines that occur are often characterized in part by the so-called Rydberg factor:


The Ris traditionally interpreted as the energy needed for total removal of one electron from the ground state to infinity, leaving an ion. The energy needed for an electron to get from a state labeled n1to a higher state labeledn2n1, and conversely the energy released when it goes back ton1, is then modeled as


Observe that Rcontains a factor ofZ2, just like theIP’s for total ionization, IPZ,Zof Eq. (25) do. That means Ris referring to the absolutely largest photon energy that the system could ever possibly be imagined deliver: starting from a state of total ionization, i.e. a naked nucleus, and having the entire electron population return in one fell swoop, with the emission of just one photon for the whole job. That scenario could never actually happen. One-at-a-time electron return is the only plausible return scenario. The inverse square integers in the square bracket bring ΔEdown to values appropriate for one-at-a-time scenarios.

Observe that the Rydberg model for spectral lines already conflicts with an older model for the atom developed from the PT; i.e., electron ‘shells’ enclosing the nucleus, inner shells filled, and at most one outer shell unfilled; partially filled for most elements, and completely filled for noble gasses. The older PT-based model suggests shielding of the nucleus by the filled inner shells of electrons. But the occurrence of a Z2inR, even for large n1and n2inΔE, means there is no shielding of the nucleus. So electrons must be in tight clusters, rather than nucleus-enclosing shells.

Observe that Rdoes not contain any Z/Mfactor like the IPmodel contains. Instead, it has a factor


which is essentially unity. At the time when Rwas formulated, most of the known trans-Hydrogenic elements hadM2Z, and the factor of Z/M1/2could be absorbed into an external constant factor, the 2π2mee4/ch3inR. That is no longer the case today. We know about heavy elements for whichM2.5Z, orZ/M2/5. So now it would be better to use the function Z/Minstead of the number1/2. An extra bonus would be that Hydrogen, withZ/M=1, would be included.

Now consider that spectral lines might notto arise from de-ionizing one ion of one atom, but rather from de-exciting a system involving multiple neutral atoms. In this description, the n1and n2are not identifiers of different states of one atom, but rather numbers of atoms organized into super atoms. Otherwise, nothing really changes. However we interpret their meaning, the predicted spectral lines remain the same.


6. Unification between Newton and Maxwell

This last technical Section of this Chapter returns to the first physics disunity mentioned in the Introduction: the seemingly different coordinate-transformation properties of Newton’s Laws for mechanics and Maxwell’s equations for electrodynamics. Newton’s laws are form invariant under Galilean transformations. But Maxwell’s equations are generally thought to be form invariant only under Lorentz transformations. Especially, they are thought to be notform invariant under Galilean transformations.

So a curious situation exists within physics today. It is generally expected that the equations of physics should be tensor equations. By definition of the word ‘tensor’, a tensor equation is form invariant under arbitrary changes of reference frame, assuming no singularities or other cruel and unusual circumstances in the transformation or its inverse transformation. That means a tensor equation should be form invariant under arbitrary, though reasonably well-behaved, space-time transformations.

So, are Maxwell’s equations reallytensor equations? Or not? Mathematicians have good reason to challenge the believed tensor status of Maxwell’s equations, while physicists have good reason to challenge the believed requirement for invariance under anything other than Lorentz transformation. But the situation is not generally acknowledged. It is the proverbial ‘elephant in the living room’.

Clarifying this situation can assist physics in becoming more unified from its beginning to its present. And mathematics has lots of applicable tools; see Kiein (2009). The present work offers an approach that is also mathematical, but a lot more elementary. Maybe it will communicate to different readers.

The problem, I believe, is of a type with which QM has some history. QM appears to be the first branch of physics that well and truly needed complex numbers. They may have been used in physics before QM, but they were only one of the tools available for the problems then at hand. Sines and cosines could generally handle any problem just as well as complex exponentials could handle it. But with QM, complex exponentials became truly essential for doing physics.

The history of mathematics has been a tale of increasing range of objects included in the discussion. It began with real, positive integers; it grew with the inclusion of zero and negative integers, and grew again with the inclusion of all rational numbers, and again with the inclusion of all irrational numbers. Then it grew with the inclusion of imaginary numbers, thus creating complex numbers. This was the first of a number of ‘doublings’ of the number of dimensions attributed to mathematical objects. [See Rowlands (2007).] After complex numbers, we got quaternions, and bi-quaternions, or octonians, and there is no reason to suppose that further doublings will not continue to prove useful.

Complex numbers make possible operations that are not possible without them. Consider, for example, the square root of3. It cannot be evaluated within the real number system, but in the complex number system, it is just±i3.

I believe ‘doublings’ are generally like this: they make possible operations that were not possible without them. There appears to be today an opportunity for a doubling in the realm of tensor calculus. There are presently exactly two tensor-transformation behaviors identified, called ‘contravariant’ and ‘covariant’. It appears that tensor calculus can be usefully extended through a doubling of the number of transformation behaviors that can be described, from two to four. It appears that such a doubling can resolve the apparent conflict between Newtonian and Maxwellian physics: it can make possible a display showing how Maxwell’s equations can actually be form invariant under arbitrary coordinate transformations.

6.1. The opportunity offered by tensor notation

The display of four transformation behaviors requires the use of four tensor index positions. So in addition to the usual contravariant (index up-right) and covariant (index down-right) positions on the right side of a tensor symbol, we need to us the positions available on the left side: index up-left and index down-left. Since left-side index positions have not been in used in this new way before, they need new names designed for the purpose. To recall the move from right to left, let us use the prefix ‘trans’. So let the up-left index position be called ‘transcontravariant’, and let the down-left index position be called ‘transcovariant’).

All the transformations are describing what happens to tensor merates when the frame of reference changes; i.e. when the basis unit vectors defining the frame of references are replaced with other basis unit vectors. The transformations discussed here are arbitrary within the specifications that make the connections between reference frames reasonably well behaved; the individual relationships are differentiable and reversible, the matrix representations of them are invertible and unimodular.

I mention both tensors and matrices because they are equivalent notation schemes that can be used interchangeably for describing systems of linear equations. Tensor notation is useful for making a compact statement of a whole mathematical situation. Matrix notation is useful for separating a whole mathematical situation into constituent parts for calculations. Individual linear equations are useful for focusing on individual parts of the mathematical problem. Human beings do have strong personal preferences about which approach to use, but all of these approaches should agree on the basic facts of a given situation, so any of these approaches should be acceptable. In the present work, all approaches will be used. That way, everyone can find something to like, and everyone can find something to dislike!

In the case of the matrix displays and the linear equations, the presentation does save a little space by ignoring two spatial dimensions and focusing on one spatial dimension (call it 1) and the temporal dimension (call it 0).

6.2. Transformation of a contravariant object

The most familiar transformation is the contravariant one. The prefix ‘contra’ means these tensor merates change opposite to the way the basis unit vectors of the reference frame change. For an arbitrary input vectorXα, the transformation readsX¯β=[x¯β/xα]Xα, where we see the transformation as partial derivatives of coordinates, new with respect to old. EquivalentlyX¯β=TαβXα, where we see the transformation written as the tensorTαβ. Also equivalently, we have[ X¯0  X¯1 ]=[ T00   T10  T01    T11 ][ X0  X1 ], where we see everything, the input and output vectors and the transformation, in matrix format. Or equivalently, we have X¯0=T00X0+T10X1and X¯1=T01X0+T11X1as two separate linear equations.

For the contravariant transformation matrix [ T00    T10  T01    T11 ]one can define a reverse transformation matrix[ R00   R10  R01   R11 ], wherein R00=T00andR11=T11, but R10=T10andR01=T01. Applied toX¯β, the reverse transformation Rβαtakes X¯βback toXα:Xα=RβαX¯β. That is to say: X0=R00X¯0+R10X¯1andX1=R01X¯0+R11X¯1. Expressed in matrix form, the reverse transformation is the inverse transformation:[ X0  X1 ]=[ R00   R10  R01   R11 ][ X¯0  X¯1 ], or[ R00   R10  R01   R11 ][ T00   T10  T01   T11 ]=[ 1   0  0   1 ]. The distinction between the words reverse and inverse is nil in the contravariant context. But it becomes important in the next context.

6.3. Transformation of a covariant object

The prefix ‘contra’ means reverse to the prefix ‘co’. The covariant transformation goes the same way the basis unit vectors change. So the covariant transformation X¯β=CβαXαin matrix format [ X¯0  X¯1 ]=[ C00   C01  C10   C11 ][ X0  X1 ]uses transformation matrix Cequal to the reverse contravariant transformation matrixR:[ X¯0  X¯1 ]=[ R00   R10  R01   R11 ][ X0 X1], or equivalently X¯0=R00X0+R10X1andX¯1=R01X0+R11X1. It is generally assumed that this is the same as saying the covariant transformation is the inverse to the contravariant transformation. Notice however that the off-diagonal meratesC01=R10, and C10=R01have indices switched around. This is because Coperates on a covariant object, whereas, in its original definition, Roperated on a contravariant object.

The index switching makes no difference if we limit attention to transformations that are space-time symmetric, i.e. Lorentz transformations. But if we wish to investigate any other type of transformation, we have to investigate whether the switch makes a difference. Consider the inner productX¯βX¯β. Under Lorentz transformation, it is preserved, equal toX¯βX¯β=XαXα. But if we do not have space-time symmetry, is it still preserved? This question has to be answered by testing.

Laying out the problem in matrix format, we have to make one of the vectors, say the covariant one, a row vector, and then we have to test:

[ X¯0   X¯1 ][ X¯0  X¯1 ]=[ X0   X1 ][ R00   R01  R10   R11 ][ T00   T10  T01   T11 ][ X0  X1 ]=?=[ X0   X1 ][ X0  X1 ]E44

Observe that the Rmatrix is transposed from what it would need to be to make the RTmatrix product collapse to the identity. So the inner product XαXαis generally notpreserved if we do not have space-time symmetry.

6.4. Transformations for objects of four types

In order to recover the general availability of preserved inner products, the two additional transformation behaviors are defined. The transcovariant transformation is defined as the transposed inverse of the contravariant one. The transcontravariant transformation is defined as the transposed inverse of the covariant one.

Recall that this discussion began with the contravariant transformation written in the tensor notationX¯β=[x¯β/xα]Xα. The discussion soon became complicated enough to merit introduction of more detailed notation that can clearly distinguish the four cases. The following Table illustrates the expanded tensor notation:


The Table is organized for user convenience, with the position of information corresponding to the index position: upper right for contravariant, lower right for covariant, lower left for transcovariant, and upper left for transcontravariant. The index position assigned to an object determines the transformation law that it follows.

Now let two arbitrary numbers with magnitude less than unity be represented by the letters Aand B(chosen from the word ‘arbitrary’!). Let the arbitrary numbers represent in turn the off-diagonal elements of transformation matrices. The following table shows the corresponding matrix notation:


Observe that this Table uses negative signs on the arbitrary Aand Bin the contravariant and transcontravariant cases, positive signs in the covariant and transcovariant cases. This sign choice is used to help recall the prefixes ‘contra’ and ‘co’. Observe too that ifB=A, we have space-time symmetry, which is the case of Lorentz transformations. And observe finally that ifB=0, we have universal time, which is the case of Galilean transformations. But Aand Bare arbitrary, and so can also represent other transformations as yet unnamed.

6.5. Transformations for invariant objects

The underlying purpose of tensor calculus is to focus on mathematical objects that are ‘coordinate free’, or ‘frame independent’, or ‘invariant’ (whether in form or in numerical value), – all expressions meaning that coordinate transformation does not change anything fundamental about an object so-described: values of scalars, or relationships expressed as equations involving tensors.

The user of tensor calculus expects certain behaviors. There should be number invariant inner products of vectors and of higher-order tensors. The ‘unity’, or ‘Kronecker delta’ is not presently regarded as a real tensor, but can be accepted as one if it can be demonstrated number invariant. Finally, the user will certainly expect a number invariant ‘metric tensor’, the essential tool for manipulating index positions to develop tensor equations. Displaying that all these expectations can be met in the case of arbitrary transformations, not just Lorentz transformations, is the objective of this Sub-Section.

The matrix notation is useful in checking out the transformation of all these entities. For example, the preserved inner product of a vector Xwith itself looks like (note the transpositions for operating on row vectors):

X¯βX¯β=[X¯0 X¯1][ X¯0  X¯1 ]=[ X0   X1 ]11AB[  1   +B  +A   1 ][ X0   X1 ][  1   B  A   1 ][ X0  X1 ]=[ X0   X1 ][ X0  X1 ]=XαXαE45


X¯βX¯β=[ X¯0   X¯1 ][ X¯0  X¯1 ]=[ X0   X1 ]11AB[  1    B  A   1 ][ 1   +B  +A   1 ][ X0  X1 ]=[ X0   X1 ][ X0  X1 ]=XαXαE46

The more familiar inner product XαXαis preserved with Lorentz transformations, but not with arbitrary transformations. So it shouldn’t be considered any kind of ‘invariant’. The same is true of the unfamiliar(Xα)(Xα).

With the extended tensor notation, we can identify the index positions that definitely make a number invariant Kronecker delta. It looks like (note the transpositions for operating on row vectors):

δ¯δγ=11AB[ 1   +A  +B   1 ][ 1   0  0   1 ][ 1   A  B   1 ]=[ 1   0  0   1 ]=δβαE47


δ¯δγ=11AB[ 1   A  B   1 ][ 1   0  0   1 ][ 1   +A  +B   1 ]=[ 1   0  0   1 ]=δβαE48

The more familiar δβαis preserved with Lorentz transformations, but not with arbitrary transformations. That is why it does not qualify as a tensor. The same is true of the unfamiliarδβα.

Some readers will be surprised to see the present argument using the Lorentz metric, [ 1    0  0   1 ], without accepting a limitation to Loentz transformations. It is widely supposed that the Lorentz metric requires Lorentz transformations, and/or Lorentz transformations require the Lorentz metric. But such a connection is not in fact mandatory.

The generally preserved forms of the Lorentz metric tensor look like (note the transpositions for operating on row vectors):

g¯δγ=11AB[  1   A  B   1 ][ 1    0  0   1 ][  1   A  B   1 ]=11AB[ 1   A  B   1 ][  1   A  +B   1]=[ 1    0  0   1]=gβαE49


g¯δγ=11AB[ 1   +A +B   1 ][ 1    0  0   1 ][ 1   +A  +B   1 ]=11AB[ 1   +A  +B   1 ][  1   +A  B   1 ]=[ 1    0  0   1 ]=gβαE50

The more familiar gβαand gβαare preserved with Lorentz transformations, but not with arbitrary transformations. They shouldn’t be considered any kind of ‘invariant’. The same is true of the unfamiliar gβαandgβα.

The number invariant gαβand gαβcan function to raise and lower indices on objects. For example, Xβ=(gαβ)XαandXβ=(gαβ) Xα, or Xα=(gαβ) (Xβ)andXa=(gab) (Xb).

One can also write additional index assignments forg. Altogether, there are 10 possible assignments, as there are 4×3/2=6with indices in different corners, and 4 with indices in the same corner.

Two of the additional index assignments look like gβαandgβα. These two entities cannot do anything to an index except change its name. For example, Xα=Xβ(gβα)andXα=Xβ(gβα), or Xβ=(gβα) (Xα)andXβ=(gβα) (Xα). So gβαand gβαare just equivalent to the number invariant δβαand δβαalready noted above.

Further additional index assignments on gcreate entities that can serve to convert a regular index into a trans one, or a trans one into a regular one. None of these entities are number invariant, but in practice, that does not matter. The user does not convert just a single object; the user converts a whole tensor equation. The index-converting gentities typically occur in pairs, and the pairs contract to number invariant objects. When they don’t occur in pairs, they do occur on both sides of an equation, and cannot affect the issue of equation form invariance.

Another two of these ofg’s are gβαandgαβ. They function to do Xβ=(gαβ)XαandXα=(gαβ)Xβ, or Xβ=gβα(Xα)andXα=gβα(Xβ). As a pair, they contract to(gβγ)(gγα)=gβα, or to(gγβ)(gαγ)=gβα, both of which are number invariant.

The final four indexedg’s aregαβ, gαβ, gαβ, andgαβ. They can all function to change a regular index into a trans one, or a trans one into a regular one, but with a twist: ‘co’ goes to ‘contra’, or ‘contra’ goes to ‘co’. That is, Xα=(gαβ) Xβ, Xα=(gαβ) Xβ, gαβ(Xα)=Xβandgαβ(Xα)=Xβ. As noted above, the contractions (gβγ)(gγα)=gβαand (gβγ)(gγα)=gβαare number invariant.

The bottom line is this: to be sure of invariance under arbitrary transformation, not just Lorentz transformation, always contract a regular index with a trans index.

6.6. General invariance for Maxwell’s equations

Maxwell’s equations in current tensor notation read:




The two-index Fαβand Dαβtensors refer to the electromagnetic field and the ‘dual’ thereof. The electromagnetic field tensor Fαβhas merates that are components of the three-dimensional electric and magnetic field vectors, EandB. The Dαβis the dual toFαβ, whose merates are components of BandE. The one-index tensors Jβand αrefer to the source charge-current density vector and the differential operator vector. The indices αand βtake the four values0,1,2,3.

The seeming limitation of Maxwell’s equations to invariance only under Lorentz transformation arises entirely from the differential operator being written as a covariant vector. In the extended tensor algebra, this operator is identified as transcovariant, and then Maxwell’s equations look like:




Written this way, Maxwell’s equations are manifestly form invariant, not only under Lorentz transformation, but also under any arbitrary (just well-behaved) transformation, including Galilean transformation.

7. Conclusions

About Maxwell’s equations and photons: Photons have a life history that begins with emission as an electromagnetic pulse pulse, proceeds with development into a waveform, then changes into regression back to a pulse, and ends with absorption by a receiver. This life history of the photon can be modeled by imagining some mirrors that apply boundary conditions corresponding to the desired scenario, feeding a Gaussian pulse at the source to Maxwell’s equations, watching Hermite polynomials then emerge, and then finally pile up at the receiver.

About EM signals and photons: The life history of the photon suggests that the assumption upon which Einstein’s SRT is founded is over-simplified. If we will make the founding assumption more realistic, then we will get more believable results. The more believable results can help us reconcile SRT with the QM of atoms. We can understand why Planck’s constant occurs. It represents the balance between competing phenomena: on the one hand, energy loss due to radiation from accelerating charges; on the other hand, energy gain due to internal torquing within the atomic system due to finite speed of signal propagation.

About Atoms: Viewed in the right way, chemical and spectroscopic data reveal a tremendous amount of regularity. So we are well enabled to interpolate and extrapolate for situations where actual data is not available. We can analyze scenarios where electrons are subtracted from or added to an atom, all at once, or one at a time; whatever we need. But take care: in the existing literature, the distinction between ‘all-at-once’ and ‘one-at-a-time’ is often obscure, so be careful.

About Maxwell and Newton: There should have been no conflict between Maxwell’s equations and Newton’s equations over the issue of transformation invariance. Maxwell’s equations are form invariant under Galilean transformations, just as they are form invariant under Lorentz transformations. Physics does not have conflicts. Only people have conflicts. And people can resolve their conflicts. The conflict perceived in the case of Newton vs. Maxwell is resolved with an extension of mathematical formalism.

About Physics in General: This work has shown that SRT deserves a moment of caution, and the reader may reasonably worry that GRT deserves some caution too. So it may be premature to develop a theory of quantum gravity. Placing the QG capstone onto the RT and QM pillars of 20th century physics may produce something that resembles the ancient constructions at Stonehenge, but not the Gothic cathedrals of Europe, much less anything modern.

8. Appendix 1. Numerical data on ionization potentials for all elements

Periods 1, 2 and 3.

Period 4.

Period 5.

Period 6.

Period 7.


The author thanks colleagues for deep and intense conversations on the topics discussed here, especially Dr. Peter Enders, Dr. Yuri Keilman, Dr. Robert Kiehn, Prof. Zbigniew Oziewicz, and Dr. Tom Phipps.


  • Note that I speak of MED, not of Classical Electrodynamics (CED) in general. CED involves, not only the works of Maxwell, but also those of a large number of other individuals. I am inclined to trust results from Maxwell, but question some of those from other authors, as reported in the present work.
  • Wheeler and Feynman were looking to time symmetry as the basis for an electromagnetic generalization of instantaneous (Newtonian) gravitational interaction. There are important differences between the regressing waveforms introduced above and the Wheeler-Feynman advanced solutions: 1) Wheeler and Feynman were looking at interactions between essentially point sources and receivers, and so had to be looking at spherically expanding retarded solutions and spherically contracting advanced solutions, not at essentially one-dimensional expanding and contracting wavelets. 2) The Wheeler-Feynman expansion or contraction is related to the spherical area of a wave front, not the waveform in the radial propagation direction. 3) A lengthy discussion of the paradox of advanced actions is necessitated in the Wheeler-Feynman work, whereas the ‘regressing’ solutions introduced here are not in fact ‘advanced’ at all; they are just regressing, in real time, in the propagation direction.

© 2012 The Author(s). Licensee IntechOpen. This chapter is distributed under the terms of the Creative Commons Attribution 3.0 License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

How to cite and reference

Link to this chapter Copy to clipboard

Cite this chapter Copy to clipboard

Cynthia Kolb Whitney (February 24th 2012). Better Unification for Physics in General Through Quantum Mechanics in Particular, Theoretical Concepts of Quantum Mechanics, Mohammad Reza Pahlavani, IntechOpen, DOI: 10.5772/35211. Available from:

chapter statistics

3115total chapter downloads

More statistics for editors and authors

Login to your personal dashboard for more detailed statistics on your publications.

Access personal reporting

Related Content

This Book

Next chapter

Nonrelativistic Quantum Mechanics with Fundamental Environment

By Ashot S. Gevorkyan

Related Book

First chapter

Measurement in Quantum Mechanics: Decoherence and the Pointer Basis

By Anu Venugopalan

We are IntechOpen, the world's leading publisher of Open Access books. Built by scientists, for scientists. Our readership spans scientists, professors, researchers, librarians, and students, as well as business professionals. We share our knowledge and peer-reveiwed research papers with libraries, scientific and engineering societies, and also work with corporate R&D departments and government entities.

More About Us