Einstein’s Special Relativity
In Part I of this three-part series on Einstein we focused on his biography and on some of the life experiences that shaped his personality and intellect. It is time now to make the effort to understand the meaning and significance of his scientific contribution.
The best place to start is Einstein’s 1905 explanation of the photoelectric effect, which gave him the Nobel prize in 1921. The photoelectric effect is the observation that many metals emit electrons when light shines upon them. It was first observed by Heinrich Hertz in 1887. This photoelectric effect is not to be taken lightly. It means that you can start an electric current in a circuit simply by shining light on a metal plate. This is the basis of solar power today.
This phenomenon is actually rather simple. We all know that light warms up the surfaces of objects when it shines on them for some time. That means that the object receives energy from the beam of light. It makes sense that, at some point, the energy supplied by the light will cause an electron in the object to increase its kinetic energy and escape. This escape is much more likely to occur in a metal because, as we know, metals have a plentiful supply of free electrons in their structure. These are electrons that are not strongly bound to specific atoms and are free to move about inside the metal.
An explanation of photoelectric emission based on classical physics is very much like the common sense explanation just described. Energy is transferred from light to the electrons in the metal. Low light intensities would not provide sufficient energy to increase the kinetic energy of electrons and cause them to escape the metal. As light intensity exceeds a certain threshold level, the energy carried by the light wave is greater and some electrons will begin to escape. Even more electrons will escape when the intensity of light is increased further.
Amazingly, this is not exactly what the investigations showed. Experiments performed in 1902 by Philipp Lenard, a German physicist who had worked as assistant to Hertz, gave results that could not be explained by classical theory. For example, the number of electrons that escaped inceased with light intensity but their kinetic energies did not. The kinetic energies increased only with increasing light frequency. Blue light caused the emitted electrons to move faster than red light did. The dependence on frequency did not make any sense in classical theory.
In his groundbreaking 1905 paper Einstein developed a theory that explained this unexpected result. Einstein solved the paradox using Planck’s theory that light energy does not flow continuously but comes in little lumps, now called photons. He showed mathematically that the energy of emitted electrons increases linearly with the frequency of incident light and is independent of the intensity of light. Einstein’s interpretation elevated the scientific value of Planck’s theory making it a cornerstone of modern physics. We do not know if Planck’s long friendship with Einstein was in any way the result of Planck’s gratitude.
Einstein’s 1905 paper on the photoelectric effect is a masterpiece of mathematical logic and simplicity. Readers who are mathematically inclined will gain tremendous value from reading the original paper. We will however resist the temptation to fill this page with mathematical symbols and equations and will opt instead for an understanding of the physical process.
We know that in a metal there are electrons that are not strongly attached to specific atoms and are free to move around. These electrons feel attractive forces from atoms but these forces cancel out as they come from all directions around the electron. If the electron is near the surface of the metal it may just receive sufficient energy from the striking photons to acquire the kinetic energy required for an escape. When a photon strikes an electron, the photon disappears. It never had any mass after all, it was all energy. The photon’s energy is now absorbed in the electron as kinetic energy and the electron moves faster. The energy received from the photon is proportional to the photon’s frequency, as we know from Planck’s law. If the frequency (and hence the energy) of the incident photons is high enough, as in ultraviolet light, more and more electrons will gain the necessary kinetic energy for an escape, including some of those electrons that are on the outer orbits of specific atoms. Increasing the intensity of incident light only increases the number of photons that collide with electrons but the photons will not have sufficient energy to impart to the electron if the light has low frequency. The electron will gain some kinetic energy from the photon, equal to the photon’s energy which, as we know, is proportional to the photon’s frequency. If this energy is not sufficient, the electron will move faster than before but will stay confined in the metal. Only increasing the frequency (colour) of incident light will increase the electron’s ability to escape beyond the threshold point and hence start the emission. This was actually confirmed in later experiments. It was also confirmed that light of low intensity but high frequency was able to start the emission.
We can now summarize the photon theory of light as follows: Light consists of small indivisible chunks of energy called photons, which move in a wave-like fashion at the speed of light. When a photon collides with an electron, the photon disappears. If the electron is a free electron, its kinetic energy increases. If it is a bound electron in an atom, the electron jumps to a higher orbit and its potential energy increases. When the electron falls back to a lower orbit, it releases a photon. Each photon has energy equal to Planck’s constant multiplied by the frequency of its wave-like motion.
A good way to visualize the photon’s wave-like motion through space is to think of a snake trying to move quickly. If you drive on rural roads and have seen a snake trying to cross the road, you may have noticed that the snake transports its body in a wave-like fashion and moves forward along a path at a certain speed. The photon moves in a similar way but moves at the speed of light. Photons of blue light move in waves of higher frequency than photons of, say, red light but both types move at the same speed, the speed of light. As a result of its higher frequency, blue light transports more energy than red light.
The photon theory of light, also called the quantum theory of light, does not replace classical electromagnetic theory. It adds to it in a way that makes possible the explanation of experimental results. This is again a demonstration of how preconceived ideas must change when empirical evidence proves them wrong or inadequate.
The annus mirabilis papers of 1905, Einstein’s extraordinary year, included a paper titled Does the inertia of a body depend upon its energy content? This is the paper where Einstein developed his theory of equivalence of mass and energy E=mc2, the most famous equation in history. The paper is only two pages long and has no more than eight equations. The mathematical derivation is remarkably simple and uses no higher mathematics than algebra.
We realize, of course, that c2 is a very large number. It is equal to the square of the speed of light, which in metres per second is the number 9 followed by 16 zeros. (Einstein comes up with 20 zeros, as he uses an older system of units based on the centimetre). This means that a tiny piece of mass has an enormous energy content.
The simplicity of Einstein’s language and mathematics is evident in this paper and refutes the old adage that says it is easier to understand the work of a great thinker from the writings of others. With Einstein’s work our understanding is greatly enriched by reading the original papers.
One of the 1905 papers deals with the special relativity theory and is titled On the Electrodynamics of Moving Bodies. This is the paper that outlines Einstein’s ideas about the relativity of space and time. As with all of Einstein’s work, we will focus on the development of the theory and the understanding of the physical concepts that underlie the mathematical formalism. Mathematics cannot be an end, it is a means to an end. As we are not mathematicians, we will not rid ourselves of the anxiety of assigning physical meaning to mathematical symbolisms. We will need to extract the scientific truth from the dryness of mathematical symbolism and make it part of our understanding, intuition and common sense.
We may remember from high school how we used the Cartesian coordinates to describe an object’s position in three dimensional space with the three coordinates x, y and z. The Cartesian system is an example of a reference system and it is as useful in relativity as it was in classical physics. If our object of interest is moving, we simply track the changes of its Cartesian coordinates within the fixed Cartesian system. Nothing prevents us, of course, to think of a moving Cartesian system with a moving body inside. This is the case when we have a body that moves inside a moving train. In this case we have a fixed Cartesian system C, which is the stationary earth, as well as a Cartesian system C’, our train which is moving relative to C.
The earth, of course, is not stationary but sometimes we will need to assume that our basic reference frame is stationary in order to explore certain phenomena. In relativity we will have to modify the Cartesian system somewhat by adding the coordinate “time”. Since it is not so easy to visualize a four-dimensional coordinate system (the three spatials x, y, z, plus time) we will think of our familiar three dimensional Cartesian system that moves in space as time passes or, better yet, moves in spacetime. In much of our discussion we will use the word frame meaning reference frame or coordinate system.
Everything in the universe is in motion and objects acquire their time dimension through their motion. Suppose for a moment that the universe consists of bodies that are fixed in space and nothing moves. There is a distribution of matter throughout our imaginary universe and matter consists of objects of different densities, such as rocks, planets, meteorites, dust, air, but nothing ever moves. We can describe this universe completely by writing the space coordinates of each molecule of matter. Time does not exist and it is superfluous in our description of the universe.
Let us now suppose that, for some reason, a molecule moves to a new position in space. It does not matter what the reason is. It could be a beam of light coming from somewhere that provides the necessary energy for the molecule’s motion. Now that we have motion, we need the concept of time for a full description of our universe. The dimension of time is born with motion, which is nothing but a change of position in space. We still cannot measure time because we do not have a periodic event that will help us define a unit of measuring time. But time does exist, because we now have several different conditions of the universe defined by the path of the molecule from its first to its final position. If we had a means of measuring time, we could even define velocity as the change in the molecule’s position divided by the time taken to get there. So velocity is change in position per unit of time, while time is also defined only when there is a change in position. It seems that there is a great deal of similarity between time and velocity and we will see a relationship between the two in Einstein’s idea of time dilation.
Two of the most important results of special relativity are length contraction and time dilation. Length contraction means that when an object is moving at a constant speed with respect to an observer, its length in the direction of motion as seen by the observer is shortened by an amount proportional to the Lorentz factor, which is a correction factor that takes into account the relationship of the object’s speed with the speed of light. Time dilation means that when an observer is moving at a constant speed with respect to another observer, the clock of the moving observer appears to the stationary observer to tick more slowly than the clock of the stationary observer. The time difference registered by the stationary observer is proportional to the Lorentz factor.
The Lorentz factor is a relativistic relation that appears in much of Einstein’s work and precedes Einstein’s 1905 papers by about ten years. In fact, special relativity was initially called the Lorentz-Einstein theory for its dependence on Lorentz fundamentals. The Lorentz factor was developed in the 1890’s by Hendrik Lorentz, a Dutch physicist who had developed a mathematical transformation to explain how the speed of light was independent of the reference frame and to understand the symmetries of Maxwell’s equations.
We need to understand why time slows down at fast speeds, as this is an important outcome in special relativity. Let us think about how we measure time. We do so by observing and registering repeating events. For example, a pendulum swinging across back and forth is endlessly repeating the same swing and records one second of time per swing. If the clock is in the moving train together with an observer, the observer will see the pendulum swinging back and forth exactly the same as when the train was stationary. But a stationary observer standing on the station platform as the train passes by will see the train’s clock pendulum traversing a greater distance from one end of the swing to the other. This greater distance is because the pendulum must travel across to complete its swing and at the same time must travel forward in the direction of the train’s motion. The pendulum will still register one second of time but the distance traversed will be greater. One second is therefore slower to the outside observer than one second when the train was stationary. The slowing of time is not perceptible at earthly speeds but becomes significant at very high speeds that are of the same order of magnitude as the speed of light. This slowing of time is called time dilation.
If we use a clock that works with light pulses instead of a mechanical pendulum clock, the result will be the same. This type of clock has a light source and light detector at its base and a mirror at a height h from its base. We assume that the clock is moving at a high speed along with an observer on its platform. The light source sends a light pulse that is reflected by the mirror and comes back to the light detector. The moving observer sees that the total distance travelled by the light pulse is 2h. The time it takes for the round trip is 2h/c, which is the distance travelled divided by the speed of light. The clock registers one second of time.
Let us now think that there is a stationary observer who stands by as the clock moves past. The stationary observer sees that the source sends a light pulse which, by the time it hits the mirror, the position of the mirror has changed somewhat in the direction of motion. The pulse is reflected back and by the time it hits the detector, the position of the detector has changed some more, again in the direction of motion. The stationary observer sees that the pulse travels a longer distance than before and therefore he registers a longer unit of time in his own apparatus than the moving clock does.
It is important to realize that time dilation is not an optical illusion and is not caused by the subjectivity of the human observer. It is caused by the relative speed of the two reference frames. The same time dilation would be recorded if instead of a human observer we had a clock or a photographic plate.
A logical consequence of time dilation is the loss of simultaneity. This is an important idea in special relativity. It means that events that are simultaneous to the observer on the station platform are not simultaneous to the observer on the moving train. The greater the speed of the train, the more pronounced the loss of simultaneity will be. Therefore, every reference frame has its own particular time. The statement of the time of an event has no meaning unless we are told the reference frame to which the statement refers.
Similarly, the idea of length contraction means that the length of a moving object, such as a rod, is subject to the Lorentz factor. The physical size and structural compactness of the rod certainly does not change but the moving rod does appear shorter to the stationary observer and the difference in length is proportional to the Lorentz factor. Length contraction is difficult to prove by direct experiment because objects of measurable length cannot be accelerated to relativistic speeds with current technology. The only objects traveling with near relativistic speeds are atomic particles but their physical magnitudes are too small to allow a measurement of contraction.
Time dilation and length contraction are not optical illusions and are not a result of human subjectivity. We might call them measurement illusions or, better yet, relativistic effects. They must be taken into account when we make observations of objects moving at relativistic speeds. Lengths and times no longer have the absolute character attributed to them in Euclidean geometry and classical physics.
Relativity tells us that the relative speed between object and observer must be accounted for, as it changes the measurement of lengths and times. Time can no longer be regarded as independent of position and motion and this is what makes it necessary for us to think in terms of spacetime. A point in the Cartesian frame represents the position coordinates x,y,z of an object in space. A point in the four dimensional spacetime frame represents the existence of a physical event in spacetime. Our focus in relativity has shifted from objects to events.
Has relativity replaced Newtonian mechanics? The answer must be positive for the microcosm, where we can have relativistic speeds at the atomic level. But the answer must be negative for the macrocosm, where the speeds of earthly and planetary objects are a tiny fraction of the speed of light.
This series on Einstein concludes witth Part III to be published soon. We will talk about General Relativity which, according to some scholars, is the most beautiful physical theory ever invented.
Our goal is to gain a basic understanding of modern science in our natural language, using our common and uncommon sense, away from complex mathematical formalism. The interested reader may find more advanced but still understandable descriptions in Bertrand Russell’s The ABC of Relativity and also in Albert Einstein’s The Meaning of Relativity.