Grokking Special Relativity

© 2013 by  Fernando Caracena

The Special Theory of Relativity in a nut shell

Einstein sacrificed everything measured about time and space in an inertial frame of reference in order to preserve the speed of light in a vacuum as being the same for every inertial observer. In his thinking about events, he expanded the idea of a position vector that relates only to space, to a four-vector, which incorporates the time when an event happens. A four vector describes everything about the when and where of an event. Just as position vectors can be reconciled in inertial frames of reference by a Galilean Transformation, the four-vectors described by two different inertial observers can be reconciled by  a Lorentz Transformation that mixes its time and space components.

The Special Theory of Relativity occupies a transitional position between classical and modern physics in several ways. It deals with bulk (macroscopic) matter and at the same time with atomic and subatomic (microscopic matter). Correction of relativistic effects are importance for the Global Positioning Navigation System (GPS), which is macroscopic. Further, through the equivalence of  mass and energy, Special Relativity accounts for how particles of matter, when colliding travelling speeds near that of speed, form additional particles of matter. This happens in many experiments of modern physics, such as those performed on the Large Hadron Collider.

In what follows, the Theory of Special Relativity is developed in a different way from how it is usually developed in text books, and in the simplest possible notation, which uses four-vectors. For an alternate derivation using the Lorenz transformations between various frames of reference, see the video by a professor at Yale (see here). A lecture on Special Relativity by a professor of physics at Sanford University, Dr. Leonard Susslikd, is available online.

The Equivalence of Mass and Energy

We begin the discussion of Special Relativity with a formula of Einstein's that almost every modern person recognizes,

E = m c2 ,                                                                                                  (1a)

where E is the entire energy content of an object and m is its mass. This equation (1a) is called the most famous equation in physics.

Equation (1a) expresses the idea that the substance of matter is energy, which manifests itself as mass. The rest mass of an object represents an enormous amount of bound energy that roars under the hood of matter. Think about it, there are a lot of contained processes that are somehow locked up in the forms that we recognize as material objects. As an example, consider how much energy is invested in a kilogram (about 2.20462 lbs) of matter at rest. The speed of light in a vacuum is very close to 3.0 x108 m/s, so that the energy content of a 1.0 kg object is

E=1*9.0x1016   Joules


E≈ 1017   J.

This amount of energy was released on the Earth's surface by the magnitude 9.1–9.3 2004 Indian Ocean earthquake and 1.7 times this much energy is the total amount of energy from the Sun that strikes the face of the Earth each second. Also, note that the total world energy consumption, 5.0x1020 J, is equivalent to to the energy content of about 5,000 kg of matter (just about 5 tons).

For comparison with the motion of ordinary objects, consider the amount of kinetic energy of a 1 kg mass travelling at 10 m/s

KE = ½ m v2 ,

KE =50 J,

which is in a very small ratio compared to its rest mass energy (5x10-16 ).

Another example:

The mass of a standard baseball is about 0.45 kg. A fast ball travels around 100 miles per hour, which is about 45 m/s. The kinetic energy of a fast ball is therefore about 450 Joules. However, the rest energy of a baseball is about 4.0 x 1016 Joules . The ratio of the kinetic energy of a fast ball to that of its rest energy is about 1.1 x 10-14 J  .


Show that the ratio of the kinetic energy an object travelling at low speeds compared too that of light to its rest energy is ½ (v/c)2  . Consider that the muzzle velocity of a rifle bullet is about, 400 m/s and that escape velocity for an object fired from the surface of the Earth is about 11,200 m/s, even these very fast speeds are small compared to the speed of light; and so, they add only minuscule amounts to the total energy of a mass moving at this speed.

Relativistic increase in mass with increased speed

Consider that the kinetic energy of an object also adds to its mass, even though the amount may be minuscule. First, consider an object at rest, then accelerate it to a speed which is small compared to the speed of light. The rest energy of the object, E0, corresponds to a certain amount of rest mass, m0, as follows:

E0 = m0 c2.                                                                                                  (1b)

When it is moving at a speed of v, the total energy of the object is

E = m0 c2 + ½ m0 v2 .                                                                                  (1c)

Now, we take (1c) to be an approximation of the relativistic formula that results from the relation, v « c, or v/c « 1. We use the following formula for the approximation of a reciprocal of a square root of a quantity that is infinitesimally close to 1 to rewrite (1c):

1/√(1-ε) ≈ 1+ ½ ε,                                                                                      (1d)

E = m0 c2/√(1-v2/c2).                                                                                   (1c)

By equating (1a) and (1c), we get Einstein's famous equation for how the mass of an object varies with its speed,

m=m0 /√(1- v2/c2).                                                                                          (2)

For ordinary objects, the increase in mass with their speed is essentially unmeasurable; but this is not so for subatomic particles that move very close to the speed of light.

Postulates of the Special Theory of Relativity

First postulate

Einstein, reaffirmed Galileo's and Newton's idea that the laws of physics are the same in all inertial frames of reference. Before Einstein's formulation of Special Relativity, the transformation between inertial frames of reference that preserved the invariance of of the laws of physics, was known as as the Galilean transformation, which runs as follows for two frames of reference  one 'at rest', relative to which another moves to the right parallel to the x-axis, beginning from coincidence at t=0 :

x' = x - v t ,                                                                             (G.1)

y'=y  ,                                                                                      (G.2)

z'=z ,                                                                                       (G.3)

t'=t .                                                                                        (G.4)

Note that the transformation assumes that at t=0 the two inertial frames of reference coincide, with the corresponding axes overlapping. Thereafter, the origins separate with a velocity v, which is parallel to the positive x-axis.

Two inertial frames of reference moving past each other parallel to their x-axes at a velocity, v, which were coincident at t=0 and t'=0.

The Galilean transformations (Fig. 1) seem almost too simple to write down; time is the same in the two frames of reference and so are the coordinates at right angles to the direction of motion. Only the separation of the origin of coordinates along the x-axis appears in (G.1). To apply to Special Relativity, it is necessary to rewrite these equations.


Second postulate

Einstein assumed that the speed of light in a vacuum is a universal constant, which will be the same for all observers in inertial frames of reference. This means that there is no universal background substance in the universe relative to which we can compare our motion. The stronger statement is that there is no luminferous aether, which is perhaps too strong a statement given our current understanding of physics. If such a substance were to exist, it would not have space and time properties that would alter our notion of the invariance of the laws of physics for all inertial observers, nor the invariance of the speed of light in a vacuum.

Let us do some trivial algebra using the Galilean Transformation (G.1-G.4). At some time t1, which is the same in both frames of reference, let us find the equivalent distance between two points along the x axis, x2 and x1 , in the moving frame:

x'2 = x2-v t1 ,                                                                                              (3a)

x'1 = x1-v t1 ,                                                                                               (3b)

x'2 - x'1 = x2- x1 .                                                                                         (3c)

The results say that the distances calculated in both frames are equal. The algebra was trivial. Einstein, why did you have trouble with this type of analysis? Answer, "The reason that distances come out equal in the two different frames of reference is that the measurements of positions were made simultaneously in both frames of reference." That is only common sense, right? As Einstein found, the notion of simultaneous is not universally true independently of the frame of reference. Einstein left all the options not covered by the two postulates of Special Relativity on the table to be determined later so that the speed of light remains an invariant.

Note that the transformations (G.1-.4) are true for inertial frames of reference moving past each other at velocities much less than that of light. That means that the relativistic transformations have to reduce to the Galilean at low velocities. Further  Hermann Minkowski, Einstein's teacher, recognized that his former pupil's algebraic relations in Special Relativity were about a four dimensional space consisting of the ordinary spacial dimensions plus another along the direction of time, for which he developed the mathematics. The basic idea was that you could define four-vectors that had scalar products that were invariant under transformations between inertial frames of reference. The scalar product however, had a new twist: it involved a negative metric.



The position four-vector for any event now has four components, all of which have the dimensions of distance,

R = (c t, x, y, z).                                                                                                  (3a)

Henceforth, an underlined bold character represents a four-vector.

The scalar product of this vector with itself is invariant under for all inertial, coordinate representations,

R R =R' R' ,                                                                                                   (3b)


R' = (c t', x', y', z').                                                                                               (3c)

The scalar products of four-vectors, however, are now defined with a negative metric in the spacial coordinates (in some formulations, time is given the negative metric, and space the positive metric) in the spacial components,

R R = (c t)2 - x2-y2-z2     ,                                                                                     (3d)

so that (3b) can be rewritten as

(c t)2 - x2-y2-z  = (c' t')2 - x' 2-y' 2-z' 2     .                                                                 (3f)

Note that a negative metric in the spacial components, at first seems to be a wacky idea, but it is possible to define the mathematical mechanisms to accomplish that negative metric, but that brings in more mathematics to discuss that at the moment detracts from the discussion of Special Relativity. In this case, we have identified four-vectors through bold, underscored characters, and their scalar products as having a negative metric,

R R = (c t)2 - r • r,                                                                                           (3g)

where r is the ordinary position vector in three dimensional space,

r= e1 x+ e2 y+ e3 z.                                                                                             (3h)

What we are looking for in the transformations of Special Relativity are linear ones of the form corresponding to (G.1—G.4):

R'0 = A R0 -B x ,                                                                                                      (L.1a)

x' = D x – E R0      ,                                                                                                    (L.2a)

y' = y               ,                                                                                                      (L.3)

z' = z             ,                                                                                                        (L.4)


R0 = c t


R0'= c t' .

The terms A, B, D, E are all functions of the relative velocity between the two inertial frames of reference and reduce to (G.1—G.4) at low velocities. Insert in these values into (3f) we have

R0- x2 -y2 -z2 =   (A R0 -B x)2     -(D x – E R0)2   -y2 -z2                                          (4a)

R0- x2 -y2 -z2=    A2 R02  + B2 x2 - 2 A B R0 x – D2x2 – E2 R02 + 2 D E R0 x -y2 -z2    (4b)

R0- x2          =   (A2 – E2) R02  + (B2 – D2) x2 +(D E -A B) R0 x.                              (4c)

To preserve the equality, the parameters involved in the transformation must satisfy the following equations:

(A2 – E2) = 1,                                                                                                           (5a)

(D2 – B2) = 1,                                                                                                           (5b)


D E        =  A B.                                                                                                        (5c)

To futher evaluate the various terms, we use a momentum four-vector for a mass at rest in the origin of the unprimed frame of reference,

P = (moc, 0, 0, 0),                                                                                                    (6a)

which transforms into the following in the primed frame:

P'=(mc, m v, 0, 0),                                                                                                  (6b)

or using (2)

P'= ( γ moc,  γ β mo c , 0,  0),                                                                                    (6c)


β = v/c                                                                                                                                               (6d)


γ  =1/(1- β2) .                                                                                                                                  (6e)


γ moc = A m0c -B 0 ,                                                                                                 (6f)

γ β mo c = D 0 – E m0 c .                                                                                           (6g)

Solving for the various constants, we have,

A=γ                                                                                                                          (7a)


E=γ β                                                                                                                       (7b)


A2 – E2 = γ2 (1 - β2)

A2 – E2 =1.


Solving for the other terms

D  γ β        = γ  B

(D2 – B2) = 1

D2(1 –β2) = 1

D = γ                                                                                                                     (7c)

B = γ β                                                                                                                  (7d)

Substituting these values in the Lorentz transformation of four-mometum vectors

P'0 = γ P0- γ β Px ,                                                                                                 (7e)

P'x= γ Px– γ β P0   .                                                                                                (7f)


Using the above results, we can rewrite the Lorentz transformations in the following form:

c t' = γ (c t - v x/c),                                                                                              (L.1)

x' = γ (x – v t ),                                                                                                    (L.2)

y' = y               ,                                                                                                    (L.3)

z' = z             .                                                                                                      (L.4)

Note that the inverse transformations are the following ones:

c t = γ (c t' + v x'/c),                                                                                            (LI.1)

x = γ (x' + v t' ),                                                                                                   (LI.2)

y' = y               ,                                                                                                    (LI.3)

z' = z             .                                                                                                      (LI.4)

Exercise: Show that the inverse Lorentz transformations (LI) are consistent with the Lorenz transformations. Hint, substitute for the primed values in one transform into those of the inverse transform to see if that produces identities.

Wrap up—Implications of Special Relativity

We live in a strange universe; our low speed perceptions of space and time are only approximations of the true behavior of objects in space and time. In translating results of measurements by other observers into our own frame of reference, we must not only account for our separation by relative motion, but also we have to correct the size of our measurements by functions involving the ratio of our relative velocity to the speed of light.

An important effect on our perception of time is that simultaneity is not a universal property in physics. To show this, find the difference in time   t'2- t'corresponding  to simultaneous time measurements in our 'rest frame' using (L.1),

t2 =t1 ,

which results in the following

t'2 - t'1= - γ (x2-x1) v/c2   .                                                                               (8)

Note that the departures from simultaneity in the universe become noticeable at high relative velocities or at great separations.

The conclusion from Special Relativity is that there is no universal NOW that is the same for all observers. The now of one inertial observer becomes a smear of events for another inertial observer.

Time dilation

First, rewrite (LI.1) as

t' = γ (t - v x/c2 )                                                                                           (9a)

Both times are measured at their respective origins of coordinates.

The moving clock, at rest in the primed coordinate system sits at that origin, which in our stationary system is located by

x= v t ,

which is the location of the prime origin as seen by us.

substituting this value of x in (9a) results in the following equation:

t' = γ t(1- v 2/ c2)


t' = t √(1- v 2/ c2)


t' = t/ γ


t = γ t'   .                                                                      (9b)

We see the moving clock run more slowly than that reported by the primed observer.

Relative to a clock in which we are at rest the time interval of a moving clock (the primed frame) is stretched, so that it appears to run too slow, especially when it moves at a very high speed.


Length contraction

In the moving frame (the primed coordinates) we see a rod that the moving observer measures to have a length (L0 = x'2 - x'1), which we measure as L = x'2 - x'1, as it whizzes by. Use (L.2) to calculate these lengths

x'2 - x'1 = γ (x2 - x1)


L0 =  γ L


L = L0 √[1- v2/c2 ] .

The conclusion is that a rod moving with respect to the observer is shortened along the direction of travel.

In the next blog on Special Relativity II we shall develop the standard notation associated with Minkowski space.




This entry was posted in algebra, mathematics, physics, relativity, vectors. Bookmark the permalink.

Leave a Reply

Your email address will not be published. Required fields are marked *