# Notes on string theory #2: The relativistic point particle

1. Introduction

In Chapter 1 of Polchinski’s textbook, we start with a discussion on the relativistic point particle (pp. 9-11).

String theory proposes that elementary particles are not pointlike, but rather 1-dimensional extended objects (i.e., strings). In fact, string theory (both the bosonic string in Volume 1 of Polchinski and the superstring that comprises much of Volume 2) can be seen as a special generalisation of point particle theory. But the deeper and more modern view is not one that necessarily begins with point particles and then strings, instead the story begins with branes. In that a number of features of string theory are shared by the point particle – as we’ll see in a later note, the point particle can be obtained in the limit the string collapses to a point – the bigger picture is that both of these objects can be considered as special cases of a p-brane.

We refer to p-branes as p-dimensional dynamical objects that have mass and can have other familiar attributes such as charge. As a p-brane moves through spacetime, it sweeps out a latex (p+1)-dimensional volume called its worldvolume. In this notation, a 0-brane corresponds to the case where p = 0. It simply describes a point particle that, as we’ll discuss in this note, traces out a worldline as it propagates through spacetime. A string (whether fundamental or solitonic) corresponds to the p = 1 case, and this turns out to be a very special case of p-branes (for many reasons we’ll learn in following notes). Without getting too bogged down in technical details that extend well beyond the current level of discussion, it is also possible to consider higher-dimensional branes. Important is the case for p = 2, which are 2-dimensional branes called membranes. In fact, the etymology for the word ‘brane’ can be viewed as derivative from `membrane’. As a physical object, a p-brane is actually a generalisation of a membrane such that we may assign arbitrary spatial dimensions. So, for the case ${p \geq 2}$, these are p-branes that appear in string theory as solitons in the corresponding low energy effective actions of various string theories (in addition to 0-branes and 1-branes).

In Type IIA and Type IIB string theories, which again is a subject of Volume 2, we see that there is entire family of p-brane solutions. From the viewpoint of perturbative string theory, which is the primary focus of Volume 1, solitons as p-branes are strictly non-perturbative objects. (There are also other classes of branes, such as Dp-branes that we’ll come across soon when studying the open string. The more complete picture of D/M-brane physics, including brane dynamics, is anticipated to be captured by M-theory. This is a higher dimensional theory that governs branes and, with good reason, is suspected to represent the non-perturbative completion of string theory).

In some sense, one can think of there being two equivalent ways to approach the idea of p-branes: a top-down higher dimensional view, or from the bottom-up as physical objects that generalise the notion of a point particle to higher dimensions. But given an introductory view of p-branes, perhaps it becomes slightly more intuitive why in approaching the concept of a string in string theory we may start (as Polchinski does) with a review of point particle theory. Indeed, it may at first seem odd to model the fundamental constituents of matter as strings. Indeed, it could seem completely arbitrary and therefore natural to ask, why not something else? But what is often missed, especially in popular and non-technical physics literature, is the natural generalising logic that leads us to study strings in particular. These are remarkable objects with remarkable properties, and what Polchinski does so well in Volume 1 is allow this generalising logic to come out naturally in the study of the simplest string theory: bosonic string theory.

In this note, we will construct the relativistic point particle action as given in p.10 (eqn. 1.2.2) and then work through the proceeding discussion in pages 10-11. The quantisation of the point particle is mentioned several pages later in the textbook, so we’ll address that topic then. In what follows, I originally also wanted to include notes on the superparticle and its superspace formulation (i.e., the inclusion of fermions to the point particle theory of bosons), as well as introduce other advanced topics; but I reasoned it is best to try to keep as close to the textbook as possible. The only exception to this rule is that, at the end of this note, we’ll finish by quickly looking at the p-brane action.

2. Relativistic point particle

Explanation of the action for a relativistic point particle as given in Polchinski (eqn. 1.2.2) is best achieved through its first-principle construction. So let us consider the basics of constructing the theory for a relativistic free point particle.

2.1. Minkowski space

As one may recall from studying Einstein’s theory of relativity, spacetime may be modelled by D-dimensional Minkowski space ${\mathbb{M}^D}$. In the abstract, the basic idea is to consider two (distinct) sets E and ${\vec{E}}$, where E is a set of points (with no given structure) and ${\vec{E}}$ is a vector space (of free vectors) acting on the set E. We view the elements of ${\vec{E}}$ as forces acting on points in E, which we in turn think of as physical particles. Applying a force (free vector) ${X \in \vec{E}}$ to a point ${P \in E}$ results in a translation. In other words, the action of a force X is to move every point P to the point ${P + X \in E}$ by translation that corresponds to X viewed as a vector.

In physics, the set E is viewed as the D-dimensional affine space ${\mathbb{M}^D}$, and then ${\vec{E}}$ is the associated D-dimensional vector space ${\mathbb{R}^{1,D-1}}$ defined over the field of real numbers. The choice to model spacetime as an affine space is quite natural, given that an affine space has no preferred or distinguished origin and, of course, the spacetime of special relativity possesses no preferred origin.

As the vectors ${X \in \mathbb{R}^{1,D-1}}$ do not naturally correspond to points ${P \in \mathbb{M}}$, but rather as displacements relating a point P to another point Q, we write ${X = \vec{PQ}}$. The points can be defined to be in one-to-one correspondence with a position vector such that ${\vec{X}_P = \vec{OP}}$, with displacements then defined by the difference ${\vec{PQ} = \vec{OQ} - \vec{OP}}$. The associated vector space possesses a zero vector ${\vec{0} \in \mathbb{R}^{1,D-1}}$, which represents the neutral element of vector addition. We can also use the vector space ${\mathbb{R}^{1,D-1}}$ to introduce linear coordinates on ${\mathbb{M}^{D}}$ by making an arbitrary choice of origin as the point ${O \in \mathbb{M}^D}$.

The elements or points ${P,Q,..., \in \mathbb{M}^D}$ are events, and they combine a moment of time with a specified position. With the arbitrary choice of origin made, we can refer to these points in Minkowski space in terms of their position vectors such that the components ${X^{\mu} = (X^0, X^i) = (t, \vec{X})}$, with ${\mu = 0,..., D-1, i = 1,...,D-1}$ of vectors ${X \in \mathbb{R}^{1,D-1}}$ correspond to linear coordinates on ${\mathbb{M}^D}$. The coordinates ${X^{0}}$ is related to the time t, which is measured by an inertial or free falling observer by ${X^0 =ct}$, with the c the fundamental velocity. The ${X^i}$ coordinates, which are combined into a (D-1)-component vector, parameterise space (from the perspective of the inertial observer).

It is notable that a vector ${X}$ has contravariant coordinates ${X^{\mu}}$ and covariant coordinates ${X_{\mu}}$ which are related by raising and lowering indices such that ${X_{\mu} = \eta_{\mu \nu}X^{\nu}}$ and ${X^{\mu} = \eta^{\mu \nu}x_{\nu}}$.

We still need to equip a Lorentzian scalar product. In the spacetime of special relativity, the vector space ${\mathbb{R}}$ is furnished with the scalar product (relativistic distance between events)

$\displaystyle \eta_{\mu \nu} = X^{\mu}X_{\mu} = -t^2 + \vec{X}^2 \begin{cases} <0 \ \text{for timelike disrance} \\ =0 \ \text{for lightlike distance} \\ >0 \ \text{for spacelike distance} \end{cases} \ \ (1)$

with matrix

$\displaystyle \eta = (\eta_{\mu \nu}) = \begin{pmatrix} - 1 & 0 \\ 0 & 1_{D-1} \end{pmatrix}, \ \ (2)$

where we have chosen the mostly plus convention. To make sense of (1), since the Minkowski metric (2) is defined by an indefinite scalar product, the distance-squared between events can be positive, zero or negative. This carries information about the causal structure of spacetime. If ${X = \vec{PQ}}$ is the displacement between two events, then these events are called time-like, light-like or space-like relative to each other, depending on X. The zeroth component of X then carries information about the time of the event P as related to Q relative to a given Lorentz frame: P is after Q (${X^0 > Q}$), or simultaneous with Q (${X^0 = 0}$), or earlier than Q (${X^0 < 0}$).

2.2. Lorentz invariance and the Poincaré group

Let’s talk more about Lorentz invariance and the Poincaré group. As inertial observers are required to use linear coordinates which are orthonormal with respect to the scalar product (1), these orthonormal coordinates are distinguished by the above standard form of the metric. It is of course possible to use other curvilinear coordinate systems, such as spherical or cylindrical coordinates. Given the standard form of the metric (2), the most general class of transformations which preserve its form are the Poincaré group, which represents the group of Minkowski spacetime isometries.

The Poincaré group is a 10-dimensional Lie group. It consists of 4 translations along with the Lorentz group of 3 rotations and 3 boosts. As a general review, let’s start with the Lorentz group. This is the set of linear transformations of spacetime that leave the Lorentz interval unchanged.

From the definitions in the previous section, the line element takes the form

$\displaystyle ds^2 = \eta_{\mu \nu}dX^{\mu}dX^{\nu} = - dt^2 + d\vec{X}^2. \ \ (3)$

For spacetime coordinates defined in the previous section, the Lorentz group is then defined to be the group of transformations ${X^{\mu} \rightarrow X^{\prime \mu}}$ leaving the relativistic interval invariant. Assuming linearity (we will not prove linearity here, with many proofs easily accessible), define a Lorentz transformation as any real linear transformation ${\Lambda}$ such that

$\displaystyle X^{\mu} \rightarrow X^{\prime \mu} = \Lambda^{\mu}_{\nu}X^{\nu} \ \ (4)$

with

$\displaystyle \eta_{\mu \nu} dX^{\prime \mu} dX^{\prime \nu} = \eta_{\mu \nu} dX^{ \mu} dX^{\nu}, \ \ (5)$

ensuring from (1) that

$\displaystyle X^{\prime 2} = X^{2}, \ \ (6)$

which, for arbitrary X, requires

$\displaystyle \eta_{\mu \nu} = \eta_{\alpha \beta} \Lambda^{\alpha}_{\mu} \Lambda^{\beta}_{\nu}. \ \ (7)$

Note that ${\Lambda = (\Lambda^{\mu}_{\nu})}$ is an invertible ${D \times D}$ matrix. In matrix notation (7) can be expressed as

$\displaystyle \Lambda^T \eta \Lambda = \eta. \ \ (8)$

Matrices satisfying (8) contain rotations together with Lorentz boosts, which relate inertial frames travelling a constant velocity relative to each other. The Lorentz transformations form a six-dimensional Lie group, which is the Lorentz group O(1,D-1).

For elements ${\Lambda \in O(1, D-1)}$ taking the determinant of (8) gives

$\displaystyle (\det \Lambda)^2 = 1 \implies \det \Lambda = \pm 1. \ \ (9)$

By considering the ${\Lambda^0_0}$ component we also find

$\displaystyle (\Lambda^0_0)^2 = 1 + \Sigma_i (\Lambda^0_i)^2 \geq 1 \Rightarrow \Lambda^0_0 \geq 1 \ \text{or} \ \Lambda^0_0 \leq -1. \ \ (10)$

So, the Lorentz group has four components according to the signs of ${\det \Lambda}$ and ${\Lambda^0_0}$. The matrices with ${\det \Lambda = 1}$ form a subgroup SO(1,D-1) with two connected components as given on the right-hand side of (10). The component containing the unit matrix ${1 \in O(1,D-1)}$ is connected and as ${SO_0(1,D-1)}$.

We may also briefly consider translations of the form

$\displaystyle X^{\mu} \rightarrow X^{\prime \mu} = X^{\mu} + a^{\mu}, \ \ (11)$

where ${a = (a^{\mu}) \in \mathbb{R}^{1, D-1}}$. Translations form a group that can be parametrised by the components of the translation vector ${a^{\mu}}$.

As mentioned, the Poincaré group is then the complete spacetime symmetry group that combines translations with Lorentz transformations. For a Lorentz transformation ${\Lambda}$ and a translation ${a}$ the combined transformation ${(\Lambda, a)}$ gives

$\displaystyle X^{\mu} \rightarrow X^{\prime \mu} = \Lambda^{\mu}_{\nu} X^{\nu} + a^{\mu}. \ \ (12)$

These combined transformations form a group since

$\displaystyle (\Lambda_2, a_2)(\Lambda_1, a_1) = (\Lambda_2 \Lambda_2, \Lambda_2 a_1 + a_2), \ (\Lambda, a)^{-1} = (\Lambda^{-1}, -\Lambda^{-1}a). \ \ (13)$

Since Lorentz transformations and translations do not commute, the Poincaré group is not a direct product. More precisely, the Poincaré group is the semi-direct product of the Lorentz and translation group, ${IO(1,D-1) = O(1,D-1) \propto \mathbb{R}^D}$.

2.3. Action principle

We now look to construct an action for the relativistic point particle (initially following the discussion in [Zwie09] as motivation).

The classical motion of a point particle as it propagates through spacetime is described by a geodesic on the spacetime. As Polchinski first notes, we can of course describe the motion of this particle by giving its position in terms of functions of time ${X(t) = (X^{\mu}(t)) = (t, \vec{X}(t))}$. For now, we may also consider some arbitrary origin and endpoint ${(ct_f, \vec{X}_{f})}$ for the particle’s path or what is also called its worldline. We also know from the principle of least action that there are many possible paths between these points.

It should be true that for any worldline all Lorentz observers compute the same value for the action. Let ${\mathcal{P}}$ denote one such worldline. Then we may use the proper time as an Lorentz invariant quantity to describe this path. Moreover, from special relativity one may recall that the proper time is a Lorentz invariant measure of time. If different Lorentz observers will record different values for the time interval between the two events along ${\mathcal{P}}$, then we instead imagine that attached to the particle is a clock. The proper time is therefore the time elapsed between the two events on that clock, according to which all Lorentz observers must agree on the amount of elapsed time. This is the basic idea, and it means we want an action of the worldline ${\mathcal{P}}$ that is proportional to the proper time.

To achieve this, we first recall the invariant interval for the motion of a particle

$\displaystyle - ds^2 = -c^2 dt^2 + (dX^1)^2 + (dX^2)^2 + (dX^3)^2, \ \ (14)$

in which, from special relativity, the proper time

$\displaystyle -ds^2 = -c^2 dt_f \rightarrow ds = c dt_f \ \ (15)$

tells us that for timelike intervals ds/c is the proper time interval. It follows that the integral of (ds/c) over the worldline ${\mathcal{P}}$ gives the proper time elapsed on ${\mathcal{P}}$. But, if the proper time gives units of time, we still needs units of energy or units of mass times velocity-squared to ensure we have the full units of action (recall that for any dynamical system the action has units of energy times time, with the Lagrangian possessing units of energy). We also need to ensure that we preserve Lorentz invariance in the process of building our theory. One obvious choice is m for the rest mass of the particle, with c for the fundamental velocity in relativity. Then we have an overall multiplicative factor ${mc^2}$ that represents the the rest energy of the particle. As a result, the action takes the tentative form ${mc^2 (ds/c) = mc ds}$. This should make some sense in that ${ds}$ is just a Lorentz scalar, and we have the factor of relativity we expect. We also include a minus sign to ensure the follow integrand is real for timelike geodesics.

$\displaystyle S = -mc \int_{\mathcal{P}} ds. \ \ (16)$

A good strategy now is to find an integral of our Lagrangian over time – say, ${t_i}$ and ${t_f}$ which are world-events that we’ll take to define our interval – because it will enable use to establish a more satisfactory expression that includes the values of time at the initial and final points of our particle’s path. If we fix a frame – which is to say if we choose the frame of a particular Lorentz observer – we may express the action (16) as the integral of the Lagrangian over time. To achieve this end, we must first return to our interval (14) and relate ${ds}$ to ${dt}$,

$\displaystyle -ds^2 = -c^2 dt^2 + (dX^1)^2 + (dX^2)^2 + (dX^3)^2$

$\displaystyle ds^2 = c^2 dt^2 - (dX^1)^2 - (dX^2)^2 - (dX^3)^2$

$\displaystyle ds^2 = [c^2 - \frac{(dX^1)^2}{dt} - \frac{(dX^2)^2}{dt} - \frac{(dX^3)^2}{dt}] dt^2$

$\displaystyle \implies ds^2 = (c^2 - v^2) dt^2$

$\displaystyle \therefore ds = \sqrt{c^2 - v^2} dt. \ \ (17)$

With this relation between ${ds}$ and ${dt}$, in the fixed frame the point particle action becomes

$\displaystyle S = -mc^{2} \int_{t_{i}}^{t_{f}} dt \sqrt{1 - \frac{v^{2}}{c^{2}}}, \ \ (18)$

with the Lagrangian taking the form

$\displaystyle L = -mc^{2} \sqrt{1 - \frac{v^{2}}{c^{2}}}. \ \ (19)$

This Lagrangian gives us a hint that it is correct as its logic breaks down when the velocity exceeds the speed of light ${v > c}$. This confirms the definition of the proper time from special relativity (i.e., the velocity should not exceed the speed of light for the proper time to be a valid concept). In the small velocity limit ${v << c}$, on the other hand, when we expand the square root (just use binomial theorem to approximate) we see that it gives

$\displaystyle L \simeq -mc^2 (1 - \frac{1}{2}\frac{v^2}{c^2}) = - mc^2 + \frac{1}{2}m v^2. \ \ (20)$

returning similar structure for the kinetic part of the free non-relativistic particle, with (${-mc^2}$) just a constant.

2.4. Canonical momentum and Hamiltonian

We will discuss the canonical momentum of the point particle again in a future note on quantisation; but for the present form of the action it is worth highlighting that we can also see the Lagrangian (19) is correct by computing the momentum ${\vec{p}}$ and the Hamiltonian.

For the canonical momentum, we take the derivative of the Lagrangian with respect to the velocity

$\displaystyle \vec{p} = \frac{\partial L}{\partial \vec{v}} = -mc^{2}(-\frac{\vec{v}}{c^{2}})\frac{1}{\sqrt{1 - \frac{v^{2}}{c^{2}}}} = \frac{m\vec{v}}{\sqrt{1 - \frac{v^{2}}{c^{2}}}}. \ \ (21)$

Now that we have an expression for the relativistic momentum of the particle, let us consider the Hamiltonian. The Hamiltonian may be written schematically as ${H = \vec{p} \cdot \vec{v} - L}$. All we need to do is make the appropriate substitutions,

$\displaystyle H = \frac{m\vec{v}^{2}}{\sqrt{1 - \frac{v^{2}}{c^{2}}}} + mc^{2}\sqrt{1 - \frac{v^{2}}{c^{2}}} = \frac{mc^{2}}{\sqrt{1 - \frac{v^{2}}{c^{2}}}}. \ \ (22)$

The Hamiltonian should make sense. Notice, if we instead write the result in terms of the particle’s momentum (rather than velocity) by inverting (22), we find an expression in terms of the relativistic energy ${\frac{E^{2}}{c^{2}} - \vec{p} \cdot \vec{p} = m^{2}c^{2}}$. This is a deep hint that we’re on the right track, as it suggests quite clearly that we’ve recovered basic relativistic physics for a point-like object.

3. Reparameterisation invariance

An important property of the action (16) is that it is invariant under whatever choice of parameterisation we might choose. This makes sense because the invariant length ds between two points on the particle’s worldline does not depend on any parameterisation. We’ve only insisted on integrating the line element, which, if you think about it, is really just a matter of adding up all of the infinitesimal segments along the worldline. But, typically, a particle moving in spacetime is described by a parameterised curve. As Polchinski notes, it is generally best to introduce some parameter and then describe the motion in spacetime by functions of that parameter.

Furthermore, how we parameterise the particle’s path will govern whether, for the classical motion, the path is one that extremises the invariant distance ds as a minimum or maximum. Our choice of ${\tau}$-parameterisation is such that the invariant length ds is given by

$\displaystyle ds^2 = -\eta_{\mu \nu}(X) dX^{\mu} dX^{\nu}, \ \ (23)$

then the choice of worldline parameter ${\tau}$ is considered to be increasing between some initial point ${X^{\mu} (\tau_i)}$ and some final point ${X^{\mu}(\tau_f)}$. So the classical paths are those which maximise the proper time. It also means that the trajectory of the particle worldline is now described by the coordinates ${X^{\mu} = X^{\mu}(\tau)}$. As a result, the space of the theory can now be updated such that ${X^{\mu}(\tau) \in \mathbb{R}^{1, D-1}}$ with ${\mu, \nu = 0,...,D-1}$.

In the use of ${\tau}$ parameterisation, an important idea is that time is in a sense being promoted to a dynamical degree of freedom without it actually being a dynamical degree of freedom. We are in many ways leveraging the power of gauge symmetry, with our choice of parameterisation enabling us to treat space and time coordinates on equal footing. The cost by trading a less symmetric description for a more symmetric one is that we pick up redundancies.

Given the previous preference of background spacetime geometry to be Minkowski, recall the metric

$\displaystyle \eta_{\mu \nu} = \begin{pmatrix} -1 & 0 & 0 & 0 \\ 0 & 1 & 0 & 0 \\ 0 & 0 & 1 & 0 \\ 0 & 0 & 0 & 1 \end{pmatrix}, \ \ (24)$

such that for the integrand ds we now use

$\displaystyle -\eta_{\mu \nu}(X) dX^{\mu} dX^{\nu} = -\eta_{\mu \nu}(X) \frac{dX^{\mu}(\tau)}{d\tau} \frac{dX^{\nu}(\tau)}{d\tau} d\tau^2. \ \ (25)$

Therefore, the action (16) may be updated to take the form

$\displaystyle S_{pp} = -mc \int_{\tau_i}^{\tau_f} d\tau \ \sqrt{-\eta_{\mu \nu} \dot{X}^{\mu} \dot{X}^{\nu}} \ \ (26)$

with ${\dot{X}^{\mu} \equiv dX^{\mu}(\tau) / d\tau}$.

Setting ${c = 1}$, notice (26) is precisely the action (eqn. 1.2.2) in Polchinski. This is the simplest action for a relativistic point particle with manifest Poincaré invariance that does not depend on the choice of parameterisation.

How do we interpret this form of the action? In the exercise to obtain (26) we have essentially played the role of a fixed observer, who has calculated the action using some parameter ${\tau}$. The important question is whether the value of the action depends on this choice of parameter. Polchinski comments that, in fact, it is a completely arbitrary choice of parameterisation. This should make sense because, again, the invariant length ds on the particle worldline ${\mathcal{P}}$ should not depend on how the path is parameterised.

Proposition 1 The action (26) is reparameterisation invariant such that if we replace ${\tau}$ with the parameter ${\tau^{\prime} = f(\tau)}$, where f is monotonic, we obtain the same value for the action.

Proof: Consider the following reparameterisation of the particle’s worldline ${\tau \rightarrow \tau^{\prime} = f(\tau)}$. Then we have

$\displaystyle d\tau \rightarrow d\tau^{\prime} = \frac{\partial f}{\partial \tau}d\tau, \ \ (27)$

implying

$\displaystyle \frac{dX^{\mu}(\tau^{\prime})}{d\tau} = \frac{dX^{\mu}(\tau^{\prime})}{d\tau^{\prime}}\frac{d\tau^{\prime}}{d\tau} = \frac{dX^{\mu}(\tau^{\prime})}{d\tau^{\prime}} \frac{\partial f(\tau)}{\partial \tau}. \ \ (28)$

Plugging this into the action (26) we get

$\displaystyle S^{\prime} = -mc \int_{\tau_i}^{\tau_f} d\tau^{\prime} \ \sqrt{\frac{dX^{\mu}(\tau^{\prime})}{d\tau^{\prime}} \frac{dX_{\mu}(\tau^{\prime})}{d\tau^{\prime}}}$

$\displaystyle = -mc \int_{\tau_i}^{\tau_f} \frac{\partial f}{\partial \tau} \ d\tau \ \sqrt{\frac{dX^{\mu}}{d\tau} \frac{dX_{\mu}}{d\tau} (\frac{\partial f}{\partial tau})^{-2}}$

$\displaystyle = -mc \int_{\tau_i}^{\tau_f} (\frac{\partial f}{\partial \tau})(\frac{\partial f}{\partial \tau})^{-1} \ d\tau \ \sqrt{\frac{dX^{\mu}}{d\tau} \frac{dX_{\mu}}{d\tau}}$

$\displaystyle = -mc \int_{\tau_i}^{\tau_f} d\tau \ \sqrt{\frac{dX^{\mu}(\tau)}{d\tau} \frac{dX_{\mu}(\tau)}{d\tau}}. \ \ (29)$

$\Box$

This ends the proof. So we see the value of the action does not depend on the choice of parameter; indeed, the choice is arbitrary.

As alluded earlier in this section, reparameterisation invariance is a gauge symmetry. In some sense, this is not even an honest symmetry; because it means that we’ve introduced a redundancy in our description, as not all degrees of freedom ${X^{\mu}}$ are physically meaningful. We’ll discuss this more in the context of the string (an example of such a redundancy appears in the study of the momenta).

4. Equation of motion for ${S_{pp}}$

To obtain (eqn. 1.2.3), Polchinski varies the action (26) and then integrates by parts. For simplicity, let us temporarily maintain ${c = 1}$. Varying (26)

$\displaystyle \delta S_{pp} = -m \int d\tau \delta (\sqrt{-\dot{X}^{\mu}\dot{X}_{\mu}}) \ \ (30)$

$\displaystyle = -m \int d\tau \frac{1}{2}(-\dot{X}^{\mu}\dot{X}_{\mu})^{-1/2}(-\delta \dot{X}^{\mu}\dot{X}_{\mu}), \ \ (31)$

then from the last term we pick up a factor of 2 leaving

$\displaystyle = -m \int d\tau (-\dot{X}^{\mu}\dot{X}_{\mu})^{-1/2} + (-\dot{X}^{\mu}\delta \dot{X}_{\mu}). \ \ (32)$

Next, we make the substitution ${u^{\mu} = \dot{X}^{\mu}(-\dot{X}^{\nu}\dot{X}_{\nu})^{-1/2}}$ such that

$\displaystyle \delta S_{pp} = -m \int d\tau (-u_{\mu})\delta \dot{X}^{\mu}. \ \ (33)$

And now we integrate by parts, which shifts a derivative onto u using the fact we can commute the variation and the derivative ${\delta \dot{X}^{\mu} = \delta d / d\tau X^{\mu} = d/d\tau \delta X^{\mu}}$. We also drop the total derivative term that we obtain in the process

$\displaystyle \delta S_{pp} = -m \int d\tau \frac{d}{d\tau} (-u_{\mu}\delta X^{\mu}) - m \int d\tau \dot{u}_{\mu} \delta X^{\mu}, \ \ (34)$

which gives the correct result

$\displaystyle \delta S_{pp} = -m \int d\tau \dot{u}_{\mu}\delta X^{\mu}. \ \ (34)$

As Polchinski notes, the equation of motion ${\dot{u}^{\mu} = 0}$ describes the free motion of the particle.

With the particle mass m being the normalisation constant, we can also take the non-relativistic limit to find (exercise 1.1). Returning to (26), one way to do this is for ${\tau}$ to be the proper time, then, as before (reinstating c for the purpose of example)

$\displaystyle \dot{X}^{\mu}(\tau) = c \frac{dt}{d\tau} + \frac{d\vec{X}^{\mu}(\tau)}{d\tau} \ \ (35)$

so that we may define the quantity ${\gamma = (1 - v^2/c^2)^{-1/2}}$. Then, in the non-relativistic limit where ${v << c}$ we have ${dt/d\tau = \gamma = 1 + \mathcal{O}(v^2/c^2)}$. It follows

$\displaystyle \dot{X}^{\mu}\dot{X}_{\mu} = -c^2 + \mid \vec{v} \mid^2 + \mathcal{O}(v^2/c^2), \ \ (36)$

with ${\vec{v}}$ a spatial vector and we define the norm ${\mid \vec{v} \mid \equiv v}$. Now, equivalent as with the choice of static gauge, the action to order ${v/c}$ takes the form

$\displaystyle S_{pp} \approx -mc \int dt \sqrt{c^2 -\mid \vec{v} \mid^2}, \ \ (37)$

where we now taylor expand to give

$\displaystyle S_{pp} \approx -mc \int (1 - \frac{1}{2}\frac{\mid \vec{v} \mid^2}{c^2}) \ \ (38)$

Observe that we now have a time integral of a term with classical kinetic structure minus a potential-like term (actually a total time derivative) that is an artefact of the relative rest energy

$\displaystyle S_{pp} \approx \int dt \ (\frac{1}{2}m\mid \vec{v} \mid^2 - mc^2). \ \ (39)$

5. Deriving ${S_{pp}^{\prime}}$(eqn. 1.2.5)

The main problem with the action (18) and equivalently (26) is that, when we go to quantise this theory, the square root function in the integrand is non-linear. Analogously, we will find a similar issue upon constructing the first-principle string action, namely the Nambu-Goto action. Additionally, in our study of the bosonic string, we will be interested firstly in studying massless particles. But notice that according to the action (26) a massless particle would be zero.

What we want to do is rewrite ${S_{PP}}$ in yet another equivalent form. To do this, we add an auxiliary field so that our new action takes the form

$\displaystyle S_{pp}^{\prime} = \frac{1}{2} \int d \tau (\eta^{-1} \dot{X}^{\mu} \dot{X}_{\mu} - \eta m^2), \ \ (40)$

where we define the tetrad ${\eta (\tau) = (- \gamma_{\tau \tau} (\tau))^{\frac{1}{2}}}$. The independent worldline metric ${\gamma_{\tau \tau}(\tau)}$ that we’ve introduce as an additional field is, in a sense, a generalised Lagrange multiplier. For simplicity we can denote this additional field ${e(\tau)}$ so that we get the action

$\displaystyle S_{pp}^{\prime} = \frac{1}{2} \int d\tau (e^{-1} \dot{X}^{2} - em^{2}), \ \ (41)$

where we have simplified the notation by setting ${\dot{X}^{2} = \eta_{\mu \nu}\dot{X}^{\mu}\dot{X}^{\nu}}$ and completely eliminated the square root. This is equivlant to what Polchinski writes in (eqn.1.2.5). The structure of (41) may look familiar, as it reads like a worldline theory coupled to 1-dimensional gravity (worth checking and playing with).

To see that ${S_{pp}^{\prime}}$ is classically equivalent (on-shell) to ${S_{pp}}$, we first consider its variation with respect to ${e(\tau)}$

$\displaystyle \delta S_{pp}^{\prime} = \frac{1}{2}\delta \int d\tau (e^{-1} \dot{X}^{2} - m^2 e)$

$\displaystyle = \frac{1}{2} \int d\tau (- \delta (\frac{1}{e})\dot{X}^{2} - \delta (m^{2} e))$

$\displaystyle = \frac{1}{2} \int d\tau (- \frac{1}{e^{2}}\dot{X}^{2} - m^{2}), \ \ (42)$

which results in the following field equations

$\displaystyle e^{2} = \frac{\dot{X}^{2}}{m^{2}}$

$\displaystyle \implies e = \sqrt{\frac{-\dot{X}^{2}}{m^{2}}} \ \ (43).$

This again aligns with Polchinski’s result (eqn. 1.2.7).

Proposition 2 If we substitute (43) back into (41), we recover the original ${S_{pp}}$ action (26).

Proof:

$\displaystyle S_{pp}^{\prime} = \frac{1}{2} \int d\tau [(-\frac{\dot{X}^2}{m^{2}})^{-1/2} \dot{X}^{2} - m^{2}(-\frac{\dot{X}^{2}}{m^{2}})^{1/2}]$

$\displaystyle = \frac{1}{2} \int d\tau [(-\frac{m^{2}}{\dot{X}^{2}})^{1/2} (\dot{X}^{2} - m^{2}(\frac{\dot{X}^{2}}{m^{2}})^{1/2})]$

$\displaystyle = \frac{1}{2} \int d\tau [(-\frac{m^{2}}{\dot{X}^{2}})^{1/2} (\dot{X}^{2} - m (- \dot{X}^{2})^{1/2})] \ \ (44)$

Recalling ${\dot{X}^{2} = \eta_{\mu \nu} \dot{X}^{\mu}\dot{X}^{\nu}}$, substitute for ${\dot{X}}$ in the square root on the right-hand side

$\displaystyle = \frac{1}{2} \int d\tau [(-\frac{m^{2}}{\dot{X}^{2}})^{1/2} \dot{X}^{2} - m (- \eta_{\mu \nu} \dot{X}^{\mu}\dot{X}^{\nu})^{1/2}. \ \ (45)$

For the first term we clean up with a bit of algebra. From complex variables recall ${i^{2} = -1}$.

$\displaystyle (-\frac{m^{2}}{\dot{X}^{2}})^{1/2} \dot{X}^{2} = (-1)(-1) -(\frac{m^{2}}{\dot{X}^{2}})^{1/2} \dot{X}^{2}$

$\displaystyle = -(-\frac{m^{2}}{\dot{X}^{2}})^{1/2} i^{2} \dot{X}^{2}$

$\displaystyle = -(-\frac{m^{2}}{\dot{X}^{2}} i^{4} \dot{X}^{2})^{1/2}$

$\displaystyle = -(-m^{2}i^{4}\dot{X}^{2})^{1/2} = -m (-i^{4}\dot{X}^{2})^{1/2}. \ \ (46)$

As ${i^{4} = 1}$, it follows ${-m(i^{4}\dot{X}^{2})^{1/2} = -m (-\dot{X}^{2})^{1/2}}$. Now, substitute for ${\dot{X}^{2}}$ and we find ${-m (-\eta_{\mu \nu}\dot{X}^{\mu}\dot{X}^{\nu})^{1/2}}$ giving

$\displaystyle S_{pp}^{\prime} = \frac{1}{2} \int d\tau [-m(- \eta_{\mu \nu}\dot{X}^{\mu}\dot{X}^{\nu})^{1/2} - m (- \eta_{\mu \nu} \dot{X}^{\mu}\dot{X}^{\nu})^{1/2}$

$\displaystyle = -m \int d\tau (- \eta_{\mu \nu}\dot{X}^{\mu}\dot{X}^{\nu})^{1/2} = S_{pp} \ \ (47).$

$\Box$

This ends the proof, demonstrating that ${S_{pp}}$and ${S_{pp}^{\prime}}$ are classically equivalent.

It is also possible to show that, like with ${S_{pp}}$, the action ${S_{pp}^{\prime}}$ is both Poincaré invariant and reparameterisation invariant.

6. Generalising to Dp-branes

As an aside, and to conclude this note, we can generalise the action for a point particle (0-brane) to an action for a p-brane. It follows that a p-brane in a ${D \geq p}$ dimensional background spacetime can be described in such a way that the action becomes,

$\displaystyle S_{pb}= -T_p \int d\mu_p \ \ (48).$

The term ${T_p}$is one that will become more familiar moving forward, especially when we begin to discuss the concept of string tension. However, in the above action it denotes the p-brane tension, which has units of mass/volume. The ${d\mu_p}$ term is the ${(p + 1)}$-dimensional volume measure,

$\displaystyle d\mu_p = \sqrt{- \det G_{ab}} \ d^{p+1} \sigma, \ \ (49)$

where ${G_{ab}}$ is the induced metric, which, in the ${p = 1}$ case, we will understand as the worldsheet metric. The induce metric is given by,

$\displaystyle G_{ab} (X) = \frac{\partial X^{\mu}}{\partial \sigma^{a}} \frac{\partial X^{\nu}}{\partial \sigma^{b}} h_{\mu \nu}(X) \ \ \ a, b \equiv 0, 1, ..., p \ \ (50)$p>

A few additional comments may follow. As ${\sigma^{0} \equiv \tau}$, spacelike coordinates in this theory run as ${\sigma^{1}, \sigma^{2}, ... \sigma^{p}}$ for the surface traced out by the p-brane. Under ${\tau}$ reparameterisation, the above action may also be shown to be invariant.

7. Summary

To summarise, one may recall how in classical (non-relativistic) theory [LINK] the evolution of a system is described by its field equations. One can generalise many of the concepts of the classical non-relativistic theory of a point particle to the case of the relativistic point particle. Indeed, one will likely be familiar with how in the non-relativistic case the path of the particle may be characterised as a path through space. This path is then parameterised by time. On the other hand, in the case of the relativistic point particle, we have briefly reviewed how the path may instead be characterised by a worldline through spacetime. This worldline is parameterised not by time, but by the proper time. And, in relativity, we learn in very succinct terms how freely falling relativistic particles move along geodesics.

It should be understood that the equations of motion for the relativistic point particle are given by the geodesics on the spacetime. This means that one must remain cognisant that whichever path the particle takes also has many possibilities, as noted in an earlier section. That is, there are many possible worldlines between some beginning point and end point. This useful fact will be explicated more thoroughly later on, where, in the case of the string, we will discuss the requirement to sum over all possible worldsheets. Other lessons related to the point particle will also be extended to the string, and will help guide how we construct the elementary string action.

References

[Moh08] T. Mohaupt, Liverpool lectures on string theory [lecture notes].

[Pol07] J. Polchinski, An introduction to the bosonic string. Cambridge, Cambridge University Press. (2007).

[Wray11] K. Wray, An introduction to string theory [lecture notes].

[Zwie09] B. Zwiebach, A first course in string theory. Cambridge, Cambridge University Press. (2009).