zotero/storage/HKSZF65X/.zotero-ft-cache

Einstein's perihelion formula and its generalization
Maurizio M. D'Eliseo Citation: American Journal of Physics 83, 324 (2015); doi: 10.1119/1.4903166 View online: http://dx.doi.org/10.1119/1.4903166 View Table of Contents: http://scitation.aip.org/content/aapt/journal/ajp/83/4?ver=pdfcov Published by the American Association of Physics Teachers Articles you may be interested in Einstein's Physics: Atoms, Quanta, and Relativity Derived, Explained, and Appraised. Am. J. Phys. 81, 719 (2013); 10.1119/1.4813218 Advance of perihelion Am. J. Phys. 81, 695 (2013); 10.1119/1.4813067 A General Relativity Workbook. Am. J. Phys. 81, 317 (2013); 10.1119/1.4789548 General relativity for sophomores Am. J. Phys. 76, 103 (2008); 10.1119/1.2825393 Precession of the perihelion of Mercury’s orbit Am. J. Phys. 73, 730 (2005); 10.1119/1.1949625
This article is copyrighted as indicated in the article. Reuse of AAPT content is subject to the terms at: http://scitation.aip.org/termsconditions. Downloaded to IP: 131.156.157.31 On: Wed, 25 Mar 2015 16:20:35

Einstein’s perihelion formula and its generalization
Maurizio M. D’Eliseoa)
Osservatorio S.Elmo - Via A.Caccavello 22, 80129 Napoli, Italia
(Received 7 July 2014; accepted 19 November 2014)
Einstein’s perihelion advance formula can be given a geometric interpretation in terms of the curvature of the ellipse. The formula can be obtained by splitting the constant term of an auxiliary polar equation for an elliptical orbit into two parts that, when combined, lead to the expression of this relativistic effect. Using this idea, we develop a general method for dealing with orbital precession in the presence of central perturbing forces, and apply the method to the determination of the total (relativistic plus Newtonian) secular perihelion advance of the planet Mercury. VC 2015
American Association of Physics Teachers.
[http://dx.doi.org/10.1119/1.4903166]

I. INTRODUCTION
A classic calculation in the scientiﬁc literature is the derivation of the formula by which Einstein explained an apparent anomaly of the observed motion of the planet Mercury.1 A planet’s perihelion remains ﬁxed under a pure inversesquare gravitational force, so any shift indicates, as ﬁrst realized by Newton, either a different force law or the presence of perturbing forces. The perihelion of Mercury is observed to precess—after correction for known planetary perturbations—at the rate of about 43 s of arc per century, and this residue is exactly predicted by the theory of general relativity.
To approximately derive the relativistic contribution to the precession (there are further corrections of negligible relevance2), it is not necessary to completely solve the relativistic orbit equation. In his original derivation, Einstein came upon an elliptic integral, which he managed to compute approximately. Since then, a host of authors in this journal3–14 have taken alternate approaches to illuminate various aspects of this problem.
Our approach to the subject arises from the interplay of two quite different methodological strategies, which we can deﬁne as the local and the global. The ﬁrst shows that the perihelion precession produced by a perturbing force can be traced back to a steady action along the entire orbit. The second deals directly with this secular effect by splitting the constant term of an auxiliary polar equation of an elliptical orbit into two parts according to a speciﬁc criterion, without needing to know what takes place during the motion. Our method is applicable to a broad class of perturbing central forces and provides the leading term of the secular perihelion shift.

II. RELATIVISTIC EQUATION

The polar equation for the orbit of a planet, considered as a test particle subject to a central force f(u), is given by the well-known expression15

u00

þ

u

¼

f ðuÞ l2u2

:

(1)

Here, uðhÞ ¼ 1=r, with r the distance from the origin, while l
is the (constant) magnitude of the orbital angular momen-
tum, and a prime denotes differentiation with respect to the independent variable, which in this case is the angle h.16

In general relativity, the spherically symmetric
Schwarzschild solution to Einstein’s ﬁeld equation corre-
sponds, for a weak ﬁeld, to a function f(u) consisting of two
attractive parts, the classical inverse-square force and a small corrective term17

f ðuÞ ¼ lu2 þ 3al2u4:

(2)

Here, l ¼ GM, with G the universal gravitational constant and M the mass of the star, and a ¼ l=c2, with c the speed of light. The parameter a has the dimension of a length and is called the gravitational radius; for the Sun we have a % 1:477 km, a very small value compared to typical orbital
radii in our solar system. From Eqs. (1) and (2), we get the
relativistic orbit equation

u00

þ

u

¼

l l2

þ

3au2;

(3)

which represents an oscillator with a weak quadratic nonlinearity. This equation cannot be solved exactly. Using methods of perturbation theory, a bounded periodic (planetary case) approximate solution can be painstakingly assembled to arbitrarily high order in the coupling constant a, allowing a determination of the precession.2
Our plan here is to bypass the solution of Eq. (3) and, more generally, to extract directly from a perturbed orbital equation the leading precession term through a simple linearization process that consists of replacing the nonlinear term by a constant. This procedure will be discussed in detail after we have dealt with some basic aspects of the elliptical orbit.

III. ELLIPTICAL ORBIT

The unperturbed orbit equation is

u00

þ

u

¼

l l2

:

(4)

The constant term l=l2 in this equation can be given a geometric meaning by exploiting a result of Newton’s that dates back to 1671.18 Newton found that a generic plane curve uðhÞ satisﬁes the equation

u00

þ

u

¼

q

1 sin3b

;

(5)

324

Am. J. Phys. 83 (4), April 2015

http://aapt.org/ajp

VC 2015 American Association of Physics Teachers

324

This article is copyrighted as indicated in the article. Reuse of AAPT content is subject to the terms at: http://scitation.aip.org/termsconditions. Downloaded to IP: 131.156.157.31 On: Wed, 25 Mar 2015 16:20:35

where q is the radius of curvature and b is the angle between the radial and tangential directions at any point on the curve. Setting b ¼ p=2, it follows that the orbit equation can be written in the form

u00 þ u ¼ j;

(6)

where j  1=q ¼ l=l2 is the curvature at those particular points. For a planetary orbit this identiﬁcation occurs in two circumstances: when the orbit is circular (in which case j ¼ 1=r is always true) and at the extrema of an elliptical orbit (perihelion and aphelion). In the latter case, the curvature is maximal and can be simply expressed in terms of the ellipse parameters (eccentricity e, semi-major axis a, and semi-minor axis b).19
The solution to Eq. (6), written so as to highlight this geometrical aspect, is obtained by starting with the particular solution u ¼ j and adding to it the periodic solution to the homogeneous equation. The orbit can thus be written in the form

uðhÞ ¼ j þ je cosðh À xÞ;

(7)

where e and x are the two arbitrary constants that parame-
trize the family of solutions. This is the polar equation of an
ellipse when the eccentricity e lies in the open interval (0, 1);
the polar axis has been chosen so that u attains its maximum value at h ¼ x, the phase that identiﬁes the position of the perihelion. The period of the solution is 2p, so that the perihelia are located at h ¼ x þ 2np, with n ¼ 0; 1; …. Now, the particular solution u ¼ j of Eq. (4) represents a circular orbit of radius r ¼ q, and from Eq. (7) we ﬁnd that j ¼ uðx þ p=2Þ. Then, by comparison with the standard polar equation for an ellipse, we deduce that the curvature is given by j ¼ 1=q ¼ 1=½að1 À e2Þ ¼ a=b2 at perihelion.

IV. ANGULAR PERIOD OPERATOR

A deﬁnite integral provides global information about the behavior of a function in the interval of integration. As a limit of Riemann sums, it is the result of a pointwise accumulation process, and this proves useful for the detection of secular effects on the orbit. Unlike Einstein’s elliptic integral, which is quite complicated to handle, the integral we shall use is elementary, because the integrand is the function u(h), the sum of a constant and a cosine.
When plotted in rectangular coordinates (h, u), with h in the range ½x; x þ 2p, the function u ¼ j þ je cosðh À xÞ traces out a sinusoid of amplitude je around the segment u ¼ j of length 2p. The oscillation begins and terminates at two successive perihelia, the points ðx; j þ jeÞ and ðx þ 2p; j þ jeÞ, while in between we have one aphelion at point ðx þ p; j À jeÞ. It follows that j is the average value of the elliptical solution over the interval ½x; x þ 2p

1 ðxþ2p u dh ¼ j;

(8)

2p x

and so u ¼ j is the circular orbit of an imaginary planet associated with the planet that follows the elliptical orbit (7), with whom it shares the same period. The integral

ðxþ2p

u dh ¼ 2pj

(9)

x

measures the area under the graph of u in the interval ½x; x þ 2p, and is equal to the area of the rectangle of width 2p and height j. Although representing the area of a plane ﬁgure, this result can also be interpreted as the circumference of a circle of radius j. Therefore, a division by j will provide the period of the orbit, i.e., the distance (from x) along the h-axis after which the function u repeats itself.
In general, given a real number s, we can consider the integration

P^uðshÞ


1

ð xþ2p=s

uðshÞ

dh

¼

 1
2p

(10)

jx

s

as the action of an operator P^ that, when acting on the func-

tion uðshÞ, results in the angular period (deﬁned as the angle

separating two successive passages of the planet through the

perihelion). The value of the factor s for a perfectly elliptical

orbit is unity, but we are anticipating the possibility that the

function u undergoes the dilation (stretching or shrinking)

uðhÞ ! uðshÞ as a result of a perturbation.

With Eq. (10) we have carried out a measurement. The operator P^ is tuned on the circular orbit u ¼ j of the imagi-

nary mean planet associated with the elliptical solution (7),

and any variation of the common angular period of these

planets—in the case of perturbed motion—will be detected.

The relative increment Dx of the angular position of the per-

ihelion over a complete orbit is obtained when we subtract from P^uðshÞ a full turn 2p


Dx  P^uðshÞ À 2p ¼ 2p 1 À 1 :

(11)

s

Thus, the perihelion shift will be positive or negative according to whether 1/s is greater than or less than one.
Now let us multiply Eq. (10) by a positive integer n. The effect of nP^ is equivalent to when the operation P^ is carried out n times successively, assuming, for 1 i n, that the terminal condition of the (i – 1)st operation becomes the initial condition for the ith operation

nP^u ¼ n ðxþ2p=s u dh ¼ 1 Xn ðxþ2ip=s

u dh

¼

j 1

x
ðxþ2np=s

u

dh

j i¼1  P^nu

xþ2ðiÀ1Þp=s
 n
¼ 2p :

(12)

jx

s

Therefore, n can be embedded in the upper limit of the integral, which we denote by the symbol P^n, representing the n-fold composition of P^ with itself. Thus, the equality P^nu ¼ 2pðn=sÞ is a consequence of the fact that the successive perihelia are evenly spaced with separation 2pð1=sÞ.

V. THE ROTATING ELLIPSE
An ellipse in polar form is speciﬁed by the elements a, e, and x, which ﬁx, respectively, the size, shape, and orientation of the ellipse in the plane. When a small perturbing force acts on the system, a viable approach is to treat these “constants” as variable. In our context, we should allow for variations of j and x, the elements a and e being contained in j (where they can vary independently).
In a speciﬁc application of this technique, the solution in the form uðhÞ ¼ j þ je cosðh À xÞ is retained with x no

325

Am. J. Phys., Vol. 83, No. 4, April 2015

Maurizio M. D’Eliseo

325

This article is copyrighted as indicated in the article. Reuse of AAPT content is subject to the terms at: http://scitation.aip.org/termsconditions. Downloaded to IP: 131.156.157.31 On: Wed, 25 Mar 2015 16:20:35

longer constant, but treated as a slowly varying function compared to cos h. The function is a solution to a ﬁrst-order differential equation solvable by successive approximations.20,21 For each value of xðhÞ we can deﬁne an orbit, the osculating ellipse, with an orientation that varies according to the speciﬁc form of the function. In general, this is such that the perihelion is both oscillating and circulating, the latter eventuality being due to a part linear in h, which controls the secular perihelion shift.
The linear part produces a variation of the orbital period by a dilation of u(h). To see this, assume xðhÞ ¼ ð1 À sÞh þ x, where s is a real number (in actual applications jsj will be very nearly equal to the unity), and x ¼ xð0Þ is the initial position of the perihelion. Then,

j þ je cos½h À xðhÞ ¼ j þ je cos fh À ½ð1 À sÞh þ xg ¼ j þ je cosðsh À xÞ ¼ uðshÞ; (13)

where u(sh) is solution to the equation

u00 þ s2u ¼ j:

(14)

Equation (13) shows that the phase of the sinusoid u shifts in h at a constant rate, so that in the plane ðh; uÞ the graph of u can be imagined as a cosine wave traveling smoothly in the positive or negative h-direction at the uniform rate 1 À s.
With the available tools we can analyze some interesting qualitative aspects of a rotating ellipse. For what follows, it is better to visualize the orbit as represented in the polar plane (r,h), where the perihelia lie on the circle of radius r ¼ að1 À eÞ. If we assume 1=s ¼ 1 þ g for 0 < g ( 1, then after n orbital turns the relative shift of the perihelion from its initial position will be

Dnx  P^nuðshÞ À 2np ¼ 2npg:

(15)

Thus, the operation Dnx maps the orbit u(sh) to its nth tangency point with the circle of the perihelia. This point is identiﬁed by the angle 2npg, reckoned from an initial position x. We wish to consider Dnx as a dynamical system, and study the behavior of its “orbits,” i.e., we want to understand the behavior of the sequences Dnx, for n ¼ 1…1. Think of n ¼ 1; 2; … (rotation number) as a sequence of times. Then one can assimilate Dnx to a stroboscope that ﬂashes brieﬂy at these times, showing the planet at its successive perihelia. Two different situations can occur. If g is a rational number, say p/q, then Dqx ﬂashes 2pp times until the perihelion is back to the start. This means that the orbital path closes; the orbit is periodic and takes the form of a rosette. If g is an irrational number, the orbital path never closes and in the long run ﬁlls more and more densely the annulus að1 À eÞ r að1 þ eÞ, in the sense that the trajectory of the planet intersects every neighborhood on the annulus, no matter as small. More speciﬁcally, the sequence of points Dnx will densely cover the circle of the perihelia, but each particular perihelion will never again be attained. It follows that the map Dnx is quasi-periodic, as made explicit by the following recurrence theorem: in the long run, a planet comes, on the circle of the perihelia, an inﬁnite number of times arbitrarily close to any position already occupied. To demonstrate this, we can use the continued fraction approximations22

g

¼

s1

þ

1 s2

1 þ

Á

Á

Á

:

(16)

Truncating at each successive stage gives an inﬁnite sequence of rational approximates (the convergents)

g

%

1 s1

;

1

s2 þ s1s2

;

…


p1 q1

;

p2 q2

;

…

pn qn

;

…;

(17)

where the integers pn and qn are coprime—their only com-

mon factor is 1—and qn > qnÀ1. The convergents pn=qn play

a role analogous to that of the partial sums of an inﬁnite

series. It can be shown22 that for each n, the difference from

g is less that 1=q2n,

g

À

pn qn


<

1 q2n

;

(18)

and that these are the best rational approximations there are, in the sense that no rational fraction with a denominator not exceeding the denominator of the convergent does better. Then, multiplying Eq. (18) by 2qnp, we get

j2qnpg À 2pnpj ¼ jDqn x À 2pnpj < 2p=qn:

(19)

Now, for a given tolerance, however small, we can ﬁnd a
positive integer n0 such that the middle term of Eq. (19) is smaller for all values of n greater than or equal to n0. So we get closer and closer to 2pnp. By imagining suitable rotations of the circle of the perihelia (changes of origin), this reason-
ing extends to the generality of the perihelia.

VI. A SMALL CHANGE OF CURVATURE
Suppose now that the constant j, this structural component of the polar equation, is altered somewhat by adding— in an algebraic sense—a small piece dj, thus affecting the maximal curvature of the orbital ellipse. This means that, for a given eccentricity, we have changed the semi-major axis of the elliptical orbit and the orbital radius of the mean planet. Then the solution u will become

u ¼ ðj þ djÞ þ ðj þ djÞe cosðh À xÞ;

(20)

and, acting on it with P^, we obtain


P^u ¼ 2p 1 þ dj :

(21)

j

The shift 2pdj=j will be an increase or a decrease, depending on the algebraic sign of dj.
We wonder whether it is possible to build a suitable increment dj that encapsulates the presence of a central perturbing force. In this way we would obtain a phenomenological derivation of the perihelion precession.
Notice that, while Eq. (13) is a two-way relationship between x(h) and u(sh), an analogous, direct link between x(h) and dj=j does not exist. Because Eqs. (10) and (21) both express a perihelion shift, we can establish indirectly a relationship between xðhÞ and dj=j via the dilation factor s, by identifying the right-hand sides of Eqs. (10) and (21); in this way we get

326

Am. J. Phys., Vol. 83, No. 4, April 2015

Maurizio M. D’Eliseo

326

This article is copyrighted as indicated in the article. Reuse of AAPT content is subject to the terms at: http://scitation.aip.org/termsconditions. Downloaded to IP: 131.156.157.31 On: Wed, 25 Mar 2015 16:20:35


2pð1=sÞ ¼ 2p 1 þ dj :

(22)

j

Now multiply both sides by j. Then the area under the graph of u(sh) in the interval ½0; 2p=s is made equal to that of the rectangle of width 2p and of height j þ dj. We assume that this area is an invariant of the perturbed system, in the sense that if we imagine bringing the height of the rectangle to the value j, which reﬂects the geometry of the actual system, its width will vary by a factor 1/s À 1. This indicates that, for the dynamic system, the addition of a dj should be understood as a virtual producer of a dilation of the function u.
As long as jdjj ( j, from Eq. (22) we get, to ﬁrst order in dj=j,

dj

s¼1À ;

(23)

j

and, by Eq. (15) with n ¼ 1, we have the identiﬁcation

dj=j ¼ g. So either s or dj can be used for the determina-

tion of the perihelion shift, and if we can connect one of

them to the physics of the problem, we will have also deter-

mined the other, and vice versa. Further, we have


xðhÞ ¼ ð1 À sÞh þ x ¼ dj h þ x;

(24)

j

which expresses the secular shift of the perihelion in terms of the virtual relative increment of curvature of the elliptical orbit at its extrema in the presence of the perturbation.

VII. RELATIVISTIC PRECESSION
In dealing with the nonlinear Eq. (3), we can imagine that the orbital equation that contains all information on the perihelic precession of a planet, say Mercury, takes the linear form

u0

0 m

þ

um

¼

jm;

jm ¼ const:

(25)

In light of the results of Sec. VI, we need an interpretation of the symbol jm. We must assume that jm is not exactly equal to the constant l=l2, which applies only to the inverse-square force. So we split up jm into two unequal parts: a dominant part j ¼ l=l2, and a much smaller secondary part dj  jm À j, which encodes information about the function 3au2. With this interpretation, Eq. (25) should be considered as an auxiliary linear equation whose particular integral is what interests us. Then Mercury will have the incremental orbital shift

dj

Dx ¼ 2p :

(26)

j

To implement this result, we rewrite the relativistic orbit equation as

u00 þ u ¼ j þ |3ﬄ{azuﬄ2} ;

(27)

dj

where we still do not know from where dj can come about.
We only know that the dimensionality of dj must be the
inverse of a length, so that dj=j is dimensionless. We tentatively guess dj  3aj2, obtained by substituting j for u in

the last term of Eq. (27), and then we attempt a perturbative approach in which we take as the ﬁrst approximate solution just the circular orbit that we have associated with the elliptical orbit. This procedure works, because from

u00 þ u ¼ j þ 3aj2;

(28)

and from Eq. (26), we get the Einstein formula

6pa

Dx ¼ 6paj ¼ að1 À e2Þ ;

(29)

which shows that the relativistic precession is proportional to the maximum curvature j ¼ a=b2 of the elliptical orbit.
Consider, for example, the actual ﬁgures for the orbit
of Mercury. From a ¼ 0:3871 AU (one Astronomical Unit is exactly 149,597,811 km) and e ¼ 0.2056, we obtain j ¼ 1=½að1 À e2Þ % 2:6973; in addition, from M % 1:989 Â 1030 kg we get 3a ¼ 3GM=c2 % 4:4309 km, which corresponds approximately to 2:96187 Â 10À8 AU. Then dj ¼ 3aj2 % 2:1549 Â 10À7, so that Eq. (28) takes the
numerical form

u00 þ u ¼ 2:6973 þ 2:1549 Â 10À7;

(30)

where obviously the two addends must be kept separate. Then we get

Dx ¼ 2p 2:1549 Â 10À7 % 5:0197 Â 10À7 rad;

(31)

2:6973

corresponding to 0.1035 arc sec per revolution. Mercury revolves about the Sun 415.2 times in a century, so we have Dx sec ¼ 0:1035ð415:2Þ ¼ 42:97 arc sec.
Astronomical data23 show that the total dynamic secular perihelion shift of Mercury is about 574.09 6 0.41 arc sec per century, of which 531.50 6 0.85 arc sec is accounted for by the disturbances of the other planets. This corresponds, in Eq. (30), to an additional numerical constant—evidently of order 10À6—that we shall calculate below to a precision within the relative standard observational uncertainty.

VIII. CENTRAL PERTURBING FORCES

Our approach to the relativistic precession, supported by a chain of heuristic arguments, can be made systematic and general. Let’s consider a polar orbital equation with a nonlinear term g(u), where  is a parameter small enough to justify a perturbative approach. To this we associate a linear equation with a constant term dj, having the dimension of inverse length

u00 þ u ¼ j þ |gﬄ{ðzuﬄ}Þ :

(32)

dj

For the relativistic equation, where gðuÞ ¼ 3au2, we have veriﬁed that the association gðjÞ ! dj works. But it turns out that such a simple recipe applies only to this case. Thus, we need a procedure that assigns to each speciﬁc function g(u) its dj, while preserving the relativistic result.
To ﬁnd this procedure, because of the dual role of s and dj, we shall determine ﬁrst the dilation factor s by means of

327

Am. J. Phys., Vol. 83, No. 4, April 2015

Maurizio M. D’Eliseo

327

This article is copyrighted as indicated in the article. Reuse of AAPT content is subject to the terms at: http://scitation.aip.org/termsconditions. Downloaded to IP: 131.156.157.31 On: Wed, 25 Mar 2015 16:20:35

a variational technique. We will assume, in the presence of a perturbation, a small variation of the solution u from the circular value u ¼ j and then see what happens. The variation equation can be formally obtained by applying the operator d such that

dðu00Þ ¼ ðu þ duÞ00 À u00 ¼ du00;

(33)

where the combination du should be viewed as a single entity. Now consider Eq. (32), in which g(u) is a function that is continuously differentiable in the closed interval ½umin; umax. We apply d to both sides of this equation, and this yields

du00 þ du ¼ g0ðuÞ du:

(34)

Assuming as reference motion the circular orbit u ¼ j, we evaluate the derivative on the right-hand side at point j. In this way we obtain the homogeneous equation

du00 þ |½1ﬄﬄﬄÀﬄﬄﬄﬄ{zgﬄ0ﬄðﬄﬄjﬄﬄﬄÞ} du ¼ 0;

(35)

s2

which implies s % 1 À  g0ðjÞ=2, and so the lowest-order solution u þ du to Eq. (32) can be written in the form of the function u(sh) of Eq. (13), with s ¼ 1 À g and g ¼  g0ðjÞ=2.
It is interesting to note that, according to Eq. (35), it is
possible to replace in Eq. (1) the actual perturbing force gðuÞl2u2 with the inverse-cube force g0ðjÞl2u3, in agreement with Newton’s theorem on revolving orbits.13 In con-
clusion, from Eqs. (23) and (35) we get

dj ¼ 1 jg0ðjÞ;

(36)

2

which establishes the correct relationship between dj and the function g(u) in the orbital equation. The resulting displacement of the perihelion per revolution will be given by

Dx ¼ 2p dj ¼ pg0ðjÞ:

(37)

j

This formula shows the key role played by the maximum
curvature of the ellipse in the phenomenon of planetary precession. We realize also why the substitution u ! j in gðuÞ ¼ u2 works only for the relativistic orbital equation. The reason is that if we assemble the initial-value
problem

1 jg0ðjÞ ¼ gðjÞ; gð1Þ ¼ 1;

(38)

2

we ﬁnd that it has the unique solution gðjÞ ¼ j2, and this

explains why the relativistic perturbing force is the only one

for which both approaches give the same result.

The presence of the normalization factor 1=j in the structure of the operator P^ suggests the formal simpliﬁcation

þ

þ

þ

P^u ¼ 1 u dh ¼ u dh  v dh  Q^v;

(39)

j

j

allowing us to write the orbital equations in dimensionless form, for which the circular mean orbit is v ¼ 1. This device sometimes simpliﬁes the mathematics, and is often used for theoretical analysis.2 The dimensionality can be restored at

any stage by opportunely reintroducing the factor j. The dimensionless form of Eq. (32) is obtained via the substitution u ! v, j ! 1, and then dropping the j from dj

v00 þ v ¼ 1 þ d; d  1 g0ð1Þ;

(40)

2

so that now, employing the operator Q^, we have Dx ¼ 2pd.
To illustrate the use of this form in a simple application, we derive the perihelion shift that arises by supposing26 that the exponent in Newton’s law lv2 is changed to a value
slightly different from 2. This was one of the many pre-
relativistic attempts to modify the law of gravitation in order to explain the motion of Mercury.27 Let us put f ðvÞ ¼ lv2þ ¼ lv2v in the dimensionless form of Eq. (1). If  is
small enough, we can limit ourselves to the ﬁrst-order
approximation

v % 1 þ lnðvÞ:

(41)

To the resulting orbit equation

v00 þ v ¼ 1 þ lnðvÞ;

(42)

by Eq. (40), with gðvÞ ¼ lnðvÞ, we associate the equation

v00 þ v ¼ 1 þ  ;

(43)

2

and so we get Dx ¼ p. Here, we do not have to make any dimensional adjustment, because  is a pure number whose choice is made to ﬁt the motion of Mercury. To this lowest degree of approximation, the shift is the same for all planets.24,25

IX. NEWTONIAN PRECESSION

A. The model

Now consider the perihelion precession caused by the gravitational pull of the other planets on Mercury. Strictly speaking, its exact determination involves the treatment of a three-dimensional many-body problem, while our perturbation approach is effective only for plane motions and central forces.
This difﬁculty can be overcome if we exploit the actual features of the planetary orbits—they are nearly coplanar and nearly circular—leading to rather realistic ﬁrst approximations. Thus, on one hand we can assume a common orbital plane. On the other hand we will show that the cumulative effect of the forces exerted by each planet along its orbit on Mercury can be equated to that of a force of central type. It follows that we can use our tools with some minor, but clever, adaptations. Thus, we shall compute the precession of Mercury using an oversimpliﬁed, but surprisingly effective Copernican model, in which we assume the orbits of the other seven planets (from Venus to Neptune) to be circular and suitably spaced. In these circumstances the average perturbing forces, those that interest us, are of central type and directed outward—conditions we know how to handle. Under this assumption, we can write Mercury’s orbit equation in the form

X7

u00 þ u ¼ j þ ngnðuÞ;

(44)

n¼0

328

Am. J. Phys., Vol. 83, No. 4, April 2015

Maurizio M. D’Eliseo

328

This article is copyrighted as indicated in the article. Reuse of AAPT content is subject to the terms at: http://scitation.aip.org/termsconditions. Downloaded to IP: 131.156.157.31 On: Wed, 25 Mar 2015 16:20:35

where in the sum the index n ¼ 0 applies to the relativistic term, while the remaining n values apply to the effects of the other seven planets. We defer to the Appendix the calculations required for the determination of the function ngnðuÞ for a generic planet n. The conclusion is that Eq. (44) takes the explicit nonlinear form

u00

þ

u

¼

j

þ

3au2

À

X7 mn
n¼1

Àj 2rnu rn2u2

À

Á; 1

(45)

where rn is the (constant) orbital radius of the planet n, and mn is the ratio of the planet’s mass to the Sun’s mass. Thus, the small coupling constants of Eq. (44) are 0 ¼ 3a and n ¼ mn for n ¼ 1; 2…7. Because of the nonlinearity of Eq. (45), we can resort to the principle of superposition of
small disturbances, which is valid in the ﬁrst-order mathe-
matical treatment of the solar system. In our case, it can be
stated by saying that the secular effect on the perihelion
produced by the perturbing terms present in Eq. (45) is the
algebraic sum of those produced by each term taken singu-
larly. The residue left out by this approximation is negligible.28 Therefore, our problem is to ﬁnd the numerical form
of the linear equation

X7

u00 þ u ¼ j þ djn:

(46)

n¼0

To do this for the planetary part, we ﬁrst compute

À

Á

mng0nðuÞ ¼ mn 2rjnu32Àrrn2n2uu22ÀÀ11Á2 :

(47)

We must now specify the circular reference orbits to be included in the derivative.

B. Effective orbital radii
There is a sensible advantage in taking for Mercury not the orbit u ¼ j, the average with respect to the polar angle (which we used in the relativistic term), but the time average, which is u ¼ 1=a.29,30 On the other hand the average distance, with respect to the angular variable h, of the planet n from the Sun, is given by hrni ¼ an, the semimajor axis of its elliptical orbit. This follows from the deﬁnition of the ellipse, rn þ dn ¼ 2an, where rn and dn are the distances from the Sun and from the empty focus, respectively. The expression is symmetric in the two distances, so that their average values over an orbit are both equal to an. We shall use instead the time average rn ¼ anð1 þ e2n=2Þ.29,30
This choice captures a dynamical aspect of the situation which would be otherwise excluded in a purely geometric treatment. As a consequence of the law of equal areas, the planet spends more time near the aphelion than near the perihelion. In an averaging process, the sample positions of the planet, per equal time intervals, are unevenly scattered over the elliptical orbit: they are grouped near the aphelion to a greater extent than near the perihelion, and therefore the average distance an must be appropriately increased. It follows that if we want to approximate an elliptical orbit by a circle, we must use this effective radius.

C. Perihelion shifts

When we insert the time average u ¼ 1=a for Mercury

and rn for the planet n, Eq. (47) becomes

À

Á

mng0nðuÞju¼1=a

¼

mn

ja4 2rn

À3ar22nÀÀra2nÁ22

:

(48)

Now we can make explicit the planetary portion of the last term of Eq. (46) by using Eqs. (36) (with j ¼ 1=a) and (48) to get

X7
n¼1

djn

¼

X7
n¼1

mn

À

Á

j4ar3nÀ3ar22nÀÀra2nÁ22

;

(49)

a rather tricky expression that summarizes a mess of mutual planetary positions.
We have carried out the calculation outlined in Eq. (49) for each perturbing planet, and the results are presented in Table I. We have also displayed the constants associated with each planet,31 so its contribution can be veriﬁed. From the comparison with the results of more reﬁned calculations23—presented in the column labeled as “theory”—it is seen that our results are individually rather close to the correct ones.
The discrepancies in Table I should be mainly attributed to the fact that Eq. (49) fails to take into account the noncentral components of the perturbing forces. However, the differences are such that their algebraic sum is almost negligible: less than half a second of arc per century. We therefore make virtually no error if we use our total dj in writing the numerical form of the relativistic þ Newtonian auxiliary equation (46) of the planet Mercury as

u00 þ u ¼ 2:6973 þ 2:8841 Â 10À6:

(50)

Thus, from one perihelion to the next, we have

2p X7

2:8841 Â 10À6

Dx ¼

j

djn ¼ 2p
n¼0

2:6973

¼ 6:7183 Â 10À6 rad;

(51)

corresponding to 1.3857 arc sec and to a centennial perihelion shift of 575.34 arc sec. Our derivation yields an excellent ﬁt to the observational data. Moreover, comparing the two numbers on the right-hand side of Eq. (50) tells us the relative strength of the perturbing forces in comparison to the sun’s inverse-square force: on the order of 10À6.

Table I. Mercury’s secular perihelion shifts caused by the seven planets. The values of mn and djn are in units of 10À6.

Planet

rn

mn

djn

Dxn Theory Diff.

Venus Earth Mars Jupiter Saturn Uranus Neptune

0.7233 1.0000 1.5303 5.2095 9.5511 19.2126 30.0701

2.4478 3.0404 0.3227 954.7786 285.8370 43.6624 51.8000

1.3484003 0.4687968 0.0119692 0.7998580 0.0386107 0.0007229 0.0002240

268.72 93.44 2.35 159.44 7.69 0.14 0.04

277.85 90.04 2.53 153.58 7.30 0.14 0.04

–9.13 3.40 –0.18 5.86 0.39 0.00 0.00

Total

2.6685819 531.82 531.48 0.34

329

Am. J. Phys., Vol. 83, No. 4, April 2015

Maurizio M. D’Eliseo

329

This article is copyrighted as indicated in the article. Reuse of AAPT content is subject to the terms at: http://scitation.aip.org/termsconditions. Downloaded to IP: 131.156.157.31 On: Wed, 25 Mar 2015 16:20:35

D. A proper perspective

In order to put our result into the proper perspective, it is worthwhile to analyze the effectiveness of our planetary model with circular orbits and time averages. A more realistic theory (still neglecting out-of-plane effects) could be done starting from the expansion in a Fourier series of the force (A5) where the tip of rn traces a Kepler ellipse. The coefﬁcients of such a series, as is well known, are particular averages of the function to be expanded. If we conﬁne ourselves to the secular effect on x for the general case of a perturber moving in an elliptical orbit of eccentricity en, then in the ﬁrst-order approximation we obtain an equation of the type32

x0ðhÞ

¼

A

þ

B

en e

cosðx

À

xnÞ;

(52)

with two constant terms on the right side: a term that represents the average contribution of a nominal circular orbit, plus a correction term arising from the non-central part of the force, which takes into account the mutual orientations of the orbits of Mercury and planet n. The constants A and B depend on the semi-major axes of the two orbits, while x and xn are the positions of the perihelia at a pre-ﬁxed epoch. In the second term, a critical factor is the ratio of the eccentricities of the two planets. Because the eccentricity of Mercury is an order of magnitude greater than any other, one realizes that the contribution of the second term on the right-hand side of Eq. (52) is small, which accounts for the minor corrective terms we found. Further, when considering the combined action of all planets, a random distribution of the perihelia attaches to each of these terms positive or negative signs, since the cosine runs through all its values from –1 to þ 1 as x À xn varies from 0 to 2p. Under opportune conditions these terms nearly compensate when they are summed, as in the actual epoch. Because the planetary perihelia are all slowly moving, in another epoch the sum of the non-central terms of Eq. (52) could produce a signiﬁcant result of positive or negative sign, and our Copernican model would be less successful.

X. CONCLUSION
The method described here is essentially based on an averaging procedure, with all the advantages and limitations of an approach of this kind. It exploits the particular integral of a speciﬁc form of the orbital equation, to which is assigned a crucial role. By opportunely replacing the nonlinear term of the perturbed orbital equation with a constant, we build a virtual model of a ﬁctitious planet on a circular orbit. The radius of this orbit differs very little—in a manner controlled by the perturbation—from the one in its absence. If we imagine traveling along the circular orbit of the unperturbed mean planet a distance equal to the circumference of the orbit of the ﬁctitious mean planet, we will arrive slightly ahead of or behind the starting point. The operation Dx extracts the angle, positive or negative, subtended by the small circular arc between the start and ﬁnish. The three worked examples have shown the validity of the method.

APPENDIX: AVERAGE FORCE EXERTED BY A PLANET ON MERCURY
To avoid certain considerations about center of mass, which do not affect the ﬁnal result, we will reduce the problem to the essentials. In the Sun-centered reference

system, if Mercury is located at r, the direct force
per unit mass f n exerted on it by planet n, located at rn, is

f

n

¼

ln

rn À r jrn À rj3

;

rn  jrnj > jrj  r;

rn ¼ const:;

(A1)

where ln is the gravitational parameter of planet n. Because we work in a plane environment, we can write

the vectors in polar form via the complex exponentials

pﬃﬃﬃﬃﬃﬃ rn ¼ rneihn ; r ¼ reih; i ¼ À1;

(A2)

and treat them as complex numbers. If we put

cn

¼

r rn

<

1;

/n ¼ hn À h;

Dn ¼ 1 þ c2n À 2cn cos /n; d/n ¼ dhn;

(A3) (A4)

then Eq. (A1) can be written in the form

fn

¼

ln

rneihn À r jrneihn À rj3

¼

ln

! ei/n eih r 1
rn2 À rn3 D3n=2 :

(A5)

The motions of the other planets and that of Mercury are

rationally independent, in the sense that there is no simple

numerical relationship between the periods. This means that

the reciprocal positions on the respective orbits at any time

are not related. It follows that we can consider the planet

Mercury, at a generic position r, to be affected by a secular

force obtained by an averaging procedure.

At this point it is useful to employ the factorization DÀn 3=2 ¼ DÀn 1DnÀ1=2. In fact, averaging with respect to /n, the secular force at point r will be

hf ni

¼ ¼

1 ð2p 2p 0

fn

d/n

¼

ln 2p

ð2p
0

d/n Dn

r r

ln 2prn2

r r

ð2p
0

d/n

À ei/n

Dn

À

Á1 cn D1n=2

ei/n rn2
:

À

! r
rn3

1 Dn1=2
(A6)

We now expand the function DÀn 1=2 in powers of cn, and keep only the linear term

DnðcnÞÀ1=2

%

1

þ

cn 2

ðei/n

þ

eÀi/n Þ

þ

ÁÁ

Á:

(A7)

Then, after some algebra, Eq. (A6) becomes

hf ni

¼

ln 2prn2

r ð2p r0

ei/n

À cn=2 Dn

d/n:

(A8)

Using the standard trigonometric integral

ð2p
0

ei/n Dn

d/n

¼

2pcn 1 À c2n

;

cn < 1;

(A9)

we ﬁnally get the average force exerted by the planet n on Mercury,

hf

ni

¼

ln rn2

À 21

cn À

Á c2n

r r

;

(A10)

which is central and repulsive.

330

Am. J. Phys., Vol. 83, No. 4, April 2015

Maurizio M. D’Eliseo

330

This article is copyrighted as indicated in the article. Reuse of AAPT content is subject to the terms at: http://scitation.aip.org/termsconditions. Downloaded to IP: 131.156.157.31 On: Wed, 25 Mar 2015 16:20:35

This function must be converted to a form fn(u) suitable for insertion in the orbital equation (1). To do this, we omit
the unit vector r/r, then substitute 1/(rnu) for cn and change the sign, obtaining

fnðuÞ

¼

À

À lnu 2rn rn2u2

À

Á 1

:

(A11)

It follows that the function to be inserted in the perturbation portion of Eq. (1) is

fnðuÞ l2u2


gnðuÞ

¼

Àmn

Àj 2rnu rn2u2

À

Á 1

;

(A12)

where we have used l2 ¼ l=j, and so mn  ln=l is the planet/Sun mass ratio.

a)Electronic mail: s.elmo@mail.com 1A. Einstein, Erkla€rung der Perihelbewegung des Merkur aus der allgemei-
nen Relativita€tstheorie, Sitzungsberichte der Kniglich Preuischen Akademie
der Wissenschaften (Seite, Berlin, 1915), pp. 831–839; The Collected
Papers of Albert Einstein, edited by A. J. Knox, M. J. Klein, and R.
Schulmann (Princeton U. P., Princeton, NJ 1996), Vol. 6, Doc. 24, pp.
112–116. 2M. M. D’Eliseo, “Higher-order corrections to the relativistic perihelion
advance and the mass of binary pulsars,” Astrophys. Space Sci. 332,
121–128 (2011). 3B. Davies, “Elementary theory of perihelion precession,” Am. J. Phys. 51,
909–911 (1983). 4N. Gauthier, “Periastron precession in general relativity,” Am. J. Phys. 55,
85–86 (1987). 5T. Garavaglia, “The Runge-Lenz vector and Einstein perihelion pre-
cession,” Am. J. Phys. 55, 164–165 (1987). 6C. Farina and M. Machado, “The Rutherford cross section and the perihe-
lion shift of Mercury with the Runge-Lenz vector,” Am. J. Phys. 55,
921–923 (1987). 7D. R. Stump, “Precession of the perihelion of Mercury,” Am. J. Phys. 56,
1097–1098 (1988). 8K. T. McDonald, “Right and wrong use of the Lenz vector for non-
Newtonian potentials,” Am. J. Phys. 58, 540–542 (1990). 9K. Doggett, “Comment on ‘Precession of the perihelion of Mercury,’ by
Daniel R. Stump,” Am. J. Phys. 59, 851 (1991). 10S. Cornbleet, “Elementary derivation of the advance of the perihelion of a
planetary orbit,” Am. J. Phys. 61, 650–651 (1993). 11B. Dean, “Phase-plane analysis of perihelion precession and Scwarzschild
orbital dynamics,” Am. J. Phys. 67, 78–86 (1999).

12D. R. Brill and D. Goel, “Light bending and perihelion precession: A uni-

ﬁed approach,” Am. J. Phys. 67, 316–319 (1999). 13M. M. D’Eliseo, “The ﬁrst-order orbital equation,” Am. J. Phys. 75,

352–355 (2007). 14T. J. Lemmon and A. R. Mondragon, “Alternative derivation of the relativ-

istic contribution to perihelic precession,” Am. J. Phys. 77, 890–893

(2009). 15N. Grossmann, The Sheer Joy of Celestial Mechanics (Birkh€auser, Boston,

MA, 1996), p. 32. 16We shall consider cases in which the independent variable is u, v, and j.

Sometimes a primed expression can also mean a derivative evaluated at

some point. This practice, here introduced for a graphical cleanliness of

the formulas, while potentially confusing, actually causes no problems

because the context always makes clear what is intended. 17R. d’Inverno, Introducing Einstein’s Relativity (Oxford U. P., New York,

NY, 2001), p. 194. 18The Mathematical Papers of Isaac Newton, edited by D. T. Whiteside

(Cambridge U. P., Cambridge, 2008), Vol. III, 1670–1673, 169–173. 19K. Kendig, Conics (The Mathematical Association of America,

Washington, DC, 2005), p. 243. 20V. G. Szebehely and H. Mark, Adventures in Celestial Mechanics, 2nd ed.

(Wiley, Hoboken, NJ, 1998), Chap. 11, pp. 221–245. 21M. M. D’Eliseo, “The quasi-elliptic motion of the Moon,” Chin. J. Phys.

50, 720–731 (2012). 22C. D. Olds, Continued Fractions (Random House, New York, NY, 1963). 23G. Clemence, “The relativity effects in planetary motions,” Rev. Mod.

Phys. 19, 361–364 (1947). 24M. M. D’Eliseo, “Central forces and secular perihelion motion,” Can. J.

Phys. 85, 1045–1054 (2007). 25H. Goldstein, C. Poole, and J. Safko, Classical Mechanics, 3rd ed.

(Addison Wesley, San Francisco, CA, 2002), pp. 536–538. 26A. Hall, “A suggestion in the theory of Mercury,” Astron. J. XIV, 49–51

(1894). 27N. T. Roseveare, Mercury’s perihelion from Le Verrier to Einstein

(Clarendon Press, Oxford, 1982). 28The residue secular effect would

be

of

order

Q
n

2n.

Therefore

it

can

be

neglected.

29P. Van de Kamp, Elements of Astromechanics (Freeman, San Francisco,

CA, 1964), p. 65. 30M. M. D’Eliseo, “Orbital averages and the secular variations of the orbits,”

S. Elmo Obs. Technical Report, 1999, available at <http://vixra.org/pdf/

1305.0100v1.pdf>. 31C. D. Murray and S. F. Dermott, Solar System Dynamics (Cambridge U.

P., New York, NY, 1999), Appendix A, pp. 526–530. 32In Ref. 31, p. 294, x depends on time, and so the angular mean motion n

multiplies the constants A and B. Since the relation between h and t is h ¼

nt þ periodical terms (Ref. 30, p. 41), it is evident that changing, in the

equation of the secular motion, the variable from t to h leaves the two con-

stants A and B unchanged except that the factor n disappears.

331

Am. J. Phys., Vol. 83, No. 4, April 2015

Maurizio M. D’Eliseo

331

This article is copyrighted as indicated in the article. Reuse of AAPT content is subject to the terms at: http://scitation.aip.org/termsconditions. Downloaded to IP: 131.156.157.31 On: Wed, 25 Mar 2015 16:20:35