Einstein's perihelion formula and its generalization Maurizio M. D'Eliseo Citation: American Journal of Physics 83, 324 (2015); doi: 10.1119/1.4903166 View online: http://dx.doi.org/10.1119/1.4903166 View Table of Contents: http://scitation.aip.org/content/aapt/journal/ajp/83/4?ver=pdfcov Published by the American Association of Physics Teachers Articles you may be interested in Einstein's Physics: Atoms, Quanta, and Relativity Derived, Explained, and Appraised. Am. J. Phys. 81, 719 (2013); 10.1119/1.4813218 Advance of perihelion Am. J. Phys. 81, 695 (2013); 10.1119/1.4813067 A General Relativity Workbook. Am. J. Phys. 81, 317 (2013); 10.1119/1.4789548 General relativity for sophomores Am. J. Phys. 76, 103 (2008); 10.1119/1.2825393 Precession of the perihelion of Mercury’s orbit Am. J. Phys. 73, 730 (2005); 10.1119/1.1949625 This article is copyrighted as indicated in the article. Reuse of AAPT content is subject to the terms at: http://scitation.aip.org/termsconditions. Downloaded to IP: 131.156.157.31 On: Wed, 25 Mar 2015 16:20:35 Einstein’s perihelion formula and its generalization Maurizio M. D’Eliseoa) Osservatorio S.Elmo - Via A.Caccavello 22, 80129 Napoli, Italia (Received 7 July 2014; accepted 19 November 2014) Einstein’s perihelion advance formula can be given a geometric interpretation in terms of the curvature of the ellipse. The formula can be obtained by splitting the constant term of an auxiliary polar equation for an elliptical orbit into two parts that, when combined, lead to the expression of this relativistic effect. Using this idea, we develop a general method for dealing with orbital precession in the presence of central perturbing forces, and apply the method to the determination of the total (relativistic plus Newtonian) secular perihelion advance of the planet Mercury. VC 2015 American Association of Physics Teachers. [http://dx.doi.org/10.1119/1.4903166] I. INTRODUCTION A classic calculation in the scientific literature is the derivation of the formula by which Einstein explained an apparent anomaly of the observed motion of the planet Mercury.1 A planet’s perihelion remains fixed under a pure inversesquare gravitational force, so any shift indicates, as first realized by Newton, either a different force law or the presence of perturbing forces. The perihelion of Mercury is observed to precess—after correction for known planetary perturbations—at the rate of about 43 s of arc per century, and this residue is exactly predicted by the theory of general relativity. To approximately derive the relativistic contribution to the precession (there are further corrections of negligible relevance2), it is not necessary to completely solve the relativistic orbit equation. In his original derivation, Einstein came upon an elliptic integral, which he managed to compute approximately. Since then, a host of authors in this journal3–14 have taken alternate approaches to illuminate various aspects of this problem. Our approach to the subject arises from the interplay of two quite different methodological strategies, which we can define as the local and the global. The first shows that the perihelion precession produced by a perturbing force can be traced back to a steady action along the entire orbit. The second deals directly with this secular effect by splitting the constant term of an auxiliary polar equation of an elliptical orbit into two parts according to a specific criterion, without needing to know what takes place during the motion. Our method is applicable to a broad class of perturbing central forces and provides the leading term of the secular perihelion shift. II. RELATIVISTIC EQUATION The polar equation for the orbit of a planet, considered as a test particle subject to a central force f(u), is given by the well-known expression15 u00 þ u ¼ f ðuÞ l2u2 : (1) Here, uðhÞ ¼ 1=r, with r the distance from the origin, while l is the (constant) magnitude of the orbital angular momen- tum, and a prime denotes differentiation with respect to the independent variable, which in this case is the angle h.16 In general relativity, the spherically symmetric Schwarzschild solution to Einstein’s field equation corre- sponds, for a weak field, to a function f(u) consisting of two attractive parts, the classical inverse-square force and a small corrective term17 f ðuÞ ¼ lu2 þ 3al2u4: (2) Here, l ¼ GM, with G the universal gravitational constant and M the mass of the star, and a ¼ l=c2, with c the speed of light. The parameter a has the dimension of a length and is called the gravitational radius; for the Sun we have a % 1:477 km, a very small value compared to typical orbital radii in our solar system. From Eqs. (1) and (2), we get the relativistic orbit equation u00 þ u ¼ l l2 þ 3au2; (3) which represents an oscillator with a weak quadratic nonlinearity. This equation cannot be solved exactly. Using methods of perturbation theory, a bounded periodic (planetary case) approximate solution can be painstakingly assembled to arbitrarily high order in the coupling constant a, allowing a determination of the precession.2 Our plan here is to bypass the solution of Eq. (3) and, more generally, to extract directly from a perturbed orbital equation the leading precession term through a simple linearization process that consists of replacing the nonlinear term by a constant. This procedure will be discussed in detail after we have dealt with some basic aspects of the elliptical orbit. III. ELLIPTICAL ORBIT The unperturbed orbit equation is u00 þ u ¼ l l2 : (4) The constant term l=l2 in this equation can be given a geometric meaning by exploiting a result of Newton’s that dates back to 1671.18 Newton found that a generic plane curve uðhÞ satisfies the equation u00 þ u ¼ q 1 sin3b ; (5) 324 Am. J. Phys. 83 (4), April 2015 http://aapt.org/ajp VC 2015 American Association of Physics Teachers 324 This article is copyrighted as indicated in the article. Reuse of AAPT content is subject to the terms at: http://scitation.aip.org/termsconditions. Downloaded to IP: 131.156.157.31 On: Wed, 25 Mar 2015 16:20:35 where q is the radius of curvature and b is the angle between the radial and tangential directions at any point on the curve. Setting b ¼ p=2, it follows that the orbit equation can be written in the form u00 þ u ¼ j; (6) where j  1=q ¼ l=l2 is the curvature at those particular points. For a planetary orbit this identification occurs in two circumstances: when the orbit is circular (in which case j ¼ 1=r is always true) and at the extrema of an elliptical orbit (perihelion and aphelion). In the latter case, the curvature is maximal and can be simply expressed in terms of the ellipse parameters (eccentricity e, semi-major axis a, and semi-minor axis b).19 The solution to Eq. (6), written so as to highlight this geometrical aspect, is obtained by starting with the particular solution u ¼ j and adding to it the periodic solution to the homogeneous equation. The orbit can thus be written in the form uðhÞ ¼ j þ je cosðh À xÞ; (7) where e and x are the two arbitrary constants that parame- trize the family of solutions. This is the polar equation of an ellipse when the eccentricity e lies in the open interval (0, 1); the polar axis has been chosen so that u attains its maximum value at h ¼ x, the phase that identifies the position of the perihelion. The period of the solution is 2p, so that the perihelia are located at h ¼ x þ 2np, with n ¼ 0; 1; …. Now, the particular solution u ¼ j of Eq. (4) represents a circular orbit of radius r ¼ q, and from Eq. (7) we find that j ¼ uðx þ p=2Þ. Then, by comparison with the standard polar equation for an ellipse, we deduce that the curvature is given by j ¼ 1=q ¼ 1=½að1 À e2ފ ¼ a=b2 at perihelion. IV. ANGULAR PERIOD OPERATOR A definite integral provides global information about the behavior of a function in the interval of integration. As a limit of Riemann sums, it is the result of a pointwise accumulation process, and this proves useful for the detection of secular effects on the orbit. Unlike Einstein’s elliptic integral, which is quite complicated to handle, the integral we shall use is elementary, because the integrand is the function u(h), the sum of a constant and a cosine. When plotted in rectangular coordinates (h, u), with h in the range ½x; x þ 2pŠ, the function u ¼ j þ je cosðh À xÞ traces out a sinusoid of amplitude je around the segment u ¼ j of length 2p. The oscillation begins and terminates at two successive perihelia, the points ðx; j þ jeÞ and ðx þ 2p; j þ jeÞ, while in between we have one aphelion at point ðx þ p; j À jeÞ. It follows that j is the average value of the elliptical solution over the interval ½x; x þ 2pŠ 1 ðxþ2p u dh ¼ j; (8) 2p x and so u ¼ j is the circular orbit of an imaginary planet associated with the planet that follows the elliptical orbit (7), with whom it shares the same period. The integral ðxþ2p u dh ¼ 2pj (9) x measures the area under the graph of u in the interval ½x; x þ 2pŠ, and is equal to the area of the rectangle of width 2p and height j. Although representing the area of a plane figure, this result can also be interpreted as the circumference of a circle of radius j. Therefore, a division by j will provide the period of the orbit, i.e., the distance (from x) along the h-axis after which the function u repeats itself. In general, given a real number s, we can consider the integration P^uðshÞ  1 ð xþ2p=s uðshÞ dh ¼  1 2p (10) jx s as the action of an operator P^ that, when acting on the func- tion uðshÞ, results in the angular period (defined as the angle separating two successive passages of the planet through the perihelion). The value of the factor s for a perfectly elliptical orbit is unity, but we are anticipating the possibility that the function u undergoes the dilation (stretching or shrinking) uðhÞ ! uðshÞ as a result of a perturbation. With Eq. (10) we have carried out a measurement. The operator P^ is tuned on the circular orbit u ¼ j of the imagi- nary mean planet associated with the elliptical solution (7), and any variation of the common angular period of these planets—in the case of perturbed motion—will be detected. The relative increment Dx of the angular position of the per- ihelion over a complete orbit is obtained when we subtract from P^uðshÞ a full turn 2p  Dx  P^uðshÞ À 2p ¼ 2p 1 À 1 : (11) s Thus, the perihelion shift will be positive or negative according to whether 1/s is greater than or less than one. Now let us multiply Eq. (10) by a positive integer n. The effect of nP^ is equivalent to when the operation P^ is carried out n times successively, assuming, for 1 i n, that the terminal condition of the (i – 1)st operation becomes the initial condition for the ith operation nP^u ¼ n ðxþ2p=s u dh ¼ 1 Xn ðxþ2ip=s u dh ¼ j 1 x ðxþ2np=s u dh j i¼1  P^nu xþ2ðiÀ1Þp=s  n ¼ 2p : (12) jx s Therefore, n can be embedded in the upper limit of the integral, which we denote by the symbol P^n, representing the n-fold composition of P^ with itself. Thus, the equality P^nu ¼ 2pðn=sÞ is a consequence of the fact that the successive perihelia are evenly spaced with separation 2pð1=sÞ. V. THE ROTATING ELLIPSE An ellipse in polar form is specified by the elements a, e, and x, which fix, respectively, the size, shape, and orientation of the ellipse in the plane. When a small perturbing force acts on the system, a viable approach is to treat these “constants” as variable. In our context, we should allow for variations of j and x, the elements a and e being contained in j (where they can vary independently). In a specific application of this technique, the solution in the form uðhÞ ¼ j þ je cosðh À xÞ is retained with x no 325 Am. J. Phys., Vol. 83, No. 4, April 2015 Maurizio M. D’Eliseo 325 This article is copyrighted as indicated in the article. Reuse of AAPT content is subject to the terms at: http://scitation.aip.org/termsconditions. Downloaded to IP: 131.156.157.31 On: Wed, 25 Mar 2015 16:20:35 longer constant, but treated as a slowly varying function compared to cos h. The function is a solution to a first-order differential equation solvable by successive approximations.20,21 For each value of xðhÞ we can define an orbit, the osculating ellipse, with an orientation that varies according to the specific form of the function. In general, this is such that the perihelion is both oscillating and circulating, the latter eventuality being due to a part linear in h, which controls the secular perihelion shift. The linear part produces a variation of the orbital period by a dilation of u(h). To see this, assume xðhÞ ¼ ð1 À sÞh þ x, where s is a real number (in actual applications jsj will be very nearly equal to the unity), and x ¼ xð0Þ is the initial position of the perihelion. Then, j þ je cos½h À xðhފ ¼ j þ je cos fh À ½ð1 À sÞh þ xŠg ¼ j þ je cosðsh À xÞ ¼ uðshÞ; (13) where u(sh) is solution to the equation u00 þ s2u ¼ j: (14) Equation (13) shows that the phase of the sinusoid u shifts in h at a constant rate, so that in the plane ðh; uÞ the graph of u can be imagined as a cosine wave traveling smoothly in the positive or negative h-direction at the uniform rate 1 À s. With the available tools we can analyze some interesting qualitative aspects of a rotating ellipse. For what follows, it is better to visualize the orbit as represented in the polar plane (r,h), where the perihelia lie on the circle of radius r ¼ að1 À eÞ. If we assume 1=s ¼ 1 þ g for 0 < g ( 1, then after n orbital turns the relative shift of the perihelion from its initial position will be Dnx  P^nuðshÞ À 2np ¼ 2npg: (15) Thus, the operation Dnx maps the orbit u(sh) to its nth tangency point with the circle of the perihelia. This point is identified by the angle 2npg, reckoned from an initial position x. We wish to consider Dnx as a dynamical system, and study the behavior of its “orbits,” i.e., we want to understand the behavior of the sequences Dnx, for n ¼ 1…1. Think of n ¼ 1; 2; … (rotation number) as a sequence of times. Then one can assimilate Dnx to a stroboscope that flashes briefly at these times, showing the planet at its successive perihelia. Two different situations can occur. If g is a rational number, say p/q, then Dqx flashes 2pp times until the perihelion is back to the start. This means that the orbital path closes; the orbit is periodic and takes the form of a rosette. If g is an irrational number, the orbital path never closes and in the long run fills more and more densely the annulus að1 À eÞ r að1 þ eÞ, in the sense that the trajectory of the planet intersects every neighborhood on the annulus, no matter as small. More specifically, the sequence of points Dnx will densely cover the circle of the perihelia, but each particular perihelion will never again be attained. It follows that the map Dnx is quasi-periodic, as made explicit by the following recurrence theorem: in the long run, a planet comes, on the circle of the perihelia, an infinite number of times arbitrarily close to any position already occupied. To demonstrate this, we can use the continued fraction approximations22 g ¼ s1 þ 1 s2 1 þ Á Á Á : (16) Truncating at each successive stage gives an infinite sequence of rational approximates (the convergents) g % 1 s1 ; 1 s2 þ s1s2 ; …  p1 q1 ; p2 q2 ; … pn qn ; …; (17) where the integers pn and qn are coprime—their only com- mon factor is 1—and qn > qnÀ1. The convergents pn=qn play a role analogous to that of the partial sums of an infinite series. It can be shown22 that for each n, the difference from g is less that 1=q2n, g À pn qn  < 1 q2n ; (18) and that these are the best rational approximations there are, in the sense that no rational fraction with a denominator not exceeding the denominator of the convergent does better. Then, multiplying Eq. (18) by 2qnp, we get j2qnpg À 2pnpj ¼ jDqn x À 2pnpj < 2p=qn: (19) Now, for a given tolerance, however small, we can find a positive integer n0 such that the middle term of Eq. (19) is smaller for all values of n greater than or equal to n0. So we get closer and closer to 2pnp. By imagining suitable rotations of the circle of the perihelia (changes of origin), this reason- ing extends to the generality of the perihelia. VI. A SMALL CHANGE OF CURVATURE Suppose now that the constant j, this structural component of the polar equation, is altered somewhat by adding— in an algebraic sense—a small piece dj, thus affecting the maximal curvature of the orbital ellipse. This means that, for a given eccentricity, we have changed the semi-major axis of the elliptical orbit and the orbital radius of the mean planet. Then the solution u will become u ¼ ðj þ djÞ þ ðj þ djÞe cosðh À xÞ; (20) and, acting on it with P^, we obtain   P^u ¼ 2p 1 þ dj : (21) j The shift 2pdj=j will be an increase or a decrease, depending on the algebraic sign of dj. We wonder whether it is possible to build a suitable increment dj that encapsulates the presence of a central perturbing force. In this way we would obtain a phenomenological derivation of the perihelion precession. Notice that, while Eq. (13) is a two-way relationship between x(h) and u(sh), an analogous, direct link between x(h) and dj=j does not exist. Because Eqs. (10) and (21) both express a perihelion shift, we can establish indirectly a relationship between xðhÞ and dj=j via the dilation factor s, by identifying the right-hand sides of Eqs. (10) and (21); in this way we get 326 Am. J. Phys., Vol. 83, No. 4, April 2015 Maurizio M. D’Eliseo 326 This article is copyrighted as indicated in the article. Reuse of AAPT content is subject to the terms at: http://scitation.aip.org/termsconditions. Downloaded to IP: 131.156.157.31 On: Wed, 25 Mar 2015 16:20:35   2pð1=sÞ ¼ 2p 1 þ dj : (22) j Now multiply both sides by j. Then the area under the graph of u(sh) in the interval ½0; 2p=sŠ is made equal to that of the rectangle of width 2p and of height j þ dj. We assume that this area is an invariant of the perturbed system, in the sense that if we imagine bringing the height of the rectangle to the value j, which reflects the geometry of the actual system, its width will vary by a factor 1/s À 1. This indicates that, for the dynamic system, the addition of a dj should be understood as a virtual producer of a dilation of the function u. As long as jdjj ( j, from Eq. (22) we get, to first order in dj=j, dj s¼1À ; (23) j and, by Eq. (15) with n ¼ 1, we have the identification dj=j ¼ g. So either s or dj can be used for the determina- tion of the perihelion shift, and if we can connect one of them to the physics of the problem, we will have also deter- mined the other, and vice versa. Further, we have  xðhÞ ¼ ð1 À sÞh þ x ¼ dj h þ x; (24) j which expresses the secular shift of the perihelion in terms of the virtual relative increment of curvature of the elliptical orbit at its extrema in the presence of the perturbation. VII. RELATIVISTIC PRECESSION In dealing with the nonlinear Eq. (3), we can imagine that the orbital equation that contains all information on the perihelic precession of a planet, say Mercury, takes the linear form u0 0 m þ um ¼ jm; jm ¼ const: (25) In light of the results of Sec. VI, we need an interpretation of the symbol jm. We must assume that jm is not exactly equal to the constant l=l2, which applies only to the inverse-square force. So we split up jm into two unequal parts: a dominant part j ¼ l=l2, and a much smaller secondary part dj  jm À j, which encodes information about the function 3au2. With this interpretation, Eq. (25) should be considered as an auxiliary linear equation whose particular integral is what interests us. Then Mercury will have the incremental orbital shift dj Dx ¼ 2p : (26) j To implement this result, we rewrite the relativistic orbit equation as u00 þ u ¼ j þ |3ffl{azuffl2} ; (27) dj where we still do not know from where dj can come about. We only know that the dimensionality of dj must be the inverse of a length, so that dj=j is dimensionless. We tentatively guess dj  3aj2, obtained by substituting j for u in the last term of Eq. (27), and then we attempt a perturbative approach in which we take as the first approximate solution just the circular orbit that we have associated with the elliptical orbit. This procedure works, because from u00 þ u ¼ j þ 3aj2; (28) and from Eq. (26), we get the Einstein formula 6pa Dx ¼ 6paj ¼ að1 À e2Þ ; (29) which shows that the relativistic precession is proportional to the maximum curvature j ¼ a=b2 of the elliptical orbit. Consider, for example, the actual figures for the orbit of Mercury. From a ¼ 0:3871 AU (one Astronomical Unit is exactly 149,597,811 km) and e ¼ 0.2056, we obtain j ¼ 1=½að1 À e2ފ % 2:6973; in addition, from M % 1:989  1030 kg we get 3a ¼ 3GM=c2 % 4:4309 km, which corresponds approximately to 2:96187  10À8 AU. Then dj ¼ 3aj2 % 2:1549  10À7, so that Eq. (28) takes the numerical form u00 þ u ¼ 2:6973 þ 2:1549  10À7; (30) where obviously the two addends must be kept separate. Then we get Dx ¼ 2p 2:1549  10À7 % 5:0197  10À7 rad; (31) 2:6973 corresponding to 0.1035 arc sec per revolution. Mercury revolves about the Sun 415.2 times in a century, so we have Dx sec ¼ 0:1035ð415:2Þ ¼ 42:97 arc sec. Astronomical data23 show that the total dynamic secular perihelion shift of Mercury is about 574.09 6 0.41 arc sec per century, of which 531.50 6 0.85 arc sec is accounted for by the disturbances of the other planets. This corresponds, in Eq. (30), to an additional numerical constant—evidently of order 10À6—that we shall calculate below to a precision within the relative standard observational uncertainty. VIII. CENTRAL PERTURBING FORCES Our approach to the relativistic precession, supported by a chain of heuristic arguments, can be made systematic and general. Let’s consider a polar orbital equation with a nonlinear term g(u), where  is a parameter small enough to justify a perturbative approach. To this we associate a linear equation with a constant term dj, having the dimension of inverse length u00 þ u ¼ j þ |gffl{ðzuffl}Þ : (32) dj For the relativistic equation, where gðuÞ ¼ 3au2, we have verified that the association gðjÞ ! dj works. But it turns out that such a simple recipe applies only to this case. Thus, we need a procedure that assigns to each specific function g(u) its dj, while preserving the relativistic result. To find this procedure, because of the dual role of s and dj, we shall determine first the dilation factor s by means of 327 Am. J. Phys., Vol. 83, No. 4, April 2015 Maurizio M. D’Eliseo 327 This article is copyrighted as indicated in the article. Reuse of AAPT content is subject to the terms at: http://scitation.aip.org/termsconditions. Downloaded to IP: 131.156.157.31 On: Wed, 25 Mar 2015 16:20:35 a variational technique. We will assume, in the presence of a perturbation, a small variation of the solution u from the circular value u ¼ j and then see what happens. The variation equation can be formally obtained by applying the operator d such that dðu00Þ ¼ ðu þ duÞ00 À u00 ¼ du00; (33) where the combination du should be viewed as a single entity. Now consider Eq. (32), in which g(u) is a function that is continuously differentiable in the closed interval ½umin; umaxŠ. We apply d to both sides of this equation, and this yields du00 þ du ¼ g0ðuÞ du: (34) Assuming as reference motion the circular orbit u ¼ j, we evaluate the derivative on the right-hand side at point j. In this way we obtain the homogeneous equation du00 þ |½1fflfflfflÀfflfflfflffl{zgffl0fflðfflffljfflfflfflÞ}Š du ¼ 0; (35) s2 which implies s % 1 À  g0ðjÞ=2, and so the lowest-order solution u þ du to Eq. (32) can be written in the form of the function u(sh) of Eq. (13), with s ¼ 1 À g and g ¼  g0ðjÞ=2. It is interesting to note that, according to Eq. (35), it is possible to replace in Eq. (1) the actual perturbing force gðuÞl2u2 with the inverse-cube force g0ðjÞl2u3, in agreement with Newton’s theorem on revolving orbits.13 In con- clusion, from Eqs. (23) and (35) we get dj ¼ 1 jg0ðjÞ; (36) 2 which establishes the correct relationship between dj and the function g(u) in the orbital equation. The resulting displacement of the perihelion per revolution will be given by Dx ¼ 2p dj ¼ pg0ðjÞ: (37) j This formula shows the key role played by the maximum curvature of the ellipse in the phenomenon of planetary precession. We realize also why the substitution u ! j in gðuÞ ¼ u2 works only for the relativistic orbital equation. The reason is that if we assemble the initial-value problem 1 jg0ðjÞ ¼ gðjÞ; gð1Þ ¼ 1; (38) 2 we find that it has the unique solution gðjÞ ¼ j2, and this explains why the relativistic perturbing force is the only one for which both approaches give the same result. The presence of the normalization factor 1=j in the structure of the operator P^ suggests the formal simplification þ þ þ P^u ¼ 1 u dh ¼ u dh  v dh  Q^v; (39) j j allowing us to write the orbital equations in dimensionless form, for which the circular mean orbit is v ¼ 1. This device sometimes simplifies the mathematics, and is often used for theoretical analysis.2 The dimensionality can be restored at any stage by opportunely reintroducing the factor j. The dimensionless form of Eq. (32) is obtained via the substitution u ! v, j ! 1, and then dropping the j from dj v00 þ v ¼ 1 þ d; d  1 g0ð1Þ; (40) 2 so that now, employing the operator Q^, we have Dx ¼ 2pd. To illustrate the use of this form in a simple application, we derive the perihelion shift that arises by supposing26 that the exponent in Newton’s law lv2 is changed to a value slightly different from 2. This was one of the many pre- relativistic attempts to modify the law of gravitation in order to explain the motion of Mercury.27 Let us put f ðvÞ ¼ lv2þ ¼ lv2v in the dimensionless form of Eq. (1). If  is small enough, we can limit ourselves to the first-order approximation v % 1 þ lnðvÞ: (41) To the resulting orbit equation v00 þ v ¼ 1 þ lnðvÞ; (42) by Eq. (40), with gðvÞ ¼ lnðvÞ, we associate the equation v00 þ v ¼ 1 þ  ; (43) 2 and so we get Dx ¼ p. Here, we do not have to make any dimensional adjustment, because  is a pure number whose choice is made to fit the motion of Mercury. To this lowest degree of approximation, the shift is the same for all planets.24,25 IX. NEWTONIAN PRECESSION A. The model Now consider the perihelion precession caused by the gravitational pull of the other planets on Mercury. Strictly speaking, its exact determination involves the treatment of a three-dimensional many-body problem, while our perturbation approach is effective only for plane motions and central forces. This difficulty can be overcome if we exploit the actual features of the planetary orbits—they are nearly coplanar and nearly circular—leading to rather realistic first approximations. Thus, on one hand we can assume a common orbital plane. On the other hand we will show that the cumulative effect of the forces exerted by each planet along its orbit on Mercury can be equated to that of a force of central type. It follows that we can use our tools with some minor, but clever, adaptations. Thus, we shall compute the precession of Mercury using an oversimplified, but surprisingly effective Copernican model, in which we assume the orbits of the other seven planets (from Venus to Neptune) to be circular and suitably spaced. In these circumstances the average perturbing forces, those that interest us, are of central type and directed outward—conditions we know how to handle. Under this assumption, we can write Mercury’s orbit equation in the form X7 u00 þ u ¼ j þ ngnðuÞ; (44) n¼0 328 Am. J. Phys., Vol. 83, No. 4, April 2015 Maurizio M. D’Eliseo 328 This article is copyrighted as indicated in the article. Reuse of AAPT content is subject to the terms at: http://scitation.aip.org/termsconditions. Downloaded to IP: 131.156.157.31 On: Wed, 25 Mar 2015 16:20:35 where in the sum the index n ¼ 0 applies to the relativistic term, while the remaining n values apply to the effects of the other seven planets. We defer to the Appendix the calculations required for the determination of the function ngnðuÞ for a generic planet n. The conclusion is that Eq. (44) takes the explicit nonlinear form u00 þ u ¼ j þ 3au2 À X7 mn n¼1 Àj 2rnu rn2u2 À Á; 1 (45) where rn is the (constant) orbital radius of the planet n, and mn is the ratio of the planet’s mass to the Sun’s mass. Thus, the small coupling constants of Eq. (44) are 0 ¼ 3a and n ¼ mn for n ¼ 1; 2…7. Because of the nonlinearity of Eq. (45), we can resort to the principle of superposition of small disturbances, which is valid in the first-order mathe- matical treatment of the solar system. In our case, it can be stated by saying that the secular effect on the perihelion produced by the perturbing terms present in Eq. (45) is the algebraic sum of those produced by each term taken singu- larly. The residue left out by this approximation is negligible.28 Therefore, our problem is to find the numerical form of the linear equation X7 u00 þ u ¼ j þ djn: (46) n¼0 To do this for the planetary part, we first compute À Á mng0nðuÞ ¼ mn 2rjnu32Àrrn2n2uu22ÀÀ11Á2 : (47) We must now specify the circular reference orbits to be included in the derivative. B. Effective orbital radii There is a sensible advantage in taking for Mercury not the orbit u ¼ j, the average with respect to the polar angle (which we used in the relativistic term), but the time average, which is u ¼ 1=a.29,30 On the other hand the average distance, with respect to the angular variable h, of the planet n from the Sun, is given by hrni ¼ an, the semimajor axis of its elliptical orbit. This follows from the definition of the ellipse, rn þ dn ¼ 2an, where rn and dn are the distances from the Sun and from the empty focus, respectively. The expression is symmetric in the two distances, so that their average values over an orbit are both equal to an. We shall use instead the time average rn ¼ anð1 þ e2n=2Þ.29,30 This choice captures a dynamical aspect of the situation which would be otherwise excluded in a purely geometric treatment. As a consequence of the law of equal areas, the planet spends more time near the aphelion than near the perihelion. In an averaging process, the sample positions of the planet, per equal time intervals, are unevenly scattered over the elliptical orbit: they are grouped near the aphelion to a greater extent than near the perihelion, and therefore the average distance an must be appropriately increased. It follows that if we want to approximate an elliptical orbit by a circle, we must use this effective radius. C. Perihelion shifts When we insert the time average u ¼ 1=a for Mercury and rn for the planet n, Eq. (47) becomes À Á mng0nðuÞju¼1=a ¼ mn ja4 2rn À3ar22nÀÀra2nÁ22 : (48) Now we can make explicit the planetary portion of the last term of Eq. (46) by using Eqs. (36) (with j ¼ 1=a) and (48) to get X7 n¼1 djn ¼ X7 n¼1 mn À Á j4ar3nÀ3ar22nÀÀra2nÁ22 ; (49) a rather tricky expression that summarizes a mess of mutual planetary positions. We have carried out the calculation outlined in Eq. (49) for each perturbing planet, and the results are presented in Table I. We have also displayed the constants associated with each planet,31 so its contribution can be verified. From the comparison with the results of more refined calculations23—presented in the column labeled as “theory”—it is seen that our results are individually rather close to the correct ones. The discrepancies in Table I should be mainly attributed to the fact that Eq. (49) fails to take into account the noncentral components of the perturbing forces. However, the differences are such that their algebraic sum is almost negligible: less than half a second of arc per century. We therefore make virtually no error if we use our total dj in writing the numerical form of the relativistic þ Newtonian auxiliary equation (46) of the planet Mercury as u00 þ u ¼ 2:6973 þ 2:8841  10À6: (50) Thus, from one perihelion to the next, we have 2p X7 2:8841  10À6 Dx ¼ j djn ¼ 2p n¼0 2:6973 ¼ 6:7183  10À6 rad; (51) corresponding to 1.3857 arc sec and to a centennial perihelion shift of 575.34 arc sec. Our derivation yields an excellent fit to the observational data. Moreover, comparing the two numbers on the right-hand side of Eq. (50) tells us the relative strength of the perturbing forces in comparison to the sun’s inverse-square force: on the order of 10À6. Table I. Mercury’s secular perihelion shifts caused by the seven planets. The values of mn and djn are in units of 10À6. Planet rn mn djn Dxn Theory Diff. Venus Earth Mars Jupiter Saturn Uranus Neptune 0.7233 1.0000 1.5303 5.2095 9.5511 19.2126 30.0701 2.4478 3.0404 0.3227 954.7786 285.8370 43.6624 51.8000 1.3484003 0.4687968 0.0119692 0.7998580 0.0386107 0.0007229 0.0002240 268.72 93.44 2.35 159.44 7.69 0.14 0.04 277.85 90.04 2.53 153.58 7.30 0.14 0.04 –9.13 3.40 –0.18 5.86 0.39 0.00 0.00 Total 2.6685819 531.82 531.48 0.34 329 Am. J. Phys., Vol. 83, No. 4, April 2015 Maurizio M. D’Eliseo 329 This article is copyrighted as indicated in the article. Reuse of AAPT content is subject to the terms at: http://scitation.aip.org/termsconditions. Downloaded to IP: 131.156.157.31 On: Wed, 25 Mar 2015 16:20:35 D. A proper perspective In order to put our result into the proper perspective, it is worthwhile to analyze the effectiveness of our planetary model with circular orbits and time averages. A more realistic theory (still neglecting out-of-plane effects) could be done starting from the expansion in a Fourier series of the force (A5) where the tip of rn traces a Kepler ellipse. The coefficients of such a series, as is well known, are particular averages of the function to be expanded. If we confine ourselves to the secular effect on x for the general case of a perturber moving in an elliptical orbit of eccentricity en, then in the first-order approximation we obtain an equation of the type32 x0ðhÞ ¼ A þ B en e cosðx À xnÞ; (52) with two constant terms on the right side: a term that represents the average contribution of a nominal circular orbit, plus a correction term arising from the non-central part of the force, which takes into account the mutual orientations of the orbits of Mercury and planet n. The constants A and B depend on the semi-major axes of the two orbits, while x and xn are the positions of the perihelia at a pre-fixed epoch. In the second term, a critical factor is the ratio of the eccentricities of the two planets. Because the eccentricity of Mercury is an order of magnitude greater than any other, one realizes that the contribution of the second term on the right-hand side of Eq. (52) is small, which accounts for the minor corrective terms we found. Further, when considering the combined action of all planets, a random distribution of the perihelia attaches to each of these terms positive or negative signs, since the cosine runs through all its values from –1 to þ 1 as x À xn varies from 0 to 2p. Under opportune conditions these terms nearly compensate when they are summed, as in the actual epoch. Because the planetary perihelia are all slowly moving, in another epoch the sum of the non-central terms of Eq. (52) could produce a significant result of positive or negative sign, and our Copernican model would be less successful. X. CONCLUSION The method described here is essentially based on an averaging procedure, with all the advantages and limitations of an approach of this kind. It exploits the particular integral of a specific form of the orbital equation, to which is assigned a crucial role. By opportunely replacing the nonlinear term of the perturbed orbital equation with a constant, we build a virtual model of a fictitious planet on a circular orbit. The radius of this orbit differs very little—in a manner controlled by the perturbation—from the one in its absence. If we imagine traveling along the circular orbit of the unperturbed mean planet a distance equal to the circumference of the orbit of the fictitious mean planet, we will arrive slightly ahead of or behind the starting point. The operation Dx extracts the angle, positive or negative, subtended by the small circular arc between the start and finish. The three worked examples have shown the validity of the method. APPENDIX: AVERAGE FORCE EXERTED BY A PLANET ON MERCURY To avoid certain considerations about center of mass, which do not affect the final result, we will reduce the problem to the essentials. In the Sun-centered reference system, if Mercury is located at r, the direct force per unit mass f n exerted on it by planet n, located at rn, is f n ¼ ln rn À r jrn À rj3 ; rn  jrnj > jrj  r; rn ¼ const:; (A1) where ln is the gravitational parameter of planet n. Because we work in a plane environment, we can write the vectors in polar form via the complex exponentials pffiffiffiffiffiffi rn ¼ rneihn ; r ¼ reih; i ¼ À1; (A2) and treat them as complex numbers. If we put cn ¼ r rn < 1; /n ¼ hn À h; Dn ¼ 1 þ c2n À 2cn cos /n; d/n ¼ dhn; (A3) (A4) then Eq. (A1) can be written in the form fn ¼ ln rneihn À r jrneihn À rj3 ¼ ln ! ei/n eih r 1 rn2 À rn3 D3n=2 : (A5) The motions of the other planets and that of Mercury are rationally independent, in the sense that there is no simple numerical relationship between the periods. This means that the reciprocal positions on the respective orbits at any time are not related. It follows that we can consider the planet Mercury, at a generic position r, to be affected by a secular force obtained by an averaging procedure. At this point it is useful to employ the factorization DÀn 3=2 ¼ DÀn 1DnÀ1=2. In fact, averaging with respect to /n, the secular force at point r will be hf ni ¼ ¼ 1 ð2p 2p 0 fn d/n ¼ ln 2p ð2p 0 d/n Dn r r ln 2prn2 r r ð2p 0 d/n À ei/n Dn À Á1 cn D1n=2 ei/n rn2 : À ! r rn3 1 Dn1=2 (A6) We now expand the function DÀn 1=2 in powers of cn, and keep only the linear term DnðcnÞÀ1=2 % 1 þ cn 2 ðei/n þ eÀi/n Þ þ ÁÁ Á: (A7) Then, after some algebra, Eq. (A6) becomes hf ni ¼ ln 2prn2 r ð2p r0 ei/n À cn=2 Dn d/n: (A8) Using the standard trigonometric integral ð2p 0 ei/n Dn d/n ¼ 2pcn 1 À c2n ; cn < 1; (A9) we finally get the average force exerted by the planet n on Mercury, hf ni ¼ ln rn2 À 21 cn À Á c2n r r ; (A10) which is central and repulsive. 330 Am. J. Phys., Vol. 83, No. 4, April 2015 Maurizio M. D’Eliseo 330 This article is copyrighted as indicated in the article. Reuse of AAPT content is subject to the terms at: http://scitation.aip.org/termsconditions. Downloaded to IP: 131.156.157.31 On: Wed, 25 Mar 2015 16:20:35 This function must be converted to a form fn(u) suitable for insertion in the orbital equation (1). To do this, we omit the unit vector r/r, then substitute 1/(rnu) for cn and change the sign, obtaining fnðuÞ ¼ À À lnu 2rn rn2u2 À Á 1 : (A11) It follows that the function to be inserted in the perturbation portion of Eq. (1) is fnðuÞ l2u2  gnðuÞ ¼ Àmn Àj 2rnu rn2u2 À Á 1 ; (A12) where we have used l2 ¼ l=j, and so mn  ln=l is the planet/Sun mass ratio. a)Electronic mail: s.elmo@mail.com 1A. Einstein, Erkla€rung der Perihelbewegung des Merkur aus der allgemei- nen Relativita€tstheorie, Sitzungsberichte der Kniglich Preuischen Akademie der Wissenschaften (Seite, Berlin, 1915), pp. 831–839; The Collected Papers of Albert Einstein, edited by A. J. Knox, M. J. Klein, and R. Schulmann (Princeton U. P., Princeton, NJ 1996), Vol. 6, Doc. 24, pp. 112–116. 2M. M. D’Eliseo, “Higher-order corrections to the relativistic perihelion advance and the mass of binary pulsars,” Astrophys. Space Sci. 332, 121–128 (2011). 3B. Davies, “Elementary theory of perihelion precession,” Am. J. Phys. 51, 909–911 (1983). 4N. Gauthier, “Periastron precession in general relativity,” Am. J. Phys. 55, 85–86 (1987). 5T. Garavaglia, “The Runge-Lenz vector and Einstein perihelion pre- cession,” Am. J. Phys. 55, 164–165 (1987). 6C. Farina and M. Machado, “The Rutherford cross section and the perihe- lion shift of Mercury with the Runge-Lenz vector,” Am. J. Phys. 55, 921–923 (1987). 7D. R. Stump, “Precession of the perihelion of Mercury,” Am. J. Phys. 56, 1097–1098 (1988). 8K. T. McDonald, “Right and wrong use of the Lenz vector for non- Newtonian potentials,” Am. J. Phys. 58, 540–542 (1990). 9K. Doggett, “Comment on ‘Precession of the perihelion of Mercury,’ by Daniel R. Stump,” Am. J. Phys. 59, 851 (1991). 10S. Cornbleet, “Elementary derivation of the advance of the perihelion of a planetary orbit,” Am. J. Phys. 61, 650–651 (1993). 11B. Dean, “Phase-plane analysis of perihelion precession and Scwarzschild orbital dynamics,” Am. J. Phys. 67, 78–86 (1999). 12D. R. Brill and D. Goel, “Light bending and perihelion precession: A uni- fied approach,” Am. J. Phys. 67, 316–319 (1999). 13M. M. D’Eliseo, “The first-order orbital equation,” Am. J. Phys. 75, 352–355 (2007). 14T. J. Lemmon and A. R. Mondragon, “Alternative derivation of the relativ- istic contribution to perihelic precession,” Am. J. Phys. 77, 890–893 (2009). 15N. Grossmann, The Sheer Joy of Celestial Mechanics (Birkh€auser, Boston, MA, 1996), p. 32. 16We shall consider cases in which the independent variable is u, v, and j. Sometimes a primed expression can also mean a derivative evaluated at some point. This practice, here introduced for a graphical cleanliness of the formulas, while potentially confusing, actually causes no problems because the context always makes clear what is intended. 17R. d’Inverno, Introducing Einstein’s Relativity (Oxford U. P., New York, NY, 2001), p. 194. 18The Mathematical Papers of Isaac Newton, edited by D. T. Whiteside (Cambridge U. P., Cambridge, 2008), Vol. III, 1670–1673, 169–173. 19K. Kendig, Conics (The Mathematical Association of America, Washington, DC, 2005), p. 243. 20V. G. Szebehely and H. Mark, Adventures in Celestial Mechanics, 2nd ed. (Wiley, Hoboken, NJ, 1998), Chap. 11, pp. 221–245. 21M. M. D’Eliseo, “The quasi-elliptic motion of the Moon,” Chin. J. Phys. 50, 720–731 (2012). 22C. D. Olds, Continued Fractions (Random House, New York, NY, 1963). 23G. Clemence, “The relativity effects in planetary motions,” Rev. Mod. Phys. 19, 361–364 (1947). 24M. M. D’Eliseo, “Central forces and secular perihelion motion,” Can. J. Phys. 85, 1045–1054 (2007). 25H. Goldstein, C. Poole, and J. Safko, Classical Mechanics, 3rd ed. (Addison Wesley, San Francisco, CA, 2002), pp. 536–538. 26A. Hall, “A suggestion in the theory of Mercury,” Astron. J. XIV, 49–51 (1894). 27N. T. Roseveare, Mercury’s perihelion from Le Verrier to Einstein (Clarendon Press, Oxford, 1982). 28The residue secular effect would be of order Q n 2n. Therefore it can be neglected. 29P. Van de Kamp, Elements of Astromechanics (Freeman, San Francisco, CA, 1964), p. 65. 30M. M. D’Eliseo, “Orbital averages and the secular variations of the orbits,” S. Elmo Obs. Technical Report, 1999, available at . 31C. D. Murray and S. F. Dermott, Solar System Dynamics (Cambridge U. P., New York, NY, 1999), Appendix A, pp. 526–530. 32In Ref. 31, p. 294, x depends on time, and so the angular mean motion n multiplies the constants A and B. Since the relation between h and t is h ¼ nt þ periodical terms (Ref. 30, p. 41), it is evident that changing, in the equation of the secular motion, the variable from t to h leaves the two con- stants A and B unchanged except that the factor n disappears. 331 Am. J. Phys., Vol. 83, No. 4, April 2015 Maurizio M. D’Eliseo 331 This article is copyrighted as indicated in the article. Reuse of AAPT content is subject to the terms at: http://scitation.aip.org/termsconditions. Downloaded to IP: 131.156.157.31 On: Wed, 25 Mar 2015 16:20:35