Variational Principles Classical Mechanics
Variational Principles Classical Mechanics
IN
CLASSICAL MECHANICS
Douglas Cline
University of Rochester
9 August 2017
ii
c
°2017 Douglas Cline
Contributors
Author: Douglas Cline
Illustrator: Meghan Sarkis
Variational Principles in Classical Mechanics by Douglas Cline is licensed under a Creative Commons
Attribution-NonCommercial-ShareAlike 4.0 International License (CC BY-NC-SA 4.0), except where other-
wise noted.
• Attribution — You must give appropriate credit, provide a link to the license, and indicate if changes
were made. You must do so in any reasonable manner, but not in any way that suggests the licensor
endorses you or your use.
• NonCommercial — You may not use the material for commercial purposes.
• ShareAlike — If you remix, transform, or build upon the material, you must distribute your contributions
under the same license as the original.
• No additional restrictions — You may not apply legal terms or technological measures that legally
restrict others from doing anything the license permits.
The licensor cannot revoke these freedoms as long as you follow the license terms.
Version 1.0
Contents
Contents iii
Preface xvii
Prologue xix
iii
iv CONTENTS
3 Linear oscillators 53
3.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53
3.2 Linear restoring forces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53
3.3 Linearity and superposition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54
3.4 Geometrical representations of dynamical motion . . . . . . . . . . . . . . . . . . . . . . . . . 55
3.4.1 Configuration space ( ) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55
3.4.2 State space, ( ̇ ) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56
3.4.3 Phase space, ( ) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56
3.4.4 Plane pendulum . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57
3.5 Linearly-damped free linear oscillator . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58
3.5.1 General solution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58
3.5.2 Energy dissipation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61
3.6 Sinusoidally-drive, linearly-damped, linear oscillator . . . . . . . . . . . . . . . . . . . . . . . 62
3.6.1 Transient response of a driven oscillator . . . . . . . . . . . . . . . . . . . . . . . . . . 62
3.6.2 Steady state response of a driven oscillator . . . . . . . . . . . . . . . . . . . . . . . . 63
3.6.3 Complete solution of the driven oscillator . . . . . . . . . . . . . . . . . . . . . . . . . 64
3.6.4 Resonance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65
3.6.5 Energy absorption . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65
3.7 Wave equation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68
3.8 Travelling and standing wave solutions of the wave equation . . . . . . . . . . . . . . . . . . . 69
3.9 Waveform analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70
3.9.1 Harmonic decomposition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70
3.9.2 The free linearly-damped linear oscillator . . . . . . . . . . . . . . . . . . . . . . . . . 70
3.9.3 Damped linear oscillator subject to an arbitrary periodic force . . . . . . . . . . . . . 71
3.10 Signal processing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72
3.11 Wave propagation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73
3.11.1 Phase, group, and signal velocities of wave packets . . . . . . . . . . . . . . . . . . . . 74
3.11.2 Fourier transform of wave packets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79
3.11.3 Wave-packet Uncertainty Principle . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80
3.12 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82
Workshop exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85
Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 88
18 Epilogue 493
Appendices
Bibliography 551
Index 555
Examples
xiii
xiv EXAMPLES
11.9 Example: Precession rate for torque-free rotating symmetric rigid rotor . . . . . . . . . . . . . . 320
11.10Example: Tennis racquet dynamics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 323
11.11Example: Rotation of asymmetrically-deformed nuclei . . . . . . . . . . . . . . . . . . . . . . . . 324
11.12Example: The Spinning "Jack" . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 327
11.13Example: The Tippe Top . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 328
11.14Example: Tipping stability of a rolling wheel . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 331
11.15Example: Pivoting . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 332
11.16Example: Rolling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 332
11.17Example: Forces on the bearings of a rotating circular disk . . . . . . . . . . . . . . . . . . . . . 333
12.1 Example: The Grand Piano . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 346
12.2 Example: Two coupled linear oscillators . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 352
12.3 Example: Two equal masses series-coupled by two equal springs . . . . . . . . . . . . . . . . . . . 354
12.4 Example: Two parallel-coupled plane pendula . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 355
12.5 Example: The series-coupled double plane pendula . . . . . . . . . . . . . . . . . . . . . . . . . . 357
12.6 Example: Three plane pendula; mean-field linear coupling . . . . . . . . . . . . . . . . . . . . . . 358
12.7 Example: Three plane pendula; nearest-neighbor coupling . . . . . . . . . . . . . . . . . . . . . . 360
12.8 Example: System of three bodies coupled by six springs . . . . . . . . . . . . . . . . . . . . . . . . 362
12.9 Example: Linear triatomic molecular CO 2 . . . . . . . . . . . . . . . . . . . . . . 363
12.10Example: Benzene ring . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 365
12.11Example: Two linearly-damped coupled linear oscillators . . . . . . . . . . . . . . . . . . . . . . . 372
12.12Example: Collective motion in nuclei . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 375
13.1 Example: Gauge invariance in electromagnetism . . . . . . . . . . . . . . . . . . . . . . . . . . . 386
13.2 Example: The linearly-damped, linear oscillator: . . . . . . . . . . . . . . . . . . . . . . . . . . . 389
14.1 Example: Check that a transformation is canonical . . . . . . . . . . . . . . . . . . . . . . . . . . 396
14.2 Example: Angular momentum: . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 399
14.3 Example: Lorentz force in electromagnetism . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 402
14.4 Example: Wavemotion: . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 402
14.5 Example: Two-dimensional, anisotropic, linear oscillator . . . . . . . . . . . . . . . . . . . . . . 403
14.6 Example: The eccentricity vector . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 404
14.7 Example: The identity canonical transformation . . . . . . . . . . . . . . . . . . . . . . . . . . . 410
14.8 Example: The point canonical transformation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 410
14.9 Example: The exchange canonical transformation . . . . . . . . . . . . . . . . . . . . . . . . . . . 410
14.10Example: Infinitessimal point canonical transformation . . . . . . . . . . . . . . . . . . . . . . . 410
14.11Example: 1-D harmonic oscillator via a canonical transformation . . . . . . . . . . . . . . . . . . 411
14.12Example: Free particle . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 415
14.13Example: Point particle in a uniform gravitational field . . . . . . . . . . . . . . . . . . . . . . . 416
14.14Example: One-dimensional harmonic oscillator . . . . . . . . . . . . . . . . . . . . . . . . . . . 417
14.15Example: The central force problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 417
14.16Example: Linearly-damped, one-dimensional, harmonic oscillator . . . . . . . . . . . . . . . . . . 419
14.17Example: Adiabatic invariance for the simple pendulum . . . . . . . . . . . . . . . . . . . . . . . 426
14.18Example: Harmonic oscillator perturbation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 428
14.19Example: Lindblad resonance in planetary and galactic motion . . . . . . . . . . . . . . . . . . . 429
15.1 Example: Acoustic waves in a gas . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 449
16.1 Example: Muon lifetime . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 460
16.2 Example: Relativistic Doppler Effect . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 461
16.3 Example: Twin paradox . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 461
16.4 Example: Rocket propulsion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 464
16.5 Example: Lagrangian for a relativistic free particle . . . . . . . . . . . . . . . . . . . . . . . . . . 472
16.6 Example: Relativistic particle in an external electromagnetic field . . . . . . . . . . . . . . . . . . 473
16.7 Example: The Bohr-Sommerfeld hydrogen atom . . . . . . . . . . . . . . . . . . . . . . . . . . . . 477
A.1 Example: Eigenvalues and eigenvectors of a real symmetric matrix . . . . . . . . . . . . . . . . . 502
A.2 Example: Degenerate eigenvalues of real symmetric matrix . . . . . . . . . . . . . . . . . . . . . 503
D.1 Example: Rotation matrix: . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 517
D.2 Example: Proof that a rotation matrix is orthogonal . . . . . . . . . . . . . . . . . . . . . . . . . 518
E.1 Example: Displacement gradient tensor . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 524
xvi EXAMPLES
F.1 Example: Jacobian for transform from cartesian to spherical coordinates . . . . . . . . . . . . . . 531
H.1 Example: Maxwell’s Flux Equations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 539
H.2 Example: Buoyancy forces in fluids . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 540
H.3 Example: Maxwell’s circulation equations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 542
H.4 Example: Electromagnetic fields: . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 543
I.1 Example: Fourier transform of a single isolated square pulse: . . . . . . . . . . . . . . . . . . . . 548
I.2 Example: Fourier transform of the Dirac delta function: . . . . . . . . . . . . . . . . . . . . . . . 548
Preface
The goal of this book is to introduce the reader to the intellectual beauty, and philosophical implications,
of the fact that nature obeys variational principles that underlie the Lagrangian and Hamiltonian analytical
formulations of classical mechanics. These variational methods, which were developed for classical mechanics
during the 18 − 19 century, have become the preeminent formalisms for classical dynamics, as well as for
many other branches of modern science and engineering. The ambitious goal of this book is to lead the student
from the intuitive Newtonian vectorial formulation, to introduction of the more abstract variational principles
that underlie the Lagrangian and Hamiltonian analytical formulations. This culminates in discussion of the
contributions of variational principles to the development of relativistic and quantum mechanics. The broad
scope of this book attempts to unify the undergraduate physics curriculum by bridging the chasm that
divides the Newtonian vector-differential formulation and the integral variational formulation of classical
mechanics, and the corresponding chasm that exists between classical and quantum mechanics. Powerful
variational techniques in mathematics, that underlie much of modern physics, are introduced and problem
solving skills are developed in order to challenge students at the crucial stage when they first encounter this
sophisticated and challenging material. The underlying fundamental concepts of classical mechanics, and
their applications to modern physics, are emphasized throughout the course.
A full understanding of the power and beauty of variational principles in classical mechanics, is best
acquired by first learning the concepts of the variational approach, and then applying these concepts to
many examples in classical mechanics. Classical mechanics is the ideal topic for learning the principles and
the power of using the variational approach prior to applying these techniques to other branches of science
and engineering. The underlying philosophical approach adopted by this book was espoused by Galileo
Galilei "You cannot teach a man anything; you can only help him find it within himself."
The development of this textbook was influenced by three textbooks: "The Variational Principles of
Mechanics" by Cornelius Lanczos (1949) [La49], "Classical Mechanics" (1950) by Herbert Goldstein[Go50],
and "Classical Dynamics of Particles and Systems" (1965) by Jerry B. Marion[Ma65]. Marion’s excellent
textbook was unusual in partially bridging the chasm between the outstanding graduate texts by Goldstein
and Lanczos, and a bevy of introductory texts based on Newtonian mechanics that were available at that
time. The present textbook was developed to cover the techniques and philosophical implications of the
variational approaches to classical mechanics, with a breadth and depth close to that provided by Goldstein
and Lanczos, but in a format that better matches the needs of the undergraduate student. An additional
goal is to bridge the gap between classical and modern physics in the undergraduate curriculum.
This book was written in support of the physics junior/senior undergraduate course P235W entitled
"Variational Principles in Classical Mechanics" that the author taught at the University of Rochester between
1993 − 2015. These lecture notes were distributed to students to allow pre-lecture study, facilitated accurate
transmission of the complicated formulae, and minimized note taking during lectures. These lecture notes
evolved into the present textbook that was used for this course. The target audience of the course, upon
which this textbook is based, typically comprised ≈ 70% junior/senior undergraduates, ≈ 25% sophomores,
≤ 5% graduate students, and the occasional well-prepared freshman. The target audience was physics
and astrophysics majors, but it attracted a significant fraction of majors from other disciplines such as
mathematics, chemistry, optics, engineering, music, and the humanities. As a consequence, the book includes
appreciable introductory level physics, plus mathematical review material, to accommodate the diverse
range of prior preparation of the students. This textbook includes material that extends beyond what
reasonably can be covered during a one-term course. This supplemental material is presented to show the
importance and broad applicability of variational concepts to classical mechanics. The book includes 162
worked examples to illustrate the concepts presented. Advanced group-theoretic concepts are minimized to
xvii
xviii PREFACE
better accommodate the mathematical skills of the typical undergraduate physics major. For compatibility
with modern literature in this field, this book follows the widely-adopted nomenclature used in "Classical
Mechanics" by Goldstein[Go50], with recent additions by Johns[Jo05].
The book is broken into four major sections. This first review section sets the stage by including a
brief historical introduction (chapter 1), review of the Newtonian formulation of mechanics plus gravitation
(chapter 2), linear oscillators and wave motion (chapter 3), and an introduction to non-linear dynamics
and chaos (chapter 4). Extensive reading assignments are assigned to minimize the time spent on this
review of Newtonian vectorial mechanics. Building on the introductory section, the second section of the
book introduces the variational principles of analytical mechanics that underlie this book. It includes an
introduction to the calculus of variations (chapter 5), the Lagrangian formulation of mechanics with appli-
cations to holonomic and non-holonomic systems (chapter 6), a discussion of symmetries, invariance, plus
Noether’s theorem (chapter 7) and an introduction to the Hamiltonian and the Hamiltonian formulation
of mechanics plus the Routhian reduction technique (Chapter 8). The third section of the book, applies
Lagrangian and Hamiltonian formulations of classical dynamics to central force problems (chapter 9), mo-
tion in non-inertial frames (chapter 10), rigid-body rotation (chapter 11), and coupled oscillators (chapter
12). The final section of the book discusses Hamilton’s Principle plus advanced applications of Lagrangian
mechanics (chapter 13), Hamiltonian mechanics including Poisson brackets, Liouville’s theorem, canonical
transformations, Hamilton-Jacobi theory, the action-angle technique (chapter 14), and classical mechanics
in the continua (chapter 15). This is followed by a brief review of the revolution in classical mechanics intro-
duced by Einstein’s theory of relativistic mechanics. The extended theory of Lagrangian and Hamiltonian
mechanics is used to apply variational techniques to the Special Theory of Relativity followed by a superficial
introduction to the concepts of General Theory of Relativity (chapter 16). The book finishes with a brief
review of the role of variational principles in bridging the gap between classical mechanics and quantum
mechanics, (chapter 17). These advanced topics extend beyond the typical syllabus for an undergraduate
classical mechanics course. The reason for introducing these advanced topics is to stimulate student interest
in physics by giving them a glimpse of the physics at the summit that they have struggled to climb. This
glimpse illustrates the breadth of classical mechanics, and the role that variational principles have played
in the development of classical, relativistic, quantal, and statistical mechanics. These final supplemental
lectures illustrate the beauty and unity of classical mechanics, and the foundation that classical mechanics
has provided to the development of modern physics. The appendices summarize aspects of the mathematical
methods that are exploited in classical mechanics.
The present textbook contains more material than required for a junior/senior undergraduate classical
mechanics course, and thus, it could serve as the text for a graduate course by focussing the course on the
variational principles covered by chapters 5 − 17. The partitioning and ordering of the topics in the book
are the result of many permutations tried while teaching classical mechanics for many years. Chapters 1
through 3 plus the mathematical appendices, are used as reading assignments during the first three weeks
of class to minimize the time spent reviewing Newtonian mechanics. This maximizes the class time available
to cover the variational approach, that is, chapters 5 through 14. The brief reviews of the mechanics in the
continua, and the transition to quantum mechanics, provide the student with a glimpse of the implications
of analytical mechanics to these more advanced topics.
Information regarding the associated P235 undergraduate course at the University of Rochester is avail-
able on the web site at http://www.pas.rochester.edu/~cline/P235/index.shtml. Information about the
author is available at the Cline home web site: http://www.pas.rochester.edu/~cline/index.html.
The author thanks Meghan Sarkis who prepared many of the illustrations, Joe Easterly who designed the
book cover plus the webpage, and Moriana Garcia who organized publication. Andrew Sifain developed the
diagnostic workshop questions. The author appreciates the permission, granted by Professor Struckmeier, to
quote his published article on the extended Hamilton-Lagrangian formalism. The author acknowledges the
feedback and suggestions made by many students who have taken this course, as well as helpful suggestions
by his colleagues; Andrew Abrams, Adam Hayes, Connie Jones, Andrew Melchionna, David Munson, Alice
Quillen, Richard Sarkis, James Schneeloch, Steven Torrisi, Dan Watson, and Frank Wolfs. These lecture
notes were typed in LATEX using Scientific WorkPlace (MacKichan Software, Inc.), while Adobe Illustrator,
Photoshop, Origin, Mathematica, and MUPAD, were used to prepare the illustrations.
Douglas Cline,
University of Rochester, 2017
Prologue
Two dramatically different philosophical approaches to science were developed in the field of classical me-
chanics during the 17 - 18 centuries. This time period coincided with the Age of Enlightenment in Europe
during which remarkable intellectual and philosophical developments occurred. This was a time when both
philosophical and causal arguments were equally acceptable in science, in contrast with current convention
where there appears to be tacit agreement to discourage use of philosophical arguments in science.
xix
xx PROLOGUE
Newtonian mechanics: Momentum and force are vectors that underlie the Newtonian formulation of
classical mechanics. Newton’s monumental treatise, entitled "Philosophiae Naturalis Principia Mathemat-
ica", published in 1687, established his three universal laws of motion, the universal theory of gravitation,
the derivation of Kepler’s three laws of planetary motion, and the development of calculus. Newton’s three
universal laws of motion provide the most intuitive approach to classical mechanics in that they are based on
vector quantities like momentum, and the rate of change of momentum, which are related to force. Newton’s
equation of motion
p
F= (Newton’s equation of motion)
is a vector differential relation between the instantaneous forces and rate of change of momentum, or equiva-
lent instantaneous accelerations, all of which are vector quantities. Momentum and force are easy to visualize,
and both cause and effect are embedded in Newtonian mechanics. Thus, if all of the forces, including the
constraint forces, acting on the system are known, then the motion is solvable for two body systems. The
mathematics for handling Newton’s "vectorial mechanics" approach to classical mechanics is well established.
Analytical mechanics: Variational principles underlie the analytical formulation of mechanics. Leibniz,
who was a contemporary of Newton, introduced methods based on a quantity called "vis viva", which is
Latin for "living force" and equals twice the kinetic energy. Leibniz believed in the philosophy that God
created a perfect world where nature would be thrifty in all its manifestations. In 1707, Leibniz proposed
that the optimum path is based on minimizing the time integral of the vis viva, which is equivalent to
the action integral of Lagrangian/Hamiltonian mechanics. In 1744 Euler derived the Leibniz result using
variational concepts while Maupertuis restated the Leibniz result based on teleological arguments. The
development of Lagrangian mechanics culminated in the 1788 publication of Lagrange’s monumental treatise
entitled "Mécanique Analytique". Lagrangian mechanics derives the magnitude and direction of the optimum
trajectories and forces based on the concept of least action, which is defined to be the time integral of the
difference between the kinetic and potential energies. Hamilton’s Principle (1834), which underlies Lagrange’s
least action principle, minimizes the action integral given by
Z
= (q q̇) (Hamilton’s Principle)
where the Lagrangian (q q̇) equals the difference between the kinetic energy and the potential energy
. This Lagrangian is a function of generalized coordinates plus their corresponding velocities ̇
The culmination of the development of analytical mechanics occurred in 1834 when Hamilton developed
the premier variational approach, called Hamiltonian mechanics, that is based on the Hamiltonian (q p)
which is a function of the fundamental conjugate position plus the momentum variables. In 1843
Jacobi provided the mathematical framework required to fully exploit the power of Hamiltonian mechanics.
Note that the Lagrangian, Hamiltonian, and the action integral, all are scalar quantities which simplifies
derivation of the equations of motion compared with the vector calculus used by Newtonian mechanics.
Philosophical developments: Variational principles apply to all aspects of our daily life. Typical ex-
amples include; selecting the optimum compromise in quality and cost when shopping, selecting the fastest
route to travel from home to work, or selecting the optimum compromise to satisfy the disparate desires of
the individuals comprising a family. It is astonishing that the laws of nature are consistent with variational
principles involving the principle of least action. Minimizing the action integral led to the development of the
mathematical field of variational calculus plus the analytical variational approaches to classical mechanics
by Euler, Lagrange, Hamilton, and Jacobi.
The analytical approach to classical mechanics appeared contradictory to Newton’s intuitive vector-
ial treatment of force and momentum. There is a dramatic difference in philosophy between the vector-
differential equations of motion derived by Newtonian mechanics, which relate the instantaneous force to
the corresponding instantaneous acceleration, and analytical mechanics, where minimizing the scalar action
integral involves integrals over space and time between specified initial and final states. Analytical mechanics
uses variational principles to determine the optimum trajectory, from a continuum of tentative possibilities
by requiring that the optimum trajectory minimizes the action integral between specified initial and final
conditions.
xxi
Figure 2: Chronological roadmap of the parallel development of the Newtonian and the variational approaches
to classical mechanics.
xxii PROLOGUE
Initially there was considerable prejudice and philosophical opposition to use of the variational approach
which is based on the assumption that nature follows the principles of economy. The variational approach
is not intuitive, and thus it was considered to be speculative and "metaphysical", but it was tolerated as an
efficient tool for exploiting classical mechanics. This opposition to the variational principles, that underlie
analytical mechanics, delayed full appreciation of the variational approach until the start of the 20 century.
As a consequence, the intuitive Newtonian formulation reigned supreme in classical mechanics for over two
centuries, even though the remarkable problem-solving capabilities of analytical mechanics were recognized
and exploited following development of analytical mechanics.
The full significance and superiority of the analytical variational formulations of classical mechanics
became widely accepted following the development of the Special Theory of Relativity in 1905. The Theory
of Relativity requires that the laws of nature be invariant to the reference frame. This is not satisfied by
the Newtonian formulation of mechanics which assumes one absolute frame of reference and a separation of
space and time. In contrast, the Lagrangian and Hamiltonian formulations of the principle of least action
remain valid in the Theory of Relativity, if the Lagrangian is written in a relativistically-invariant form
in space-time. The complete invariance of the variational approach to coordinate frames is precisely the
formalism necessary for handling relativistic mechanics. Hamiltonian mechanics, which is expressed in terms
of the conjugate variables (q p), relates classical mechanics directly to the underlying physics of quantum
mechanics and quantum field theory. As a consequence, the philosophical opposition to exploiting variational
principles no longer exists, and Hamiltonian mechanics has become the preeminent formulation of modern
classical mechanics. The reader is free to draw their own conclusions regarding the philosophical question
"is the principle of economy a fundamental law of classical mechanics, or is it a fortuitous consequence of
the fundamental laws of nature?"
From the late seventeenth century, until the dawn of modern physics at the start of the twentieth cen-
tury, classical mechanics remained a primary driving force in the development of physics. Classical mechanics
embraces an unusually broad range of topics spanning motion of macroscopic astronomical bodies to mi-
croscopic particles in nuclear and particle physics, at velocities ranging from zero to near the velocity of
light, from one-body to statistical many-body systems, as well as having extensions to quantum mechanics.
Introduction of the Special Theory of Relativity in 1905, and the General Theory of Relativity in 1916,
necessitated modifications to classical mechanics for relativistic velocities, and can be considered to be an
extended theory of classical mechanics. Since the 19200 s, quantal physics has superseded classical mechanics
in the microscopic domain. Although quantum physics has played the leading role in the development of
physics during much of the past century, classical mechanics still is a vibrant field of physics that recently
has led to exciting developments associated with non-linear systems and chaos theory. This has spawned
new branches of physics and mathematics as well as changing our notion of causality.
Goals: The primary goal of this book is to introduce the reader to the powerful variational approaches that
play such a pivotal role in classical mechanics, plus many other branches of modern science and engineering.
Figure 1 gives a historical roadmap of the evolution of classical mechanics from Newton, to the variational
approaches of Euler, Lagrange, Hamilton and Jacobi. This book emphasizes the intellectual beauty of these
remarkable developments as well as stressing the philosophical implications that have had a tremendous
impact on modern science. A secondary goal is to apply variational principles to solve advanced applications
in classical mechanics in order to introduce many sophisticated and powerful mathematical techniques that
underlie much of modern physics.
The connections and applications of classical mechanics to modern physics are emphasized throughout
the book in an effort to span the chasm that divides the Newtonian vector-differential formulation and the
integral variational formulation of classical mechanics, and the corresponding chasm that exists between
classical and quantum mechanics. Note that these variational principles, developed in the field of classical
mechanics, now are used in a diverse and wide range of fields including economics, meteorology, engineering,
and computing.
This study of classical mechanics involves climbing a vast mountain of knowledge, and the pathway to
the top leads to elegant and beautiful theories that underlie much of modern physics. These theories are
applied to four major topics in classical mechanics. In addition, being so close to the summit provides the
opportunity for this book to take a few extra steps beyond the normal undergraduate classical mechanics
syllabus to provide a glimpse of the exciting physics found at the summit. This new physics includes topics
such as quantum, relativistic, and statistical mechanics..
Chapter 1
1.1 Introduction
This chapter briefly reviews the historical evolution of classical mechanics since considerable insight can be
gained from study of the history of science. There are two dramatically different approaches used in classical
mechanics. The first is the vectorial approach of Newton which is based on vector quantities like momentum,
force, and acceleration. The second is the analytical approach of Lagrange, Euler, Hamilton, and Jacobi,
that is based on the concept of least action and variational calculus. The more intuitive Newtonian picture
reigned supreme in classical mechanics until the start of the twentieth century. Variational principles, which
were developed during the nineteenth century, never aroused much enthusiasm in scientific circles due to
philosophical objections to the underlying concepts; this approach was merely tolerated as an efficient tool
for exploiting classical mechanics. A dramatic advance in the philosophy of scientific thinking occurred at
the start of the 20 century leading to widespread acceptance of the superiority of variational principles.
1
2 CHAPTER 1. A BRIEF HISTORY OF CLASSICAL MECHANICS
Moon and the Sun. The greek philosophers were relatively advanced in logic and mathematics and developed
concepts that enabled them to calculate areas and perimeters. Unfortunately their philosophical approach
neglected collecting quantitative and systematic data that is an essential ingredient to the advancement of
science.
Archimedes (287-212 B.C.) represented the culmination of science in ancient Greece. As an engineer
he designed machines of war while as a scientist he made significant contributions to hydrostatics and the
principle of the lever. As a mathematician he applied infinitessimals in a way that is reminiscent of modern
integral calculus which he used to derive a value for Unfortunately much of the work of the brilliant
Archimedes subsequently fell into oblivion. Hero of Alexandria (10 - 70 A.D.) described the principle
of reflection that light takes the shortest path. This is an early illustration of variational principle of
least time. Ptolemy (83 - 161 A.D.) wrote several scientific treatises that greatly influenced subsequent
philosophers. Unfortunately he adopted the incorrect geocentric solar system in contrast to the heliocentric
model of Aristarchus and others.
Gottfried Leibniz (1646-1716) was a brilliant German philosopher, a contemporary of Newton, who
worked on both calculus and mechanics. Leibniz started development of calculus in 1675, ten years after
Newton, but Leibniz published his work in 1684, which was three years before Newton’s Principia. Leibniz
made significant contributions to integral calculus and was responsible for the calculus notation currently
used. He introduced the name calculus based on the Latin word for the small stone used for counting.
Newton and Leibniz were involved in a protracted argument over who originated calculus. It appears that
Leibniz saw drafts of Newton’s work on calculus during a visit to England. Throughout their argument
Newton was the ghost writer of most of the articles in support of himself and he had them published under
non-de-plume of his friends. Leibniz made the tactical error of appealing to the Royal Society to intercede on
his behalf. Newton, as president of the Royal Society, appointed his friends to an "impartial " committee to
investigate this issue, then he wrote the committee’s report that accused Leibniz of plagiarism of Newton’s
work on calculus, after which he had it published by the Royal Society. Still unsatisfied he then wrote an
anonymous review of the report in the Royal Society’s own periodical. This bitter dispute lasted until the
death of Leibniz. When Leibniz died his work was largely discredited. The fact that he falsely claimed to be
a nobleman and added the prefix von to his name, coupled with Newton’s vitriolic attacks, did not help his
credibility. Newton is reported to have declared that he took great satisfaction in "breaking Leibniz’s heart."
Studies during the 20 century have largely revived the reputation of Leibniz and he is acknowledged to
have made major contributions to the development of calculus.
Leibniz made significant contributions to classical mechanics. In contrast to Newton’s laws of motion,
which are based on the concept of momentum, Leibniz devised a new theory of dynamics based on kinetic
and potential energy that anticipates the analytical variational approach of Lagrange and Hamilton. Leibniz
argued for a quantity called the "vis viva", which is Latin for "living force" that equals twice the kinetic
energy. Leibniz argued that the change in kinetic energy is equal to the work done. In 1687 Leibniz
proposed that the optimum path is based on minimizing the time integral of the vis viva which is equivalent
to the action integral. Leibniz used both philosophical and causal arguments in his work which were equally
acceptable during the Age of Enlightenment. Unfortunately for Leibniz, his analytical approach based on
energies, which are scalars, appeared contradictory to Newton’s intuitive vectorial treatment of force and
momentum. There was considerable prejudice and philosophical opposition to the variational approach which
assumes that nature is thrifty in all of its actions. The variational approach was considered to be speculative
and "metaphysical" in contrast to the causal arguments supporting Newtonian mechanics. This opposition
delayed full appreciation of the variational approach until the start of the 20 century.
Johann Bernoulli (1667-1748) was a Swiss mathematician who was a student of Leibniz’s calculus, and
sided with Leibniz in the Newton-Leibniz dispute over the credit for developing calculus. Also Bernoulli sided
with the Descartes’ vortex theory of gravitation which delayed acceptance of Newton’s theory of gravitation
in Europe. Bernoulli pioneered development of the calculus of variations by solving the problems of the
catenary, the brachistochrone, and Fermat’s principle. The Bernoulli family is famous for its contributions
to mathematics and science; Johann’s son Daniel played a significant role in the development of the well-
known Bernoulli Principle in hydrodynamics.
Pierre Louis Maupertuis (1698-1759) was a student of Johann Bernoulli and conceived the universal
hypothesis that in nature there is a certain quantity called action which is minimized. Although this bold
assumption correctly anticipates the development of the variational approach to classical mechanics, he
obtained his hypothesis by an entirely incorrect method. He was a dilettante whose mathematical prowess
was far behind the high standards of that time, and he could not establish satisfactorily the quantity to be
minimized. His teleological1 argument was influenced by Fermat’s principle and the corpuscle theory of light
that implied a close connection between optics and mechanics.
Leonhard Euler (1707-1783) was the preeminent Swiss mathematician of the 18 century and was
a student of Johann Bernoulli. Euler developed, with full mathematical rigor, the calculus of variations
following in the footsteps of Johann Bernoulli. Euler used variational calculus to solve minimum/maximum
isoperimetric problems which had attracted and challenged the early developers of calculus, Newton, Leibniz,
and Bernoulli. Euler also was the first to solve the rigid-body rotation problem using the three components
of the angular velocity as kinematical variables. Euler became blind in both eyes by 1766 but that did not
hinder his prolific output in mathematics due to his remarkable memory and mental capabilities. Euler’s
contributions to mathematics are remarkable in quality and quantity; for example during 1775 he published
1 Teleology is any philosophical account that holds that final causes exist in nature, meaning that – analogous to purposes
one mathematical paper per week in spite of being blind. Euler implicitly implied the principle of least
action using vis visa which is not the exact form explicitly developed by Lagrange.
Jean le Rond d’Alembert (1717-1785) was a French mathematician and physicist who had the
clever idea of extending use of the principle of virtual work from statics to dynamics. D’Alembert’s Principle
rewrites the principle of virtual work in the form
X
(F − ṗ )r = 0
=1
where the inertial reaction force ṗ is subtracted from the corresponding force F. This extension of the
principle of virtual work applies equally to both statics and dynamics leading to a single variational principle.
Joseph Louis Lagrange (1736-1813) was an Italian mathematician who was a student of Leonhard
Euler and his work paralleled that of Euler. In 1788 Lagrange published his monumental treatise on ana-
lytical mechanics entitled "Mécanique Analytique" which describes his new, immensely powerful, analytical
technique that can solve any mechanical problem without resort to geometrical considerations. His theory
only required the analytical form of the scalar quantities kinetic and potential energy. In the preface of
his book he refers modestly to his extraordinary achievements with the statement "The reader will find no
figures in the work. The methods which I set forth do not require either constructions or geometrical or
mechanical reasonings: but only algebraic operations, subject to a regular and uniform rule of procedure."
Lagrange also introduced the concept of undetermined multipliers to handle auxiliary conditions which plays
a vital part of theoretical mechanics. William Hamilton, an outstanding figure in the analytical formulation
of classical mechanics, called Lagrange the "Shakespeare of mathematics," on account of the extraordinary
beauty, elegance, and depth of the Lagrangian methods. Lagrange also pioneered numerous significant
contributions to mathematics. For example, Euler, Lagrange, and d’Alembert developed much of the math-
ematics of partial differential equations. Lagrange survived the French Revolution and, in spite of being a
foreigner, Napoleon named Lagrange to the Legion of Honour and made him a Count of the Empire in 1808.
Lagrange was honoured by being buried in the Pantheon.
Jean Baptiste Joseph Fourier (1768-1830) was a French mathematician and physicist who was a
student of Lagrange. Fourier is most famous for the development of Fourier analysis which includes Fourier
series, and Fourier transforms. His work has many applications to classical mechanics such as all forms of
wave motion, signal processing, and solving for the eigenfunctions of linear equations.
the d’Alembert principle to give the first exact formulation of the principle of least action which underlies the
variational principles used in analytical mechanics. The form derived by Euler and Lagrange employed the
principle in a way that applies only for conservative (scleronomic) cases. A significant discovery of Hamilton
is his realization that classical mechanics and geometrical optics can be handled from one unified viewpoint.
In both cases he uses a "characteristic" function that has the property that, by mere differentiation, the
path of the body, or light ray, can be determined by the same partial differential equations. This solution is
equivalent to the solution of the equations of motion.
Carl Gustave Jacob Jacobi (1804-1851), a Prussian mathematician and contemporary of Hamilton,
significantly developed Hamiltonian mechanics. He was one of the few who immediately recognized the
extraordinary importance of the Hamiltonian formulation of mechanics. Jacobi developed canonical trans-
formation theory and showed that the function, used by Hamilton, is only one special case of functions that
generate suitable canonical transformations. He proved that any complete solution of the partial differen-
tial equation, without the specific boundary conditions applied by Hamilton, is sufficient for the complete
integration of the equations of motion. This greatly extends the usefulness of Hamilton’s partial differential
equations. In 1843 Jacobi developed both the Poisson brackets, and the Hamilton-Jacobi, formulations of
Hamiltonian mechanics. The latter gives a single, first-order partial differential equation for the action func-
tion in terms of the generalized coordinates which greatly simplifies solution of the equations of motion.
He also derived a principle of least action for time-independent cases which had been studied by Euler and
Lagrange. Jacobi developed a superior approach to the variational integral that, by eliminating time from
the integral, determined the path without saying anything about how the motion occurs in time.
James Clerk Maxwell (1831-1879) was a Scottish theoretical physicist and mathematician. His
most prominent achievement was formulating a classical electromagnetic theory that united all previously
unrelated observations, experiments and equations of electricity, magnetism and optics into one consistent
theory. Maxwell’s equations demonstrated that electricity, magnetism and light are all manifestations of the
same phenomenon, namely the electromagnetic field. Consequently, all other classic laws and equations of
electromagnetism were simplified cases of Maxwell’s equations. Maxwell’s achievements concerning electro-
magnetism have been called the "second great unification in physics". Maxwell demonstrated that electric
and magnetic fields travel through space in the form of waves, and at the constant speed of light. In 1864
Maxwell wrote "A Dynamical Theory of the Electromagnetic Field" which proposed that light was in fact
undulations in the same medium that is the cause of electric and magnetic phenomena. His work in produc-
ing a unified model of electromagnetism is one of the greatest advances in physics. Maxwell, in collaboration
with Ludwig Boltzmann (1844-1906), also helped develop the Maxwell—Boltzmann distribution, which is
a statistical means of describing aspects of the kinetic theory of gases. These two discoveries helped usher in
the era of modern physics, laying the foundation for such fields as special relativity and quantum mechanics.
Boltzmann founded the field of statistical mechanics and was an early staunch advocate of the existence of
atoms and molecules.
Henri Poincaré (1854-1912) was a French theoretical physicist and mathematician. He was the first to
present the Lorentz transformations in their modern symmetric form and discovered the remaining relativistic
velocity transformations. Although there is similarity to Einstein’s Special Theory of Relativity, Poincaré and
Lorentz still believed in the concept of the ether and did not fully comprehend the revolutionary philosophical
change implied by Einstein. Poincaré worked on the solution of the three-body problem in planetary motion
and was the first to discover a chaotic deterministic system which laid the foundations of modern chaos
theory. It rejected the long-held deterministic view that if the position and velocities of all the particles are
known at one time, then it is possible to predict the future for all time.
The last two decades of the 19 century saw the culmination of classical physics and several important
discoveries that led to a revolution in science that toppled classical physics from its throne. The end of the
19 century was a time during which tremendous technological progress occurred, flight, the automobile,
and turbine-powered ships were developed, Niagara Falls was harnessed for power, etc. During this period,
Heinrich Hertz (1857-1894) produced electromagnetic waves confirming their derivation using Maxwell’s
equations as well as simultaneously discovering the photoelectric effect. Technical developments, such as
photography, the induction spark coil, and the vacuum pump played a significant role in scientific discoveries
made during the 1890’s. At the end of the 19 century, scientists thought that the basic laws were understood
and worried that future physics would be in the fifth decimal place; some scientists worried that little was
left for them to discover. However, there remained a few, presumed minor, unexplained discrepancies plus
new discoveries that led to the revolution in science that occurred at the beginning of the 20 century.
1.7. THE 20 CENTURY REVOLUTION IN PHYSICS 7
Advances in classical mechanics continue to be made. For example, during the past four decades there
have been tremendous advances in the understanding of the evolution of chaos in non-linear systems. This
is due to the availability of computers which has reopened this interesting branch of classical mechanics that
was pioneered by Henri Poincaré. Although classical mechanics is the most mature branch of physics that
has been studied for over 5000years, there still are new research opportunities in this field of physics.
References:
Excellent sources of information regarding the history of major players in the field of classical mechanics
can be found on Wikipedia and the book "Variational Principle of Mechanics" by Lanczos.[La49]
Chapter 2
2.1 Introduction
It is assumed that the reader has been introduced to Newtonian mechanics applied to one or two point objects.
This chapter reviews Newtonian mechanics for motion of many-body systems as well as for macroscopic
sized bodies. Newton’s Law of Gravitation also is reviewed. The purpose of this review is to ensure that the
reader has a solid foundation of elementary Newtonian mechanics upon which to build the powerful analytic
Lagrangian and Hamiltonian approaches to classical dynamics.
Newtonian mechanics is based on application of Newton’s Laws of motion which assume that the concepts
of distance, time, and mass, are absolute, that is, motion is in an inertial frame. The Newtonian idea of
the complete separation of space and time, and the concept of the absoluteness of time, are violated by the
Theory of Relativity as discussed in chapter 16. However, for most practical applications, relativistic effects
are negligible and Newtonian mechanics is an adequate description at low velocities. Therefore chapters
3 − 15 will assume velocities for which Newton’s laws of motion are applicable.
9
10 CHAPTER 2. REVIEW OF NEWTONIAN MECHANICS
If the forces acting on two bodies are their mutual action and reaction, then equation 24 simplifies to
p1 p2
F12 + F21 = + = (p1 + p2 ) = 0 (2.5)
This implies that the total linear momentum (P = p1 + p2 ) is a constant of motion.
Combining equations 21 and 22 leads to a second-order differential equation
p 2 r
F= = 2 = r̈ (2.6)
Note that the force on a body F, and the resultant acceleration a = r̈ are colinear. Appendix 2 gives
explicit expressions for the acceleration a in cartesian and curvilinear coordinate systems. The definition of
force depends on the definition of the mass . Newton’s laws of motion are obeyed to a high precision for
velocities much less than the velocity of light. For example, recent experiments have shown they are obeyed
with an error in the acceleration of ∆ ≤ 5 × 10−14 2
If the work done on the particle is positive, then the final kinetic energy 2 1 Especially noteworthy is that
the kinetic energy [ ] is a scalar quantity which makes it simple to use. This first-order spatial integral is the
foundation of the analytic formulation of mechanics that underlies Lagrangian and Hamiltonian mechanics.
1
P
The average location of the system corresponds to the location of the center of mass since r0 = 0
that is
1 X 1 X
r = R + r0 = R (2.23)
The vector R which describes the location of the center of mass, depends on the origin and coordinate
system chosen. For a continuous mass distribution the location vector of the center of mass is given by
Z
1 X 1
R= r = r (2.24)
The center of mass can be evaluated by calculating the individual components along three orthogonal axes.
The center-of-mass frame of reference is defined as the frame for which the center of mass is stationary.
This frame of reference is especially valuable for elucidating the underlying physics which involves only the
relative motion of the many bodies. That is, the trivial translational motion of the center of mass frame,
which has no influence on the relative motion of the bodies, is factored out and can be ignored. For example,
a tennis ball (006) approaching the earth (6 × 1024 ) with velocity could be treated in three frames,
(a) assume the earth is stationary, (b) assume the tennis ball is stationary, or (c) the center-of-mass frame.
The latter frame ignores the center of mass motion which has no influence on the relative motion of the
tennis ball and the earth. The center of linear momentum and center of mass coordinate frames are identical
in Newtonian mechanics but not in relativistic mechanics as described in chapter 1643.
14 CHAPTER 2. REVIEW OF NEWTONIAN MECHANICS
It is convenient to describe a many-body system by a position vector r0 with respect to the center of mass.
r = R + r0 (2.26)
That is,
X
X X
P= p = r = R + r0 = R + 0 = Ṙ (2.27)
P
since r0 = 0 as given by the definition of the center of mass. That is;
P = Ṙ (2.28)
Thus P
the total linear momentum for a system is the same as the momentum of a single particle of mass
= located at the center of mass of the system.
The origin of the external force is from outside of the system while the internal force is due to the mutual
interaction between the particles in the system. Newton’s Law tells us that
X
ṗ = F = F
+ f (2.30)
6=
Substituting Newton’s third law f = −f into equation 232 implies that
XX
XX X
X
f = f = − f = 0 (2.33)
6= 6= 6=
2.8. TOTAL LINEAR MOMENTUM OF A MANY-BODY SYSTEM 15
which is satisfied only for the case where the summations equal zero. That is, for every internal force, there
is an equal and opposite reaction force that cancels that internal force.
Therefore the first-order integral for linear momentum can be written in differential and integral forms
as
X Z2 X
Ṗ = F F
= P2 − P1 (2.34)
1
The reaction of a body to an external force is equivalent to a single particle of mass located at the center
of mass assuming that the internal forces cancel due to Newton’s third law.
Note that the total linear momentum P is conserved if the net external force F is zero, that is
P
F = =0 (2.35)
Therefore the P of the center of mass is a constant. Moreover, if the component of the force along any
direction b
e is zero, that is,
P · b
e
F · b
e= =0 (2.36)
then P · b
e is a constant. This fact is used frequently to solve problems involving motion in a constant force
field. For example, in the earth’s gravitational field, the momentum of an object moving in vacuum in the
vertical direction is time dependent because of the gravitational force, whereas the horizontal component of
momentum is constant if no forces act in the horizontal direction.
= cos
= sin
16 CHAPTER 2. REVIEW OF NEWTONIAN MECHANICS
r = R + r0 (2.37)
The total angular momentum separates into two terms, the angular momentum about the center of mass,
plus the angular momentum of the center of mass about the origin of the axis system. This factoring of the
angular momentum only applies for the center of mass. This is called Samuel König’s first theorem.
The origin of the external force is from outside of the system while the internal force is due to the interaction
with the other − 1 particles in the system. Newton’s Law tells us that
X
ṗ = F = F
+ f (2.44)
6=
2.9. ANGULAR MOMENTUM OF A MANY-BODY SYSTEM 17
Note that (r − r ) is the vector r connecting to . For central forces the force vector f = rc
thus
XX XX
(r − r ) × f = r × rc
= 0 (2.47)
That is, for central internal forces the total internal torque on a system of particles is zero, and the rate of
change of total angular momentum for central internal forces becomes
X X
L̇ = r × F = N =N
(2.48)
where N is the net external torque acting on the system. Equation 248 leads to the differential and integral
forms of the first integral relating the total angular momentum to total external torque.
Z2
L̇ = N N = L2 − L1 (2.49)
1
Angular momentum conservation occurs in many problems involving zero external torques N = 0 plus
two-body central forces F = ()r̂ since the torque on the particle about the center of the force is zero
Examples are, the central gravitational force for stellar or planetary systems in astrophysics, and the central
electrostatic force manifest for motion of electrons in the atom. In addition, the component of angular
momentum about any axis Lê is conserved if the net external torque about that axis Nê =0.
r = R + r0 (2.51)
R
The location of the center of mass is uniquely defined as being at the location where r0 = 0 The
velocity of the particle can be expressed in terms of the velocity of the center of mass Ṙ plus the velocity
of the particle with respect to the center of mass ṙ0 . That is,
For the special case of the center of mass, the middle term is zero since, by definition of the center of mass,
P 0
ṙ = 0 Therefore
X
1 1
= 02 + 2 (2.54)
2 2
Thus the total kinetic energy of the system is equal to the sum of the kinetic energy of a mass moving
with the center of mass velocity plus the kinetic energy of motion of the individual particles relative to the
center of mass. This is called Samuel König’s second theorem. P
Note that for a fixed center-of-mass energy, the total kinetic energy has a minimum value of 12 02
when the velocity of the center of mass = 0. For a given internal excitation energy, the minimum energy
required to accelerate colliding bodies occurs when the colliding bodies have identical, but opposite, linear
momenta. That is, when the center-of-mass velocity = 0.
Applying Stokes theorem for a path-independent force leads to the alternate statement that the curl is zero.
See appendix 33
∇ × F = 0 (2.56)
Note that the vector product of two del operators ∇ acting on a scalar field equals
∇ × ∇ = 0 (2.57)
Thus it is possible to express a path-independent force field as the gradient of a scalar field, , that is
F = −∇ (2.58)
2.10. WORK AND KINETIC ENERGY FOR A MANY-BODY SYSTEM 19
= + (2.60)
Note that the potential energy is defined only to within an additive constant since the force F = −∇
depends only on difference in potential energy. Similarly, the kinetic energy is not absolute since any inertial
frame of reference can be used to describe the motion and the velocity of a particle depends on the relative
velocities of inertial frames. Thus the total mechanical energy = + is not absolute.
If a single particle is subject to several path-independent forces, such as gravity, linear restoring forces,
etc., then a potential energy can be ascribed to each of the forces where for each force F = −∇ . In
X
contrast to the forces, which add vectorially, these scalar potential energies are additive, = . Thus
the total mechanical energy for potential energies equals
X
= + (r) = + (r) (2.61)
The time derivative of the total mechanical energy is given using equations 263 264 in equation 262
r r r
= + =F· + (∇ ) · + = [F + (∇ )] · + (2.65)
Note that if the field is path independent, that is ∇ × F = 0 then the force and potential are related by
F = −∇ (2.66)
Therefore, for path independent forces, the first term in the time derivative of the total energy in equation
265 is zero. That is,
= (2.67)
In addition, when the potential energy is not an explicit function of time, then
= 0 and thus the total
energy is conserved. That is, for the combination of (a) path independence plus (b) time independence, then
the total energy of a conservative field is conserved.
20 CHAPTER 2. REVIEW OF NEWTONIAN MECHANICS
Note that there are cases where the concept of potential still is useful even when it is time dependent.
That is, if path independence applies, i.e. F = −∇ at any instant. For example, a Coulomb field problem
where charges are slowly changing due to leakage etc., or during a peripheral collision between two charged
bodies such as nuclei.
The origin of the external force is from outside of the system while the internal force is due to the interaction
with the other − 1 particles in the system. Newton’s Law tells us that
X
ṗ = F = F
+ f (2.70)
6=
2.10. WORK AND KINETIC ENERGY FOR A MANY-BODY SYSTEM 21
The work done on the system by a force moving from configuration 1 → 2 is given by
X Z 2 X
X Z 2
1→2 = F
· r + f · r (2.71)
1 1
6=
Equating the two equivalent equations for 1→2 , that is 268 and 275gives that
1→2 = 2 − 1 = (1) − (2) + (1) − (2) (2.78)
This shows that, for conservative forces, the total energy is conserved and is given by
The three first-order integrals for linear momentum, angular momentum, and energy provide powerful
approaches for solving the motion of Newtonian systems due to the applicability of conservation laws for the
corresponding linear and angular momentum plus energy conservation for conservative forces. In addition,
the important concept of center-of-mass motion naturally separates out for these three first-order integrals.
Although these conservation laws were derived assuming Newton’s Laws of motion, these conservation laws
are more generally applicable, and these conservation laws surpass the range of validity of Newton’s Laws of
motion. For example, in 1930 Pauli and Fermi postulated the existence of the neutrino in order to account for
non-conservation of energy and momentum in -decay because they did not wish to relinquish the concepts
of energy and momentum conservation. The neutrino was first detected in 1956 confirming the correctness
of this hypothesis.
22 CHAPTER 2. REVIEW OF NEWTONIAN MECHANICS
X X
= p · ṙ + ṗ · r (2.81)
However, X X X
p · ṙ = ṙ · ṙ = 2 = 2 (2.82)
Thus
X
= 2 + F · r (2.84)
where the hi brackets refer to the time average. Note that if the motion is periodic and the chosen time
equals a multiple of the period, then ( )−(0)
= 0. Even if the motion is not periodic, if the constraints and
velocities of all the particles remain finite, then there is an upper bound to This implies that choosing
→ ∞ means that ( )−(0)
→ 0 In both cases the left-hand side of the equation tends to zero giving the
virial theorem * +
1 X
h i = − F · r (2.86)
2
The right-hand side of this equation is called the virial of the system. For a single particle subject to a
conservative central force F = −∇ the Virial theorem equals
¿ À
1 1
h i = h∇ · ri = (2.87)
2 2
If the potential is of the form = +1 that is, = −( + 1) , then
= ( + 1) . Thus for a single
particle in a central potential = +1 the Virial theorem reduces to
+1
h i = h i (2.88)
2
The following two special cases are of considerable importance in physics.
Hooke’s Law: Note that for a linear restoring force = 1 then
h i = + h i ( = 1)
You may be familiar with this fact for simple harmonic motion where the average kinetic and potential
energies are the same and both equal half of the total energy.
2.11. VIRIAL THEOREM 23
Inverse-square law: The other interesting case is for the inverse square law = −2 where
1
h i = − h i ( = −2)
2
The Virial theorem is useful for solving problems in that knowing the exponent of the field makes it
possible to write down directly the average total energy in the field. For example, for = −2
1 1
hi = h i + h i = − h i + h i = h i (2.89)
2 2
This occurs for the Bohr model of the hydrogen atom where the kinetic energy of the bound electron is half
of the potential energy. The same result occurs for planetary motion in the solar system.
2
h i ≈ ()
where is the radius of a cluster. The average kinetic energy per galaxy is 12 hi2 where hi2 is the average
square of the galaxy velocities with respect to the center of mass of the cluster. Thus the total kinetic energy
of the cluster is
hi2 hi2
hi ≈ = ()
2 2
The Virial theorem tells us that a central force having a radial dependence of the form ∝ gives hi =
+1
2 h i. For the inverse-square gravitational force then
1
hi = − h i ()
2
Thus equations and give an estimate of the total mass of the cluster to be
hi2
≈
This estimate is larger than the value estimated from the luminosity of the cluster implying a large amount
of "dark matter" must exist in galaxies which remains an open question in physics.
24 CHAPTER 2. REVIEW OF NEWTONIAN MECHANICS
F = F + N + f = a (2.91)
= cos (2.93)
ff N
Similarly, taking components along the inclined plane in the di-
rection y
2
sin − = 2 (2.94)
Using the concept of coefficient of friction
= (2.95) Fg
Thus the equation of motion can be written as x
2
(sin − cos ) = (2.96)
2
The block accelerates if sin cos that is, tan The
acceleration is constant if and are constant, that is Figure 2.3: Block on an inclined plane
2
= (sin − cos ) (2.97)
2
Remember that if the block is stationary, the friction coefficient balances such that (sin − cos ) = 0
that is, tan = . However, there is a maximum static friction coefficient beyond which the block starts
sliding. The kinetic coefficient of friction is applicable for sliding friction and usually
Another example of constant force and acceleration is motion of objects free falling in a uniform gravi-
tational field when air drag is neglected. Then one obtains the simple relations such as = + , etc.
2.12. APPLICATIONS OF NEWTON’S EQUATIONS OF MOTION 25
̈ + 20 = 0 (2.100)
which is the equation of the harmonic oscillator. Examples are small oscillations of a mass on a spring,
vibrations of a stretched piano string, etc.
The solution of this second order equation is
This is the well known sinusoidal behavior of the displacement for the simple harmonic oscillator. The
angular frequency 0 is
r
0 = (2.102)
Note that for this linear system with no dissipative forces, the total energy is a constant of motion as
discussed previously. That is, it is a conservative system with a total energy given by
1 1
̇2 + 2 = (2.103)
2 2
The first term is the kinetic energy and the second term is the potential energy. The Virial theorem gives
that for the linear restoring force the average kinetic energy equals the average potential energy.
Consider a conservative force in one dimension. Since it was shown that the total energy = + is
conserved for a conservative field, then
1
= + = 2 + () (2.105)
2
Therefore: r
2
= =± [ − ()] (2.106)
Integration of this gives
Z
±
− 0 = q (2.107)
0 2
[ − ()]
where = 0 when = 0 Knowing () it is possible to solve this equation as a function of time.
26 CHAPTER 2. REVIEW OF NEWTONIAN MECHANICS
0.8
the exponential term in the potential function can be ex-
0.6
panded to give U(x) 0.4
∙ ¸2 Uo 0.2 x
( − 0 ) 0
() ≈ 0 1 − (1 − − ) −0 ≈ 2 (−0 )2 −0 0.0
1 2 3 4 5
-0.2
-0.4
This gives a restoring force
-0.6
() 0 -0.8
() = − = −2 ( − 0 ) -1.0
That is, for small amplitudes the restoring force is linear. Potential energy function ()0 versus
for the diatomic molecule.
2.12. APPLICATIONS OF NEWTON’S EQUATIONS OF MOTION 27
where for spherical objects of diameter , 1 ≈ 155×10−4 and 2 ≈ 0222 in MKS units. Fortunately, the
equation of motion usually can be integrated when the retarding force has a simple power law dependence.
As an example, consider free fall in the Earth’s gravitational field.
− − 1 =
Separate the variables and integrate
Z µ ¶
+ 1
= = − ln
0 − − 1 1 + 1 0
That is µ ¶
1
=− + + 0 −
1 1
Note that for À 1 the velocity approaches a terminal velocity of ∞ = −
1 The characteristic time
constant is = 1 = ∞ Note that if 0 = 0 then
³
´
= ∞ 1 − −
For the case of small raindrops with = 05 then ∞ = 8 (18) and time constant = 08 sec
Note that in the absence of air drag, these rain drops falling from 2000 would attain a velocity of over
400 m.p.h. It is fortunate that the drag reduces the speed of rain drops to non-damaging values. Note that
the above relation would predict high velocities for hail. Fortunately, the drag increases quadratically at the
higher velocities attained by large rain drops or hail, and this limits the terminal velocity to moderate values.
As known in the mid-west, these velocities still are sufficient to do considerable crop damage.
Quadratic regime 2 1
For larger objects at higher velocities, i.e. high Reynold’s number, the drag depends on the square of the
velocity making it necessary to differentiate between objects rising and falling. The equation of motion is
− ± 2 2 =
2.12. APPLICATIONS OF NEWTON’S EQUATIONS OF MOTION 29
where the positive sign is for falling objects and negative sign for rising objects. Integrating the equation of
motion for falling gives
Z µ ¶
−1 0 −1
= 2
= tanh − tanh
0 − + 2 ∞ ∞
q q
∞
where = 2 and ∞ = 2 That is, = For the case of a falling object with 0 = 0 solving for
velocity gives
= ∞ tanh
As an example, a 06 basket ball with = 025 will have ∞ = 20 ( 43 m.p.h.) and = 21.
Consider President George H.W. Bush skydiving. Assume his mass is 70kg and assume an equivalent
spherical shape of the former President to have a diameter of = 1. This gives that ∞ = 56
( 120) and = 56. When Bush senior opens his 8 diameter parachute his terminal velocity is
estimated to decrease to 7 ( 15 ) which is close to the value for a typical ( 8) diameter emergency
parachute which has a measured terminal velocity of 11 in spite of air leakage through the central vent
needed to provide stability.
Therefore the rocket is given an equal and opposite increase in momentum
In the time interval the net change in the linear momentum of the rocket plus fuel system is given by
Since
= − (2.117) dm’ u
then Earth
− = (2.118)
Inserting this in the above equation gives
³ ´ Figure 2.5: Vertical motion of a rocket in a
= − (2.119) gravitational field
Integration gives ³ ´
0
=− (0 − ) + ln (2.120)
But the change in mass is given by Z Z
= − (2.121)
0 0
That is
0 − = (2.122)
Thus ³ ´
0
= − + ln (2.123)
Note that once the propellant is exhausted the rocket will continue to fly upwards as it decelerates in the
gravitational field. You can easily calculate the maximum height. Note that this formula assumes that the
acceleration due to gravity is constant whereas for large heights above the Earth it is necessary to use the
true gravitational force − 2 where is the distance from the center of the earth. In real situations it is
necessary to include air drag which requires use of a computer to numerically solve the equations of motion.
The highest rocket velocity is attained by maximizing the exhaust velocity and the ratio of initial to final
mass. Because the terminal velocity is limited by the mass ratio, engineers construct multistage rockets that
jettison the spent fuel containers and rockets.
2.12. APPLICATIONS OF NEWTON’S EQUATIONS OF MOTION 31
This can be simplified using the vector identity equation 24 giving
X £¡ ¢ ¤
L= 2 ω − (r · ω) r (2.126)
The simplest case for rigid-body rotation is when the body has a symmetry axis with the angular velocity ω
parallel to this body-fixed symmetry axis. For this case then r can be taken perpendicular to ω for which
the second term in equation 2126, i.e. (r · ω) =0, thus
X ¡ ¢
L = 2 ω (r perpendicular to ω)
where is the perpendicular distance from the axis of rotation to the body, For a continuous body the
moment of inertia can be generalized to an integral over the mass density of the body
Z
= 2 (2.128)
where is perpendicular to the rotation axis. The definition of the moment of inertia allows rewriting the
angular momentum about a symmetry axis L in the form
where the moment of inertia is taken about the symmetry axis and assuming that the angular velocity
of rotation vector is parallel to the symmetry axis.
32 CHAPTER 2. REVIEW OF NEWTONIAN MECHANICS
ω = ẑ (2.130)
r = ( ) (2.131)
which is written as a column vector for clarity. Inserting v in the cross-product r ×v gives the components
of the angular momentum to be
⎛ ⎞
X X −
L= r × v = ⎝ − ⎠
2 + 2
where (2134) gives the elementary formula for the mo- Figure 2.6: A rigid rotating body comprising a sin-
ment of inertia = about the axis given earlier gle mass attached by a massless rod at a fixed
in (2129). angle shown at the instant when happens to
The surprising result is that and are non-zero lie in the plane. As the body rotates about
implying that the total angular momentum vector L is the − axis the mass has a velocity and mo-
in general not parallel with ω This can be understood mentum into the page (the negative direction).
by considering the single body shown in figure 26. Therefore the angular momentum L = r × p is in
When the body is in the plane then = 0 and the direction shown which is not parallel to the
= 0 Thus the angular momentum vector L has a angular velocity
component along the − direction as shown which is
not parallel with ω and, since the vectors ω L r are
coplanar, then L must sweep around the rotation axis ω to remain coplanar with the body as it rotates
about the axis. Instantaneously the velocity of the body v is into the plane of the paper and, since
2.12. APPLICATIONS OF NEWTON’S EQUATIONS OF MOTION 33
L = r × v then L is at an angle (90◦ − ) to the axis. This implies that a torque must be applied
to rotate the angular momentum vector. This explains why your automobile shakes if the rotation axis and
symmetry axis are not parallel for one wheel.
The first two moments in (2133) are called products of inertia of the body designated by the pair of
axes involved. Therefore, to avoid confusion, it is necessary to define the diagonal moment, which is called
the moment of inertia, by two subscripts as Thus in general, a body can have three moments of inertia
about the three axes plus three products of inertia. This group of moments comprise the inertia tensor
which will be discussed further in chapter 11. If a body has an axis of symmetry along the axis then the
summations will give = = 0 while will be unchanged. That is, for rotation about a symmetry
axis the angular momentum and rotation axes are parallel. For any axis along which the angular momentum
and angular velocity coincide is called a principal axis of the body.
0 =
0 = +
0
= ( + 2 )
That is
= =
0 0 + 2
Note that this is true independent of the details of the acceleration of the initially stationary child.
N = f · R =
34 CHAPTER 2. REVIEW OF NEWTONIAN MECHANICS
Since the moment of inertia about the center of a uniform sphere is = 25 2 then the angular acceleration
of the ball is
5
̇ = = 2 2
= ()
5 2
Moreover the frictional force causes a deceleration of the linear velocity of the center of mass of
= − = − ()
Integrating from time zero to gives
Z
5
= ̇ =
0 2
The linear velocity of the center of mass at time is given by integration of equation
Z
= = 0 −
0
The billiard ball stops sliding and only rolls when = , that is, when
5
= 0 −
2
That is, when
2 0
=
7
Thus the ball slips for a distance
Z
2 12 02
= = 0 − =
0 2 49
Note that if the ball is pushed at a distance above the center of mass, besides the linear velocity there
is an initial angular momentum of
0 5 0
= 2 =
5 2 2 2
For the case = 25 then the ball immediately assumes a pure non-slipping roll. For 25 one has
0 while 25 corresponds to 0 . In the latter case the frictional force points forward.
p
Since F() = then equation 2135 gives that
Z Z
p 0
P= = p = p() − p0 = ∆p (2.136)
0 0 0
Thus the impulse P is an unambiguous quantity that equals the change in linear momentum of the object
that has been struck which is independent of the details of the time dependence of the impulsive force.
Computation of the spatial motion still requires knowledge of () since the 2136 can be written as
Z
1
v() = F(0 )0 + v0 (2.137)
0
Integration gives
Z " Z ”
#
1 0 0
r() − r0 = v0 + F( ) ” (2.138)
0 0
In general this is complicated. However, for the case of a constant force F() = F0 this simplifies to the
constant acceleration equation
1 F0 2
r() − r0 = v0 + (2.139)
2
F0
where the constant acceleration a = .
y
P = ∆p
= ∆v s
M
T= s × P = ∆L = ∆ω
36 CHAPTER 2. REVIEW OF NEWTONIAN MECHANICS
P
∆v =
s×P
∆ω
=
Assume that the bat was stationary prior to the strike, then after the strike the net translational velocity
of a point along the body-fixed symmetry axis of the bat at a distance from the center of mass, is given
by
P 1 P 1
v () = ∆v + ∆ω × y = + ((s × P) × y) = + [(s · y) P− (s · P) y]
It is assumed that and are perpendicular and thus (s · P) = 0 which simplifies the above equation to
µ ¶
P (s · y)
v () = ∆v + ∆ω × y = 1+
Note that the translational velocity of the location along the bat symmetry axis at a distance from the
center of mass, is zero if the bracket equals zero, that is, if
2
s·y =− = −
where is called the radius of gyration of the body about the center of mass. Note that when the scalar
product · = −
= −2
then there will be no translational motion at the point . This point on the
axis lies on the opposite side of the center of mass from the strike point , and is called the center of
percussion corresponding to the impulse at the point . The center of percussion often is referred to as the
"sweet spot" for an object corresponding to the impulse at the point . For a baseball bat the batter holds
the bat at the center of percussion so that they do not feel an impulse in their hands when the ball is struck
at the point . This principle is used extensively to design bats for all sports involving striking a ball with
a bat, such as, cricket, squash, tennis, etc. as well as weapons such of swords and axes used to decapitate
opponents.
Thus
1 2
= − cos
40 0
3
Integrate from 2 2 gives that the total momentum imparted to 2 is
Z 3
1 2 2 1 2
= − cos =
40 0
2
20 0
where g is the gravitational field which is a position-dependent force per unit gravitational mass pointing
towards the center of the Earth. The gravitational mass is measured when an object is weighed.
Newton’s Law of Gravitation leads to the relation for the gravitational field g (r) at the location r due
to a gravitational mass distribution at the location r0 as given by the integral over the gravitational mass
density ³ ´
Z (r0 ) b r − rb0
g (r) = − 0 2
0 (2.149)
(r − r )
The acceleration of matter in a gravitational field relates the gravitational and inertial masses
F = g = a (2.150)
Thus
a= g (2.151)
That is, the acceleration of a body depends on the gravitational strength and the ratio of the gravitational
and inertial masses. It has been shown experimentally that all matter is subject to the same acceleration
in vacuum at a given location in a gravitational field. That is, is a constant common to all materials.
Galileo first showed this when he dropped objects from the Tower of Pisa. Modern experiments have shown
that this is true to 5 parts in 1013 .
The exact equivalence of gravitational mass and inertial mass is called the weak principle of equiva-
lence which underlies the General Theory of Relativity as discussed in chapter 14. It is convenient to use
the same unit for the gravitational and inertial masses and thus they both can be written in terms of the
common mass symbol .
= = (2.152)
Therefore the subscripts and can be omitted in equations 2150 and 2152. Also the local acceleration
due to gravity a can be written as
a=g (2.153)
F
The gravitational field g ≡ has units of in the MKS system while the acceleration a has units 2 .
since the scalar product of the unit vectors b r·br = 1 Note that the second two terms also cancel since
b
r · θ̂ = r̂ · φ̂ = 0 since the unit vectors are mutually orthogonal. Thus the line integral just depends only on
the starting and ending radii and is independent of the angular coordinates or the detailed path taken between
( ) and ( )
Consider the Principle of Superposition for a gravitational field produced by a set of point masses. The
line integral then can be written as:
Z Z
X
X
∆→ =− F · l = − F · l = ∆→ (2.157)
=1 =1
Thus the net potential energy difference is the sum of the contributions from each point mass producing the
gravitational force field. Since each component is conservative, then the total potential energy difference also
must be conservative. For a conservative force, this line integral is independent of the path taken, it depends
only on the starting and ending positions, r and r . That is, the potential energy is a local function
dependent only on position. The usefulness of gravitational potential energy is that, since the gravitational
force is a conservative force, it is possible to solve many problems in classical mechanics using the fact
that the sum of the kinetic energy and potential energy is a constant. Note that the gravitational field is
conservative, since the potential energy difference ∆→ is independent of the path taken. It is conservative
because the force is radial and time independent, it is not due to the 12 dependence.
Note that the probe mass 0 factors out from the integral. It is convenient to define a new quantity called
gravitational potential where
Z
∆→
∆
→ = = − g · l (2.159)
0
That is; gravitational potential difference is the work that must be done, per unit mass, to move from a to
b with no change in kinetic energy. Be careful not to confuse the gravitational potential energy difference
∆→ and gravitational potential difference ∆→ , that is, ∆ has units of energy, , while ∆ has
units of .
The gravitational potential is a property of the gravitational force field; it is given as minus the line
integral of the gravitational field from to . The change in gravitational potential energy for moving a
mass 0 from to is given in terms of gravitational potential by:
∆→ = 0 ∆
→ (2.160)
Thus gravitational potential is a simple additive scalar field because the Principle of Superposition applies.
The gravitational potential, between two points differing by in height, is . Clearly, the greater or ,
the greater the energy released by the gravitational field when dropping a body through the height . The
unit of gravitational potential is the
2.14. NEWTON’S LAW OF GRAVITATION 41
= −g · l (2.164)
Using cartesian coordinates both g and l can be written as
g = bi + bj + k
b l = bi + bj + k
b (2.165)
Taking the scalar product gives:
= −g · l = − − − (2.166)
Differential calculus expresses the change in potential in terms of partial derivatives by:
= + + (2.167)
By association, 2166 and 2167 imply that
= − = − = − (2.168)
Thus on each axis, the gravitational field can be written as minus the gradient of the gravitational potential.
In three dimensions, the gravitational field is minus the total gradient of potential and the gradient of the
scalar function can be written as:
g = −∇ (2.169)
In cartesian coordinates this equals
∙ ¸
b b b
g=− i +j +k (2.170)
Thus the gravitational field is just the gradient of the gravitational potential, which always is perpendicular
to the equipotentials. Skiers are familiar with the concept of gravitational equipotentials and the fact that
the line of steepest descent, and thus maximum acceleration, is perpendicular to gravitational equipotentials
of constant height. The advantage of using potential theory for inverse-square law forces is that scalar
potentials replace the more complicated vector forces, which greatly simplifies calculation. Potential theory
plays a crucial role for handling both gravitational and electrostatic forces.
the path taken between two points and . Consider two possible paths
between and as shown in figure 29. The line integral from to via
route 1 is equal and opposite to the line integral back from to via 2
route 2 if the gravitational field is conservative as shown earlier.
A better way of expressing this is that the line integral of the gravita-
tional field is zero around any closed path. Thus the line integral between
and , via path 1, and returning back to , via path 2, are equal and Figure 2.9: Circulation of the
opposite. That is, the net line integral for a closed loop is zero. gravitational field.
42 CHAPTER 2. REVIEW OF NEWTONIAN MECHANICS
I
g · l = 0 (2.171)
which is a measure of the circulation of the gravitational field. The fact that the circulation equals zero
corresponds to the statement that the gravitational field is radial for a point mass.
Stokes Theorem, discussed in appendix 3, states that
I Z
F · l = (∇ × F) · S (2.172)
∇×g =0 (2.174)
∇ × ∇ = 0 (2.175)
g = −∇ (2.176)
Thus is consistent with the above definition of gravitational potential in that the scalar product
Z Z Z X Z
∆→ = − g · l = (∇) · l = = (2.177)
An identical relation between the electric field and electric potential applies for the inverse-square law
electrostatic field.
Reference potentials:
Note that only differences in potential energy, , and gravitational potential, , are meaningful, the absolute
values depend on some arbitrarily chosen reference. However, often it is useful to measure gravitational
potential with respect to a particular arbitrarily chosen reference point such as to sea level. Aircraft
pilots are required to set their altimeters to read with respect to sea level rather than their departure
0 0
airport. This ensures that aircraft leaving from say both Rochester, 559 and Denver 5000 , have
their altimeters set to a common reference to ensure that they do not collide. The gravitational force is the
gradient of the gravitational field which only depends on differences in potential, and thus is independent of
any constant reference.
Consider a closed surface where the direction of the surface vector S is defined as outwards. The net
flux out of this closed surface is given by
I I
b
r · S
Φ = − = − Ω = −4 (2.183)
2
This is independent of where the point mass lies within the closed surface or on the shape of the closed
surface. Note that the solid angle subtended is zero if the point mass lies outside the closed surface. Thus
the flux is as given by equation 2183 if the mass is enclosed by the closed surface, while it is zero if the mass
is outside of the closed surface.
Since the flux for a point mass is independent of the location of the mass within the volume enclosed by
the closed surface, and using the principle of superposition for the gravitational field, then for enclosed
point masses the net flux is
Z X
Φ≡ g · S = −4 (2.184)
This can be extended to continuous mass distributions, with local mass density giving that the net flux
Z Z
Φ≡ g · S = −4 (2.185)
or Z
[∇ · g + 4] = 0 (2.187)
44 CHAPTER 2. REVIEW OF NEWTONIAN MECHANICS
This is true independent of the shape of the surface, thus the divergence of the gravitational field
∇ · g = −4 (2.188)
This is a statement that the gravitational field of a point mass has a 12 dependence.
Using the fact that the gravitational field is conservative, this can be expressed as the gradient of the
gravitational potential
g = −∇ (2.189)
and Gauss’s law, then becomes
∇ · ∇ = 4 (2.190)
which also can be written as Poisson’s equation
∇2 = 4 (2.191)
Knowing the mass distribution allows determination of the potential by solving Poisson’s equation.
A special case that often is encountered is when the mass distribution is zero in a given region. Then the
potential for this region can be determined by solving Laplace’s equation with known boundary conditions.
∇2 = 0 (2.192)
For example, Laplace’s equation applies in the free space between the masses. It is used extensively in elec-
trostatics to compute the electric potential between charged conductors which themselves are equipotentials.
An elegant way to express Newton’s Law of Gravitation is in terms of the flux and circulation of the
gravitational field. That is,
Flux: Z Z
Φ≡ g · S = −4 (2.194)
Circulation: I
g · l = 0 (2.195)
The flux and circulation are better expressed in terms of the vector differential concepts of divergence
and curl.
Divergence:
∇ · g = −4 (2.196)
Curl:
∇×g =0 (2.197)
Remember that the flux and divergence of the gravitational field are statements that the field between
point masses has a 12 dependence. The circulation and curl are statements that the field between point
masses is radial.
Because the gravitational field is conservative it is possible to use the concept of the scalar potential
field This concept is especially useful for solving some problems since the gravitational potential can be
evaluated using the scalar integral Z
(0 ) 0
∆∞→ = − (2.198)
0
2.14. NEWTON’S LAW OF GRAVITATION 45
An alternate approach is to solve Poisson’s equation if the boundary values and mass distributions are known
where Poisson’s equation is:
∇2 = 4 (2.199)
These alternate expressions of Newton’s law of gravitation can be exploited to solve problems. The
method of solution is identical to that used in electrostatics.
and then
g = −∇
c) The obvious spherical symmetry can be used in conjunction 0
with Gauss’s law to easily solve this problem.
Z Z g
-GM r -GM
g · S = −4 r²
2.15 Summary
Newton’s Laws of Motion:
A cursory review of Newtonian mechanics has been presented. The concept of inertial frames of reference
was introduced since Newton’s laws of motion apply only to inertial frames of reference.
Newton’s Law of motion
p
F= (26)
leads to second-order equations of motion which can be difficult to handle for many-body systems.
Solution of Newton’s second-order equations of motion can be simplified using the three first-order in-
tegrals coupled with corresponding conservation laws. The first-order time integral for linear momentum
is Z 2 Z 2
p
F = = (p2 − p1 ) (210)
1 1
The first-order time integral for angular momentum is
Z 2 Z 2
L p L
= r × = N N = = (L2 − L1 ) (216)
1 1
The first-order spatial integral is related to kinetic energy and the concept of work. That is
Z 2
F = F · r = (2 − 1 ) (221)
r 1
The conditions that lead to conservation of linear and angular momentum and total mechanical energy
were discussed for many-body systems. The important class of conservative forces was shown to R 2apply if
the position-dependent force do not depend on time or velocity, and if the work done by a force 1 F · r
is independent of the path taken between the initial and final locations. The total mechanical energy is a
constant of motion when the forces are conservative.
It was shown that the concept of center of mass of a many-body or finite sized body separates naturally
for all three first-order integrals. The center of mass is that point about which
X Z
r0 = r0 = 0 (Centre of mass definition)
where r0 is the vector defining the location of mass with respect to the center of mass. The concept of
center of mass greatly simplifies the description of the motion of finite-sized bodies and many-body systems
by separating out the important internal interactions and corresponding underlying physics, from the trivial
overall translational motion of a many-body system..
The Virial theorem states that the time-averaged properties are related by
* +
1 X
h i = − F · r (286)
2
It was shown that the Virial theorem is useful for relating the time-averaged kinetic and potential energies,
especially for cases involving either linear or inverse-square forces.
Typical examples were presented of application of Newton’s equations of motion to solving systems
involving constant, linear, position-dependent, velocity-dependent, and time-dependent forces, to constrained
and unconstrained systems, as well as systems with variable mass. Rigid-body rotation about a body-fixed
rotation axis also was discussed.
It is important to be cognizant of the following limitations that apply to Newton’s laws of motion:
1) Newtonian mechanics assumes that all observables are measured to unlimited precision, that is
p r are known exactly. Quantum physics introduces limits to measurement due to wave-particle duality.
2) The Newtonian view is that time and position are absolute concepts. The Theory of Relativity shows
that this is not true. Fortunately for most problems and thus Newtonian mechanics is an excellent
approximation.
2.15. SUMMARY 47
3) Another limitation, to be discussed later, is that it is impractical to solve the equations of motion
for many interacting bodies such as molecules in a gas. Then it is necessary to resort to using statistical
averages, this approach is called statistical mechanics.
Newton’s work constitutes a theory of motion in the universe that introduces the concept of causality.
Causality is that there is a one-to-one correspondence between cause of effect. Each force causes a known
effect that can be calculated. Thus the causal universe is pictured by philosophers to be a giant machine
whose parts move like clockwork in a predictable and predetermined way according to the laws of nature. This
is a deterministic view of nature. There are philosophical problems in that such a deterministic viewpoint
appears to be contrary to free will. That is, taken to the extreme it implies that you were predestined to
read this book because it is a natural consequence of this mechanical universe!
Gravitation Electrostatics
Force field g ≡ F E ≡ F
Density Mass density (r0 ) Charge density (r0 )
R (r0 )(r−r0 ) 1
R (r0 )(r−r0 ) 0
Conservative central field g (r) = − (r−r0 )2 0 E (r) = 4 2
R R R 0 R 0)
(r−r
Flux Φ ≡ g · S = −4 Φ ≡ E · S = 10
I I
Circulation g · l = 0 E · l = 0
Divergence ∇ · g = −4 ∇ · E = 10
Curl ∇×g =0 ∇×E=0
R (0 )0 1
R (0 ) 0
Potential ∆∞→ = − 0 ∆∞→ = 4 0 0
Poisson’s equation ∇2 = 4 ∇2 = − 10
Both the gravitational and electrostatic central fields are conservative making it possible to use the
concept of the scalar potential field This concept is especially useful for solving some problems since the
potential can be evaluated using a scalar integral. An alternate approach is to solve Poisson’s equation if the
boundary values and mass distributions are known. The methods of solution of Newton’s law of gravitation
are identical to those used in electrostatics and are readily accessible in the literature.
48 CHAPTER 2. REVIEW OF NEWTONIAN MECHANICS
Workshop exercises
1. Spend a few minutes looking over the following problems, paying particular attention to the problems that
you think you might have trouble with. All of the problems are taken from an introductory physics course on
mechanics, so this should seem like review material. After you have had some time to look over the problems,
you will take turns stepping up to the board to solve one. When it is your turn, you may pick ANY of the
problems that have not already been solved. Depending on the number of students in the recitation, you may
be asked to solve more than one problem. Good luck!
(a) Justin fires a 12-gram bullet into a block of wood. The bullet travels at 190 m/s, penetrates the 2.0-kg
block of wood, and emerges going 150 m/s. If the block is stationary on a frictionless surface when hit,
how fast does it move after the bullet emerges?
(b) A mass at the end of a spring vibrates with a frequency of 0.88 Hz; when an additional 1.25 kg mass
is added to , the frequency is 0.48 Hz. What is the value of ?
(c) Dan has a new chandelier in his living room. The chandelier is 27-kg and it hangs from the ceiling on a
vertical 4.0-m-long wire. What horizontal force would Dan need to use to displace its position 0.10 m to
one side? What will be the tension in the wire?
(d) Dianne has a new spring with a spring constant of 900 N/m that she bought at Springs-R-Us. She places
it vertically on a table and compresses it by 0.150 m. What upward speed can it give to a 0.300-kg ball
when released?
(e) A tiger leaps horizontally from a 6.5-m-high rock with a speed of 4.0 m/s. How far from the base of the
rock will she land?
(f) How much work must SuperRyan do to stop a 1300-kg car traveling at 100 km/hr?
(g) Jason catches a baseball 3.1 s after throwing it vertically upward. With what speed did he throw it and
what height did it reach?
(h) Laura is practicing her figure skating and during her finale she can increase her rotation rate from an
initial rate of 1.0 rev every 2.0 s to a final rate of 3.0 rev/s. If her initial moment of inertia was 4.6 kg·m2 ,
what is her final moment of inertia?
(i) On an icy day in Rochester (imagine that!), you worry about parking your car in your driveway, which
has an incline of 12◦ . Your neighbor Emily’s driveway has an incline of 9◦ , and Brian’s driveway across
the street has one of 6◦ . The coefficient of static friction between tire rubber and ice is 0.15. Which
driveway(s) will be safe to park a car?
2. Two particles are projected from the same point with velocities 1 and 2 , at elevations 1 and 2 , respectively
(1 2 ). Show that if they are to collide in mid-air the interval between the firings must be
21 2 sin(1 − 2 )
(1 cos 1 + 2 cos 2 )
(If you don’t have time to solve this problem completely, then at least give an outline of how you would go
about solving the problem.)
3. Read each of the following statements and, without consulting anyone else, mark them true or false. If you are
unsure of any of them, make a guess. Once everyone has answered each of the statements individually, break
into small groups and compare your answers. Try to come to an agreement as a group. The Teaching Assistant
will then make sure everyone has the correct answer. Good luck!
(a) The conservation of linear momentum is a consequence of translational symmetry, or the homogeneity of
space.
(b) For an isolated system with no external forces acting on it, the angular momentum will remain constant
in both magnitude and direction.
(c) A reference frame is called an inertial frame if Newton’s laws are valid in that frame.
(d) Newtonian mechanics and the laws of electromagnetism are invariant under Galilean transformations.
2.15. SUMMARY 49
(e) The law of conservation of angular momentum is a consequence of rotational symmetry, or the isotropy
of space.
(f) The center of mass of a system of particles moves like a single particle of mass (total mass of the
system) acted on by a single force that is equal to the sum of all the external forces acting on the
system.
(g) If Newton’s laws are valid in one reference frame, then they are also valid in any reference frame accelerated
with respect to the first system.
(h) The law of conservation of energy is a consequence of inversion symmetry, or the invertibility of space.
4. The teeter totter comprises two identical weights which hang on drooping arms attached to a peg as shown.
The arrangement is unexpectedly stable and can be spun and rocked with little danger of toppling over.
l l
L
m m
(a) Find an expression for the potential energy of the teeter toy as a function of when the teeter toy is
cocked at an angle about the pivot point. For simplicity, consider only rocking motion in the vertical
plane.
(b) Determine the equilibrium values(s) of .
(c) Determine whether the equilibrium is stable, unstable, or neutral for the value(s) of found in part (b).
(d) How could you determine the answers to parts (b) and (c) from a graph of the potential energy versus ?
(e) Expand the expression for the potential energy about = 0 and determine the frequency of small
oscillations.
5. For each of the situations described below, determine which of the four functional forms of the force is most
appropriate. Consider motion only along one dimension.
Go around the room and take turns answering a question. When it is your turn, pick a functional form and
explain why you chose the one you did. If you are unsure, make a guess or ask a question to get help from the
rest of the workshop. There may be more than one answer depending on your interpretation of the situation,
so be sure to explore all of the possibilities.
(a) A mass resting on a frictionless table is attached to a spring, which in turn is attached to a wall. The
mass is pulled to the side and executes simple harmonic motion in the horizontal direction.
(b) A freely-falling body subject to a constant gravitational field with no air resistance.
(c) An electron, initially at rest (treat it classically!), encounters an incoming electromagnetic wave of electric
field intensity given by = 0 sin( + ).
(d) A large mass is affected by the gravitational field of another mass a distance away.
50 CHAPTER 2. REVIEW OF NEWTONIAN MECHANICS
(e) A freely-falling body subject to a constant gravitational field with air resistance.
(f) A charged point particle is affected by the presence of another charged point particle a distance away.
6. A particle of mass is constrained to move on the frictionless inner surface of a cone of half-angle .
(a) Find the restrictions on the initial conditions such that the particle moves in a circular orbit about the
vertical axis.
(b) Determine whether this kind of orbit is stable. A particle of mass is constrained to move on the
frictionless inner surface of a cone of half-angle , as shown in the figure.
(a) Draw gravitational field lines and equipotential lines for the rod. What can you say about the equipotential
surfaces of the rod?
(b) Calculate the gravitational potential at a point that is a distance from one end of the rod and in a
direction perpendicular to the rod.
(c) Calculate the gravitational field at by direct integration.
(d) Could you have used Gauss’s law to find the gravitational field at ? Why or why not?
9. Consider a fluid with density and velocity in some volume . The mass current = determines the
amount of mass exiting the surface per unit time by the integral
·
(a) Using the divergence theorem, prove the continuity equation, ∇ · +
=0
10. A rocket of initial mass burns fuel at constant rate (kilograms per second), producing a constant force .
The total mass of available fuel is . Assume the rocket starts from rest and moves in a fixed direction with
no external forces acting on it.
Problems
1. Consider a solid hemisphere of radius . Compute the coordinates of the center of mass relative to the center
of the spherical surface used to define the hemisphere.
2. A 2000kg Ford was travelling south on Mt. Hope Avenue when it collided with your 1000kg sports car travelling
west on Elmwood Avenue. The two badly-damaged cars became entangled in the collision and leave a skid mark
that is 20 meters long in a direction 14◦ to the west of the original direction of travel of the Excursion. The
wealthy Excursion driver hires a high-powered lawyer who accuses you of speeding through the intersection.
Use your P235 knowledge, plus the police officer’s report of the recoil direction, the skid length, and knowledge
that the coefficient of sliding friction between the tires and road is = 06, to deduce the original velocities of
both cars. Were either of the cars exceeding the 30mph speed limit?
3. A particle of mass moving in one dimension has potential energy () = 0 [2( )2 − ( )4 ] where 0 and
are positive constants.
a) Find the force () that acts on the particle.
b) Sketch (). Find the positions of stable and unstable equilibrium.
c) What is the angular frequency of oscillations about the point of stable equilibrium?
d) What is the minimum speed the particle must have at the origin to escape to infinity?
e) At = 0 the particle is at the origin and its velocity is positive and equal to the escape velocity. Find ()
and sketch the result.
4. a) Consider a single-stage rocket travelling in a straight line subject to an external force acting along the
same line where is the exhaust velocity of the ejected fuel relative to the rocket. Show that the equation of
motion is
̇ = −̇ +
b) Specialize to the case of a rocket taking off vertically from rest in a uniform gravitational field Assume
that the rocket ejects mass at a constant rate of ̇ = − where is a positive constant. Solve the equation of
motion to derive the dependence of velocity on time.
c) The first couple of minutes of the launch of the Space Shuttle can be described roughly by; initial mass
= 2 × 106 kg, mass after 2 minutes = 1 × 106 kg, exhaust speed = 3000 and initial velocity is zero.
Estimate the velocity of the Space Shuttle after two minutes of flight.
d) Describe what would happen to a rocket where ̇
5. A time independent field is conservative if ∇ × = 0. Use this fact to test if the following fields are
conservative, and derive the corresponding potential .
a) = + + = + = +
b) = −− = ln = − +
52 CHAPTER 2. REVIEW OF NEWTONIAN MECHANICS
6. Consider a solid cylinder of mass and radius sliding without rolling down the smooth inclined face of a
wedge of mass that is free to slide without friction on a horizontal plane floor. Use the coordinates shown
in the figure.
a) How far has the wedge moved by the time the cylinder has descended from rest a vertical distance ?
b) Now suppose that the cylinder is free to roll down the wedge without slipping. How far does the wedge
move in this case if the cylinder rolls down a vertical distance ?
c) In which case does the cylinder reach the bottom faster? How does this depend on the radius of the cylinder?
x
y
x
7. If the gravitational field vector is independent of the radial distance within a sphere, find the function describing
the mass density () of the sphere.
Chapter 3
Linear oscillators
3.1 Introduction
Oscillations are a ubiquitous feature in nature. Examples are periodic motion of planets, the rise and fall
of the tides, water waves, pendulum in a clock, musical instruments, sound waves, electromagnetic waves,
and wave-particle duality in quantal physics. Oscillatory systems all have the same basic mathematical form
although the names of the variables and parameters are different. The classical linear theory of oscillations
will be assumed in this chapter since: (1) The linear approximation is well obeyed when the amplitudes of
oscillation are small, that is, the restoring force obeys Hooke’s Law. (2) The Principle of Superposition
applies. (3) The linear theory allows most problems to be solved explicitly in closed form. This is in contrast
to non-linear system where the motion can be complicated and even chaotic as discussed in chapter 4.
F = −∇ (3.1)
53
54 CHAPTER 3. LINEAR OSCILLATORS
Thus these linear combinations also satisfy the general linear equation
L() = () (3.14)
Applicability of the Principle of Superposition to a system provides a tremendous advantage for handling
and solving the equations of motion of oscillatory systems.
Configuration plots of ( ) where = cos(4) and = cos(5 − ) at four different phase values . The
curves are called Lissajous figures
56 CHAPTER 3. LINEAR OSCILLATORS
2 2
+ ¡ 2 ¢ = 1 (3.20)
2
the name "state space" in common with reference [Ta05]. Lanczos [La49] uses the term "state space" to refer to the extended
phase space (q p) discussed in chapter 16
3.4. GEOMETRICAL REPRESENTATIONS OF DYNAMICAL MOTION 57
F () = − ()b
v (3.24)
where the velocity dependent function () can be complicated. Fortunately there is a very large class of
problems in electricity and magnetism, classical mechanics, molecular, atomic, and nuclear physics, where
the damping force depends linearly on velocity which greatly simplifies solution of the equations of motion.
Therefore this chapter will discuss only linear damping.
Consider the free simple harmonic oscillator, that is, assuming no oscillatory forcing function, with a
linear damping term F () = −v where the parameter is the damping factor. Then the equation of
motion is
− − ̇ = ̈ (3.25)
The general solution to the linearly-damped free oscillator is obtained by inserting the complex trial
solution = 0 Then
2
() 0 + Γ0 + 20 0 = 0 (3.29)
2 − Γ − 20 = 0 (3.30)
The solution is
s µ ¶2
Γ Γ
± = ± 20 − (3.31)
2 2
The two solutions ± are complex conjugates and thus the solutions of the damped free oscillator are
2 2
Γ
2+ 20 −( Γ
2) Γ
2− 20 −( Γ
2)
= 1 + 2 (3.32)
where
s µ ¶2
Γ
1 ≡ 2 − (3.34)
2
3.5. LINEARLY-DAMPED FREE LINEAR OSCILLATOR 59
¡ Γ ¢2
Underdamped motion 21 ≡ 2 − 2 0
When 21 0 then the square root is real so the solution can be written taking the real part of which
gives that equation 333 equals
Where and are adjustable constants fit to the initial conditions. Therefore the velocity is given by
∙ ¸
−Γ
2
Γ
̇() = − 1 sin ( 1 − ) + cos ( 1 − ) (3.36)
2
This is the damped sinusoidal oscillation illustrated in figure 35. The solution has the following
characteristics:
2
a) The oscillation amplitude decreases exponentially with a time constant = Γ
q b) There
¡ Γ ¢2
is a small reduction in the frequency of the oscillation due to the damping leading to 1 =
2 − 2
The amplitude-time dependence and state-space diagrams for the free linearly-damped harmonic oscillator.
The upper row shows the underdamped system for the case with damping Γ = 50 . The lower row shows
the overdamped ( Γ2 0 ) [solid line] and critically damped ( Γ2 = 0 ) [dashed line] in both cases assuming
that initially the system is at rest.
60 CHAPTER 3. LINEAR OSCILLATORS
Figure 3.4: Real and imaginary solutions ± of the damped harmonic oscillator. A phase transition occurs
at Γ = 2 0 For Γ 2 0 (dashed) the two solutions are complex conjugates and imaginary. For Γ 2 0 ,
(solid), there are two real solutions + and − with widely different decay constants where + dominates
the decay at long times.
¡ Γ ¢2
Overdamped case 21 ≡ 2 − 2 0
q¡ ¢
Γ 2
In this case the square root of 21 is imaginary and can be expressed as 01 = 2 − 2 Therefore the
solution is obtained more naturally by using a real trial solution = 0 in equation 333 which leads to
two roots ⎡ sµ ¶ ⎤
2
Γ Γ
± = − ⎣− ± − 2 ⎦
2 2
Thus the exponentially damped decay has two time constants + and −
£ ¤
() = 1 −+ + 2 −− (3.37)
The time constant 1− 1+ thus the first term 1 −+ in the bracket decays in a shorter time than the
second term 2 −− As illustrated in figure 36 the decay rate, which is imaginary when underdamped, i.e.
Γ Γ
2 bifurcates into two real values ± for overdamped, i.e. 2 . At large times the dominant term
when overdamped is for + which has the smallest decay rate, that is, the longest decay constant + = 1+ .
There is no oscillatory motion for the overdamped case, it slowly moves monotonically to zero as shown in
fig 35. The amplitude decays away with a time constant that is longer than Γ2
¡ Γ ¢2
Critically damped 21 ≡ 2 − 2 =0
Γ
This is the limiting case where 2 = For this case the solution is of the form
1
2, gives the time-averaged total energy as
à µ ¶2 !
−Γ 1 2 2 1 2 Γ 1 2 2
hi = 1 + + 0 (3.43)
4 4 2 4
()
̈ + Γ̇ + 20 = (3.48)
where () is the driving force. For mathematical simplicity the driving force is chosen to be a sinusoidal
harmonic force. The solution of this second-order differential equation comprises two components, the
complementary solution (transient response), and the particular solution (steady-state response).
which is identical to the solution of the free linearly-damped harmonic oscillator. As discussed in section 35
the solution of the linearly-damped free oscillator is given by the real part of the complex variable where
Γ £ ¤
= − 2 1 1 + 2 −1 (3.50)
and s µ ¶2
Γ
1 ≡ 2 − (3.51)
2
2
Underdamped motion 21 ≡ 2 − Γ2 0 : When 21 0 then the square root is real so the transient
solution can be written taking the real part of which gives
0 − Γ
() = 2 cos ( 1 ) (3.52)
The solution has the following characteristics:
2
a) The amplitude of the transient solution decreases exponentially with a time constant = Γ while
the energy decreases with a time constant of Γ1
q ¡ ¢2
b) There is a small downward frequency shift in that 1 = 2 − Γ2
¡ ¢2
Overdamped case 21 ≡ 2 − Γ2 0 : In this case the square root is imaginary, which can be expressed
q¡ ¢
Γ 2
as 01 ≡ 2 − 2 which is real and the solution is just an exponentially damped one
0 − Γ h 01 0
i
() = 2 + −1 (3.53)
There is no oscillatory motion for the overdamped case, it slowly moves monotonically to zero. The total
energy decays away with two time constants greater than Γ1
¡ Γ ¢2
Critically damped 21 ≡ 2 − 2 = 0 : For this case, as mentioned for the damped free oscillator, the
solution is of the form
Γ
() = ( + ) − 2 (3.54)
Thus the particular solution is the real part of the complex variable which is a solution of
0
̈ + Γ̇ + 20 = (3.56)
A trial solution is
= 0 (3.57)
This leads to the relation
0
− 2 0 + Γ0 + 20 0 = (3.58)
¡ 2 ¢
Multiplying the numerator and denominator by the factor 0 − 2 − Γ gives
0 0 £¡ 2 ¢ ¤
0 =
=
0 − 2 − Γ (3.59)
( 20 − 2 ) + Γ ( 20 2 2
− ) + (Γ)
2
The steady state solution () thus is given by the real part of , that is
0 £¡ 2 ¢ ¤
() =
2 2
0 − 2 cos + Γ sin (3.60)
( 20 − 2 ) + (Γ)
20 − 2
cos = q (3.62)
2
( 20 − 2 ) + (Γ)2
and
Γ
sin = q (3.63)
2 2
( 20 − 2 ) + (Γ)
The phase represents the phase difference between the
driving force and the resultant motion. For a fixed 0 the
phase = 0 when = 0 and increases to = 2 when
= 0 . For 0 the phase → as → ∞. Figure 3.5: Phase between driving force and
The steady state solution can be re-expressed in terms of resultant motion.
the phase shift as
0
() = q [cos cos + sin sin ]
2
( 0 − 2 ) + (Γ)2
2
0
= q cos ( − ) (3.64)
2
( 0 − 2 ) + (Γ)2
2
64 CHAPTER 3. LINEAR OSCILLATORS
Figure 3.6: Amplitude versus time, and state space plots of the transient solution (dashed) and total solution
(solid) for two cases. The upper row shows the case where the driving frequency = 51 while the lower row
shows the same for the case where the driving frequency = 5 1
Note that the frequency of the transient solution is 1 which in general differs from the driving frequency
. The phase shift − for the transient component is set by the initial conditions. The transient response
leads to a more complicated motion immediately after the driving function is switched on. Figure 38
illustrates the amplitude time dependence and state space diagram for the transient component, and the
total response, when the driving frequency is either = 51 or = 5 1 Note that the modulation of the
steady-state response by the transient response is unimportant once the transient response has damped out
leading to a constant elliptical state space trajectory. For cases where the initial conditions are = ̇ = 0
then the transient solution has a relative phase difference − = radians at = 0 and relative amplitudes
such that the transient and steady-state solutions cancel at = 0
The characteristic sounds of different types of musical instruments depend very much on the admixture
of transient solutions plus the number and mixture of oscillatory active modes. Percussive instruments, such
as the piano, have a large transient component. The mixture of transient and steady-state solutions for
forced oscillations occurs frequently in studies of networks in electrical circuit analysis.
3.6. SINUSOIDALLY-DRIVE, LINEARLY-DAMPED, LINEAR OSCILLATOR 65
3.6.4 Resonance
The discussion so far has discussed the role of the transient and steady-state solutions of the driven damped
harmonic oscillator which occurs frequently is science, and engineering. Another important aspect is reso-
nance that occurs when the driving frequency approaches the natural frequency 1 of the damped system.
Consider the case where the time is sufficient for the transient solution to have decayed to zero.
Figure 39 shows the amplitude and phase for the steady-
state response as goes through a resonance as the driving
frequency is changed. The steady-states solution of the
driven oscillator follows the driving force when 0 in
that the phase difference is zero and the amplitude is just
0
The response of the system peaks at resonance, while
for 0 the harmonic system is unable to follow the
more rapidly oscillating driving force and thus the phase of
the induced oscillation is out of phase with the driving force
and the amplitude of the oscillation tends to zero.
Note that the resonance frequency for a driven damped
oscillator, differs from that for the undriven damped oscilla-
tor, and differs from that for the undamped oscillator. The
natural frequency for an undamped harmonic oscillator
is given by
20 = (3.68)
The transient solution is the same as damped free os-
cillations of a damped oscillator and has a frequency of
the system 1 given by
µ ¶2
Γ
21 = 20 − (3.69)
2
The absorptive term steadily absorbs energy while the elastic term oscillates as energy is alternately absorbed
or emitted. The time average over one cycle is given by
h D Ei
2
h i = 0 − hcos sin i + (cos ) (3.80)
®
where hcos sin i and cos 2 are the time average over one cycle. The time averages over one complete
cycle for the first term in the bracket is
1 2 Γ 2
h i = 0 = 0 (3.83)
2 2 ( 20 − 2 )2 + (Γ)2
This shape of the power curve is a classic Lorentzian shape. Note that the maximum of the average kinetic
¡ ¢2
energy occurs at = 0 which is different from the peak of the amplitude which occurs at 21 = 20 − Γ2 .
The potential energy is proportional to the amplitude squared, i.e. 2 which occurs at the same angular
¡ ¢2
frequency as the amplitude, that is, 2 = 2 = 20 − 2 Γ2 . The kinetic and potential energies resonate
at different angular frequencies as a result of the fact that the driven damped oscillator is not conservative
3.6. SINUSOIDALLY-DRIVE, LINEARLY-DAMPED, LINEAR OSCILLATOR 67
because energy is continually exchanged between the oscillator and the driving force system in addition to
the energy dissipation due to the damping.
When ∼ 0 Γ, then the power equation simplifies since
¡ 2 ¢
0 − 2 = ( 0 + ) ( 0 − ) ≈ 2 0 ( 0 − ) (3.84)
Therefore
02 Γ
h i ' ¡ ¢ (3.85)
8 ( 0 − )2 + Γ 2
2
This is called the Lorentzian or Breit-Wigner shape. The half power points are at a frequency difference
from resonance of ±∆ where
Γ
∆ = | 0 − | = ± (3.86)
2
Thus the full width at half maximum of the Lorentzian curve equals Γ Note that the Lorentzian has a
narrower peak but much wider tail relative to a Gaussian shape. At the peak of the absorbed power, the
absorptive amplitude can be written as
0
( = 0 ) = (3.87)
20
That is, the peak amplitude increases with increase in . This explains the classic comedy scene where the
soprano shatters the crystal glass because the highest quality crystal glass has a high which leads to a
large amplitude oscillation when she sings on resonance.
The mean lifetime of the free linearly-damped harmonic oscillator, that is, the time for the energy of
free oscillations to decay to 1 was shown to be related to the damping coefficient Γ by
1
= (3.88)
Γ
Therefore we have the classical uncertainty principle for the linearly-damped harmonic oscillator
that the measured full-width at half maximum of the energy resonance curve for forced oscillation and the
mean life for decay of the energy of a free linearly-damped oscillator are related by
Γ = 1 (3.89)
This relation is correct only for a linearly-damped harmonic system. Comparable relations between the
lifetime and damping width exist for different forms of damping.
One can demonstrate the above line width and decay time relationship using an acoustically driven
electric guitar string. It also occurs for the width of the electromagnetic radiation and the lifetime for decay
of atomic or nuclear electromagnetic decay. This classical uncertainty principle is exactly the same as the
one encountered in quantum physics due to wave-particle duality. In nuclear physics it is difficult to measure
the lifetime of states when 10−13 For shorter lifetimes the value of Γ can be determined from the shape
of the resonance curve which can be measured directly when the damping is large.
= 00 where is the phase difference between the voltage and the current. For this circuit the impedance
is given by µ ¶
1
= + −
Because of the phases involved in this circuit, at resonance the maximum voltage across the resistor
occurs at a frequency of = 0 across the capacitor the maximum voltage occurs at a frequency 2 =
2
20 1
20 − 2 2
2 and across the inductor the maximum voltage occurs at a frequency = 2 where 20 =
1− 22
is the resonance angular frequency when = 0. Thus these resonance frequencies differ when 0.
Ψ Ψ
= ∓ (3.92)
The sign in this equation depends on the sign of the wave velocity making it not a generally useful formula.
Consider the second derivatives
2Ψ 2 Ψ 2 Ψ
= 2 = (3.93)
2
2
and
2Ψ 2 Ψ 2
2 Ψ
= = + (3.94)
2 2 2
3.8. TRAVELLING AND STANDING WAVE SOLUTIONS OF THE WAVE EQUATION 69
2 Ψ
Factoring out 2
gives
2Ψ 1 2Ψ
2
= 2 2 (3.95)
This wave equation in one dimension for a linear system is independent of the sign of the velocity. There
are an infinite number of possible shapes of waves both travelling and standing in one dimension, all of these
must satisfy this one-dimensional wave equation. The converse is that any function that satisfies this one
dimensional wave equation must be a wave in this one dimension.
The Wave Equation in three dimensions is
2Ψ 2Ψ 2Ψ 1 2Ψ
∇2 Ψ ≡ + + = (3.96)
2 2 2 2 2
There are an infinite number of possible solutions Ψ to this wave equation, any one of which corresponds to
a wave motion with velocity .
The Wave Equation is applicable to all manifestations of wave motion, both transverse and longitudinal,
for linear systems. That is, it applies to waves on a string, water waves, seismic waves, sound waves,
electromagnetic waves, matter waves, etc. If it can be shown that a wave equation can be derived for any
system, discrete or continuous, then this is equivalent to proving the existence of waves of any waveform,
frequency, or wavelength travelling with the phase velocity given by the wave equation.[Cra65]
2) The spatial dependence of the waveform at a given instant = 0 which can be expressed using a
Fourier decomposition of the spatial dependence as a function of wavenumber = 0
∞
X ∞
X
Ψ( 0 ) = (0 −1 0 ) = (0 ) 0 (3.100)
=−∞ =−∞
The above is applicable both to discrete, or continuous linear oscillator systems, e.g. waves on a string.
In summary, stationary normal modes of a system are obtained by a superposition of travelling waves
travelling in opposite directions, or equivalently, travelling waves can result from a superposition of stationary
normal modes.
70 CHAPTER 3. LINEAR OSCILLATORS
Note that since the average over 2 of cos2 = 12 then the average over the cos2 ( 1 − ) term gives the
2
intensity () = 2 −Γ which has a mean lifetime for the decay of = Γ1 The | ()|2 distribution has the
classic Lorentzian shape, shown in figure 312, which has a full width at half-maximum, FWHM, equal to Γ.
Note that () is complex and thus one also can determine the phase shift which is given by the ratio of
the imaginary to real parts of equation 3105 i.e. tan = 2Γ .
( −21 )
The mean lifetime of the exponential decay of the intensity can be determined either by measuring
2
from the time dependence, or measuring the FWHM Γ = 1 of the Fourier transform | ()| . In nuclear
and atomic physics excited levels decay by photon emission with the wave form of the free linearly-damped,
linear oscillator. Typically the mean lifetime usually can be measured when & 10−12 whereas for
shorter lifetimes the radiation width Γ becomes sufficiently large to be measured. Thus the two experimental
approaches are complementary.
For each harmonic term the response of a linearly-damped linear oscillator to the forcing function
() = 0 () cos( ) is given by equation (365 − 67) to be
This is shown schematically in figure 313. The Fourier transformation connects the three quantities in the
time domain with the corresponding three in the frequency domain. For example, the impulse response of
the low-pass filter has a fall time of which is related by a Fourier transform to the width of the transfer
function. Thus the time and frequency domain approaches are closely related and give the same result for
the output signal for the low-pass filter to the applied square-wave input signal. The result is that the
higher-frequency components are attenuated leading to slow rise and fall times in the time domain.
Analog signal processing and Fourier analysis were the primary tools to analyze and process all forms of
periodic motion during the 20 century. For example, musical instruments, mechanical systems, electronic
circuits, all employed resonant systems to enhance the desired frequencies and suppress the undesirable
frequencies and the signals were observed using analog oscilloscopes. The remarkable development of com-
puting has enabled use of digital signal processing leading to a revolution in signal processing that has had a
profound impact on both science and engineering. For example, the digital oscilloscope, which can sample at
frequencies above 109 has replaced the analog oscilloscope because it allows sophisticated analysis of each
individual signal that was not possible using analog signal processing. For example, the analog approach
in nuclear physics involved tiny analog electric signals, produced by many individual radiation detectors,
that were transmitted hundreds of meters via carefully shielded and expensive coaxial cables to the data
room where the signals were amplified and signal processed using analog filters to maximize the signal to
noise in order to separate the signal from the background noise. Stray electromagnetic radiation picked up
via the cables significantly degraded the signals. The performance and limitations of the analog electronics
severely restricted the pulse processing capabilities. Digital signal processing has rapidly replaced analog
3.11. WAVE PROPAGATION 73
Figure 3.11: Response of an electrical circuit to an input square wave. The upper row shows the time
and the exponential-form frequency representations of the square-wave input signal. The middle row gives
the impulse response, and corresponding transfer function for the circuit. The bottom row shows the
corresponding output properties in both the time and frequency domains
signal processing. Analog to digital detector circuits are built directly into the electronics for each individual
detector so that only digital information needs to be transmitted from each detector to the analysis com-
puters. Computer processing provides unlimited and flexible processing capabilities for the digital signals
greatly enhancing the response and sensitivity of our detector systems. Common examples of digital signal
processing are digital CD and DVD disks.
the concept of wave-particle duality and the development of wave mechanics by Schrödinger.
The argument of the exponential is called the phase of the wave where
≡ − (3.114)
If we move along the axis at a velocity such that the phase is constant then we perceive a stationary
wave. The velocity of this wave is called the phase velocity. To ensure constant phase we require that
is constant or, assuming real and
= (3.115)
Therefore the phase velocity is defined to be
= (3.116)
The velocity we have used so far is just the phase velocity of the individual wavelets at the carrier frequency.
If or are complex then one must take the real parts to ensure that the velocity is real.
If the phase velocity of a wave is dependent on the wavelength, that is, () then the system is
said to be dispersive in that the wave is dispersed according the wavelength. The simplest illustration of
dispersion is the refraction of light in glass prism which leads to dispersion of the light into the spectrum
of wavelengths. Dispersion leads to development of wave packets that travel at group and signal velocities
that usually differ from the phase velocity. To illustrate this consider two equal amplitude travelling waves
having slightly different wave number and angular frequency . Superposition of these waves gives
In the event that → ∞ and the frequencies are continuously distributed, then the summation is replaced
by an integral
3.11. WAVE PROPAGATION 75
Z ∞
( ) = ()(±) (3.121)
−∞
where the factor () represents the distribution amplitudes of the component waves, that is the spectral
decomposition of the wave. This is the usual Fourier decomposition of the spatial distribution of the wave.
Consider an extension of the linear superposition of two waves to a well defined wave packet where the
amplitude is nonzero only for a small range of wavenumbers 0 ± ∆
Z 0 +∆
( ) = ()(−) (3.122)
0 −∆
This functional shape is called a wave packet which only has meaning if ∆ 0 . The angular frequency
can be expressed by making a Taylor expansion around 0
µ ¶
() = (0 ) + ( − 0 ) + (3.123)
0
The summation of terms in the exponent given by 3125 leads to the amplitude 3123 having the form of a
product where the integral becomes
Z 0 +∆
(−0 )[−(
) ]
( ) = (0 −0 ) () 0
(3.125)
0 −∆
2E E
∇2 E − − = 0
2
2
H H
∇2 H − 2 − = 0
The third term in both of these wave equations is a damping term that leads to a damped solution of an
electromagnetic wave in a good conductor.
The solution of these damped wave equations can be solved by considering an incident wave
E = x̂(−)
−2 + 2 − = 0
That is ∙ ¸
2 = 2 1 −
In general is complex, that is, it has real and imaginary parts that lead to a solution of the form
E = − (− )
The first exponential term is an exponential damping term while the second exponential term is the oscillating
term.
Consider that the plasma involves the motion of a bound damped electron, of charge of mass bound
in a one dimensional atom or lattice subject to an oscillatory electric field of frequency . Assume that the
electromagnetic wave is travelling in the ̂ direction with the transverse electric field in the ̂ direction. The
equation of motion of an electron can be written as
where Γ is the damping factor. The instantaneous displacement of the oscillating charge equals
1
x= x̂0 (−)
( 20 − 2 ) + Γ
and the velocity is
ẋ = 2 x̂0 (−)
( 0 − 2 ) + Γ
Thus the instantaneous current density is given by
2
j = ẋ = 2 x̂0 (−)
( 0 − 2 ) + Γ
78 CHAPTER 3. LINEAR OSCILLATORS
2
= 2
( 0 − 2 ) + Γ
Let us consider only unbound charges in the plasma, that is let 0 = 0. Then the conductivity is given by
2
=
Γ − 2
For a low density ionized plasma Γ thus the conductivity is given approximately by
2
≈ −
Since is pure imaginary, then j and E have a phase difference of 2 which implies that the average of
the Joule heating over a complete period is hj · Ei = 0 Thus there is no energy loss due to Joule heating
implying that the electromagnetic energy is conserved.
Substitution of into the relation for 2
∙ ¸ ∙ ¸
2
2 = 2 1 − = 2 1 −
2
Define the Plasma oscillation frequency to be
r
2
≡
then 2 can be written as ∙ ³ ´2 ¸
2 2
= 1 − ()
For a low density plasma the dielectric constant ' 1 and the relative permeability ' 1 and thus
= 0 ' 0 and = 0 ' 0 . The velocity of light in vacuum = √10 . Thus for low density
0
equation can be written as
2 = 2 + 2 2 ()
Differentiation of equation with respect to gives 2 2 2
= 2 That is, = and the phase
velocity is r
2
= 2 + 2
There are three cases to consider. h ¡ ¢2 i
1) : For this case 1 − 1 and thus is a pure real number. Therefore the elec-
tromagnetic wave is transmitted with a phase velocity that exceeds while the group velocity is less than
. h ¡ ¢2 i
2) : For this case 1 − 1 and thus is a pure imaginary number. Therefore the
electromagnetic wave is not transmitted and in the ionosphere it is attenuated rapidly as −( ) . However,
since there are no Joule heating losses then the electromagnetic wave must be complete reflected. Thus the
Plasma oscillation frequency serves as a cut-off frequency. For this example the signal and group velocities
are identical.
For the ionosphere = 10−11 electrons/m 3 , which corresponds to a Plasma oscillation frequency of
= 2 = 3 . Thus electromagnetic waves in the AM waveband ( 16 ) are totally reflected by
the ionosphere and bounce repeatedly around the Earth, whereas for VHF frequencies above 3 , the waves
are transmitted and refracted passing through the atmosphere. Thus light is transmitted by the ionosphere.
By contrast, for a good conductor like silver, the Plasma oscillation frequency is around 1016 which is
in the far ultraviolet part of the spectrum. Thus, all lower frequencies, such as light, are totally reflected
by such a good conductor, whereas X-rays have frequencies above the Plasma oscillation frequency and are
transmitted.
3.11. WAVE PROPAGATION 79
That is, the transform of a rectangular wavepacket gives a cosine wave modulated by an unnormalized
function which is a nice example of a simple wave packet. That is, on the right hand side we have
2
a wavepacket ∆ = ± ∆ wide. Note that the product of the two measures of the widths ∆ · ∆ = ±
Example 2 considers a rectangular
³ ´ pulse of unity amplitude between − 2 ≤ ≤ 2 which resulted in a
sin
Fourier transform () =
2
. That is, for a pulse of width ∆ = ± 2 the frequency envelope has
2
the first zero at ∆ = ± . Note that this is the complementary system to the one considered here which has
∆ · ∆ = ± illustrating the symmetry of the Fourier transform and its inverse.
80 CHAPTER 3. LINEAR OSCILLATORS
This product of the standard deviations equals unity only for the special case of Gaussian-shaped spectral
distributions, and is greater than unity for all other shaped spectral distributions.
The intensity of the wave is the square of the amplitude leading to standard deviation widths for a
Gaussian distribution where ()2 = 12 ()2 , that is, () = √
()
2
. Thus the standard deviations for the
spectral distribution and width of the intensity of the wavepacket are related by:
1
() · () > (Uncertainty principle for frequency-time intensities)
2
This states that the uncertainties with which you can simultaneously measure the time and frequency
for the intensity of a given wavepacket are related. If you try to measure the frequency within a short time
interval () then the uncertainty in the frequency measurement () > 21() Accurate measurement
of the frequency requires measurement times that encompass many cycles of oscillation, that is, a long
wavepacket.
Exactly the same relations exist between the spectral distribution as a function of wavenumber and the
spatial dependence of a wave which are conjugate representations. Thus the spectral distribution plotted
versus is directly related to the amplitude as a function of position ; the spectral distribution versus is
related to the amplitude as a function of ; and the spectral distribution is related to the spatial dependence
on Following the same arguments discussed above, the standard deviation, ( ) characterizing the width
of the spectral intensity distribution of , and the standard deviation () characterizing the spatial
width of the wave packet intensity as a function of are related by the Uncertainty Principle for position-
wavenumber. Thus in summary the uncertainty principle for the intensity of wave motion is,
1
() · () > (3.128)
2
1 1 1
() · ( ) > () · ( ) > () · ( ) >
2 2 2
This applies to all forms of wave motion, be they, sound waves, water waves, electromagnetic waves, or
matter waves.
As discussed in chapter 17, the transition to quantum mechanics involves relating the matter-wave prop-
erties to the energy and momentum of the corresponding particle. That is, in the case of matter waves,
multiplying both sides of equation 3129 by ~ and using the de Broglie relations gives that the particle en-
ergy is related to the angular frequency by = ~ and the particle momentum is related to the wavenumber,
→
−
that is −
→p = ~ k . These lead to the Heisenberg Uncertainty Principle:
~
() · () > (3.129)
2
~ ~ ~
() · ( ) > () · ( ) > () · ( ) >
2 2 2
This uncertainty principle applies equally to the wavefunction of the electron in the
hydrogen atom, proton in a nucleus, as well as to a wavepacket describing a particle wave moving along
3.11. WAVE PROPAGATION 81
some trajectory. Thus, this implies that, for a particle of given momentum, the wavefunction is spread out
spatially. Planck’s constant ~ = 105410−34 · = 658210−16 · is extremely small compared with energies
and times encountered in normal life, and thus the effects due to the Uncertainty Principle are not manifest
for macroscopic dimensions.
Confinement of a particle, of mass , within ±() of a fixed location implies that there is a corresponding
uncertainty in the momentum
~
( ) ≥ (3.130)
2()
D E
Now the variance in momentum p is given by the difference in the average of the square (p · p)2 , and the
square of the average of hpi2 . That is
D E
2 2
(p)2 = (p · p) − hpi (3.131)
2 ~2
Kinetic energy = ≥ (Zero-point energy)
2 8()2
This zero-point energy is the minimum kinetic energy that a particle of mass can have if confined within a
distance ±() This zero-point energy is a consequence of wave-particle duality and the uncertainty between
the size and wavenumber for any wave packet. It is a quantal effect in that the classical limit has ~ → 0 for
which the zero-point energy → 0
Inserting numbers for the zero-point energy gives that an electron confined to the radius of the atom,
that is () = 10−10 has a zero-point kinetic energy of ∼ 1 . Confining this electron to 3 × 10−15 the
size of a nucleus, gives a zero-point energy of 109 (1 ) Confining a proton to the size of the nucleus
gives a zero-point energy of 05 . These values are typical of the level spacing observed in atomic and
nuclear physics. If ~ was a large number, then a billiard ball confined to a billiard table would be a blur
as it oscillated with the minimum zero-point kinetic energy. The smaller the spatial region that the ball
was confined, the larger would be its zero-point energy and momentum causing it to rattle back and forth
between the boundaries of the confined region. Life would be dramatically different if ~ was a large number.
In summary, Heisenberg’s Uncertainty Principle is a well-known and crucially important aspect of quan-
tum physics. What is less well known, is that the Uncertainty Principle exists for all forms of wave motion,
that is, it is not restricted to matter waves. The following three examples illustrate application of the
Uncertainty Principle to acoustics, the nuclear Mössbauer effect, and quantum mechanics.
3.12 Summary
Linear systems have the feature that the solutions obey the Principle of Superposition, that is, the am-
plitudes add linearly for the superposition of different oscillatory modes. Applicability of the Principle of
Superposition to a system provides a tremendous advantage for handling and solving the equations of motion
of oscillatory systems.
Geometric representations of the motion of dynamical systems provide sensitive probes of periodic mo-
tion. Configuration space (q q ), state space (q q̇ ) and phase space (q p ), are powerful geometric
representations that are used extensively for recognizing periodic motion where q q̇ and p are vectors in
-dimensional space.
Linearly-damped free linear oscillator The free linearly-damped linear oscillator is characterized by
the equation
̈ + Γ̇ + 20 = 0 (326)
The solutions of the linearly-damped free linear oscillator are of the form
s µ ¶2
£ ¤ Γ
= −( ) 1 1 + 2 −1
Γ
2 1 ≡ 2 − (333)
2
The solutions fall into three categories
q ¡ ¢2
−( Γ
2 )
() = cos ( 1 − ) underdamped 1 = 2 − Γ2 0
∙ q¡ ¢ ¸
Γ 2
() = [1 −+
+ 2 −−
] overdamped ± = − − Γ2 ± 2 − 2
q ¡ Γ ¢2
() = ( + ) −( 2 )
Γ
critically damped 1 = 2 − 2 = 0
3.12. SUMMARY 83
The energy dissipation for the linearly-damped free linear oscillator time averaged over one period is
given by
hi = 0 −Γ (344)
The quality factor characterizing the damping of the free oscillator is define to be
1
= = (347)
∆ Γ
where ∆ is the energy dissipated per radian.
Resonance A detailed discussion of resonance and energy absorption for the driven linearly-damped linear
oscillator was given. For resonance the maximum amplitudes occur at frequencies
q
undamped free linear oscillator 0 =
q ¡ ¢2
linearly-damped free linear oscillator 1 = 20 − Γ2
q ¡ ¢2
driven linearly-damped linear oscillator = 20 − 2 Γ2
The energy absorption for the steady-state solution for resonance is given by
The time average power input is given by only the absorptive term
1 2 Γ 2
h i = 0 = 0 (3.133)
2 2 ( 20 − 2 )2 + (Γ)2
Wave propagation The wave equation was introduced and both travelling and standing wave solutions
of the wave equation were discussed. Harmonic wave-form analysis, and the complementary time-sampled
wave form analysis techniques, were introduced in this chapter and in appendix . The relative merits of
Fourier analysis and the digital Green’s function waveform analysis were illustrated for signal processing.
The concepts of phase velocity, group velocity, and signal velocity were introduced. The phase velocity
is given by
= (3117)
and group velocity µ ¶
= = + (3128)
0
If the group velocity is frequency dependent then the information content of a wave packet travels at the
signal velocity which can differ from the group velocity.
The Wave-packet Uncertainty Principle implies that making a precise measurement of the frequency
q of a
2
sinusoidal wave requires that the wave packet be infinitely long. The standard deviation () = h2 i − hi
characterizing the width of the amplitude of the wavepacket spectral distribution in the angular frequency
domain, (), and the corresponding width in time () are related by :
The standard deviations for the spectral distribution and width of the intensity of the wave packet are
related by:
1
() · () > (3.134)
2
1 1 1
() · ( ) > () · ( ) > () · ( ) >
2 2 2
This applies to all forms of wave motion, including sound waves, water waves, electromagnetic waves, or
matter waves.
3.12. SUMMARY 85
Workshop exercises
1. Given below are a list of statements followed by a list of reasons related to harmonic motion. For each of the
statements, determine the reason(s) that make that statement true. You may do this in small groups or as one
large group—the teaching assistant will decide what works best for your workshop.
Statements:
• We can neglect the higher order terms in the Taylor expansion of ().
• The restoring force is a linear force.
• 0 must vanish.
• ()0 is negative and is positive.
• We can write () as a Taylor series expansion.
Reasons:
2. Second-order ordinary differential equations are an important part of the physics of the harmonic oscillator.
(a) What do each of the following terms mean with respect to differential equations?
i. Ordinary
ii. Second-order
iii. Homogeneous
iv. Linear
(b) Give a mini-lesson on how to solve second-order differential equations by working through the following
examples. Don’t just provide a solution; explain the steps leading up to the solution.
i. 00 +5 0 +6 = 0
ii. 00 + 0 + = 0
iii. 00 +4 0 +4 = 0
iv. 00 −3 02
v. 00 −3 0 −4 = 2 sin
3. Harmonic oscillations occur for many different types of systems and it is important to recognize when the
equations for harmonic motion apply. Three different systems are described below. Each system can be
approximately described using the equations for harmonic motion. Break up into three groups—one group per
system. For your group’s system, answer the following questions:
(a) What approximations are necessary for this system to exhibit harmonic oscillations?
(b) What is the differential equation that governs the motion of this system? Use Newton’s second law to
arrive at this equation.
(c) What is the solution to the differential equation that you found in part (b)?
(d) What is the natural frequency of oscillations?
• A mass is tied to a massless spring having a spring constant . The system oscillates in one dimension
along a horizontal frictionless surface.
86 CHAPTER 3. LINEAR OSCILLATORS
• A particle of mass is attached to a weightless, extensionless rod to form a pendulum. The length of
the rod is and the system oscillates in a single plane.
• A tube is bent into the shape of a U and is partially filled with a liquid of density . The cross-sectional
area of the tube is and the length of the tube filled with liquid is . The liquid is initially displaced so
that it is higher on one side of the tube than the other.
Once each group has answered all of the questions, share the results with the entire class.
4. Consider a mass attached to a spring of spring constant . The spring is mounted horizontally so that the
mass oscillates horizontally on a frictionless surface. The spring is attached to the wall on the right and the
mass is initially moved to the right of its equilibrium position (compressing the spring) by a distance and
released. Working individually, determine how (if at all) the period of the motion would be affected by each of
the changes below. Once you have answered each part on your own, compare your answers with a classmate.
5. When you were first introduced to simple harmonic motion, you used the formula ̈ = − to find the
position of the oscillating mass as a function of time. This assumes that the origin is defined to be the
equilibrium point. What happens if this is not the case? What would the equation of motion look like? How
would the position of the oscillating mass as a function of time change?
6. For each of the situations described below, give a rough sketch of the state space diagram (̇ versus ) that
represents the motion of each object. All of the motion takes place along the -axis.
7. Consider a simple harmonic oscillator consisting of a mass attached to a spring of spring constant . For
this oscillator () = sin( 0 − ).
8. Consider a damped, driven oscillator consisting of a mass attached to a spring of spring constant .
(a) Determine the direction and the magnitude of the gravitational field for all regions of space.
(b) If the gravitational potential is zero at the origin, what is the difference between the gravitational potential
at = and = ?
11. A mass is constrained to move along one dimension. Two identical springs are attached to the mass, one on
each side, and each spring is in turn attached to a wall. Both springs have the same spring constant .
12. Discuss the motion of a continuous string when ½ plucked at one third of the ¾
length of the string. That is, the
3
0 ≤ ≤ 3
initial condition is ̇( 0) = 0, and ( 0) = 3
2 ( − ) 3 ≤≤
13. When a particular driving force is applied to a stretched string it is observed that the string vibration in purely
of the harmonic. Find the driving force.
14. Consider the two-mass system pivoted at its vertex where 6= . It undergoes oscillations of the angle
with respect to the vertical in the plane of the triangle.
l l
M m
l
15. A cube of side and mass is immersed in water with density past the point of equilibrium and then
released. Assume there is no damping due to the water.
√ √
() = + 1 cos ( ) + 2 sin ( )
where 1 and 2 are constants. If (0) = −, determine ().
Problems
1. An unusual pendulum is made by fixing a string to a horizontal cylinder of radius wrapping the string
several times around the cylinder, and then tying a mass to the loose end. In equilibrium the mass hangs a
distance 0 vertically below the edge of the cylinder. Find the potential energy if the pendulum has swung to
an angle from the vertical. Show that for small angles, it can be written in the Hooke’s Law form = 12 2 .
Comment of the value of
3. A simple pendulum consists of a mass suspended from a fixed point by a weight-less, extensionless rod of
length .
p the equation of motion, and in the approximation sin ≈ show that the natural frequency is
a) Obtain
0 = , where is the gravitational field strength.
b) Discuss
√ the motion in the event that the motion takes place in a viscous medium with retarding force
2 ̇.
4. Derive the expression for the State Space paths of the plane pendulum if the total energy is 2. Note
that this is just the case of a particle moving in a periodic potential () = (1−cos) Sketch the State
Space diagram for both 2 and 2
5. Consider the motion of a driven linearly-damped harmonic oscillator after the transient solution has died out,
and suppose that it is being driven close to resonance, = .
1 2 2
a) Show that the oscillator’s total energy is = 2 .
b) Show that the energy ∆ dissipated during one cycle by the damping force Γ̇ is Γ2
6. Two masses m1 and m2 slide freely on a horizontal frictionless rail and are connected by a spring whose force
constant is k. Find the frequency of oscillatory motion for this system.
7. A particle of mass moves under the influence of a resistive force proportional to velocity and a potential ,
that is .
( ̇) = −̇ −
where 0 and () = (2 − 2 )2
a) Find the points of stable and unstable equilibrium.
b) Find the solution of the equations of motion for small oscillations around the stable equilibrium points
c) Show that as → ∞ the particle approaches one of the stable equilibrium points for most choices of initial
conditions. What are the exceptions? (Hint: You can prove this without finding the solutions explicitly.)
Chapter 4
4.1 Introduction
In nature only a subset of systems have equations of motion that are linear. Contrary to the impression
given by the analytic solutions presented in undergraduate physics courses, most dynamical systems in nature
exhibit non-linear behavior that leads to complicated motion. The solutions of non-linear equations usually
do not have analytic solutions, superposition does not apply, and they predict phenomena such as attractors,
discontinuous period bifurcation, extreme sensitivity to initial conditions, rolling motion, and chaos. There
have been some exciting discoveries in classical mechanics during the past four decades associated with the
recognition that nonlinear systems can exhibit chaos. Chaotic phenomena have been observed in most fields of
science and engineering such as, weather patterns, fluid flow, motion of planets in the solar system, epidemics,
changing populations of animals, birds and insects, and the motion of electrons in atoms. The complicated
dynamical behavior predicted by non-linear differential equations is not limited to classical mechanics, rather
it is a manifestation of the mathematical properties of the solutions of the differential equations involved,
and thus is generally applicable to solutions of first or second-order non-linear differential equations. It is
important to understand that the systems discussed in this chapter follow a fully deterministic evolution
predicted by the laws of classical mechanics, the evolution for which is based on the prior history. This
behavior is completely different from a random walk where each step is based on a random process. The
complicated motion of deterministic non-linear systems stems in part from sensitivity to the initial conditions.
The French mathematician Poincaré is credited with being the first to recognize the existence of chaos
during his investigation of the gravitational three-body problem in celestial mechanics. At the end of the
nineteenth century Poincaré noticed that such systems exhibit high sensitivity to initial conditions character-
istic of chaotic motion, and the existence of nonlinearity which is required to produce chaos. Poincaré’s work
received little notice, in part it was overshadowed by the parallel development of the Theory of Relativity
and quantum mechanics at the start of the 20 century. In addition, solving nonlinear equations of motion
is difficult, which discouraged work on nonlinear mechanics and chaotic motion. The field blossomed in the
19600 when computers became sufficiently powerful to solve the nonlinear equations required to calculate
the long-time histories necessary to document chaotic behavior.
Laplace, and many other scientists, believed in the deterministic view of nature which assumes that if the
position and velocities of all particles are known, then one can unambiguously predict the future motion using
Newtonian mechanics. Researchers in many fields of science now realize that this “clockwork universe" is
invalid. That is, knowing the laws of nature can be insufficient to predict the evolution of nonlinear systems
in that the time evolution can be extremely sensitive to the initial conditions even though they follow a
completely deterministic development. There are two major classifications of nonlinear systems that lead to
chaos in nature. The first classification encompasses nondissipative Hamiltonian systems such as Poincaré’s
three-body celestial mechanics system. The other main classification involves driven, damped, non-linear
oscillatory systems.
Nonlinearity and chaos is a broad and active field and thus this chapter will focus only on a few examples
that illustrate the general features of non-linear systems. Weak non-linearity is used to illustrate bifurcation
and asymptotic attractor solutions for which the system evolves independent of the initial conditions. The
common sinusoidally-driven linearly-damped plane pendulum illustrates several features characteristic of the
89
90 CHAPTER 4. NONLINEAR SYSTEMS AND CHAOS
evolution of a non-linear system from order to chaos. The impact of non-linearity on wavepacket propagation
velocities and the existence of soliton solutions is discussed. The example of the three-body problem is
discussed in chapter 9. The transition from laminar flow to turbulent flow is illustrated by fluid mechanics
discussed in chapter 158. Analytic solutions of nonlinear systems usually are not available and thus one
must resort to computer simulations. As a consequence the present discussion focusses on the main features
of the solutions for these systems and ignores how the equations of motion are solved.
Insert this first-order solution into equation 44, then the cubic term in the expansion gives a term 3 =
1
4 (cos 3 + 3 cos ). Thus the perturbation expansion to third order involves a solution of the form
This perturbation solution shows that the non-linear term has distorted the signal by addition of the third
harmonic of the driving frequency with an amplitude that depends sensitively on . This illustrates that the
superposition principle is not obeyed for this non-linear system, but, if the non-linearity is weak, perturbation
theory can be used to derive the solution of a non-linear equation of motion.
Figure 41 illustrates that for a potential () = 22 + 4 the 4 non-linear term reduces the maximum
amplitude which makes the total energy contours in state-space more rectangular than the elliptical shape
for the harmonic oscillator shown in figure 33. The solution is of the form given in equation 46.
4.2. WEAK NONLINEARITY 91
Figure 4.1: The left side shows the potential energy for a symmetric potential () = 22 + 4 . The right
side shows the contours of constant total energy on a state-space diagram.
1 = 0 + 1
Substituting this into the equation of motion, and neglecting terms of higher order than gives
2
̈1 + 20 1 = 20 = [1 − cos (2 0 )]
2
To solve this try a particular integral
1 = + cos (2 0 )
and substitute into the equation of motion gives
2 2
−3 20 cos (2 0 ) + 20 = − cos (2 0 )
2 2
Comparison of the coefficients gives
2
=
2 20
2
=
6 20
92 CHAPTER 4. NONLINEAR SYSTEMS AND CHAOS
22
2 = −
3 2
and ∙ ¸
2 1 2 1
1 = ( + 1 ) sin ( 0 ) + 2 − cos ( 0 ) + cos (2 0 )
0 2 3 6
The constant ( + 1 ) is given by the initial amplitude and velocity.
This system is nonlinear in that the output amplitude is not proportional to the input amplitude. Secondly,
a large amplitude second harmonic component is introduced in the output waveform; that is, for a non-linear
system the gain and frequency decomposition of the output differs from the input. Note that the frequency
composition is amplitude dependent. This particular example of a nonlinear system does not exhibit chaos.
The Laboratory for Laser Energetics uses nonlinear crystals to double the frequency of laser light.
is at the minimum, which is the origin of the state-space diagram as shown in figure 41.
The more complicated one-dimensional potential well
shown in figure 42 has two minima that are symmetric about = 0 with a saddle of height 8.
The kinetic plus potential energies of a particle with mass = 2 released in this potential, will be
assumed to be given by
( ̇) = ̇2 + () (4.9)
The state-space plot in figure 42 shows contours of constant energy with the minima at ( ̇) = (±2 0).
At slightly higher total energy the contours are closed loops around either of the two minima at = ±2.
For total energies above the saddle energy of 8, the contours are peanut-shaped and are symmetric about
the origin. Assuming that the motion is weakly damped, then a particle released with total energy
which is higher than will follow a peanut-shaped spiral trajectory centered at ( ̇) = (0 0) in the
4.4. LIMIT CYCLES 93
Figure 4.2: The left side shows the potential energy for a bimodal symmetric potential () = 8 − 42 +
054 . The right-hand figure shows contours of the sum of kinetic and potential energies on a state-space
diagram. For total energies above the saddle point the particle follows peanut-shaped trajectories in state-
space centered around ( ̇) = (0 0). For total energies below the saddle point the particle will have closed
trajectories about either of the two symmetric minima located at ( ̇) = (±2 0). Thus the system solution
bifurcates when the total energy is below the saddle point.
state-space diagram for . For there are two separate solutions for the two
minimum centered at = ±2 and ̇ = 0. This is an example of bifurcation where the one solution for
bifurcates into either of the two solutions for .
For an initial total energy damping will result in spiral trajectories of the particle that
will be trapped in one of the two minima. For the particle trajectories are centered giving
the impression that they will terminate at ( ̇) = (0 0) when the kinetic energy is dissipated. However, for
the particle will be trapped in one of the two minimum and the trajectory will terminate
at the bottom of that potential energy minimum occurring at ( ̇) = (±2 0). These two possible terminal
points of the trajectory are called point attractors. This example appears to have a single attractor for
which bifurcates leading to two attractors at ( ̇) = (±2 0) for . The
determination as to which minimum traps a given particle depends on exactly where the particle starts in
state space and the damping etc. That is, for this case, where there is symmetry about the -axis, when
the particle has an initial total energy then the initial conditions with radians of state
space will lead to trajectories that are trapped in the left minimum, and the other radians of state space
will be trapped in the right minimum. Trajectories starting near the split between these two halves of the
starting state space will be sensitive to the exact starting phase. This is an example of sensitivity to initial
conditions.
̇ = ( ) (4.10)
̇ = ( )
occur frequently in physics. The state-space paths do not cross for such two-dimensional autonomous systems,
where an autonomous system is not explicitly dependent on time.
The Poincaré-Bendixson theorem states that, state-space, and phase-space, can have three possible paths:
(1) closed paths, like the elliptical paths for the undamped harmonic oscillator,
(2) terminate at an equilibrium point as → ∞, like the point attractor for a damped harmonic oscillator,
(3) tend to a limit cycle as → ∞. The limit cycle is unusual in that the periodic motion tends
asymptotically to the limit-cycle attractor independent of whether the initial values are inside or outside
the limit cycle. The balance of dissipative forces and driving forces often leads to limit-cycle attractors,
94 CHAPTER 4. NONLINEAR SYSTEMS AND CHAOS
Figure 4.3: The Poincaré-Bendixson theorem allows the following three scenarios for two-dimensional au-
tonomous systems. (1) Closed paths as illustrated by the undamped harmonic oscillator. (2) Terminate at
an equilibrium point as → ∞, as illustrated by the damped harmonic oscillator, and (3) Tend to a limit
cycle as → ∞ as illustrated by the van der Pol oscillator.
especially in biological applications. Identification of limit-cycle attractors, as well as the trajectories of the
motion towards these limit-cycle attractors, is more complicated than for point attractors.
≡ (4.12)
¡ ¢
= − − 2 − 1 (4.13)
It is advantageous to transform the (̇ ) state space to polar coordinates in by setting
= cos (4.14)
= sin
and using the fact that 2 = 2 + 2 Therefore
= + (4.15)
Similarly for the angle coordinate
= cos − sin (4.16)
= sin + cos (4.17)
4.4. LIMIT CYCLES 95
Figure 4.4: Solutions of the van der Pol system for = 02 top row and = 5 bottom row, assuming that
20 = 1. The left column shows the time dependence (). The right column shows the corresponding ( ̇)
state space plots. Upper: Weak nonlinearity, = 02; At large times the solution tends to one limit
cycle for initial values inside or outside the limit cycle attractor. The amplitude () for two initial condi-
tions approaches an approximately harmonic oscillation. Lower: Strong nonlinearity, μ = 5; Solutions
approach a common limit cycle attractor for initial values inside or outside the limit cycle attractor while
the amplitude () approaches a common approximate square-wave oscillation.
2 = − (4.18)
Equations 415 and 418 allow the van der Pol equations of motion to be written in polar coordinates
¡ ¢
= − 2 cos2 − 1 sin2 (4.19)
¡ ¢
= −1 − 2 cos2 − 1 sin cos (4.20)
The non-linear terms on the right-hand side of equations 419 − 20 have a complicated form.
96 CHAPTER 4. NONLINEAR SYSTEMS AND CHAOS
Weak non-linearity: 1
In the limit that → 0, the equations 419 420 correspond to a circular state-space trajectory similar to
the harmonic oscillator. That is, the solution is of the form
() = sin ( − 0 ) (4.21)
where and 0 are arbitrary parameters. For weak non-linearity, 1 the angular equation 420 has a
rotational frequency that is unity since the sin cos term changes sign twice per period,
¡ in addition
¢ to the
small value of . For 1 and 1 the radial equation 419 has a sign of the 2 cos2 − 1 term that
is positive and thus the radius increases monotonically to unity. For 1 the bracket is predominantly
negative resulting in a spiral decrease in the radius. Thus, for very weak non-linearity, this radial behavior
results in the amplitude spiralling to a well defined limit-cycle attractor value of = 2 as illustrated by
the state-space plots in figure 44 for cases where the initial condition is inside or external to the circular
attractor. The final amplitude for different initial conditions also approach the same asymptotic behavior.
Dominant non-linearity: 1
For the case where the non-linearity is dominant, that is 1, then as shown in figure 44, the system
approaches a well defined attractor, but in this case it has a significantly skewed shape in state-space, while
the amplitude approximates a square wave. The solution remains close to = +2 until = ̇ ≈ +7 and
then it relaxes quickly to = −2 with = ̇ ≈ 0 This is followed by the mirror image. This behavior is
called a relaxed vibration in that a tension builds up slowly then dissipates by a sudden relaxation process.
The seesaw is an extreme example of a relaxation oscillator where the angle switches spontaneously from
one solution to the other when the difference in the moment arms changes sign.
The study of feedback in electronic circuits was the stimulus for study of this equation by van der
Pol. However, Lord Rayleigh first identified such relaxation oscillator behavior in 1880 during studies of
vibrations of a stringed instrument excited by a bow, or the squeaking of a brake drum. In his discussion of
non-linear effects in acoustics, he derived the equation
̈ − ( − ̇2 )̇ + 20 (4.22)
Differentiation of Rayleigh’s equation 422 gives
...
− ( − 3̇2 )̈ + 20 ̇ = 0 (4.23)
Using the substitution of r
3
= 0 ̇ (4.24)
leads to the relations
r r r
̇ ... ̈
̇ = ̈ = = (4.25)
3 0 3 0 3 0
Substituting these relations into equation 423 gives
r r ∙ ¸ r
̈ 3 ̇ 2 ̇ 2
− − + 0 =0 (4.26)
3 0 3 02 0 3 0
q
Multiplying by 0 3 and rearranging leads to the van der Pol equation
2
̈ − ( − 2 )̇ − 20 = 0 (4.27)
02 0
The rhythm of a heartbeat driven by a pacemaker is an important application where the self-stabilization of
the attractor is a desirable characteristic to stabilize an irregular heartbeat; the medical term is arrhythmia.
The mechanism that leads to synchronization of the many pacemaker cells in the heart and human body to
an implanted pacemaker is discussed in chapter 1212. Another biological application of limit cycles is the
time variation of animal populations.
In summary the non-linear damping of the van der Pol oscillator leads to a self-stabilized, single limit-
cycle attractor that is insensitive to the initial conditions. The van der Pol oscillator has many important
applications such as bowed musical instruments, electrical circuits, and human anatomy as mentioned above.
4.5. HARMONICALLY-DRIVEN, LINEARLY-DAMPED, PLANE PENDULUM 97
1A similar approach is used by the book "Chaotic Dynamics" by Baker and Gollub[Bak96].
98 CHAPTER 4. NONLINEAR SYSTEMS AND CHAOS
Figure 4.5: Motion of the driven damped pendulum for drive strengths of = 02, = 09 = 105 and
= 1078. The left side shows the time dependence of the deflection angle with the time axis expressed
in dimensionless units ̃. The right side shows the corresponding state-space plots. These plots assume
̃ = 0 = 23 , = 2, and the motion starts with = = 0.
4.5. HARMONICALLY-DRIVEN, LINEARLY-DAMPED, PLANE PENDULUM 99
20
10
2
2 4 6 8 10 12 14
t 2 2
2 10
20
10
2
2 4 6 8 10 12 14 t 2 2
10
2
20
Figure 4.6: The driven damped pendulum assuming that ̃ = 23 , = 2, with initial conditions (0) = − 2 ,
(0) = 0. The system exhibits period-two motion for drive strengths of = 1078 as shown by the state
space diagram for cycles 10 − 20. For = 1081 the system exhibits period-four motion shown for cycles
10 − 30.
(̃) ≈ cos(̃ ̃ − )
then the small 3 term in equation 435 contributes a term proportional to cos3 (̃ ̃ − ). But
1¡ ¢
cos3 (̃ ̃ − ) = cos 3(̃ ̃ − ) + 3 cos(̃ ̃ − )
4
That is, the nonlinearity introduces a small term proportional to cos 3( − ). Since the right-hand side of
equation 435 is a function of only cos then the terms in ̇ and ̈ on the left hand side must contain
the third harmonic cos 3( − ) term. Thus a better approximation to the solution is of the form
£ ¤
(̃) = cos(̃ ̃ − ) + cos 3(̃ ̃ − ) (4.36)
100 CHAPTER 4. NONLINEAR SYSTEMS AND CHAOS
where the admixture coefficient 1. This successive approximation method can be repeated to add
additional terms proportional to cos ( − ) where is an integer with ≥ 3. Thus the nonlinearity
introduces progressively weaker -fold harmonics to the solution. This successive approximation approach
is viable only when the admixture coefficient 1 Note that these harmonics are integer multiples of ,
thus the steady-state response is identical for each full period even though the state space contours deviate
from an elliptical shape.
Figure 4.7: Rolling motion for the driven damped plane pendulum for = 14. (a) The time dependence
of angle () increases by 2 per drive period whereas (b) the angular velocity () exhibits periodicity. (c)
The state space plot for rolling motion is shown with the origin shifted by 2 per revolution to keep the plot
within the bounds − +
for rolling motion corresponds to a chain of loops with a spacing of 2 between each loop. The state space
diagram for rolling motion is more compactly presented if the origin is shifted by 2 per revolution to keep
the plot within bounds as illustrated in figure 47.
Figure 4.8: Left: Space-space orbits for the driven damped pendulum with = 1105. Note that the orbits
do not repeat for cycles 25 to 200. Right: Time-state-space diagram for = 1168. The plot shows 16
trajectories starting with different initial values in the range −015 015.
102 CHAPTER 4. NONLINEAR SYSTEMS AND CHAOS
Figure 4.9: State-space plots for the harmonically-driven, linearly-damped, pendulum for driving amplitudes
of = 05 and = 12. These calculations were performed using the Runge-Kutta method by E. Shah,
(Private communication)
1 |()|
= lim lim ln (4.40)
→∞ 0 →0 |0 |
Systems for which the Lyapunov exponent 0, (negative) converge exponentially to the same attractor
solution at long times since |()| → 0 for → ∞. By contrast, systems for which 0 (positive) diverge
to completely different long-time solutions, that is, |()| → ∞ for → ∞. Even for infinitesimally
4.6. DIFFERENTIATION BETWEEN ORDERED AND CHAOTIC MOTION 103
Figure 4.10: Lyapunov plots of ∆ versus time for two initial starting points differing by ∆0 = 0001.
The parameters are = 2 and () = sin( 23 ) and ∆ = 004. The Lyapunov exponent for = 05
which is drawn as a dashed line, is convergent with = −0251 For = 12 the exponent is divergent as
indicated by the dashed line which as a slope of = 01538 These calculations were performed using the
Runge-Kutta method by E. Shah, (Private communication)
small differences in the initial conditions, systems having a positive Lyapunov exponent diverge to different
attractors, whereas when the Lyapunov exponent 0 they correspond to stable solutions.
Figure 410 illustrates Lyapunov plots for the harmonically-driven, linearly-damped, plane pendulum,
with the same conditions discussed in chapter 45. Note that for the small driving amplitude = 05
the Lyapunov plot converges to ordered motion with an exponent = −0251 whereas for = 12 the
plot diverges characteristic of chaotic motion with an exponent = 01538 The Lyapunov exponent usually
fluctuates widely at the local oscillator frequency, and thus the time average of the Lyapunov exponent must
be taken over many periods of the oscillation to identify the general trend with time. Some systems near an
order-to-chaos transition can exhibit positive Lyapunov exponents for short times, characteristic of chaos,
and then converge to negative at longer time implying ordered motion. The Lyapunov exponents are
used extensively to monitor the stability of the solutions for non-linear systems. For example the Lyapunov
exponent is used to identify whether fluid flow is laminar or turbulent as discussed in chapter 158.
A dynamical system in -dimensional phase space will have a set of Lyapunov exponents {1 2 }
associated with a set of attractors, the importance of which depend on the initial conditions. Typically one
Lyapunov exponent dominates at one specific location in phase space, and thus it is usual to use the maximal
Lyapunov exponent to identify chaos.The Lyapunov exponent is a very sensitive measure of the onset of chaos
and provides an important test of the chaotic nature for the complicated motion exhibited by non-linear
systems.
Figure 4.12: Three Poincaré section plots for the harmonically-driven, linearly-damped, pendulum for various
initial conditions with = 12 ̃ = 23 and ∆ = 100
. These calculations used the Runge-Kutta method
and were performed for 6000 by E. Shah (Private communication).
when the restoring force is non-linear. The system exhibits bifurcation where it can evolve to multiple
attractors that depend sensitively on the initial conditions. The system exhibits both oscillatory, and rolling,
solutions depending on the amplitude of the motion. The system exhibits domains of simple ordered motion
separated by domains of very complicated ordered motion as well as chaotic regions. The transitions between
these dramatically different modes of motion are extremely sensitive to the amplitude and phase of the
driver. Eventually the motion becomes completely chaotic. The Lyapunov exponent, bifurcation diagram,
and Poincaré section plots, are sensitive measures of the order of the motion. These three sensitive measures
of order and chaos are used extensively in many fields in classical mechanics. Considerable computing
capabilities are required to elucidate the complicated motion involved in non-linear systems. Examples
include laminar and turbulent flow in fluid dynamics and weather forecasting of hurricanes, where the
motion can span a wide dynamic range in dimensions from 10−5 to 104 .
where is used as the independent variable since it is invariant to phase transitions of the system. Note
that the factor for the first derivative term is the reciprocal of the group velocity
µ ¶
1
≡ (4.42)
=0
106 CHAPTER 4. NONLINEAR SYSTEMS AND CHAOS
3
+ 3
+ 6 =0 (4.49)
A solution of this equation has the characteristics of a solitary wave with fixed shape. It is given by
substituting the form ( ) = ( − ) into the Korteweg-de Vries equation which gives
108 CHAPTER 4. NONLINEAR SYSTEMS AND CHAOS
3
− + 3 + 6 =0 (4.50)
Integrating with respect to gives
2
3 2 + − = (4.51)
3
where is a constant of integration. This non-linear equation has a solution
∙√ ¸
1 2
( ) = sec ( − − ) (4.52)
2 2
where is a constant. Equation 452 is the equation of a solitary wave moving in the + direction at a
velocity .
Soliton behavior is observed in phenomena such as tsunamis, tidal bores that occur for some rivers,
signals in optical fibres, plasmas, atmospheric waves, vortex filaments, superconductivity, and gravitational
fields having cylindrical symmetry. Much work has been done on solitons for fibre optics applications. The
soliton’s inherent stability make long-distance transmission possible without the use of repeaters, and could
potentially double the transmission capacity.
Before the discovery of solitons, mathematicians were under the impression that nonlinear partial differ-
ential equations could not be solved exactly. However, solitons led to the recognition that there are non-linear
systems that can be solved analytically. This discovery has prompted much investigation into these so-called
"integrable systems." Such systems are rare, as most non-linear differential equations admit chaotic behavior
with no explicit solutions. Integrable systems nevertheless lead to very interesting mathematics ranging from
differential geometry and complex analysis to quantum field theory and fluid dynamics.
Many of the fundamental equations in physics (Maxwell’s, Schrödinger’s) are linear equations. However,
physicists have begun to recognize many areas of physics in which nonlinearity can result in qualitatively
new phenomenon which cannot be constructed via perturbation theory starting from linearized equations.
These include phenomena in magnetohydrodynamics, meteorology, oceanography, condensed matter physics,
nonlinear optics, and elementary particle physics. For example, the European space mission Cluster detected
a soliton-like electrical disturbances that travelled through the ionized gas surrounding the Earth starting
about 50,000 kilometers from Earth and travelling towards the planet at about 8 km/s. It is thought that
this soliton was generated by turbulence in the magnetosphere.
Efforts to understand the nonlinearity of solitons has led to much research in many areas of physics. In
the context of solitons, their particle-like behavior (in that they are localized and preserved under collisions)
leads to a number of experimental and theoretical applications. The technique known as bosonization allows
viewing particles, such as electrons and positrons, as solitons in appropriate field equations. There are
numerous macroscopic phenomena, such as internal waves on the ocean, spontaneous transparency, and the
behavior of light in fiber optic cable, that are now understood in terms of solitons. These phenomena are
being applied to modern technology.
4.8 Summary
The study of the dynamics of non-linear systems remains a vibrant and rapidly evolving field in classical
mechanics as well as many other branches of science. This chapter has discussed examples of non-linear
systems in classical mechanics. It was shown that the superposition principle is broken even for weak
nonlinearity. It was shown that increased nonlinearity leads to bifurcation, point attractors, limit-cycle
attractors, and sensitivity to initial conditions.
Limit-cycle attractors: The Poincaré-Bendixson theorem for limit cycle attractors states that the
paths, both in state-space and phase-space, can have three possible paths:
(1) closed paths, like the elliptical paths for the undamped harmonic oscillator,
(2) terminate at an equilibrium point as → ∞, like the point attractor for a damped harmonic oscillator,
(3) tend to a limit cycle as → ∞.
The limit cycle is unusual in that the periodic motion tends asymptotically to the limit-cycle attractor
independent of whether the initial values are inside or outside the limit cycle. The balance of dissipative forces
and driving forces often leads to limit-cycle attractors, especially in biological applications. Identification of
4.8. SUMMARY 109
limit-cycle attractors, as well as the trajectories of the motion towards these limit-cycle attractors, is more
complicated than for point attractors.
The van der Pol oscillator is a common example of a limit-cycle system that has an equation of motion
of the form
2 ¡ ¢
2
+ 2 − 1 + 20 = 0 (411)
The van der Pol oscillator has a limit-cycle attractor that includes non-linear damping and exhibits
periodic solutions that asymptotically approach one attractor solution independent of the initial conditions.
There are many examples in nature that exhibit similar behavior.
Harmonically-driven, linearly-damped, plane pendulum: The non-linearity of the well-known
driven linearly-damped plane pendulum was used as an excellent example of the behavior of non-linear
systems in nature. It was shown that non-linearity leads to discontinuous period bifurcation, extreme
sensitivity to initial conditions, rolling motion and chaos.
Differentiation between ordered and chaotic motion: Lyapunov exponents, bifurcation diagrams,
and Poincaré sections were used to identify the transition from order to chaos. Chapter 158 discusses
the non-linear Navier-Stokes equations of viscous-fluid flow which leads to complicated transitions between
laminar and turbulent flow. Fluid flow exhibits remarkable complexity that nicely illustrates the dominant
role that non-linearity can have on the solutions of practical non-linear systems in classical mechanics.
Wave propagation for non-linear systems: Non-linear equations can lead to unexpected behavior
for wave packet propagation such as fast or slow light as well as soliton solutions. Moreover, it is notable
that some non-linear systems can lead to analytic solutions.
The complicated phenomena exhibited by the above non-linear systems is not restricted to classical
mechanics, rather it is a manifestation of the mathematical behavior of the solutions of the differential
equations involved. That is, this behavior is a general manifestation of the behavior of solutions for second-
order differential equations. Exploration of this complex motion has only become feasible with the advent
of powerful computer facilities during the past three decades. The breadth of phenomena exhibited by
these examples is manifest in myriads of other nonlinear systems, ranging from many-body motion, weather
patterns, growth of biological species, epidemics, motion of electrons in atoms, etc. Other examples of non-
linear equations of motion not discussed here, are the three-body problem, which is mentioned in chapter 9,
and turbulence in fluid flow which is discussed in chapter 15.
It is stressed that the behavior discussed in this chapter is very different from the random walk problem
which is a stochastic process where each step is purely random and not deterministic. This chapter has
assumed that the motion is fully deterministic and rigorously follows the laws of classical mechanics. Even
though the motion is fully deterministic, and follows the laws of classical mechanics, the motion is extremely
sensitive to the initial conditions and the non-linearities can lead to chaos. Computer modelling is the only
viable approach for predicting the behavior of such non-linear systems. The complexity of solving non-linear
equations is the reason that this book will continue to consider only linear systems. Fortunately, in nature,
non-linear systems can be approximately linear when the small-amplitude assumption is applicable.
110 CHAPTER 4. NONLINEAR SYSTEMS AND CHAOS
Workshop exercises
1. Consider the chaotic motion of the driven damped pendulum whose equation of motion is given by
for which the Lyapunov exponent is = 1 with time measured in units of the drive period.
(a) Assume that you need to predict () with an accuracy of 10−2 , and that you know the initial
value (0) to within 10−6 . What is the maximum time horizon max for which you can predict
() within the required accuracy?
(b) Suppose that, with unlimited time and financial constraints, you manage to improve the accuracy of the
initial value to 10−9 (that is, a thousand-fold improvement). What is the time horizon now for
achieving the accuracy of 10−2 ?
(c) By what factor has max improved with the 1000 − improvement in initial measurement.
(d) What does this imply regarding long-term predictions of chaotic motion?
2. A non-linear oscillator satisfies the equation ̈ + ̇3 + = 0 Find the polar equations for the motion in the
state-space diagram. Show that any trajectory that starts within the circle 1 encircle the origin infinitely
many times in the clockwise direction. Show further that these trajectories in state space terminate at the
origin.
3. Consider the system of a mass suspended between two identical springs as shown.
If each spring is stretched a distance to attach the mass at the equilibrium position the mass is subject to
two equal and oppositely directed forces of magnitude . Ignore gravity. Show that the potential in which
the mass moves is approximately
½ ¾ ½ ¾
( − )
() = 2 + 4
43
Construct a state-space diagram for this potential.
Problems
1. A non-linear oscillator satisfies the equation
2. A mass moves in one direction and is subject to a constant force +0 when 0 and to a constant force
−0 when 0. Describe the motion by constructing a state space diagram. Calculate the period of the
motion in terms of 0 and the amplitude . Disregard damping.
Chapter 5
Calculus of variations
5.1 Introduction
The prior chapters have focussed on the intuitive Newtonian perspective of classical mechanics, which is
based on vector quantities like force, momentum, and acceleration. Newtonian mechanics leads to second-
order differential equations of motion. The calculus of variations underlies a powerful alternative approach
to classical mechanics that is based on identifying the path that minimizes an integral quantity. This integral
variational approach was first championed by Gottfried Wilhelm Leibniz, contemporaneously with Newton’s
development of the differential approach to classical mechanics.
During the 18 century, Bernoulli, who was a student of Leibniz, developed the field of variational
calculus which underlies the integral variational approach to mechanics. He solved the brachistochrone
problem which involves finding the path for which the transit time between two points is the shortest. The
integral variational approach also underlies Fermat’s principle in optics, which can be used to derive that
the angle of reflection equals the angle of incidence, as well as derive Snell’s law. Other applications of the
calculus of variations include solving the catenary problem, finding the maximum and minimum distances
between two points on a surface, polygon shapes having the maximum ratio of enclosed area to perimeter,
or maximizing profit in economics. Bernoulli, developed the principle of virtual work used to describe
equilibrium in static systems, and d’Alembert extended the principle of virtual work to dynamical systems.
Euler, the preeminent Swiss mathematician of the 18 century and a student of Bernoulli, developed the
calculus of variations with full mathematical rigor. Lagrange (1736-1813),a student of Euler, culminated the
development of the Lagrangian variational approach to classical mechanics.
The Euler-Lagrangian approach to classical mechanics stems from a deep philosophical belief that the
laws of nature are based on a principle of economy.That is, the physical universe follows paths through space
and time that are based on extrema principles. The standard Lagrangian is defined as the difference
between the kinetic and potential energy, that is
= − (5.1)
Chapters 6 and 13 will show that the laws of classical mechanics can be expressed in terms of Hamilton’s
variational principle which states that the motion of the system between the initial time 1 and final time
2 follows a path that minimizes the scalar action integral defined as the time integral of the Lagrangian.
Z 2
= (5.2)
1
The calculus of variations provides the mathematics required to determine the path that minimizes the
action integral. This variational approach is both elegant and beautiful, and has withstood the rigors
of experimental confirmation. In fact, not only is it an exceedingly powerful alternative approach to the
intuitive Newtonian approach to mechanics, but Hamilton’s variational principle now is recognized to be
more fundamental than Newton’s Laws of Motion. The Lagrangian and Hamiltonian variational approaches
to mechanics are the only approaches that can handle the Theory of Relativity, statistical mechanics, and
the dichotomy of philosophical approaches to quantum physics.
111
112 CHAPTER 5. CALCULUS OF VARIATIONS
is an extremum, that is, it is a maximum or minimum. Here is the independent variable, () the dependent
variable, plus its first derivative 0 ≡ The quantity [() 0 (); ] has some given dependence on 0
and The calculus of variations involves varying the function () until a stationary value of is found,
which is presumed to be an extremum. This means that if a function = () gives a minimum value for the
scalar functional , then any neighboring function, no matter how close to () must increase . For all
paths, the integral is taken between two fixed points, 1 1 and 2 2 Possible paths between the initial
and final points are illustrated in figure 51. Relative to any neighboring path, the functional must have
a stationary value which is presumed to be the correct extremum path.
Define a neighboring function using a parametric representation ( ) such that = 0, = (0 ) = ()
is the function that yields the extremum for . Assume that an infinitesimally small fraction of the
neighboring function () is added to the extremum path (). That is, assume
The condition that the integral has a stationary (extremum) value is that be independent of to first
order along the path giving the extremum value ( = 0). That is
µ ¶
=0 (5.6)
=0
for all functions () This is illustrated on the right side of figure 51
Applying condition (56) to equation (55) and since is independent of then
Z 2 µ ¶
0
= + 0 = 0 (5.7)
1
Since the limits of integration are fixed, the differential operation affects only the integrand. From equations
(54),
= () (5.8)
and
0
= (5.9)
Consider the second term in the integrand
Z 2 Z 2
0
0
= 0
(5.10)
1 1
5.2. EULER’S DIFFERENTIAL EQUATION 113
y(x)
Varied path
x
x1 x2
x
O
Figure 5.1: The left shows the extremum () Rand neighboring paths ( ) = () + () between (1 1 )
and (2 2 ) that minimizes the function = 12 [() 0 (); ] . The right shows the dependence of
as a function of the admixture coefficient for a maximum (upper) or a minimum (lower) at = 0.
Integrate by parts Z Z
= − (5.11)
gives Z ∙ ¸2 Z 2 µ ¶
2
= () − () (5.12)
1 0 0 1 1 0
Note that the first term on the right-hand side is zero since by definition = () = 0 at 1 and 2 Thus
Z 2 µ ¶ Z 2 µ µ ¶¶
0
= + 0 = () − ()
1 1 0
This integral now appears to¢be independent of However, the functions and 0 occurring in the derivatives
¡
are functions of Since =0 must vanish for a stationary value, and because () is an arbitrary function
subject to the conditions stated, then the above integrand must be zero. This derivation that the integrand
must be zero leads to Euler’s differential equation
− =0 (5.15)
0
where and 0 are the original functions, independent of The basis of the calculus of variations is that the
function () that satisfies Euler’s equation is an stationary function. Note that the stationary value could
be either a maximum or a minimum value. When Euler’s equation is applied to mechanical systems using
the Lagrangian as the functional, then Euler’s differential equation is called the Euler-Lagrange equation.
114 CHAPTER 5. CALCULUS OF VARIATIONS
The function is
q
2 y
= 1 + ( 0 )
Therefore
x 1 y1
=0
and
0 x 2 y2
= q
0 2
1 + ( 0 )
Inserting these into Euler’s equation 515 gives
⎛ ⎞
⎝ 0 x
0+ q ⎠=0
2 Shortest distance between two points in a plane.
1 + ( 0 )
that is
0
q = constant =
1 + ( 0 )2
This is valid if
0 = √ =
1 − 2
Therefore
= +
which is the equation of a straight line in the plane. Thus the shortest path between two points in a plane is
a straight line between these points, as is intuitively obvious. This stationary value obviously is a minimum.
This trivial example of the use of Euler’s equation to determine an extremum value has given the obvious
answer. It has been presented here because it provides a proof that a straight line is the shortest distance in
a plane and illustrates the power of the calculus of variations to determine extremum paths.
Consider that the particle of mass starts at the origin 1 = 0 1 = 0 with zero velocity. Since the
problem conserves energy and assuming that initially = + = 0 then
1
2 − = 0
2
That is p
= 2
The transit time is given by
Z 2 Z 2
p Z 2 s
2 + 2 (1 + 02 )
= = √ =
1 1 2 1 2
where 0 ≡
. Note that, in this example, the independent variable has been chosen to be and the dependent
variable is ().
The function of the integral is s
1 (1 + 02 )
=√
2
√
Factor out the constant 2 term, which does not affect the final equation, and note that
= 0
0
= r ³ ´
0 2
1 + (0 )
or (x1 , y1) a a x
0
1
r ³ ´ = constant = √2 a
1 + (0 )2
P(x , y)
(x 2 , y 2 )
That is 2a
02 1
³ ´= Cycloid
1 + (0 )
2 2
or
= ( − sin ) + constant
116 CHAPTER 5. CALCULUS OF VARIATIONS
The parametric equations for a cycloid passing through the origin are
= ( − sin )
= (1 − cos )
which is the form of the solution found. That is, the shortest time between two points is obtained by con-
straining the motion of the mass to follow a cycloid shape. Thus the mass first accelerates rapidly by falling
down steeply and then follows the curve and coasts upward at the end. The elapsed time is obtained by
inserting
q the above parametric relations for and in terms of into the transit time integral giving
= where and are fixed by the end point coordinates. Thus the time to fall from starting with zero
q
velocity at the cusp to the minimum of the cycloid is If 2 = 1 = 0 then 2 = 2 which defines the
q q
shape of the cycloid and the minimum time is 2 = 2 If the mass starts with a non-zero initial
2
velocity, then the starting point is not at the cusp of the cycloid, but down a distance such that the kinetic
energy equals the potential energy difference from the cusp.
A modern application of the Brachistochrone problem is determination of the optimum shape of the low-
friction emergency chute that passengers slide down to evacuate a burning aircraft. Bernoulli solved the
problem of rapid evacuation of an aircraft two centuries before the first flight of a powered aircraft.
p
= −− 1 + 02
Therefore Euler’s equation equals
p 00 − 02 − 00 02 −
− 02 − √
− = − 1 + + √ + =0
0 1 + 02 1 + 02 (1 + 02 )
32
Integration gives
³ ´
Z Z ln cos(1 −)
ln(cos(1 − )) − ln(cos(1 + )) cos(1 +)
() = = tan(1 − ) = + 2 = + 2
− −
Using the initial condition that (−) = 0 gives 2 = 0. Similarly the final condition () = 0 implies that
1 = 0. Thus Euler’s equation has determined that the optimal trajectory that minimizes the cost integral
is µ ¶
1 cos()
() = ln
cos()
This example is typical of problems encountered in economics.
Independent variable
Assuming that is the independent variable, then the surface area can be written as
s µ ¶2
Z 2 Z 2 p
= 2 1+ = 2 1 + 02
1 1
118 CHAPTER 5. CALCULUS OF VARIATIONS
p
where 0 ≡ . The function of the surface integral is = 1 + 02 The derivatives are
p
= 1 + 02
and
0
= q
0
1 + (0 )2
Therefore Euler’s equation gives
⎛ ⎞
⎝ 0 p
q ⎠ − 1 + 02 = 0
2
1 + (0 )
Independent variable
Consider the case where the independent variable is chosen to be , then the surface integral can be written
as s
Z 2 µ ¶2 Z p
= 2 1+ = 2 1 + 02
1
√
where 0 ≡ 02
. Thus the function of the surface integral is = 1 + The derivatives are
=0
and
0
= q
0
1 + ( 0 )2
Therefore Euler’s equation gives ⎛ ⎞
0
⎝ ⎠=0
0+ q
2
1 + ( 0 )
That is
0
q =
1 + ( 0 )2
where is a constant. This can be rewritten as
¡ ¢
02 2 − 2 = 2
or
0 = =p
− 2
2
where = 1 2 3
By analogy with the one dimensional problem, define neighboring functions for each variable. Then
If the variables () are independent, then the () are independent. Since the () are independent,
then evaluating the above equation at = 0 implies that each term in the bracket must vanish independently.
That is, Euler’s differential equation becomes a set of equations for the independent variables
− =0 (5.19)
0
where = 1 2 3 Thus, each of the equations can be solved independently when the variables are
independent. Note that Euler’s equation involves partial derivatives for the dependent variables , 0 and
the total derivative for the independent variable .
This is a problem that has two dependent variables () and () with chosen as the independent
variable. The integral can be broken into two parts 1 → 0 and 0 → −2
∙Z 0 q Z −2 q ¸
1
= 1 1 + (0 )2 + ( 0 )2 + 2 1 + (0 )2 + ( 0 )2
1 0
The functionals are functions of 0 and 0 but not or . Thus Euler’s equation for simplifies to
µ ¶
1 1 0 2 0
0+ (√ +√ ) =0
1 + 0 2 + 02 1 + 02 + 0 2
This implies that 0 = 0, therefore is a constant. Since the initial and final values were chosen to be
1 = 2 = 0, therefore at the interface = 0. Similarly Euler’s equations for are
µ ¶
1 1 0 2 0
0+ (√ +√ ) =0
1 + 0 2 + 02 1 + 02 + 0 2
But 0 = tan 1 for 1 and 0 = − tan 2 for 2 and it was shown that 0 = 0. Thus
⎛ ⎞
µ ¶
⎝1 1 tan 1 2 tan 2 ⎠ 1
0+ (q −q ) = (1 sin 1 − 2 sin 2 ) = 0
1 + (tan 1 )2 1 + (tan 2 )2
Therefore 1 (1 sin 1 − 2 sin 2 ) = constant which must be zero since when 1 = 2 then 1 = 2 . Thus
Fermat’s principle leads to Snell’s Law.
1 sin 1 = 2 sin 2
The geometry of this problem is simple enough to directly minimize the path rather than using Euler’s
equations for the two parameters as performed above. The lengths of the paths 1 and 2 are
q
1 = 2 + 12 + 2
q
2 = (2 − )2 + 22 + 2
This problem involves two dependent variables, () and (). To find the minima, set the partial derivatives
= 0 and = 0. That is,
1 1 2
= (p +q )=0
2 + 12 + 2 2
(2 − ) + 22 + 2
This is zero only if = 0, that is the point lies in the plane containing 1 and 2 . Similarly
1 1 2 (2 − ) 1
= (p −q ) = (1 sin 1 − 2 sin 2 ) = 0
2 2
+ 1 + 2 2
(2 − ) + 22 + 2
1 sin 1 = 2 sin 2
Fermat’s principle has shown that the refracted light is given by Snell’s Law, and is in a plane normal to the
surface. The laws of reflection also are given since then 1 = 2 = and the angle of reflection equals the
angle of incidence.
5.6. EULER’S INTEGRAL EQUATION 121
Note that the variables 1 2 3 are independent, and thus Euler’s equation for several independent variables
can be used. To minimize the functional , the function
µ ¶2 µ ¶2 µ ¶2
= + + ()
1 2 3
must satisfy the Euler equation
3 µ ¶
X
− =0
=1 0
where 0 =
. Substitute into Euler’s equation gives
X3 µ ¶
=0
=1
0
= + + 0 (5.20)
But µ ¶
0
0 0 = 0 + 0 (5.21)
0
Combining these two equations gives
µ ¶
0
= − − 0 + 0 (5.22)
0 0
The last two terms can be rewritten as µ ¶
0 − (5.23)
0
which vanishes when the Euler equation is satisfied. Therefore the above equation simplifies to
µ ¶
0
− − =0 (5.24)
0
This integral form of Euler’s equation is especially useful when = 0 that is, when does not depend
explicitly on the independent variable . Then the first integral of equation 524 is a constant, i.e.
− 0 = constant (5.25)
0
This is Euler’s integral variational equation. Note that the shortest distance between two points, the mini-
mum surface of rotation, and the brachistochrone, described earlier, all are examples where
= 0 and thus
the integral form of Euler’s equation is useful for solving these cases.
122 CHAPTER 5. CALCULUS OF VARIATIONS
the assumption made in chapter 55 that the variables are inde- y
pendent. Ff
For example, for a disk rolling down an inclined plane without slip-
ping, there are three coordinates [perpendicular to the wedge], , [Along
the surface of the wedge], and the rotation angle shown in figure 52
The constraint forces, F N, lead to the correlation of the variables such
that = , while = . Basically there is only one independent vari-
Figure 5.2: A disk rolling down
able, which can be either or The use of only one independent variable
an inclined plane.
essentially buries the constraint forces under the rug, which is fine if you
only need to know the equation of motion. If you need to determine the
forces of constraint then it is necessary to include all coordinates explicitly in the equations of motion.
where = 1 2 3 . There can be such equations of constraint where 0 ≤ ≤ . An example of such a
geometric constraint is when the motion is confined to the surface of a sphere of radius in coordinate space
which can be written in the form = 2 + 2 + 2 − 2 = 0 Such algebraic constraint equations are called
Holonomic which allows use of generalized coordinates as well as Lagrange multipliers to handle both the
constraint forces and the correlation of the coordinates.
where = 1 2 3 , = 1 2 3 . If equation (527) represents the total differential of a function then
it can be integrated to give a holonomic relation of the form of equation 526. However, if equation 527 is
5.7. CONSTRAINED VARIATIONAL SYSTEMS 123
not the total differential, then it is non-holonomic and can be integrated only after having solved the full
problem.
An example of differential constraint equations is for a wheel rolling on a plane without slipping which is
non-holonomic and more complicated than might be expected. The wheel moving on a plane has five degrees
of freedom since the height is fixed. That is, the motion of the center of mass requires two coordinates
( ) plus there are three angles ( ) where is the rotation angle for the wheel, is the pivot angle of
the axis, and is the tilt angle of the wheel. If the wheel slides then all five degrees of freedom are active.
If the axis of rotation of the wheel is horizontal, that is, the tilt angle = 0 is constant, then this kinematic
system leads to three differential constraint equations The wheel can roll with angular velocity ̇, as well as
pivot which corresponds to a change in Combining these leads to two differential equations of constraint
These constraints are insufficient to provide finite relations between all the coordinates. That is, the con-
straints cannot be reduced by integration to the form of equation 526 because there is no functional relation
between and the other three variables, . Many rolling trajectories are possible between any two points
of contact on the plane that are related to different pivot angles. That is, the point of contact of the disk
could pivot plus roll in a circle returning to the same point where are unchanged whereas the value
of depends on the circumference of the circle. As a consequence the rolling constraint is non-holonomic
except for the case where the disk rolls in a straight line and remains vertical.
where is a fixed length. This integral constraint is geometric and holonomic. Another example is finding
the minimum surface area of a closed surface subject to the enclosed volume being the constraint.
Such a system is called holonomic since there is a direct relation between the coupled variables. An example
of such a holonomic geometric constraint is if the motion is confined to the surface of a sphere of radius
which can be written in the form
= 2 + 2 + 2 − 2 = 0 (5.32)
Non-holonomic constraints There are many classifications of non-holonomic constraints that exist
if equation (531) is not satisfied. The algebraic approach is difficult to handle when the constraint is an
inequality, such as the requirement that the location is restricted to lie inside a spherical shell of radius
which can be expressed as
= 2 + 2 + 2 − 2 ≤ 0 (5.33)
124 CHAPTER 5. CALCULUS OF VARIATIONS
This non-holonomic constrained system has a one-sided constraint. Systems usually are non-holonomic if
the constraint is kinematic as discussed above.
Partial Holonomic constraints Partial-holonomic constraints are holonomic for a restricted range
of the constraint surface in coordinate space, and this range can be case specific. This can occur if the
constraint force is one-sided and perpendicular to the path. An example is the pendulum with the mass
attached to the fulcrum by a flexible string that provides tension but not compression. Then the pendulum
length is constant only if the tension in the string is positive. Thus the pendulum will be holonomic if
the gravitational plus centrifugal forces are such that the tension in the string is positive, but the system
becomes non-hononomic if the tension is negative as can happen when the pendulum rotates to an upright
angle where the centrifugal force outwards is insufficient to compensate for the vertical downward component
of the gravitational force. There are many other examples where the motion of an object is holonomic when
the object is pressed against the constraint surface, such as the surface of the Earth, but is unconstrained if
the object leaves the surface.
Time dependence
A constraint is called scleronomic if the constraint is not explicitly time dependent. This ignores the time
dependence contained within the solution of the equations of motion. Fortunately a major fraction of
systems are scleronomic. The constraint is called rheonomic if the constraint is explicitly time dependent.
An example of a rheonomic system is where the size or shape of the surface of constraint is explicitly time
dependent such as a deflating pneumatic tire.
Energy conservation
The solution depends on whether the constraint is conservative or dissipative, that is, if friction or drag are
acting. The system will be conservative if there are no drag forces, and the constraint forces are perpendicular
to the trajectory of the path such as the motion of a charged particle in a magnetic field. Forces of constraint
can result from sliding of two solid surfaces, rolling of solid objects, fluid flow in a liquid or gas, or result from
electromagnetic forces. Energy dissipation can result from friction, drag in a fluid or gas, or finite resistance
of electric conductors leading to dissipation of induced electric currents in a conductor, e.g. eddy currents.
A rolling constraint is unusual in that friction between the rolling bodies is necessary to maintain rolling.
A disk on a frictionless inclined plane will conserve it’s angular momentum since there is no torque acting
if the rolling contact is frictionless, that is, the disk will just slide. If the friction is sufficient to stop sliding,
then the bodies will roll and not slide. A perfect rolling body does not dissipate energy since no work is
done at the instantaneous point of contact where both bodies are in zero relative motion and the force is
perpendicular to the motion. In real life, a rolling wheel can involve a very small energy dissipation due to
deformation at the point of contact coupled with non-elastic properties of the material used to make the
wheel and the plane surface. For example, a pneumatic tire can heat up and expand due to flexing of the
tire.
Since equations 536 and 538 both equal zero, the equations 538 can be multiplied by arbitrary
undetermined factors and added to equations 536 to give.
Note that this is not trivial in that although the sum of the constraint equations for each is zero; the
individual terms of the sum are not zero.
Insert equations 536 plus 538 into 539 and collect all terms, gives
Ã
!
X X
+ = 0 (5.40)
=1
Note that all the are free independent variations and thus the terms in the brackets, which are the
coefficients of each , individually must equal zero. For each of the values of , the corresponding bracket
implies
X
+ =0 (5.41)
=1
Equation 542 is equivalent to a variational problem for finding the stationary value of 0
Ã
!
X
0
( ) = + = 0 (5.43)
where 0 is defined to be à !
X
0
≡ + (5.44)
=1
The solution to equation 543 can be found using Euler’s differential equation 519 of variational calculus.
At the extremum ( 0 ) = 0 corresponds to following contours of constant 0 which are in the surface that is
perpendicular to the gradients of the terms in 0 . The Lagrange multiplier constants are required because,
although these gradients are parallel at the extremum, the magnitudes of the gradients are not equal.
The beauty of the Lagrange multipliers approach is that the auxiliary conditions do not have to be
handled explicitly, since they are handled automatically as additional free variables during solution of
Euler’s equations for a variational problem with + unknowns fit to + equations. That is, the
variables are determined by the variational procedure using the variational equations
X
0 0
( 0 )−( )= ( 0)−( )− =0 (5.45)
simultaneously with the variables which are determined by the variational equations
0 0
( 0 )−( )=0 (5.46)
Equation 545 usually is expressed as
X
( )− ( 0)+ =0 (5.47)
The elegance of Lagrange multipliers is that a single variational approach allows simultaneous determination
of all + unknowns. Chapter 62 will show that the forces of constraint are given directly by the
terms.
Equation now contains only a single arbitrary function 1 () that is not restricted by the constraint. Thus
the bracket in the integrand of equation must equal zero for the extremum. That is
µ ¶ µ ¶−1 µ ¶ µ ¶−1
− = − ≡ −()
0 0
Now the left hand side of this equation is only a function of and with respect to and 0 while the
right-hand side is a function of and with respect to and 0 Because both sides are functions of then
each side can be set equal to a function −() Thus the above equations can be written as
0
− = () 0
− = () ()
There are three unknown functions. () () and () The complete solution for these three unknown
functions is obtained by solving the two equations, , plus the equation of constraint . The Lagrange
multiplier () is related to the force of constraint. This example of two variables coupled by one holonomic
constraint conforms with the general relation for many variables and constraints given by equation 547.
That is, it is an extremum for both () and the Lagrange multiplier . This effectively involves finding the
extremum path for the function ( ) = ( ) + ( ) where both () and are the minimized
variables. Therefore the curve () must satisfy the differential equation
∙ ¸
− + − =0 (5.51)
0 0
5.9. LAGRANGE MULTIPLIERS FOR HOLONOMIC CONSTRAINTS 129
= + = ( + ) 1 + 02
The catenary
Note that this case is one where = 0 and is a constant; also
defining = + then 0 = 0 Therefore the Euler’s equations can be written in the integral form
− 0 = = constant
0
√
Inserting the relation = 1 + 02 gives
p 0
1 + 02 − 0 √ =
1 + 02
where is an arbitrary constant. This simplifies to
³ ´2
02 = −1
The integral of this is µ ¶
+
= cosh
where and are arbitrary constants fixed by the locations of the two fixed ends of the rope.
0
and 0 = √ 02
Insert these into the Euler-Lagrange equation (551) gives
1+
" #
0
1− p =0
1 + 02
That is " #
0 1
p =
1+ 02
Integrate with respect to gives
0
p =−
1 + 02
where is a constant of integration. This can be rearranged to give
± ( − )
0 = q
2 − ( − )2
5.10 Geodesic
The geodesic is defined as the shortest path between two fixed points for motion that is constrained to lie
on a surface. Variational calculus provides a powerful approach for determining the equations of motion
constrained to follow a geodesic.
The use of variational calculus is illustrated by considering the geodesic constrained to follow the surface
of a sphere of radius . As discussed in appendixq23, the element of path length on the surface of the
2
sphere is given in spherical coordinates as = 2 + (sin ) . Therefore the distance between two
points 1 and 2 is ⎡s ⎤
Z 2 µ ¶2
⎣
= + sin2 ⎦ (5.52)
1
where 0 =
This is a case where
= 0 and thus the integral form of Euler’s equation can be used
leading to the result that
p p
02 + sin2 − 0 0 02 + sin2 = constant = (5.54)
This gives that p
sin2 = 02 + sin2 (5.55)
This can be rewritten as
1 csc2
= 0 =√ (5.56)
1 − 2 csc2
5.11. VARIATIONAL APPROACH TO CLASSICAL MECHANICS 131
The terms in the brackets are just expressions for the rectangular coordinates That is,
− = (5.62)
This is the equation of a plane passing through the center of the sphere. Thus the geodesic on a sphere
is the path where that plane through the center, as well as the initial and final points, intersects the sphere.
This geodesic is called a great circle. Euler’s equation gives both the maximum and minimum extremum
path lengths for motion on this great circle.
Chapter 16 discusses the geodesic in the four-dimensional space-time coordinates that underlie the General
Theory of Relativity. As a consequence, the use of the calculus of variations to determine the equations of
motion for geodesics plays a pivotal role in the General Theory of Relativity.
5.12 Summary
Euler’s differential equation: The calculus of variations has been introduced and Euler’s differential
equation was derived. The calculus of variations reduces to varying the functions () where = 1 2 3 ,
such that the integral Z 2
= [ () 0 (); ] (516)
1
is an extremum, that is, it is a maximum or minimum. Here is the independent variable, () are
the dependent variables plus their first derivatives 0 ≡ 0
The quantity [() (); ] has some given
0
dependence on and The calculus of variations involves varying the functions () until a stationary
value of is found which is presumed to be an extremum. It was shown that if the () are independent,
then the extremum value of leads to independent Euler equations
− =0 (519)
0
whereR = 1 2 3. This can be used to determine the functional form () that ensures that the integral
= 12 [() 0 (); ] is a stationary value, that is, presumably a maximum or minimum value.
Note that Euler’s equation involves partial derivatives for the dependent variables 0 and the total
derivative for the independent variable R
Euler’s integral equation: It was shown that if the function 12 [ () 0 (); ] does not depend on
the independent variable, then Euler’s differential equation can be written in an integral form. This integral
form of Euler’s equation is especially useful when = 0 that is, when does not depend explicitly on ,
then the first integral of the Euler equation is a constant
− 0 = constant (525)
0
Constrained variational systems: Most applications involve constraints on the motion. The equations
of constraint can be classified according to whether the constraints are holonomic or non-holonomic, the time
dependence of the constraints, and whether the constraint forces are conservative.
Generalized coordinates in variational calculus: Independent generalized coordinates can be chosen
that are perpendicular to the rigid constraint forces and therefore the constraint does not contribute to the
functional being minimized. That is, the constraints are embedded into the generalized coordinates and thus
the constraints can be ignored when deriving the variational solution.
Minimal set of generalized coordinates: If the constraints are holonomic then the holonomic
equations of constraint can be used to transform the coupled generalized coordinates to = −
independent generalized variables 0 . The generalized coordinate method then uses Euler’s equations to
determine these = − independent generalized coordinates.
− =0 (535)
0
Lagrange multipliers for holonomic constraints: The Lagrange multipliers approach for variables,
plus holonomic equations of constraint, determines all + unknowns for the system. The holonomic
forces of constraint acting on the variables, are related to the Lagrange multiplier terms ()
that
( ; ) = 0 (538)
The advantage of using the Lagrange multiplier approach is that the variational procedure simultaneously
determines both the equations of motion for the variables plus the constraint forces acting on the
system.
5.12. SUMMARY 133
Workshop exercises
1. Find the extremal of the functional
Z2
̇2
() =
3
1
that satisfies (1) = 3 and (2) = 18. Show that this extremal provides the global minimum of .
(a) A particle is constrained to move on the surface of a sphere. What are the equations of constraint for this
system?
(b) A disk of mass and radius rolls without slipping on the outside surface of a half-cylinder of radius
5. What are the equations of constraint for this system?
(c) What are holonomic constraints? Which of the equations of constraint that you found above are holo-
nomic?
(d) Equations of constraint that do not explicitly contain time are said to be scleronomic. Moving constraints
are rheonomic. Are the equations of constraint that you found above scleronomic or rheonomic?
3. For each of the following systems, describe the generalized coordinates that would work best. There may be
more than one answer for each system.
(a) An inclined plane of mass is sliding on a smooth horizontal surface, while a particle of mass is
sliding on the smooth inclined surface.
(b) A disk rolls without slipping across a horizontal plane. The plane of the disk remains vertical, but it is
free to rotate about a vertical axis.
(c) A double pendulum consisting of two simple pendula, with one pendulum suspended from the bob of the
other. The two pendula have equal lengths and have bobs of equal mass. Both pendula are confined to
move in the same plane.
(d) A particle of mass is constrained to move on a circle of radius . The circle rotates in space about
one point on the circle, which is fixed. The rotation takes place in the plane of the circle, with constant
angular speed , in the absence of a gravitational force.
(e) A particle of mass is attracted toward a given point by a force of magnitude 2 , where is a constant.
4. Looking back at the systems in problem 3, which ones could have equations of constraint? How would you
classify the equations of constraint (holonomic, scleronomic, rheonomic, etc.)?
134 CHAPTER 5. CALCULUS OF VARIATIONS
Problems
1. Find the extremal of the functional Z
() = (2 sin − ̇2 )
0
that satisfies () = () = 0. Show that this extremal provides the global maximum of .
Z2 q
√
2. Find and describe the path = () for which the the integral 1 + ( 0 )2 is stationary.
1
3. Find the dimensions of the parallelepiped of maximum volume circumscribed by a sphere of radius .
4. Consider a single loop of the cycloid having a fixed value of as shown in the figure. A car released from
rest at any point 0 anywhere on the track between and the lowest point , that is, 0 has a parameter
0 0
O x
P0
P
(a) Show that the time for the cart to slide from 0 to is given by the integral
r Z r
1 − cos
(0 → ) =
cos 0 − cos
0
p
(b) Prove that this time is equal to which is independent of the position 0
(c) Explain qualitatively how this surprising result can possibly be true.
5. Consider a medium for which the refractive index = where is a constant and is the distance from
2
the origin. Use Fermat’s Principle to find the path of a ray of light travelling in a plane containing the origin.
Hint, use two-dimensional polar coordinates with = () Show that the resulting path is a circle through
the origin.
6. Find the shortest path between the ( ) points (0 −1 0) and (0 1 0) on the conical surface
p
= 1 − 2 + 2
What is the length of this path? Note that this is the shortest mountain path around a volcano.
7. Show that the geodesic on the surface of a right circular cylinder is a segment of a helix.
Chapter 6
Lagrangian dynamics
6.1 Introduction
Newtonian mechanics is based on vector observables such as momentum and force, and Newton’s equations
of motion can be derived if the forces are known. However, Newtonian mechanics becomes difficult for
many-body systems when constraint forces apply. The alternative algebraic Lagrangian mechanics approach
is based on the concept of scalar energies which circumvent many of the difficulties in handling constraint
forces and many-body systems.
The Lagrangian approach to classical dynamics is based on the calculus of variations introduced in chapter
5. It was shown that the calculus of variations determines the function () such that the scalar functional
Z
2 X
= [ () 0 (); ] (6.1)
1
is an extremum, that is, a maximum or minimum. Here is the independent variable, () are the
dependent variables, and their derivatives 0 ≡ 0
where = 1 2 3 The function [ () (); ] has
0
an assumed dependence on and The calculus of variations determines the functional dependence
of the dependent variables () on the independent variable that is required to ensure that is an
extremum. For independent variables, has a stationary point, which is presumed to be an extremum,
that is determined by solution of Euler’s differential equations
− =0 (6.2)
0
If the coordinates () are independent, then the Euler equations, (62), for each coordinate are inde-
pendent. However, for constrained motion, the constraints lead to auxiliary conditions that correlate the
coordinates. As shown in chapter 5 a transformation to independent generalized coordinates can be made
such that the correlations induced by the constraint forces are embedded into the choice of the independent
generalized coordinates. The use of generalized coordinates in Lagrangian mechanics simplifies derivation of
the equations of motion for constrained systems. For example, for a system of coordinates, that involves
holonomic constraints, there are = − independent generalized coordinates. For such holonomic
constrained motion, it will be shown that the Euler equations can be solved using either of the following
three alternative ways.
1) The minimal set of generalized coordinates approach involves finding a set of = − indepen-
dent generalized coordinates that satisfy the assumptions underlying (63). These generalized coordinates
can be determined if the equations of constraint are holonomic, that is, related by algebraic equations of
constraint
( ; ) = 0 (6.3)
where = 1 2 3 These equations uniquely determine the relationship between the correlated coordi-
nates. This method has the advantage that it reduces the system of coordinates, subject to constraints,
to = − independent generalized coordinates which reduces the dimension of the problem to be solved.
However, it does not explicitly determine the forces of constraint which are effectively swept under the rug.
135
136 CHAPTER 6. LAGRANGIAN DYNAMICS
2) The Lagrange multipliers approach takes account of the correlation between the coordinates and
holonomic constraints by introducing the Lagrange multipliers (). These generalized coordinates
are correlated by the holonomic constraints.
X
0 − = () (6.4)
where = 1 2 3 . The Lagrange multiplier approach has the advantage that Euler’s calculus of variations
automatically use the Lagrange equations, plus the equations of constraint, to explicitly determine both
the coordinates and the forces of constraint
P which are related to the Lagrange multipliers as given
in equation (64). Chapter 62 shows that the ()
terms are directly related to the holonomic
forces of constraint.
3) The generalized force approach incorporates the forces of constraint explicitly as will be shown
in chapter 653. Generalized forces include the constraint forces explicitly, and thus can accommodate
holonomic, non-holonomic, and non-conservative forces.
The physics underlying the Lagrange formulation of classical mechanics will be illustrated by use of a
plausibility argument that is based on Newton’s laws of motion. This will be followed by a more rigorous
derivation of the Lagrangian formulation developed by the following two approaches that better elucidate
the physics underlying the Lagrange and Hamiltonian analytic representations of classical mechanics. In
1788 Lagrange derived his equations of motion using the differential d’Alembert Principle, that extends to
dynamical systems the Bernoulli Principle of infinitessimal virtual displacements and virtual work. The
other approach, developed in 1834, uses the integral Hamilton’s Principle to derive the Lagrange equations.
Euler’s variational calculus underlies d’Alembert’s Principle and Hamilton’s Principle since both are based
on the philosophical belief that the laws of nature prefer economy of motion. Chapters 62 − 65 show that
both d’Alembert’s Principle and Hamilton’s Principle lead to the Euler-Lagrange equations. This will be
followed by examples to illustrate the use of Lagrangian mechanics in classical mechanics.
1 p·p 2 2 2
= 2 = = + +
2 2 2 2 2
It can be seen that
= (6.6)
̇
and
= = (6.7)
̇
Consider that the force, acting on a mass is arbitrarily separated into two components, one part that
is conservative, and thus can be written as the gradient of a scalar potential , plus the excluded part of
the force, . The excluded part of the force could include non-conservative frictional forces as well
as forces of constraint which may be conservative or non-conservative. This separation allows the force to
be written as
F = −∇ + F (6.8)
Along each of the axes,
=− + (6.9)
̇
6.2. NEWTONIAN PLAUSIBILITY ARGUMENT FOR LAGRANGIAN MECHANICS 137
Equation (69) can be extended by transforming the cartesian coordinate to the generalized coordinates
Define the standard Lagrangian to be the difference between the kinetic energy and the potential energy,
which can be written in terms of the generalized coordinates as
= + = (6.11)
̇ ̇ ̇ ̇
Using the above equations allows Newton’s equation of motion (69) to be expressed as
− = (6.12)
̇
A comparison of equations (612) and (64) shows that the holonomic constraint forces
that are
contained in the excluded force can be identified with the Lagrange multiplier term in equation 64.
X
≡ () (6.14)
That is the Lagrange multiplier terms can be used to account for holonomic constraint forces
. Thus
equation 612 can be written as
X
− = () + (6.15)
̇
where the Lagrange multiplier term accounts for holonomic constraint forces, and
includes all the
remaining forces that are not accounted for by the scalar potential , or the Lagrange multiplier terms
.
For holonomic, conservative forces it is possible to absorb all the forces into the potential plus the
Lagrange multiplier term, that is
= 0 Moreover, the use of a minimal set of generalized coordinates
allows the holonomic constraint forces to be ignored by explicitly reducing the number of coordinates from
dependent coordinates to = − independent generalized coordinates. That is, the correlations due
to the constraint forces are embedded into the generalized coordinates. Then equation 615 reduces to the
basic Euler differential equations.
− =0 (6.16)
̇
Note that equation 616 is identical to Euler’s equation 534, if the independent variable is replaced
R
by time . Thus Newton’s equation of motion are equivalent to minimizing the action integral = 12 ,
that is Z 2
= ( ̇ ; ) = 0 (6.17)
1
which is Hamilton’s Principle. Hamilton’s Principle underlies many aspects of physics and now it is used
as the starting point for developing classical mechanics. Hamilton’ Principle was postulated 46 years after
Lagrange introduced Lagrangian mechanics.
The above plausibility argument, which is based on Newtonian mechanics, illustrates the close connection
between the vectorial Newtonian mechanics and the algebraic Lagrangian mechanics approaches to classical
mechanics.
138 CHAPTER 6. LAGRANGIAN DYNAMICS
X
X
F
· r + f · r = 0 (6.19)
The second term in equation 619 can be ignored if the virtual work due to the constraint forces is zero.
This is rigorously true for rigid bodies and is valid for any forces of constraint where the constraint forces
are perpendicular to the constraint surface and the virtual displacement is tangent to this surface. Thus if
the constraint forces do no work, then (619) reduces to
X
F
· r = 0 (6.20)
This relation is the Bernoulli’s Principle of Static Virtual Work and is used to solve problems in statics.
Bernoulli introduced dynamics by using Newton’s Law to related force and momentum.
F = ṗ (6.21)
For the special case where the forces of constraint is zero, then equation 624 reduces to d’Alembert’s
Principle
X
(F − ṗ ) · r = 0 (6.25)
The d’Alembert’s Principle, by a stroke of genius, cleverly transforms the principle of virtual work from the
realm of statics to dynamics. Application of virtual work to statics primarily leads to algebraic equations
between the forces, whereas d’Alembert’s principle applied to dynamics leads to differential equations.
6.3. LAGRANGE EQUATIONS FROM D’ALEMBERT’S PRINCIPLE 139
The arbitrary virtual displacement r can be related to the virtual displacement of the generalized coordinate
by
X
r
r = (6.28)
Note that by definition, a virtual displacement considers only displacements of the coordinates, and no time
variation is involved.
The above transformations can be used to express d’Alembert’s dynamical principle of virtual work in
generalized coordinates. Thus the first term in d’Alembert’s Dynamical Principle, (625) becomes
X
X X
r
F
· r = F
· = (6.29)
Note that just as the generalized coordinates need not have the dimensions of length, so the do not
necessarily have the dimensions of force, but the product must have the dimensions of work. For
example, could be torque and could be the corresponding infinitessimal rotation angle.
The second term in d’Alembert’s Principle (625) can be transformed using equation 628
à !
X X X r
ṗ · r = r̈ · r = r̈ · (6.31)
The second right-hand term in (632) can be rewritten by interchanging the order of the differentiation with
respect to and µ ¶
r v
= (6.35)
Substituting (634) and (635) into (632) gives
à ! ½ µ ¶ ¾
X X r X v v
ṗ · r = r̈ · = v · − v · (6.36)
̇
Inserting (629) and (636) into d’Alembert’s Principle (625) leads to the relation
( Ã Ã !! Ã ! )
X X X1 X1
2 2
(F − ṗ ) · r = − − − = 0 (6.37)
̇
2
2
P
The 21 2 term can be identified with the system kinetic energy . Thus d’Alembert Principle reduces
to the relation
∙½
X µ ¶ ¾ ¸
− − = 0 (6.38)
̇
For cartesian coordinates is a function only of velocities (̇ ̇ ̇) and thus the term = 0 However,
as discussed in appendix 22, for curvilinear coordinates 6= 0 due to the curvature of the coordinates
as is illustrated for polar coordinates where v =̇r̂ + ̇θ̂.
If all the generalized coordinates are independent, then equation 638 implies that the term in the
square brackets is zero for each individual value of . This leads to the basic Euler-Lagrange equations of
motion for each of the independent generalized coordinates
½ µ ¶ ¾
− = (6.39)
̇
where ≥ ≥ 1. That is, this leads to Euler-Lagrange equations of motion for the generalized forces .
As discussed in chapter 58 when holonomic constraint forces apply, it is possible to reduce the system
to = − independent generalized coordinates for which equation 625 applies.
In 1687 Leibniz proposed minimizing the time integral of his “vis viva", which equals 2 That is,
Z 2
= 0 (6.40)
1
The variational equation 639 accomplishes the minimization of equation 640. It is remarkable that Leibniz
anticipated the basic variational concept prior to the birth of the developers of Lagrangian mechanics, i.e.,
d’Alembert, Euler, Lagrange, and Hamilton.
6.3.3 Lagrangian
The handling of both conservative
P and non-conservative generalized forces is best achieved by assuming
r̄
that the generalized force = F ·
can be partitioned into a conservative velocity-independent term,
that can be expressed in terms of the gradient of a scalar potential, −∇ plus an excluded generalized force
which contains the non-conservative, velocity-dependent, and all the constraint forces not explicitly
included in the potential . That is,
= −∇ + (6.41)
Inserting (641) into (638) and assuming that the potential is velocity independent, allows (638) to be
rewritten as
X ∙½ µ ( − ) ¶ ( − ) ¾ ¸
− − = 0 (6.42)
̇
6.4. LAGRANGE EQUATIONS FROM HAMILTON’S PRINCIPLE 141
≡ − (6.43)
∙½
X µ ¶ ¾ ¸
− − = 0 (6.44)
̇
Note that equation (644) contains the basic Euler-Lagrange equation (638) as a special case when = 0.
In addition, note that if all the generalized coordinates are independent, then the square bracket terms are
zero for each value of which leads to the general Euler-Lagrange equations of motion
½ µ ¶ ¾
− =
(6.45)
̇
where ≥ ≥ 1.
Chapter 653 will show that the holonomic constraint forces can be factored out of the generalized force
term
which simplifies derivation of the equations of motion using Lagrangian mechanics. The general
Euler-Lagrange equations of motion are used extensively in classical mechanics because conservative forces
play a ubiquitous role in classical mechanics.
has a minimum value for the correct path of motion. As discussed in chapter 132, choice the Lagrangian
usually is limited to a function of the generalized coordinates q, and their velocities q̇, plus time . At this
stage the discussion is restricted to use of the standard Lagrangian ≡ − . Hamilton’s Principle can
be written in terms of virtual infinitessimal displacement as
Z 2
= = 0 (6.47)
1
Variational calculus therefore implies that a system of independent generalized coordinates must satisfy
the basic Lagrange-Euler equations
− =0 (6.48)
̇
This is precisely the conclusion given in equation 645 when = 0 which was derived using d’Alembert’s
Principle.
This discussion has demonstrated that Euler’s variational differential equation underlies both the dif-
ferential variational d’Alembert Principle, and the integral Hamilton’s Principle. These approaches have
been used to derive the most general Lagrange equations that are applicable to both holonomic and non-
holonomic constraints, as well as for conservative and non-conservative systems. Chapter 62 presented a
plausibility argument that illustrated that the same result is justified based on Newtonian mechanics. How-
ever, d’Alembert’s Principle and Hamilton’s Principle, expressed in terms of generalized coordinates, are
broader in scope than the equations of motion implied using Newtonian mechanics.
142 CHAPTER 6. LAGRANGIAN DYNAMICS
For the case of = − unknowns, any virtual displacement is independent of , therefore the
only way for (644) to hold is for the term in brackets to vanish for each value of , that is
½ µ ¶ ¾
− = (6.50)
̇
where = 1 2 3 These are the Lagrange equations for the minimal set of independent generalized
coordinates.
If all the generalized forces are conservative plus velocity independent, and are included in the potential
and = 0, then (650) simplifies to
½ µ ¶ ¾
− =0 (6.51)
̇
where = 1 2 3 Kinematic constraints can be expressed in terms of the infinitessimal displacements
of the form
X
(q ) + = 0 (6.53)
=1
, described by the vector q that are derived from the equations of constraint. As discussed in chapter 57,
if (653) represents the total differential of a function, then it can be integrated to give a holonomic relation
of the form of equation (652). However, if (653) is not the total differential, then it can be integrated only
after having solved the full problem. If = 0 then the constraint is scleronomic.
The discussion of Lagrange multipliers in chapter 591, showed that, for virtual displacements
the correlation of the generalized coordinates, due to the constraint forces, can be taken into account by
multiplying (653) by unknown Lagrange multipliers and summing over all constraints. Generalized
forces can be partitioned into a Lagrange multiplier term plus a remainder force. That is
X
= (q ) +
(6.54)
=1
where
is the remaining part of the generalized force after subtracting both the part of the force
absorbed in the potential energy , which is buried in the Lagrangian
P , as well as the holonomic constraint
forces which are included in the Lagrange multiplier terms =1 (q ). The Lagrange multipliers
can be chosen arbitrarily in (656) Utilizing the free choice of the Lagrange multipliers allows them
to be determined in such a way that the coefficients of the first infinitessimals, i.e. the square brackets
vanish. Therefore the expression in the square bracket must vanish for each value of 1 ≤ ≤ . Thus it
follows that ½ µ ¶ ¾ X
− − (q ) −
=0 (6.57)
̇
=1
when = 1 2 Thus (656) reduces to a sum over the remaining coordinates between + 1 ≤ ≤
"½ µ ¶ ¾ X
#
X
− − (q ) − = 0 (6.58)
=+1
̇
=1
In equation (658) the = − infinitessimals can be chosen freely since the = − degrees
of freedom are independent. Therefore the expression in the square bracket must vanish for each value of
+ 1 ≤ ≤ . Thus it follows that
½ µ ¶ ¾ X
− − (q ) −
=0 (6.59)
̇
=1
where = + 1 + 2 Combining equations (657) and (659) then gives the important general relation
that for 1 ≤ ≤
½ µ ¶ ¾ X
− = (q ) +
(6.60)
̇
=1
144 CHAPTER 6. LAGRANGIAN DYNAMICS
To summarize, the Lagrange multiplier approach (660) automatically solves the equations plus the
holonomic equations of constraint, which determines the + unknowns, that is, the coordinates
plus the forces of constraint. The beauty of the Lagrange multipliers is that all variables, plus the
constraint forces, are found simultaneously by using the calculus of variations to determine the extremum
for the expanded Lagrangian 0 (q q̇ λ).
is the sum of the components in the direction for all external forces that have not been taken into account
by the scalar potential or the Lagrange multipliers. Thus the non-conservative generalized force
contains non-holonomic constraint forces, including dissipative forces such as drag or friction, that are not
included in or used in the Lagrange multiplier terms to account for the holonomic constraint forces.
The concept of generalized forces is illustrated by the case of spherical coordinate systems. The attached
table gives the displacement elements , (taken from table 4) and the generalized force for the three
coordinates. Note that has the dimensions of force and has the units of energy. By contrast
equation 630 gives that = and = which have the dimensions of torque. However, and
both have the dimensions of energy as is required in equation 630. This illustrates that the units used
for generalized forces depend on the units of the corresponding generalized coordinate.
Unit vectors ·
̂ r̂ r̂
θ̂ θ̂ θ̂
φ̂ φ̂ sin φ̂ sin sin
for variables, with equations of constraint. The generalized forces are not included in the
conservative, potential energy or the Lagrange multipliers approach for holonomic equations of constraint.2
The following is a logical procedure for applying the Euler-Lagrange equations to classical mechanics.
In summary, in Lagrangian mechanics is based on energies which are scalars in contrast to Newtonian
mechanics which is based on vector forces and momentum. As a consequence, Lagrange mechanics allows
use of any set of independent generalized coordinates, which do not have to be orthogonal, and they can
have very different units for different variables. The generalized coordinates can incorporate the correlations
introduced by constraint forces.
The active forces are split into the following three categories;
1. Velocity-independent conservative forces are taken into account using scalar potentials .
2. Holonomic constraint forces can be determined using Lagrange multipliers.
3. Non-holonomic constraints require use of generalized forces
.
Use of the concept of scalar potentials is a trivial and powerful way to incorporate conservative forces in
Lagrangian mechanics. The Lagrange multipliers approach requires using the Euler-Lagrange equations for
+ coordinates but determines for holonomic constraint forces and equations of motion simultaneously.
Non-holonomic constraints and dissipative forces can be incorporated into Lagrangian mechanics via use of
generalized forces which broadens the scope of Lagrangian mechanics.
Note that the equations of motion resulting from the Lagrange-Euler algebraic approach are the same
equations of motion as obtained using Newtonian mechanics. However, the Lagrangian is a scalar which
facilitates rotation into the most convenient frame of reference, and can greatly simplify determination of
the equations of motion when constraint forces apply. As discussed in chapter 14, the Lagrangian and the
Hamiltonian variational approaches to mechanics are the only viable way to handle relativistic, statistical,
and quantum mechanics.
146 CHAPTER 6. LAGRANGIAN DYNAMICS
= ̇
̇
= ̇
̇
= ̇
̇
= = =0
Insert these in the Lagrange equation gives
Λ = − = ̇ − 0 = 0
̇
Thus
= ̇ =
= ̇ =
= ̇ =
That is, this shows that the linear momentum is conserved if is a constant, that is, no forces apply. Note
that momentum conservation has been derived without any direct reference to forces.
1 ³ 2 2´
= + −
2 g
(x, y)
Using the Lagrange equation for the coordinate
gives
Λ = − = − 0 = 0
Thus the horizontal momentum ̇ is conserved and
= 0 The coordinate gives x
The importance of selecting the most convenient generalized coordinates is nicely illustrated by trying to
solve this problem using polar coordinates where is radial distance and the elevation angle from the
axis as shown in the adjacent figure. Then
1 2 1 ³ ´2
= +
2 2
= sin
Thus
1 2 1 ³ ´2
= + ̇ − sin
2 2
Λ = 0 for the coordinate
2
̇ − sin − ̈ = 0
Λ = 0 for the coordinate
− cos − 2̇̇ − 2 ̈ = 0
These equations written in polar coordinates are more complicated than the result expressed in cartesian
coordinates. This is because the potential energy depends directly on the coordinate, whereas it is a function
of both This illustrates both the freedom for using different generalized coordinates, plus the importance
of choosing a sensible set of generalized coordinates.
= ( − ) sin
1 = − = 0
2 = −=0
A holonomic constraint can be used to reduce the system to a single generalized coordinate plus generalized
velocity ̇ Expressed in terms of this single generalized coordinate, the Lagrangian becomes
µ ¶
1
= + 2 ̇ 2 − ( − ) sin
2
1 = − = 0
2 = −=0
which gives
̈ = −1
The constraint can be written as
̈ = ̈
1 2
Let = 2 and solve for and gives
1 = − ¡ 2
¢ sin = − sin
1+ 3
The frictional force is given by
1
= 1 = 1 = − sin
3
Also
2
̈ = sin + 1 = sin
3
and the torque is
−1 = = ̈
The four methods for handling the equations of constraint all are equivalent and result in the same
equations of motion. The scalar Lagrangian mechanics is able to calculate the vector forces acting in a direct
and simple way. The Newton’s law approach is more intuitive for this simple case and the ease and power
of the Lagrangian approach is not apparent for this simple system.
The following series of examples will gradually increase in complexity, and will illustrate the power,
elegance, plus superiority of the Lagrangian approach compared with the Newtonian approach.
6.8. APPLICATIONS TO SYSTEMS INVOLVING HOLONOMIC CONSTRAINTS 151
1 + 2 − = 0
2 = − 1
1 = 1 sin 1
2 = ( − 1 ) sin 2
The conservative gravitational force is absorbed into the potential energy given by
= cos
= sin
Thus
x
̇ = −(sin )̇
Two frictionless masses that are connected by a
̇ = (cos )̇
bar and are constrained to slide in vertical and
This constraint, that is absorbed into the generalized co- horizontal channels.
ordinate, is holonomic, scleronomic, and conservative.
The kinetic energy is given by
1 ¡2 ¢ 1
= (sin )2 ̇2 + 2 (cos )2 ̇2 = 2 ̇2
2 2
The gravitational potential energy is given by
= −0 sin
= ̇ + ̇ cos
= −̇ sin
( + ) ̈ + ̈ cos = 0 y
7
̈ cos + ̈ − sin = 0
5 .
Eliminating ̈ gives
µ ¶
7 cos2 sin
− ̈ =
5 +
Integrate this equation assuming the initial conditions,
results in
5 ( + ) sin x
= 2 y
2 [7 ( + ) − 5 cos2 ]
x
Thus Solid sphere rolling without slipping on an
cos 5 sin (2) inclined plane on a frictionless horizontal floor.
=− = 2
+ 4 [7 ( + ) − 5 cos2 ]
Note that these equations predict conservation of linear
momentum for the block plus sphere.
2 ̇ = constant =
6.8. APPLICATIONS TO SYSTEMS INVOLVING HOLONOMIC CONSTRAINTS 155
2 cos
̈ + sin − 2 4 3 = 0
sin
There are many possible solutions depending on the initial conditions. The pendulum can just oscillate
in the direction, or rotate in the direction or some combination of these. Note that if is zero, then
the equation reduces to the simple harmonic pendulum, while the other extreme is when ̈ = 0 for which the
motion is that of a conical pendulum that rotates at a constant angle 0 to the vertical axis.
156 CHAPTER 6. LAGRANGIAN DYNAMICS
= cos
= sin
The angular momentum = 2 ̇, thus the equation of motion can be written as
The last term in the right-hand side is the Coriolis force caused by the time variation of the pendulum length.
For the radial distance the Lagrange equation Λ = 0 gives
2
̈ = ̇ + cos − ( − 0 )
This equation just equals the tension in the spring, i.e. = ̈. The first term on the right-hand side
represents the centrifugal radial acceleration, the second term is the component of the gravitational force,
and the third term represents Hooke’s Law for the spring. For small amplitudes of the motion appears as
a superposition of harmonic oscillations in the plane.
In this example the orthogonal coordinate approach used gave the tension in the spring thus it is unnec-
essary to repeat this using the Lagrange multiplier approach.
6.8. APPLICATIONS TO SYSTEMS INVOLVING HOLONOMIC CONSTRAINTS 157
− = 1 2 (a)
³ ̇ ´
2
̈ − ̇ = 1 2
For Λ =
³ 2 ´
̇ = ̇ = 0 (b)
Thus the angular momentum is conserved, that is, it is a constant of motion.
For Λ =
̈ = − − 1 (c)
and the time differential of the constraint equation is
2̇ − ̇ = 0 (d)
The above four equations of motion can be used to determine 1
2
The
√ radius of the circle at the intersection of the plane = with the paraboloid = is given by
0 = For a constant height = , then ̈ = 0 and equation (c) reduces to
1 = −
Therefore the constraint force is given by
( )
= 1 =− 2
Assuming that ̈ = 0 then equation (a) for ̇ = and = 0 gives
¡ ¢
0 − 0 2 = 1 20 = − 20 =
That is, the constraint force equals
= −0 2
which is the usual centripetal force. These relations also give that the initial angular velocity required for
such a stable trajectory with height is r
2
̇ = =
6.8. APPLICATIONS TO SYSTEMS INVOLVING HOLONOMIC CONSTRAINTS 159
1 ³ 2
´ 1 ³ 2
´
= 1 ̇2 + 2 ̇ + 2 ̇2 + 2 ̇
2 2
³ ´ 1 µ ¶
1 ·2
2 2
= 1 ̇2 + ( − ) ̇ + 2 ̇2 + 2 m2
2 2
Mass 2 hanging from a rope that is connected
The potential energy in terms of the generalized coordi- to 1 which slides on a frictionless plane.
nates relative to the horizontal plane, is
= 0 − 2 cos
6.15 Example: Two connected masses constrained to slide along a moving rod
Consider two identical masses constrained to move
along the axis of a thin straight rod, of mass and length
which is free to both translate and rotate. Two identi- z1
cal springs link the two masses to the central point of the
rod. Consider only motions of the system for which the
extended lengths of the two springs are equal and opposite z r
such that the two masses always are equal distances from
the center of the rod keeping the center of mass at the O
center of the rod. Find the equations of motion for this y r y1
system.
x
Use a fixed cartesian coordinate system ( ) and
a moving frame with the origin at the center of the
rod with its cartesian coordinates (0 0 0 ) being parallel x1
to the fixed coordinate frame as shown in the figure. Let Two identical masses constrained to slide on
( ) be the spherical coordinates of a point referring to a moving rod of mass The masses are
the center of the moving (0 0 0 ) frame as shown in the attached to the center of the rod by identical
figure. Then the two masses have spherical coordinates springs each having a spring constant .
( ) and (− ) in the moving-rod fixed frame. The
frictionless constraints are holonomic.
The kinetic energy of the system is equal to the kinetic energy for all the mass concentrated at the center
of mass plus the kinetic energy about the center of mass. Since is the center of mass then the kinetic
energy can be separated into three terms
= + +
Note that since the kinetic energy is a scalar quantity it is rotational invariant and thus can be evaluated in
any rotated frame. Thus the kinetic energy of the center of mass is
1
= ( + 2)(̇2 + ̇ 2 + ̇ 2 )
2
The rotational kinetic energy of the two masses in the center of mass frame is
2
= (̇2 + 2 ̇ + 2 ̇2 sin2 )
The rotational kinetic energy of the rod is a scalar and thus can be evaluated in any rotated frame of
reference fixed with respect to the principal axis system of the rod. The angular velocity of the rod about
resolved along its principal axes is given by
1 1 2
= ( 2 + 2 + 2 ) = 2 (̇ + ̇2 sin2 )
2 24
The only potential energy is due to the two extended springs which are assumed to have the same length
where 0 is the unstretched length.
1
= 2 · ( − 0 )2 = ( − 0 )2
2
Thus the Lagrangian is
1 2 1 2
= ( + 2)(̇2 + ̇ 2 + ̇ 2 ) + (̇2 + 2 ̇ + 2 ̇2 sin2 ) + 2 (̇ + ̇2 sin2 ) − ( − 0 )2
2 24
Using Lagrange’s equations Λ = 0 for the generalized coordinates gives.
6.9. APPLICATIONS INVOLVING NON-HOLONOMIC CONSTRAINTS 161
( ) = − = 0
For the restricted domain where this system is holonomic, it can be solved using generalized coordinates,
generalized forces, Lagrange multipliers, or Newtonian mechanics as illustrated below.
Minimal generalized coordinates:
The minimal number of generalized coordinates reduces the system to one coordinate , which does not
determine the constraint force that is needed to know if the constraint applies. Thus this approach is not
useful for solving this partially-holonomic system.
162 CHAPTER 6. LAGRANGIAN DYNAMICS
Generalized forces:
The radial constraint has a corresponding generalized force . The Lagrange equation Λ = gives
2
̈ + cos − ̇ = (a)
The Lagrange equation Λ = = 0 since there is no tangential force for this frictionless system. Therefore
When constrained to follow the surface of the spherical shell, the system is holonomic, i.e. = and
̇ = ̈ = 0. Thus the above two equations reduce to
2
cos − ̇ = (c)
2 ̈ − sin = 0
That is
̈ = sin
Integrate to get ̇ using the fact that
̇ ̇
̈ = = ̇
then Z Z Z
̈ = ̇̇ = sin
Therefore
2 2
̇ = (1 − cos ) (d)
assuming that ̇ = 0 at = 0 Substituting equation () into equation () gives the constraint force, which
is normal to the surface, to be
= = (3 cos − 2)
Note that = = 0 when cos = 23 , that is = 482
Lagrange multipliers:
For the holonomic regime, which obeys the constraint, ( ) = − = 0 the Lagrange equation for
is Λ =
Since = 1 then
2
̈ + cos − ̇ = (a)
The Lagrange equation for gives ∆ =
= 0 since
= 0 Thus
As above, when constrained to follow the surface of the spherical shell, the system is holonomic =
and ̇ = ̈ = 0 Thus the above two equations reduce to
2
cos − ̇ = (c)
2 ̈ − sin = 0 (d)
That is, the answers are identical to that obtained using generalized forces, namely;
2 2
̇ = (1 − cos ) (d)
assuming that ̇ = 0 at = 0
The force of constraint applied by the surface is
= =
6.9. APPLICATIONS INVOLVING NON-HOLONOMIC CONSTRAINTS 163
= = (3 cos − 2)
Energy conservation:
This problem can be solved using energy conservation
1
2 = [1 − cos ]
2
Thus the centripetal acceleration
2
= 2[1 − cos ]
The normal force to the surface will cancel when the centripetal acceleration equals the gravitational acceler-
ation, that is, when
2
= 2[1 − cos ] = cos
This occurs when cos = 23 . This is an unusual case where the Newtonian approach is the simplest.
1 = − − = 0
2 = ( − ) − = 0
164 CHAPTER 6. LAGRANGIAN DYNAMICS
³ 2
´ 2
The kinetic energy is = 12 ̇2 + 2 ̇ + 12 ̇ and the potential energy is = cos Thus the
Lagrangian is
1 ³ 2
´ 1 2
= ̇2 + 2 ̇ + ̇ − cos
2 2
Consider the solution using Lagrange multipliers for the holonomic regime where both constraints are
satisfied and lead to the following differential constraint relations
1 1 1
= 1 =0 =0
2 2 2
= 0 = = − ( + )
The Lagrange operator equation Λ gives,
1 2
− = 1 + 2
̇
that is
2
̈ + cos − ̇ = 1 (a)
Λ gives
2 ̈ + 2̇̇ − sin = −2 ( + ) (b)
Λ gives
̈ = 2 (c)
Since the center of the sphere rolling on the spherical shell must have
=+
then
̇ = ̈ = 0
̈ = ̈
Substituting this into () gives
2
̈ = 2
Insert this into equation () gives
sin
2 = ¡ 2 2¢
+
The moment of inertia about the axis of a solid sphere is = 25 2 Then
2 sin
2 =
7
But also
̇ 2 5 5 sin
̈ = ̇ = 2 = 2 =
2 7
Integrating gives Z Z
5
̇̇ = sin
7
That is
2 10
̇ = (1 − cos )
7
assuming that ̇ = 0 at = 0 Inserting this into equation () gives
10
− [1 − cos ] + cos = 1
7
6.9. APPLICATIONS INVOLVING NON-HOLONOMIC CONSTRAINTS 165
That is
1 = [17 cos − 10]
7
Note that this equals zero when
10
cos =
17
For larger angles 1 is negative implying that the solid sphere will fly off the surface of the spherical shell.
The sphere will leave the surface of the cylinder when cos = 10
17 that is, = 5397 This is a significantly
larger angle than obtained for the similar problem where the mass is sliding on a frictionless cylinder because
the energy stored in rotation implies that the linear velocity of the mass is lower at a given angle for the
case of a rolling sphere.
The above discussion has omitted an important fact that, if ∞ the frictional force becomes
insufficient to maintain the rolling constraint before = 5397 that is, the frictional force will exceed
the sliding limit . To determine when the rolling constraint fails it is necessary to determine the
frictional torque
= −2
Thus
= −2
It is in the negative direction because of the direction chosen for The required coefficient of friction is
given by the ratio of the frictional force to the normal force, that is
2 2 sin
= =
1 [17 cos − 10]
For = 1 the disk starts to slip when = 47540 Note that the sphere starts slipping before it flies off
the cylinder since a normal force is required to support a frictional force and the difference depends on the
coefficient of friction. The no-slipping constraint is not satisfied once the sphere starts slipping and the
frictional force should equal 1 Thus for the angles beyond 4754 the problem needs to be solved with
the rolling constraint changed to a sliding non-conservative frictional force. This is best handled by including
the frictional force and normal forces as generalized forces. Fortunately this will be a small correction. The
friction will slightly change the exact angle at which the normal force becomes zero and the system transitions
to free motion of the sphere in a gravitational field.
Similarly Λ = = gives
̈ =
These can be solved by substituting the relation = . The sphere flies off the spherical shell
when ≤ 0 leading to free motion discussed in example 62. The problem of a solid uniform sphere rolling
inside a hollow sphere can be solved the same way.
166 CHAPTER 6. LAGRANGIAN DYNAMICS
6.19 Example: Small body held by friction on the periphery of a rolling wheel
Assume that a small body of mass is bal-
anced on a rolling wheel of mass and radius
as shown in the figure. The wheel rolls in y
a vertical plane without slipping on a horizontal
surface. This example illustrates that it is possi-
ble to use simultaneously a mixture of holonomic F N
constraints, partially-holonomic constraints, and
generalized forces.3
m
Assume that at = 0 the wheel touches the
floor at = = 0 with the mass perched at
the top of the wheel at = 0. Let the frictional
force acting on the mass be and the reaction
force of the periphery of the wheel on the mass
be . Let ̇ be the angular velocity of the wheel,
and ̇ the horizontal velocity of the center of the M
wheel. The polar coordinates of the mass x
O
are taken with measured from the center of the
x
wheel with measured with respect to the vertical.
Thus the cartesian coordinates of the small mass Small body of mass held by friction on the periphery
are ( + sin + cos ) with respect to the of a rolling wheel of mass and radius .
origin at = = 0.
The kinetic energy is given by
∙³ ´2 ³ ´2 ¸
1 1 1
= ̇2 + ̇2 + ̇ + ̇ cos + ̇ sin + ̇ cos − ̇ sin
2 2 2
The gravitational force can be absorbed into the scalar potential term of the Lagrangian and includes only
the potential energy of the mass since the potential energy of the rolling wheel is constant.
= + ( + cos )
1 1 1 h 2 i
= ( + ) ̇2 + ̇2 + 2 ̇ + 2̇̇ cos + 2̇̇ sin + ̇2 − ( + cos )
2 2 2
1 = − = ̇ − ̇ = 0
2) The mass is touching the periphery of the wheel, that is, the normal force 0 This is a one-sided
restricted holonomic constraint.
2 = − = 0
3) The mass does not slip on the wheel if the frictional force . When this restricted
holonomic constraint is satisfied, then
3 = ̇ − ̇ = 0
The rolling constraint is holonomic, and can be accounted for using one Lagrange multiplier plus the
differential constraint equations
3 This problem is solved in detail in example 319 of " Classical Mechanics and Relativity". by Muller-Kirsten [06]
6.9. APPLICATIONS INVOLVING NON-HOLONOMIC CONSTRAINTS 167
1
= 1
1
= 0
1
=
1
= 0
The other two constraints are non-holonomic, and thus these constraint forces are expressed in terms of two
generalized forces and that are related to the tangential force and radial reaction force . For
simplicity, assume that the wheel is a thin-walled cylinder with a moment of inertia of
= 2
= − cos + sin
= (− cos + sin ) (− cos ) − ( sin + cos ) sin = −
=
This last equation can be derived by Newtonian mechanics from consideration of the forces acting.
The above equations of motion can be used to calculate the motion for the following conditions.
a) Mass not slipping:
This occurs if = ≤ which also implies that 0 That is a situation where the system is
holonomic with = ̇ = ̇ ̇ = ̇ which can be solved using the generalized coordinate approach with
only one independent coordinate which can be taken to be .
b) Mass slipping:
Here the no-slip constraint is violated and thus one has to explicitly include the generalized forces
and assume that sliding friction is given by =
c) Reaction force is negative:
Here the mass is not subject to any constraints and it is in free fall.
The above example illustrates the flexibility provided by Lagrangian mechanics that allows simultane-
ous use of Lagrange multipliers, generalized forces, and scalar potential to handle combinations of several
holonomic and nonholonomic constraints for a complicated problem.
168 CHAPTER 6. LAGRANGIAN DYNAMICS
F = (E + v × B) (6.61)
It is interesting to use Maxwell’s equations and Lagrangian mechanics to show that the Lorentz force can be
represented by a conservative potential in Lagrangian mechanics.
Maxwell’s equations can be written as
∇·E = (6.62)
0
B
∇ × E+= 0
∇·B = 0
E
∇ × B−0 0 = J
Since ∇ · B =0 then it follows from Appendix that B can be represented by the curl of a vector
potential, A that is
B=∇×A (6.63)
Substituting this into ∇ × E+ B
= 0 gives that
∇ × A
∇ × E+ = 0 (6.64)
µ ¶
A
∇× E + = 0
Since this curl is zero it can be represented by the gradient of a scalar potential
A
E+ = −∇ (6.65)
The following shows that this relation corresponds to taking the gradient of a potential for the charge
where the potential is given by the relation
= (Φ − A · v) (6.66)
where Φ is the scalar electrostatic potential. This scalar potential can be used in the Lagrange equations
using the Lagrangian
1
= v · v − (Φ − A · v) (6.67)
2
The Lorentz force can be derived from this Lagrangian by considering the Lagrange equation for the cartesian
coordinate
− =0 (6.68)
̇
Using the above Lagrangian (667) gives
∙ ¸
Φ A
̈ + + − ·v =0 (6.69)
But
= + ̇ + ̇ + ̇ (6.70)
and
A
·v = ̇ + ̇ + ̇ (6.71)
6.11. TIME-DEPENDENT FORCES 169
F = (E + v × B) (6.73)
This has demonstrated that the electromagnetic scalar potential
= (Φ − A · v) (6.74)
satisfies Maxwell’s equations, gives the Lorentz force, and it can be absorbed into the Lagrangian. Note that
the velocity-dependent Lorentz force is conservative since E is conservative, and because (v × B × v)=0
therefore the magnetic force does no work since it is perpendicular to the trajectory. The velocity-dependent
conservative Lorentz force is an important and ubiquitous force that features prominently in many branches
of science. It will be discussed further for the case of relativistic motion in chapter 166.
= cos
The kinetic energy is
∙³ ´2 ¸
1 1 h 2
i
= ̇ cos + (̇ + ̇ sin )2 = 2 ̇ + 2̇̇ sin + ̇ 2
2 2
and the potential energy is
= [(1 − cos ) + ]
Thus the Lagrangian is
1 h 2 2 i
= ̇ + 2̇̇ sin + ̇ 2 − [(1 − cos ) + ]
2
The Euler-Lagrange equations lead to equations of motion for and
Assume the small-angle approximation where → 0 then these two equations reduce to
µ ¶
̈
̈ + + = 0
̈ + =
Substitute ̈ = − 2 cos into these equations gives
µ ¶
2
̈ + − cos = 0
¡ ¢
− 2 cos =
These correspond to stable harmonic oscillations about ≈ 0 if the bracket term is positive, and to
unstable motion if the bracket is negative. Thus, for small amplitude oscillation about ≈ 0 the motion of
the system can be unstable whenever the bracket is negative, that is, when the acceleration 2 cos
and resonance behavior can occur coupling the pendulum period and the forcing frequency .
This discussion also applies to the inverted pendulum with a surprising result. It is well known that the
pendulum is unstable near = . However, if the support is oscillating, then for ≈ the equations of
motion become
µ ¶
2
̈ − − cos = 0
¡ ¢
− 2 cos =
The inverted pendulum has stable oscillations about ≈ if the bracket is negative, that is, if 2 cos
This illustrates that nonautonomous dynamical systems can involve either stable or unstable motion.
where the impulsive force is introduced using the generalized force
. Knowing the initial conditions at
time the conditions at the time + are given by integration of equation 675 over the duration of the
impulse which gives Z + µ ¶ Z + Z +
− =
(6.76)
̇
This integration determines the conditions at time + which then are used as the initial conditions for the
motion when the impulsive force is zero.
The second approach is to realize that equation 676 can rewritten in the form
Z + µ ¶ ¯+ Z + µµ ¶ ¶
¯¯
lim = lim = ∆ = lim + (6.77)
→0 ̇ →0 ̇ ¯ →0
Note that in the limit that → 0 then the integral of the generalized momentum = simplifies to give
̇
³ ´
the change in generalized momentum ∆ . In addition, assuming that the non-impulsive forces
are
6.12. IMPULSIVE FORCES 171
finite and independent of the instantaneous impulsive force during the infinitessimal duration , then the
R + ³ ´
contribution of the non-impulsive forces during the impulse can be neglected relative to the
R +
large impulsive force term; lim →0 . Thus it can be assumed that
Z +
∆ = lim
= ̃ (6.78)
→0
where ̃ is the generalized impulse associated with coordinate = 1 2 3 . This generalized impulse
can be derived from the time integral of the impulsive forces P given by equation 2135 using the time
integral of equation 677, that is
Z + Z + X X
r r
∆ = ̃ = lim
≡ lim P · = P̃ · (6.79)
→0 →0
Note that the generalized impulse ̃ can be a translational impulse P̃ with corresponding translational
variable or an angular impulsive torque τ̃ with corresponding angular variable .
Impulsive force problems usually are solved in two stages. Either equations 676 or 679 are used to
determine the conditions of the system immediately following the impulse. If → 0 then impulse changes
the generalized velocities ̇ but not the generalized coordinates . The subsequent motion then is determined
using the Lagrangian equations of motion with the impulsive generalized force being zero, and assuming that
the initial condition corresponds to the result of the impulse calculation.
1 2 1 2
= (1 + 2 )21 ̇1 + 2 1 2 ̇1 ̇2 + 2 22 ̇2
2 2
The total potential energy is
Use equation 679 to transform to the generalized coordinates 1 and 2 with the corresponding generalized
impulsive torques
̃1 = ̃ 1
̃2 = ̃ ( − 1 )
Since the system starts at rest where 1 = 2 = 0, then using equation 677 gives the change in angular
momentum immediately following the impulse to be
³ ´
1 21 ̇1 + 2 1 1 ̇1 + 2 ̇2 = ̃ 1
³ ´
2 2 1 ̇1 + 2 ̇2 = ̃ ( − 1 )
These two equations determine ̇1 and ̇2 immediately after the impulse; these can be used with 1 = 2 = 0
as initial conditions for solving the subsequent force-free motion when the generalized impulsive force is zero.
As described in example 125 the subsequent motion of this series coupled pendulum will be a superposition
of the two normal modes with amplitudes determined by the result of the impulse calculation.
6.14 Summary
Newtonian plausibility argument for Lagrangian mechanics:
A justification for introducing the calculus of variations to classical mechanics becomes apparent when
the concept of the Lagrangian ≡ − is used in the functional and time is the independent variable.
It was shown that Newton’s equation of motion can be rewritten as
− = (612)
̇
where
are the excluded forces of constraint plus any other conservative or non-conservative forces not
included in the potential This corresponds to the Euler-Lagrange equation for determining the minimum
of the time integral of the Lagrangian.
The excluded force
can be partitioned into the holonomic constraint part
which can be
represented by the Lagrange multipliers term.
X
≡ () (614)
where the Lagrange multiplier term accounts for holonomic constraint forces, and
includes all addi-
tional forces not accounted for by the scalar potential , or the Lagrange multiplier terms
. As discussed
in chapter 663, the constraint forces can be included explicitly as generalized forces in the excluded term
of equation 615.
Note that for unconstrained pure conservative forces, equation 615 can be simplified to the Euler-Lagrange
equation for independent coordinates .
− =0 (616)
̇
R 2
This is equivalent to using the calculus of variations to minimize the action integral = 1
, that is
Z 2
= ( ̇ ; ) = 0 (617)
1
where the functional is the Lagrangian and the independent variable is time
d’Alembert’s Principle
It was shown that d’Alembert’s Principle
X
(F
− ṗ ) · r = 0 (625)
cleverly transforms the principle of virtual work from the realm of statics to dynamics. Application of virtual
work to statics primarily leads to algebraic equations between the forces, whereas d’Alembert’s principle
applied to dynamics leads to differential equations.
Lagrange equations of motion
Lagrange used d’Alembert’s Principle to derived the basic equations of Lagrangian mechanics. This proof
clearly illustrates the role of the calculus of variations in Lagrangian mechanics as well as elucidating the
174 CHAPTER 6. LAGRANGIAN DYNAMICS
role of forces in the theory. The d’Alembert Principle leads to Euler’s variational equation for the kinetic
energy plus the active forces for each coordinate .
X ∙½ µ ¶ ¾ ¸
− − = 0 (638)
̇
If the coordinates are independent then for each value of the square bracket equals zero which
corresponds to Euler’s equation.
The Lagrangian method concentrates solely on active forces, completely ignoring all other internal forces.
In Lagrangian mechanics the generalized forces, corresponding to each generalized coordinate, can be parti-
tioned three ways
X
= −∇ + (q ) +
=1
where the velocity-independent conservative forces can be absorbed into Pascalar potential , the holonomic
constraint forces can be handled using the Lagrange multiplier term =1
(q ), and the remaining
part of the active forces can be absorbed into the generalized force
. The scalar potential energy is
handled by absorbing it into the standard Lagrangian = − . If the constraint forces are holonomic then
these forces are easily and elegantly handled by use of Lagrange multipliers. All remaining forces, including
dissipative forces, can be handled by including them explicitly in the the generalized force .
Combining the above two equations gives
"½ µ ¶ ¾
#
X X
− − − (q ) = 0 (656)
̇
=1
Use of the Lagrange multipliers to handle the constraint forces ensures that all infinitessimals are
independent implying that the expression in the square bracket must be zero for each of the values of .
This leads to Lagrange equations plus constraint relations
½ µ ¶ ¾
X
− =
+ (q ) (660)
̇
=1
where = 1 2 3
Application of Lagrangian mechanics:
The optimal way to exploit Lagrangian mechanics is as follows:
= Φ − v · A (674)
6.14. SUMMARY 175
leads to the Lorentz force where Φ is the scalar electric potential and A the vector potential.
Time-dependent forces:
It was shown that time-dependent forces can lead to complicated motion having both stable regions and
unstable regions of motion that can exhibit chaos.
Impulsive forces:
A generalized impulse ̃ can be derived for an instantaneous impulsive force from the time integral of
the impulsive forces P given by equation 2135 using the time integral of equation 617, that is
Z + Z + X X
r r
∆ = ̃ = lim
≡ lim F · = P̃ · (679)
→0 →0
Note that the generalized impulse ̃ can be a translational impulse P̃ with corresponding translational
variable or an angular impulsive torque T̃ with corresponding angular variable .
Comparison of Newtonian and Lagrangian mechanics:
In contrast to Newtonian mechanics, which is based on knowing all the vector forces acting on a system,
Lagrangian mechanics can derive the equations of motion using generalized coordinates without requiring
knowledge of the constraint forces acting on the system. Lagrangian mechanics provides a remarkably
powerful, and incredibly consistent, approach to solving for the equations of motion in classical mechanics
which is especially powerful for handling systems that are subject to holonomic constraints.
176 CHAPTER 6. LAGRANGIAN DYNAMICS
Workshop exercises
1. A disk of mass and radius rolls without slipping down a plane inclined from the horizontal by an angle
. The disk has a short weightless axle of negligible radius. From this axis is suspended a simple pendulum of
length and whose bob has a mass . Assume that the motion of the pendulum takes place in the plane
of the disk.
3. Consider a particle of mass moving in a plane and subject to an inverse square attractive force.
4. Consider a Lagrangian function of the form ( ˙ ¨ ). Here the Lagrangian contains a time derivative
of the generalized coordinates that is higher than the first. When working with such Lagrangians, the term
“generalized mechanics” is used.
(a) Consider a system with one degree of freedom. By applying the methods of the calculus of variations,
and assuming that Hamilton’s principle holds with respect to variations which keep both and ̇ fixed at
the end points, show that the corresponding Lagrange equation is
µ ¶ µ ¶
2
− + = 0
2 ̈ ̇
Such equations of motion have interesting applications in chaos theory.
(b) Apply this result to the Lagrangian
=− ̈ − 2
2 2
Do you recognize the equations of motion?
5. A bead of mass slides under gravity along a smooth wire bent in the shape of a parabola 2 = in the
vertical ( ) plane.
(c) Set up Lagrange’s equations of motion for both and with the constraint adjoined and a Lagrangian
multiplier introduced.
(d) Show that the same equation of motion for results from either of the methods used in part (b) or part
(c).
(e) Express in terms of and ̇.
(f) What are the and components of the force of constraint in terms of and ̇?
7. Consider the double pendulum comprising masses 1 and 2 connected by inextensible strings as shown in
the figure. Assume that the motion of the pendulum takes place in a vertical plane.
(a) Are there any equations of constraint? If so, what are they?
(b) Find Lagrange’s equations for this system.
O
L1
m1
2 L2
m 1g
m2
m2 g
8 Consider the system shown in the figure which consists of a mass suspended via a constrained massless link
of length where the point is acted upon by a spring of spring constant . The spring is unstretched when
the massless link is horizontal. Assume that the holonomic constraints at and are frictionless.
a Derive the equations of motion for the system using the method of Lagrange multipliers.
kx
x0 x L y
mg
Problems
1. A sphere of radius is constrained to roll without slipping on the lower half of the inner surface of a hollow
cylinder of radius Determine the Lagrangian function, the equation of constraint, and the Lagrange equations
of motion. Find the frequency of small oscillations.
2. A particle moves in a plane under the influence of a force = −−1 directed toward the origin; and
( 0) are constants. Choose generalized coordinates with the potential energy zero at the origin.
a) Find the Lagrangian equations of motion.
b) Is the angular momentum about the origin conserved?
c) Is the total energy conserved?
3. Two blocks, each of mass are connected by an extensionless, uniform string of length . One block is placed
on a frictionless horizontal surface, and the other block hangs over the side, the string passing over a frictionless
pulley. Describe the motion of the system:
a) when the mass of the string is negligible
b) when the string has mass .
4. Two masses 1 and 2 (1 6= 2 ) are connected by a rigid rod of length and of negligible mass. An
extensionless string of length 1 is attached to 1 and connected to a fixed point of the support . Similarly
a string of length 2 (1 6= 2 ) connects 2 and . Obtain the equation of motion describing the motion in
the plane of 1 2 and , and find the frequency of small oscillation around the equilibrium position.
5. A thin uniform rigid rod of length 2 and mass is suspended by a massless string of length . Initially the
system is hanging vertically downwards in the gravitational field . Use as generalized coordinates the angles
given in the diagram.
a) Derive the Lagrangian for the system.
b) Use the Lagrangian to derive the equations of motion.
c) A horizontal impulsive force in the direction strikes the bottom end of the rod for an infinitessimal
time . Derive the initial conditions for the system immediately after the impulse has occurred.
d) Draw a diagram showing the geometry of the pendulum shortly after the impulse when the displacement
angles are significant.
x
O
y
2L
2
Fx
Mg
Chapter 7
7.1 Introduction
The discussion of Lagrangian dynamics illustrates the power of Lagrangian mechanics for deriving the equa-
tions of motion. In contrast to Newtonian mechanics, which is given in terms of force vectors acting on a
system, the Lagrangian method, based on d’Alembert’s Principle or Hamilton’s Principle, is expressed in
terms of the scalar kinetic and potential energies of the system. The Lagrangian approach is a sophisticated
alternative to Newton’s laws of motion, that provides a simpler derivation of the equations of motion that
allows constraint forces to be ignored. In addition, the use of Lagrange multipliers or generalized forces
allows the Lagrangian approach to determine the constraint forces when these forces are of interest. The
equations of motion, derived either from Newton’s Laws or Lagrangian dynamics, can be non-trivial to
solve mathematically. It is necessary to integrate second-order differential equations, which for degrees of
freedom, imply 2 constants of integration.
Chapter 7 will explore the remarkable connection between symmetry and invariance of a system under
transformation, and the related conservation laws that imply the existence of constants of motion. Even
when the equations of motion cannot be solved easily, it is possible to derive important physical principles
regarding the first-order integrals of motion of the system directly from the Lagrange equation, as well as
elucidating the underlying symmetries plus invariance. This property is contained in Noether’s theorem
which states that conservation laws are associated with differentiable symmetries of a physical system.
= − = (7.1)
̇ ̇ ̇ ̇
X1 ¡ 2 ¢
= ̇ + ̇2 + ̇2
̇ =1 2
= ̇ =
= (7.2)
̇
which is the component of the linear momentum for the particle.
179
180 CHAPTER 7. SYMMETRIES, INVARIANCE AND THE HAMILTONIAN
This result suggests an obvious extension to the concept of momentum to generalized coordinates. The
generalized momentum associated with the coordinate is defined to be
≡ (7.3)
̇
Note that also is called the conjugate momentum or canonical momentum to where are
conjugate, or canonical, variables. Remember that the linear momentum is the first-order time integral
given by equation 210. If is not a spatial coordinate, then is the generalized momentum, not the
kinematic linear momentum. For example, if is an angle, then will be angular momentum. That
is, the generalized momentum may differ from the usual linear or angular momentum since the definition
(73) is more general than the usual definition of momentum, = ̇ in classical mechanics. This is
illustrated by the case of a moving charged particles in an electromagnetic field. Chapter 6 showed
that electromagnetic forces on a charge can be described in terms of a scalar potential where
= (Φ − A · v ) (7.4)
The generalized momentum to the coordinate for charge and mass is given by the above Lagrangian
= = ̇ + (7.6)
̇
Note that this includes both the mechanical linear momentum plus the correct electromagnetic momentum.
The fact that the electromagnetic field carries momentum should not be a surprise since electromagnetic
waves also carry energy as is illustrated by the radiant energy from the sun.
Φ
N() = −
The initial angular momentum in the electromagnetic field can be derived using equation 76 plus Stoke’s
theorem (Appendix 3). Equation 2142 gives that the final angular momentum equals the angular impulse
Z I I I Z
L
= ̇ = = = B · dS =Φ
I Z
where Φ = = B · dS is the initial total magnetic flux through the solenoid. Thus the total initial
angular momentum is given by
L
= 0 + L
= Φ
Since the final electromagnetic field is zero the final total angular momentum is given by
L
= L
+ 0 = Φ
Note that the total angular momentum is conserved. That is, initially all the angular momentum is stored in
the electromagnetic field, whereas the final angular momentum is all mechanical. This explains the paradox
that the mechanical angular momentum is not conserved, only the total angular momentum of the system is
conserved, that is, the sum of the mechanical and electromagnetic angular momenta.
The new set of generalized coordinates satisfies Lagrange’s equations of motion with the new Lagrangian
The Lagrangian is a scalar, with units of energy, which does not change if the coordinate representa-
tion is changed. Thus ( 0 ̇ 0 ) can be derived from ( ̇ ) by substituting the inverse relation =
(10 20 0 ; ) into ( ̇ ) That is, the value of the Lagrangian is independent of which coordinate
representation is used. Although the general form of Lagrange’s equations of motion is preserved in any
point transformation, the explicit equations of motion for the new variables usually look different from those
with the old variables. A typical example is the transformation from cartesian to spherical coordinates.
For a given system, there can be particular transformations for which the explicit equations of motion are
the same for both the old and new variables. Transformations where the equations of motion are invariant
are called invariant transformations. It will be shown that if the Lagrangian does not explicitly contain
a particular coordinate of displacement then the corresponding conjugate momentum, is conserved.
This relation is called Noether’s theorem which states “For each symmetry of the Lagrangian, there is a
conserved quantity".
Noether’s Theorem will be used to consider invariant transformations for two dependent variables, ()
and () plus their conjugate momenta and . For a closed system, these provide up to six possible
conservation laws for the three axes. Then we will discuss the independent variable and its relation to
the Generalized Energy Theorem, which provides another possible conservation law. For simplicity, these
discussions assume that the systems are holonomic and conservative.
The Lagrange equations using generalized coordinates for holonomic systems, was given by equation 660
to be ½ µ ¶ ¾ X
− = (q ) +
(7.9)
̇
=1
or equivalently as " #
X
̇ = + (q ) + (7.11)
=1
Note that if the Lagrangian does not contain explicitly, that is, the Lagrangian is invariant to a linear
translation, or equivalently, is spatially homogeneous, and if the Lagrange multiplier constraint force and
generalized force terms are zero, then
" #
X
+ (q ) + =0 (7.12)
=1
r = θ × r
Similarly, the products of the generalized velocities ̇ with the corresponding derivatives of 1 and 0 give
X 2
̇ = 22 (7.27)
̇
X 1 (q q̇ )
̇ = 1 (q q̇ ) (7.28)
̇
X 0 (q )
̇ = 0 (7.29)
̇
186 CHAPTER 7. SYMMETRIES, INVARIANCE AND THE HAMILTONIAN
Equation 725 gives that = 2 when the transformed system is scleronomic, i.e. = 0 and then the
kinetic energy is a quadratic function of the generalized velocities ̇ . Using the definition of the generalized
momentum equation 73 assuming = 2 , and that the potential is velocity independent, gives that
2
≡ = − = (7.30)
̇ ̇ ̇ ̇
Then equation 727 reduces to the useful relation that
1X 1
2 = ̇ = q̇ · p (7.31)
2 2
The Lagrange equations for a conservative force are given by equation 660 to be
X
− =
+ (q ) (7.33)
̇
=1
The holonomic constraints can be accounted for using the Lagrange multiplier terms while the generalized
force
includes non-holonomic forces or other forces not included in the potential energy term of the
Lagrangian, or holonomic forces not accounted for by the Lagrange multiplier terms.
Substituting equation 733 into equation 732 gives
"
#
X X X X
= ̇ − ̇ + (q ) + ̈ +
̇
̇
=1
" #
X µ ¶ X X
= ̇ − ̇
+ (q ) + (7.34)
̇
=1
Jacobi’s generalized momentum, equation 73 can be used to express the generalized energy ( ̇ ) in
terms of the canonical coordinates ̇ and , plus time . Define the Hamiltonian function to equal the
generalized energy expressed in terms of the conjugate variables ( ), that is,
X µ ¶ X
(q p) ≡ (q q̇ ) ≡ ̇ − (q q̇ ) = (̇ ) − (q q̇ ) (7.37)
̇
This Hamiltonian (q p) underlies Hamiltonian mechanics which plays a profoundly important role in
most branches of physics as illustrated in chapters 8 14 and 17.
1 Most textbooks call the function (q q̇ ) Jacobi’s energy integral. This book adopts the more descriptive name Generalized
=− (7.39)
h P i
Thus the Hamiltonian is time independent if both +
=1 (q ) = 0 and the Lagrangian are
time-independent. For an isolated closed system having no external forces acting, then the Lagrangian is
time independent because the velocities are constant, and there is no external potential energy. That is, the
Lagrangian is time-independent, and
⎡ ⎤
µ ¶
⎣X
̇ − ⎦ = =− =0 (7.40)
̇
As a consequence, the Hamiltonian (q p) and generalized energy (q q̇ ), both are constants of motion
if the Lagrangian is a constant of motion, and if the external non-potential forces are zero. This is an example
of Noether’s theorem, where the symmetry of time independence leads to conservation of the conjugate
variable, which is the Hamiltonian or Generalized energy.
If the potential energy does not depend explicitly on velocities ̇ or time, then
( − )
= = = (7.42)
̇ ̇ ̇
Using equations 727 728 729 gives that the total generalized Hamiltonian (q p) equals
But the sum of the kinetic and potential energies equals the total energy. Thus equation 744 can be rewritten
in the form
(q p) = ( + ) − (1 + 20 ) = − (1 + 20 ) (7.45)
Note that Jacobi’s generalized energy and the Hamiltonian do not equal the total energy . However, in
the special case where the transformation is scleronomic, then 1 = 0 = 0 and if the potential energy
does not depend explicitly of ̇ , then the generalized energy (Hamiltonian) equals the total energy, that is,
= Recognition of the relation between the Hamiltonian and the total energy facilitates determining
the equations of motion.
188 CHAPTER 7. SYMMETRIES, INVARIANCE AND THE HAMILTONIAN
=− (7.47)
P h P i
Also, when ̇
+ =1 (q ) = 0 and if the Lagrangian is not an explicit function of time,
then the Hamiltonian is a constant of motion. That is, is conserved if, and only if, the Lagrangian, and
consequently the Hamiltonian, are not explicit functions of time, and if the external forces are zero.
= −
= 0 conserved, = conserved, 6=
= −
6= 0 not conserved, = not conserved, 6=
Note the following general facts regarding the Lagrangian and the Hamiltonian.
(1) the Lagrangian is indefinite with respect to addition of a constant to the scalar potential,
(2) the Lagrangian is indefinite with respect to addition of a constant velocity,
(3) there is no unique choice of generalized coordinates.
(4) the Hamiltonian is a scalar function that is derived from the Lagrangian scalar function.
(5) the generalized momentum is derived from the Lagrangian.
These facts, plus the ability to recognize the conditions under which is conserved, and when =
can greatly facilitate solving problems as shown by the following two examples.
7.10. HAMILTONIAN INVARIANCE 189
The Hamiltonian in the fixed frame is conserved and equals the total energy, that is = + .
Rotating frame of reference 0
The above inertial fixed-frame Lagrangian can be written in terms of the primed (non-inertial rotating
frame) coordinates as
µ ´2 ¶
³ 2 2
´ 02 ³ 0
= − = ̇ + 2 ̇ − () = ̇ + 02 ̇ + − (0 )
2 2
The generalized momenta derived from this Lagrangian are
³ 0 ´
0 = 0 = ̇ 02
̇ + = 00 + 02
̇
0 = = ̇02 =
̇0
The Hamiltonian expressed in terms of the non-inertial rotating frame coordinates is
⎛ ³ ´⎞
0 1 00 + 2
0 (0 0 0 0 ) = 0 ̇0 + 0 ̇ − =
⎝02
+
⎠ + (0 )
̇ ̇ 2 2
Note that 0 (0 0 0 0 ) is time independent and therefore is conserved, but (0 0 0 0 ) 6= because
the generalized coordinates are time dependent. In addition, 00 is conserved since
̇0 = 0 =− =0
0
7.10. HAMILTONIAN INVARIANCE 191
Note that the Lagrangian and Hamiltonian are not explicit functions of time, therefore they are conserved.
Also the potential is velocity independent and there is no coordinate transformation, thus the Hamiltonian
equals the total energy which is a constant of motion.
2
= − cos =
22
() = − ( + ) = 0
192 CHAPTER 7. SYMMETRIES, INVARIANCE AND THE HAMILTONIAN
Using the Lagrangian, plus the one equation of constraint, requires one Lagrange multiplier. Then the
Lagrange equations of motion for and are
∙ ¸
− + = 0
̇
∙ ¸
− + = 0
̇
Substitute the Lagrangian and the equation of constraint gives two equations of motion
− ( − ) sin − ( − )2 ̈ + ( − ) = 0
1
− 2 ̈ − = 0
2
The lower equation of motion gives that
1
= − ̈
2
Substitute this into the equation of constraint gives
1
= − ( − ) ̈
2
Substitute this into the first equation of motion gives the equation of motion for to be
2
̈ = sin
3 ( − )
that is
=− sin
3
The torque acting on the small cylinder due to the frictional force is
1
= 2 ̈ = −
2
Thus the frictional force is
= − = sin
3
Noether’s theorem can be used to ascertain if the angular momentum is a constant of motion. The
derivative of the Lagrangian
= ( − ) sin
and thus the Lagrange equations tells us that ̇ = ( − ) sin . Therefore is not a constant of motion.
The Lagrangian is not an explicit function of which would suggest that is a constant of motion.
But this is incorrect because the constraint equation = (−) couples and , that is, they are not
independent variables, and thus and are coupled by the constraint equation. As a result is not a
constant of motion because it is directly coupled to = ( − ) sin which is not a constant of motion.
Thus neither nor are constants of motion. This illustrates that one must account carefully for equations
of constraint, and the concomitant constraint forces, when applying Noether’s theorem which tacitly assumes
independent variables.
The Hamiltonian can be derived using the generalized momenta
= = ( − )2 ̇
̇
1
= = 2 ̇
̇ 2
Then the Hamiltonian is given by
2 2
= ̇ + ̇ − = + + [ − ( − ) cos ]
2 ( − )2 2
Note that the transformation to generalized coordinates is time independent and the potential is not velocity
dependent, thus the Hamiltonian also equals the total energy. Also the Hamiltonian is conserved since
= 0.
7.11. HAMILTONIAN FOR CYCLIC COORDINATES 193
The importance of the relations between invariance and symmetry cannot be overemphasized. It extends
beyond classical mechanics to quantum physics and field theory. For a three-dimensional closed system,
there are three possible constants for linear momentum, three for angular momentum, and one for energy. It
is especially interesting in that these, and only these, seven integrals have the property that they are additive
for the particles comprising a system, and this occurs independent of whether there is an interaction among
the particles. That is, this behavior is obeyed by the whole assemble of particles for finite systems. Because
of its profound importance to physics, these relations between symmetry and invariance are used extensively.
It is more convenient to write the generalized coordinates plus their generalized momentum as
vectors, e.g. q ≡ (1 2 ), p ≡ (1 2 ). The generalized momenta conjugate to the coordinate ,
defined by 73, then can be written in the form
(q q̇ t)
= (7.50)
̇
Substituting this definition of the generalized momentum into the Hamiltonian defined in (737), and
expressing it in terms of the coordinate q and its conjugate generalized momenta p, leads to
X
(q p ) = ̇ − (q q̇ ) (7.51)
= p · q̇−(q q̇ ) (7.52)
P
Note that the scalar product p · q̇ = ̇ equals 2 for systems that are scleronomic and when the
potential is velocity independent.
The crucial feature of the Hamiltonian is that it is expressed as (q p ) that is, it is a function
of the generalized coordinates q and their conjugate momenta p, which are taken to be independent, in
addition to the independent variable, . This is in contrast to the Lagrangian (q q̇ ) which is a function
of the generalized coordinates , the corresponding velocities ̇ , and time The velocities q̇ are the
time derivatives of the coordinates q and thus these are related. In physics, the fundamental conjugate
coordinates are (q p) which are the coordinates underlying the Hamiltonian. This is in contrast to (q q̇)
which are the coordinates that underlie the Lagrangian. Thus the Hamiltonian is more fundamental than
the Lagrangian and is a reason why the Hamiltonian mechanics, rather than the Lagrangian mechanics, was
used as the foundation for development of quantum and statistical mechanics.
Hamiltonian mechanics will be derived two other ways. Chapter 8 uses the Legendre transformation
between the conjugate variables (q q̇ ) and (q p ) where the generalized coordinate q and its conju-
gate generalized momentum, p are independent. This shows that Hamiltonian mechanics is based on the
same variational principles as those used to derive Lagrangian mechanics. Chapter 13 derives Hamiltonian
mechanics directly from Hamilton’s Principle of Least action. Chapter 8 will introduce the algebraic Hamil-
tonian mechanics, that is based on the Hamiltonian. The powerful capabilities provided by Hamiltonian
mechanics will be described in chapter 14.
7.14 Summary
This chapter has explored the importance of symmetries and invariance in Lagrangian mechanics and has
introduced the Hamiltonian. The following are the main points introduced in this chapter.
Noether’s theorem:
Noether’s theorem explores the remarkable connection between symmetry, plus the invariance of a sys-
tem under transformation and related conservation laws which imply the existence of important physical
principles, and constants of motion. Transformations where the equations of motion are invariant are called
invariant transformations. Variables that are invariant to a transformation are called cyclic variables. It
was shown that if the Lagrangian does not explicitly contain a particular coordinate of displacement, then
the corresponding conjugate momentum, ̇ is conserved. This is Noether’s theorem which states “For each
symmetry of the Lagrangian, there is a conserved quantity". In particular it was shown that translational
invariance in a given direction leads to the conservation of linear momentum in that direction, and rotational
invariance about an axis leads to conservation of angular momentum about that axis. These are the first-
order spatial and angular integrals of the equations of motion. Noether’s theorem also relates the properties
of the Hamiltonian to time invariance of the Lagrangian, namely;
(1) is conserved if, and only if, the Lagrangian, and consequently the Hamiltonian, are not explicit
functions of time.
(2) The Hamiltonian gives the total energy if the constraints and coordinate transformations are time
independent and the potential energy is velocity independent. This is equivalent to stating that if the con-
straints, or generalized coordinates, for the system are time independent then = .
Noether’s theorem is of importance since it underlies the relation between symmetries, and invariance in
all of physics; that is, its applicability extends beyond classical mechanics.
7.14. SUMMARY 195
Generalized momentum:
The generalized momentum associated with the coordinate is defined to be
≡ (73)
̇
where is also called the conjugate momentum (or canonical momentum) to where are
conjugate, or canonical, variables. Remember that the linear momentum is the first-order time integral
given by equation 210. Note that if is not a spatial coordinate, then is not linear momentum, but is
the conjugate momentum. For example, if is an angle, then will be angular momentum.
Kinetic energy in generalized coordinates:
It was shown that the kinetic energy van be expressed in terms of generalized coordinates by
XX 1 XX XX 1 µ ¶2
(q q̇ ) = ̇ ̇ + ̇ + (719)
2
2
= 2 (q q̇ ) + 1 (q q̇ ) + 0 (q ) (7.53)
For scleronomic systems with a potential that is velocity independent, then the kinetic energy can be
expressed as
1X 1
= 2 = ̇ = q̇ · p (731)
2 2
Generalized energy
Jacobi’s Generalized Energy (q ̇ ) was defined as
X µ ¶
(q q̇ ) ≡ ̇ − (q q̇ ) (736)
̇
Hamiltonian function
The Hamiltonian (q p) was defined in terms of the generalized energy (q q̇ ) and by introducing
the generalized momentum. That is
X
(q p) ≡ (q q̇ ) = ̇ − (q q̇ ) = p · q̇−(q q̇ ) (737)
Note that if all the generalized non-potential forces are zero, then the bracket in equation 738 is zero, and
if the Lagrangian is not an explicit function of time, then the Hamiltonian is a constant of motion.
Generalized energy and total energy:
The generalized energy, and corresponding Hamiltonian, equal the total energy if:
1) The kinetic energy has a homogeneous quadratic dependence on the generalized velocities and the
transformation to generalized coordinates is independent of time, = 0
2) The potential energy is not velocity dependent, thus the terms ̇ = 0
Chapter 8 will introduce Hamiltonian mechanics that is built on the Hamiltonian, and chapter 14 will
explore applications of Hamiltonian mechanics.
196 CHAPTER 7. SYMMETRIES, INVARIANCE AND THE HAMILTONIAN
Workshop exercises
1. Consider a particle of mass moving in a plane and subject to an inverse square attractive force.
2. Consider a Lagrangian function of the form ( ˙ ¨ ). Here the Lagrangian contains a time derivative
of the generalized coordinates that is higher than the first. When working with such Lagrangians, the term
“generalized mechanics” is used.
(a) Consider a system with one degree of freedom. By applying the methods of the calculus of variations,
and assuming that Hamilton’s principle holds with respect to variations which keep both and ̇ fixed at
the end points, show that the corresponding Lagrange equation is
µ ¶ µ ¶
2
− + = 0
2 ̈ ̇
Such equations of motion have interesting applications in chaos theory.
(b) Apply this result to the Lagrangian
=− ̈ − 2
2 2
Do you recognize the equations of motion?
3. A uniform solid cylinder of radius and mass rests on a horizontal plane and an identical cylinder rests
on it touching along the top of the first cylinder with the axes of both cylinders parallel. The upper cylinder
is given an infinitessimal displacement so that both cylinders roll without slipping in the directions shown by
the arrows.
y
x
t=0 t>0
4. Consider a diatomic molecule which has a symmetry axis along the line through the center of the two atoms
comprising the molecule. Consider that this molecule is rotating about an axis perpendicular to the symmetry
axis and that there are no external forces acting on the molecule. Use Noether’s Theorem to answer the
following questions:
a) Is the total angular momentum conserved?
b) Is the projection of the total angular momentum along a space-fixed axis conserved?
c) Is the projection of the angular momentum along the symmetry axis of the rotating molecule conserved?
d) Is the projection of the angular momentum perpendicular to the rotating symmetry axis conserved?
7.14. SUMMARY 197
5. A bead of mass slides under gravity along a smooth wire bent in the shape of a parabola 2 = in the
vertical ( ) plane.
Problems
1. Let the horizontal plane be the − plane. A bead of mass is constrained to slide with speed along a
curve described by the function = (). What force does the curve apply to the bead? (Ignore gravity)
2. Consider the Atwoods machine shown. The masses are 4, 5, and 3. Let and be the heights of the
right two masses relative to their initial positions.
a) Solve this problem using the Euler-Lagrange equations
b) Use Noether’s theorem to find the conserved momentum.
4m
x y
5m 3m
3. A cube of side 2 and center of mass , is placed on a fixed horizontal cylinder of radius and center as
shown in the figure. Originally the cube is placed such that is centered above but it can roll from side to
side without slipping. (a) Assuming that use the Lagrangian approach to to find the frequency for small
oscillations about the top of the cylinder. For simplicity make the small angle approximation for before using
the Lagrange-Euler equations. (b) What will be the motion if ? Note that the moment of inertia of the
cube about the center of mass is 23 2 .
h
b
O
198 CHAPTER 7. SYMMETRIES, INVARIANCE AND THE HAMILTONIAN
4. Two equal masses of mass are glued to a massless hoop of radius is free to rotate about its center in a
vertical plane. The angle between the masses is 2 , as shown. Find the frequency of oscillations.
5. Three massless sticks each of length 2, and mass with the center of mass at the center of each stick, are
hinged at their ends as shown. The bottom end of the lower stick is hinged at the ground. They are held so
that the lower two sticks are vertical, and the upper one is tilted at a small angle with respect to the vertical.
They are then released. At the instant of release what are the three equations of motion derived from the
Lagrangian derived assuming that is small? Use these to determine the initial angular accelerations of the
three sticks.
m
Chapter 8
Hamiltonian mechanics
8.1 Introduction
The three major formulations of classical mechanics are
1. Newtonian mechanics which is the most intuitive vector formulation used in classical mechanics.
2. Lagrangian mechanics is a powerful algebraic formulation of classical mechanics derived using either
d’Alembert’s Principle, or Hamilton’s Principle. The latter states ”A dynamical system follows a path
that minimizes the time integral of the difference between the kinetic and potential energies”.
3. Hamiltonian mechanics has a beautiful superstructure that, like Lagrangian mechanics, is built
upon variational calculus, Hamilton’s principle, and Lagrangian mechanics.
Hamiltonian mechanics is introduced at this juncture since it is closely interwoven with Lagrange mechan-
ics. Hamiltonian mechanics plays a fundamental role in modern physics, but the discussion of the important
role it plays in modern physics will be deferred until chapters 14 and 17 where applications to modern physics
are addressed.
The following important concepts were introduced in chapter 7:
The generalized momentum was defined to be given by
(q q̇)
≡ (8.1)
̇
Note that, as discussed in chapter 72, if the potential is velocity dependent, such as the Lorentz force, then
the generalized momentum includes terms in addition to the usual mechanical momentum.
Jacobi’s generalized energy function (q q̇ ) was introduced where
µ
X ¶
(q q̇ ) = ̇ − (q q̇ ) (8.2)
̇
The Hamiltonian function was defined to be given by expressing the generalized energy function,
equation 82, in terms of the generalized momentum. That is, the Hamiltonian (q p ) is expressed as
X
(q p ) = ̇ − (q q̇ ) (8.3)
The symbols q, p, designate vectors of generalized coordinates, q ≡ (1 2 ) p ≡P(1 2 ).
Equation 83 can be written compactly in a symmetric form using the scalar product p · q̇ = ̇ .
(q p ) + (q q̇ ) = p · q̇ (8.4)
A crucial feature of Hamiltonian mechanics is that the Hamiltonian is expressed as (q p ) that
is, it is a function of the generalized coordinates and their conjugate momenta, which are taken to be
independent, plus the independent variable, time. This contrasts with the Lagrangian (q q̇ ) which is a
function of the generalized coordinates , and the corresponding velocities ̇ , that is the time derivatives
of the coordinates , plus the independent variable, time.
199
200 CHAPTER 8. HAMILTONIAN MECHANICS
v = ∇u (u w) (8.5)
and where w designates passive variables. The function ∇u (u w) is the first-order derivative, (gradient)
of (u w) with respect to the components of the vector u. The Legendre transform states that the inverse
formula can always be written as a first-order derivative
u = ∇v (v w) (8.6)
The relationship between the functions (u w) and (v w) is symmetrical and each is said to be the
Legendre transform of the other.
The general Legendre transform can be used to relate the Lagrangian and Hamiltonian by identifying the
active variables v with p and u with q̇ the passive variable w with q, and the corresponding functions
(u w) =(q q̇) and (v w) =(q p). Thus the generalized momentum (81) corresponds to
where (q) are the passive variables. Then the Legendre transform states that the transformed variable q̇
is given by the relation
q̇ = ∇p (q p) (8.10)
Since the functions (q q̇) and (q p) are the Legendre transforms of each other, they satisfy the
relation
(q p ) +(q q̇ ) = p · q̇ (8.11)
The function (q p ), which is the Legendre transform of the Lagrangian (q q̇ ) is called the Hamil-
tonian function and equation (811) is identical to our original definition of the Hamiltonian given by
equation (83). The variables q and are passive variables thus equation (88) gives that
Written in component form equation 812 gives the partial derivative relations
Note that equations 813 and 814 are strictly a result of the Legendre transformation. To complete the
transformation from Lagrangian to Hamiltonian mechanics it is necessary to invoke the calculus of variations
via the Lagrange-Euler equations. The symmetry of the Legendre transform is illustrated by equation 811
Equation 731 gives that the scalar product p · q̇ =22 For scleronomic systems, with velocity indepen-
dent potentials the standard Lagrangian = − and = 2 − + = + . Thus, for this simple
case, equation 811 reduces to an identity + = 2 .
X
− = +
(8.16)
̇
=1
This gives the corresponding Hamilton equation for the time derivative of to be
X
= ̇ = + +
(8.17)
̇
=1
Substitute equation 813 into equation 817 leads to the second Hamilton equation of motion
(q p) X
̇ = − + +
(8.18)
=1
One can explore further the implications of Hamiltonian mechanics by taking the time differential of (83)
giving. µ ¶
(q p) X ̇ ̇
= ̇ + − − − (8.19)
̇
Inserting the conjugate momenta ≡ ̇ and equation 817 into equation 819 results in
à "
# !
(q p) X ̇ X ̇
= ̇ ̇ + − ̇ − − ̇ − − (8.20)
=1
The second and fourth terms cancel as well as the ̇ ̇ terms, leaving
Ã" # !
(q p) X X
= + ̇ − (8.21)
=1
Use equations 815 and 818 to substitute for and in equation 822 gives
Ã"
# !
(q p) X X (q p)
= + ̇ + (8.23)
=1
202 CHAPTER 8. HAMILTONIAN MECHANICS
Note that equation 823 must equal the generalized energy theorem equation 821 Therefore,
=− (8.24)
In summary, Hamilton’s equations of motion are given by
(q p)
̇ = (8.25)
" #
(q p) X
̇ = − + +
(8.26)
=1
Ã" # !
(q p) X X (q q̇)
= + ̇ − (8.27)
=1
The symmetry of Hamilton’s equations of motion is illustrated when the Lagrange multiplier and gener-
alized forces are zero. Then
(q p)
̇ = (8.28)
(p q )
̇ = − (8.29)
(p q ) (p q ) (q̇ q)
= =− (8.30)
This simplified form illustrates the symmetry of Hamilton’s equations of motion. Many books present
the Hamiltonian only for this simplified case where it is holonomic, conservative, and generalized coordinates
are used.
= cos (8.31)
= sin
=
2
̇ = − = − (8.38)
3
̇ = − =− (8.39)
̇ = − =− (8.40)
̇ = = (8.41)
̇ = = (8.42)
2
̇ = = (8.43)
Note that if is cyclic, that is = 0 then the angular momentum about the axis, , is a constant
of motion. Similarly, if is cyclic, then is a constant of motion.
204 CHAPTER 8. HAMILTONIAN MECHANICS
The Lagrangian is
³ 2 2 2
´
= − = ̇ + 2 ̇ + 2 sin2 ̇ − () (8.45)
2
The conjugate momenta are
= = ̇ (8.46)
2
= = ̇ (8.47)
2 2
= = sin ̇ (8.48)
Assuming a conservative force then is conserved. Since the transformation from cartesian to generalized
spherical coordinates is time independent, then = Thus using (846 − 848) the Hamiltonian is given
by
X
(q p ) = ̇ − (q q̇ ) (8.49)
³ ´ ³ 2 2
´
= ̇ + ̇ + ̇ − ̇2 + 2 ̇ + 2 sin2 ̇ + ( ) (8.50)
à 2 !
2 2
1
= 2 + 2 + 2 2 + ( ) (8.51)
2 sin
Note that the Lagrangian is not explicitly time dependent, thus the Hamiltonian is a constant of motion.
Hamilton’s equations give that
̇ = = − ̇ = =0
̇ = = − ̇ = =0
̇ = = − ̇ = =
Combining these gives that ̈ = 0 ̈ = 0 ̈ = −. Note that the linear momenta and are constants
of motion whereas the rate of change of is given by the gravitational force . Note also that = +
for this conservative system.
The Hamiltonian is
X
= ̇ − = ̇ −
1 2 1 2 1 2 1 2
= − + = +
2 2 2 2
Note that the Lagrangian is not explicitly time dependent, thus the Hamiltonian will be a constant of motion.
Hamilton’s equations give that
̇ = =
or
= ̇
In addition
−̇ = = =
Combining these gives that
̈ + =0
which is the equation of motion for the harmonic oscillator.
1 2 2
= ̇ + cos
2
The momentum conjugate to is
= = 2 ̇
̇
which is the angular momentum about the pivot point.
The Hamiltonian is
X 1 2 2 2
= ̇ − = ̇ − = ̇ − cos = 2 − cos
2 2
horizontal axis when || , that is, the pendulum swings
around a circle continuously, i.e. it rotates continuously in one
direction about the horizontal axis. The phase change occurs at
= and is designated by the separatrix trajectory.
O
The plot of versus for the plane pendulum is better pre-
sented on a cylindrical phase space representation since is a
cyclic variable that cycles around the cylinder, whereas oscil-
lates equally about zero having both positive and negative values.
When wrapped around a cylinder then the unstable and stable (b)
equilibrium points will be at diametrically opposite locations on Phase-space diagrams for the plane
the surface of the cylinder at = 0. For small oscillations pendulum. The separatrix (bold line)
about equilibrium, also called librations, the correlation between separates the oscillatory solutions from
and is given by the clockwise closed ellipses wrapped on the the rolling solutions. The upper (a)
cylindrical surface, whereas for energies || the positive shows one complete cycle while the lower
corresponds to counterclockwise rotations while the negative (b) shows two complete cycles.
corresponds to clockwise rotations.
2 + 2 = 2
F = −r
y
the potential is the same as for the harmonic oscillator,
that is
1 1
= 2 = (2 + 2 )
2 2 x
This is independent of and thus is cyclic.
Mass attracted to origin by force proportional to
In cylindrical coordinates the velocity is
distance from origin with the motion constrained
2 2 to the surface of a cylinder.
2 = ̇2 + 2 ̇ +
=
= 0
208 CHAPTER 8. HAMILTONIAN MECHANICS
= ̇ + ̇ + ̇ −
1 ³ 2 2
´ 1
= ̇ + 2 ̇ + ̇ 2 − + 2 ̇
2 2
µ ¶2
2 1 1 2 2
= + + + −
2 22 2 2
" µ ¶2 #
1 2 1 2
= + + + −
2 2
Note that the Hamiltonian is not an explicit function of time, therefore it is a constant of motion which
equals the total energy. " #
µ ¶2
1 2 1 2
= + + + − =
2 2
Since ̇ = −
and if is not an explicit function of then ̇ = 0 that is, is a constant of motion.
Thus and are constants of motion.
Consider the initial conditions = ̇ = ̇ = ̇ = 0. Then
1 1
= = 2 ̇ − 2 = − 2
̇ 2 2
= 0
" µ ¶2 #
1 2 1 2 ln( )
= + + + + 0 = 0
2 2 ln( )
Note that at = then is given by the last equation since the Hamiltonian equals a constant 0 . That
is, assuming that then
1
2 = 20 − ( )2
2
Define a critical magnetic field by r
2 20
≡
then
¡ 2¢ ¡ ¢ 1
= = 2 − 2 ( )2
2
Note that if then is real at = . However, if then is imaginary at =
implying that there must be a maximum orbit radius 0 for the electron where 0 . That is, the electron
trajectories are confined spatially to coaxial cylindrical orbits concentric with the magnetron electromagnetic
fields. These closed electron trajectories excite the microwave cavities located in the nearby outer cylindrical
wall of the anode.
.
210 CHAPTER 8. HAMILTONIAN MECHANICS
X
X −
X
(1 ; 1 ; ) = ̇ − = ̇ + ̇ − (8.58)
=1
Routh’s clever idea was to define a new function, called the Routhian, that include only one of the two
partitions of the kinetic energy terms. This makes the Routhian a Hamiltonian for the coordinates for which
the kinetic energy terms are included, while the Routhian acts like a negative Lagrangian for the coordinates
where the kinetic energy term is omitted. This book defines two Routhians.
X
(1 ; ̇1 ̇ ; +1 ; ) ≡ ̇ − (8.59)
X
(1 ; 1 ; ̇+1 ̇ ; ) ≡ ̇ − (8.60)
The first, Routhian, called includes the kinetic energy terms only for the cyclic variables, and behaves
like a Hamiltonian for the cyclic variables, and behaves like a Lagrangian for the non-cyclic variables. The
second Routhian, called − includes the kinetic energy terms for only the non-cyclic variables, and
behaves like a Hamiltonian for the non-cyclic variables, and behaves like a negative Lagrangian for the cyclic
variables. These two Routhians complement each other in that they make the Routhian either a Hamiltonian
for the cyclic variables, or the converse where the Routhian is a Hamiltonian for the non-cyclic variables.
The Routhians use ( ̇ ) to denote those coordinates for which the Routhian behaves like a Lagrangian, and
( ) for those coordinates where the Routhian behaves like a Hamiltonian. For uniformity, it is assumed
that the degrees of freedom between 1 ≤ ≤ are non-cyclic, while those between +1 ≤ ≤ are ignorable
cyclic coordinates.
The Routhian is a hybrid of Lagrangian and Hamiltonian mechanics. Some textbooks minimize discussion
of the Routhian on the grounds that this hybrid approach is not fundamental. However, the Routhian is
used extensively in engineering in order to derive the equations of motion for rotating systems. In addition
it is used when dealing with rotating nuclei in nuclear physics, rotating molecules in molecular physics, and
rotating galaxies in astrophysics. The Routhian reduction technique provides a powerful way to calculate
the intrinsic properties for a rotating system in the rotating frame of reference. The Routhian approach is
included in this textbook because it plays an important role in practical applications of rotating systems, plus
it nicely illustrates the relative advantages of the Lagrangian and Hamiltonian formulations in mechanics.
8.6. ROUTHIAN REDUCTION 211
The first two terms on the right can be combined to give the Hamiltonian for only the cyclic
variables, = + 1 + 2 , that is
(1 ; ̇1 ̇ ; +1 ; ) = − (8.63)
The Routhian (1 ; ̇1 ̇ ; +1 ; ) also can be written in an alternate form
X
X
X
(1 ; ̇1 ̇ ; +1 ; ) ≡ ̇ − = ̇ − − ̇ (8.64)
=1
X
= − ̇ (8.65)
which is expressed as the complete Hamiltonian minus the kinetic energy term for the noncyclic coordinates.
The Routhian behaves like a Hamiltonian for the cyclic coordinates and behaves like a negative
Lagrangian for all the = − noncyclic coordinates = 1 2 Thus the equations of motion
for the non-cyclic variables are given using Lagrange’s equations of motion, while the Routhian behaves
like a Hamiltonian for the ignorable cyclic variables = + 1
Ignoring both the Lagrange multiplier and generalized forces, then the partitioned equations of motion
for the non-cyclic and cyclic generalized coordinates are given in Table 81
Table 81; Equations of motion for the Routhian
Lagrange equations Hamilton equations
Coordinates Noncyclic: 1 ≤ ≤ Cyclic: ( + 1) ≤ ≤
Thus there are cyclic (ignorable) coordinates ( )+1 ( ) which obey Hamilton’s equations of
motion, while the the first = − non-cyclic (non-ignorable) coordinates ( ̇)1 ( ̇) for = 1 2
obey Lagrange equations. The solution for the cyclic variables is trivial since they are constants of motion
and thus the Routhian has reduced the number of equations of motion that must be solved from to
the = − non-cyclic variables This Routhian provides an especially useful way to reduce the number
of equations of motion for rotating systems.
Note that there are several definitions used to define the Routhian, for example some books define this
Routhian as being the negative of the definition used here so that it corresponds to a positive Lagrangian.
However, this sign usually cancels when deriving the equations of motion, thus the sign convention is unim-
portant if a consistent sign convention is used.
212 CHAPTER 8. HAMILTONIAN MECHANICS
This Routhian behaves like a Hamiltonian for the non-cyclic variables which are expressed in terms of
and appropriate for a Hamiltonian. This Routhian writes the cyclic coordinates in terms of , and ̇
appropriate for a Lagrangian, which are treated assuming the Routhian is a negative Lagrangian for
these cyclic variables as summarized in table 82.
This non-cyclic Routhian is especially useful since it equals the Hamiltonian for the non-cyclic
variables, that is, the kinetic energy for motion of the cyclic variables has been removed. Note that since the
cyclic variables are constants of motion, then is a constant of motion if is a constant of motion.
However, does not equal the total energy since the coordinate transformation is time dependent,
that is, corresponds to the energy of the non-cyclic parts of the motion. For example, when used
to describe rotational motion, corresponds to the energy in the non-inertial rotating body-fixed
frame of reference. This is especially useful in treating rotating systems such as rotating galaxies, rotating
machinery, molecules, or rotating strongly-deformed nuclei as discussed in chapter 109
The Lagrangian and Hamiltonian are the fundamental algebraic approaches to classical mechanics. The
Routhian reduction method is a valuable hybrid technique that exploits a trick to reduce the number of
variables that have to be solved for complicated problems encountered in science and engineering. The
Routhian provides the most useful approach for solving the equations of motion for rotating
molecules, deformed nuclei, or astrophysical objects in that it gives the Hamiltonian in the non-inertial
body-fixed rotating frame of reference ignoring the rotational energy of the frame. By contrast, the cyclic
Routhian is especially useful to exploit Lagrangian mechanics for solving problems in rigid-body
rotation such as the Tippe Top described in example 1114.
Note that the Lagrangian, Hamiltonian, plus both the and Routhian’s, all are scalars
under rotation, that is, they are rotationally invariant. However, they may be expressed in terms of the
coordinates in either the stationary or aP rotating frame. The major difference is that the Routhian includes
only subsets of the kinetic energy term ̇ . The relative merits of using Lagrangian, Hamiltonian, and
both the and Routhian reduction methods, are illustrated by the following examples.
8.6. ROUTHIAN REDUCTION 213
Take the time derivative of equation () and use () to substitute for ̇ gives that
2 cos
̈ − 3 + sin = 0 ()
2 4 sin
Note that equation (b) shows that is a cyclic coordinate. Thus
that is the angular momentum about the vertical axis is conserved. Note that although is a constant of
motion, ̇ = 2 sin2 is a function of and thus in general it is not conserved. There are various solutions
depending on the initial conditions. If = 0 then the pendulum is just the simple pendulum discussed
previously that can oscillate, or rotate in the direction. The opposite extreme is where = 0 where the
pendulum rotates in the direction with constant . In general the motion is a complicated coupling of the
and motions.
214 CHAPTER 8. HAMILTONIAN MECHANICS
= = 2 sin2 ̇
̇
The Routhian ( ̇ ̇ ) behaves like a Hamiltonian for and like a Lagrangian 0 = −
for . Use of Hamilton’s canonical equations for give
̇ = =
sin2
2
−̇ = =0
These two equations show that is a constant of motion given by
Note that the Hamiltonian only includes the kinetic energy for the motion which is a constant of motion,
but this energy does not equal the total energy. This is what is predicted by Noether’s theorem due to the
symmetry of the Lagrangian about the vertical axis.
Since ( ̇ ̇ ) behaves like a Lagrangian for then the Lagrange equation for is
Λ = − =0
̇
where the negative sign of the Lagrangian in ( ̇ ̇ ) cancels. This leads to
2 cos
2 ̈ = − sin
2 sin3
that is
2 cos
̈ − 3 + sin = 0 ()
2 4 sin
This result is identical to the one obtained using Lagrangian mechanics in example 612 and Hamiltonian
mechanics given in example 86. The Routhian simplified the problem to one degree of freedom
by absorbing into the Hamiltonian the cyclic, that is, ignorable, coordinate and its conserved conjugate
momentum . Note that the central term in equation is the centrifugal term which is due to rotation
about the vertical axis. This term is zero for plane pendulum motion when = 0.
8.6. ROUTHIAN REDUCTION 215
2 2
+ = − cos − ̇
22 22 sin2
2 1 2
= − 2 sin2 ̇ − cos ()
22 2
This behaves like a negative Lagrangian for and a Hamiltonian for . The conjugate momenta are
= =− = 2 sin2 ̇
̇ ̇
̇ = =− =0
that is, is a constant of motion.
Hamilton’s equations of motion give
̇ = = ()
2
2 cos
−̇ = = − 2 3 + sin ()
sin
Equation gives that
̇
̇ = ̈ =
2
Inserting this into equation gives
2 cos
̈ − + sin = 0
2 4 sin3
which is identical to the equation of motion derived using . The Hamiltonian in the rotating frame
is a constant of motion given by but it does not include the total energy.
Note that these examples show that both forms of the Routhian, as well as the complete Lagrangian
formalism, shown in example 612, and complete Hamiltonian formalism, shown in example 86 all give the
same equations of motion. This illustrates that the Lagrangian, Hamiltonian, and Routhian mechanics all
give the same equations of motion and this applies both in the static inertial frame as well as a rotating frame
since the Lagrangian, Hamiltonian and Routhian all are scalars under rotation, that is, they are rotationally
invariant.
216 CHAPTER 8. HAMILTONIAN MECHANICS
8.9 Example: Single particle moving in a vertical plane under the influence of
an inverse-square central force
The Lagrangian for a single particle of mass moving in a vertical plane and subject to a central inverse
square central force, is specified by two generalized coordinates, and
2 2
= (̇ + 2 ̇ ) +
2
The ignorable coordinate is since it is cyclic. Let the constant conjugate momentum be denoted by =
̇
= 2 ̇. Then the corresponding cyclic Routhian is
2 1
( ̇ ) = ̇ − = 2
− ̇2 −
2 2
This Routhian is the equivalent one-dimensional potential () minus the kinetic energy of radial motion.
Applying Hamilton’s equation to the cyclic coordinate gives
̇ = 0 = ̇
2
implying a solution
= 2 ̇ =
where the angular momentum is a constant.
The Lagrange-Euler equation can be applied to the non-cyclic coordinate
Λ = − =0
̇
where the negative sign of cancels. This leads to the radial solution
2
̈ − 3
+ 2 =0
where = which is a constant of motion in the centrifugal term. Thus the problem has been reduced to a
one-dimensional problem in radius that is in a rotating frame of reference.
Note that since the drag force is dissipative the dominant component of the drag force must point in the
opposite direction to the velocity vector. For example, for a simple linear velocity dependence the generalized
drag force could be of the form = −̄
8.7. DISSIPATIVE DYNAMICAL SYSTEMS 217
Lagrangian mechanics
Consider equations of motion for the degrees of freedom, and assume that the dissipation depends linearly
on velocity. Then, allowing all possible cross coupling of the equations of motion for the equations of
motion can be written in the form
X
[ ̈ + ̇ + − ()] = 0 (8.70)
=1
Multiplying equation 870 by ̇ , take the time integral, and sum over , gives the following energy equation
X X Z X
X Z Z
X
X Z
X
̈ ̇ + ̇ ̇ + ̇ = ()̇ (8.71)
=1 =1 0 =1 =1 0 =1 =1 0 0
The right-hand term is the total energy supplied to the system by the external generalized forces ()
during the time . The first time-integral term on the left-hand side is the total kinetic energy, while the
third integral term equals the potential energy. The second integral term on the left equals 2F where F is
defined as
1 XX
F≡ ̇ ̇ (8.72)
2 =1 =1
and the summations are over all particles of the system. This definition allows for complicated cross-
coupling effects between the particles. Fortunately the particle-particle coupling effects usually can be
neglected allowing use of the simpler definition that includes only the diagonal terms. Then the diagonal
form of the Rayleigh dissipation function can be written as
1X 2
F≡ ̇ (8.73)
2 =1
which is the rate of energy (power) loss due to the dissipative forces involved. The same relation is obtained
after summing over all the particles involved.
Transforming the frictional force into generalized coordinates requires the relation
X r r
ṙ = ̇ + (8.78)
Using equations 617 and 647 the component of the generalized frictional force is given by
X X X
r ṙ ṙ F
= F · = F · =− ∇ F · =− (8.80)
=1
=1
̇ =1
̇ ̇
Thus the Lagrange equations 647 can be written including the Rayleigh dissipation function in the form
½ µ ¶ ¾ "X
#
F
− = (q ) +
− (8.81)
̇ ̇
=1
Where corresponds to the generalized forces remaining after removal of the generalized linear, velocity-
dependent, frictional force and the holonomic forces of constraint are absorbed into the Lagrange mul-
tiplier term.
Linear dissipative forces can be directly, and elegantly, included in Lagrangian mechanics by use of
Rayleigh’s dissipation function. Equation 881 facilitates solving the equations of motion when linear velocity-
dependent dissipative forces are acting on the system.
Hamiltonian mechanics
If the nonconservative forces depend linearly on velocity, and are derivable from Rayleigh’s dissipation
function according to equation 881, then using the definition of generalized momentum gives
" #
X F
̇ = = + (q ) + − (8.82)
̇ ̇
=1
" #
(p q ) X F
̇ = − + (q ) +
− (8.83)
̇
=1
The Rayleigh dissipation function provides an elegant and convenient way to account for the frequently
encountered special case of linear dissipative forces in Lagrangian and Hamiltonian mechanics. The following
two examples illustrate the usefulness of the Rayleigh dissipation function when applied to both classical
mechanics and electromagnetism.
8.7. DISSIPATIVE DYNAMICAL SYSTEMS 219
gives
These two coupled equations can be decoupled and simplified by making a transformation to normal coor-
dinates, 1 2 where
1 = 1 − 2 2 = 1 + 2
Thus
1 1
1 =
( + 2 ) 2 = ( − 1 )
2 1 2 2
Insert these into the equations of motion gives
Add and subtract these two equations gives the following two decoupled equations
( + 20 ) 0
̈ 1 + ̇1 + 1 = cos ()
0
̈2 + ̇ 2 + 2 = cos ()
q p
(+20 ) 0
Define Γ = 1 = 2 = = . Then the two independent equations of motion become
This solution is a superposition of two independent, linearly-damped, driven normal modes 1 and 2 that
have different natural frequencies 1 and 2 . For weak damping these two driven normal modesq
each undergo
¡ ¢2
damped oscillatory motion with the 1 and 2 normal modes exhibiting resonances at 1 = 21 − 2 Γ2
0
q ¡ ¢2
and 02 = 22 − 2 Γ2
Thus the total magnetic energy which is analogous to kinetic energy is given by summing over all
circuits to be
1 XX
= = ̇ ̇
2 =1
=1
Similarly the electrical energy stored in the mutual capacitance between the circuits, which
is analogous to potential energy, is given by
1 X X
= =
2 =1
=1
Assuming that Ohm’s Law is obeyed, that is, the dissipation force depends linearly on velocity, then the
Rayleigh dissipation function can be written in the form
1 XX
F≡ ̇ ̇ ()
2 =1
=1
where is the resistance matrix. Thus the dissipation force, expressed in volts, is given by
F 1X
= − = ̇ ()
̇ 2
=1
Inserting equations and into equation 881 plus making the assumption that an additional gen-
eralized electrical force = () volts is acting on circuit then the Euler-Lagrange equations give the
following equations of motion.
X ∙ ¸
̈ + ̇ + = ()
=1
This is a generalized version of Kirchhoff’s loop rule which can be seen by considering the case where the
diagonal term = is the only non-zero term. Then
∙ ¸
̈ + ̇ + = ()
This sum of the voltages is identical to the usual expression for Kirchhoff’s loop rule. This example
illustrates the power of variational methods when applied to fields beyond classical mechanics.
8.8. SUMMARY 221
8.8 Summary
Hamilton’s equations of motion
Inserting the generalized momentum into Jacobi’s generalized energy relation was used to define the
Hamiltonian function to be
(q p ) = p · q̇−(q q̇ ) (83)
The Legendre transform of the Lagrange-Euler equations, led to Hamilton’s equations of motion.
̇ = (825)
" #
X
̇ = − + +
(826)
=1
where
=− (824)
The are treated as independent canonical variables Lagrange was the first to derive the canonical
equations but he did not recognize them as a basic set of equations of motion. Hamilton derived the canonical
equations of motion from his fundamental variational principle and made them the basis for a far-reaching
theory of dynamics. Hamilton’s equations give 2 first-order differential equations for for each of the
degrees of freedom. Lagrange’s equations give second-order differential equations for the variables ̇
Routhian reduction technique
The Routhian reduction technique is a hybrid of Lagrangian and Hamiltonian mechanics that exploits
the advantages of both approaches for solving problems involving cyclic variables. It is especially useful for
solving motion in rotating systems in science and engineering. Two Routhians are used frequently for solving
the equations of motion of rotating systems. Assuming that the variables between 1 ≤ ≤ are non-cyclic,
while the variables between + 1 ≤ ≤ are ignorable cyclic coordinates, then the two Routhians are:
X
X
(1 ; ̇1 ̇ ; +1 ; ) = ̇ − = − ̇ (865)
X X
(1 ; 1 ; ̇+1 ̇ ; ) = ̇ − = − ̇ (868)
The Routhian is a negative Lagrangian for the non-cyclic variables between 1 ≤ ≤ , where
= − and is a Hamiltonian for the cyclic variables between + 1 ≤ ≤ . Since the cyclic
variables are constants of the Hamiltonian, their solution is trivial, and the number of variables included in
the Lagrangian is reduced from to = − . The Routhian is useful for solving some problems in
classical mechanics. The Routhian is a Hamiltonian for the non-cyclic variables between 1 ≤ ≤ ,
and is a negative Lagrangian for the cyclic variables between + 1 ≤ ≤ . Since the cyclic variables
are constants of motion, the Routhian also is a constant of motion but it does not equal the total
energy since the coordinate transformation is time dependent. The Routhian is especially valuable
for solving rotating many-body systems such as galaxies, molecules, or nuclei, since the Routhian
is the Hamiltonian in the rotating body-fixed coordinate frame.
Dissipative systems:
There are three different approaches to Lagrangian or Hamiltonian mechanics that can be used to derive
the equations of motion for dissipative systems. The first, and most straightforward approach, is to introduce
the drag force as a generalized force in the Euler-Lagrange equations. The second approach uses Rayleigh’s
dissipation scalar function F which applies when drag forces depend linearly on velocity. If the dissipative
force can be expressed as
F = −∇ F (875)
222 CHAPTER 8. HAMILTONIAN MECHANICS
then the Lagrange equations can be written in terms of the Rayleigh dissipation function as
½ µ ¶ ¾ "X
#
F
− = (q ) +
− (881)
̇ ̇
=1
The third approach, discussed in chapter 137 uses non-standard Lagrangians or Hamiltonians that are
derived from the required equations of motion using the inverse variational problem.
Comparison of Lagrangian and Hamiltonian mechanics
Lagrangian and the Hamiltonian dynamics are two powerful and related algebraic formulations of me-
chanics that are based on the same variational principle. They both concentrate solely on active forces and
can ignore internal forces. They can handle many-body systems and allow convenient generalized coordinates
of choice, which is impractical or impossible using Newtonian mechanics. Thus it is natural to compare the
relative advantages of these two algebraic formalisms in order to decide which should be used for a specific
problem.
For a system with generalized coordinates, plus constraint forces that are not required to be known,
then the Lagrangian approach, using a minimal set of generalized coordinates, reduces to only = −
second-order differential equations and unknowns compared to the Newtonian approach where there are
+ unknowns. Alternatively, use of Lagrange multipliers allows determination of the constraint forces
resulting in + second order equations and unknowns. The Lagrangian potential function is limited
to conservative forces, Lagrange multipliers can be used to handle holonomic forces of constraint, while
generalized forces can be used to handle non-conservative and non-holonomic forces. The advantage of the
Lagrange equations of motion is that they can deal with any type of force, conservative or non-conservative,
and they directly determine , ̇ rather than which then requires relating to ̇.
For a system with generalized coordinates, the Hamiltonian approach determines 2 first-order differ-
ential equations which are easier to solve than second-order equations. But the 2 solutions then must be
combined to determine the equations of motion. The Hamiltonian approach is superior to the Lagrange ap-
proach in its ability to obtain an analytical solution of the integrals of the motion. Hamiltonian dynamics also
has a means of determining the unknown variables for which the solution assumes a soluble form. Important
applications of Hamiltonian mechanics are to quantum mechanics and statistical mechanics, where quantum
analogs of and can be used to relate to the fundamental variables of Hamiltonian mechanics. This
does not apply for the variables and ̇ of Lagrangian mechanics. The Hamiltonian approach is especially
powerful when the system has cyclic variables, then the conjugate momenta are constants. Thus the
conjugate variables ( ) can be factored out of the Hamiltonian, which reduces the number of conjugate
variables required to − . This is not possible using the Lagrangian approach since, even though the
coordinates can be factored out, the velocities ̇ still must be included, thus the conjugate variables
must be included. The Lagrange approach is advantageous for obtaining a numerical solution of systems in
classical mechanics. However, Hamiltonian mechanics expresses the variables in terms of the fundamental
canonical variables (q p) which provides a more fundamental insight into the underlying physics.1
1 Recommended reading: "Classical Mechanics" H. Goldstein, Addison-Wesley, Reading (1950). The present chapter
closely follows the notation used by Goldstein to facilitate cross-referencing and reading the many other textbooks that have
adopted this notation.
8.8. SUMMARY 223
Workshop exercises
1. A block of mass rests on an inclined plane making an angle with the horizontal. The inclined plane (a
triangular block of mass ) is free to slide horizontally without friction. The block of mass is also free to
slide on the larger block of mass without friction.
2. Discuss among yourselves the following four conditions that can exist for the Hamiltonian and give several
examples of systems exhibiting each of the four conditions.
(a) The Hamiltonian is conserved and equals the total mechanical energy
(b) The Hamiltonian is conserved but does not equal the total mechanical energy
(c) The Hamiltonian is not conserved but does equal the total mechanical energy
(d) The Hamiltonian is not conserved and does not equal the mechanical total energy.
3. A block of mass rests on an inclined plane making an angle with the horizontal. The inclined plane (a
triangular block of mass ) is free to slide horizontally without friction. The block of mass is also free to
slide on the larger block of mass without friction.
4. Discuss among yourselves the following four conditions that can exist for the Hamiltonian and give several
examples of systems exhibiting each of the four conditions.
a) The Hamiltonian is conserved and equals the total mechanical energy
b) The Hamiltonian is conserved but does not equal the total mechanical energy
c) The Hamiltonian is not conserved but does equal the total mechanical energy
d) The Hamiltonian is not conserved and does not equal the mechanical total energy
5. Compare the Lagrangian formalism and the Hamiltonian formalism by creating a two-column chart. Label one
side “Lagrangian” and the other side “Hamiltonian” and discuss the similarities and differences. Here are some
ideas to get you started:
6. It can be shown that if ( ̇ ) is the Lagrangian of a particle moving in one dimension, then = 0 where
0 ( ̇ ) = ( ̇ ) +
and ( ) is an arbitrary function. This problem explores the consequences of
this on the Hamiltonian formalism.
224 CHAPTER 8. HAMILTONIAN MECHANICS
(a) Relate the new canonical momentum 0 , for 0 , to the old canonical momentum , for .
(b) Express the new Hamiltonian 0 ( 0 0 ) for 0 in terms of the old Hamiltonian ( ) and .
(c) Explicitly show that the new Hamilton’s equations for 0 are equivalent to the old Hamilton’s equations
for .
7. A massless hoop of radius is rotating about an axis perpendicular to its central axis at constant angular
velocity . A mass can freely slide around the hoop.
8. Consider a pendulum of length attached to the end of rod of length . The rod is rotating at constant
angular velocity in the plane. Assume the pendulum is always taut.
Problems
1) A particle of mass in a gravitational field slides on the inside of a smooth parabola of revolution whose axis is
vertical. Using the distance from the axis and the azimuthal angle as generalized coordinates, find the following.
a) The Lagrangian of the system.
b) The generalized momenta and the corresponding Hamiltonian
c) The equation of motion for the coordinate as a function of time.
d) If
= 0 show that the particle can execute small oscillations about the lowest point of the paraboloid and
find the frequency of these oscillations.
2) Consider a particle of mass which is constrained to move on the surface of a sphere of radius . There are no
external forces of any kind acting on the particle.
a) What is the number of generalized coordinates necessary to describe the problem?
b) Choose a set of generalized coordinates and write the Lagrangian of the system.
c) What is the Hamiltonian of the system? Is it conserved?
d) Prove that the motion of the particle is along a great circle of the sphere.
3. A block of mass is attached to a wedge of mass by a spring with spring constant . The inclined frictionless
surface of the wedge makes an angle to the horizontal. The wedge is free to slide on a horizontal frictionless surface
as shown in the figure.
a) Given that the relaxed length of the spring is , find the values 0 when both book and wedge are stationary.
b) Find the Lagrangian for the system as a function of the coordinate of the wedge and the length of spring .
Write down the equations of motion.
c) What is the natural frequency of vibration?
8.8. SUMMARY 225
4. A fly-ball governor comprises two masses connected by 4 hinged arms of length to a vertical shaft and to a
mass which can slide up or down the shaft without friction in a uniform vertical gravitational field as shown in
the figure. The assembly is constrained to rotate around the axis of the vertical shaft with same angular velocity as
that of the vertical shaft. Neglect the mass of the arms, air friction, and assume that the mass has a negligible
moment of inertia. Assume that the whole system is constrained to rotate with a constant angular velocity 0 .
a) Choose suitable coordinates and use the Lagrangian to derive equations of motion of the system around the
equilibrium position.
b) Determine the height of the mass above its lowest position as a function of 0 .
c) Find the frequency of small oscillations about this steady motion.
d) Derive a Routhian that provides the Hamiltonian in the rotating system.
e) Is the total energy of the fly-ball governor in the rotating frame of reference constant in time?
f) Suppose that the shaft and assembly are not constrained to rotate at a constant angular velocity 0 , that is,
it is allowed to rotate freely at angular velocity ̇. What is the difference in the overall motion?
5. A rigid straight, frictionless, massless, rod rotates about the axis at an angular velocity ̇. A mass slides
along the frictionless rod and is attached to the rod by a massless spring of spring constant .
a; Derive the Lagrangian and the Hamiltonian
b; Derive the equations of motion in the stationary frame using Hamiltonian mechanics.
c; What are the constants of motion?
d; If the rotation is constrained to have a constant angular velocity ̇ = then is the non-cyclic Routhian
= − ̇ a constant of motion, and does it equal the total energy?
e; Use the non-cyclic Routhian to derive the radial equation of motion in the rotating frame of reference
for the cranked system with ̇ = .
226 CHAPTER 8. HAMILTONIAN MECHANICS
6. A thin uniform rod of length 2 and mass is suspended from a massless string of length tied to a nail. Initially
the rod hangs vertically. A weak horizontal force is applied to the rod’s free end.
a) Write the Lagrangian for this system.
b) For very short times such that all angles are small, determine the angles that string and the rod make with
the vertical. Start from rest at = 0
c) Draw a diagram to illustrate the initial motion of the rod.
7. A uniform ladder of mass and length 2 is leaning against a frictionless vertical wall with its feet on a
frictionless horizontal floor. Initially the stationary ladder is released at an angle 0 = 60◦ to the floor. Assume
that gravitation field = 9812 acts vertically downward and that the moment of inertia of the ladder about its
midpoint is = 13 2 .
a) Derive the Lagrangian
b) Derive the Hamiltonian
c) Explain if the Hamiltonian is conserved and/or if it equals the total energy
d) Use the Lagrangian to derive the equations of motion
e) Derive the angle at which the ladder loses contact with the vertical wall?
8. The classical mechanics exam induces Jacob to try his hand at bungee jumping. Assume Jacob’s mass
is suspended in a gravitational field by the bungee of unstretched length and spring constant . Besides the
longitudinal oscillations due to the bungee jump, Jacob also swings with plane pendulum motion in a vertical plane.
Use polar coordinates , neglect air drag, and assume that the bungee always is under tension.
a; Derive the Lagrangian
b; Determine Lagrange’s equation of motion for angular motion and identify by name the forces contributing to
the angular motion.
c; Determine Lagrange’s equation of motion for radial oscillation and identify by name the forces contributing to
the tension in the spring.
d; Derive the generalized momenta
e; Determine the Hamiltonian and give all of Hamilton’s equations of motion.
Chapter 9
9.1 Introduction
Conservative two-body central forces are of tremendous importance in physics because of the pivotal role that
the Coulomb and the gravitational forces play in nature. The Coulomb force plays a role in electrodynamics,
molecular, atomic, and nuclear physics, while the gravitational force plays an analogous role in celestial
mechanics. Therefore this chapter focusses on the physics of systems involving conservative two-body central
forces because of the importance and ubiquity of these conservative two-body central forces in nature.
A conservative two-body central force has the following three important attributes.
1. Conservative: A conservative force depends only on the particle position, that is, the force is not
time dependent. Moreover the work done by the force moving a body between any two points 1 and 2
is path independent. Conservative fields are discussed in chapter 28.
2. Two-body: A two-body force between two bodies depends only on the relative locations of the two
interacting bodies and is not influenced by the proximity of additional bodies. For two-body forces
acting between bodies, the force on body 1 is the vector superposition of the two-body forces due
to the interactions with each of the other − 1 bodies. This differs from three-body forces where the
force between any two bodies is influenced by the proximity of a third body.
3. Central: A central force field depends on the distance 12 from the origin of the force at point 1 to
the body location at point 2, and the force is directed along the line joining them, that is, r̂12 .
A conservative, two-body, central force combines the above three attributes and can be expressed as,
The force field F21 has a magnitude (12 ) that depends only on the magnitude of the relative separation
vector r12 = r2 − r1 between the origin of the force at point 1 and point 2 where the force acts, and the force
is directed along the line joining them, that is, r̂12 .
Chapter 28 showed that if a two-body central force is conservative, then it can be written as the gradient
of a scalar potential energy () which is a function of the distance from the center of the force field.
227
228 CHAPTER 9. CONSERVATIVE TWO-BODY CENTRAL FORCES
That is, the two vectors r1 r2 are written in terms of the position vector for the center of mass R and the
position vector r for relative motion in the center of mass.
Assuming that the two-body central force is conservative and represented by (), then the Lagrangian
of the two-body system can be written as
1 1
= 1 |ṙ1 |2 + 2 |ṙ2 |2 − () (9.11)
2 2
9.2. EQUIVALENT ONE-BODY REPRESENTATION FOR TWO-BODY MOTION 229
Differentiating equations 910 with respect to time, and inserting them into the Lagrangian, gives
1 ¯¯ ¯¯2 1
= ¯Ṙ¯ + |ṙ|2 − () (9.12)
2 2
where the total mass is defined as
= 1 + 2 (9.13)
and the reduced mass is defined by
1 2
≡ (9.14)
1 + 2
or equivalently
1 1 1
= + (9.15)
1 2
The total Lagrangian can be separated into two independent parts
1 ¯¯ ¯¯2
= ¯Ṙ¯ + (9.16)
2
where
1 2
= |ṙ| − () (9.17)
2
Assuming that no external forces are acting, then R = 0 and the three Lagrange equations for each of the
three coordinates of the R coordinate can be written as
P
= =0 (9.18)
Ṙ
That is, for a pure central force, the center-of-mass momentum P is a constant of motion where
P = = Ṙ (9.19)
Ṙ
It is convenient to work in the center-of-mass frame using
the effective Lagrangian . In the center-of-mass
¯ ¯2 frame of
1 ¯ ¯
reference, the translational kinetic energy 2 ¯Ṙ¯ associated
with center-of-mass motion is ignored, and only the energy in
the center-of-mass is considered. This center-of-mass energy
is the energy involved in the interaction between the colliding
bodies. Thus, in the center-of-mass, the problem has been re-
duced to an equivalent one-body problem of a mass moving
about a fixed force center with a path given by r which is the
separation vector between the two bodies, as shown in figure
92. In reality, both masses revolve around their center of
mass, also called the barycenter, in the center-of-mass frame
as shown in figure 92. Knowing r allows the trajectory of
each mass about the center of mass r01 and r02 to be calcu-
lated. Of course the true path in the laboratory frame of
reference must take into account both the translational mo-
tion of the center of mass, in addition to the motion of the
Figure 9.2: Orbits of a two-body system with
equivalent one-body representation relative to the barycenter.
mass ratio of 2 rotating about the center-of-
Be careful to remember the difference between the actual tra-
mass, O. The dashed ellipse is the equivalent
jectories of each body, and the effective trajectory assumed
one-body orbit with the center of force at the
when using the reduced mass which only determines the rel-
focus O.
ative separation r of the two bodies. This reduction to an
equivalent one-body problem greatly simplifies the solution
of the motion, but it misrepresents the actual trajectories and the spatial locations of each mass in space.
The equivalent one-body representation will be used extensively throughout this chapter.
230 CHAPTER 9. CONSERVATIVE TWO-BODY CENTRAL FORCES
The center-of-mass Lagrangian leads to the following two general properties regarding the angular mo-
mentum vector L.
1) The motion lies entirely in a plane perpendicular to the fixed direction of the total angular momentum
vector. This is because
L·r=r×p·r=0 (9.21)
that is, the radius vector is in the plane perpendicular to the total angular momentum vector. Thus, it is
possible to express the Lagrangian in polar coordinates, ( ) rather than spherical coordinates. In polar
coordinates the center-of-mass Lagrangian becomes
1 ³ 2
´
= ̇2 + 2 ̇ − () (9.22)
2
2) If the potential is spherically symmetric, then the polar angle is cyclic and therefore Noether’s
theorem gives that the angular momentum p ≡ L = r × p is a constant of motion. That is, since = 0
where the vectors ṗ and ψ̇ imply that equation 923 refers to three independent equations corresponding
to the three components of these vectors. Thus the angular momentum p conjugate to ψ is a constant of
motion. The generalized momentum p is a first integral of the motion which equals
p = = 2 ψ̇ = p̂ (9.24)
ψ̇
where the magnitude of the angular momentum , and the direction p̂ both are constants of motion.
A simple geometric interpretation of equation 924 is illus-
trated in figure 93 The radius vector sweeps out an area A
in time where y
1
A = r × v (9.25)
2
and the vector A is perpendicular to the − plane. The rate
of change of area is
A 1
= r×v (9.26)
2
But the angular momentum is r+dr
A r
L = r × p = r × v = 2 (9.27)
Thus the conservation of angular momentum implies that the
areal velocity
also is a constant of motion This fact is called
Kepler’s second law of planetary motion which he deduced in
1609 based on Tycho Brahe’s 55 years of observational records x
O
of the motion of Mars. Kepler’s second law implies that a
planet moves fastest when closest to the sun and slowest when
farthest from the sun. Note that Kepler’s second law is a state-
ment of the conservation of angular momentum which is inde- Figure 9.3: Area swept out by the radius
pendent of the radial form of the central potential. vector in the time dt.
9.4. EQUATIONS OF MOTION 231
2
̈ = − + 3 (9.30)
Similarly, for the angular coordinate, the operator equation Λ = 0 leads to equation 924. That is, the
angular equation of motion for the magnitude of is
= = 2 ̇ = (9.31)
̇
Lagrange’s equations have given two equations of motion, one dependent on radius and the other on
the polar angle . Note that the radial acceleration is just a statement of Newton’s Laws of motion for the
radial force in the center-of-mass system of
2
= − + 3 (9.32)
This can be written in terms of an effective potential
2
() ≡ () + (9.33)
22
which leads to an equation of motion
()
= ̈ = − (9.34)
2
2
Since 3 = ̇ , the second term in equation (933)
It is remarkable that the six-dimensional equations the combined effective bound potential.
of motion, for two bodies interacting via a two-body
central force, has been reduced to trivial center-of-mass translational motion, plus a one-dimensional one-
body problem given by (934) in terms of the relative separation and an effective potential ().
232 CHAPTER 9. CONSERVATIVE TWO-BODY CENTRAL FORCES
9.6 Hamiltonian
Since the center-of-mass Lagrangian is not an explicit function of time, then
=− =0 (9.40)
Thus the center-of mass Hamiltonian is a constant of motion. However, since the transformation to
center of mass can be time dependent, then 6= that is, it does not include the total energy because
the kinetic energy of the center-of-mass motion has been omitted from . Also, since no transformation
is involved, then
= + = (9.41)
That is, the center-of-mass Hamiltonian equals the center-of-mass total energy. The center-of-mass
Hamiltonian then can be written using the effective potential (933) in the form
2 2 2 2 2
= + 2 + () = + 2
+ () = + () = (9.42)
2 2 2 2 2
It is convenient to express the center-of-mass Hamiltonian in terms of the energy equation for the
orbit in a central field using the transformed variable = 1 . Substituting equations 933 and 937 into the
Hamiltonian equation 942 gives the energy equation of the orbit
"µ ¶ #
2
2 ¡ ¢
+ + −1 =
2
(9.43)
2
Energy conservation allows the Hamiltonian to be used to solve problems directly. That is, since
̇2 2
= + + () = (9.44)
2 22
then s µ ¶
2 2
̇ = =± − − (9.45)
22
The time dependence can be obtained by integration
Z
±
= r ³ ´ + constant (9.46)
2 2
− − 22
An inversion of this gives the solution in the standard form = () However, it is more interesting to find
the relation between and From relation 946 for then
±
= r ³ ´ (9.47)
2 2
− − 22
Therefore Z
±
= r ³ ´ + constant (9.49)
2
2 2 − − 22
which can be used to calculate the angular coordinate. This gives the relation between the radial and angular
coordinates which specifies the trajectory.
Although equations (945) and (949) formally give the solution, the actual solution can be derived
analytically only for certain specific forms of the force law and these solutions differ for attractive versus
repulsive interactions.
234 CHAPTER 9. CONSERVATIVE TWO-BODY CENTRAL FORCES
2) E = 0 : It can be shown that the orbit for this case is parabolic.
3) 0 E Umin : For this case the equivalent orbit has both a maximum and minimum radial distance
2
at which ̇ = 0 At the turning points the radial kinetic energy term is zero so = + 2 2 For the
attractive inverse square law force the path is an ellipse with the focus at the center of attraction (Figure
95), which is Kepler’s First Law. During the time that the radius ranges from min to max and back the
radius vector turns through an angle ∆ which is given by
Z max
±
∆ = 2 r ³ ´ (9.50)
min 2
2 2 − − 2 2
The general path prescribes a rosette shape which is a closed curve only if ∆ is a rational fraction of
2.
4) E = Umin : In this case is a constant implying that the path is circular since
s µ ¶
2 2
̇ = =± − − =0 (9.51)
22
5) E Umin : For this case the square root is imaginary and there is no real solution.
In general the orbit is not closed, and such open orbits do not repeat. Bertrand’s Theorem states that
the inverse-square central force, and the linear harmonic oscillator, are the only radial dependences of the
central force that lead to stable closed orbits.
2 ()
+ () = 0 r
2
Q
A solution of this is
1 r0
() = cos( − )
0
where 0 and are arbitrary constants. This can be rewritten as x
0
() = Trajectory of a free body
cos( − )
This is the equation of a straight line in polar coordinates as illustrated in the adjacent figure. This shows
that a free body moves in a straight line if no forces are acting on the body.
9.8. INVERSE-SQUARE, TWO-BODY, CENTRAL FORCE 235
Equation 958 is the polar equation of a conic section. Equation 958 also can be derived with the origin
at a focus by inserting the inverse square law potential into equation 949 which gives
Z
±
= q + constant (9.60)
2 2 2
2 + 2 −
The value of 0 merely determines the orientation of the major axis of the equivalent orbit. Without loss of
generality, it is possible to assume that the angle is measured with respect to the major axis of the orbit,
that is 0 = 0. Then the equation can be written as
" s #
1 2 2
= = − 2 [1 + cos ()] = − 2 1 + 1 + cos () (9.63)
2
This is the equation of a conic section where is the eccentricity of the conic section. The conic section is a
hyperbola if 1, parabola if = 1 ellipse if 1 and a circle if = 0 All the equivalent one-body orbits
for an attractive force have the origin of the force at a focus of the conic section. The orbits depend on
whether the force is attractive or repulsive, on the conserved angular momentum and on the center-of-mass
energy .
foci of the elliptical orbit. The term periapsis or pericenter both are used to designate the closest distance of approach, while
apoapsis or apocenter are used to designate the farthest distance of approach. Attaching the terms "perí-" and "apo-" to the
general term "-apsis" is preferred over having different names for each object in the solar system. For example, frequently used
terms are "-helion" for orbits of the sun, "-gee" for orbits around the earth, and "-cynthion" for orbits around the moon.
9.8. INVERSE-SQUARE, TWO-BODY, CENTRAL FORCE 237
The maximum distance, = max which is called the apoapsis, occurs when = 180
2
max = − (9.65)
[1 − ]
Remember that since 0 for bound orbits, the negative signs in equations 964 and 965 lead to 0.
2
The most bound orbit is a circle having = 0 which implies that = − 2 .
The shape of the elliptical orbit also can be described with respect to the center of the elliptical equivalent
orbit by deriving the lengths of the semi-major axis and the semi-minor axis shown in figure 95
µ ¶
1 1 2 2 2
= (min + max ) = + = (9.66)
2 2 [1 + ] [1 − ] [1 − 2 ]
p 2
= 1 − 2 = p (9.67)
[1 − 2 ]
Remember that the predicted bound elliptical orbit corresponds to the equivalent one-body representation
for the two-body motion as illustrated in figure 92. This can be transformed to the individual spatial
trajectories of the each of the two bodies in an inertial frame.
The eccentricity of the major planets ranges from = 02056 for Mercury, to = 00068 for Venus. The
Earth has an eccentricity of = 00167 with min = 91 · 106 miles and max = 95 · 106 miles. On the other
hand, = 0967 for Halley’s comet, that is, the radius vector ranges from 06 to 18 times the radius of the
orbit of the Earth.
The orbit energy can be derived by substituting the eccentricity, given by equation 962 into the semi-
major axis length given by equation 966 which leads to the center-of-mass energy of
= − (9.73)
2
However, the Hamiltonian, given by equation 942 implies that is
µ ¶
1
= 2 + − =− (9.74)
2 2
For the simple case of a circular orbit, = then the velocity equals
s
= (9.75)
For a circular orbit, the drag on a satellite lowers the total energy resulting in a decrease in the radius
of the orbit and a concomitant increase in velocity. That is, when the orbit radius is decreased, part of the
gain in potential energy accounts for the work done against the drag, and the remaining part goes towards
increase of the kinetic energy. Also note that, as predicted by the Virial Theorem, the kinetic energy always
is half the potential energy for the inverse square law force.
ṗ = ()r̂ (9.79)
Note that the angular moment L = r × p is conserved for a central force, that is L̇ = 0. Therefore the time
derivative of the product p × L reduces to
£ ¤
(p × L) = ṗ × L = ()r̂× (r×ṙ) = () r (r · ṙ) − 2 ṙ (9.80)
This can be simplified using the fact that
1
r · ṙ = (r · r) = ̇ (9.81)
2
thus ∙ ¸
£ 2
¤ 2 ṙ ṙ ³r´
() r (r · ṙ) − ṙ = − () − 2 = − ()2 (9.82)
This allows equation 980 to be reduced to
³r´
(p × L) = − ()2 (9.83)
Assume the special case of the inverse-square law, equation 952, then the central force equation 983 reduces
to
(p × L) = − (r̂) (9.84)
or
[(p × L) + (r̂)] = 0 (9.85)
Define the eccentricity vector A as
A ≡ (p × L) + (r̂) (9.86)
then equation 985 corresponds to
A
=0 (9.87)
This is a statement that the eccentricity vector is a constant of motion for an inverse-square, central
force.
The definition of the eccentricity vector A and angular momentum vector L implies a zero scalar product,
A · L =0 (9.88)
Thus the eccentricity vector A and angular momentum L are mutually perpendicular, that is, A is in the
plane of the orbit while L is perpendicular to the plane of the orbit. The eccentricity vector A, always points
along the major axis of the ellipse from the focus to the periapsis as illustrated on the left side in figure 97.
2 The symmetry underlying the eccentricity vector is less intuitive than the energy or angular momentum invariants leading
to it being discovered independently several times during the past three centuries. Jakob Hermann was the first to indentify
this invariant for the special case of the inverse-square central force. Bernoulli generalized his proof in 1710. Laplace derived
the invariant at the end of the 18 century using analytical mechanics. Hamilton derived the connection between the invariant
and the orbit eccentricity. Gibbs derived the invariant using vector analysis. Runge published the Gibb’s derivation in his
textbook which was referenced by Lenz in a 1924 paper on the quantal model of the hydrogen atom. Goldstein named this
invariant the "Laplace-Runge-Lenz vector", while others have named it the "Runge-Lenz vector" or the "Lenz vector". This
book uses Hamilton’s more intuitive name of "eccentricity vector".
240 CHAPTER 9. CONSERVATIVE TWO-BODY CENTRAL FORCES
Figure 9.7: The elliptical trajectory and eccentricity vector A for two bodies interacting via the inverse-
square, central force for eccentricity = 075. The left plot shows the elliptical spatial trajectory where
the semi-major axis is assumed to be on the -axis and the angular momentum L =ẑ, is out of the page.
The force centre is at one foci of the ellipse. The vector coupling relation A ≡ (p × L) + (r̂) is illustrated
at four points on the spatial trajectory. The right plot is a hodograph of the linear momentum p for this
trajectory. The periapsis is denoted by the number 1 and the apoapsis is marked as 3 on both plots. Note
that the eccentricity vector A is a constant that points parallel to the major axis towards the perapsis.
As a consequence, the two orthogonal vectors A and L completely define the plane of the orbit, plus the
orientation of the major axis of the Kepler orbit, in this plane. The three vectors A, p × L, and (r̂) obey
the triangle rule as illustrated in the left side of figure 97.
Hamilton noted the direct connection between the eccentricity vector A and the eccentricity of the
conic section orbit. This can be shown by considering the scalar product
r· (p × L) = (r × p) ·L = L · L =2 (9.90)
Note that equations 963 and 991 are identical if 0 = 0. This implies that the eccentricity and are
related by
=− (9.92)
where is defined to be negative for an attractive force. The relation between the eccentricity and total
center-of-mass energy can be used to rewrite equation 962 in the form
2 = 2 2 + 2 2 (9.93)
The combination of the eccentricity vector A and the angular momentum vector L completely specifies
the orbit for an inverse square-law central force. The trajectory is in the plane perpendicular to the angu-
lar momentum vector L, while the eccentricity, plus the orientation of the orbit, both are defined by the
eccentricity vector A. The eccentricity vector and angular momentum vector each have three independent
coordinates, that is, these two vector invariants provide six constraints, while the scalar invariant energy
adds one additional constraint. The exact location of the particle moving along the trajectory is not defined
and thus there are only five independent coordinates governed by the above seven constraints. Thus the
9.9. ISOTROPIC, LINEAR, TWO-BODY, CENTRAL FORCE 241
eccentricity vector, angular momentum, and center-of-mass energy are related by the two equations 988 and
993.
Noether’s theorem states that each conservation law is a manifestation of an underlying symmetry.
Identification of the underlying symmetry responsible for the conservation of the eccentricity vector A is
elucidated using equation 986 to give
(r̂) = A− (p × L) (9.94)
Take the scalar product
2
(r̂) · (r̂) = () = 2 2 + 2 − 2 · (p × L) (9.95)
Choose the angular momentum to be along the -axis, that is, L =ẑ, and, since p and A are perpendicular
to L, then p and A are in the x̂ − ŷ plane. Assume that the semimajor axis of the elliptical orbit is along
the x-axis, then the locus of the momentum vector on a momentum hodograph has the equation
µ ¶2 µ ¶2
2 + − = (9.96)
¯ ¯
¯ ¯
Equation 996 implies that the locus of the momentum vector is a circle of radius ¯
¯ with the center
¡ ¢
displaced from the origin at coordinates 0 as shown by the momentum hodograph on the right side of
an figure 97. The angle and eccentricity are related by,
cos = − =− = (9.97)
The circular orbit is centered at the origin for = − = 0, and thus the magnitude |p| is a constant around
the whole trajectory.
The inverse-square, central, two-body, force is unusual in that it leads to stable closed bound orbits
because the radial and angular frequencies are degenerate, i.e. = In momentum space, the locus of
the linear momentum vector p is a perfect circle which is the underlying symmetry responsible for both the
fact that the orbits are closed, and the invariance of the eccentricity vector. Mathematically this symmetry
for the Kepler problem corresponds to the body moving freely on the boundary of a four-dimensional sphere
in space and momentum. The invariance of the eccentricity vector is a manifestation of the special property
of the inverse-square, central force under certain rotations in this four-dimensional space; this (4) symmetry
is an example of a hidden symmetry.
in polar coordinates. In addition, since the force is spherically symmetric, then the angular momentum is
conserved. The orbit solutions are conic sections as described in chapter 97. The shape of the orbit for
the harmonic two-body central force can be derived using either polar or cartesian coordinates as illustrated
below.
The right-hand side of equation 9105 is a constant. The solution of 9105 must be a sine or cosine function
with polar angle = . That is
à ! ⎡à !2 ⎤ 12
0 − 2 =⎣ + 2 ⎦ cos 2 ( − 0 ) (9.106)
2
That is,
⎛ Ã ! 12 ⎞
1 2
0 = = 2 ⎝1 + 1+ 2 cos 2( − 0 )⎠ (9.107)
2
Equation 9107 corresponds to a closed orbit centered at the origin of the elliptical orbit as illustrated in
figure 98 The eccentricity of this closed orbit is given by
à !1
2 2 2
1+ 2 = (9.108)
2 − 2
Equations 966 967 give that the eccentricity is related to the semi-major and semi-minor axes by
µ ¶2
2 = 1 − (9.109)
Note that for a repulsive force 0, then ≥ 1 leading to unbound hyperbolic or parabolic orbits centered
on the origin. An attractive force, 0 allows for bound elliptical, as well as unbound parabolic and
hyperbolic orbits.
9.9. ISOTROPIC, LINEAR, TWO-BODY, CENTRAL FORCE 243
y py
0.8 1.2
1.0
0.6
0.8
0.4 r 0.6 p
0.4
0.2
0.2
x px
-1.2 -1.0 -0.8 -0.6 -0.4 -0.2 0.2 0.4 0.6 0.8 1.0 1.2 -0.8 -0.6 -0.4 -0.2 0.2 0.4 0.6 0.8
-0.2
-0.2
-0.4
-0.4 -0.6
-0.8
-0.6
-1.0
-0.8 -1.2
Figure 9.8: The elliptical equivalent trajectory for two bodies interacting via the linear, central force for
eccentricity = 075. The left plot shows the elliptical spatial trajectory where the semi-major axis is
assumed to be on the -axis and the angular momentum L =ẑ, is out of the page. The force center is at
the center of the ellipse. The right plot is a hodograph of the linear momentum p for this trajectory.
Solutions for the independent coordinates, and their corresponding momenta, are
2 2
2 = 2 + 2 = [ cos ( + )] + [ cos ( + )] (9.113)
p
2 + 2 4 + 4 + 2 2 cos ( − )
= + cos (2 + 0 )
2 2
where
2 cos + 2 cos
cos 0 = p (9.114)
4 + 4 + 2 2 cos ( − )
For a phase difference − = ± 2 this equation describes an ellipse centered at the origin which agrees
with equation 9107 that was derived using polar coordinates.
The two normal modes of the isotropic harmonic oscillator are degenerate, therefore are equally good
normal modes with two corresponding total energies, 1 2 , while the corresponding angular momentum
points in the direction.
2 1
1 = + 2 (9.115)
2 2
2 1
2 = + 2 (9.116)
2 2
= ( − ) (9.117)
244 CHAPTER 9. CONSERVATIVE TWO-BODY CENTRAL FORCES
Figure 98 shows the closed elliptical equivalent orbit plus the corresponding momentum hodograph for
the isotropic harmonic two-body central force. Figures 97 and 98 contrast the differences between the
elliptical orbits for the inverse-square force, and those for the harmonic two-body central force. Although
the orbits for bound systems with the harmonic two-body force, and the inverse-square force, both lead to
elliptical bound orbits, there are important differences. Both the radial motion and momentum are two
valued per cycle for the reflection-symmetric harmonic oscillator, whereas the radius and momentum have
only one maximum and one minimum per revolution for the inverse-square law. Although the inverse-square,
and the isotropic, harmonic, two-body central forces both lead to closed bound elliptical orbits for which the
angular momentum is conserved and the orbits are planar, there is another important difference between the
orbits for these two interactions. The orbit equation for the Kepler problem is expressed with respect to a
foci of the elliptical equivalent orbit, as illustrated in figure 97, whereas the orbit equation for the isotropic
harmonic oscillator orbit is expressed with respect to the center of the ellipse as illustrated in figure 98.
The diagonal matrix elements 011 = 1 , and 022 = 2 which are constants of motion. The off-diagonal
term is given by
µ ¶2 µ 2 ¶Ã 2 !
1 1 2 1 2 2
02
12 ≡ + = + + − 4 ( − )2 = 1 2 − 3 (9.119)
2 2 2 2 2 2 4
The terms on the right-hand side of equation 9119 all are constants of motion, therefore 02 12 also is a
constant of motion. Thus the 3 × 3 symmetry tensor A0 can be reduced to a 2 × 2 symmetry tensor for which
all the matrix elements are constants of motion, and the trace of the symmetry tensor is equal to the total
energy.
In summary, the inverse-square, and harmonic oscillator two-body central interactions both lead to closed,
elliptical equivalent orbits, the plane of which is perpendicular to the conserved angular momentum vector.
However, for the inverse-square force, the origin of the equivalent orbit is at the focus of the ellipse and
= , whereas the origin is at the center of the ellipse and = 2 for the harmonic force. As a
consequence, the elliptical orbit is reflection symmetric for the harmonic force but not for the inverse square
force. The eccentricity vector and symmetry tensor both specify the major axes of these elliptical orbits,
the plane of which are perpendicular to the angular momentum vector. The eccentricity vector, and the
symmetry tensor, both are directly related to the eccentricity of the orbit and the total energy of the two-
body system. Noether’s theorem states that the invariance of the eccentricity vector and symmetry tensor,
plus the corresponding closed orbits, are manifestations of underlying symmetries. The dynamical 3
symmetry underlies the invariance of the symmetry tensor, whereas the dynamical 4 symmetry underlies
the invariance of the eccentricity vector. These symmetries lead to stable closed elliptical bound orbits only
for these two specific two-body central forces, and not for other two-body central forces.
9.10. CLOSED-ORBIT STABILITY 245
2 2
(0 ) = − = −0 ̇ (9.123)
03
which can be written in terms of the central force for a stable orbit as
µ ¶
3 (0 )
− + 0 (9.128)
0 0
To the extent that this linear restoring force dominates over higher-order terms, then a perturbation of the
stable orbit will undergo simple harmonic oscillations about the stable orbit with angular frequency
v³
u 2 ´
u
t 2 =0
= (9.133)
The above discussion shows that a small amplitude radial oscillation about the stable orbit with amplitude
will be of the form
= sin(2 + )
The orbit will be closed if the product of the oscillation frequency and the orbit period is an integer
value.
The fact that planetary orbits in the gravitational field are observed to be closed is strong evidence
that the gravitational force field must obey the inverse square law. Actually there are small precessions of
planetary orbits due to perturbations of the gravitational field by bodies other than the sun, and due to
relativistic effects. Also the gravitational field near the earth departs slightly from the inverse square law
because the earth is not a perfect sphere, and the field does not have perfect spherical symmetry. The study
of the precession of satellites around the earth has been used to determine the oblate quadrupole and slight
octupole (pear shape) distortion of the shape of the earth.
The most famous test of the inverse square law for gravitation is the precession of the perihelion of
Mercury. If the attractive force experienced by Mercury is of the form
F() = − r̂
2+
where || is small, then it can be shown that, for approximate circular orbitals, the perihelion will advance
by a small angle per orbit period. That is, the precession is zero if = 0, corresponding to an inverse
square law dependence which agrees with Bertrand’s theorem. The position of the perihelion of Mercury has
been measured with great accuracy showing that, after correcting for all known perturbations, the perihelion
advances by 43(±5) seconds of arc per century, that is 5 × 10−7 radians per revolution. This corresponds to
= 16 × 10−7 which is small but still significant. This precession remained a puzzle for many years until
1915 when Einstein predicted that one consequence of his general theory of relativity is that the planetary
orbit of Mercury should precess at 43 seconds of arc per century, which is in remarkable agreement with
observations.
9.10. CLOSED-ORBIT STABILITY 247
1 2 2
= +
2 22
At the minimum µ ¶
2
= − =0
=0 3
Thus
µ ¶ 14
2
0 =
and µ ¶
2 32
= + = 4 0
2 =0 04
which is a stable orbit. Small perturbations of such a stable circular orbit will have an angular frequency
v³
u 2 ´ s
u
t 2 =0
= =2
Note that this is twice the frequency for the planar harmonic oscillator with the same restoring coefficient.
This is due to the central repulsion, the effective potential well for this rotating oscillator example has about
half the width for the corresponding planar harmonic oscillator. Note that the kinetic energy for the rotational
2 1 2
motion, which is 2 2 equals the potential energy 2 at the minimum as predicted by the Virial Theorem
for a linear two-body restoring force.
= 0
That this is true can be shown by inserting this orbit into the differential orbit equation.
Using a Binet transformation to the variable to gives
1 1
= = −
0
−
= −
0
2 2
−
=
2 0
2 1 1
+ = − 2 2( )
2
gives µ ¶
2 − 1 1
+ − = − 2 02 2
0 0
That is µ ¶ ¡ 2 ¢ ¡ 2 ¢
1 + 1 2 −3 −3 + 1 2
=− 0 =−
3
which is a central attractive inverse cubic force.
The time dependence of the spiral orbit can be derived since the angular momentum gives
̇ = 2
= 2 2
0
2
̇2 + + ( − ) =
22
The effective potential is
2
= + ( − )
22
which is shown in the adjacent figure. The stationary value occurs when
µ ¶
2
= − 3 + = 0
0 0
2 = 2 03
Note that 0 = 0 if = 0.
The stability of the solution is given by the second deriv-
ative µ 2 ¶
32 3
2
= 4 = 0
0 0 0
Therefore the stationary point is stable.
Note that the equation of motion for the minimum can be
expressed in terms of the restoring force on the two masses
µ 2 ¶
2̈ = − ( − 0 )
2 0
Even when all the bodies are interacting via two-body central
forces, the problem usually is insoluble in terms of known ana-
lytic integrals. Newton first posed the difficulty of the three-body
Kepler problem which has been studied extensively by mathe-
maticians and physicists. No known general analytic integral Figure 9.10: A contour plot of the effec-
solution has been found. Each body for the -body system has tive potential for the Sun-Earth gravita-
6 degrees of freedom, that is, 3 for position and 3 for momen- tional system in the rotating frame where
tum. The center-of-mass motion can be factored out, therefore the Sun and Earth are stationary. The
the center-of-mass system for the -body system has 6 − 10 de- 5 Lagrange points are saddle points
grees of freedom after subtraction of 3 degrees for location of the where the net force is zero. (Figure cre-
center of mass, 3 for the linear momentum of the center of mass, ated by NASA)
3 for rotation of the center of mass, and 1 for the total energy of
the system. Thus for = 2 there are 12 − 10 = 2 degrees of freedom for the two-body system for which the
Kepler approach takes to be r and For = 3 there are 8 degrees of freedom in the center of mass system
that have to be determined.
Numerical solutions to the three-body problem can be obtained using successive approximation or per-
turbation methods in computer calculations. The problem can be simplified by restricting the motion to
either of following two approximations:
1) Planar approximation
This approximation assumes that the three masses move in the same plane, that is, the number of degrees
of freedom are reduced from 8 to 6 which simplifies the numerical solution.
The polar angle is measured with respect to the symmetry axis of the two-body system which is along
the line of distance of closest approach as shown in figure 96. The geometry and symmetry show that the
scattering angle is related to the trajectory angle ∞ by
= − 2 ∞ (9.144)
Since
2 = 2 2 = 2 2 (9.146)
then the scattering angle can be written as.
Z ∞
−
∞ = = r³ ´ (9.147)
2 min 2
2 1 − − 2
Let = 1 , then Z ∞
−
∞ = = r³ ´ (9.148)
2 min
1− − 2 2
min = (1 + ) (9.156)
2
à !
1
min = 1+ (9.157)
2 sin 2
Figure 9.14: Classical trajectories for
Note that for = 180 then scattering to a given angle by the
repulsive Coulomb field plus the at-
= = (min) (9.158) tractive nuclear field for three differ-
min
ent impact parameters. Path 1 is
which is what you would expect from equating the incident kinetic pure Coulomb. Paths 2 and 3 in-
energy to the potential energy at the distance of closest approach. clude Coulomb plus nuclear interac-
For scattering of two nuclei by the normal repulsive Coulomb force, tions. The dashed parts of trajecto-
when the impact parameter becomes small enough, the attractive nu- ries 2 and 3 correspond to only the
clear force also acts leading to impact-parameter dependent effective Coulomb force acting, i.e. zero nu-
potentials illustrated in figure 914 Trajectory 1 does not overlap the clear force
nuclear force and thus is pure Coulomb. Trajectory 2 interacts at the
periphery of the nuclear potential and the trajectory deviates from pure Coulomb shown dashed. Trajectory
3 passes through the interior of the nuclear potential. These three trajectories all can lead to the same scat-
tering angle and thus there no longer is a one-to-one correspondence between scattering angle and impact
parameter.
This cross section assumes elastic scattering by a repulsive two-body inverse-square central force. For scat-
tering of nuclei in the Coulomb potential, the constant is given to be
2
= (9.160)
4
The cross section, scattering angle and of equation 9159 are in the center-of-mass coordinate system,
whereas usually two-body elastic scattering data involve scattering of the projectiles by a stationary target.
9.12. TWO-BODY SCATTERING 255
Gieger and Marsden performed scattering of 77 MeV particles from a thin gold foil and proved that
the differential scattering cross section obeyed the Rutherford formula back to angles corresponding to a
distance of closest approach of 10−14 which is much smaller that the 10−10 size of the atom. This
validated the Rutherford model of the atom and immediately led to the Bohr model of the atom which
played such a crucial role in the development of quantum mechanics. Bohr showed that the agreement with
the Rutherford formula implies the Coulomb field obeys the inverse square law to small distances. This work
was performed at Manchester University, England between 1908 and 1913. It is fortunate that the classical
result is identical to the quantal cross section for scattering, otherwise the development of modern physics
could have been delayed for many years.
Scattering of very heavy ions, such as 208 Pb, can electromagnetically excite target nuclei. For the Coulomb
force the impact parameter and the distance of closest approach, min are directly related to the scattering
angle by equation 9155. Thus observing the angle of the scattered projectile unambiguously determines the
hyperbolic trajectory and thus the electromagnetic impulse given to the colliding nuclei. This process, called
Coulomb excitation, uses the measured angular distribution of the scattered ions for inelastic excitation of
the nuclei to precisely and unambiguously determine the Coulomb excitation cross section as a function of
impact parameter. This unambiguously determines the shape of the nuclear charge distribution.
̇ = ̇ = =− = − cos ()
2
1
√
The initial energy gives that = 2 Hence the orbit equation is
√
1 2
= = sin ()
The above trajectory has a distance of closest approach, min , when min = 2 . Moreover, due to the
symmetry of the orbit, the scattering angle is given by
µ ¶
1
= − 2 0 = 1 −
Since 2 = 2 2 ̇∞
2
= 22 then
µ ¶− 12 µ ¶− 12
2
1− = 1+ 2 = 1+ 2
This gives that the impact parameter is related to scattering angle by
2
( − )
2 =
(2 − )
This impact parameter relation can be used in equation 9141 to give the differential cross section
¯ ¯
¯¯ ¯¯ 2 ( − )
= =
Ω sin ¯ ¯ (2 − )2 2
These orbits are called Cotes spirals.
256 CHAPTER 9. CONSERVATIVE TWO-BODY CENTRAL FORCES
p
+ p
=0 (9.161)
Using the center-of-momentum frame, coupled with the conservation of linear momentum, implies that the
vector sum of the final momenta of the reaction products,
also is zero. That is
X
p
=0 (9.162)
=1
An additional constraint is that energy conservation relates the initial and final kinetic energies by
¡ ¢2 ¡ ¢2 ¡ ¢2 ¡ ¢2
+ += + (9.163)
2 2 2 2
where the value is the energy contributed to the final total kinetic energy by the reaction between the
incoming projectile and target. For exothermic reactions, 0 the summed kinetic of the reaction products
exceeds the sum of the incoming kinetic energies, while for endothermic reactions, 0 the summed kinetic
energy of the reaction products is less than that of the incoming channel.
For two-body kinematics, the following are three advantages to working in the center-of-momentum frame
of reference.
1. Two incident colliding bodies are colinear as are two final bodies.
2. The linear momenta for the two colliding bodies are identical in both the incident channel and also the
outgoing channel.
3. The total energy in the center-of-momentum coordinate frame is the energy available to the reac-
tion during the collision. The trivial kinetic energy of the center-of-momentum frame relative to the
laboratory frame is handled separately.
The kinematics for two-body reactions is easily determined using the conservation of linear momentum
along and perpendicular to the beam direction plus the conservation of energy, 9161 − 9163. Note that it is
common practice to use the name center-of-mass rather than center-of-momentum in spite of the fact that
for relativistic mechanics only the center-of-momentum is a meaningful concept.
General features of the transformation between the center-of-momentum and laboratory frames of refer-
ence are best illustrated by elastic or inelastic scattering of nuclei where the two reaction products in the final
channel are identical to the incident bodies. Inelastic excitation of an excited state energy of ∆ in either
reaction product corresponds to = −∆ while elastic scattering corresponds to = −∆ = 0.
For inelastic scattering the conservation of linear momenta for the outgoing channel in the center-of-
momentum simplifies to
p
+ p
=0 (9.164)
that is, the linear momenta of the two reaction products are equal and opposite.
Assume that the center-of-momentum direction of the scattered projectile is at an angle = relative
to the direction of the incoming projectile direction and the scattered target nucleus is scattered at a center-
of-momentum direction = − . Elastic scattering corresponds to simple¯ scattering for which
¯ ¯ ¯ the
magnitudes of the incoming and outgoing projectile momenta are equal, that is, ¯
¯
= ¯
¯.
9.13. TWO-BODY KINEMATICS 257
Figure 9.15: Vector hodograph of the scattered projectile and target velocities for a projectile, with incident
velocity that is elastically scattered by a stationary target body. The circles show the magnitude of
the projectile and target body final velocities in the center of mass. The center-of-mass velocity vectors
are shown as dashed lines while the laboratory vectors are shown as solid lines. The left hodograph shows
normal kinematics where the projectile mass is less than the target mass. The right hodograph shows inverse
kinematics where the projectile mass is greater than the target mass. For elastic scattering = 0 .
Velocities
The transformation between the center-of-momentum and laboratory frames requires knowledge of the par-
ticle velocities which can be derived from the linear momenta since the particle masses are known. Assume
that a projectile, mass , with incident energy in the laboratory frame bombards a stationary target
with mass The incident projectile velocity is given by
r
2
= (9.165)
The final velocities in the laboratory frame after the inelastic collision are
In the center-of-momentum coordinate system, equation 910 implies that the initial center-of-momentum
velocities are
=
+
= (9.166)
+
It is simple to derive that the final center-of-momentum velocities after the inelastic collision are given
258 CHAPTER 9. CONSERVATIVE TWO-BODY CENTRAL FORCES
by
r
2
0 = ̃
+
r
2
0 = ̃ (9.167)
+
Angles
The angles of the scattered recoils are written as
and
where
1 1
= q = q (9.170)
1 + 1 +
( +
(1 + ) )
and
is the energy per nucleon on the incident projectile.
Equation 9169 can be rewritten as
sin
tan
=
(9.171)
cos
+
Another useful relation from equation 9169 gives the center-of-momentum scattering angle in terms of
the laboratory scattering angle.
= sin
−1
( sin
) + (9.172)
This gives the difference in angle between the lab scattering angle and the center-of-momentum scattering
angle. Be careful with this relation since
is two-valued for inverse kinematics corresponding to the two
possible signs for the solution.
The angle relations between the lab and center-of-momentum for the recoiling target nucleus are connected
by
r
sin( − )
= ≡ ̃ (9.173)
sin ̃
That is
= sin−1 (̃ sin ) + (9.174)
9.13. TWO-BODY KINEMATICS 259
Figure 9.16: The kinematic correlation of the laboratory and center-of-mass scattering angles of the recoiling
projectile and target nuclei for scattering for 43 /nucleon 104 Pd on 208 Pb (left) and for the inverse
43 /nucleon 208 Pb on 104 Pd (right). The projectile scattering angles are shown by solid lines while the
recoiling target angles are shown by dashed lines. The blue curves correspond to elastic scattering, that is
= 0 while the red curves correspond to inelastic scattering with = −5 .
where
1 1
̃ = q =q (9.175)
1+ (1 +
) 1+ ( +
)
Note that ̃ is the same under interchange of the two nuclei at the same incident energy/nucleon, and
that ̃ is always larger than or equal to unity since is negative. For elastic scattering ̃ = 1 which gives
1
= ( − ) (Recoil lab angle for elastic scattering)
2
sin
tan = (Target lab to CM angle conversion)
cos + ̃
Velocity vector hodographs provide useful insight into the behavior of the kinematic solutions. As shown
in figure 915, in the center-of-momentum frame the scattered projectile has a fixed final velocity 0 , that is,
the velocity vector describes a circle as a function of . The vector addition of this vector and the velocity
of the center-of-mass vector − gives the laboratory frame velocity 0 . Note that for normal kinematics,
where then | | |0 | leading to a monotonic one-to-one mapping of the center-of-momentum
angle and 0
. However, for inverse kinematics, where then | | | | leading to two valued
solutions at any fixed laboratory scattering angle .
Billiard ball collisions are an especially simple example where the two masses³are identical
´ and the collision
is essentially elastic. Then essentially = ̃ = 1, =
2 and
= 1
2 −
, that is, the angle
between the scattered billiard balls is 2 .
Both normal and inverse kinematics are illustrated in figure 916 which shows the dependence of the
projectile and target scattering angles in the laboratory frame as a function of center-of-momentum scattering
angle for the Coulomb scattering of 104 Pd by 208 Pb, that is, for a mass ratio of 2 : 1. Both normal and
inverse kinematics are shown for the same bombarding energy of 43 for elastic scattering and
for inelastic scattering with a -value of −5 .
260 CHAPTER 9. CONSERVATIVE TWO-BODY CENTRAL FORCES
Figure 9.17: Recoil energies, in , versus laboratory scattering angle, shown on the left for scattering of
447 104 Pd by 208 Pb with = −50 , and shown on the right for scattering of 894 208 Pb on
104
Pd with = −50
Since sin( − ) ≤ 1 then equation 9173 implies that ̃ sin ≤ 1 Since ̃ is always larger than or
equal to unity there is a maximum scattering angle in the laboratory frame for the recoiling target nucleus
given by
1
sin max = (9.176)
̃
For elastic scattering = sin−1 ( ̃1 ) = 90◦ since ̃ = 1 for both 894 208 Pb bombarding 104 Pd, and
the inverse reaction using a 447 104 Pd beam scattered by a 208 Pb target. A -value of −5
gives ̃ = 1002808 which implies a maximum scattering angle of = 8571◦ for both 894 208 Pb
bombarding 104 Pd, and the inverse reaction of a 447 104 Pd beam scattered by a 208 Pb target. As a
consequence there are two solutions for for any allowed value of as illustrated in figure 916.
Since sin(
− ) ≤ 1 then equation 9150 implies that sin ≤ 1 For a 447
104
Pd beam
208
scattered by a Pb target = 050, thus = 05 for elastic scattering which implies that there is no
upper bound to
. This leads to a one-to-one correspondence between and for normal kinematics.
In contrast, the projectile has a maximum scattering angle in the laboratory frame for inverse kinematics
since
= 20 leading to an upper bound to given by
1
sin
max = (9.177)
For elastic scattering = 2 implying ◦
max = 30 . In addition to having a maximum value for , when
1 there also are two solutions for for any allowed value of . For the example of 894 208 Pb
bombarding 178 Hf leads to a maximum projectile scattering angle of ◦
= 300 for elastic scattering and
◦
= 29907 for = −5
Kinetic energies
In the laboratory frame the kinetic energies of the scattered projectile and recoiling target nucleus are
given by
µ ¶2 ³ ´
= 1 + 2 + 2 cos
̃ (9.180)
+
³ ´
2
= 2 1 + ̃ + 2̃ cos ̃ (9.181)
( + )
where
and are the center-of-mass scattering angles respectively for the scattered projectile and
target nuclei.
For the chosen incident energies the normal and inverse reactions give the same center-of-momentum
energy of 298 which is the energy available to the interaction between the colliding nuclei. However,
the kinetic energy of the center-of-momentum is 447−298 = 149 for normal kinematics and 894−298 =
596 for inverse kinematics. This trivial center-of-momentum kinetic energy does not contribute to the
reaction. Note that inverse kinematics focusses all the scattered nuclei into the forward hemisphere which
reduces the required solid angle for particle detection.
Solid angles
The laboratory-frame solid angles for the scattered projectile and target are taken to be and
respectively, while the center-of-momentum solid angles are Ω and Ω respectively. The Jacobian relating
the solid angles is
à !2
sin ¯ ¯
¯ ¯
= ¯cos( − ¯
) (9.182)
Ω sin
à !2
sin ¯ ¯
¯ ¯
= ¯cos( − )¯ (9.183)
Ω sin
These can be used to transform the calculated center-of-momentum differential cross sections to the
laboratory frame for comparison with measured values. Note that relative to the center-of-momentum frame,
the forward focussing increases the observed differential cross sections in the forward laboratory frame and
decreases them in the backward hemisphere.
9.14 Summary
This chapter has focussed on the classical mechanics of bodies interacting via conservative, two-body, central
interactions. The following are the main topics presented in this chapter.
Equivalent one-body representation for two bodies interacting via a central interaction The
equivalent one-body representation of the motion of two bodies interacting via a two-body central interaction
greatly simplifies solution of the equations of motion. The position vectors r1 and r2 are expressed in terms
of the center-of-mass vector R plus total mass = 1 + 2 while the position vector r plus associated
reduced mass = 11+ 2
2
describe the relative motion of the two bodies in the center of mass. The total
Lagrangian then separates into two independent parts
1 ¯¯ ¯¯2
= ¯Ṙ¯ + (916)
2
where the center-of-mass Lagrangian is
1 2
= |ṙ| − () (917)
2
Equations 910, and 911 can be used to derive the actual spatial trajectories of the two bodies expressed in
terms of r1 and r2 from the relative equations of motion, written in terms of R and r for the equivalent
one-body solution..
Angular momentum Noether’s theorem shows that the angular momentum is conserved if only a spherically-
symmetric two-body central force acts between the interacting two bodies. The plane of motion is perpen-
dicular to the angular momentum vector and thus the Lagrangian can be expressed in polar coordinates
as
1 ³ 2
´
= ̇2 + 2 ̇ − () (922)
2
Differential orbit equation of motion The Binet transformation = 1 allows the center-of-mass
Lagrangian for a central force F = ()r̂ to be used to express the differential orbit equation for the
radial motion as
2 1 1
+ = − 2 2( ) (939)
2
The Lagrangian, and the Hamiltonian all were used to derive the equations of motion for two bodies inter-
acting via a two-body, conservative, central interaction. The general features of the conservation of angular
momentum and conservation of energy for a two-body, central potential were presented.
Inverse-square, two-body, central force The is of pivotal importance in nature since it is applies
to both the gravitational force and the Coulomb force. The underlying symmetries of the inverse-square,
two-body, central interaction, lead to conservation of angular momentum, conservation of energy, Gauss’s
law, and that the two-body orbits follow closed, degenerate, orbits that are conic sections, for which the
eccentricity vector is conserved. The radial dependence, relative to the force center which lies at one focus
of the conic section, is given by
1
= − 2 [1 + cos ( − 0 )] (958)
where the orbit eccentricity equals s
2 2
= 1+ (962)
2
These lead to Kepler’s three laws of motion for two bodies in a bound orbit due to the attractive gravitational
force for which = −1 2 . The inverse-square law is special in that the eccentricity vector A is a third
invariant of the motion, where
A ≡ (p × L) + (r̂) (986)
The eccentricity vector unambiguously defines the orientation and direction of the major axis of the elliptical
orbit. The invariance of the eccentricity vector, and the existence of stable closed orbits, are manifestations
of the dynamical 04 symmetry.
9.14. SUMMARY 263
Isotropic, harmonic, two-body, central force The isotropic, harmonic, two-body, central interaction
is of interest since, like the inverse-square law force, it leads to closed elliptical orbits described by
⎛ Ã ! 12 ⎞
2
1
= 2 ⎝1 + 1 + 2 cos 2( − 0 )⎠ (9107)
2
Orbit stability Bertrand’s theorem states that only the inverse square law and the linear radial depen-
dences of the central forces lead to stable closed bound orbits that do not precess. These are manifestation
of the dynamical symmetries that occur for these two specific radial forms of two-body forces.
The three-body problem The difficulties encountered in solving the equations of motion for three bodies,
that are interacting via two-body central forces, was discussed. The three-body motion can include the
existence of chaotic motion. It was shown that solution of the three-body problem is simplified if either the
planar approximation, or the restricted three-body approximation, are applicable.
Two-body scattering The total and differential two-body scattering cross sections were introduced. It
was shown that for the inverse-square law force there is a simple relation between the impact parameter
and scattering angle given by
= cot (9155)
2 2
This led to the solution for the differential scattering cross-section for Rutherford scattering due to the
Coulomb interaction. µ ¶2
1 1
= (9159)
Ω 4 2 sin4 2
This cross section assumes elastic scattering by a repulsive two-body inverse-square central force. For scat-
tering of nuclei in the Coulomb potential the constant is given to be
2
= (9160)
4
Two-body kinematics The transformation from the center-of-momentum frame to laboratory frames of
reference was introduced. Such transformations are used extensively in many fields of physics for theoretical
modelling of scattering, and for analysis of experiment data.
264 CHAPTER 9. CONSERVATIVE TWO-BODY CENTRAL FORCES
Workshop exercises
1. Listed below are several statements concerning central force motion. For each statement, give the reason for
why the statement is true. If a statement is only true in certain situations, then explain when it holds and
when it doesn’t. The system referred to below consists of mass 1 located at 1 and mass 2 located at 2 .
• The potential energy of the system depends only on the difference 1 − 2 , not on 1 and 2 separately.
• The potential energy of the system depends only on the magnitude of 1 − 2 , not the direction.
• It is possible to choose an inertial reference frame in which the center of mass of the system is at rest.
• The total energy of the system is conserved.
• The total angular momentum of the system is conserved.
2 2
2. A particle of mass moves in a potential () = −0 −
.
(a) Given the constant , find an implicit equation for the radius of the circular orbit. A circular orbit at
= is possible if µ ¶¯
¯¯
=0
¯=
where is the effective potential.
(b) What is the largest value of for which a circular orbit exists? What is the value of the effective potential
at this critical orbit?
3. A particle of mass is observed to move in a spiral orbit given by the equation = , where is a constant.
Is it possible to have such an orbit in a central force field? If so, determine the form of the force function.
4. The
£ interaction energy¤ between two atoms of mass is given by the Lennard-Jones potential, () =
(0 )12 − 2(0 )6
(a) Determine the Lagrangian of the system where 1 and 2 are the positions of the first and second mass,
respectively.
(b) Rewrite the Lagrangian as a one-body problem in which the center-of-mass is stationary.
(c) Determine the equilibrium point and show that it is stable.
(d) Determine the frequency of small oscillations about the stable point.
5. Consider two bodies of mass in circular orbit of radius 0 2, attracted to each other by a force () , where
is the distance between the masses.
(a) Determine the Lagrangian of the system in the center-of-mass frame (Hint: a one-body problem subject
to a central force).
(c) Determine the equation of motion in in terms of the angular momentum and |F()|.
(d) Expand your result in (c) about an equilibrium radius 0 and show that the condition for stability
0 ( )
is, (00) + 30 0
6. Consider two charges of equal magnitude connected by a spring of spring constant 0 in circular orbit. Can
the charges oscillate about some equilibrium? If so, what condition must be satisfied?
7. Consider a mass in orbit around a mass , which is subject to a force = − 2 ̂ , where is the distance
between the masses. Show that the Runge-Lenz vector = × − ̂ is conserved.
9.14. SUMMARY 265
Problems
1. Show that the areal velocity is constant for a particle moving under the influence of an attractive force given
by () = − . Calculate the time averages of the kinetic and potential energies and compare with the the
results of the virial theorem.
2. Assume that the Earth’s orbit is circular and that the Sun’s mass suddenly decreases by a factor of two. (a)
What orbit will the earth then have? (b) Will the Earth escape the solar system?
3. Discuss the motion of a particle in a central inverse-square-law force field for a superimposed force whose
magnitude is inversely proportional to the cube of the distance from the particle to force center; that is
() = − − (k, 0)
2 3
Show that the motion is described by a precessing ellipse. Consider the cases
2 2 2
a) , b) = c) where is the angular momentum and the reduced mass.
4. A communications satellite is in a circular orbit around the earth at a radius and velocity . A rocket
accidentally fires quite suddenly, giving the rocket an outward velocity in addition to its original tangential
velocity
a) Calculate the ratio of the new energy and angular momentum to the old.
b) Describe the subsequent motion of the satellite and plot () () the net effective potential, and ()
after the rocket fires.
5. Two identical point objects, each of mass are bound by a linear two-body force = − where is the
vector distance between the two point objects. The two point objects each slide on a horizontal frictionless
plane subject to a vertical gravitational field . The two-body system is free to translate, rotate and oscillate
on the surface of the frictionless plane.
a) Derive the Lagrangian for the complete system including translation and relative motion.
b) Use Noether’s theorem to identify all constants of motion.
c) Use the Lagrangian to derive the equations of motion for the system.
d) Derive the generalized momenta and the corresponding Hamiltonian.
e) Derive the period for small amplitude oscillations of the relative motion of the two masses.
6. A bound binary star system comprises two spherical stars of mass 1 and 2 bound by their mutual gravita-
tional attraction. Assume that the only force acting on the stars is their mutual gravitation attraction and let
be the instantaneous separation distance between the centers of the two stars where is much larger than
the sum of the radii of the stars.
a) Show that the two-body motion of the binary star system can be represented by an equivalent one-body system
and derive the Lagrangian for this system.
b) Show that the motion for the equivalent one-body system in the center of mass frame lies entirely in a plane
and derive the angle between the normal to the plane and the angular momentum vector.
c) Show whether is a constant of motion and whether it equals the total energy.
d) It is known that a solution to the equation of motion for the equivalent one-body orbit for this gravitational
force has the form
1
= − 2 [1 + cos ]
and that the angular momentum is a constant of motion = . Use these to prove that the attractive force leading
to this bound orbit is
F= r̂
2
where must be negative.
266 CHAPTER 9. CONSERVATIVE TWO-BODY CENTRAL FORCES
7 When performing the Rutherford experiment, Gieger and Marsden scattered 77 4 He particles (alpha
particles) from 238 U at a scattering angle in the laboratory frame of = 900 . Derive the following observables
as measured in the laboratory frame.
238
(a) The recoil scattering angle of the U in the laboratory frame.
(b) The scattering angles of the 4 He and 238
U in the center-of-mass frame
(c) The kinetic energies of the 4 He and 238
U in the laboratory frame
(d) The impact parameter
(e) The distance of closest approach min
Chapter 10
10.1 Introduction
Newton’s Laws of motion apply only to inertial frames of reference. Inertial frames of reference make it
possible to use Newton’s laws of motion, or Lagrangian, or Hamiltonian mechanics, to develop the necessary
equations of motion. There are certain situations where it is much more convenient to treat the motion
in a non-inertial frame of reference. Examples are motion in frames of reference undergoing translational
acceleration, rotating frames of reference, or frames undergoing both translational and rotational motion.
This chapter will analyze the behavior of dynamical systems in accelerated frames of reference, especially
rotating frames such as on the surface of the Earth. Newtonian mechanics, as well as the Lagrangian and
Hamiltonian approaches, will be used to handle motion in non-inertial reference frames by introducing extra
inertial forces that correct for the fact that the motion is being treated with respect to a non-inertial reference
frame. These inertial forces are often called fictitious even though they appear real in the non-inertial frame.
The underlying reasons for each of the inertial forces will be discussed followed by a presentation of important
applications.
a = A +a0 (10.3)
2 r 2 r0 2 R
where a = 2 a0 =
2 and A = 2
In the fixed frame, Newton’s laws give that
F = a (10.4)
The force in the fixed frame can be separated into two terms, the acceleration of the accelerating frame of
reference A plus the acceleration with respect to the accelerating frame a0 .
The accelerating frame of reference can exploit Newton’s Laws of motion using an effective translational
force F0 ≡ F − A The additional −A term is called an inertial force; it can be altered by
choosing a different non-inertial frame of reference, that is, it is dependent on the frame of reference in which
the observer is situated.
Consider that during a time the position vector in the fixed
primed reference frame moves by an arbitrary infinitessimal
distance r0 As illustrated in figure 102, this infinitessi-
mal distance in the primed non-rotating frame can be split
into two parts:
a) r = θ×r0 which is due to rotation of the rotating
frame with respect to the translating primed frame.
b) (r00 ) which is the motion with respect to the rotating
(double-primed) frame.
That is, the motion has been arbitrarily divided into
a part that is due to the rotation of the double-primed
frame, plus the vector displacement measured in this rotating
(double-primed) frame. It is always possible to make such a
decomposition of the displacement as long as the vector sum
can be written as Figure 10.2: Infinitessimal displacement in
the non rotating primed frame and in the ro-
r0 = r00 + θ × r0 (10.8) tating double-primed reference frame frame.
10.3. ROTATING REFERENCE FRAME 269
Since θ = ω then the time differential of the displacement, equation 108, can be written as
µ 0¶ µ 00 ¶
r r
= + ω × r0 (10.9)
³ 0´
The important conclusion is that a velocity measured in a non-rotating reference frame r can be
³ 00 ´
expressed as the sum of the velocity r
measured relative to a rotating frame, plus the term ω × r0
which accounts for the rotation of the frame. The division of the r0 vector into two parts, a part due to
rotation of the frame plus a part with respect to the rotating frame, is valid for any vector as shown below.
The inertial-frame time derivative taken with components along the rotating coordinate basis ê
, equation
1011, is
µ ¶ X3 µ ¶ X3
G ê
= ê
+ ( ) (10.14)
=1 =1
Substitute the unit vector ê for r0 in equation 109 plus using equation 1012 gives that
µ ¶
ê
= ω × ê (10.15)
Substitute this into the second term of equation 1014 gives
µ ¶ µ ¶
G G
= +ω×G (10.16)
This important identity relates the time derivatives of any vector expressed in both the inertial frame and
the rotating non-inertial frame bases. Note that the ω × G term originates from the fact that the unit
basis vectors of the rotating reference frame are time dependent with respect to the non-rotating frame basis
vectors as given by equation (1015). Equation (1016) is used extensively for problems involving rotating
frames. For example, for the special case where G = r0 , then equation (1016) relates the velocity vectors in
the fixed and rotating frames as given in equation (109).
As another example, consider the vector ω̇
µ ¶ µ ¶ µ ¶
ω ω ω
ω̇ = = +ω×ω = = ω̇ (10.17)
That is, the angular acceleration ̇ has the same value in both the fixed and rotating frames of reference.
270 CHAPTER 10. NON-INERTIAL REFERENCE FRAMES
Now we wish to use the general transformation to a rotating frame basis which requires inclusion of the time
dependence of the unit vectors in the rotating frame, that is,
µ 00 ¶ µ 00 ¶
v v 00
= + ω × v (10.23)
µ ¶ µ ¶
ω ω
× r0 = × r0 (10.24)
µ 0 ¶
r 00
ω× = ω × v + ω × (ω × r0 ) (10.25)
10.6. LAGRANGIAN MECHANICS IN A NON-INERTIAL FRAME 271
a = A + a00 + 2ω × v
00
+ ω × (ω × r0 ) + ω̇ × r0 (10.26)
µ 00 ¶ µ 00 ¶
v r
where the acceleration in the rotating frame is a00 =
00
while the velocity is v = and
A is with respect to the fixed frame.
Newton’s laws of motion are obeyed in the inertial frame, that is
In the double-primed frame, which may be both rotating and accelerating in translation, one can ascribe an
effective force F 00
that obeys an effective Newton’s law for the acceleration a in the rotating frame
F 00 00 0 0
= a = F − (A + 2ω × v + ω × (ω × r ) + ω̇ × r ) (10.28)
1 h 2
i
= V ·V +v00 ·v00 + 2V ·v00 + 2V · (ω × r0 ) + 2v
00
· (ω × r0 ) + (ω × r0 ) − ()
2
(10.30)
This can be used to derive the canonical momentum in the rotating frame
p00 = 00 = [V +v00 + ω × r0 ] (10.31)
v
The Lagrange equations can be used to derive the equations of motion in terms of the variables evaluated
in the rotating reference frame. The required Lagrange derivatives are
00 = [A +a00 + (ω × v
00
) + (ω̇ × r0 )] (10.32)
v
and
00
= − [(ω × V ) − (ω × v ) − ω × (ω × r0 )] − ∇ (10.33)
r0
where the scalar triple product, equation 21 has been used. Thus the Lagrange equations give for the
rotating frame basis that
The external force is identified as F = −∇ . Equation 1016 can be used to transform between the
fixed and the rotating bases. h i
A = A + (ω × V) (10.35)
This leads to an effective force in the non-inertial translating plus rotating frame that corresponds to an
effective Newtonian force of
F 00 00 0 0
= a = F − [A + 2ω × v + ω × (ω × r ) + (ω̇ × r )] (10.36)
where A is expressed in the fixed frame. The derivation of equation 1036 using Lagrangian mechanics,
confirms the identical formula 1029 derived using Newtonian mechanics.
The four correction terms for the non-inertial frame basis correspond to the following effective forces.
Translational acceleration: F
= −A is the usual inertial force experienced in a linearly acceler-
ating frame of reference, and where A is with respect to the fixed frame .
Coriolis force; F 00
= −2ω × v This is a new type of inertial force that is present only when a
particle is moving in the rotating frame. This force is proportional to the velocity in the rotating frame and
is independent of the position in the rotating frame
Centrifugal force: F 0
= −ω × (ω × r ) This is due to the centripetal acceleration of the particle
owing to the rotation of the moving axis about the axis of rotation.
Transverse (azimuthal) force: F 0
= −ω̇ × r This is a straightforward term due to acceleration of
the particle due to the angular acceleration of the rotating axes.
The above inertial forces are correction terms arising from trying to extend Newton’s laws of motion to
a non-inertial frame involving both translation and rotation. These correction forces are often referred to as
“fictitious” forces. However, these non-inertial forces are very real when located in the non-inertial frame.
Since the centrifugal and Coriolis terms are unusual they are discussed below.
Note that
ω · F = 0 (10.38)
therefore the centrifugal force is perpendicular to the axis of
rotation.
Using the vector identity, equation 24 allows the centrifu-
gal force to be written as
£ ¤
F = − (ω · r0 ) ω − 2 r0 (10.39)
.
0 0 r
For the case where the radius r is perpendicular to ω then ω·r =
0 and thus for this case
= = ̇ − cos
̇
= = 2 ̇ + sin
̇
These lead to the corresponding velocities of
̇ = + cos
sin
̇ = 2
−
and thus the Hamiltonian is given by
= ̇ + ̇ −
2 1 1
= + 2
− sin + cos + ( − 0 )2 + 2 − cos
2 2 2 2
The Hamilton equations of motion give that
̇ = = + cos
sin
̇ = = −
2
These radial and angular velocities are the same as obtained using Lagrangian mechanics.
The Hamilton equations for ̇ and ̇ are given by
2
̇ = − = − 2 sin − ( − 0 ) + cos + 3
Similarly
̇ = − = cos + sin − sin
The transformation equations relating the generalized coordinates are time dependent so the Hamil-
tonian does not equal the total energy . In addition neither the Lagrangian nor the Hamiltonian are
conserved since they both are time dependent. The fact that the Hamiltonian is not conserved is obvious since
the whole system is accelerating upwards leading to increasing kinetic and potential energies. Moreover, the
time derivative of the angular momentum ̇ is non-zero so the angular momentum is not conserved.
Non-inertial fulcrum frame:
This system also can be addressed in the accelerating non-inertial fulcrum frame of reference which is
fixed to the fulcrum of the spring of the pendulum. In this non-inertial frame of reference, the acceleration
of the frame can be taken into account using an effective acceleration which is added to the gravitational
force; that is, is replaced by an effective gravitational force ( + ). Then the Lagrangian in the fulcrum
frame simplifies to
1 2 1
= ̇2 + 2 ̇ + ( + ) ( cos ) − ( − 0 )2
2 2
The Lagrange equations of motion in the fulcrum frame are given by
10.8. CORIOLIS FORCE 275
Λ = 0
2
̈ − ̇ − ( + ) cos + ( − 0 ) = 0
Λ = 0
2 ( + )
̈ + ̇̇ + sin = 0
These are identical to the Lagrange equations of motion derived in the inertial frame.
The can be used to derive the momenta in the non-inertial fulcrum frame
̃ = = ̇
̇
̃ = = 2 ̇
̇
which comprise only a part of the momenta derived in the inertial frame. These partial fulcrum momenta
lead to a fulcum-frame Hamiltonian
̃2 ̃ 1 2
= ̃ ̇ + ̃ ̇ − = + + ( − 0 ) − ( + ) cos
2 22 2
Both and are time independent and thus the fulcrum Hamiltonian is a constant
of motion in the fulcrum frame. However, does not equal the total energy which is increasing with
time due to the acceleration of the fulcrum frame relative to the inertial frame. This example illustrates that
use of non-inertial frames can simplify solution of accelerating systems.
F = F0 − g
zero that is
a00 = 0 = F0 + (g − ω × (ω × r0 ))
z + 2 ρ
g = −b b
This is the equation of a paraboloid and corresponds to a parabolic gravitational equipotential energy surface.
Astrophysicists build large parabolic mirrors for telescopes by continuously spinning a large vat of glass while
it solidifies. This is much easier than grinding a large cylindrical block of glass into a parabolic shape.
2ω × ṙ00
ω̇ = − ()
”
that is, the rotational frequency decreases if the radius is increased. Note that, as shown in equation 1017
̇ = ̇ 00 . This nonzero value of ̇ obviously leads to an azimuthal force in addition to the Coriolis force.
Consider the rate of change of angular momentum for the rotating mass assuming that the angular
momentum comes purely from the rotation Then in the rotating frame
ṗ00 = (”2 ω) = 200 ̇00 ω + 002 ω̇
Substituting equation for ̇ in the second term gives
That is, the two terms cancel. Thus the angular momentum is conserved for this case where the velocity is
radial. Note that, since ” is assumed to be colinear with then it is the same in both the stationary and
rotating frames of reference and thus angular momentum is conserved in both frames. In addition, in the
fixed frame, the angular momentum is conserved if no external torques are acting as assumed above.
Note that since the rotational energy is
1
= 2
2
Also the angular momentum is conserved, that is
p = ω = ω̂
p
Substituting ω = in the rotational energy gives
2 2
= =
2 2
Therefore the rotational energy actually increases as the moment of inertia decreases when the ice skater
pulls her arms close to her body. This increase in rotational energy is provided by the work done as the
dancer pulls her arms inward against the centrifugal force.
10.9. ROUTHIAN REDUCTION FOR ROTATING SYSTEMS 277
(1 ; ̇1 ̇ ; +1 ; ) = − = ω · J − (10.43)
This Routhian behaves like a Hamiltonian for the ignorable cyclic coordinates ω J Simultaneously it behaves
like a negative Lagrangian for all the other coordinates
The non-cyclic Routhian complements in that it is defined as
(1 ; 1 ; ̇+1 ̇ ; ) = − = − ω · J (10.44)
This non-cyclic Routhian behaves like a Hamiltonian for all the non-cyclic variables and behaves like a
negative Lagrangian for the two cyclic variables . Since the cyclic variables are constants of motion,
then is a constant of motion that equals the energy in the rotating frame if is a constant of
motion. However, does not equal the total energy since the coordinate transformation is time
dependent, that is, the Routhian corresponds to the energy of the non-cyclic parts of the motion.
For example, the Routhian for a system that is being cranked about the axis at some fixed
angular frequency ̇ = with corresponding total angular momentum p = J can be written as1
1 For clarity sections 101 to 108 of this chapter adopted a naming convention that uses unprimed coordinates with the
subscript for the inertial frame of reference, primed coordinates with the subscript for the translating coordinates, and
double-primed coordinates with the subscript for the translating plus rotating frame. For brevity the subsequent discussion
omits the redundant subscripts since the single and double prime superscripts completely define the moving and
rotating frames of reference.
278 CHAPTER 10. NON-INERTIAL REFERENCE FRAMES
= − ω · J
where it is assumed that the deformed nucleus has the symmetry axis along the direction and rotates about
the axis. Since the Routhian is for a non-inertial rotating frame of reference it does not include the total
energy but, if the shape is constant in time, then and the corresponding body-fixed Hamiltonian
are conserved and the energy levels for the nucleons bound in the spheroidal potential well can be calculated
using a conventional quantum mechanical model.
For a prolate spheroidal deformed potential well, the nucleon orbits that have the angular momentum
nearly aligned to the symmetry axis correspond to nucleon trajectories that are restricted to the narrowest
280 CHAPTER 10. NON-INERTIAL REFERENCE FRAMES
part of the spheroid, whereas trajectories with the angular momentum vector close to perpendicular to the
symmetry axis have trajectories that probe the largest radii of the spheroid. The Heisenberg Uncertainty
Principle, mentioned in chapter 312, describes how orbits restricted to the smallest dimension will have
the highest linear momentum, and corresponding kinetic energy, and vise versa for the larger sized orbits.
Thus the binding energy of different nucleon trajectories in the spheroidal potential well depends on the angle
between the angular momentum vector and the symmetry axis of the spheroid as well as the deformation of
the spheroid. A quantal nuclear model Hamiltonian is solved for assumed spheroidal-shaped potential wells.
The corresponding orbits each have angular momenta j for which the projection of the angular momentum
along the symmetry axis Ω is conserved, but the projection of j in the laboratory frame is not conserved
since the potential well is not spherically symmetric. However, the total Hamiltonian is spherically symmetric
in the laboratory frame, which is satisfied by allowing the deformed spheroidal potential well to rotate freely in
the laboratory frame, and then 2 and Ω all are conserved quantities. The attractive residual nucleon-
nucleon pairing interaction results in pairs of nucleons being bound in time-reversed orbits ( × )0 , that
is, with resultant total spin zero, in this spheroidal nuclear potential. Excitation of an even-even nucleus
can break one pair and then the total projection of the angular momentum along the symmetry axis is
= |Ω1 ± Ω2 |, depending on whether the projections are parallel or antiparallel. More excitation energy
can break several pairs and the projections continue to be additive. The binding energies calculated in the
spheroidal potential well must be added to the rotational energy = J2 2 to get the total energy, where
J is the moment of inertia. Nuclear structure measurements are in good agreement with the predictions of
nuclear structure calculations that employ the Routhian approach.
F
a0 = + g − (2ω × v0 + ω × (ω × [r0 + R]) + ω̇ × r0 )
F
= + g − (2ω × v0 + ω × (ω × r) + ω̇ × r0 )
where r is with respect to the center of the Earth. This is as expected directly from equation 1036. Since
the angular frequency of the earth is a constant then ̇ × r0 = 0 Thus the acceleration can be written as
F
a0 = + [g − ω × (ω × r)] − 2ω × v0 (10.52)
The term in the square brackets combines the gravitational acceleration plus the centrifugal acceleration.
A measurement of the Earth’s gravitational accel-
eration actually measures the term in the square brack-
ets in equation 1052, that is, an effective gravitational
acceleration where
g = g − ω × (ω × r) (10.53)
This is quite small for the Earth since = 073 × 10−4 and = 6371 leading to a correction
term 2 cos = 003 cos 2 Since
= 2 cos sin (10.55)
and
= − 2 cos2 (10.56)
Then the angle between g and g is given by
2 cos sin
' tan =
= (10.57)
− 2 cos2
a0 = g − 2ω × v0 (10.58)
x (East)
Neglect the centrifugal correction term since it is very small,
that is, let g = g. Using the coordinate axis shown in
figure 107, the surface-frame vectors have components
Equator
ω = 0ib0 + cos jb0 + sin kb0 (10.59)
and
g = − kb0 (10.60)
Thus the Coriolis term is
¯ ¯
¯ ib0 jb0 kb0 ¯
¯ ¯ Figure 10.7: Rotating frame fixed on the sur-
2ω × v0 = 2 ¯¯ 0 cos sin ¯¯ face of the Earth.
¯ 0 0
0 ¯
h³ 0
´ ³ ´ ³ ´ i
0 0 0
= 2 cos − sin ib0 + sin jb0 − cos kb0
r̈0 = − kb0 −2[ib0 (̇ 0 cos − ̇ 0 sin ) + jb0 ̇0 sin − kb0 ̇0 cos ] (10.61)
where ̇00 ̇00 ̇00 are the initial velocities. Substituting the above velocity relations into the equation of motion
for ̈ gives
̈0 = 2 cos − 2 (̇00 cos − ̇00 sin ) − 4 2 0 (10.64)
The last term 4 2 is small and can be neglected leading to a simple uncoupled second-order differential
equation in . Integrating this twice assuming that 00 = 00 = 00 = 0 plus the fact that 2 cos and
2 (̇00 cos − ̇00 sin ) are constant, gives
1
0 = 3 cos − 2 (̇00 cos − ̇00 sin ) + ̇00 (10.65)
3
Similarly, ¡ ¢
0 = ̇00 − ̇00 2 sin (10.66)
1
0 = − 2 + ̇00 + ̇00 2 cos (10.67)
2
10.11. FREE MOTION ON THE EARTH 283
Note that the velocity equals zero when = 0 assuming that is finite. That is, the velocity reaches a
maximum at a radius
1 1
= (1 + ) (10.73)
4 sin
10.12. WEATHER SYSTEMS 285
Figure 10.9: Hurricane Katrina over the Gulf of Mexico on 28 August 2005. [Published by the NOAA]
which occurs at the wall of the eye of the circulating low-pressure system.
Low pressure regions are produced by heating of air causing it to rise and resulting in an inflow of air
to replace the rising air. Hurricanes form over warm water when the temperature exceeds 26◦ and the
moisture levels are above average. They are created at latitudes between 10◦ − 15◦ where the sea is warmest,
but not closer to the equator where the Coriolis force drops to zero. About 90% of the heating of the air comes
from the latent heat of vaporization due to the rising warm moist air condensing into water droplets in the
cloud similar to what occurs in thunderstorms. For hurricanes in the northern hemisphere, the air circulates
anticlockwise inwards. Near the wall of the eye of the hurricane, the air rises rapidly to high altitudes at
which it then flows clockwise and outwards and subsequently back down in the outer reaches of the hurricane.
Both the wind velocity and pressure are low inside the eye which can be cloud free. The strongest winds
are in vortex surrounding the eye of the hurricane, while weak winds exist in the counter-rotating vortex of
sinking air that occurs far outside the hurricane.
Figure 109 shows the satellite picture of the hurricane Katrina, recorded on 28 August 2005. The eye of
the hurricane is readily apparent in this picture. The central pressure was 902002 (902) compared
with the standard atmospheric pressure of 1013002 (1013). This 111 pressure difference produced
steady winds in Katrina of 280 ( 175) with gusts up to 344 which resulted in 1833 fatalities.
Tornadoes are another example of a vortex low-pressure system that are the opposite extreme in both
size and duration compared with a hurricane. Tornadoes may last only ∼ 10 minutes and be quite small in
radius. Pressure drops of up to 100 have been recorded, but since they may only be a few 100 meters in
diameter, the pressure gradient can be much higher than for hurricanes leading to localized winds thought to
approach 500. Unfortunately, the instrumentation and buildings hit by a tornado often are destroyed
making study difficult. Note that the the pressure gradient in small diameter of rope tornadoes is much
more destructive than the larger 14 mile diameter tornadoes, resulting in much higher winds.
286 CHAPTER 10. NON-INERTIAL REFERENCE FRAMES
2 1
= 2 sin − (10.74)
2
≤ ( sin ) (10.76)
As a consequence, high pressure regions tend to have weak pressure gradients and light winds in contrast
to the large pressure gradients plus concomitant damaging winds possible for low pressure systems such a
hurricanes or tornados.
The circulation behavior, exhibited by weather patterns, also applies to ocean currents and other liquid
flow on earth. However, the residual angular momentum of the liquid often can overcome the Coriolis terms.
Thus often it will be found experimentally that water exiting the bathtub does not circulate anticlockwise in
the northern hemisphere as predicted by the Coriolis force. This is because it was not stationary originally,
but rotating slowly.
Reliable prediction of weather is an extremely difficult, complicated and challenging task, which is of con-
siderable importance in modern life. As discussed in chapter 158, fluid flow can be much more complicated
than assumed in this discussion of air flow and weather. Both turbulent and laminar flow are possible. As a
consequence, computer simulations of weather phenomena are difficult because the air flow can be turbulent
and the transition from order to chaotic flow is very sensitive to the initial conditions. Typically the air
flow can involve both macroscopic ordered coherent structures over a wide dynamic range of dimensions,
coexisting with chaotic regions. Computer simulations of fluid flow often are performed based on Lagrangian
mechanics to exploit the scalar properties of the Lagrangian. Ordered coherent structures, ranging from
microscopic bubbles to hurricanes, can be recognized by exploiting Lyapunov exponents to identify the or-
dered motion buried in the underlying chaos. Thus the techniques discussed in classical mechanics are of
considerable importance outside of physics.
that is, the true gravitational field g0 corrected for the centrifugal
force.
Assume the small angle approximation for the deflection angle of the pendulum then = cos '
and = , thus ' . Then has shown in figure 1010, the horizontal components of the restoring
force are
= − (10.79)
= − (10.80)
Since g is vertical, and neglecting terms involving ̇ then evaluating the cross product in equation (1078)
simplifies to
̈ = − + 2̇Ω cos (10.81)
̈ = − − 2̇Ω cos (10.82)
where is the colatitude which is related to the latitude by
Ω = Ω cos (10.85)
̈ − 2Ω ̇ + 20 = 0
̈ − 2Ω ̇ + 20 = 0 (10.86)
These are two coupled equations that can be solved by making a coordinate transformation.
Define a new coordinate that is a complex number
= + (10.87)
Multiply the second of the coupled equations 1086 by and add to the first equation gives
̈ + 2Ω ̇ + 20 = 0 (10.89)
Note that the complex number contains the same information regarding the position in the − plane
as equations 1086. The plot of in the complex plane, the Argand diagram, is a birds-eye view of the
position coordinates ( ) of the pendulum. This second-order homogeneous differential equation has two
independent solutions that can be derived by guessing a solution of the form
2 − 2Ω − 2 = 0
288 CHAPTER 10. NON-INERTIAL REFERENCE FRAMES
Therefore q
= Ω ± Ω2 + 20 (10.91)
Assume that the angular velocity of the pendulum 0 is very much higher than the angular velocity of
the earth, i.e. 0 Ω, then
' Ω ± 0 (10.92)
Thus the solution is of the form
() = −Ω (+ 0 + − 0 ) (10.93)
This can be written as
() = −Ω cos( 0 + ) (10.94)
where the phase and amplitude depend on the initial conditions. Thus the plane of oscillation of the
pendulum is defined by the ratio of the and coordinates, that is the phase angle Ω This phase angle
rotates with angular velocity Ω where
At the north pole the earth rotates under the pendulum with angular velocity Ω and the axis of the
pendulum is fixed in an inertial frame of reference. At lower latitudes, the pendulum precesses at the lower
angular frequency Ω = Ω sin that goes to zero at the equator. For example, in Rochester, NY, = 43◦
and therefore a Foucault pendulum precesses at Ω = 0682Ω. That is, the pendulum precesses 2455◦ /day.
10.14 Summary
This chapter has focussed on describing motion in non-inertial frames of reference. It has been shown that
the force and acceleration in non-inertial frames can be related using either Newtonian and Lagrangian
mechanics by introducing additional inertial forces in the non-inertial reference frame.
Rotating reference frame It was shown that the time derivatives of a general vector G in both an
inertial frame and a rotating reference frame are related by
µ ¶ µ ¶
G G
= +ω×G (1016)
where the ω × G term originates from the fact that the unit vectors in the rotating reference frame are time
dependent with respect to the inertial frame.
Reference frame undergoing both rotation and translation Both Newtonian and Lagrangian me-
chanics were used to show that for the case of translational acceleration plus rotation, the effective force in
the non-inertial (double-primed) frame can be written as
These inertial correction forces result from describing the system in a non-inertial frame. These inertial
forces are felt when in the rotating-translating frame of reference. Thus the notion of these inertial forces
can be very useful for solving problems in non-inertial frames. For the case of rotating frames, two important
inertial forces are the centrifugal force, −ω × (ω × r0 ) and the Coriolis force −2ω × v00 .
10.14. SUMMARY 289
Routhian reduction for rotating systems It was shown that for non-inertial systems, identical equa-
tions of motion are derived using Newtonian, Lagrangian, Hamiltonian, and Routhian mechanics.
Terrestrial manifestations of rotation Examples of motion in rotating frames presented in the chapter
included projectile motion with respect to the surface of the Earth, rotation alignment of nucleons in rotating
nuclei, and weather phenomena.
Workshop exercises
1. Consider a fixed reference frame and a rotating frame 0 . The origins of the two coordinate systems always
coincide. By carefully drawing a diagram, derive an expression relating the coordinates of a point in the two
systems. (This was covered in Chapter 2, but it is worth reviewing now.
2. The effective force observed in a rotating coordinate system is given by equation 1028.
3. A plumb line is carried along in a moving train, with the mass of the plumb bob. Neglect any effects due to
the rotation of the Earth and work in the noninertial frame of reference of the train.
(a) Find the tension in the cord and the deflection from the local vertical if the train is moving with constant
acceleration 0 .
(b) Find the tension in the cord and the deflection from the local vertical if the train is rounding a curve of
radius with constant speed 0 .
4. A bead on a rotating rod is free to slide without friction. The rod has a length and rotates about its end
with angular velocity . The bead is initially released from rest (relative to the rod) at the midpoint of the
rod.
(a) Find the displacement of the bead along the wire as a function of time.
(b) Find the time when the bead leaves the end of the rod.
(c) Find the velocity (relative to the rod) of the bead when it leaves the end of the rod.
5. Here is a “thought experiment” for you to consider. Suppose you are in a small sailboat of mass at the
Earth’s equator. At the equator there is very little wind (this is known as the “equatorial doldrums”), so your
sailboat is, more or less, sitting still. You have a small anchor of mass on deck and a single mast of height
in the middle of the boat. How can you use the anchor to put the boat into motion? In which direction will
the boat move?
6. Does water really flow in the other direction when you flush a toilet in the southern hemisphere? What (if
anything) does the Coriolis force have to do with this?
7. We are presently at a latitude (with respect to the equator) and Earth is rotating with constant angular
velocity . Consider the following two scenarios: Scenario A: A particle is thrown upward with initial speed
0 . Scenario B: An identical particle is dropped (at rest) from the maximum height of the particle in Scenario
A. Circle all the true statements regarding the Coriolis deflection assuming that the particles have landed for
a) and b), .
290 CHAPTER 10. NON-INERTIAL REFERENCE FRAMES
Problems
1. If a projectile is fired due east from a point on the surface of the Earth at a northern latitude with a velocity
of magnitude 0 and at an inclination to the horizontal of show that the lateral deflection when the projectile
strikes the Earth is
403
= sin sin2 cos
2
where is the rotation frequency of the Earth.
2. In the preceding problem, if the range of the projectile is 00 for the case = 0 show that the change of range
due to rotation of the Earth is
s µ ¶
0 2003 1 1 3
∆ = cos cot − tan
2 2
3
3. Obtain an expression for the angular deviation of a particle projected from the North Pole in a path that lies
close to the surface of the earth. Is the deviation significant for a missile that makes a 4800-km flight in 10
minutes? What is the ”miss distance” if the missile is aimed directly at the target? Is the miss difference
greater for a 19300-km flight at the same velocity?
Chapter 11
Rigid-body rotation
11.1 Introduction
Rigid-body rotation features prominently in science, engineering, and sports. Prior chapters have focussed
primarily on motion of point particles. This chapter extends the discussion to motion of finite-sized rigid
bodies. A rigid body is a collection of particles where the relative separations remain rigidly fixed. In real
life, there is always some motion between individual atoms, but usually this microscopic motion can be
neglected when describing macroscopic properties. Note that the concept of perfect rigidity has limitations
in the theory of relativity since information cannot travel faster than the velocity of light, and thus signals
cannot be transmitted instantaneously between the ends of a rigid body which is implied if the body had
perfect rigidity.
The description of rigid-body rotation is most easily handled by specifying the properties of the body
in the rotating body-fixed coordinate frame whereas the observables are measured in the stationary iner-
tial laboratory coordinate frame. In the body-fixed coordinate frame, the primary observable for classical
mechanics is the inertia tensor of the rigid body which is well defined and independent of the rotational
motion. By contrast, in the stationary inertial frame the observables depend sensitively on the details of
the rotational motion. For example, when observed in the stationary fixed frame, rapid rotation of a pencil
about the longitudinal symmetry axis gives a time-averaged shape of the pencil that looks like a thin cylin-
der, whereas the time-averaged shape is a flat disk for rotation perpendicular to the symmetry axis of the
pencil. In spite of this, the pencil always has the same unique inertia tensor in the body-fixed frame. Thus
the best solution for describing rotation of a rigid body is to use a rotation matrix that transforms from
the stationary fixed frame to an instantaneous body-fixed frame for which the moment of inertia tensor can
be evaluated. Moreover, the problem can be greatly simplified by transforming to a body-fixed coordinate
frame that is aligned with any symmetry axes of the body since then the inertia tensor can be diagonal; this
is called a principal axis system.
Rigid-body rotation can be broken into the following two classifications.
1) Rotation about a fixed axis:
A body can be constrained to rotate about an axis that has a fixed location and orientation relative to
the body. The hinged door is a typical example. Rotation about a fixed axis is straightforward since the
axis of rotation, plus the moment of inertia about this axis, are well defined and this case was discussed in
chapter 2127.
2) Rotation about a point
A body can be constrained to rotate about a fixed point of the body but the orientation of this rotation
axis about this point is unconstrained. One example is rotation of an object flying freely in space which can
rotate about the center of mass with any orientation. Another example is a child’s spinning top which has
one point constrained to touch the ground but the orientation of the rotation axis is undefined.
The prior discussion in chapter 2127 showed that rigid-body rotation is more complicated than assumed
in introductory treatments of rigid-body rotation. It is necessary to expand the concept of moment of inertia
to the concept of the inertia tensor, plus the fact that the angular momentum may not point along the
rotation axis. The most general case requires consideration of rotation about a body-fixed point where the
orientation of the axis of rotation is unconstrained. The concept of the inertia tensor of a rotating body is
291
292 CHAPTER 11. RIGID-BODY ROTATION
crucial for describing rigid-body motion. It will be shown that working in the body-fixed coordinate frame of
a rotating body allows a description of the equations of motion in terms of the inertia tensor for a given point
of the body, and that it is possible to rotate the body-fixed coordinate system into a principal axis system
where the inertia tensor is diagonal. For any principal axis, the angular momentum is parallel to the angular
velocity if it is aligned with a principal axis. The use of a principal axis system greatly simplifies treatment
of rigid-body rotation and exploits the powerful and elegant matrix algebra mentioned in appendix .
The following discussion of rigid-body rotation is broken into three topics, (1) the inertia tensor of the
rigid body, (2) the transformation between the rotating body-fixed coordinate system and the laboratory
frame, i.e., the Euler angles specifying the orientation of the body-fixed coordinate frame with respect to the
laboratory frame, and (3) Lagrange and Euler’s equations of motion for rigid-bodies. This is followed by a
discussion of practical applications.
There are two especially convenient choices for the fixed point . If no point in the body is fixed with
respect to an inertial coordinate system, then it is best to choose as the center of mass. If one point of
the body is fixed with respect to a fixed inertial coordinate system, such as a point on the ground where a
child’s spinning top touches, then it is best to choose this stationary point as the body-fixed point
11.3. RIGID-BODY ROTATION ABOUT A BODY-FIXED POINT 293
where
= 1 =
= 0 6=
In most cases it is more useful to express the components of the inertia tensor in an integral form over
the mass distribution rather than a summation for discrete bodies. That is,
Z Ã Ã 3 ! !
X
0 2
= (r ) − (11.13)
The inertia tensor is easier to understand when written in cartesian coordinates r0 = ( ) rather
than in the form r0 = (1 2 3 ) Then, the diagonal moments of inertia of the inertia tensor are
X
£ ¤ X £ ¤
≡ 2 + 2 + 2 − 2 = 2 + 2 (11.14)
X
£ ¤ X £ ¤
≡ 2 + 2 + 2 − 2 = 2 + 2
X
X
£ ¤ £ ¤
≡ 2 + 2 + 2 − = 2
2 + 2
The above notation for the inertia tensor allows the angular momentum (1112) to be written as
3
X
= (11.17)
Note that every fixed point in a body has a specific inertia tensor. The components of the inertia tensor
at a specified point depend on the orientation of the coordinate frame whose origin is located at the specified
fixed point. For example, the inertia tensor for a cube is very different when the fixed point is at the center
of mass compared with when the fixed point is at a corner of the cube.
11.5. MATRIX AND TENSOR FORMULATIONS OF RIGID-BODY ROTATION 295
As discussed in appendix 2, equation (1118) now can be written in tensor notation as an inner product
of the form
L = {I} · ω (11.21)
Note that the above notation uses boldface for the inertia tensor I, implying a rank-2 tensor representation,
while the angular velocity ω and the angular momentum L are written as column vectors. The inertia tensor
is a 9-component rank-2 tensor defined as the ratio of the angular momentum vector L and the angular
velocity ω.
L
{I} = (11.22)
ω
Note that, as described in appendix , the inner product of a vector ω, which is the rank 1 tensor, and a
rank 2 tensor {I} leads to the vector L. This compact notation exploits the fact that the matrix and tensor
representation are completely equivalent, and are ideally suited to the description of rigid-body rotation.
where are real numbers, which are called the principal moments of inertia of the body, and are
usually written as . When the angular velocity vector ω points along any principal axis unit vector ̂, then
the angular momentum L is parallel to ω and the magnitude of the principal moment of inertia about this
principal axis is given by the relation
̂ = ̂ (11.24)
The principal axes are fixed relative to the shape of the rigid body and they are invariant to the orientation
of the body-fixed coordinate system used to evaluate the inertia tensor. The advantage of having the body-
fixed coordinate frame aligned with the principal axis coordinate frame is that then the inertia tensor is
diagonal, which greatly simplifies the matrix algebra. Even when the body-fixed coordinate system is not
aligned with the principal axis frame, if the angular velocity is specified to point along a principal axis then
the corresponding moment of inertia will be given by (1124)
In principle it is possible to locate the principal axes by varying the orientation of the angular velocity
vector ω to find those orientations for which the angular momentum L and angular velocity ω are parallel
which characterizes the principal axes. However, the best approach is to diagonalize the inertia tensor.
296 CHAPTER 11. RIGID-BODY ROTATION
These equations are solved for the ratios 11 : 21 : 31 which are the direction numbers of the principle axis
system corresponding to solution 1 This principal axis system is defined relative to the original coordinate
system. This procedure is repeated to find the orientation of the other two mutually perpendicular principal
axes.
R=a+r (11.34)
Figure 11.2: Transformation be-
where a is the vector connecting the origins of the coordinate systems tween two parallel body-coordinate
and illustrated in figure 112. The elements of the inertia tensor systems, O and Q.
with respect to axis system are given by equation 1112 to be
" Ã 3 ! #
X X
2
≡ − (11.35)
The components along the three axes for each of the two coordinate systems are related by
= + (11.36)
The first summation on the right-hand side corresponds to the elements of the inertia tensor in the
center-of-mass frame. Thus the terms can be regrouped to give
à 3
!
" 3
#
X X X X
2
≡ + − + 2 − − (11.38)
P
However, each term in the last bracket involves a sum of the form Take the coordinate system
to be with respect to the center of mass for which
X
r0 = 0 (11.39)
298 CHAPTER 11. RIGID-BODY ROTATION
where is the center-of-mass inertia tensor. This is the general form of Steiner’s parallel-axis theorem.
As an example, the moment of inertia around the 1 axis is given by
¡¡ ¢ ¢ ¡ ¢
11 ≡ 11 + 21 + 22 + 23 11 − 21 = 11 + 22 + 23 (11.43)
which corresponds to the elementary statement that the difference in the moments of inertia equals the
mass of the body multiplied by the square of the distance between the parallel axes, 1 1 Note that the
minimum moment of inertia of a body is which is about the center of mass.
11.1 Example: Inertia tensor of a solid cube rotating about the center of mass.
The complicated expressions for the inertia tensor can be un-
derstood using the example of a uniform solid cube with side ,
density and mass = 3 rotating about different axes. As-
sume that the origin of the coordinate system is at the center
of mass with the axes perpendicular to the centers of the faces of
the cube.
The components of the inertia tensor can be calculated using
(1113) written as an integral over the mass distribution rather O
than a summation.
Z Ã Ã 3 ! !
X
= (r0 ) 2 −
Thus
Z Z Z Inertia tensor of a uniform solid cube of
2 2 2 ¡ 2 ¢ side about the center of mass and a
11 = 2 + 23 3 2 1
−2 −2 −2 corner of the cube . The vector is the
1 5 1 vector distance between and .
= = 2 = 22 = 33
6 6
By symmetry the diagonal moments of inertia about each face
are identical. Similarly the products of inertia are given by
Z 2 Z 2 Z 2
12 = − (1 2 ) 3 2 1 = 0
−2 −2 −2
a) Direct calculation Let one corner of the cube be the origin of the coordinate system and assume
that the three adjacent sides of the cube lie along the coordinate axes. The components of the inertia tensor
can be calculated using (1113) Thus
Z Z Z
¡ 2 ¢ 2 2
11 = 2 + 23 3 2 1 = 5 = 2
0 0 0 3 3
Z Z Z
1 1
12 = − (1 2 ) 3 2 1 = − 5 = − 2
0 0 0 4 4
Thus, evaluating all the nine components gives
⎛ ⎞
8 −3 −3
1
I = 2 ⎝ −3 8 −3 ⎠
12 −3 −3 8
b) Parallel-axis theorem This inertia tensor also can be calculated using the parallel-axis theorem to
relate the moment of inertia about the corner, to that at the center of mass. As shown in the figure, the
vector has components
1 = 2 = 3 =
2
Applying the parallel-axis theorem gives
¡ ¢ ¡ ¢ 1 1 2
11 = 11 + 2 − 21 = 11 + 22 + 23 = 2 + 2 = 2
6 2 3
and similarly for 22 and 33 . The off-diagonal terms are given by
1
12 = 12 + (−1 2 ) = − 2
4
Thus the inertia tensor, transposed from the center of mass, to the corner of the cube is
⎛ 2 ⎞ ⎛ ⎞
3
2
− 14 2 − 14 2 8 −3 −3
1
I = ⎝ − 14 2 23 2 − 14 2 ⎠ = 2 ⎝ −3 8 −3 ⎠
1 2 1 2 2 2 12
−4 −4 3
−3 −3 8
This inertia tensor about the corner of the cube, is the same as that obtained by direct integration.
c) Principal moments of inertia The coordinate axis frame used for rotation about the corner of the
cube is not a principal axis frame. Therefore let us diagonalize the inertia tensor to find the principal
axis frame the principal moments of inertia about a corner. To achieve this requires solving the secular
determinant ¯ ¡2 ¢ ¯
¯ 2 1 2 1 2 ¯
¯ 31 2− ¡−24 2 ¢ − 41 2 ¯
¯ − ¯=0
¯ 41 3 − −4
¡ ¢ ¯
¯ − 2 1
−4 2 2 2 ¯
4 3 −
The value of a determinant is not affected by adding or subtracting any row or column from any other
row or column. Subtract row 1 from row 2 gives
¯ ¡2 ¢ ¯
¯ 2 − 1 2
− 14 2 ¯
¯ 311 ¡−114 2 ¢ ¯
¯ − + 2 ¯=0
¯ 12 12 − ¡0 2 ¢ ¯
¯ − 1 2 − 14 2 2 ¯
4 3 −
where the second subscript 1 attached to signifies that this solution corresponds to 11 This gives
2 11 − 21 − 31 = 0
− 11 + 2 21 − 31 = 0
− 11 − 21 + 2 31 = 0
Solving ⎛ these⎞three equations gives the unit vector for the first principal axis for which 11 = 16 2 to be
1
ê1 = √13 ⎝ 1 ⎠. This can be repeated to find the other two principal axes by substituting 22 = 11 2
12 This
1
gives for the second principal moment 22
⎛ ⎞⎛ ⎞
−3 −3 −3 12
1
({I} − {I}) · ω = 2 ⎝ −3 −3 −3 ⎠ ⎝ 22 ⎠ = 0
12
−3 −3 −3 32
This results in three identical equations for the components of but all three equations are the same, namely
12 + 22 + 32 = 0
This does not uniquely determine the direction of However, it does imply that 2 corresponding to the
second principal axis has the property that
ω̂ · ê1 = 0
that is, any direction of ̂2 that is perpendicular to ̂1 is acceptable. In other words; any two orthogonal unit
vectors ̂2 and ̂3 that are perpendicular to ̂1 are acceptable. This ambiguity exists whenever two eigenvalues
are equal; the three principal axes are only uniquely defined if all three eigenvalues are different. The same
ambiguity exist when all three eigenvalues are identical as occurs for the principal moments of inertia about
the center-of-mass of a uniform solid cube. This explains why the principal moment of inertia for the diagonal
of the cube, that passes through the center of mass, has the same moment as when the principal axes pass
through the center of the faces of the cube.
Z Z Z
= (2 + 2 ) + 2 2 ≥ (2 + 2 ) = (11.44)
11.10. GENERAL PROPERTIES OF THE INERTIA TENSOR 301
Note that for any body the three principal moments of inertia must satisfy the triangle rule that the sum of
any pair must exceed or equal the third. Moreover, if the body is a thin lamina with thickness = 0 that
is, a thin plate in the − plane, then
+ = (11.45)
This perpendicular-axis theorem can be very useful for solving problems involving rotation of plane laminae.
The opposite of a plane laminae is a long thin cylindrical needle of mass , length , and radius
Along the symmetry axis the principal moments are = 12 2 → 0 as → 0 while perpendicular to the
1
symmetry axis = = 12 2 . These satisfy the triangle rule.
Spherical top: 1 = 2 = 3
A spherical top is a body having three degenerate principal moments of inertia. Such a body has the same
symmetry as the inertia tensor about the center of a uniform sphere. For a sphere it is obvious from the
symmetry that any orientation of three mutually orthogonal axes about the center of the uniform sphere are
equally good principal axes. For a uniform cube the principal axes of the inertia tensor about the center of
mass were shown to be aligned such that they pass through the center of each face, and the three principal
moments are identical; that is, inertially it is equivalent to a spherical top. A less obvious consequence of the
spherical symmetry is that any orientation of three mutually perpendicular axes about the center of mass of
a uniform cube is an equally good principal axis system.
302 CHAPTER 11. RIGID-BODY ROTATION
Symmetric top: 1 = 2 6= 3
The equivalent ellipsoid for a body with two degenerate principal moments of inertia is a spheroid which has
cylindrical symmetry with the cylindrical axis aligned along the third axis. A body with 3 1 = 2 is a
prolate spheroid while a body with 3 1 = 2 is an oblate spheroid. Examples with a prolate spheroidal
equivalent inertial shape are a rugby ball, pencil, or a baseball bat. Examples of an oblate spheroid are an
orange, or a frisbee. A uniform sphere, or a uniform cube, rotating about a point displaced from the center-
of-mass also behave inertially like a symmetric top. The cylindrical symmetry of the equivalent spheroid
makes it obvious that any mutually perpendicular axes that are normal to the axis of cylindrical symmetry
are equally good principal axes even when the cross section in the 1−2 plane is square as opposed to circular.
A rotor is a diatomic-molecule shaped body which is a special case of a symmetric top where 1 = 0
and 2 = 3 . The rotation of a rotor is perpendicular to the symmetry axis since the rotational energy and
angular momentum about the symmetry axis are zero because the principal moment of inertia about the
symmetry axis is zero.
Asymmetric top: 1 6= 2 6= 3
A body where all three principal moments of inertia are distinct, 1 6= 2 6= 3 is called an asymmetric
top. Some molecules, and nuclei have asymmetric, triaxially-deformed, shapes.
The left-hand sides of these equations are identical since the inertia tensor is symmetric, that is =
Therefore subtracting these equations gives
X X
− = 0 (11.51)
That is X
( − ) = 0 (11.52)
or
( − ) ω · ω = 0 (11.53)
11.11. ANGULAR MOMENTUM L AND ANGULAR VELOCITY ω VECTORS 303
If 6= then
ω · ω = 0 (11.54)
which implies that the and principal axes are perpendicular. However, if = then equation
1153 does not require that ω · ω = 0, that is, these axes are not necessarily perpendicular, but, with
no loss of generality, these two axes can be chosen to be perpendicular with any orientation in the plane
perpendicular to the symmetry axis.
Summarizing the above discussion, the inertia tensor has the following properties.
1) Diagonalization may be accomplished by an appropriate rotation of the axes in the body.
2) The principal moments (eigenvalues) and principal axes (eigenvectors) are obtained as roots of the
secular determinant and are real.
3) The principal axes (eigenvectors) are real and orthogonal.
4) For a symmetric top with two identical principal moments of inertia, any orientation of two orthogonal
axes perpendicular to the symmetry axis are satisfactory eigenvectors.
5) For a spherical top with three identical principal moment of inertia, the principal axes system can
have any orientation with respect to the origin.
where ω is the angular velocity, {I} the inertia tensor, and L the corresponding angular momentum.
Two important consequences of equation 1155 are that:
• The angular momentum L and angular velocity ω are not necessarily colinear.
• In general the Principal axis system of the rotating rigid body is not aligned with either the angular
momentum or angular velocity vectors.
An exception to these statements occurs when the angular velocity ω is aligned along a principal axes
for which the inertia tensor is diagonal, i.e. = , and then both L and ω point along this principal
axis. In general the angular momentum L and angular velocity ω precess around each other. An important
special case is for torque-free systems where Noether’s theorem implies that the angular momentum vector
L is conserved both in magnitude and amplitude. In this case, the angular velocity ω and the Principal axis
system, both precesses around the angular momentum vector L. That is, the body appears to tumble with
respect to the laboratory fixed frame. Understanding rigid-body rotation requires care not to confuse the
body-fixed Principal axis coordinate frame, used to determine the inertia tensor, and the fixed laboratory
frame where the motion is observed.
Consider that the body is rotated about a diagonal of the cube for which
⎛ the⎞center of mass will be on
1
the rotation axis. Then the angular velocity vector is written as ω = √13 ⎝ 1 ⎠ where the components of
1
q
1 2 2 2
= = = √3 with the angular velocity magnitude + + =
⎛ ⎞ ⎛ ⎞ ⎛ ⎞
1 0 0 1 1
1 1 1 1 1
L = {I} · ω = 2 √ ⎝ 0 1 0 ⎠ · ⎝ 1 ⎠ = 2 √ ⎝ 1 ⎠ = 2 ω
6 3 0 0 1 1 6 3 1 6
Note that L and ω again are colinear showing it also is a principal axis. Moreover, the magnitude of L
is identical for orientations of the rotation axes passing through the center of mass when centered on
either one face, or the diagonal, of the cube implying that the principal moments of inertia about these axes
are identical. This illustrates the important property that, when the three principal moments of inertia are
identical, then any orientation of the coordinate system is an equally good principal axis system. That is,
this corresponds to the spherical top where all orientations are principal axes, not just along the obvious
symmetry axes.
This is a general expression for the kinetic energy that is valid for any choice of the origin from which the
body-fixed vectors r0 are measured. However, if the origin is chosen to be the center of mass, then, and only
then, the middle term cancels. That is, since V · ω is independent of the specific particle, then
à !
X X
0 0
V · ω × r = V · ω × r (11.59)
and R = 0 in the body-fixed frame if the selected point in the body is the center of mass. Thus, when using
the center of mass frame, the middle term of equation 1158 is zero. Therefore, for the center of mass frame,
the kinetic energy separates into two terms in the body-fixed frame
where
1X
= 2 (11.62)
2
1X
= (ω × r0 ) · (ω × r0 )
2
The rotational kinetic energy can be expressed in terms of components of ω and r0 in the body-fixed
frame. Also the following formulae are greatly simplified if r0 = ( ) in the rotating body-fixed frame
306 CHAPTER 11. RIGID-BODY ROTATION
is written in the form r0 = (1 2 3 ) where the axes are defined by the numbers 1 2 3 rather than
. In this notation the rotational kinetic energy is written as
⎡Ã !Ã ! Ã !⎛ ⎞⎤
1X X X X X
= ⎣ 2 2 − ⎝ ⎠⎦ (11.65)
2
where = 1 if = and = 0 if 6=
Then the kinetic energy can be written more compactly
⎡Ã !Ã ! Ã !⎛ ⎞⎤
1X X X X X
= ⎣ 2 2 − ⎝ ⎠⎦
2
3
" Ã 3 ! #
1 XX X
2
= ( ) − ( ) ( )
2
3
" " Ã 3 ! ##
1X X X
2
= − (11.67)
2
The term in the outer square brackets is the inertia tensor defined in equation 1112 for a discrete body. The
inertia tensor components for a continuous body are given by equation 1113.
Thus the rotational component of the kinetic energy can be written in terms of the inertia tensor as
3
1X
= (11.68)
2
Note that when the inertia tensor is diagonal ,then the evaluation of the kinetic energy simplifies to
3
1X
= 2 (11.69)
2
which is the familiar relation in terms of the scalar moment of inertia discussed in elementary mechanics.
Equation 1168 also can be factored in terms of the angular momentum L.
1X 1X X 1X
= = = (11.70)
2 2
2
As mentioned earlier, tensor algebra is an elegant and compact way of expressing such matrix operations.
Thus it is possible to express the rotational kinetic energy as
⎛ ⎞ ⎛ ⎞
¡ ¢ 11 12 13 1
1
= 1 2 3 · ⎝ 21 22 23 ⎠ · ⎝ 2 ⎠ (11.71)
2
31 32 33 3
1
≡ T = ω · {I} · ω (11.72)
2
where the rotational energy T is a scalar. Using equation 1155 the rotational component of the kinetic
energy also can be written as
1
≡ T = ω · L (11.73)
2
which is the same as given by (1170). It is interesting to realize that even though L = {I} · ω is the inner
product of a tensor and a vector, it is a vector as illustrated by the fact that the inner product = 12 ω·L =
1
2 ω · ({I} · ω) is a scalar. Note that the translational kinetic energy must be added to the rotational
kinetic energy to get the total kinetic energy as given by equation 1161
11.13. EULER ANGLES 307
1) Rotation about the space-fixed ẑ axis from the space x̂ axis to the line of nodes n̂ : The
first rotation (x y z) · λ → (n y0 z) is in a right-handed direction through an angle about the space-fixed
z axis. Since the rotation takes place in the x − y plane, the transformation matrix is
⎛ ⎞
cos sin 0
{λ } = ⎝ − sin cos 0 ⎠ (11.75)
0 0 1
1 The space-fixed coordinate frame and the body-fixed coordinate frames are unambiguously defined, that is, the space-fixed
frame is stationary while the body-fixed frame is the principal-axis frame of the body. There are several possible intermediate
frames that can be used to define the Euler angles. The − − sequence of rotations, used here, is used in most physics
textbooks in classical mechanics. Unfortunately scientists and engineers use slightly different conventions for defining the Euler
angles. As discussed in Appendix A of "Classical Mechanics" by Goldstein, nuclear and particle physicists have adopted the
− − sequence of rotations while the US and UK aerodynamicists have adopted a − − sequence of rotations.
308 CHAPTER 11. RIGID-BODY ROTATION
This leads to the intermediate coordinate system (n y0 z) where the rotated x axis now is colinear with the
n axis of the intermediate frame, that is, the line of nodes.
The precession angular velocity ̇ is the rate of change of angle of the line of nodes with respect to the space
axis about the space-fixed axis.
2) Rotation about the line of nodes n̂ from the space ẑ axis to the body-fixed 3̂ axis: The
second rotation
(n y0 z) · → (n y00 3) (11.77)
is in a right-handed direction through the angle about the n̂ axis (line of nodes) so that the ”” axis becomes
colinear with the body-fixed 3̂ axis. Because the rotation now is in the ẑ− 3̂ plane, the transformation matrix
is ⎛ ⎞
1 0 0
{λ } = ⎝ 0 cos sin ⎠ (11.78)
0 − sin cos
The line of nodes which is at the intersection of the space-fixed and body-fixed planes, shown in figure 113
points in the n̂ = ẑ × 3̂ direction. The new ”” axis now is the body-fixed 3̂ axis. The angular velocity ̇ is
the rate of change of angle of the body-fixed 3̂-axis relative to the space-fixed ẑ-axis about the line of nodes.
3) Rotation about the body-fixed 3̂ axis from the line of nodes to the body-fixed 1̂ axis: The
third rotation
(n y00 3) · → (1̂ 2̂ 3̂) (11.79)
is in a right-handed direction through the angle about the new body-fixed 3̂ axis This third rotation
transforms the rotated intermediate (n y00 3) frame to final body-fixed coordinate system (1̂ 2̂ 3̂) The
transformation matrix is ⎛ ⎞
cos sin 0
{λ } = ⎝ − sin cos 0 ⎠ (11.80)
0 0 1
The spin angular velocity ̇ is the rate of change of the angle of the body-fixed 1-axis with respect to the
line of nodes about the body-fixed 3 axis.
The total rotation matrix {λ} is given by
Thus the complete rotation from the space-fixed (x y z) axis system to the body-fixed (1 2 3) axis system
is given by
(1 2 3) = {λ} · (x y z) (11.82)
where {λ} is given by the triple product equation (1181) leading to the rotation matrix
⎛ ⎞
cos cos − sin cos sin sin cos + cos cos sin sin sin
{λ} = ⎝ − cos sin − sin cos cos − sin sin + cos cos cos sin cos ⎠ (11.83)
sin sin − cos sin cos
The inverse transformation from the body-fixed axis system to the space-fixed axis system is given by
⎛ ⎞
cos cos − sin cos sin − cos sin − sin cos cos sin sin
−1
{λ} = {λ} = ⎝ sin cos + cos cos sin − sin sin + cos cos cos − cos sin ⎠ (11.85)
sin sin sin cos cos
11.14. ANGULAR VELOCITY ω 309
Taking the product {λ} {λ}−1 = 1 shows that the rotation matrix is a proper, orthogonal, unit matrix.
The use of three different coordinate systems, space-fixed, the intermediate line of nodes, and the body-
fixed frame can be confusing at first glance. Basically the angle specifies the rotation about the space-fixed
axis between the space-fixed axis and the line of nodes of the Euler angle intermediate frame. The angle
specifies the rotation about the body-fixed 3 axis between the line of nodes and the body-fixed 1 axis. Note
that although the space-fixed and body-fixed axes systems each are orthogonal, the Euler angle basis in
general is not orthogonal. For rigid-body rotation the rotation angle about the space-fixed axis is time
dependent, that is, the line of nodes is rotating with an angular velocity ̇ with respect to the space-fixed
coordinate frame. Similarly the body-fixed coordinate frame is rotating about the body-fixed 3 axis with
angular velocity ̇ relative to the line of nodes.
Note that the precession angular velocity ̇ is the angular velocity that the body-fixed ê3 and ẑ × 3̂ axes
precess around the space-fixed ẑ axis. Table 111 gives the Euler angular velocities required to calculate
the components of the angular velocity ω for the body-fixed (1 2 3) axis system. Collecting the individual
components of ω gives the components of the angular velocity of the body, relative to the space-fixed axes,
in the body-fixed axis system (1 2 3)
The angular velocity of the body about the body-fixed 3-axis, 3 , is the sum of the projection of the
precession angular velocity of the line-of-nodes ̇ with respect to the space-fixed x-axis, plus the angular
velocity ̇ of the body-fixed 3-axis with respect to the rotating line-of-nodes.
Similarly, the components of the body angular velocity ω for the space-fixed axis system ( ) can be
derived to be
Note that when = 0 then the Euler angles are singular in that the space-fixed axis is parallel with
the body-fixed 3 axis and there is no way of distinguishing between precession ̇ and spin ̇, leading to
= 3 = ̇ + ̇. When = then the axis and 3 axis are antiparallel and = ̇ − ̇ = − 3 . The other
special case is when cos = 0 for which the Euler angle system is orthogonal and the space-fixed = ̇,
that is, it equals the precession, while the body-fixed 3 = ̇, that is, it equals the spin. When the Euler
angle basis is not orthogonal then equations (1186 − 88) and (1189 − 91) are needed for expressing the
Euler equations of motion in either the body-fixed frame or the space-fixed frame respectively.
Equations 1186 − 88 for the components of the angular velocity in the body-fixed frame can be expressed
in terms of the Euler angle velocities in a matrix form as
⎛ ⎞ ⎛ ⎞ ⎛ ⎞
1 sin sin cos 0 ̇
⎝ 2 ⎠ = ⎝ sin cos − sin 0 ⎠ · ⎝ ̇ ⎠ (11.92)
3 cos 0 1 ̇
again note that the transformation matrix is not orthogonal which is to be expected since the Euler angular
velocities are about axes that do not form a rectangular system of coordinates. Similarly equations 1189−91
for the angular velocity in the space-fixed frame can be expressed in terms of the Euler angle velocities in
matrix form as ⎛ ⎞ ⎛ ⎞ ⎛ ⎞
0 cos sin sin ̇
⎝ ⎠ = ⎝ 0 sin sin cos ⎠ · ⎝ ̇ ⎠ (11.93)
1 0 cos ̇
Using equation 1186 − 88 for the body-fixed angular velocities gives the rotational kinetic energy in terms
of the Euler angular velocities and principal-frame moments of inertia to be
∙ ³ ´2 ³ ´2 ³ ´2 ¸
1
= 1 ̇ sin sin + ̇ cos + 2 ̇ sin cos − ̇ sin + 3 ̇ cos + ̇ (11.95)
2
11.16. ROTATIONAL INVARIANTS 311
Similarly, the scalar product can be calculated using the Euler angle velocities for the space-fixed frame
using equations 1189 − 91.
2 2 2
ω · ω = ||2 = 2 + 2 + 2 = ̇ + ̇ + ̇ + 2̇̇ cos
2
This shows the obvious result that the scalar product · = || is invariant to rotations of the coordinate
frame, that is, it is identical when evaluated in either the space-fixed, or body-fixed frames.
Note that for = 0, the 3̂ and ̂ axes are parallel, and perpendicular to the ̂ axis, then
³ ´2 2
||2 = ̇ + ̇ + ̇
For the case when = 180◦ , the 3̂ and ̂ axes are antiparallel, and perpendicular to the ̂ axis, then
³ ´2 2
2
|| = ̇ − ̇ + ̇
For the case when = 90◦ , the 3̂ , ̂, and ̂ axes are mutually perpendicular, that is, orthogonal, and then
2 2 2
||2 = ̇ + ̇ + ̇
The time-averaged shape of a rapidly-rotating body, as seen in the fixed inertial frame, is very different
from the actual shape of the body, and this difference depends on the rotational frequency. For example, a
pencil rotating rapidly about an axis perpendicular to the body-fixed symmetry axis has an average shape
that is a flat disk in the laboratory frame which bears little resemblance to a pencil. The actual shape of the
pencil could be determined by taking high-speed photographs which display the instantaneous body-fixed
shape of the object at given times. Unfortunately for fast rotation, such as rotation of a molecule or a
nucleus, it is not possible to take photographs with sufficient speed and spatial resolution to observe the
instantaneous shape of the rotating body. What is measured is the average shape of the body as seen in the
fixed laboratory frame. In principle the shape observed in the fixed inertial frame can be related to the shape
in the body-fixed frame, but this requires knowing the body-fixed shape which in general is not known. For
example, a deformed nucleus may be both vibrating and rotating about some triaxially deformed average
shape which is a function of the rotational frequency. This is not apparent from the shapes measured in the
fixed frame for each of the excited states.
The fact that scalar products are rotationally invariant, provides a powerful means of transforming prod-
ucts of observables in the body-fixed frame, to those in the laboratory frame. In 1971 Cline developed
a powerful model-independent method that utilizes rotationally-invariant products of the electromagnetic
quadrupole operator 2 to relate the electromagnetic 2 properties for the observed levels of a rotating
nucleus measured in the laboratory frame, to the electromagnetic 2 properties of the deformed rotating
nucleus measured in the body-fixed frame.[Cli71, Cli72, Cli86] The method uses the fact that scalar products
of the electromagnetic multipole operators are rotationally invariant. This allows transforming scalar prod-
ucts of a complete set of measured electromagnetic matrix elements, measured in the laboratory frame, into
312 CHAPTER 11. RIGID-BODY ROTATION
the electromagnetic properties in the body-fixed frame of the rotating nucleus. These rotational invariants
provide a model-independent determination of the magnitude, triaxiality, and vibrational amplitudes of the
average shapes in the body-fixed frame for individual observed nuclear states that may be undergoing both
rotation and vibration. When the bombarding energy is below the Coulomb barrier, the scattering of a
projectile nucleus by a target nucleus is due purely to the electromagnetic interaction since the distance
of closest approach exceeds the range of the nuclear force. For such pure Coulomb collisions, the electro-
magnetic excitation of collective nuclei populates many excited states, as illustrated in figure 1213, with
cross sections that are a direct measure of the 2 matrix elements. These measured matrix elements are
precisely those required to evaluate, in the laboratory frame, the 2 rotational invariants from which it is
possible to deduce the intrinsic quadrupole shapes of the rotating-vibrating nuclear states in the body-fixed
frame[Cli86].
Note that this relation is expressed in the inertial space-fixed frame of reference, not the non-inertial body-
fixed frame. The subscript is added to emphasize that this equation is written in the inertial space-fixed
frame of reference. However, as already discussed, it is much more convenient to transform from the space-
fixed inertial frame to the body-fixed frame for which the inertia tensor of the rigid body is known. Thus the
next stage is to express the rotational motion in terms of the body-fixed frame of reference. For simplicity,
translational motion will be ignored.
The rate of change of angular momentum can be written in terms of the body-fixed value, using the
transformation from the space-fixed inertial frame (x̂ ŷ ẑ) to the rotating frame (ê1 ê2 ê3 ) as given in
chapter 103, µ ¶ µ ¶
L L
N= = +ω×L (11.99)
However, the body axis ê is chosen to be the principal axis such that
= (11.100)
where the principal moments of inertia are written as . Thus the equation of motion can be written using
the body-fixed coordinate system as
¯ ¯
¯ ê1 ê2 ê3 ¯¯
¯
N = 1 ̇ 1 ê1 + 2 ̇ 2 ê2 + 3 ̇ 3 ê3 + ¯¯ 1 2 3 ¯¯ (11.101)
¯ 1 1 2 2 3 3 ¯
= (1 ̇ 1 − (2 − 3 ) 2 3 ) ê1 + (2 ̇ 2 − (3 − 1 ) 3 1 ) ê2 + (3 ̇ 3 − (1 − 2 ) 1 2 ) ê3(11.102)
11.18. LAGRANGE EQUATIONS OF MOTION FOR RIGID-BODY ROTATION 313
1 = 1 ̇ 1 − (2 − 3 ) 2 3 (11.103)
2 = 2 ̇ 2 − (3 − 1 ) 3 1
3 = 3 ̇ 3 − (1 − 2 ) 1 2
These are the Euler equations for rigid body in a force field expressed in the body-fixed coordinate
frame. They are applicable for any applied external torque N.
The motion of a rigid body depends on the structure of the body only via the three principal moments
of inertia 1 2 and 3 Thus all bodies having the same principal moments of inertia will behave exactly the
same even though the bodies may have very different shapes. As discussed earlier, the simplest geometrical
shape of a body having three different principal moments is a homogeneous ellipsoid. Thus, the rigid-body
motion often is described in terms of the equivalent ellipsoid that has the same principal moments.
A deficiency of Euler’s equations is that the solutions yield the time variation of ω as seen from the body-
fixed reference frame axes, and not in the observers fixed inertial coordinate frame. Similarly the components
of the external torques in the Euler equations are given with respect to the body-fixed axis system which
implies that the orientation of the body is already known. Thus for non-zero external torques the problem
cannot be solved until the the orientation is known in order to determine the components . However,
these difficulties disappear when the external torques are zero, or if the motion of the body is known and it
is required to compute the applied torques necessary to produce such motion.
since the and eb3 axes are colinear. This can be rewritten as
3 ̇ 3 − (1 − 2 ) 1 2 = 3 (11.109)
Any axis could have been designated the eb3 axis, thus the above equation can be generalized to all three
axes to give
1 ̇ 1 − (2 − 3 ) 2 3 = 1 (11.110)
2 ̇ 2 − (3 − 1 ) 3 1 = 2
3 ̇ 3 − (1 − 2 ) 1 2 = 3
These are the Euler’s equations given previously in (11103). Note that although ̇ 3 is the equation
of motion for the coordinate, this is not true for the φ and θ rotations which are not along the body-fixed
1 and 2 axes as given in table 111.
Because L is perpendicular to the shaft, and L rotates around ω as the shaft rotates, let eb2 be along L
L = 2 eb2
1 = 0
2 = sin
3 = cos
1 = 1 1 = 0
2 = 2 2 = (1 + 2 ) 2 sin Rotation of a dumbbell.
3 = 3 3 = 0
which is consistent with the angular momentum being along the eb2 axis.
Using Euler’s equations, and assuming that the angular velocity is constant, i.e. ̇ = 0 then the compo-
nents of the torque required to satisfy this motion are
That is, this motion can only occur in the presence of the above applied torque which is in the direction
−eb1 that is, mutually perpendicular to eb2 and eb3 . This torque can be written as N = ω × L.
11.19. HAMILTONIAN EQUATIONS OF MOTION FOR RIGID-BODY ROTATION 315
(2 − 3 ) 2 3 − 1 ̇ 1 = 0 (11.111)
(3 − 1 ) 3 1 − 2 ̇ 2 = 0 (11.112)
3 ̇ 3 = 0 (11.113)
where the precession angular velocity Ω =̇ with respect to the body-fixed frame is defined to be
µ ¶
(3 − 1 )
Ω≡ ω3 (11.116)
1
Combining the time derivatives of equations 11114 and 11115 leads to two uncoupled equations
̈ 1 + Ω2 1 = 0 (11.117)
̈ 2 + Ω2 2 = 0 (11.118)
These are the differential equations for a harmonic oscillator with solutions
1 = cos Ω (11.119)
2 = sin Ω
316 CHAPTER 11. RIGID-BODY ROTATION
These equations describe a vector rotating in a circle of radius about an axis perpendicular to ̂3 that
is, rotating in the ̂1 − ̂2 plane with angular frequency Ω = −̇. Note that
21 + 22 = 2 (11.120)
which is a constant. In addition 3 is constant, therefore the magnitude of the total angular velocity
q
|ω| = 21 + 22 + 23 = constant (11.121)
The motion of the torque-free symmetric body is that the angular velocity ω precesses around the
symmetry axis ̂3 of the body at an angle with a constant precession frequency Ω with respect to the
body-fixed frame as shown in figure 114. Thus, to an observer on the body, ω traces out a cone around the
body-fixed symmetry axis. Note from (11116) that the vectors Ω̂3 and 3 ̂3 are parallel when Ω is positive,
that is, 3 (oblate shape) and antiparallel if 3 (prolate shape).
For the system considered, the orientation of the angular momentum vector L must be stationary in the
space-fixed inertial frame since the system is torque free, that is, L is a constant of motion. Also we have
that the projection of the angular momentum on the body-fixed symmetry axis is a constant of motion, that
is, it is a cyclic variable. Thus
1 3
3 = 3 3 = Ω (11.122)
(3 − 1 )
Understanding the relation between the angular momentum and angular velocity is facilitated by consid-
ering another constant of motion for the torque-free symmetric rotor, namely the rotational kinetic energy.
1
= ω · L = constant (11.123)
2
Since L is a constant for torque-free motion, and also the magnitude of ω was shown to be constant, therefore
the angle between these two vectors must be a constant to ensure that also rot = 12 ω · L = constant. That
is, ω precesses around L at a constant angle ( − ) such that the projection of ω onto L is constant. Note
that
ω × eb3 = 2 eb1 − 1 eb2 (11.124)
and, for a symmetric rotor,
L · ω × eb3 = 1 1 2 − 2 1 2 = 0 (11.125)
since 1 = 2 for the symmetric rotor. Because L · ω × eb3 = 0 for a symmetric top then L ω and eb3 are
coplanar.
Figure 115 shows the geometry of the motion for both oblate and prolate axially-deformed bodies. To
an observer in the space-fixed inertial frame, the angular velocity ω traces out a cone that precesses with
angular velocity Ω around the space fixed L axis called the space cone. For convenience, figure 115 assumes
that L and the space-fixed inertial frame ẑ axis are colinear. The angular velocity ω also traces out the
body cone as it precesses about the body-fixed ê3 axis. Since L ω and eb3 are coplanar, then the ω vector is
at the intersection of the space and body cones as the body cone rolls around the space cone. That is, the
space and body cones have one generatrix in common which coincides with ω. As shown in figure 115, for
a needle the body cone appears to roll without slipping on the outside of the space cone at the precessional
velocity of Ω = − By contrast, as shown in figure 115 for an oblate (disc-shaped) symmetric top the
space cone rolls inside the body cone and the precession Ω is faster than .
Since no external torques are acting for torque-free motion, then the magnitude and direction of the total
angular momentum are conserved. The description of the motion is simplified if L is taken to be along the
space-fixed ẑ axis, then the Euler angle is the angle between the body-fixed basis vector ê3 and space-fixed
basis vector ẑ. If at some instant in the body frame, it is assumed that eb2 is aligned in the plane of L ω
and eb3 then
1 = 0 2 = sin 3 = cos (11.126)
If is the angle between the angular velocity ω and the body-fixed ê3 axis, then at the same instant
z z
L
Space cone 3 L
3
Space cone
Body cone
2
2
Body cone
1
(a) (b)
Figure 11.5: Torque-free rotation of symmetric tops; (a) circular flat disk, (b) circular rod. The space-fixed
and body-fixed cones are shown by fine lines. The space-fixed axis system is designated by the unit vectors
(x̂ ŷ ẑ) and the body-fixed principal axis system by unit vectors (1̂ 2̂ 3̂)
The components of the angular momentum also can be derived from L = I · ω to give
1 = 1 1 = 0 2 = 2 2 = 1 sin 3 = 3 3 = 3 cos (11.128)
2
Equations 11126 and 11128 give two relations for the ratio 3 , that is,
2 1
= tan = tan (11.129)
3 3
For a prolate spheroid 1 3 therefore while Ω and 3 have opposite signs.
For a oblate spheroid 1 3 therefore while Ω and 3 have the same sign.
The sense of precession can be understood if the body cone rolls without slipping on the outside of the
space cone with Ω in the opposite orientation to for the prolate case, while for the oblate case the space
cone rolls inside the body cone with Ω and oriented in similar directions. Note from (11129) that = 0
if = 0, that is L ω and the 3 axis are aligned corresponding to a principal axis. Similarly, = 90◦ if
= 90◦ , then again L and ω are aligned corresponding to them being principal axes.
Lagrangian mechanics has been used to calculate the motion with respect to the body-fixed principal
axis system. However, the motion needs to be known relative to the space-fixed inertial frame where the
motion is observed. This transformation can be done using the following relation
µ ¶ µ ¶
ê3 ê3
= + ω × ê3 = ω × ê3 (11.130)
since the unit vector ê3 is stationary in the body-fixed frame. The vector product of ω × ê3 and ê3 gives
µ ¶
ê3
ê3 × = ê3 × ω × ê3 = (ê3 · ê3 ) ω − (ê3 · ω) ê3 = ω − 3 ê3
therefore µ ¶
ê3
ω = ê3 × + 3 ê3 (11.131)
¡ 3¢
The angular momentum equals L = {I} ·ω. Since ê3 × ê
is perpendicular to the ê3 axis, then
for the case with 1 = 2 , µ ¶
ê3
L =1 ê3 × + 3 3 ê3 (11.132)
318 CHAPTER 11. RIGID-BODY ROTATION
Thus the angular momentum for a torque-free symmetric rigid rotor comprises two components, one being
the perpendicular component that precesses around ê3 , and the other is 3 .
In the space-fixed frame assume that the ẑ axis is colinear with L Then taking the scalar product of ê3
and L, using equation 11126 gives
µ ¶
ê3
3 = ê3 · L =1 ê3 · ê3 × + 3 3 ê3 · ê3 (11.133)
The first term on the right is zero and thus equation 11133 and 11126 give
3 = 3 3 = cos (11.134)
The time dependence of the rotation of the body-fixed symmetry axis with respect to the space-fixed
axis system can be obtained by taking the vector product ê3 × L using equation 11132 and using equation
24 to expand the triple vector product,
à µ ¶ !
ê3
ê3 × L = 1 ê3 × ê3 × + 3 3 ê3 × ê3 (11.135)
"Ã µ ¶ ! µ ¶ #
ê3 ê3
= 1 ê3 · ê3 − (ê3 · ê3 ) +0
¡ ê3 ¢
since (ê3 × ê3 ) = 0. Moreover (ê3 · ê3 ) = 1, and ê3 ·
= 0 since they are perpendicular, then
µ ¶
ê3 L
= × ê3 (11.136)
1
This equation shows that the body-fixed symmetry axis ê3 precesses around the L where L is a constant
of motion for torque-free rotation. The true rotational angular velocity ω in the space-fixed frame, given by
equations 11131 can be evaluated using equation 11136 Remembering that it was assumed that L is in
the ẑ direction, that is, L =ẑ then
µ ¶
ê3
ω = ê3 × + 3 ê3
µ ¶
cos
= ê3 × (ẑ × ê3 ) + ê3
1 3
µ ¶
1 − 3
= ẑ + cos ê3 (11.137)
1 1 3
That is, the symmetry axis of the axially-symmetric rigid rotor makes an angle to the angular momentum
vector ẑ and precesses around ẑ with a constant angular velocity 1 while the axial spin of the rigid body
has a constant value 3 . Thus, in the precessing frame, the rigid body appears to rotate about its fixed
³ ´
1 −3
symmetry axis with a constant angular velocity cos3
− cos
1 = cos 1 3 . The precession of the
symmetry axis looks like a wobble superimposed on the spinning motion about the body-fixed symmetry
axis. The angular precession rate in the space-fixed frame can be deduced by using the fact that
which gives the precession rate about the space-fixed axis in terms of the angular velocity . Note that the
precession rate ̇ if 31 1, that is, for oblate shapes, and ̇ if 31 1, that is, for prolate shapes.
11.20. TORQUE-FREE ROTATION OF AN INERTIALLY-SYMMETRIC RIGID ROTOR 319
Since and are constants of motion, then the precessional angular velocity ̇ about the space-fixed ẑ
axis, and the spin angular velocity ̇, which is the spin frequency about the body-fixed 3̂ axis, are constants
that depend directly on 1 3 and
There is one additional constant of motion available if no dissipative forces act on the system, that is,
energy conservation which implies that the total energy
1 ³ 2 2 2
´ 1 ³ ´2
= 1 ̇ sin + ̇ + 3 ̇ cos + ̇ (11.153)
2 2
will be a constant of motion. But the second term on the right-hand side also is a constant of motion since
and 3 both are constants, that is
1 1 ³ ´2 2
3 23 = 3 ̇ cos + ̇ = = constant (11.154)
2 2 3
Thus energy conservation implies that the first term on the right-hand side also must be a constant given by
1 ¡ 2 ¢ 1 ³ 2 2
´ 2
1 1 + 22 = 1 ̇ sin2 + ̇ = − = constant (11.155)
2 2 3
These results are identical to those given in equations 11120 and 11121 which were derived using Euler’s
equations. These results illustrate that the underlying physics of the torque-free rigid rotor is more easily
extracted using Lagrangian mechanics rather than using the Euler-angle approach of Newtonian mechanics.
11.9 Example: Precession rate for torque-free rotating symmetric rigid rotor
Table 112 lists the precession and spin angular velocities, in the space-fixed frame, for torque-free rotation
of three extreme symmetric-top geometries spinning with constant angular momentum when the motion
is slightly perturbed such that is at a small angle to the symmetry axis. Note that this assumes the
perpendicular axis theorem, equation 1145 which states that for a thin laminae 1 + 2 = 3 giving, for a
thin circular disk, 1 = 2 and thus 3 = 21
Table 112: Precession and spin rates for torque-free axial rotation of symmetric rigid rotors
3
Rigid-body symmetric shape Principal moment ratio 1 Precession rate ̇ Spin rate ̇
Symmetric needle 0 0
Sphere 1 0
Thin circular disk 2 2 −
The precession angular velocity in the space frame ranges between 0 to 2 depending on whether the
body-fixed spin angular velocity is aligned or anti-aligned with the rotational frequency . For an extreme
prolate spheroid 31 = 0 the body-fixed spin angular velocity Ω = − 3 which cancels the angular velocity
of the rotating frame resulting in a zero precession angular velocity of the body-fixed ê3 axis around the
space-fixed frame. The spin Ω = 0 in the body-fixed frame for the rigid sphere 31 = 1 and thus the precession
rate of the body-fixed ̂3 axis of the sphere around the space-fixed frame equals . For oblate spheroids and
thin disks, such as a frisbee, 31 = 2 making the body-fixed precession angular velocity Ω = + which adds
to the angular velocity and increases the precession rate up to 2 as seen in the space-fixed frame. This
illustrates that the spin angular velocity can add constructively or destructively with the angular velocity 2
2 Inhis autobiography Surely You’re Joking Mr Feynman, he wrote " I was in the [Cornell] cafeteria and some guy, fooling
around, throws a plate in the air. As the plate went up in the air I saw it wobble, and noticed that the red medallion of
Cornell on the plate going around. It was pretty obvious to me that the medallion went around faster than the wobbling. I
started to figure out the motion of the rotating plate. I discovered that when the angle is very slight, the medallion rotates
twice as fast as the wobble rate. It came out of a very complicated equation! ". The quoted ratio (2 : 1) is incorrect, it should
be (1 : 2). Benjamin Chao in Physics Today of February 1989 speculated that Feynman’s error in inverting the factor of
two might be "in keeping with the spirit of the author and the book, another practical joke meant for those who do physics
without experimenting". He pointed out that this story occurred on page 157 of a book of length 314 pages (1:2). Observe the
dependence of the ratio of wobble to rotation angular velocities on the tilt angle .
11.21. TORQUE-FREE ROTATION OF AN ASYMMETRIC RIGID ROTOR 321
1 ̇ 1 = (2 − 3 ) 2 3 (11.156)
2 ̇ 2 = (3 − 1 ) 3 1
3 ̇ 3 = (1 − 2 ) 1 2
Since = for = 1 2 3, then equation 11156 gives
The bracket is equivalent to (21 + 22 + 23 ) = 0 which impliesFigure 11.6: Rotation of an asymmetric
that the total rotational angular momentum is a constant of rigid rotor. The dark lines correspond to
motion as expected for this torque-free system, even though the contours of constant total rotational ki-
individual components 1 2 3 may vary. That is netic energy T, which has an ellipsoidal
2 2 2
1 + 2 + 3 = 2
(11.159) shape, projected onto the angular momen-
tum L sphere in the body-fixed frame.
Note that equation 11159 is the equation of a sphere of radius .
Multiply the first equation of 11157 by 1 , the second by 2 , and the third by 3 , and sum gives
2
Thus, for a given value of when = min = 2 3
the orientation of L in the body-fixed frame is either
(0 0 +) or (0 0 −), that is, aligned with the ê3 axis along which the principal moment of inertia is largest.
For slightly higher kinetic energy the trajectory of follows closed paths precessing around ê3 . When the
2
kinetic energy = 222 the angular momentum vector follows either of the two thin-line trajectories each
of which are a separatrix. These do not have closed orbits around ê2 and they separate the closed solutions
around either ê3 or ê1 For higher kinetic energy the precessing angular momentum vector follows closed
trajectories around ê1 and becomes fully aligned with ê1 at the upper-bound kinetic energy.
Note that for the special case when 3 2 = 1 then the asymmetric rigid rotor equals the symmetric
rigid rotor for which the solutions of Euler’s equations were solved exactly in chapter 1119. For the symmetric
rigid rotor the -ellipsoid becomes a spheroid aligned with the symmetry axis and thus the intersections
with the -sphere lead to circular paths around the ê3 body-fixed principal axis, while the separatrix circles
the equator corresponding to the ê3 axis separating clockwise and anticlockwise precession about L3 . This
discussion shows that energy, plus angular momentum conservation, provide the general features of the
solution for the torque-free symmetric top that are in agreement with those derived using Euler’s equations
of motion
ω = 1 b
e1 (11.163)
Consider that a small perturbation is applied causing the angular velocity vector to be
ω = 1 b
e1 + b
e2 + b
e3 (11.164)
(2 − 3 ) − 1 ̇ 1 = 0
(3 − 1 ) 1 − 2 ̇ = 0
(1 − 2 ) 1 − 3 ̇ = 0
Assuming that the product in the first equation is negligible, then ̇ 1 = 0 that is, 1 is constant.
The other two equations can be solved to give
µ ¶
(3 − 1 )
̇ = 1 (11.165)
2
µ ¶
(1 − 2 )
̇ = 1 (11.166)
3
Take the time derivative of the first equation
µ ¶
(3 − 1 )
̈ = 1 ̇ (11.167)
2
and substitute for ̇ gives µ ¶
(1 − 3 ) (1 − 2 ) 2
̈ + 1 = 0 (11.168)
2 3
The solution of this equation is
() = Ω1 + −Ω1 (11.169)
where s
(1 − 3 ) (1 − 2 )
Ω1 = 1 (11.170)
2 3
11.22. STABILITY OF TORQUE-FREE ROTATION OF AN ASYMMETRIC BODY 323
Note that since it was assumed that 3 2 1 then Ω1 is real. The solution for () therefore represents a
stable oscillatory motion with precession frequency Ω1 The identical result is obtained for Ω1 = Ω1 = Ω1
Thus the motion corresponds to a stable minimum about the ê1 axis with oscillations about the = = 0
minimum with period. s
(1 − 3 ) (1 − 2 )
Ω1 = 1 (11.171)
2 3
Permuting the indices gives that for perturbations applied to rotation about either the 2 or 3 axes give
precession frequencies s
(2 − 1 ) (2 − 3 )
Ω2 = 2 (11.172)
1 3
s
(3 − 2 ) (3 − 1 )
Ω3 = 3 (11.173)
1 2
Since 3 2 1 then Ω1 and Ω3 are real while Ω2 is imaginary. Thus, whereas rotation about either
the 3 or the 1 axes are stable, the imaginary solution about ê2 corresponds to a perturbation increasing
with time. Thus, only rotation about the largest or smallest moments of inertia are stable. Moreover for
the symmetric rigid rotor, with 1 = 2 6= 3 stability exists only about the symmetry axis ê3 independent
on whether the body is prolate or oblate. This result was implied from the use of energy and angular
momentum conservation in chapter 1120. Friction was not included in the above discussion. In the presence
of dissipative forces, such as friction or drag, only rotation about the principal axis corresponding to the
maximum moment of inertia is stable.
Stability of rigid-body rotation has broad applications to rotation of satellites, molecules and nuclei.
The first U.S. satellite, Explorer 1, was launched in 1958 with the rotation axis aligned with the cylindrical
axis which was the minimum principal moment of inertia. After a few hours the satellite started tumbling
with increasing amplitude due to a flexible antenna dissipating and transferring energy to the perpendicular
axis which had the largest moment of inertia. Torque-free motion of a deformed rigid body is a ubiquitous
phenomena in many branches of science, engineering, and sports as illustrated by the following examples.
The imaginary precession frequency Ω1 about the 1 axis implies unstable rotation leading to tumbling
whereas the minimum moment 22 and maximum moment 33 imply stable rotation about the 2 and 3 axes.
This rotational behavior is easily demonstrated by throwing a tennis racquet and is called the tennis racquet
theorem. The center of percussion, example 214 also is an important inertial property of a tennis racquet.
324 CHAPTER 11. RIGID-BODY ROTATION
and (cos ) is an associated Legendre function of cos . Spherical harmonics are the angular portion of a
set of solutions to Laplace’s equation. Represented in a system of spherical coordinates, Laplace’s spherical
harmonics ( ) are a specific set of spherical harmonics that form an orthogonal system. Spherical
harmonics are important in many theoretical and practical applications.
In the principal axis frame of the body, there are three non-zero quadrupole deformation parameters
which can be written in terms of the deformation parameters where 20 = cos , 21 = 2−1 = 0 and
22 = 2−2 = √12 sin Using these in equations () give the three semi-axis dimensions in the principal
axis frame, (primed frame), r
5 2
= 0 cos( − ) ()
4 3
q q
Note that for = 0, then 1 = 2 = − 12 4 5
0 while 3 = + 4 5
0 , that is the body has prolate
deformation with the symmetry axis along the 3 axis. The same prolate shape is obtained for = 23 and
= 4 with the prolate symmetry axes along the 1 and 2 axes respectively. For =
then 1 = 3 =
q3 q 3
1 5 5
+ 2 4 0 while 2 = − 4 0 , that is the body has oblate deformation with the symmetry axis along
the 2 axis. The same oblate shape is obtained for = and = 5 3 with the oblate symmetry axes along
the 3 and 1 axes respectively. For other values of the shape is ellipsoidal.
For the asymmetric deformed rigid body, the rotational Hamiltonian can be expressed in the form[Dav58]
3
X ||2
=
=1
4 2 sin2 ( 0 − 2
3 )
where the rotational angular momentum is R The principal moments of inertia are related by the triaxiality
parameter 0 which they assumed is identical to the shape parameter . For axial symmetry the moment of
inertia about the symmetry axis is taken to be zero for a quantal system since rotation of the potential well
about the symmetry axis corresponds to no change in the potential well, or corresponding rotation of the bound
nucleons. That is, the nucleus is not a rigid body, the nucleons only rotate to the extent that the ellipsoidal
potential well is cranked around such that the nucleons must follow the rotation of the potential well. In
addition, vibrational modes coexist about the average asymmetric deformation, plus octupole deformation
often coexists with the above quadrupole deformed modes.
11.23. SYMMETRIC RIGID ROTOR SUBJECT TO TORQUE ABOUT A FIXED POINT 325
1 ³ 2 2 2
´ 1 ³ ´2
= 1 ̇ sin + ̇ + 3 ̇ cos + ̇ + cos (11.183)
2 2
will be a constant of motion. But the middle term on the right-hand
side also is a constant of motion
1 ³ ´2 2
2
3 ̇ cos + ̇ = = 3 = constant (11.184)
2 3 3
The effective potential () is shown in figure 118. It is clear that the motion of a symmetric top with
effective energy 0 is confined to angles 1 2
Note that the above result also is obtained if the Routhian is used, rather than the Lagrangian, as
mentioned in chapter 87, and defined by equation (865). That is, the Routhian can be written as
The Routhian ( ̇ ) acts like a Hamiltonian for the ( ) and ( ) variables which are
constants of motion, and thus are ignorable variables. The Routhian acts as the negative Lagrangian for the
2
remaining variable with rotational kinetic energy 12 1 ̇ and effective potential energy
2
( − cos ) 2 2
= + + cos = () +
21 sin2 3 3
The equation of motion describing the system in the rotating frame is given by one Lagrange equation
( )− =0
̇
The negative sign of the Routhian cancels out when used in the Lagrange equation. Thus, in the rotating
frame of reference, the system is reduced to a single degree of freedom, the nutation angle with effective
energy 0 given by equations 11186 − 11188.
11.23. SYMMETRIC RIGID ROTOR SUBJECT TO TORQUE ABOUT A FIXED POINT 327
Figure 11.9: Nutational motion of the body-fixed symmetry axis projected onto the space-fixed unit sphere.
The three case are (a) ̇ never vanishes, (b) ̇ = 0 at = 2 (c) ̇ changes sign between 1 and 2
The motion of the symmetric top is simplest at the minimum value of the effective potential curve, where
0 = min at which the nutation is restricted to a single value = ¡0 The
¢ motion is a steady precession
at a fixed angle of inclination, that is, the "sleeping top". Solving for =0 = 0 gives that
" s #
sin2 0 4 1 cos 0
− cos = 1± 1− (11.190)
2 cos 0 2
If 0 2 then to ensure that the solution is real requires a minimum value of the angular momentum on the
body-fixed axis of 2 ≥ 4 1 cos 0 . If 0 2 then there is no minimum angular momentum projection
on the body-fixed axis. There are two possible solutions to the quadratic relation corresponding to either a
slow or fast precessional frequency. Usually the slow precession is observed.
For the general case, where 10 min the nutation angle between the space-fixed and body-fixed 3
axes varies in the range 1 2 This axis exhibits a nodding variation which is called nutation. Figure
119 shows the projection of the body-fixed symmetry axis on the unit sphere in the space-fixed frame. Note
that the observed nutation behavior depends on the relative sizes of and cos For certain values, the
precession ̇ changes sign between the two limiting values of producing a looping motion as shown in figure
119. Another condition is where the precession is zero for 2 producing a cusp at 2 as illustrated in figure
119. This behavior can be demonstrated using the gyroscope or the symmetric top.
6
10̇ 1 − 6 2 3 = sin sin (a)
6
10̇ 2 − 6 1 3 = sin cos (b)
4̇ 3 = 0 (c)
Equation () relates the spin about the 3 axis, the precession, and the angle to the vertical that is
The bracket must be positive to have stable sinusoidal oscillations. That is, the spin angular velocity
required for the jack to spin about a stable vertical axis is given by.
3Ω 3
+
2 2Ω
This illustrates the conditions required for stable rotation of any axially-symmetric top.
of degrees of freedom from 5 to 2, namely which is the tilt angle, and 0 which is the orientation of the
tilt. This Routhian is a Lagrangian in two dimension that was used to derive the equations of motion
via the Lagrange Euler equation
( )− =
̇
( )− = 0
̇0 0
where the 0 are generalized torques about the 2 angles that take into account the sliding frictional
forces. This sophisticated Routhian reduction approach provides an exhaustive and refined solution for the
Tippe Top and confirms that sliding friction plays a key role in the unusual behavior of the Tippe Top.
1 = ̇ (11.191)
2 = ̇ sin
3 = ̇ cos
The frame fixed in the rotating wheel must include the additional angular velocity of the disk ̇ about
the ê3 axis, that is
Ω1 = 1 = ̇ (11.192)
Ω2 = 2 = ̇ sin
Ω3 = 3 + ̇ = ̇ cos + ̇
where Ω designates the angular velocity of the rotating disk, while ω designates the rotation of the moving
frame (1 2 3).
For a thin disk the moment of inertia are related by the perpendicular axis theorem (chapter 119)
1 + 2 = 3
This leads to the following relations for the three components in the moving frame
1 = ̇1 + 2 3 − 3 2 (11.194)
2 − sin = ̇2 + 3 1 − 1 3
3 − cos = ̇3 + 1 2 − 2 1
330 CHAPTER 11. RIGID-BODY ROTATION
Figure 11.10: Uniform disk rolling on a horizontal plane. The space-fixed axis system is (x y z) , while
the moving reference frame (1 2 3) is centered at the center of mass of the disk with the 1 2 axes in the
plane of the disk. The disk is rotating with a uniform angular velocity ̇ about the 3 axis and rolling in the
direction that is an angle relative to the axis.
Equations 11199 are non-linear, and a closed-form solution is possible only for limited cases such as when
= 90◦ .
Note that the above equations of motion also can be derived using Lagrangian mechanics knowing that
1 ¡ 2 ¢ 1 ¡ ¢ 1
= 1 + 22 + 32 + 1 Ω21 + Ω22 + 3 Ω23 − cos
2 2 2
The differential equations of constraint can be derived from equations 11197 to be
− cos = 0
− sin = 0
Use of generalized forces plus the Lagrange-Euler equations (647) can be used to derive the equations of
motion and solve for the components of the constraint force 1 2 and 3 .
to gyroscopic effects. Excellent articles on this subject have been written by D.E.H. Jones Physics Today 23(4) (1970) 34, and
also by J. Lowell & H.D. McKell, American Journal of Physics 50 (1982) 1106.
332 CHAPTER 11. RIGID-BODY ROTATION
The parallel-axis theorem relates the moment of inertia with respect to the pivot point and center of mass
The angular velocities of the center of mass, and about the center of mass, are identical since the pivot point
is fixed, that is
= =
Thus the angular momentum about the pivot point is given by the sum of the angular momenta
That is, the angular momentum is the sum of the angular momentum of the body about the center of mass
plus the angular momentum of the center of mass about the pivot point. This is an example of Chasles
theorem.
The kinetic energy is given only by the rotational energy since the pivot point is stationary
1 1 1 1 1
= 2 = 2 2 + 2 = 2 + 2
2 2 2 2 2
That is, it equals the kinetic energy of rotation about the center of mass plus the instantaneous kinetic energy
for translation of the center of mass in agreement with Chasles theorem. Thus for pivoting the angular
momentum and kinetic energy are the same if evaluated using either center of mass coordinates or using the
pivot point as the reference point.
That is, the angular momentum only includes the angular momentum about the center of mass which is
smaller than the angular momentum for the same body pivoting about a point on the periphery of the cylinder.
The kinetic energy is given by
1 1 1 1
= 2 + 2 = 2 + 2
2 2 2 2
Thus the angular momentum is significantly smaller for rolling relative to pivoting of a given body, whereas
the kinetic energy is the same for both rolling or pivoting of a given body.
11.25. DYNAMIC BALANCING OF WHEELS 333
1 = 3 = 0
1
2 = − 2 sin cos 2
4
That is, the torque is in the ̂2 direction. Thus the forces on the bearings can be calculated since N = r × F,
thus
|2 | sin 2
| | = = 2 2
2 16
Estimate the size of these forces for the front wheel of your car travelling at 70 m.p.h. if the rotation axis is
displaced by 2◦ from the symmetry axis of the wheel.
334 CHAPTER 11. RIGID-BODY ROTATION
Figure 11.11: Forward two-and-a-half somersaults with two twists demonstrates unequivocally that a diver
can initiate continuous twisting in midair. In the illustrated maneuver the diver does more than one full
somersault before he starts to twist. To maintain the twisting the diver does not have to move his legs.[Fro80]
11.27 Summary
This chapter has introduced the important, topic of rigid-body rotation which has many applications in
physics, engineering, sports, etc.
Inertia tensor The concept of the inertia tensor was introduced where the 9 components of the inertia
tensor are given by à à 3 ! !
Z X
0 2
= (r ) − (1114)
Steiner’s parallel-axis theorem
¡¡ 2 ¢ ¢ ¡ ¢
11 ≡ 11 + 1 + 22 + 23 11 − 21 = 11 + 22 + 23 (1143)
relates the inertia tensor about the center-of-mass to that about parallel axis system not through the center
of mass.
Diagonalization of the inertia tensor about any point was used to find the corresponding Principal axes
of the rigid body.
Angular momentum The angular momentum L for rigid-body rotation is expressed in terms of the
inertia tensor and angular frequency by
⎛ ⎞ ⎛ ⎞
11 12 13 1
L= ⎝ 21 22 23 ⎠ · ⎝ 2 ⎠ = {I} · ω (1156)
31 32 33 3
Euler angles The Euler angles relate the space-fixed and body-fixed principal axes. The angular velocity
ω expressed in terms of the Euler angles has components for the angular velocity in the body-fixed axis system
(1 2 3)
1 = ̇1 + ̇1 + 1 = ̇ sin sin + ̇ cos (1186)
2 = ̇2 + ̇2 + 2 = ̇ sin cos − ̇ sin (1187)
3 = ̇3 + ̇3 + 3 = ̇ cos + ̇ (1188)
Similarly, the components of the angular velocity for the space-fixed axis system ( ) are
= ̇ cos + ̇ sin sin (1189)
= ̇ sin − ̇ sin cos (1190)
= ̇ + ̇ cos (1191)
Rotational invariants The powerful concept of the rotational invariance of scalar properties was intro-
duced. Important examples of rotational invariants are the Hamiltonian, Lagrangian, and Routhian.
Euler equations of motion for rigid-body motion The dynamics of rigid-body rotational motion was
explored and the Euler equations of motion were derived using both Newtonian and Lagrangian mechanics.
1 = 1 1 − (2 − 3 ) 2 3 (11103)
2 = 2 2 − (3 − 1 ) 3 1
3 = 3 3 − (1 − 2 ) 1 2
336 CHAPTER 11. RIGID-BODY ROTATION
Lagrange equations of motion for rigid-body motion The Euler equations of motion for rigid-body
motion, given in equation 11103 were derived using the Lagrange-Euler equations.
Torque-free motion of rigid bodies The Euler equations and Lagrangian mechanics were used to study
torque-free rotation of both symmetric and asymmetric bodies including discussion of the stability of torque-
free rotation.
Rotating symmetric body subject to a torque The complicated motion exhibited by a symmetric top,
that is spinning about one fixed point and subject to a torque, was introduced and solved using Lagrangian
mechanics.
The rolling wheel The non-holonomic motion of rolling wheels was introduced, as well as the importance
of static and dynamic balancing of rotating machinery..
Rotation of deformable bodies The complicated non-holonomic motion involving rotation of deformable
bodies was introduced.
11.27. SUMMARY 337
Workshop exercises
1. Three objects are described below. Break up into three groups, one group per object, and determine the inertia
tensor.
• A very thin sheet with a mass density = where is a positive constant. The sheet lies in the
plane and its sides are both of length .
• An inclined-plane shaped block of mass is oriented with one corner at the origin as shown.
• An equilateral triangle made up of three thin rods of length and uniform mass density .
(a) For the first object (the thin sheet), determine the principal moments of inertia.
(b) For the second object (the inclined plane), determine the principal axes.
(c) For the third object (the equilateral triangle), determine the products of inertia.
(a) Calculate the inertia tensor for a set of coordinates whose origin is at the center of mass of the shell.
(b) Now suppose that the shell is rolling without slipping toward a step of height , where . The shell
has a linear velocity . What is the angular momentum of the shell relative to the tip of the step?
(c) The shell now strikes the tip of the step inelastically (so that the point of contact sticks to the step,
but the shell can still rotate about the tip of the step). What is the angular momentum of the shell
immediately after contact?
(d) Finally, find the minimum velocity which enables the shell to surmount the step. Express your result in
terms of and .
5. The vectors ̂, ̂ , and ̂ constitute a set of orthogonal right-handed axes. The vectors ̂ + ̂ − 2̂ , −̂ + ̂ , and
̂ + ̂ + ̂ are also perpendicular to one another.
(a) Write out the set of direction cosines relating the new axes to the old.
(b) How are the Eulerian angles defined? Describe this transformation by a set of Eulerian angles.
338 CHAPTER 11. RIGID-BODY ROTATION
6. A torsional pendulum consists of a vertical wire attached to a mass which can rotate about the vertical axis.
Consider three torsional pendula which consist of identical wires from which identical homogeneous solid cubes
are hung. One cube is hung from a corner, one from midway along an edge, and one from the middle of a face
as shown. What are the ratios of the periods of the three pendula?
7. A dumbbell comprises two equal point masses connected by a massless rigid rod of length 2 which is
constrained to rotate about an axle fixed to the center of the rod at an angle as shown in the figure. The
center of the rod is at the origin of the coordinates, the axle along the -axis, and the dumbbell lies in the
− plane at = 0. The angular velocity is a constant in time and is directed along the axis.
a) Calculate all elements of the inertia tensor. Be sure to specify the coordinate system used.
b) Using the calculated inertia tensor find the angular momentum of the dumbbell in the laboratory frame as
a function of time.
c) Using the equation = × , calculate the angular momentum and show that it it is equal to the answer
of part (b).
d) Calculate the torque on the axle as a function of time.
e) Calculate the kinetic energy of the dumbbell.
x
O
z
8. A heavy symmetric top has a mass with the center of mass a distance from the fixed point about which
it spins and 1 = 2 6= 3 . The top is precessing at a steady angular velocity Ω about the vertical space-fixed
axis. What is the minimum spin 0 about the body-fixed symmetry axis, that is, the 3 axis assuming that
the 3 axis is inclined at an angle = with respect to the vertical axis. Solve the problem at the instant
when the 3 1 axes all are in the same plane as shown in the figure.
z
O x
1
11.27. SUMMARY 339
9. Consider an object with the center of mass is at the origin and inertia tensor,
⎛ ⎞
12 −12 0
= ⎝ −12 12 0 ⎠
0 0 1
(a) Determine the principal moments of inertia and the principal axes. Guess the object.
(b) Determine the rotation matrix and compute † . Do the diagonal elements match with your results
from (a)? Note: columns of are eigenvectors of .
(c) Assume = (̂
+ ̂). Determine in the rotating coordinate system. Are and in the same
√
2
direction? What does this mean?
(d) Repeat (c) for = √
2
(̂ − ̂). What is different and why?
(e) For which case will there be a non-zero torque required?
(f) Determine the rotational kinetic energy for the case = √
2
(̂ − ̂)?
10. Consider a wheel (solid disk) of mass and radius . The wheel is subject to angular velocities = ̂
where ̂ is normal to the surface and = ̂ .
11. Determine the principal moments of inertia of an ellipsoid given by the equation,
2 2 2
2
+ 2 + 2 = 1
12. Determine the principal moments of inertia of a sphere of radius with a cavity of radius located from the
center of the sphere.
13. Three
³ ´ ³masses form
equal ´ the ³
vertices of an equilateral
´ triangle of side length . The masses are located at
0 0 3 , 0 2 − 2 3 , and 0 − 2 − 2 3 , such that the center-of-mass is located at the origin.
√ √ √
Problems
1. Calculate the moments of inertia 1 2 3 for a homogeneous cone of mass whose height is and whose
base has a radius Choose the 3 -axis along the symmetry axis of the cone.
a) Choose the origin at the apex of the cone, and calculate the elements of the inertia tensor.
b) Make a transformation such that the center of mass of the cone is the origin and find the principal moments
of inertia.
2. Four masses, all of mass lie in the − plane at positions ( ) = ( 0) (− 0) (0 +2) (0 −2)
These are joined by massless rods to form a rigid body
(a) Find the inertial tensor, using the axes as a reference system. Exhibit the tensor as a matrix.
(b) Consider a direction given by the unit vector ̂ that lies equally between the positive axes; that is
it makes equal angles with these three directions. Find the moment of inertia for rotation about this ̂ axis.
(c) Given that at a certain time the angular velocity vector lies along the above direction ̂, find, for that
instant, the angle between the angular momentum vector and ̂
3. A homogeneous cube, each edge of which has a length initially is in a position of unstable equilibrium with
one edge of the cube in contact with a horizontal plane. The cube then is given a small displacement causing
it to tip over and fall. Show that the angular velocity of the cube when one face strikes the plane is given by
³√ ´
2 = 2−1
3 12
where = 2 if the edge cannot slide on the plane, and where = 5 if sliding can occur without friction.
4. A symmetric body moves without the influence of forces or torques. Let 3 be the symmetry axis of the body
and be along 03 . The angle between and 3 is . Let and initially be in the 2 − 3 plane. What is
the angular velocity of the symmetry axis about in terms of 1 3 and ?
5. Consider a thin rectangular plate with dimensions by and mass Determine the torque necessary to
rotate the thin plate with angular velocity about a diagonal. Explain the physical behavior for the case when
= .
Chapter 12
12.1 Introduction
Chapter 3 discussed the behavior of a single linearly-damped linear oscillator subject to a harmonic force.
No account was taken for the influence of the single oscillator on the driver for the case of forced oscillations.
Many systems in nature comprise complicated free or forced oscillations of coupled-oscillator systems. Ex-
amples of coupled oscillators are; automobile suspension systems, electronic circuits, electromagnetic fields,
musical instruments, atoms bound in a crystal, neural circuits in the brain, networks of pacemaker cells in
the heart, etc. Energy can be transferred back and forth between coupled oscillators as the motion evolves.
However, it is possible to describe the motion of coupled linear oscillators in terms of a sum over independent
normal coordinates, i.e. normal modes, even though the motion may be very complicated. These normal
modes are constructed from the original coordinates in such a way that the normal modes are uncoupled.
The topic of finding the normal modes of coupled oscillator systems is a ubiquitous problem encountered in
all branches of science and engineering. As discussed in chapter 4 oscillatory motion of non-linear systems
can be complicated. Fortunately most oscillatory systems are approximately linear when the amplitude of
oscillation is small. This discussion assumes that the oscillation amplitudes are sufficiently small to ensure
linearity.
341
342 CHAPTER 12. COUPLED LINEAR OSCILLATORS
1 ≡ 1 − 2 (12.12)
2 ≡ 1 + 2
that is
1
1 = ( + 1 ) (12.13)
2 2
1
2 = ( − 1 )
2 2
Substitute these into the equations of motion (121), gives
¡ ¢
1 + 2 + ( + 20 ) 1 + 0 2 = 0 (12.14)
¡ ¢ 0 0
1 − 2 + ( + 2 ) 1 − 2 = 0
Adding and subtracting these two equations gives Figure 12.3: Motion of two coupled har-
0 monic oscillators in the (1 2 ) spatial
̈ 1 + ( + 2 ) 1 = 0 (12.15)
configuration space and in terms of the
̈ 2 + 2 = 0 normal modes ( 1 2 ). Initial conditions
are 2 = 1 = ̇1 = ̇2 = 0
Note that the two coordinates 1 and 2 are uncoupled and there-
fore independent. The solutions of these equations are
where 1 corresponds to angular frequencies 1 , and 2 corresponds to 2 . The two coordinates 1 and 2 are
called the normal coordinates and the two solutions are the normal modes with corresponding
angular frequencies, 1 and 2 .
The (1 2 ) axes of the two normal modes correspond to a
rotation of 45◦ in configuration space, figure 123. The initial
conditions chosen correspond to 1 = −2 and thus both modes
1
are excited with equal intensity. Note that there are 5 lobes along
the 2 axis versus 4 lobes along the 1 axis reflecting the ratio
of the eigenfrequencies 1 and 2 Also note that the diamond
shape of the motion in the (1 2 ) configuration space illustrates
that the extrema amplitudes for 2 are a maximum when 1 is
zero, and vise versa. This is equivalent to the statement that Antisymmetric mode
the energies in the two modes are coupled with the energy for (out of phase)
the first oscillator being a maximum when the energy is a min-
2
imum for the second oscillator, and vise versa. By contrast, in
the ( 1 2 ) configuration space, the motion is bounded by a rec-
tangle parallel to the (1 2 ) axes reflecting the fact that the
extrema amplitudes, and corresponding energies, for the 1 nor-
mal mode are constant and independent of the motion for the 2
normal mode, and vise versa. The decoupling of the two normal Symmetric mode
modes is best illustrated by considering the case when only one (in phase)
of these two normal modes is excited. For the initial conditions
1 (0) = −2 (0) and 1 (0) = −2 (0) then 2 () = 0 That is,
only the 1 () normal mode is excited with frequency 1 which
Figure 12.4: Normal modes for two cou-
corresponds to motion confined to the 1 axis of figure 123
pled oscillators.
344 CHAPTER 12. COUPLED LINEAR OSCILLATORS
As shown in figure 124, 1 () is the antisymmetric mode in which the two masses oscillate out of phase
such as to keep the center of mass of the two masses stationary. For the initial conditions 1 (0) = 2 (0)
and 1 (0) = 2 (0) then 1 () = 0 that is, only the 2 () normal mode is excited. The 2 () normal mode
is the symmetric mode where the two masses oscillate in phase with frequency 2 ; it corresponds to motion
along the 2 axis For the symmetric phase, both masses move together leading to a constant extension of
the coupling spring. As a result the frequency 2 of the symmetric mode 2 () is lower than the frequency
1 of the asymmetric mode 1 () That is, the asymmetric mode is stiffer since all three springs provide
active restoring forces, compared to the symmetric mode where the coupling spring is uncompressed. In
general, for attractive forces the lowest frequency always occurs for the mode with the highest symmetry.
2 = + 1 + + 0 + 2 = 2 + 0 + 2
= ( + 0 + 2 ) − ( + 1 ) = 0 − 1
1 = 0 − (12.17)
0
2 = 2 − 2 −
q
0
The 1 mode, which has angular frequency 1 = +2 corresponds to an oscillations of the relative
separation , whilep the center-of-mass location is stationary. By contrast, the 2 mode, with angular
frequency 2 = corresponds to an oscillation of the center of mass with the relative separation
being a constant.
Figure 125 illustrates the decoupled center-of-mass
, and relative motions for both normal modes of
the coupled double-oscillator system. The difference in 2.0
while r
2 = ≈ 0 (1 − ) (12.26)
1
That is the two solutions are split equally spaced q about the
+0
0 2
single uncoupled oscillator value given by 0 = ≈
p 3
(1 + ). Note that the single uncoupled oscillator fre-
quency 0 depends on the coupling strength . 0 n=3
This splitting of the characteristic frequencies is a feature
exhibited by many systems of identical oscillators where
half of the frequencies are shifted upwards and half down-
ward. If is odd, then the central frequency is unshifted as Figure 12.6: Normal-mode frequencies for
illustrated for the case of = 3. An example of this behav- n=2 and n=3 weakly-coupled oscillators.
ior is the Zeeman effect where the magnetic field couples the
atomic motion resulting in a hyperfine splitting of the energy
levels of the form illustrated.
346 CHAPTER 12. COUPLED LINEAR OSCILLATORS
There are myriad examples involving weakly-coupled oscillators in many aspects of the natural world.
The example of collective modes in nuclear physics, illustrated in example 1213, is typical of applications to
physics, while there are many examples applied to musical instruments, acoustics, and engineering. Weakly-
coupled oscillators are a dominant theme throughout biology as illustrated by congregations of synchronously
flashing fireflies, crickets that chirp in unison, an audience clapping at the end of a performance, networks
of pacemaker cells in the heart, insulin-secreting cells in the pancreas, and neural networks in the brain and
spinal cord that control rhythmic behaviors such as breathing, walking, and eating. Synchronous motion of
a large number of weakly-coupled oscillators often leads to large collective motion of weakly-coupled systems
as discussed in chapter 1212
Hitchpin
Damper String Bridge
Hammer
Pin block
Jack
Soundboard
Ribs
Key
Schematic diagram of the action for a grand piano, including the strings, bridge and sounding board. Note
that there are either two or three parallel strings per note all hit by a single hammer.
The grand piano provides an excellent example of a weakly-coupled harmonic oscillator system that has
normal modes. There are either two or three parallel strings per note that are stretched tightly parallel to the
top of the horizontal sounding board. The strings press downwards on the bridge that is attached to the top of
the sounding board. The strings for each note are excited when struck vertically upwards by a single hammer.
In the base section of the piano each note comprises two strings tuned to nearly the same frequency. The
coupling of the motion of the strings is via the bridge plus sounding board. Normally, the hammer strikes both
strings simultaneously exciting the vertical symmetric mode, not the vertical antisymmetric mode. The bridge
is connected to the sounding board which moves the largest amount for the symmetric mode where both strings
move the bridge in phase. This strong coupling produces a loud sound. The antisymmetric mode does not
move the sounding board much since the strings at the bridge move out of phase. Consequently, the symmetric
mode, that is strongly coupled to the sounding board, damps out more rapidly than the antisymmetric mode
which is weakly coupled to the sound board and thus has a longer time constant for decay since the radiated
sound energy is lower than the symmetric mode.
The una-corda pedal (soft pedal) for a grand piano moves the action sideways such that the hammer strikes
only one of the two strings, or two of the three strings, resulting in both the symmetric and antisymmetric
modes being excited equally. The una-corda pedal produces a characteristically different tone than when
the hammer simultaneously hits all the strings; that is, it produces a smaller transient component. The
symmetric mode rapidly damps due to energy propagation by the sounding board. Thus the longer lasting
antisymmetric mode becomes more prominent when both modes are equally excited using the una-corda pedal.
The symmetric and antisymmetric modes have slightly different frequencies and produce beats which also
contributes to the different timbre produced using the una-corda pedal. For the mid and upper frequency
range, the piano has three strings per note which have one symmetric mode and two separate antisymmetric
modes. To further complicate matters, the strings also can oscillate horizontally which couples weakly to the
bridge plus sounding board. The strengths that these different modes are excited depend on subtle differences
in the shape and roughness of the hammer head striking the strings. Primarily the hammer excites the two
vertical modes rather than the horizontal modes.
12.6. GENERAL ANALYTIC THEORY FOR COUPLED LINEAR OSCILLATORS 347
Expressing these in terms of generalized coordinates = ( ) where = 1 2 then the generalized
velocities are given by
X
̇ = ̇ + (12.30)
=1
As discussed in chapter 76 if the system is scleronomic then the partial derivative
=0 (12.31)
Thus the kinetic energy, equation 1229, of a scleronomic system can be written as a homogeneous quadratic
function of the generalized velocities
1X
= ̇ ̇ (12.32)
2
Note that if the velocities ̇ correspond to translational velocity, then the kinetic energy tensor T corresponds
to an effective mass tensor, whereas if the velocities correspond to angular rotational velocities, then the
kinetic energy tensor T corresponds to the inertia tensor.
348 CHAPTER 12. COUPLED LINEAR OSCILLATORS
It is possible to make an expansion of the about the equilibrium values of the form
X µ ¶
(1 2 ) = (0 ) + + (12.34)
0
Only the first-order term will be kept since the second and higher terms are of the same order as the higher-
order
³ ´terms ignored in the Taylor expansion of the potential. Thus, at the equilibrium point, assume that
= 0 where = 1 2 3 .
0
That is
1X
0 (1 2 ) = (12.38)
2
where the components of the kinetic energy tensor T and potential energy tensor V are
à 3
!
X X
≡ (12.43)
0
µ 2 0 ¶
≡ (12.44)
0
12.6. GENERAL ANALYTIC THEORY FOR COUPLED LINEAR OSCILLATORS 349
Note that and may have different units, but all the terms in the summations for both and 0 have
units of energy. The and values are evaluated at the equilibrium point, and thus both and
are × arrays of values evaluated at the equilibrium location.
and
X
= ̇ (12.48)
̇
Thus the Lagrange equations reduce to the following set of equations of motion,
X
( + ̈ ) = 0 (12.49)
For each where 1 ≤ ≤ there exists a set of second-order linear homogeneous differential equations
with constant coefficients. Since the system is oscillatory, it is natural to try a solution of the form
() = (−) (12.50)
Assuming that the system is conservative, then this implies that is real, since an imaginary term for
would lead to an exponential damping term. The arbitrary constants are the real amplitude and the
phase Substitution of this trial solution for each leads to a set of equations
X¡ ¢
− 2 = 0 (12.51)
(−)
where the common factor has been removed. Equation 1251 corresponds to a set of linear
homogeneous algebraic equations that the amplitudes must satisfy for each . For a non-trivial solution
to exist, the determinant of the coefficients must vanish, that is
¯ ¯
¯ 11 − 2 11 12 − 2 12 13 − 2 13 ¯
¯ ¯
¯ 12 − 2 12 22 − 2 22 23 − 2 23 ¯
¯ ¯
¯ 13 − 2 13 23 − 2 23 33 − 2 33 ¯ = 0 (12.52)
¯ ¯
¯ ¯
where the symmetry = has been included. This is the standard eigenvalue problem for which
the above determinant gives the secular equation or the characteristic equation. It is an equation
of degree in 2 The roots of this equation are 2 where are the characteristic frequencies or
eigenfrequencies of the normal modes.
Substitution of 2 into equation 1252 determines the ratio 1 : 2 : 3 : : for this solution
which defines the components of the -dimensional eigenvector a . That is, solution of the secular equations
have determined the eigenvalues and eigenvectors of the solutions of the coupled-channel system.
350 CHAPTER 12. COUPLED LINEAR OSCILLATORS
12.6.4 Superposition
P
The equations of motion ( + ̈ ) = 0 are linear equations that satisfy superposition. Thus the
most general solution () can be a superposition of the eigenvectors a , that is
X
() = ( − ) (12.53)
Thus the most general solution of these linear equations involves a sum over the eigenvectors of the
system which are cosine functions of the corresponding eigenfrequencies.
Multiply equation 1255 by and sum over . Similarly multiply equation 1256 by and sum over .
These summations lead to
X X
= 2 (12.57)
X X
= 2 (12.58)
Note that the left-hand sides of these two equations are identical. Thus taking the difference between these
equations gives
¡ 2 ¢X
− 2 = 0 (12.59)
¡ ¢
Note that if 2 − 2 6= 0, that is, assuming that the eigenfrequencies are not degenerate, then to ensure
that equation 1259 is zero requires that
X
= 0 6= (12.60)
This shows that the eigenfunctions are orthogonal. If the eigenfrequencies are degenerate, i.e. 2 = 2 ,
then, with no loss of generality, the axes and can be chosen to be orthogonal.
The eigenfunction normalization can be chosen freely since only ratios of the eigenfunction compo-
nents are determined when is used in equation 1251. The kinetic energy, given by equation 1232
must be positive, or zero for the case of a static system. That is
1X
= ̇ ̇ ≥ 0 (12.61)
2
12.6. GENERAL ANALYTIC THEORY FOR COUPLED LINEAR OSCILLATORS 351
Use the time derivative of equation 1254 to determine ̇ and insert into equation 1261 gives that the kinetic
energy is
1X 1X X
= ̇ ̇ = cos ( − ) cos ( − ) (12.62)
2 2
Since this sum must be a positive number, and the magnitude of the amplitudes can be chosen freely, then
it is possible to normalize the eigenfunction amplitudes to unity. That is, choose that
X
= 1 (12.65)
The orthogonality equation, 1260 and the normalization equation 1265 can be combined into a single
orthonormalization equation
X
= (12.66)
where eb are the unit vectors for the generalized coordinates.
1 2 1 2 1 0 1 1
= + + (2 − 1 )2 = ( + 0 ) 21 + ( + 0 ) 22 − 0 1 2
2 1 2 2 2 2 2
while the kinetic energy is given by
1 1
= ̇21 + ̇22
2 2
2) The second stage is to evaluate the potential energy and kinetic energy tensors. The potential
energy tensor is nondiagonal since gives
µ ¶
2
11 ≡ = + 0 = 22
1 1
µ ¶0
2
12 = = −0 = 21
1 2 0
Since 11 = 22 = and 12 = 21 = 0 then the kinetic energy tensor is
½ ¾
0
T=
0
Note that for this case, the kinetic energy tensor equals the mass tensor, which is diagonal, whereas the
potential energy tensor equals the spring constant tensor, which is nondiagonal.
3) The third stage is to use the potential energy and kinetic energy tensors to evaluate the secular
determinant using equations 1252
¯ ¯
¯ + 0 − 2 −0 ¯
¯ ¯
2 ¯=0
¯ − 0 0
+ −
That is ¡ ¢
+ 0 − 2 = ±0
12.7. TWO-BODY COUPLED OSCILLATOR SYSTEMS 353
which simplifies to
= 11 = −21
Similarly, for the other eigenfrequency 2 , that is, = 1 = 2
( + 0 − ) 12 − 0 22 = 0
which simplifies to
= 12 = 22
5) The final stage is to write the general coordinates in terms of the normal coordinates () ≡
Thus
1 = 11 1 + 12 2 = 11 1 + 22 2
and
2 = 21 1 + 22 2 = −11 1 + 22 2
Adding or subtracting gives that the normal modes are
1
1 = (1 − 2 )
211
1
2 = (2 + 1 )
222
then 11 = 22 = and 12 = 21 = 0 Thus the kinetic energy tensor is
½ ¾
0
T=
0
Note that for this case the kinetic energy tensor is diagonal whereas the potential energy tensor is nondiagonal.
3) The third stage is to use the potential energy and kinetic energy tensors to evaluate the secular
determinant using equation 1252 ¯ ¯
¯ 2 − 2 − ¯
¯ ¯=0
¯ − − 2 ¯
The expansion of this secular determinant yields
¡ ¢¡ ¢
2 − 2 − 2 − 2 = 0
That is
2 2
4 − 3 + 2 =0
The solutions are √ r √ r
5+1 5−1
1 = 2 =
2 2
4) The fourth step is to insert these eigenfrequencies into the secular equation 1251
X¡ ¢
− 2 = 0
12.7. TWO-BODY COUPLED OSCILLATOR SYSTEMS 355
Note that for this case the kinetic energy tensor is diagonal whereas the potential energy tensor is nondiagonal.
The third stage is to evaluate the secular determinant
¯ ¯
¯ + 2 − 2 2 −2 ¯
¯ ¯
2 ¯=0
¯ − 2 2
+ − 2
Consider = 1 ¡ ¢
+ 2 − 2 2 1 − 2 2 = 0
Then for the first eigenfrequency, 1 , the subscripts are = 1 = 1
³ ´
+ 2 − 2 11 − 2 21 = 0
which simplifies to
11 = 21
Similarly, for = 1 = 2
µ µ ¶ ¶
2
+ 2 − + 2 12 − 2 22 = 0
which simplifies to
12 = −22
The final stage is to write the general coordinates in terms of the normal coordinates
1 = 11 1 + 12 2 = 11 1 − 22 2
and
2 = 21 1 + 22 2 = 11 1 + 22 2
Adding or subtracting these equations gives that the normal modes are
1 1
1 = (1 + 2 ) 2 = (2 − 1 )
211 222
As for the case of the double oscillator discussed in example 122, the symmetric normal mode corresponds
to an oscillation pof the center-of-mass, with zero relative motion of the two pendula, which has the lower
frequency 1 = This frequency is the same as for one independent pendulum as expected since they
vibrate in unison and thus the only restoring force is gravity. The antisymmetric mode corresponds q¡ to
2
¢
relative motion of the two pendula with stationary center-of-mass and has the frequency 2 = +
since the restoring force includes both the coupling spring and gravity.
This example introduces the role of degeneracy which occurs in this system p if the coupling of the pendula
is zero, that is, = 0 leading to both frequencies being equal, i.e. 1 = 2 = . When = 0, then both
{T} and {V} are diagonal and thus in the (1 2 ) space the two pendula are independent normal modes.
However, the symmetric and asymmetric normal modes, as derived above, are equally good normal modes.
In fact, since the modes are degenerate, any linear combination of the motion of the independent pendula are
equally good normal modes and thus one can use any set of orthogonal normal modes to describe the motion.
12.7. TWO-BODY COUPLED OSCILLATOR SYSTEMS 357
As shown in the adjacent figure, the normal modes for this system
are
1 1 Normal modes for two
1 = ( + √2 ) 2 = (1 − √2 ) series-coupled plane pendula.
211 1 2 2 22 2
√
The second mass has a 2 larger amplitude that is in phase for solution 1 and out of phase for solution 2.
b) Large amplitude chaotic regime
Stachowiak and Okada [Sta05] used computer simulations to numerically analyze the behavior of this
system with increase in the oscillation amplitudes. Poincaré sections, bifurcation diagrams, and Lyapunov
exponents all confirm that this system evolves from regular normal-mode oscillatory behavior in the linear
regime at low energy, to chaotic behavior at high excitation energies where non-linearity dominates. This
behavior is analogous to that of the driven, linearly-damped, harmonic pendulum described in chapter 45
358 CHAPTER 12. COUPLED LINEAR OSCILLATORS
¡ 02 ¢ ¡ 2 ¢
= 1 + 02 02
2 + 3 = 1 + 22 + 23 − 21 2 − 21 3 − 22 3
2 2
The kinetic energy evaluated at the equilibrium location is
1 ³ ´2 1 ³ ´2 1 ³ ´2
= ̇1 + ̇2 + ̇3
2 2 2
The next stage is to evaluate the {T} and {V} tensors
⎧ ⎫ ⎧ ⎫
⎨ 1 0 0 ⎬ ⎨ 1 − − ⎬
T = 2 0 1 0 V = − 1 −
⎩ ⎭ ⎩ ⎭
0 0 1 − − 1
The third stage is to evaluate the secular determinant which can be written as
¯ ¯
¯ 1 − 2 − − ¯
¯ ¯
¯ − 2
1 − − ¯
¯ ¯=0
¯ ¯
¯ − − 1 − 2 ¯
while for = 3 = 2
−13 + 223 − 33 = 0
Solving these gives
13 = 23 = 33
Assuming that the eigenfunction is normalized
The normal modes are obtained by taking the inverse matrix {a}−1 and using {η} = {a}−1 {θ} Note
−1
that since {a} is real and orthogonal, then {a} equals the transpose of {a} That is;
⎧ ⎫ ⎧ √ √ ⎫⎧ ⎫
⎨ 1 ⎬ ⎨ 12 √2 − 12√ 2 0√ ⎬ ⎨ 1 ⎬
1
= 6 16 √6 − 13√ 6 2
⎩ 2 ⎭ ⎩ 61 √ 1 1 ⎭⎩ ⎭
3 3 3 3 3 3 3 3
The normal mode 3 has eigenfrequency
r
√
3 = 1 − 2
and eigenvector
1
η 3 = √ (1 2 3 )
3
360 CHAPTER 12. COUPLED LINEAR OSCILLATORS
The general analytic approach requires the and energy tensors given by
⎧ ⎫ ⎧ ⎫
⎨ 1 0 0 ⎬ ⎨ + 2 −2 0 ⎬
T = 2 0 1 0 V= −2 + 22 −2
⎩ ⎭ ⎩ ⎭
0 0 1 0 −2 + 2
12.8. THREE-BODY COUPLED LINEAR OSCILLATOR SYSTEMS 361
Note that in contrast to the prior case of three fully-coupled pendula, for the nearest neighbor case the potential
energy tensor {V} is non-zero only on the diagonal and ±1 components ¡ parallel
¢ to the diagonal.
The third stage is to evaluate the secular determinant of the V − 2 T matrix, that is
¯ ¯
¯ + 2 − 2 2 −2 0 ¯
¯ ¯
¯ − 2 2
+ 2 − 2 2
− 2 ¯=0
¯ ¯
¯ 0 − 2 2
+ − 2 2 ¯
which results in the three non-degenerate eigenfrequencies for the normal modes.
The normal modes are similar to the prior case of complete linear
coupling, pas shown in the adjacent figure.
1 = This lowest mode 1 involves the three pendula oscillating
in phase such that the springs are not stretched or compressed thus the 1
period of this coherent oscillation is the same as an independent pendulum
of mass and length . That is
1
η 1 = √ (1 2 3 )
3
p
2 = +
This second mode 2 has the central mass stationary with
the outer pendula oscillating with the same amplitude and out of phase.
That is
1
η 2 = √ (1 0 −3 )
2
q 2
3 = + 3
. This third mode 3 involves the outer pendula in phase
with the same amplitude while the central pendulum oscillating with angle
3 = −21 . That is
1
η3 = √ (1 −22 3 )
6
Similar to the prior case of three completely-coupled pendula, the coherent
normal mode η 1 corresponds to an oscillation of the center-of-mass with
no relative motion, while η 2 and η 3 correspond to relative motion of
the pendula with stationary center of mass motion. In contrast to the
prior example of complete coupling, for nearest neighbor coupling the two 3
higher lying solutions are not degenerate. That is, the nearest neighbor
coupling solutions differ from when all masses are linearly coupled.
It is interesting to note that this example combines two coupling mech-
anisms that can be used to predict the solutions for two extreme cases
by switching off one of these coupling mechanisms. Switching off the
coupling springs, by setting = 0,pmakes all three normal frequencies
degenerate with 1 = 2 = 3 = . This corresponds to three inde- Normal modes of three plane
p
pendent identical pendula each with frequency = . Also the three pendula with nearest-neighbour
linear combinations 1 2 3 also have this same frequency, in particular coupling.
1 corresponds to an in-phase oscillation of the three pendula. The three
uncoupled pendula are independent and any combination the three modes is allowed since the three frequencies
are degenerate.
The other extreme is to let = 0 that is switch off the gravitational field or let → ∞, then the only
coupling is due to the two springs. This results in 1 = 0 because there is no restoring force acting on the
coherent motion of the three in-phase coupled oscillators; as a result, oscillatory motion cannot be sustained
since it corresponds to the center of mass oscillation with no external forces acting which is spurious. That
is, this spurious solution corresponds to constant linear translation.
362 CHAPTER 12. COUPLED LINEAR OSCILLATORS
Note that for this case the kinetic energy tensor is diagonal whereas k
Longitudinal modes
The coordinate system used is illustrated in the adjacent figure.
The Lagrangian for this system is
µ ¶
2 2 2 2 2
= ̇1 + ̇2 + ̇3 − [(2 − 1 ) + (3 − 2 ) ]
2 2 2 2
Evaluating the kinetic energy tensor gives
⎧ ⎫
⎨ 0 0 ⎬
T= 0 0
⎩ ⎭
0 0
Note that the same answer is obtained using Newtonian mechanics. That is, the force equation gives
̈1 − (2 − 1 ) = 0
̈2 + (2 − 1 ) − (3 − 2 ) = 0
̈3 − (3 − 2 ) = 0
This leads to the same secular determinant as given above with the matrix elements clustered along the
diagonal for nearest-neighbor problems.
364 CHAPTER 12. COUPLED LINEAR OSCILLATORS
Transverse modes
The solutionsqare:
¡ ¢
4) 4 = 2 2+ This is the only non-spurious transverse mode 4 which corresponds to the two
outside masses vibrating in unison transverse to the symmetry axis while the central mass vibrates oppositely.
This mode radiates electric dipole radiation since the electric dipole is oscillating.
5) 5 = 0. This transverse solution 5 has all three nuclei vibrating in unison transverse to the symmetry
axis and corresponds to a spurious center of mass oscillation.
6) 6 = 0 This transverse solution 6 corresponds to a stationary central mass with the two outside
masses vibrating oppositely. This corresponds to a rotational oscillation of the molecule which is spurious
since there are no torques acting on the molecule for a central force. Rotational motion usually is taken into
account separately.
The normal modes for the bent triatomic molecule are similar except that the oscillator coupling strength
is reduced by the factor cos where is the bend angle.
12.9. MOLECULAR COUPLED OSCILLATOR SYSTEMS 365
1 X³ 2 ´
+1
2 Figure 12.8: Transverse motion of a
= ̇ − (−1 − ) (12.83)
2 =1 linear discrete lattice chain
Using this Lagrangian in the Lagrange Euler equations gives the following second-order equation of motion
for transverse oscillations
̈ = 2 (−1 − 2 + +1 ) (12.84)
where = 1 2 and r
≡ (12.85)
The normal modes for the transverse modes comprise standing waves that satisfy the same boundary
conditions as for the longitudinal modes. The equations of motion for longitudinal motion, equation
1277 or transverse motion, equation
p 1284 are identical in form. The major difference is thatp 0 for the
transverse normal modes ≡ differs from that for the longitudinal modes which is ≡ . Thus
the following discussion of the normal modes on a discrete lattice chain is identical in form for both transverse
and longitudinal waves.
This secular determinant corresponds to the special case of nearest neighbor interactions with the kinetic
energy tensor T being diagonal and the potential energy tensor V involving coupling only to adjacent
masses. The secular determinant is of order and thus determines exactly eigen frequencies for each
polarization mode.
For large the solution of this problem is more efficiently obtained by using a recursion relation approach,
rather than solving the above secular determinant. The trick is to assume that the phase differences
between the motion of adjacent masses all are identical for a given polarization. Then the amplitude for the
mass for the frequency mode is of the form
which reduces to
2 = 2 2 − 2 2 cos = 4 2 sin2
2
that is
= 2 sin (12.91)
2
where = 1 2 3
Now it is necessary to determine the phase angle which can be done by applying the boundary
conditions for standing waves on the lattice chain. These boundary conditions for stationary modes require
that the ends of the lattice chain are nodes, that is = (+1) = 0 Using the fact that only the real
part of has physical meaning, leads to the amplitude for the mass for the mode to be
Therefore
( + 1) = (12.95)
where = 1 2 3 . That is
= = = = (12.96)
+1 ( + 1) 2
= 2 sin = 2 sin = 2 sin = 2 sin (12.97)
2 ( + 1) 2 ( + 1) 2 2
where the corresponding wavenumber is given by
2
= = = (12.98)
( + 1)
This implies that the normal modes are quantized with half-wavelengths 2 = .
12.10. DISCRETE LATTICE CHAIN 369
r=1 r=2
1 .0 1 .0
0 .8 0 .8
0 .6 0 .6
0 .4 0 .4
0 .2 0 .2
0 .0 0 .0
0. 1 0 .2 0. 3 0. 4 0. 5 0. 6 0. 7 0. 8 0. 9 1. 0 0. 1 0 .2 0. 3 0. 4 0. 5 0. 6 0. 7 0. 8 0. 9 1. 0
-0 .2 -0 .2
-0 .4 -0 .4
-0 .6 -0 .6
-0 .8 -0 .8
-1 .0 -1 .0
r=3 r=4
1 .0 1 .0
0 .8 0 .8
0 .6 0 .6
0 .4 0 .4
0 .2 0 .2
0 .0 0 .0
0. 1 0 .2 0. 3 0. 4 0. 5 0. 6 0. 7 0. 8 0. 9 1. 0 0. 1 0 .2 0. 3 0. 4 0. 5 0. 6 0. 7 0. 8 0. 9 1. 0
-0 .2 -0 .2
-0 .4 -0 .4
-0 .6 -0 .6
-0 .8 -0 .8
-1 .0 -1 .0
r=5 r=6
1 .0 1 .0
0 .8 0 .8
0 .6 0 .6
0 .4 0 .4
0 .2 0 .2
0 .0 0 .0
0. 1 0 .2 0. 3 0. 4 0. 5 0. 6 0. 7 0. 8 0. 9 1. 0 0. 1 0 .2 0. 3 0. 4 0. 5 0. 6 0. 7 0. 8 0. 9 1. 0
-0 .2 -0 .2
-0 .4 -0 .4
-0 .6 -0 .6
-0 .8 -0 .8
-1 .0 -1 .0
Figure 12.9: Plots of the maximal vibrational amplitudes for the frequency sinusoidal mode, versus
distance along the chain, for transverse normal modes of a vibrating discrete lattice with = 5. Only =
1 2 3 4 5 are distinct modes because = 6 is a null mode. Note that the modes with = 7 8 9 10 11 12
shown dashed, duplicate the locations of the mass displacement given by the lower-order modes.
Combining equations 1296 and 1293 gives the maximum amplitudes for the eigenvectors to be
= sin
(12.99)
2
For independent linear oscillators there are only independent normal modes, that is, for = + 1 the
sine function in equation 1297 must be zero. Beyond = the equations do not describe physically new
situations. This is illustrated by figure 129 which shows the transverse modes of a lattice chain with = 5.
There are only = 5 independent normal modes of this system since = + 1 = 6 corresponds to a null
mode with all () = 0. Also note that the solutions for + 1 shown dashed, replicate the mass
locations of modes with + 1, that is, the modes with 6 are replicas of the lower-order modes.
Note that has a maximum value ≤ 2 0 since the sine function cannot exceed unity. This leads
to a maximum frequency = 2 0 called the cut-off frequency, which occurs when = . That is, the
null-mode occurs when = + 1 for which equation 1299 equals zero The range of quantized normal
modes that can occur is intuitive. That is, the longest half-wavelength max
2 = = ( + 1) equals the total
length of the discrete lattice chain. The shortest half-wavelength −
2
= is set by the lattice spacing.
Thus the discrete wavenumbers of the normal modes, for each polarization, range from 1 to 1 where is
an integer.
Assuming real the normal coordinate and corresponding frequency are,
= (12.100)
Equations 1297 and 1299 give the angular frequency and displacement. Note that superposition applies
since this system is linear. Therefore the most general solution for each polarization can be any superposition
of the form ∙ ¸
X
() = sin (12.101)
=1
( + 1)
370 CHAPTER 12. COUPLED LINEAR OSCILLATORS
where the distance along the chain = , that is, it is quantized in units of the cell spacing , with being
an integer. The positive sign in the exponent corresponds to a wave travelling in the − direction while
the negative sign corresponds to a wave travelling in the + direction. The velocity of a fixed phase of the
travelling wave must satisfy that ± is a constant. This will occur if the phase velocity of the wave is
given by
= = (12.103)
The wave has a frequency = 2 and wavelength = 2
thus the phase velocity = = .
Inserting the travelling wave 12102 into the transverse equation of motion 1284 for the discrete lattice
chain gives
The lattice chain has a phase velocity for the wave given by
¯ ¯
¯sin ¯
2
= = 0 (12.110)
2
Γ
= 2 0 cosh (12.116)
2
which increases with Γ. Thus, when = 2 0 then the amplitude of the wave is of the form
For this special case, it was shown in chapter 8 that the Lagrange equations can be written in terms of the
Rayleigh dissipation function as ½ µ ¶ ¾
F
− + = (12.120)
̇ ̇
where are generalized forces acting on the system that are not absorbed into the potential Using
equations 1243 1244 and 12120 allows the equations of motion for damped coupled linear oscillators to
be written in a matrix form as
{T} q̈ + {C} q̇+ {V} q = {Q} (12.121)
where the symmetric matrices {T} {C} and {V} are positive definite for positive definite systems. Rayleigh
pointed out that in the special case where the damping matrix {C} is a linear combination of the {T} and
{V} matrices, then the matrix {C} is diagonal leading to a separation of the damped system into normal
modes. As discussed in chapter 4 many systems in nature are linear for small amplitude oscillations allowing
use of the Rayleigh dissipation function which provides an analytic solution. However, in general, except for
when {C} is small, this separation into normal modes is not possible for damped systems and the solutions
must be obtained numerically.
The following two examples illustrate approaches used to handle linearly-damped coupled-oscillator sys-
tems.
Figure 12.12: Kuramoto model of collective synchronization of coupled oscillators. The left and center
plots show the time and coupling strength dependence of the order parameter . The right plot shows the
frequency dependence including coupling (solid line) and without coupling (dashed line).
where = 1 2 . Kuramoto recognized that mean-field coupling was the most tractable system to solve,
that is, a system where the coupling is applicable equally to all the oscillators. Moreover, he assumed an
equally-weighted, pure sinusoidal coupling for the coupling term Γ ( − ) between the coupled oscillators.
That is, he assumed
Γ ( − ) = sin( − ) (12.124)
where ≥ 0 is the coupling strength, and the factor 1 ensures that the model is well behaved as → ∞.
Kuramoto assumed that the frequency distribution () was unimodular and symmetric about the mean
frequency Ω, that is (Ω + ) = (Ω − ).
This problem can be simplified by exploiting the rotational symmetry and transforming to a frame of
reference that is rotating at an angular frequency Ω. That is, use the transformation = − Ω where
is measured in the rotating frame. This makes () unimodular with a symmetric frequency distribution
about = 0. The phase velocity in this rotating frame is
X
̇ = + sin( − ) (12.125)
=1
Kuramoto observed that the phase-space distribution can be expressed in terms of the order parameters
in that equation 12122 can be multiplied on both sides by − to give
1 X ( − )
(− ) = (12.126)
=1
angular frequency Ω and co-rotate with average phase (), whereas those frequencies lying further from
the center continue to rotate independently at their natural frequencies and drift relative to the coherent
cluster frequency Ω. As a consequence this mixed state is only partially synchronized as illustrated on the
right side of figure 1212. The synchronized fraction has a -function behavior for the frequency distribution
which grows in intensity with further increase in . The unsynchronized component has nearly the original
frequency distribution () except that it is depleted in the region of the locked frequency due to strength
absorbed by the -function component.
Kuramoto’s toy model nicely illustrates the essential features of the evolution of collective synchronization
with coupling strength. It has been applied to the study neuronal synchronization in the brain[Cum07]. The
model illustrates that the collective synchronization of coupled oscillators leads to a component that has a
single frequency for correlated motion which can be much narrower than the inherent frequency distribution
of the ensemble of coupled oscillators.
238
Figure 12.13: Collective rotational bands in the nucleus U excited by Coulomb excitation. [Sim98]
analogous to the correlated flow of individual water molecules in a tidal wave. The weaker octupole term in
the residual interaction leads to an octupole [pear-shaped] coupled oscillator coherent state lying slightly above
the quadrupole coherent state. In contrast to the rotational motion of strongly-deformed quadrupole-deformed
nuclei, the octupole deformation exhibits more vibrational-like properties than rotational motion of a charged
tidal wave. The observed large increase in moment of inertia at higher rotational frequencies, shown in the
insert, is due to the Coriolis force aligning the individual valence nucleons along the rotational axis. Thus,
although the nucleus 238 U is the epitome of a complicated many-body quantal system, it is apparent that
basic classical mechanics of coupled oscillators, and rotation, underlie the physics phenomena exhibited by
synchronized collective motion in the nuclear many-body system.
The close correspondence between classical mechanics predictions, and the observed excitation phenomena
observed for the 238 nucleus, is surprising for a system that is the epitome of a many-body quantal fluid.
The following list identifies other manifestations of classical mechanics discussed in this book, that play a
role in this experimental study.
1. Coincident detection of the excited nuclei recoiling in vacuum was used to identify the exact scattering
angles, plus recoil velocities, of the scattered nuclei. This specifies the hyperbolic Rutherford trajectory
for each scattered nucleus, the nuclear masses, and their recoil velocities. The deexcitation −rays
emitted in flight by each recoiling nucleus, were detected in coincidence with the scattered nuclei. Knowl-
edge of the recoil velocities and scattering angles enabled correction for the Doppler shift in energy of
each detected coincident -ray.
2. The transition energies and angular distribution of the deexcitation -rays determined the energies,
spins, and parities of the excited states in 235 .
3. The measured yields of the coincident deexcitation -rays determined the excitation cross section as a
function of the nuclear scattering angle.
12.13. SUMMARY 377
4. A full quantal calculation for this system is beyond the capabilities of modern computers since the
experiment involves excitation of ∼ 100 excited levels, coupled by about ∼ 1000 electromagnetic matrix
elements, and the scattering involves inclusion of thousands of partial wave due to the long range of the
Coulomb potential and the heavy mass of the scattered nuclei. Therefore a semi-classical approximation
is used for the quantal calculation of the electromagnetic excitation cross sections as a function of time
as the scattered nuclei traverse Rutherford’s hyperbolic Coulomb scattering trajectory for each scattered
nucleus.
5. The measured cross section for the deexcitation -rays are compared with the predicted cross sections
to determine the ∼ 1000 electromagnetic matrix elements connecting the states in 235 .
6. The measured electromagnetic matrix elements have been measured in the laboratory frame of reference.
Much more insight into the collective motion in 235 is obtained by transforming the electromagnetic
matrix elements into the body-fixed frame of reference for this rotating deformed body. Rotational
invariants, described in chapter 1116, are used to derive the electromagnetic properties in the rotating
body-fixed frame of reference which unambiguously determines the electromagnetic shape for each excited
nuclear state observed in 235 .
7. Hamiltonian mechanics, based on the Routhian is used to make theoretical model calculations
of the nuclear structure of 235 in the rotating body-fixed frame for comparison with the experimental
data derived from this experiment.
This experiment illustrates that classical mechanics plays a key role in all aspects of the study of the
nuclear structure of this many-body quantal system.
12.13 Summary
This chapter has focussed on many—body coupled linear oscillator systems which are a ubiquitous feature in
nature. A summary of the main conclusions are the following.
Normal modes: It was shown that coupled linear oscillators exhibit normal modes and normal coordinates
that correspond to independent modes of oscillation with characteristic eigenfrequencies .
General analytic theory for coupled linear oscillators Lagrangian mechanics was used to derive the
general analytic procedure for solution of the many-body coupled oscillator problem which reduces to the
conventional eigenvalue problem. A summary of the procedure for solving coupled oscillator problems is as
follows:.
1) Choose generalized coordinates and evaluate and .
1X
= ̇ ̇ (1241)
2
and
1X
= (1242)
2
and µ ¶
2
≡ (1244)
0
378 CHAPTER 12. COUPLED LINEAR OSCILLATORS
4) From the initial conditions determine the complex scale factors where
5) Determine the normal coordinates where each is a normal mode. The normal coordinates can be
expressed as
η = {a}−1 q (1261)
Few-body coupled oscillator systems The general analytic theory was used to determine the solutions
for parallel and series couplings of two and three linear oscillators. The phenomena observed include degen-
erate and non-degenerate eigenvalues and spurious center-of-mass oscillatory modes. There are two broad
classifications for three or more coupled oscillators, that is, either complete coupling of all oscillators, or
coupling of the nearest-neighbor oscillators. It is observed that the eigenvalue corresponding to the most
coherent motion of the coupled oscillators corresponds to the most collective motion and its eigenvalue is dis-
placed the most in energy from the remaining eigenvalues. For some systems this coherent collective mode
corresponded to a center-of-mass motion with no internal excitation of the other modes, while the other
eigenvalues corresponded to modes with internal excitation of the oscillators such that the center of mass
is stationary. The above procedure has been applied to two classification of coupling, complete coupling of
many oscillators, and nearest neighbor coupling. Both degenerate and spurious center-of-mass modes were
observed. Strong collective shape degrees of freedom in nuclei are examples of complete coupling due to the
weak residual interactions between nucleons in the nucleus. It was seen that, for many coupled oscillators,
one coherent state separates from the other states and this coherent state carries the bulk of the collective
strength.
Discrete lattice chain Transverse and longitudinal modes of motion on the discrete lattice chain were dis-
cussed because of the important role it plays in nature, such as in crystalline lattice structures. Both normal
modes and travelling waves were discussed including the phenomena of dispersion and cut-off frequencies.
Molecules and the crystalline lattice chains are examples where nearest neighbor coupling is manifest. It
was shown that, for the −oscillator discrete lattice chain, there are only independent longitudinal modes
plus modes for the two transverse polarizations, and that the angular frequency ≤ 2 0 that is, a cut-off
frequency exists.
Damped coupled linear oscillators It was shown that linearly-damped coupled oscillator systems can
be solved analytically using the concept of the Rayleigh dissipation function.
Collective synchronization of coupled oscillators The Kuramoto schematic phase model was used
to illustrate how weak residual forces can cause collective synchronization of the motion of many coupled
oscillators. This is applicable to biological systems as well as mechanical systems.
12.13. SUMMARY 379
Workshop exercises
1. Consider two masses (each of mass ) connected by a spring to each other and by springs to fixed positions.
Motion is only allowed along one dimension. (This is exactly the same system that is discussed in chapter 152
of the lecture notes on coupled oscillations.) Let each of the two oscillator springs have a force constant and
let the force constant of the coupling spring be 12 . Let 1 and 2 be the coordinates as described in the
textbook.
(a) Draw a picture of the two masses displaced by a small amount. Using the picture, try to make sense of
the equations of motion as given in the text:
2. Two particles, each with mass , move in one dimension in a region near a local minimum of the potential
energy where the potential energy is approximately given by
1
= (721 + 422 + 41 2 )
2
where is a constant.
5. A mechanical analog of the benzene molecule comprises a discrete lattice chain of 6 point masses connected
in a plane hexagonal ring by 6 identical springs each with spring constant and length .
a) List the wave numbers of the allowed undamped longitudinal standing waves.
b) Calculate the phase velocity and group velocity for longitudinal travelling waves on the ring.
c) Determine the time dependence of a longitudinal standing wave for a angular frequency = 2 , that
is, twice the cut-off frequency.
such that = 2 ,
(a) Determine the eigenfrequencies and normal coordinates.
(b) Choose a set of initial conditions such that the system oscillates at its highest eigenfrequency.
(c) Determine the solutions 1 () and 2 ().
380 CHAPTER 12. COUPLED LINEAR OSCILLATORS
Problems
1. Four identical masses are connected by four identical springs, spring constant and constrained to move
on a frictionless circle of radius as shown on the left in the figure.
a) How many normal modes of small oscillation are there?
b) What are the eigenfrequencies of the small oscillations?
c) Describe the motion of the four masses for each eigenfrequency.
2. Consider the two identical coupled oscillators given on the right in the figure assuming 1 = 2 = . Let both
oscillators be linearly damped with a damping constant . A force = 0 cos() is applied to mass 1 .
Write down the pair of coupled differential equations that describe the motion. Obtain a solution by expressing
the differential equations in terms of the normal coordinates. Show that the normal coordinates 1 and 2
exhibit resonance peaks at the characteristic frequencies 1 and 2 respectively.
3. As shown on the left below the mass moves horizontally along a frictionless rail. A pendulum is hung from
with a weightless rod of length with a mass at its end.
a) Prove that the eigenfrequencies are
r
1 = 0 2 = ( + )
x
M
Chapter 13
13.1 Introduction
In two papers published in 1834 and 1835, Hamilton announced a dynamical principle upon which it is
possible to base all of mechanics, and indeed most of classical physics. Hamilton was seeking a theory of
optics when he developed Hamilton’s Principle, plus the field of Hamiltonian mechanics, both of which play
a pivotal role in classical mechanics.
Hamilton’s Principle is based on defining the action functional1 of the generalized coordinates
q and their corresponding velocities q̇. Z 2
= (q q̇) (13.1)
1
The scalar quantity is a functional of the Lagrangian (q q̇). In principle, higher order time derivatives
of the generalized coordinates could be included, but most systems in classical mechanics are described
adequately by including only the generalized coordinates, plus their velocities. Note that the definition of
the action functional does not limit the specific form of the Lagrangian. That is, it allows for more general
Lagrangians than the standard Lagrangian (q q̇) = (q̇) − (q ) that was used throughout chapters
5 − 12. Hamilton stated that the actual trajectory of a mechanical system is given by requiring that the
action functional is stationary. The action functional is stationary if the variational principle is written in
terms of virtual infinitessimal displacement to be
Z 2
= (q q̇) = 0 (13.2)
1
Typically this stationary point corresponds to a minimum of the action functional. Applying variational
calculus to the action functional leads to the Lagrange equations of motion for the system. That is, Hamilton’s
Principle, applied to the Lagrangian function (q q̇), generates the Lagrangian equations of motion.
− =0 (13.3)
̇
P
These Lagrange equations agree with those derived using d’Alembert’s Principle, if the =1 (q ) +
generalized force terms are ignored.
Hamilton’s Principle can be considered to be the fundamental postulate of classical mechanics. It replaces
Newton’s postulated three laws of motion. As illustrated in chapters 6 − 12, Lagrangian mechanics based on
the standard Lagrangian = − provides a remarkably powerful and consistent approach to solving the
equations of motion in classical mechanics. This chapter extends the discussion to non-standard Lagrangians.
Chapter 512 developed a plausibility argument, based on Newton’s laws of motion, that led to the
Lagrange equations of motion using the standard Lagrangian. d’Alembert’s Principle of virtual work was
used in chapter 6 to provide a more fundamental derivation of Lagrange’s equations of motion which was
based on the standard Lagrangian. An important feature is that Hamilton’s Principle extends Lagrangian
mechanics to the use of non-standard Lagrangians.
1 The term action functional often is abbreviated to action. It is called Hamilton’s Principal Function in older texts.
381
382 CHAPTER 13. HAMILTON’S PRINCIPLE OF LEAST ACTION
Expanding the integrand of in equation 136 gives that, relative to the extremum path , the incremental
change in action is Z 2 X µ ¶
= − = + ̇ + [∆]21 (13.7)
1
̇
³ ´
The second term in the integral can be integrated by parts since ̇ = leading to
⎡ ⎤2
Z 2 X µ
¶ X
= − + ⎣ + ∆⎦ (13.8)
1
̇
̇
1
Note that equation 138 includes contributions from the entire path of the integral as well as the variations
at the ends of the curve and the ∆ terms. Equation 138 leads to the following two pioneering principles of
least action in variational mechanics that were developed by Hamilton.
For independent generalized coordinates , the integrand in brackets vanishes leading to the Euler-Lagrange
equations. Conversely, if the Euler-Lagrange equations in 139 are satisfied, then, = 0 that is, the path
is stationary. This leads to the statement that the path in configuration space between two configurations
q(1 ) and q(2 ) that the system occupies at times 1 and 2 respectively, is that for which the action is
stationary. This is a statement of Hamilton’s Principle.
13.2. PRINCIPLE OF LEAST ACTION 383
where and ̇ are evaluated at 1 and 2 . Then equation 138 reduces to
⎡ ⎤2 ⎡ ⎛ ⎞ ⎤2
X X X
= ⎣ + ∆⎦ = ⎣ ∆ + ⎝− ̇ + ⎠ ∆⎦ (13.11)
̇
̇
̇
1 1
Since the generalized momentum = ̇ , then equation 1311 can be expressed in terms of the Hamiltonian
and generalized momentum as
⎡ ⎤2
X
= ⎣ ∆ − ∆⎦ = [p·∆q − ∆]21 (13.12)
1
= (13.13)
Equation 1312 contains Hamilton’s Principle of Least-action. Equation 1313 gives an alternative relation
of the generalized momentum that is in terms of the action functional
Integrating the action , equation 1311, between the end points gives the action for the path between
= 1 and = 2 , that is, ( (1 ) 1 (2 ) 2 ) to be
Z 2
( (1 ) 1 (2 ) 2 ) = [p · q̇ − (q p)] (13.14)
1
The integrand in the modified Hamilton’s principle, = [p · q̇ − (q p)] can be used in the Euler-
Lagrange equations for = 1 2 3 to give
µ ¶
− = ̇ + =0 (13.16)
̇
Thus Hamilton’s principle of least-action leads to Hamilton’s equations of motion, that is equations 1316 1317..
The total time derivative of the action , which is a function of the coordinates and time, is
X
= + ̇ = + p · q̇ (13.18)
Combining equations 1318 and 1319 gives the Hamilton-Jacobi equation which is discussed in chapter 145.
+ (q p) = 0 (13.20)
In summary, Hamilton’s principle of least action led directly to Hamilton’s equations of motion (1316 1317)
plus the Hamilton-Jacobi equation (1320). Note that both Hamilton’s Principle (138) and Hamilton’s equa-
tions of motion (1316 1317) have been derived directly from Hamilton’s concept of Least Action without
explicitly invoking the Lagrangian.
The abbreviated action can be simplified assuming the standard Lagrangian = − has a velocity-
independent potential , then equation 84 gives.
Z 2X Z 2 Z 2 Z 2
0 ≡ ̇ = ( + ) = 2 = p·q (13.23)
1 1 1 1
Abbreviated action provides for use of a simplified form of the principle of least action that is based
on the kinetic energy and not potential energy. For conservative systems it determines the path of the
motion, but not the time dependence of the motion. Consider virtual motions where the path satisfies
energy conservation, and where the end points are held fixed, that is = 0 but allow for a variation in
the final time. Then using equation 1321
= − = − (13.24)
However, equation 1321 gives that
= 0 − (13.25)
Therefore
0 = 0 (13.26)
That is, the abbreviated action has a minimum with respect to all paths that satisfy the conservation of
energy which can be written as Z 2
0 = 2 = 0 (13.27)
1
Equation 1327 is called the Maupertuis’ least-action principle which he proposed in 1744 based on Fermat’s
Principle in optics. Credit for the formulation of least action commonly is given to Maupertuis; however, the
Maupertuis principle is identical to use of least action applied to the "vis viva", as was proposed by Leibniz
four decades earlier. Maupertuis used teleological arguments, rather than scientific rigor, because of his
limited mathematical capabilities. In 1744 Euler provided a scientifically rigorous argument, presented above,
that underlies the Maupertuis principle. Euler derived the correct variational relation for the abbreviated
action to be Z X
0 = = 0 (13.28)
Hamilton’s use of the principle of least action to derive both Lagrangian and Hamiltonian mechanics is a
remarkable accomplishment. It underlies Hamiltonian mechanics and confirmed the conjecture of Mauper-
tuis.
13.3. STANDARD LAGRANGIAN 385
Hamilton extended Lagrangian mechanics by defining Hamilton’s Principle, equation 132, which states that
a dynamical system follows a path for which the action functional is stationary, that is, time integral of the
Lagrangian. Chapter 6 showed that using the standard Lagrangian in the action functional leads to the
Euler-Lagrange variational equations
½ µ ¶ ¾
X
− =
+ (q ) (13.30)
̇
=1
The Lagrange multiplier terms handle the holonomic constraint forces and handles the remaining
excluded generalized forces. Chapters 6 − 12 showed that the use of the standard Lagrangian, with the
Euler-Lagrange equations (133) provides a remarkably powerful and flexible way to derive second-order
equations of motion for dynamical systems in classical mechanics.
Note that the Euler-LagrangePequations, expressed solely in terms of the standard Lagrangian (1329)
that is, excluding the
+
=1 (q ) terms, are valid only under the following conditions:
1. The forces acting on the system, apart from any forces of constraint, must be derivable from scalar
potentials.
2. The equations of constraint must be relations that connect the coordinates of the particles and may
be functions of time, that is, the constraints are holonomic.
P
The
+ =1 (q ) terms extend the range of validity of using the standard Lagrangian in the
1. The Lagrangian is indefinite with respect to addition of a constant to the scalar potential which cancels
out when the derivatives in the Euler-Lagrange differential equations are applied.
3. The Lagrangian is indefinite with respect to addition of a total time derivative of the form 2 →
1 + [Λ( )] for any differentiable function Λ( ) of the generalized coordinates plus time, that
has continuous second derivatives.
386 CHAPTER 13. HAMILTON’S PRINCIPLE OF LEAST ACTION
This last statement can be proved by considering a transformation between two related standard La-
grangians of the form
µ ¶
Λ(q ) Λ(q ) Λ(q )
2 (q ) = 1 (q ) + = 1 (q ) + ̇ + (13.31)
This leads to a standard Lagrangian 2 that has the same equations of motion as 1 as is shown by
substituting equation 1331 into the Euler-Lagrange equations. That is,
µ ¶ µ ¶ µ ¶
2 2 1 1 2 Λ(q ) 2 Λ(q ) 1 1
− = − + − = − (13.32)
̇ ̇ ̇
Thus even though the 1 and 2 are different, they are completely equivalent in that they generate identical
equations of motion.
There is an unlimited range of equivalent standard Lagrangians that all lead to the same equations of
motion and satisfy the requirements of the Lagrangian. That is, there is no unique choice among the wide
range of equivalent standard Lagrangians expressed in terms of generalized coordinates. This discussion is
an example of gauge invariance in physics.
Modern theories in physics describe reality in terms of potential fields. Gauge invariance, which also is
called gauge symmetry, is a property of field theory for which different underlying fields lead to identical
observable quantities. Well-known examples are the static electric potential field and the gravitational
potential field where any arbitrary constant can be added to these scalar potentials with zero impact on the
observed static electric field or the observed gravitational field. Gauge theories constrain the laws of physics
in that the impact of gauge transformations must cancel out when expressed in terms of the observables.
Gauge symmetry plays a crucial role in both classical and quantal manifestations of field theory, e.g. it is
the basis of the Standard Model of electroweak and strong interactions.
Equivalent Lagrangians are a clear manifestation of gauge invariance as illustrated by equations 1331 1332
which show that adding any total time derivative of a scalar function Λ(q) to the Lagrangian has no ob-
servable consequences on the equations of motion. That is, although addition of the total time derivative of
the scalar function Λ(q ) changes the value of the Lagrangian, it does not change the equations of motion
for the observables derived using equivalent standard Lagrangians.
In Lagrangian formulations of classical mechanics, the gauge invariance is readily apparent by direct
inspection of the Lagrangian.
B=∇×A
A
E = −∇Φ −
The equations of motion for a charge in an electromagnetic field can be obtained by using the Lagrangian
1
= v · v − (Φ − A · v)
2
Consider the transformations (AΦ) → (A0 Φ0 ) in the transformed Lagrangian 0 where
A0 = A + ∇Λ(r)
Λ(r)
Φ0 = Φ −
13.5. NON-STANDARD LAGRANGIANS 387
A0 A
E0 = −∇Φ0 − = −∇Φ − =E
B0 = ∇ × A0 = ∇ × A = B
That is, the additive terms due to the scalar field Λ(r) cancel. Thus the electromagnetic force fields following
a gauge-invariant transformation are shown to be identical in agreement with what is inferred directly by
inspection of the Lagrangian.
2. For the special case of linear dissipation, it is possible to use the Rayleigh dissipation function
1 XX
F≡ ̇ ̇ (13.33)
2 =1 =1
3. Extensions of Lagrangian mechanics using non-standard Lagrangians can be used that build dissipation
directly into the Lagrangian This can allow exploitation of Lagrangian mechanics for a wide range of
dissipative systems.
The use of non-standard Lagrangians is based on the inverse variational problem where known second-
order equations of motion, plus the inverse variational approach, are used to derive a Lagrangian or Hamil-
tonian that generates the assumed equations of motion. Non-standard Lagrangians can have very different
functional dependences on q̇ qand compared with standard Lagrangians, and yet still can lead to the
required equations of motion, the generalized momenta, and the corresponding Hamiltonian, needed to solve
problems in classical mechanics. The reason for exploring the capabilities of use of non-standard Lagrangians
is that they have the potential to eliminate some of the limitations endemic to Lagrangian and Hamiltonian
mechanics.
Dissipation plays a prominent role in the burgeoning field of non-linear dynamical systems in classical
mechanics. This prominence has stimulated recent studies of the applicability of standard, and non-standard,
Lagrangians to a wide range of dissipative dynamical systems. Musielak et al, and others, [Mus08a, Mus08b,
Cei10] considered dynamical systems that were described by equations of motion with first-order time-
derivative dissipative terms of even and odd powers, and coefficients varying in time or space. They found
that there are at least three different classes of equations of motion, two of which use standard Lagrangians
and can be classified as general. However, the third class is special in that it can be derived only using non-
standard Lagrangians. Each general class has a subset of equations with non-standard Lagrangians. The
existence of standard Lagrangians is limited to equations of motion with either time-dependent coefficients
plus linear dissipative terms, or space-dependent coefficients and quadratic dissipative terms. However, the
equations of motion that can be derived from non-standard Lagrangians are restricted by conditions that must
be satisfied by the coefficients and functions of these equations. Although these non-standard Lagrangians
may have restricted applicability, they do provide hope that such techniques can be used to broaden the scope
of problems that can be addressed using the basic Lagrangian and Hamiltonian mechanics formalisms. Note
that, even though Lagrange published his treatise on analytical mechanics in 1788, fundamental problems
remain to be solved in order to attain the full potential capabilities of analytical mechanics.
13.8. LINEAR VELOCITY-DEPENDENT DISSIPATION 389
1. The equations of motion of a conservative linear dynamical system are given by a variational principle
only if the masses of the system are constant.
2. The equations of motion of a dissipative linear dynamical system are given by a variational principle if,
and only if, the dissipation coefficients are identically equal to the rates of change of the corresponding
masses.
Bateman[Bat31] pointed out that an isolated dissipative system is physically incomplete, that is, a com-
plete system must comprise at least two coupled subsystems where energy is transferred from a dissipating
subsystem to an absorbing subsystem. A complete system should comprise both the dissipating and ab-
sorbing systems to ensure that the total system Lagrangian and Hamiltonian are conserved, as is assumed
in conventional Lagrangian and Hamiltonian mechanics. Both Bateman and Dekker[Dek75] have illustrated
that the equations of motion for a linearly-damped, free, one-dimensional harmonic oscillator are derivable
using the Hamilton variational principle via introduction of a fictitious complementary subsystem that ab-
sorbs the energy, and is a function of a second variable that mirrors the function of the variable for the
dissipative subsystem of interest.
Example 132 illustrates that the linearly-damped, linear oscillator may be handled by three alterna-
tive equivalent non-standard Lagrangians that assume either: (1) a multidimensional system, (2) explicit
time dependent Lagrangians and Hamiltonians, or (3) complex non-standard Lagrangians, to generate the
equations of motion.
Similarly minimizing by variation of the primary variable that is Λ = 0 leads to the uncoupled equation
of motion for
£ ¤
̈ − Γ̇ + 20 = 0 ()
2
Note that equation of motion () which was obtained by variation of the auxiliary variable corresponds
to that for the usual free, linearly-damped, one-dimensional harmonic oscillator for the variable which
dissipates energy as is discussed in chapter 35. The equation of motion () is obtained by variation of the
primary variable and corresponds to a free linear, one-dimensional, oscillator for the variable that is
absorbing the energy dissipated by the dissipating system.
The generalized momenta,
≡
̇
can be used to derive the corresponding Hamiltonian
à µ ¶2 !
Γ Γ
( ) = [ ̇ + ̇ − ] = − [ − ] + 20 − ()
2 2 2 2
Note that this Hamiltonian is time independent, and thus is conserved for this complete dual-variable system.
Using Hamilton’s equations of motion gives the same two uncoupled equations of motion as obtained using
the Lagrangian, i.e. () and ().
2: Time-dependent Lagrangian:
The complementary subsystem of the above dual-component Lagrangian, that is added to the primary
dissipative subsystem, is the adjoint to the equations for the primary subsystem of interest. In some cases, a
set of the solutions of the complementary equations can be expressed in terms of the solutions of the primary
subsystem allowing the equations of motion to be expressed solely in terms of the variables of the primary
subsystem. Inspection of the solutions of the damped harmonic oscillator, presented in chapter 35, implies
that and must be related by the function
= Γ ()
Therefore Bateman proposed a time-dependent, non-standard Lagrangian 2 of the form
Γ £ 2 ¤
= ̇ − 20 2 ( )
2
This Lagrangian corresponds to a harmonic oscillator for which the mass = 0 Γ is accreting
exponentially with time in order to mimic the exponential energy dissipation. Use of this Lagrangian in the
Euler-Lagrange equations gives the solution
£ ¤
Γ ̈ + Γ̇ + 20 = 0 ()
If the factor outside of the bracket is non-zero, then the equation in the bracket must be zero. The expression
in the bracket is the required equation of motion for the linearly-damped linear oscillator. This Lagrangian
generates a generalized momentum of
= Γ ̇
and the Hamiltonian is
2 −Γ 2 Γ 2
= ̇ − 2 = + 0 ()
2 2
The Hamiltonian is time dependent as expected. This leads to Hamilton’s equations of motion
−Γ
̇ = = ()
−̇ = = 20 Γ ()
Take the total time derivative of equation and use equation to substitute for ̇ gives
£ ¤
Γ ̈ + Γ̇ + 20 = 0 ()
13.8. LINEAR VELOCITY-DEPENDENT DISSIPATION 391
If the term Γ is non-zero, then the term in brackets is zero. The term in the bracket is the usual equation
of motion for the linearly-damped harmonic oscillator.
3: Complex Lagrangian:
Dekker proposed use of complex dynamical variables for solving the linearly-damped harmonic oscillator.
It exploits the fact that, in principle, each second order differential equation can be expressed in terms of
a set of first-order differential equations. This feature is the essential difference between Lagrangian and
Hamiltonian mechanics. Let be complex and assume it can be expressed in the form of a real variable as
µ ¶
Γ
= ̇ − + ()
2
̈ + Γ̇ + 20 = 0 ()
This is the desired equation of motion for the linearly-damped harmonic oscillator. This result also can be
shown by taking the time derivative of equation () and taking only the real part, i.e.
µ ¶
Γ Γ
̈ + ̇ + ̇ = ̈ + − ̇ + Γ̇ = ̈ + Γ̇ + 20 = 0 ()
2 2
= ̃ = ()
̇ ̇ ∗
The above Lagrangian plus canonically conjugate momenta lead to the complimentary Hamiltonians
µ ¶
∗ Γ
( ̃ ) = + (̃∗ ∗ − ) ()
2
µ ¶
∗ Γ
̃ ( ̃ ) = − (̃∗ ∗ − ) ()
2
These Hamiltonians give Hamilton equations of motion that lead to the correct equations of motion for
and ∗
The above examples have shown that three very different, non-standard, Lagrangians, plus their corre-
sponding Hamiltonians, all lead to the correct equation of motion for the linearly-damped harmonic oscilla-
tor. This illustrates the power of using non-standard Lagrangians to describe dissipative motion in classical
mechanics. However, postulating non-standard Lagrangians to produce the required equations of motion
appears to be of questionable usefulness. A fundamental approach is needed to build a firm foundation upon
which non-standard Lagrangian mechanics can be based. Non-standard Lagrangian mechanics remains an
active, albeit narrow, frontier of classical mechanics
392 CHAPTER 13. HAMILTON’S PRINCIPLE OF LEAST ACTION
13.9 Summary
This chapter introduced Hamilton’s use of least action to derive Hamilton’s Principle, and its application to
Lagrangian and Hamiltonian mechanics. Gauge invariance of the Lagrangian was discussed. The concept of
alternative standard, and non-standard, Lagrangians was introduced and their applicability was illustrated.
The following summarizes the conclusions.
Hamilton’s Principle Hamilton’s Principle is based on use of variational calculus to determine the equa-
tions of motion for which the action functional has a stationary solution, where
Z 2
= (q q̇) (131)
1
That is Z 2
= = 0 (132)
1
Hamilton’s Principle of least action leads directly to the Lagrange-Euler equations without assuming that
the Lagrangian is of the standard form. That is, Hamilton’s Principle allows for a wide range of allowable
functional forms for the Lagrangian.
Hamilton’s Principle leads to a direct relation between the generalized momentum and the action.
= (1313)
It was shown that Hamilton’s Principle of least action predicts Hamilton’s equations of motion
̇ + =0 − ̇ + =0
In addition, it predicts the Hamiltonian-Jacobi equation.
+ (q p) = 0 (1320)
Gauge invariance of the standard Lagrangian: It was shown that there is a continuum of equivalent
standard Lagrangians that lead to the same set of equations of motion for a system. This feature is related
to gauge invariance in mechanics. The following transformations change the standard Lagrangian, but leave
the equations of motion unchanged.
1. The Lagrangian is indefinite with respect to addition of a constant to the scalar potential which cancels
out when the derivatives in the Euler-Lagrange differential equations are applied.
2. Similarly the Lagrangian is indefinite with respect to addition of a constant kinetic energy.
3. The Lagrangian is indefinite with respect to addition of a total time derivative of the form →
+ [Λ( )] for any differentiable function Λ( ) of the generalized coordinates, plus time, that has
continuous second derivatives.
Non-standard Lagrangians: The flexibility and power of Lagrangian mechanics can be extended to a
broader range of dynamical systems by employing an extended definition of the Lagrangian that is allowed
by Hamilton’s variational action principle, equation 132. It was illustrated that the inverse variational
calculus formalism can be used to identify non-standard Lagrangians that generate the required equations
of motion. These non-standard Lagrangians can be very different from the standard Lagrangian and do not
separate into kinetic and potential energy components. These alternative Lagrangians can be used to handle
dissipative systems which are beyond the range of validity when using standard Lagrangians. That is, it
was shown that several very different Lagrangians and Hamiltonians can be equivalent for generating useful
equations of motion of a system. Currently the use of non-standard Lagrangians is a narrow, but active,
frontier of classical mechanics.
Chapter 14
14.1 Introduction
This study of classical mechanics has involved climbing a vast mountain of knowledge, while the pathway
to the top has led us to elegant and beautiful theories that underlie much of modern physics. Being so
close to the summit provides the opportunity to take a few extra steps in order to glimpse at applications of
variational techniques to physics at the summit. These are described next in chapters 14 − 17.
Hamilton’s development of Hamiltonian mechanics in 1834 is the crowning achievement for applying vari-
ational principles to classical mechanics. A fundamental advantage of Hamiltonian mechanics is that it uses
the conjugate coordinates q p plus time , which is a considerable advantage in most branches of physics
and engineering. Compared to Lagrangian mechanics, Hamiltonian mechanics has a significantly broader
arsenal of powerful techniques that can be exploited to obtain an analytical solution of the integrals of the
motion for complicated systems. In addition, Hamiltonian dynamics provides a means of determining the
unknown variables for which the solution assumes a soluble form, and is ideal for study of the fundamental
underlying physics in applications to fields such as quantum or statistical physics. As a consequence, Hamil-
tonian mechanics is the preeminent variational approach used in modern physics. This chapter introduces
the following four techniques in Hamiltonian mechanics: (1) the elegant Poisson bracket representation of
Hamiltonian mechanics, which played a pivotal role in the development of quantum theory; (2) the pow-
erful Hamilton-Jacobi theory coupled with Jacobi’s development of canonical transformation theory; (3)
action-angle variable theory; and (4) canonical perturbation theory.
Prior to further development of the theory of Hamiltonian mechanics, it is useful to summarize the major
formula relevant to Hamiltonian mechanics that have been presented in chapters 7 8 and 13.
Action functional :
As discussed in chapter 132, Hamiltonian mechanics is built upon Hamilton’s action functional
Z 2
(q p) = (q q̇) (14.1)
1
Generalized momentum :
In chapter 72, the generalized (canonical) momentum was defined in terms of the Lagrangian to be
(q q̇)
≡ (14.3)
̇
Chapter 132 defined the generalized momentum in terms of the action functional to be
(q p)
= (14.4)
393
394 CHAPTER 14. ADVANCED HAMILTONIAN MECHANICS
Hamiltonian function:
The Hamiltonian (q p) was defined in terms of the generalized energy (q q̇ ) plus the generalized
momentum. That is
X
(q p) ≡ (q q̇ ) = ̇ − (q q̇ ) = p · q̇−(q q̇ ) (14.6)
P
where p q correspond to -dimensional vectors, e.g. q ≡ (1 2 ) and the scalar product p· q̇ = ̇ .
Chapter 82 used a Legendre transformation to derive this relation between the Hamiltonian and Lagrangian
functions. Note that whereas the Lagrangian (q q̇ ) is expressed in terms of the coordinates q plus
conjugate velocities q̇, the Hamiltonian (q p ) is expressed in terms of the coordinates q plus their
conjugate momenta p. For scleronomic systems, plus assuming the standard Lagrangian, then equations
744 and 729 give that the Hamiltonian simplifies to equal the total mechanical energy, that is, = + .
Generalized energy theorem:
The equations of motion lead to the generalized energy theorem which states that the time dependence
of the Hamiltonian is related to the time dependence of the Lagrangian.
"
#
(q p) X X (q q̇ )
= ̇
+ (q ) − (14.7)
=1
Note that if all the generalized non-potential forces and Lagrange multiplier terms are zero, and if the
Lagrangian is not an explicit function of time, then the Hamiltonian is a constant of motion.
Hamilton’s equations of motion:
Chapter 83 showed that a Legendre transform plus the Lagrange-Euler equations led to Hamilton’s
equations of motion. Hamilton derived these equations of motion directly from the action functional, as
shown in chapter 132
(q p)
̇ = (14.8)
" #
X
̇ = − (q p) + + (14.9)
=1
(q p) (q q̇ )
= − (14.10)
Note the symmetry of Hamilton’s two canonical equations. The canonical variables are treated
as independent canonical variables Lagrange was the first to derive the canonical equations but he did not
recognize them as a basic set of equations of motion. Hamilton derived the canonical equations of motion
from his fundamental variational principle and made them the basis for a far-reaching theory of dynamics.
Hamilton’s equations give 2 first-order differential equations for for each of the degrees of freedom.
Lagrange’s equations give second-order differential equations for the variables ̇
Hamilton-Jacobi equation:
Hamilton used Hamilton’s Principle to derive the Hamilton-Jacobi equation.
+ (q p) = 0 (14.11)
The solution of Hamilton’s equations is trivial if the Hamiltonian is a constant of motion, or when a set of
generalized coordinate can be identified for which all the coordinates are constant, or are cyclic (also called
ignorable coordinates). Jacobi developed the mathematical framework of canonical transformation required
to exploit the Hamilton-Jacobi equation.
14.2. POISSON BRACKET REPRESENTATION OF HAMILTONIAN MECHANICS 395
Note that the above definition of the Poisson bracket leads to the following identity, antisymmetry, linearity,
Leibniz rules, and Jacobi Identity.
[ ] = 0 (14.13)
where and are functions of the canonical variables plus time. Jacobi’s identity; (1417) states that
the sum of the cyclic permutation of the double Poisson brackets of three functions is zero. Jacobi’s identity
plays a useful role in Hamiltonian mechanics as will be shown.
Note that the Poisson bracket is antisymmetric under interchange in and It is interesting that the only
non-zero fundamental Poisson bracket is for conjugate variables where = that is
Let = and replace by , and use the fact that the fundamental Poisson brackets [ ] = 0
and [ ] = , then equation 1425 reduces to
X µ
¶ X
[ ] = [ ] + [ ] = (14.28)
That is
[ ] = − (14.29)
Similarly
X µ
¶
[ ] = [ ] + [ ] (14.30)
leading to
[ ] = (14.31)
Substituting equations (1429) and (1431) into equation (1427) gives
X µ
¶
[ ] = − = [ ] (14.32)
Thus the canonical variable subscripts ( ) and ( ) can be ignored since the Poisson bracket is
invariant to any canonical transformation of canonical variables. The counter argument is that if the Poisson
bracket is independent of the transformation then the transformation is canonical.
Since it has been shown that this transformation is canonical, it is possible to go further and determine
the function that generates this transformation. Solving the transformation equations for and give
¡ ¢2 ¡ ¢
= − 1 sec2 = 2 − 1 tan
Since the transformation is canonical, there exists a generating function 3 ( ) such that
3 3
=− =−
3 3
3 ( ) = + = − −
h¡ ¢2 i ¡ ¢2 h¡ ¢2 i
= − − 1 tan − − 1 tan = − − 1 tan
This example illustrates how to determine a useful generating function and prove that the transformation is
canonical.
These two Poisson Brackets for three functions can be used to derive the Poisson Bracket of four functions,
taken in pairs. This can be accomplished two ways using either equation 1433 or 1434
These two alternate derivations give different relations for the same Poisson Bracket. Equating the alternative
equations 1435 and 1436 gives that
This can be factored into separate relations, the left-hand side for body 1 and the right-hand side for body
2.
(1 1 − 1 1 ) (2 2 − 2 2 )
= = (14.37)
[1 1 ] [2 2 ]
398 CHAPTER 14. ADVANCED HAMILTONIAN MECHANICS
Since the left-hand ratio holds for 1 1 independent of 2 2 , and vise versa, then they must equal
a constant that does not depend on 1 1 does not depend on 2 2 , and must commute with
(1 1 − 1 1 ). That is, must be a constant number independent of these variables.
X µ 1 1 1 1
¶
(1 1 − 1 1 ) = [1 1 ] ≡ − (14.38)
Equation 1438 is an especially important result which states that to within a multiplicative constant number
, there is a one-to-one correspondence between the Poisson Bracket and the commutator of two independent
functions. An important implication is that if two functions, have a Poisson Bracket that is zero, then
the commutator of the two functions also must be zero, that is, and commute.
Consider the special case where the variables 1 and 1 correspond to the fundamental canonical vari-
ables, ( ). Then the commutators of the fundamental canonical variables are given by
− = [ ] = (14.39)
− = [ ] = 0 (14.40)
− = [ ] = 0 (14.41)
In 1925, Paul Dirac, a 23-year old graduate student at Bristol, recognized that the formal correspondence
between the Poisson bracket in classical mechanics, and the corresponding commutator, provides a logical
and consistent way to bridge the chasm between the Hamiltonian formulation of classical mechanics, and
quantum mechanics. He realized that making the assumption that the constant ≡ ~, leads to Heisenberg’s
fundamental commutation relations in quantum mechanics, as is discussed in chapter 1732. Assuming that
≡ ~ provides a logical and consistent way that builds quantization directly into classical mechanics, rather
than using ad-hoc, case-dependent, hypotheses as was used by the older quantum theory of Bohr.
Time dependence:
The total time differential of a function ( ) is defined by
µ ¶
X
= + ̇ + ̇ (14.42)
that is
= + [ ] (14.45)
This important equation states that the total time derivative of any function ( ) can be expressed in
terms of the partial time derivative plus the Poisson bracket of ( ) with the Hamiltonian.
14.2. POISSON BRACKET REPRESENTATION OF HAMILTONIAN MECHANICS 399
Any observable ( ) will be a constant of motion if = 0, and thus equation (1445) gives
+ [ ] = 0 (If is a constant of motion)
That is, it is a constant of motion when
= [ ] (14.46)
Moreover, this can be extended further to the statement that if the constant of motion is not explicitly
time dependent then
[ ] = 0 (14.47)
The Poisson bracket with the Hamiltonian is zero for a constant of motion that is not explicitly time
dependent. Often it is more useful to turn this statement around with the statement that if [ ] = 0 and
= 0 then = 0, implying that is a constant of motion.
Independence
Consider two observables ( ) and ( ). The independence of these two observables is determined
by the Poisson bracket
[ ] = − [ ] (14.48)
If this Poisson bracket is zero, that is, if the two observables ( ) and ( ) commute, then their
values are independent and can be measured independently. However, if the Poisson bracket [ ] 6= 0, that
is ( ) and ( ) do not commute, then and are correlated since interchanging the order of
the Poisson bracket changes the sign which implies that the measured value for depends on whether is
simultaneously measured.
A useful property of Poisson brackets is that if and both are constants of motion, then the double
Poisson bracket [ [ ]] = 0. This can be proved using Jacobi’s identity
If [ ] = 0 and [ ] = 0 then [ [ ]] = 0 that is, the Poisson bracket [ ] commutes with . Note
that if and do not depend explicitly on time, that is
= = 0, then combining equations (1445)
and (1449) leads to Poisson’s Theorem that relates the total time derivatives.
∙ ¸ ∙ ¸
[ ] = + (14.50)
[ ] = 0
400 CHAPTER 14. ADVANCED HAMILTONIAN MECHANICS
Since is not an explicit function of time, = 0 then = 0 that is, the angular momentum about
the axis = is a constant of motion.
The Poisson bracket of the total angular momentum 2 commutes with the Hamiltonian, that is
" #
£ 2 ¤ 2
2
= + = 0
sin2
2
Since the total angular momentum 2 = 2 + sin2 is not explicitly time dependent, then it also must be a
constant of motion. Note that Noether’s theorem also gives that both the angular momenta 2 and are
constants of motion. Also since the Poisson brackets are
[ ] = 0
£ 2 ¤
= 0
then Jacobi’s identity, equation 1417 can be used to imply that
£ ¤
[ 2 ] = 0
£ 2 ¤
That
£ 2 is,
¤ the Poisson bracket is a constant of motion. Note that if 2 and commute, that is,
£2 ¤ = 0 then they can be measured simultaneously with unlimited accuracy, and this also satisfies that
commutes with .
The ( ) components of the angular momentum are given by
X
X
= (r × p) = ( − )
=1 =1
X X
= (r × p) = ( − )
=1 =1
X X
= (r × p) = ( − )
=1 =1
̇ = [ ] = (14.57)
̇ = [ ] = − (14.58)
The above shows that the full structure of Hamilton’s equations of motion can be expressed directly in
terms of Poisson brackets.
The elegant formulation of Poisson brackets has the same form in all canonical coordinates as the Hamil-
tonian formulation. However, the normal Hamilton canonical equations in classical mechanics assume implic-
itly that one can specify the exact position and momentum of a particle simultaneously at any point in time
which is applicable only to classical mechanics variables that are continuous functions of the coordinates,
and not to quantized systems. The important feature of the Poisson Bracket representation of Hamilton’s
equations is that it generalizes Hamilton’s equations into a form (1457 1458) where the Poisson bracket is
equally consistent with both classical and quantum mechanics in that it allows for non-commuting canonical
variables and Heisenberg’s Uncertainty Principle. Thus the generalization of Hamilton’s equations, via use
of the Poisson brackets, provides one of the most powerful analytic tools applicable to both classical and
quantal dynamics. It played a pivotal role in derivation of quantum theory as described in chapter 17.
402 CHAPTER 14. ADVANCED HAMILTONIAN MECHANICS
(p−A)2
= (p · ẋ) − = + Φ
2
The Hamilton equations of motion give
(p−A)
ẋ= [x ] =
and
ṗ = [p] = −∇Φ + {(p−A) × (∇ × A)}
Define the magnetic field to be
B≡∇×A
and the electric field to be
A
E = − ∇Φ −
then the Lorentz force can be written as
F = ṗ = (E + ẋ × B)
conservative system of many identical coupled linear oscillators. Then evaluating the following Poisson
brackets gives
[ ] = 0
[ ] = 0
[ ] = 0
[ ] = 0
[ ] =
6 0
[ ] = 6 0
Thus one cannot simultaneously measure the conjugate variables ( ) or ( ). This is the Uncertainty
Principle manifest by all forms of wave motion in classical and quantal mechanics as discussed in chapter
3113
14.2. POISSON BRACKET REPRESENTATION OF HAMILTONIAN MECHANICS 403
1 1
≡ √ ( + ) ≡ √ ( − )
2 2
Express the kinetic and potential energies in terms of the new coordinates gives
∙³ ´2 ³ ´2 ¸ 1 ³ ´
1 2
(̇ ̇) = ̇ + ̇ + ̇ − ̇ = ̇2 + ̇
4 2
1 h 2 2
i 1 ¡ ¢ 1 1
= ( + ) + ( − ) + 2 − 2 = ( + ) 2 + ( − ) 2
4 2 2 2
Note that the coordinate transformation makes the Lagrangian separable, that is
1 ³ 2 2
´ 1 1
= ̇ + ̇ − ( + ) 2 + ( − ) 2 = +
2 2 2
where
1 1 1 2 1
= ̇2 − ( + ) 2 = ̇ − ( − ) 2
2 2 2 2
This shows that that the transformation has separated the system into two normal modes that are harmonic
oscillators with angular frequencies
r r
+ −
1 = 2 =
Note that non-isotropic harmonic oscillator reduces to the isotropic linear oscillator when = 0.
b) HAMILTONIAN: The canonical momenta are given by
= = ̇
̇
= = ̇
̇
The definition of the Hamiltonian gives
1 ¡ 2 ¢ 1 1
= ̇ + ̇ − = + 2 + ( + ) 2 + ( − ) 2
2 2 2
Note that this can be factored as
= +
where
1 2 1 1 2 1
= + ( + ) 2 = + ( − ) 2
2 2 2 2
404 CHAPTER 14. ADVANCED HAMILTONIAN MECHANICS
Using the Poisson Bracket expression for the time dependence, equation 1445 and using the fact that
the Hamiltonian is not explicitly time dependent, that is,
= 0, gives
= + [ ] = 0 + [ + ] = [ ]
= + − − =0
Similarly = 0. This implies that the Hamiltonians for both normal modes, and are time-
independent constants of motion which are equal to the total energy for each mode.
c) ANGULAR MOMENTUM: The angular momentum for motion in the plane is perpendicular to
the with a magnitude of
= ( − )
The time dependence of the angular momentum is given by
= + [ ] = 0 + − + −
= + + − − + = 2
Note that if = 0 then the two eigenfrequencies, are degenerate, = , that is, the system reduces to
the isotropic harmonic oscillator in the plane that was discussed in chapter 99. In addition,
= 0 for
= 0 that is, the angular momentum in the plane is a constant of motion when = 0.
d) SYMMETRY TENSOR: The symmetry tensor was defined in chapter 993 to be
1
0 = +
2 2
where and can correspond to either or . The symmetry tensor defines the orientation of the major
axis of the elliptical orbit for the two-dimensional, isotropic, linear oscillator as described in chapter 993
The isotropic oscillator has been shown to have two normal modes that are degenerate, therefore and
are equally good normal modes. The Hamiltonian showed that, for = 0 the Hamiltonian gives the total
energy is conserved, as well as the energies for each of the two normal modes which are.
2 1 2 1
= + 2 = + 2
2 2 2 2
Consider the matrix element
1
0 = +
2 2
where each can represent or . Then for each matrix element
0 0 0 0 0 0
= + [ ] = 0 + − + − =0
That is, each matrix element 012 commutes with the Hamiltonian
£ 0 ¤
= 0
Thus the Poisson Brackets representation of Hamiltonian mechanics has been used to prove that the
symmetry tensor 0 = 2 + 12 is a constant of motion for the isotropic harmonic oscillator. That is,
all the elements , and 0 of the symmetric tensor A0 commute with the Hamiltonian.
0 0
Note that the three constants of motion, L, A0 and H for the isotropic, two-dimensional, linear oscillator
form a closed algebra under the Poisson Bracket formalism.
A ≡ (p × L) + (r̂)
14.2. POISSON BRACKET REPRESENTATION OF HAMILTONIAN MECHANICS 405
is a constant of motion that specifies the major axis of the elliptical orbit. The eccentricity vector for the
inverse-square-law force can be investigated using Poisson Brackets as was done for the symmetry tensor
above. It can be shown that
[ ] =
µ 2 ¶
p
[ ] = −2 + (a)
2
Note that the bracket on the right-hand side of equation () equals the Hamiltonian for the inverse square-
law attractive force, and thus the Poisson bracket equals
µ 2 ¶
p
[ ] = −2 + = −2
2
For the Hamiltonian it can be shown that the Poisson bracket
[ A] = 0
That is, the eccentricity vector commutes with the Hamiltonian and thus it is a constant of motion. Previously
this result was obtained directly using the equations of motion as given in equation 987. Note that the three
constants of motion, L, A and H form a closed algebra under the Poisson Bracket formalism similar to
the triad of constants of motion, L, A0 and H that occur for the two-dimensional, isotropic linear oscillator
described above. Examples 145 and 146 illustrate that the Poisson Brackets representation of Hamiltonian
mechanics is a powerful probe of the underlying physics, as well as confirming the results obtained directly
from the equations of motion as described in chapter 984 and 9 9 3 .
Hence the net increase in in the infinitessimal rectangular element due to flow in the horizontal
direction is
− (̇ ) (14.62)
Similarly, the net gain due to flow in the vertical direction is
− (̇ ) (14.63)
Thus the total increase in the element per unit time is therefore
∙ ¸
− (̇ ) + (̇ ) (14.64)
Assume that the total number of points must be conserved, then the total increase in the number of
points inside the element must equal the net changes in on the infinitessimal surface element per
unit time. That is µ ¶
(14.65)
Thus summing over all possible values of gives
∙ ¸
X
+ (̇ ) + (̇ ) = 0 (14.66)
or ∙ ¸
X X ∙ ̇ ̇
¸
+ ̇ + ̇ + + =0 (14.67)
Inserting Hamilton’s canonical equations into both brackets and differentiating the last bracket results in
∙ ¸ X ∙ 2 ¸
X 2
+ − + − =0 (14.68)
The two terms in the last bracket cancel and thus
∙ ¸
X
+ − = + [ ] = 0 (14.69)
However, this just equals , therefore
= + [ ] = 0 (14.70)
This is called Liouville’s theorem which states that the rate of change of density of representative
points vanishes, that is, the density of points is a constant in the Hamiltonian phase space along a specific
trajectory. Liouville’s theorem means that the system acts like an incompressible fluid that moves such as to
occupy an equal volume in phase space at every instant, even though the shape of the phase-space volume
may change, that is, the phase-space density of the fluid remains constant. Equation (1470) is another
illustration of the basic Poisson bracket relation (1445) and the usefulness of Poisson brackets in physics.
Liouville’s theorem is crucially important to statistical mechanics of ensembles where the exact knowledge
of the system is unknown, only statistical averages are known. An example is in focussing of beams of charged
particles by beam handling systems. At a focus of the beam, the transverse width in is minimized, while
the width in is largest since the beam is converging to the focus, whereas a parallel beam has maximum
width and minimum spreading width . However, the product remains constant throughout the
focussing system. For a two dimensional beam, this applies equally for the and coordinates, etc. It is
obvious that the final beam quality for any beam transport system is ultimately limited by the emittance of
the source of the beam, that is, the initial area of the phase space distribution. Note that Liouville’s theorem
only applies to Hamiltonian − phase space, not to − ̇ Lagrangian state space. As a consequence,
Hamiltonian dynamics, rather than Lagrange dynamics, is used to discuss ensembles in statistical physics.
Note that Liouville’s theorem is applicable only for conservative systems, that is, where Hamilton’s
equations of motion apply. For dissipative systems the phase space volume shrinks with time rather than
being a constant of the motion.
14.3. CANONICAL TRANSFORMATIONS IN HAMILTONIAN MECHANICS 407
Similarly, applying Hamilton’s Principle of least action to the new Lagrangian L(Q Q̇ ) gives
Z 2 Z 2 h i
= L(Q Q̇ ) = P · Q̇ − H(Q P ) = 0 (14.74)
1 1
The discussion of gauge-invariant Lagrangians, chapter 134 showed that and L can be related by the
total time derivative of a generating function where
=L− (14.75)
The generating function can be any well-behaved function with continuous second derivatives of both the
old and new canonical variables p q P Q and Thus the integrands of (1473) and (1474) are related by
h i
p · q̇ − (q p ) = P · Q̇ − H(Q P ) + (14.76)
408 CHAPTER 14. ADVANCED HAMILTONIAN MECHANICS
where is a possible scale transformation. A scale transformation, such as changing units, is trivial, and will
be assumed to be absorbed into the coordinates, making = 1 Assuming that 6= 1 is called an extended
canonical transformation.
The total time derivative of the generating function = 1 (q Q) is given by
∙ ¸
(q Q) 1 (q Q) 1 (q Q) 1 (q Q)
= · q̇ + · Q̇ + (14.77)
q Q
Insert equation (1477) into equation (1476), and assume that the trivial scale factor = 1 then
∙ ¸ ∙ ¸
1 (q Q) 1 (q Q) 1 (q Q)
p− · q̇ − (q p ) = P + · Q̇ − H(Q P ) +
q Q
Assume that the generating function 1 determines the canonical variables p and P to be
then the terms in each square bracket cancel, leading to the required canonical transformation
The total time derivative of the generating function = 2 (q P)−Q · P is given by
∙ ¸
2 (q P) 2 (q P) 2 (q P)
= · q̇ + · Ṗ − P · Q̇ − Ṗ · Q + (14.80)
q P
Insert this into equation (1476) and assume that the trivial scale factor = 1 then
µ ¶ ∙ ¸
2 (q P) 2 (q P) 2 (q P)
p− · q̇ − (q p ) = P · Q̇ − P · Q̇+ − Q · Ṗ − H(Q P ) +
q P
Assume that the generating function 2 determines the canonical variables p and Q to be
Insert this into equation (1476) and assume that the trivial scale factor = 1 then
∙ ¸ ∙ ¸
3 (p Q) 3 (p Q) 3 (p Q)
− q+ · ṗ − (q p ) = P+ ·Q̇ − H(Q P ) +
p Q
Assume that the generating function 3 determines the canonical variables q and P to be
3 (p Q) 3 (p Q)
q=− P=− (14.84)
p Q
then the terms in brackets cancel, leading to the required transformation
3 (p Q)
H(Q P ) = (q p ) + (14.85)
Insert this into equation (1476) and assume that the trivial scale factor = 1 then
∙ ¸ ∙ ¸
4 (p P) 4 (p P) 4 (p P)
− q+ · ṗ − (q p ) = − Q ·Ṗ − H(Q P ) +
p P
Assume that the generating function 4 determines the canonical variables q and Q to be
4 (p P) 4 (p P)
q=− Q= (14.87)
p P
then the terms in brackets cancel, leading to the required transformation
4 (p P)
H(Q P ) = (q p ) + (14.88)
Note that the last three generating functions require the inclusion of additional bilinear products of
in order for the terms to cancel to give the required result. The addition of the bilinear terms,
ensures that the resultant generating function is the same using any of the four generating functions
1 2 3 4 . Frequently the 2 (q P ) generating function is the most convenient. The four possible
generating functions of the first kind, given above, are related by Legendre transformations. A canonical
transformation does not have to conform to only one of the four generating functions for all the degrees
of freedom, they can be a mixture of different flavors for the different degrees of freedom. The properties of
the generating functions are summarized in table 141.
The partial derivatives of the generating functions determine the corresponding conjugate variables
not explicitly included in the generating function . Note that, for the first trivial example 1 = the
old momenta become the new coordinates, = and vice versa, = − . This illustrates that it is
better to name them "conjugate variables" rather than "momenta" and "coordinates".
In summary, Jacobi has developed a mathematical framework for finding the generating function
required to make a canonical transformation to a new Hamiltonian H(Q P ), that has a known solution.
That is,
H(Q P ) = (q p ) + (14.89)
When H(Q P ) is a constant, then a solution has been obtained. The inverse transformation for this solution
Q() P() → q() p() now can be used to express the final solution in terms of the original variables of the
system.
Note the special case when H(Q P ) = 0 then equation 1489 has been reduced to the Hamilton-Jacobi
relation (1412)
(q p ) + =0 (1412)
In this case, the generating function determines the action functional required to solve the Hamilton-
Jacobi equation (1412). Since equation (1489) has transformed the Hamiltonian (q p ) → H(Q P )
for which H(Q P ) = 0, then the solution Q() P() for the Hamiltonian H(Q P ) = 0 is obtained easily.
This approach underlies Hamilton-Jacobi theory presented in chapter 144
2 (q P )
= = +
Thus the infinitessimal changes in and are given by
(q P ) (q P )
(q p) = − = = + (2 )
(q P ) (q P )
(q p) = − = − = − + (2 )
2 2 1 ¡ 2 ¢
= + = + 2 2 2
2 2 2
This form of the Hamiltonian is a sum of two squares suggesting a canonical transformation for which
is cyclic in a new coordinate. A guess for a canonical transformation is of the form = cot which
2
is of the 1 (q Q) type where 1 equals 1 ( ) = 2 cot Using (1478) gives
1 ( )
= = cot
1 ( ) 2
= − =
2 sin2
H = =
Since
H
̇ = =
then
= +
Substituting into () gives the well known solution of the one-dimensional harmonic oscillator
r
2
= sin( + )
2
412 CHAPTER 14. ADVANCED HAMILTONIAN MECHANICS
The Hamilton-Jacobi equation, (1494) can be written more compactly using tensors q and ∇ to designate
(1 ) and 1
respectively. That is
(q ∇ ) + =0 (14.95)
Equation (1495) is a first-order partial differential equation in + 1 variables which are the old spatial
coordinates plus time . The new momenta have not been specified except that they are constants
since H = 0
Assume the existence of a solution of (1495) of the form ( ) = (1 ; 1 +1 ; ) where
the generalized momenta = 1 2 plus are the + 1 independent constants of integration in the
transformed frame. One constant of integration is irrelevant to the solution since only partial derivatives of
( ) with respect to and are involved. Thus, if is a solution of the first-order partial differential
equation, then so is + where is a constant. Thus it can be assumed that one of the + 1 constants of
integration is just an additive constant which can be ignored leading effectively to a solution
where none of the independent constants are solely additive. Such generating function solutions are called
complete solutions of the first-order partial differential equations since all constants of integration are known.
It is possible to assume that the generalized momenta, are constants , where the are the
constants This allows the generalized momentum to be written as
(q α )
= (14.97)
Similarly, Hamilton’s equations of motion give the conjugate coordinate Q = β where are constants That
is
(q α )
= = (14.98)
The above procedure has determined the complete set of 2 constants (Q = β P = α). It is possible to
invert the canonical transformation to express the above solution, which is expressed in terms of =
and = back to the original coordinates, that is, = ( ) and momenta = ( ) which is
the required solution.
Note that this equals the abbreviated action described in chapter 1323, that is (q α) = 0 (q α)
Inserting the action (q α) into the Hamilton-Jacobi equation (1412) gives
(q α)
(q; ) = (α) (14.104)
This is called the time-independent Hamilton-Jacobi equation. Usually it is convenient to have
equal the total energy. However, sometimes it is more convenient to exclude the energy ( ) in the
set, in which case = (1 2 −1 ); the Routhian exploits this feature..
The equations of the canonical transformation expressed in terms of (q α) are
(q α) (α) (q α)
= + = (14.105)
These equations show that Hamilton’s characteristic function (q α) is itself the generating function of a
time-independent canonical transformation from the old variables ( ) to a set of new variables
(α)
= + = (14.106)
Table 142 summarizes the time-dependent and time-independent forms of the Hamilton-Jacobi equation.
Hamilton-Jacobi equation ( 1 ; 1
; )+
= 0 ( 1 ;
1 ) =
Transformation equations =
=
= = = = +
14.4. HAMILTON-JACOBI THEORY 415
(q α) (q α)
= = (14.109)
H H
̇ = =0 ̇ = =0 (14.110)
H = + = − =0 (14.111)
which has reduced the problem to a simple sum of one-dimensional first-order differential equations.
If the variable is cyclic, then the Hamiltonian is not a function of and the term in Hamilton’s
characteristic function equals = which separates out from the summation in equation 14107 That
is, all cyclic variables can be factored out of (q α) which greatly simplifies solution of the Hamilton-Jacobi
equation. As a consequence, the ability of the Hamilton-Jacobi method to make a canonical transformation to
separate the system into many cyclic or independent variables, which can be solved trivially, is a remarkably
powerful way for solving the equations of motion in Hamiltonian mechanics.
Since the Hamiltonian does not explicitly depend on the coordinates ( ) then the coordinates are cyclic
and separation of the variables, 14107, gives that the action
= α · r − ()
Since
S α
= r− Q̇ =
α
the equation of motion and the conjugate momentum are given by
α
r = Q̇ + p = ∇ = α
Thus the Hamilton-Jacobi relation has given both the equation of motion and the linear momentum p.
Assuming that the variables can be separated = () + () + () leads to
()
= =
()
= =
() q
= = 2( − ) − 2 − 2
Thus by integration the total equals
Z Z Z ³q ´
= + + 2( − ) − 2 − 2
0 0 0
³ ´
− 0 = ( − 0 )
´
³
− 0 = ( − 0 )
⎛q ⎞
2( − ) − 2 − 2
− 0 = ⎝ ⎠ ( − 0 ) − 1 ( − 0 )2
2
( ) = ( ) −
=
Inserting the generalized momentum into the Hamiltonian gives
Ã∙ ¸2 !
1
+ 2 2 2 =
2
The left-hand side is independent of whereas the right-hand side is independent of and Both sides
must equal a constant which is set to equal −2 , that is
"µ ¶2 µ ¶2 #
1 1 Θ 2
+ 2 + () + =
2 22 sin2
µ ¶2
Φ
= 2
The equation in and can be rearranged in the form
" µ ¶2 # "µ ¶2 #
2 1 Θ 2
2 + () − = − +
2 sin2
The left-hand side is independent of and the right-hand side is independent of so both must equal a
constant which is set to be −2 µ ¶2
1 2
+ () + =
2 22
µ ¶2
Θ 2
+ = 2
sin2
The variables now are completely separated and, by rearrangement plus integration, one obtains
√ Z r
2
() = 2 − () −
22
Z r
2
Θ() = 2 −
sin2
Φ() =
√ Z r Z r
2 2
= 2 − () − + 2 − +
2 2
sin2
The Hamilton’s characteristic function is the generating function from coordinates ( )
to new coordinates, which are cyclic, and new momenta that are constant and taken to be the separation
constants
r
√ 2
= = 2 − () −
22
r
2
= = 2 −
sin2
= =
14.4. HAMILTON-JACOBI THEORY 419
These equations lead to the elliptical, parabolic, or hyperbolic orbits discussed in chapter 9.
2 1
2 ( ) = ̇ − 2 ( ̇ ) = −Γ + 20 2 Γ ()
2 2
Note that both the Lagrangian and Hamiltonian are explicitly time dependent and thus they are not
conserved quantities. This is as expected for this dissipative system.
Hamilton-Jacobi theory:
The form of the non-autonomous Hamiltonian () suggests use of the generating function for a canonical
transformation to an autonomous Hamiltonian, for which H is a constant of motion.
Γ
( ) = 2 ( ) = 2 = ()
That is, the transformed Hamiltonian H( ) is not explicitly time dependent, and thus is conserved.
Expressed in the original canonical variables ( ), the transformed Hamiltonian H( )
2 −Γ Γ 20 2 Γ
H( )= + +
2 2 2
is a constant of motion which was not readily apparent when using the original Hamiltonian. This unexpected
result illustrates the usefulness of canonical transformations for solving dissipative systems. The Hamilton-
Jacobi theory now can be used to solve the equations of motion for the transformed variables ( ) plus the
transformed Hamiltonian H( ). The derivative of the generating function
= ()
Use equation ( ) to substitute for in the Hamiltonian H( ) (equation ( )), then the Hamilton-
Jacobi method gives
µ ¶2
1 Γ 20 2
+ + + =0
2 2 2
This equation is separable as described in 14107 and thus let
( ) = ( ) −
The choice of the sign is irrelevant for this case and thus the positive sign is chosen. There are three possible
cases for the solution depending on whether the square-root term is real, zero, or imaginary.
Case 1: 1, that is, 2
2 r
0
1
h ¡ ¢2 i
Define = 1− 2 Then equation () can be integrated to give
Z p
2
= − − + ( − 2 2 ) ()
4
and Z
1
= = − + p
0 ( − 2 2 )
This integral gives µ ¶
−1
sin √ = 0 ( + ) ≡ +
14.4. HAMILTON-JACOBI THEORY 421
where s s
µ ¶2 µ ¶2
Γ Γ
= 0 = 0 1− = 20 − ()
2 0 2
Transforming back to the original variable gives
Γ
() = − 2 sin ( + ) ()
where and are given by the initial conditions. Equation is identical to the solution for the underdamped
linearly-damped linear oscillator given previously in equation 335.
Case 2: that is, 2Γ0 = 1
2 = 1, r
h ¡ ¢2 i
In this case = 1− 2 = 0 and thus equation simplifies to
2 √
= − − +
4
and
= = − + √
0
Therefore the solution is
Γ
() = − 2 ( + ) ()
where F and G are constants given by the initial conditions. This is the solution for the critically-damped
linearly-damped, linear oscillator given previously in equation 338.
Case 3: Γ
2 1, that is, 2 0 1 rh i
¡ ¢2
Define a real constant where = 2 − 1 = , then
Z p
2
= − − + ( + 2 2 )
4
Then Z
1
= = − + p
0 ( + 2 2 )
This last integral gives µ ¶
−1
sinh √ = 0 ( + ) ≡ +
where sµ ¶2
= 0 = 0 −1
2 0
Then the original variable gives
Γ
() = − 2 sinh ( + ) ()
This is the classic solution of the overdamped linearly-damped, linear harmonic oscillator given previously in
equation 337 The canonical transformation from a non-autonomous to an autonomous system allowed use
of Hamiltonian mechanics to solve the damped oscillator problem.
Note that this example used Bateman’s non-standard Lagrangian, and corresponding Hamiltonian, for
handling a dissipative linear oscillator system where the dissipation depends linearly on velocity. This non-
standard Lagrangian led to the correct equations of motion and solutions when applied using either the
time-dependent Lagrangian, or time-dependent Hamiltonian, and these solutions agree with those given in
chapter 35 which were derived using Newtonian mechanics.
422 CHAPTER 14. ADVANCED HAMILTONIAN MECHANICS
Let 1 = , 2 = 3 = 1 = 2 = 3 = .
The momentum components are given by
( )
= (14.113)
which corresponds to
p = ∇ = ∇ (14.114)
where for each cyclic variable the integral is taken over one complete period of oscillation. The cyclic variable
is called the action variable where
I
1 1
≡ = (14.117)
2 2
The canonical variable to the action variable I is the angle variable
R φ. Note that the name "action variable"
is used to differentiate I from the action functional = which has the same units; i.e. angular
momentum.
The general principle underlying the use of action-angle variables is illustrated by considering one body,
of mass , subject to a one-dimensional bound conservative potential energy (). The Hamiltonian is
given by
2
( ) = + () (14.118)
2
This bound system has a ( ) phase space contour for each energy =
p
( ) = ± 2( − ()) (14.119)
For an oscillatory
I system the two-valued momentum of equation 14119 is non-trivial to handle. By contrast,
the area ≡ of the closed loop in phase space is a single-valued scalar quantity that depends on
I
and (). Moreover, Liouville’s theorem states that the area of the closed contour in phase space ≡
is invariant to canonical transformations. These facts suggest the use of a new pair of conjugate variables,
( ) where () uniquely labels the trajectory, and corresponding area, of a closed loop in phase space
for each value of , and the single-valued function is a corresponding angle that specifies the exact point
along the phase-space contour as illustrated in Fig 143.
For simplicity consider the linear harmonic oscillator where
1
() = 2 2 (14.120)
2
Then the Hamiltonian, 14118 equals
2 1
( ) = + 2 2 (14.121)
2 2
Hamilton’s equations of motion give that
̇ = − = − 2 (14.122)
̇ = = (14.123)
424 CHAPTER 14. ADVANCED HAMILTONIAN MECHANICS
= cos(( − 0 )) (14.124)
= − sin ( − 0 ) (14.125)
[ ]() = 1
That is, the phase space has been mapped from ellipses, with area proportional to in the ( ) phase
space, to a cylindrical ( ) phase space where =
are constant values that are independent of the angle,
while increases linearly with time. Thus the variables ( ) are periodic with modulus ∆ = 2.
The period of the periodic oscillatory motion is given simply by ∆ = 2 = which is the well known re-
sult for the harmonic oscillator. Note that the action-angle variable canonical transformation has determined
the frequency of the periodic motion without solving the detailed trajectory of the motion.
14.5. ACTION-ANGLE VARIABLES 425
The above example of the harmonic oscillator has shown that, for integrable periodic systems, it is
possible to identify a canonical transformation to ( ) such that the Hamiltonian is independent of the
angle which specifies the instantaneous location on the constant energy contour . If the phase space
contour is a separatrix, then it divides phase space into invariant regions containing phase-space contours
with differing behavior. The action-angle variables are not useful for separatrix contours. For rolling motion,
the system rotates with continuously increasing, or decreasing angle, and there is no natural boundary for the
action angle variable since the phase space trajectory is continuous and not closed. However, the action-angle
approach still is valid if the motion involves periodic as well as rolling motion.
The example of the one-dimensional, one-body, harmonic oscillator can be expanded to the more general
case for many bodies in three dimensions. This is illustrated by considering multiple periodic systems for
which the Hamiltonian is conservative and where the equations of the canonical transformation are separable.
The generalized momenta then can be written as
( ; 1 2 )
= (14.136)
for which each is a function of and the integration constants
The momentum ( 1 2 ) represents the trajectory of the system in the ( ) phase space that is
characterized by Hamilton’s characteristic function ( ) Combining equations 14116 14136 gives
I
( ; 1 2 )
≡ (14.138)
Since is merely a variable of integration, each active action variable is a function of the constants
of integration in the Hamilton-Jacobi equation. Because of the independence of the separable-variable pairs
( ), the form independent functions of the and hence are suitable for use as a new set of constant
momenta. Thus the characteristic function can be written as
X
(1 ; 1 ) = ( ; 1 ) (14.139)
= 2 + (14.142)
that is, they are linear functions of time The constants can be identified with the frequencies of the
multiple periodic motions.
The action-angle variables appear to be no different than a particular set of transformed coordinates.
Their merit appears when the physical interpretation is assigned to . Consider the change as the
are changed infinitesimally
X X 2
= = (14.143)
The derivative with respect to vanishes except for the component of . Thus equation 14143 reduces
to
426 CHAPTER 14. ADVANCED HAMILTONIAN MECHANICS
X
= ( ) (14.144)
Therefore, the total change in as the system goes through one complete cycle is
X I
∆ = ( ) = 2 (14.145)
where
is outside the integral since the are constants for cyclic motion. Thus ∆ = 2 = where
is the period for one cycle of oscillation, where the angular frequency is given by
1
= = (14.146)
2
Thus the frequency associated with the periodic motion is the reciprocal of the period The secret here is
that the derivative of with respect to the action variable given by equation (14141) directly determines
the frequency of the periodic motion without the need to solve the complete equations of motion. Note that
multiple periodic motion can be represented by a Fourier expansion of the form
∞
X ∞
X ∞
X
= 1 2(1 1 +2 2 +3 3 ++ ) (14.147)
1 =−∞ 2 =−∞ =−∞
Although the action-angle approach to Hamilton-Jacobi theory does not produce complete equations of
motion, it does provide the frequency decomposition that often is the physics of interest. The reason that
the powerful action-angle variable approach has been introduced here is that it is used extensively in celestial
mechanics. The action-angle concept also played a key role in the development of quantum mechanics, in
that Sommerfeld recognized that Bohr’s ad hoc assumption that angular momentum is quantized, could be
expressed in terms of quantization of the angle variable as is mentioned in chapter 17.
Then the average mean square amplitude and velocity over one period are
2® ® 2
= [0 cos( + 0 )]2 = 0
2
D 2E ® 2 20
̇ = [−0 sin( + 0 )]2 =
2
Since, for the simple pendulum, 2 = , then the tension in the string
2® D 2E
2
= (1 − ) + ̇ = (1 + 0 )
2 4
Assuming that 0 is a small angle, and that the change in length −∆ is very small during one period
then the work done is
2
∆ = ∆ = − ∆ − 0 ∆ (a)
4
while the change in internal oscillator energy is
∙ ¸
2 1 1
∆(− cos 0 ) = ∆ − (1 − 0 ) = − ∆ + ∆(20 ) = − ∆ + 20 ∆ + 0 ∆0
2 2 2
(b)
The work done must balance the increment in internal energy therefore
320 ∆
0 ∆0 + =0
4
or
3
20 ∆ ln(0 4 ) = 0
Therefore it follows that
3
(0 4 ) = constant (c)
or
3
0 ∝ − 4
Thus shortening the length of the pendulum string from to 2 adiabatically corresponds to the amplitude
increasing by a factor 168.
Consider the action-angle integral for one closed period = 2
for this problem
I
=
I
= 2 ̇ · ̇
D 2 E 2
= 2 ̇
2 2
= 0
1 3
= 2 20 2 = constant
The constant can be identified with the new momentum Then the transformation equations become
= = = = = − =
That is
= +
which corresponds to motion with a uniform velocity in the system.
2
(b) Consider that the Hamiltonian is perturbed by addition of potential = 2 which corresponds to the
harmonic oscillator. Then
1 2
= 2 +
2 2
Consider the transformed Hamiltonian
1 2 2 2 1 2
H=+ = 2 + − = = ( + )
2 2 2 2 2
Hamilton’s equations of motion
H H
̇ = ̇ = −
give that
̇ = ( + )
̇ = − ( + )
These two equations can be solved to give
̈ + = 0
which is the equation of a harmonic oscillator showing that is harmonic of the form = 0 sin ( + )
where 0 are constants of motion. Thus
= −̇ − = −0 [cos( + ) + sin( + )]
The transformation equations then give
= = 0 sin ( + )
= + = −̇ = −0 cos( + )
Hence the solution for the perturbed system is harmonic, which is to be expected since the potential has a
quadratic dependence of position.
The symplectic matrix J is defined as being a 2 by 2 skew-symmetric, orthogonal matrix that is broken
into four × null or unit matrices according to the scheme
µ ¶
[0] + [1]
J= (14.152)
− [1] [0]
where [0] is the -dimension null matrix, for which all elements are zero. Also [1] is the -dimensional unit
matrix, for which the diagonal matrix elements are unity and all off-diagonal matrix elements are zero. The
J matrix accounts for the opposite signs used in the equations for q̇ and ṗ. The symplectic representation
allows the Hamilton’s equations of motion to be written in the compact form
η̇ = J (14.153)
η
This textbook does not use the elegant symplectic representation since it excludes the important gener-
alized forces and Lagrange multiplier forces.
The Lagrangian function (q q̇) and the action functional (q p) are scalar functions under rotation,
but they determine the vector force fields and the corresponding equations of motion. Thus the use of
rotationally-invariant functions (q q̇) and (q p) provide a simple representation of the vector force
fields. This is analogous to the use of scalar potential fields (q ) to represent the electrostatic and gravita-
tional vector force fields. Like scalar potential fields, Lagrangian and Hamiltonian mechanics represents the
observables as derivatives of (q q̇) and (q p) and the absolute values of (q q̇) and (q p) are
undefined; only differences in (q q̇) and (q p) are observable. For example, the generalized momenta
are given by the derivatives ≡
̇ and = . The physical significance of the least action (q α) is
14.8. COMPARISON OF THE LAGRANGIAN AND HAMILTONIAN FORMULATIONS 431
illustrated when the canonically transformed momenta P = α is a constant. Then the generalized momenta
and the Hamilton-Jacobi equation, imply that the total time derivative of the action equals
= ̇ + = − = (14.154)
The indefinite integral of this equation reproduces the definite integral (141) to within an arbitrary constant,
i.e. Z
(q p) = (q q̇) + constant (14.155)
Lagrangian formulation:
Consider a system with independent generalized coordinates, plus constraint forces that are not required
to be known. The Lagrangian approach can reduce the system to a minimal system of = − inde-
pendent generalized coordinates leading to = − second-order differential equations. By comparison,
the Newtonian approach uses + unknowns. Alternatively, the Lagrange multipliers approach allows
determination of the holonomic constraint forces resulting in = + second order equations to determine
= + unknowns. The Lagrangian potential function is limited to conservative forces, but generalized
forces can be used to handle non-conservative and non-holonomic forces. The advantage of the Lagrange
equations of motion is that they can deal with any type of force, conservative or non-conservative, and
they directly determine , ̇ rather than which then requires relating to ̇. The Lagrange approach is
superior to the Hamiltonian approach if a numerical solution is required for typical undergraduate problems
in classical mechanics. However, Hamiltonian mechanics has a clear advantage for addressing more profound
and philosophical questions in physics.
Hamiltonian formulation:
For a system with independent generalized coordinates, and constraint forces, the Hamiltonian approach
determines 2 first-order differential equations. In contrast to Lagrangian mechanics, where the Lagrangian
is a function of the coordinates and their velocities, the Hamiltonian uses the variables q and p, rather
than velocity. The Hamiltonian has twice as many independent variables as the Lagrangian which is a great
advantage, not a disadvantage, since it broadens the realm of possible transformations that can be used to
simplify the solutions. Hamiltonian mechanics uses the conjugate coordinates q p corresponding to phase
space. This is an advantage in most branches of physics and engineering. Compared to Lagrangian mechanics,
Hamiltonian mechanics has a significantly broader arsenal of powerful techniques that can be exploited to
obtain an analytical solution of the integrals of the motion for complicated systems. These techniques
include, the Poisson bracket formulation, canonical transformations, the Hamilton-Jacobi approach, the
action-angle variables, and canonical perturbation theory. In addition, Hamiltonian dynamics also provides
a means of determining the unknown variables for which the solution assumes a soluble form, and it is
ideal for study of the fundamental underlying physics in applications to other fields such as quantum or
statistical physics. However, the Hamiltonian approach endemically assumes that the system is conservative
putting it at a disadvantage with respect to the Lagrangian approach. The appealing symmetry of the
Hamiltonian equations, plus their ability to utilize canonical transformations, makes it the formalism of
choice for examination of system dynamics. For example, Hamilton-Jacobi theory, action-angle variables
and canonical perturbation theory are used extensively to solve complicated multibody orbit perturbations
in celestial mechanics by finding a canonical transformation that transforms the perturbed Hamiltonian to
a solved unperturbed Hamiltonian.
The Hamiltonian formalism features prominently in quantum mechanics since there are well established
rules for transforming the classical coordinates and momenta into linear operators used in quantum me-
chanics. The variables q q̇ used in Lagrangian mechanics do not have simple analogs in quantum physics.
As a consequence, the Poisson bracket formulation, and action-angle variables of Hamiltonian mechanics
played a key role in development of matrix mechanics by Heisenberg, Born, and Dirac, while the Hamilton-
Jacobi formulation played a key role in development of Schrödinger’s wave mechanics. Similarly, Hamiltonian
mechanics is the preeminent variational approached used in statistical mechanics.
432 CHAPTER 14. ADVANCED HAMILTONIAN MECHANICS
14.9 Summary
This chapter has gone beyond what is normally covered in an undergraduate course in classical mechanics,
in order to illustrate the power of the remarkable arsenal of methods available for solution of the equations of
motion using Hamiltonian mechanics. This has included the Poisson bracket representation of Hamiltonian
formulation of mechanics, canonical transformations, Hamilton-Jacobi theory, action-angle variables, and
canonical perturbation theory. The purpose was to illustrate the power of variational principles in Hamil-
tonian mechanics and how they relate to fields such as quantum mechanics. The following are the key points
made in this chapter.
Poisson brackets: The elegant and powerful Poisson bracket formalism of Hamiltonian mechanics was
introduced. The Poisson bracket of any two continuous functions of generalized coordinates ( ) and
( ) is defined to be
X µ
¶
[ ] ≡ − (1413)
[ ] = 0 (1422)
There is a one-to-one correspondence between the commutator and Poisson Bracket of two independent
functions,
(1 1 − 1 1 ) = [1 1 ] (1438)
where is an independent constant. In particular 1 1 commute of the Poisson Bracket [1 1 ] = 0.
Poisson Bracket representation of Hamiltonian mechanics: It has been shown that the Poisson
bracket formalism contains the Hamiltonian equations of motion and is invariant to canonical transforma-
tions. Also this formalism extends Hamilton’s canonical equations to non-commuting canonical variables.
Hamilton’s equations of motion can be expressed directly in terms of the Poisson brackets
̇ = [ ] = (1457)
̇ = [ ] = − (1458)
An important result is that the total time derivative of any operator is given by
= + [ ] (1445)
Poisson brackets provide a powerful means of determining which observables are time independent and
whether different observables can be measured simultaneously with unlimited precision. It was shown that
the Poisson bracket is invariant to canonical transformations, which is a valuable feature for Hamiltonian
mechanics. Poisson brackets were used to prove Liouville’s theorem which plays an important role in the use
of Hamiltonian phase space in statistical mechanics. The Poisson bracket is equally applicable to continuous
solutions in classical mechanics as well as discrete solutions in quantized systems.
14.9. SUMMARY 433
Canonical transformations: A transformation between a canonical set of variables ( ) with Hamil-
tonian ( ) to another set of canonical variable ( ) with Hamiltonian H( ) can be achieved
using a generating functions such that
H( ) = ( ) + (1489)
Possible generating functions are summarized in the following table.
If the canonical transformation makes H( ) = 0 then the conjugate variables ( ) are constants
of motion. Similarly if H( ) is a cyclic function then the corresponding are constants of motion.
Hamilton-Jacobi theory: Hamilton-Jacobi theory determines the generating function required to per-
form canonical transformations that leads to a powerful method for obtaining the equations of motion for
a system. The Hamilton-Jacobi theory uses the action function ≡ 2 as a generating function, and the
canonical momentum is given by
= (144)
This can be used to replace in the Hamiltonian leading to the Hamilton-Jacobi equation
(; ; ) + =0 (1494)
Solutions of the Hamilton-Jacobi equation were obtained by separation of variables. The close optical-
mechanical analogy of the Hamilton-Jacobi theory is an important advantage of this formalism that led to
it playing a pivotal role in the development of wave mechanics by Schrödinger.
Action-angle variables: The action-angle variables exploits a canonical transformation from ( ) →
( ) where I
1 1
≡ = (14117)
2 2
For periodic motion the phase-space trajectory is closed with area given by and this area is conserved for
the above canonical transformation. For a conserved Hamiltonian the action variable is independent of
the angle variable . The time dependence of the angle variable directly determines the frequency of the
periodic motion without recourse to calculation of the detailed trajectory of the periodic motion.
Canonical perturbation theory: Canonical perturbation theory is a valuable method of handling multi-
body interactions. The adiabatic invariance of the action-angle variables provides a powerful approach for
exploiting canonical perturbation theory.
Comparison of Lagrangian and Hamiltonian formulations: The remarkable power, and intellectual
beauty, provided by use of variational principles to exploit the underlying principles of natural economy in
nature, has had a long and rich history. It has led to profound developments in many branches of theoretical
physics. However, it is noted that although the above algebraic formulations of classical mechanics have been
used for over two centuries, the important limitations of these algebraic formulations to non-linear systems
remain a challenge that still is being addressed.
It has been shown that the Lagrangian and Hamiltonian formulations represent the vector force fields,
and the corresponding equations of motion, in terms of the Lagrangian function (q q̇) or the action
434 CHAPTER 14. ADVANCED HAMILTONIAN MECHANICS
functional (q p) which are scalars under rotation. The Lagrangian function (q q̇) is related to the
action functional (q p) by
Z 2
(q p) = (q q̇) (141)
1
These functions are analogous to electric potential, in that the observables are derived by taking derivatives
of the Lagrangian function (q q̇) or the action functional (q p). The Lagrangian formulation is more
convenient for deriving the equations of motion for simple mechanical systems. The Hamiltonian formulation
has a greater arsenal of techniques for solving complicated problems plus it uses the canonical variables ( )
which are the variables of choice for applications to quantum mechanics and statistical mechanics.
14.9. SUMMARY 435
Workshop exercises
1. Poisson brackets are a powerful means of elucidating when observables are constant of motion and whether
two observables can be simultaneously measured with unlimited precision. Consider a spherically symmetric
Hamiltonian à !
1 2 2
= 2 + + + ()
2 2 2 sin2
for a mass where ( is a central potential. Use the Poisson bracket plus the time dependence to determine
the following:
(c) Show { } = . The following identity may be useful: = − .
(d) Show { 2 } = 0 .
p2 1 ¡ ¢
= + 21 12 + 22 22
2 2
What condition is satisfied if 2 a conserved quantity?
436 CHAPTER 14. ADVANCED HAMILTONIAN MECHANICS
Problems
1. Consider the motion of a particle of mass in an isotropic harmonic oscillator potential = 12 2 and take
the orbital plane to be the − plane. The Hamiltonian is then
1 2 1
≡ 0 = ( + 2 ) + (2 + 2 )
2 2
Introduce the three quantities
1 2 1
1 = ( − 2 ) + (2 − 2 )
2 2
1
2 = +
3 = ( − )
q
with = . Use Poisson brackets to solve the following:
a) Show that [0 ] = 0 for = 1 2 3 proving that (1 2 3 ) are constants of motion.
b) Show that
[1 2 ] = 23
[2 3 ] = 21
[3 1 ] = 22
−1
so that (2) (1 2 3 ) have the same Poisson bracket relations as the components of a 3-dimensional angular
momentum.
c) Show that
02 = 12 + 22 + 32
2. Assume that the transformation equations between the two sets of coordinates ( ) and ( ) are
sin
= ln( )
= cot
a) Assuming that are canonical variables, i.e. [ ] = 1, show directly from the above transformation
equations that are canonical variables.
(b) Show that
− = ( + cot )
c) Find the explicit generating function
R 1 ( ) √
that generates this transformation between these two sets of
canonical variables. Note the integral sin−1 = 1 − 2 + sin−1
3. Consider the uniform motion of a free particle of mass . The Hamiltonian is a constant of motion and so is
the function
( ) ≡ −
(a) Compare the Poisson bracket [ ] with and prove that is a constant of motion.
(b) Prove that the Poisson bracket of two constants of motion is itself a constant of motion even if the constants
( ) and ( ) depend explicitly on time.
(c) Show in general that if the Hamiltonian and the quantity are constants of motion, then
also is a constant
of motion.
4 (a) Solve the Hamilton-Jacobi equation for the generating ( ) for a single particle moving under the
Hamiltonian = 12 2 . Find the canonical transformation = ( )and = ( ) where and are
the transformed coordinate and momentum respectively. Interpret your result.
(b) If there is a perturbing Hamiltonian ∆ = 12 2 , then no longer will be constant. Express the transformed
Hamiltonian (using the same transformation found in part (a)) in terms of and Solve for () and
() and show that the perturbed solution [() ()] [() ()] is simple harmonic.
Chapter 15
15.1 Introduction
Lagrangian and Hamiltonian mechanics have been used to determine the equations of motion for discrete
systems having a finite, albeit sometimes large, number of discrete variables where 1 ≤ ≤ . There
are important classes of systems where it is more convenient to treat the system as being continuous. For
example, the interatomic spacing in solids is a few 10−10 which is negligible compared with the size of
typical macroscopic, three-dimensional solid objects. As a consequence, for wavelengths much greater than
the atomic spacing in solids, it is useful to treat macroscopic crystalline lattice systems as continuous three-
dimensional uniform solids, rather than as three-dimensional discrete lattice chains. Fluid and gas dynamics
are other examples of continuous mechanical systems. Another important class of continuous systems involves
the theory of fields, such as electromagnetic fields. Lagrangian and Hamiltonian mechanics of the continua
extend classical mechanics into the advanced topic of field theory. This chapter goes beyond the scope of a
typical undergraduate classical mechanics course in order to provide a brief glimpse of how Lagrangian and
Hamiltonian mechanics underlie advanced and important aspects of the mechanics of the continua, including
field theory.
1 X³ 2 ´
+1
2
= ̇ − (−1 − ) (15.1)
2 =1
where the masses are attached in series to +1 identical springs of length and spring constant . Assume
that the spring has a uniform cross-section area and length Then each spring volume element ∆ =
has a mass , that is, the volume mass density = ∆ or = ∆ . Chapter 1553 will show that the
spring constant = where is Young’s modulus, is the cross sectional area of the chain element, and
is the length of the element. Then the spring constant can be written as = ∆ 2 . Therefore equation
151 can be expressed as a sum over volume elements ∆ =
+1
à µ ¶2 !
1X −1 −
= ̇2 − ∆ (15.2)
2 =1
In the limit that → ∞ and the spacing = → 0 then the summation in equation 152 can be written
as a volume integral where = is the distance along the linear chain and the volume element ∆τ → 0.
Then the Lagrangian can be written as the integral over the volume element rather than a summation
437
438 CHAPTER 15. ANALYTICAL FORMULATIONS FOR CONTINUOUS SYSTEMS
The coordinate () for the discrete chain has become a continuous function ( ) for the uniform chain.
Thus the integral form of the Lagrangian can be expressed as
Z Ã µ ¶2 ! Z
1 2 ( )
= ̇ − = L (15.4)
2
where the function L is called the Lagrangian density defined by
à µ ¶2 !
1 ( )
L≡ ̇ 2 − (15.5)
2
The variable in the Lagrangian density is not a generalized coordinate; it only serves the role of a continuous
index played previously by the index . For the discrete case, each value of defined a different generalized
coordinate . Now for each value of there is a continuous function ( ) which is a function of both
position and time.
Lagrange’s equations of motion applied to the continuous Lagrangian in equation 154 gives
2 2
2
− 2 =0 (15.6)
This is the familiar wave equation in one dimension for a longitudinal wave on the continuous chain with a
phase velocity s
= (15.7)
The continuous linear chain also can exhibit transverse modes which have a Lagrangian density were the
Young’s modulus is replaced by the tension in the chain, and is replaced
q by the linear mass density
of the chain, leading to a phase velocity for a transverse wave = .
Following the same approach used in chapter 52, it is assumed that the stationary path for the action
integral is described by the function ( ). Define a neighboring function using a parametric representation
( ; ) such that for = 0, where = ( ) is the function that yields the stationary action integral .
15.3. THE LAGRANGIAN DENSITY FORMULATION FOR CONTINUOUS SYSTEMS 439
Assume that an infinitessimal fraction of a neighboring function ( ) is added to the extremum path
( ). That is, assume
Then Hamilton’s principle requires that the action integral be a stationary function value for = 0, that is,
() is independent of which is satisfied if
Z 2 Z 2 µ ¶
() L L ̇ L 0
= + + 0 = 0 (15.13)
1 1 ̇
Since the auxiliary function ( ) is arbitrary, then the integrand term in the square brackets of equation
1519 must equal zero. That is, µ ¶ µ ¶
L L L
+ − =0 (15.20)
̇ 0
Equation 1520 gives the equations of motion in terms of the Lagrangian density that has been derived
based on Hamilton’s principle.
equations of motion in terms of the Lagrangian density for three spatial dimensions involves the straightfor-
ward addition of the and coordinates. That is, in three dimensions the vector displacement is expressed
by the vector q ( ) and the Lagrangian density is related to the Lagrangian by integration over three
dimensions. That is, they are related by the equation
Z
q
= L(q ∇ · q ) (15.21)
where, in cartesian coordinates, the volume element = . The Lagrangian density is a function
L(q q
∇ · q ) where the one field quantity ( ) has been extended to a spatial vector q ( )
and the spatial derivatives 0 have been transformed into ∇ · q. Applying the method used for the one-
dimensional spatial system, to the three-dimensional system, leads to the following set of equations of motion
à ! à ! à ! à !
L L L L L
+ + + − =0 (15.22)
q
q
q
q
q
where the spatial derivatives have been written explicitly for clarity.
Note that the equations of motion, equation 1522, treat the spatial and time coordinates symmetrically.
This symmetry between space and time is unchanged by multiplying the spatial and time coordinate by
arbitrary numerical factors. This suggests the possibility of introducing a four-dimensional coordinate system
≡ { }
where the parameter is freely chosen. Using this 4-dimensional formalism allows equation 1522 to be
written more compactly as ⎛ ⎞
X4
⎝ L ⎠ L
q
− =0 (15.23)
q
As discussed in chapter 16 relativistic mechanics treats time and space symmetrically, that is, a four-
dimensional vector q ( ) can be used that treats time and the three spatial dimensions symmetrically
and equally. This four-dimensional space-time formulation allows the first four terms in equation 1522 to be
condensed into a single term which illustrates the symmetry underlying equation 1523. If the Lagrangian
density is Lorentz invariant, and if = then equation 1523 is covariant. Thus the Lagrangian density
formulation is ideally suited to the development of relativistically covariant descriptions of fields.
Chapter 153 illustrates, in general terms, how field theory can be expressed in a Lagrangian formulation
via use of the Lagrange density. It is equally possible to obtain a Hamiltonian formulation for continuous
systems analogous to that obtained for discrete systems. As summarized in chapter 14, the Hamiltonian
and Hamilton’s canonical equations of motion are related directly to the Lagrangian by use of a Legendre
transformation. The Hamiltonian is defined as being
X µ ¶
≡ ̇ − (15.24)
̇
In the limit that the coordinates are continuous, then the summation in equation 1526 can be
transformed into a volume integral over the Lagrangian density L. In addition, a momentum density can be
represented by the vector field π where
L
π≡ (15.27)
q̇
Then the obvious definition of the Hamiltonian density H is
Z Z
= H = (π · q̇−L) (15.28)
H =π · q̇−L (15.29)
Unfortunately the Hamiltonian density formulation does not treat space and time symmetrically making
it more difficult to develop relativistically covariant descriptions of fields. Hamilton’s principle can be used
to derive the Hamilton equations of motion in terms of the Hamiltonian density analogous to the approach
used to derive the Lagrangian density equations of motion. As described in Classical Mechanics 2 edition
by Goldstein, the resultant Hamilton equations of motion for one dimension are
H
= ̇ (15.30)
H H
− = −̇ (15.31)
0
H L
= − (15.32)
Note that equation 1531 differs from that for discontinuous systems.
The diagonal first term is the dilation term which corresponds to changes in the volume with no changes
in shape. The off-diagonal second term involves the shear terms that correspond to changes of the shape of
the body that also changes the volume. The constants and are Lamé’s moduli of elasticity which are
positive. Various moduli of elasticity, corresponding to different distortions in the shape and volume of any
solid body, can be derived from Lamé’s moduli for the material.
The components of the elastic forces can be derived from the gradient of the elastic potential energy,
equation 1542 by use of Gauss’ law plus vector differential calculus. The components of the elastic force,
derived from the strain tensor σ, can be associated with the corresponding components of the stress tensor
T. Thus, for homogeneous isotropic linear materials, the components of the stress tensor are related to the
strain tensor by the relation
X µ ¶ X
= + + = + 2 (15.43)
where it has been assumed that = . The two moduli of elasticity and are material-dependent
constants. Equation 1543 can be written in tensor notation as
where () is the trace of the strain tensor and is the identity matrix.
Equation 1544 can be inverted to give the strain tensor components in terms of the stress tensor com-
ponents. " #
1 X
= − (15.45)
2 (3 + 2)
The various moduli of elasticity relate combinations of different stress and strain tensor components. The
following five elastic moduli are used frequently to describe elasticity in homogeneous isotropic media, and
all are related to Lamé’s two moduli of elasticity.
1) Young’s modulus describes tensile elasticity which is axial stiffness of the length of a body to
deformation along the axis of the applied tensile force.
2) Bulk modulus = ∆ defines the relative dilation or compression of a bodies volume to pressure
applied uniformly in all directions.
2
=+ (15.47)
3
The bulk modulus is an extension of Young’s modulus to three dimensions and typically is larger than .
The inverse of the bulk modulus is called the compressibility of the material.
444 CHAPTER 15. ANALYTICAL FORMULATIONS FOR CONTINUOUS SYSTEMS
3) Shear modulus describes the shear stiffness of a body to volume-preserving shear deformations.
The shear strain becomes a deformation angle given by the ratio of the displacement along the axis of the
shear force and the perpendicular moment arm. The shear modulus equals Lamé’s constant . That is,
= (15.48)
4) Poisson’s ratio is the negative ratio of the transverse to axial strain. It is a measure of the volume
conserving tendency of a body to contract in the directions perpendicular to the axis along which it is
stretched. In terms of Lamé’s constants, Poisson’s ratio equals
= (15.49)
2 ( + )
Note that for a stable, isotropic elastic material, Poisson’s ratio is bounded between −10 ≤ ≤ 05 to ensure
that the and moduli have positive values. At the incompressible limit, = 05, and the bulk modulus
and Lame parameter are infinite, that is, the compressibility is zero. Typical solids have Poisson’s ratios
of ≈ 005 if hard and = 025 if soft.
The stiffness of elastic solids in terms of the elastic moduli of solids can be complicated due to the
geometry and composition of solid bodies. Often it is more convenient to express the stiffness in terms of
the spring constant where
= (15.50)
The spring constant is inversely proportional to the length of the spring because the strain of the material
is defined to be the fractional deformation, not the absolute deformation.
That is, the inner product of the del operator, ∇,I and the rank-2 stress tensor T, give the vector force
2
density f . This force acting on the enclosed mass for the closed volume, leads to an acceleration 2 .
Thus I Z I
2ξ
F = T·A = ∇ · T = 2 (15.52)
Use equation 1544 to relate the stress tensor T to the moduli of elasticity gives
" #
2 ξ X 2 ξ 2 ξ
2 = ( + ) + 2 (15.53)
where = 1 2 3. In general this equation is difficult to solve. However, for the simple case of a plane wave
in the = 1 direction, the problem reduces to the following three equations
2 ξ1 2 ξ1
= ( + 2) (15.54)
2 21
2ξ 2 ξ2
22 = (15.55)
21
2ξ 2ξ
23 = 23 (15.56)
1
q
(+2)
Equation 1554 corresponds to a longitudinal wave travelling with velocity = . Equations
q
1555 1556 correspond to two perpendicular transverse waves travelling with velocity = . This il-
lustrates the important fact that longitudinal waves travel faster than transverse waves in an elastic solid.
Seismic waves in the Earth, generated by earthquakes, exhibit this property. Note that shearing stresses do
not exist in ideal liquids and gases since they cannot maintain shear forces and thus = 0
15.6. ELECTROMAGNETIC FIELD THEORY 445
f = (E + J × B) (15.58)
Maxwell’s equations
1 E
= 0 ∇ · E J= ∇ × B − ²0 (15.59)
0
can be used to eliminate the charge and current densities in equation 1557
µ ¶
1 E
f =0 (∇ · E) E + ∇ × B − ²0 ×B (15.60)
0
Vector calculus gives that
E B
(E × B) = × B + E× (15.61)
while Faraday’s law gives
B
= −∇ × E (15.62)
Equation 1562 allows equation 1561 to be rewritten as
E B
× B = + (E × B) − E× = + (E × B) + E× (∇ × E) (15.63)
Equation 1563 can be inserted in equation 1560. In addition, a term 1 (∇ · B) B can be added since
0
∇ · B =0 which allows equation 1560 to be written in the symmetric form
1 1 E
f = 0 (∇ · E) E + (∇ · B) B+ (∇ × B) × B − ²0 ×B (15.64)
0 0
1 1
= 0 (∇ · E) E + (∇ · B) B+ (∇ × B) × B−0 (E × B) − 0 E× (∇ × E) (15.65)
0 0
Using the vector identity
∇ (A · B) = A× (∇ × B) + B× (∇ × A) + (A · ∇) B+ (B · ∇) A (15.66)
Let A = B = E then ¡ ¢
∇ 2 = 2E× (∇ × E) + 2 (E · ∇) E (15.67)
That is
1 ¡ 2¢
E× (∇ × E) = ∇ − (E · ∇) E (15.68)
2
Similarly
1 ¡ 2¢
B× (∇ × B) = ∇ − (B · ∇) B (15.69)
2
Inserting equations 1568 and 1569 into equation 1565 gives
∙ ¸ ∙ ¸
1 2 1 1 2
f =0 (∇ · E) E+ (E · ∇) E− ∇ + (∇ · B) B+ (B · ∇) B− ∇ − 0 (E × B) (15.70)
2 0 2
446 CHAPTER 15. ANALYTICAL FORMULATIONS FOR CONTINUOUS SYSTEMS
This complicated formula can be simplified by defining the rank-2 Maxwell stress tensor T which has
components µ ¶ µ ¶
1 1 1
≡ 0 − 2 + − 2 (15.71)
2 0 2
The inner product of the del operator and the Maxwell stress tensor is a vector with components of
∙ ¸ ∙ ¸
1 2 2 1 1 2 2
(∇ · T) = 0 (∇ · E) + (E · ∇) − ∇ + (∇ · B) + (B · ∇) − ∇ (15.72)
2 0 2
The above definition of the Maxwell stress tensor, plus the Poynting vector S = 1 (E × B) allows the force
0
density equation 1558 to be written in the form
S
f = ∇ · T−0 0 (15.73)
The divergence theorem allows the total force, acting of the volume to be written in the form
Z µ ¶
S
F = ∇ · T−0 0 (15.74)
I Z
= T·a−0 0 Sdτ (15.75)
Note that, if the Poynting vector is time independent, then the second term in equation 1575 is zero and the
Maxwell stress tensor T is the force per unit area, (stress) acting on the surface. The fact that T is a rank-2
tensor is apparent since the stress represents the ratio of the force-density vector f and the infinitessimal
area vector a, which do not necessarily point in the same directions.
Then equation 1576 implies that the total momentum flux density π = π +π is related to Maxwell’s
stress tensor by
(π + π ) = ∇ · T (15.79)
That is, like the elasticity stress tensor, the divergence of Maxwell’s stress tensor T equals the rate of change
of the total momentum density, that is, −T is the momentum flux density.
This discussion of the Maxwell stress tensor and its relation to momentum in the electromagnetic field
illustrates the role that analytical formulations of classical mechanics can play in field theory.
15.7. IDEAL FLUID DYNAMICS 447
g = −ρ∇ (15.89)
v = −∇ (15.91)
This scalar potential field can be used to derive the vector velocity field for irrotational flow.
Note that the (v · ∇) v term in Euler’s equation (1590) can be rewritten using the vector identity
1 ¡ ¢
(v · ∇) v = ∇ 2 − v × ∇ × v (15.92)
2
Inserting equation 1592 into Euler’s equation 1590 then gives
µ ¶
v 1 1 2
= v × ∇ × v− ∇ + + (15.93)
2
v
Potential flow corresponds to time independent irrotational flow, that is, both = 0 and ∇ × v = 0 For
potential flow equation 1593 reduces to
µ ¶
1 2
∇ + + = 0
2
The Navier-Stokes equations are nonlinear due to the (v · ∇) v term as well as being a function of
velocity. This non-linearity leads to a wide spectrum of dynamic behavior ranging from ordered laminar
flow to chaotic turbulence. Numerical solution of the Navier-Stokes equations is extremely difficult because
of the wide dynamic range of the dimensions of the coherent structures involved in turbulent motion. For
example, simulation calculations require use of a high resolution mesh which is a challenge to the capabilities
of current generation computers.
The microscopic boundary condition at the interface of the solid and fluid is that the fluid molecules
have zero average tangential velocity relative to the normal to the solid-fluid interface. This implies that
there is a boundary layer for which there is a gradient in the tangential velocity of the fluid between the
solid-fluid interface and the free-steam velocity. This velocity gradient produces vorticity in the fluid. When
the viscous forces are negligible then the angular momentum in any coherent vortex structure is conserved
leading to the vortex motion being preserved as it propagates.
10
Inertial forces CD
Re ≡ = = (15.99) B
Viscous forces 1 C D
varies inversely with Re leading to the drag forces that are roughly linear with velocity as described in chapter
2105 The size and velocities of raindrops in a light rain shower correspond to such Reynolds numbers.
B) For 10 Re 30 the flow has two turbulent vortices immediately behind the body in the wake of
the cylinder, but the flow still is primarily laminar as illustrated.
C) For 40 Re 250 the pair of vortices peel off alternately producing a regular periodic sequence of
vortices although the flow still is laminar. This vortex sheet is called a von Kármán vortex sheet for which
the velocity at a given position, relative to the cylinder, is time dependent in contrast to the situation at
lower Reynolds numbers.
D) For 103 Re 105 viscous forces are negligible relative to the inertial effects of the vortices and
boundary-layer vortices have less time to diffuse into the larger region of the fluid, thus the boundary layer is
thinner. The boundary-layer flow exhibits a small scale chaotic turbulence in three dimensions superimposed
on regular alternating vortex structures. In this range is roughly constant and thus the drag forces are
proportional to the square of the velocity. This regime of Reynold numbers corresponds to typical velocities
of moving automobiles.
E) For Re ≈ 106 , which is typical of a flying aircraft, the inertial effects dominate except in the narrow
boundary layer close to the solid-fluid interface. The chaotic region works its way further forward on the
cylinder reducing the volume of the chaotic turbulent boundary layer which results in a significant decreases
in . For a sailplane wing flying at about 50, the boundary layer at the leading edge of the cylinder
reduces to the order of a millimeter in thickness at the leading edge and a centimeter at the trailing edge. At
these Reynold’s numbers the airflow comprises a thin boundary layer, where viscous effects are important,
plus fluid flow in the bulk of the fluid where the vortex inertial terms dominate and viscous forces can be
ignored. That is, the viscous stress tensor term ∇ · T on the right-hand side of equation 1597 can be
ignored, and the Navier-Stokes equation reduces to the simpler Euler equation for such inviscid fluid flow.
The importance of the inertia of the vortices is illustrated by the persistence of the vortex structure
and turbulence over a wide range of length scales characteristic of turbulent flow. The dynamic range of
the dimension of coherent vortex structures is enormous. For example, in the atmosphere the vortex size
ranges from 105 in diameter for hurricanes down to 10−3 in thin boundary layers adjacent to an aircraft
wing. The transition from laminar to turbulent flow is illustrated by water flow over the hull of a ship which
involves laminar flow at the bow followed by turbulent flow behind the bow wave and at the stern of the
ship. The broad extent of the white foam of seawater along the side and the stern of a ship illustrates the
considerable energy dissipation produced by the turbulence. The boundary layer of a stalled aircraft wing
is another example. At a high angle of attack, the airflow on the lower surface of the wing remains laminar,
that is, the stream velocity profile, relative to the wing, increases smoothly from zero at the wing surface
outwards until it meets the ambient air velocity on the outer surface of the boundary layer which is the order
of a millimeter thick. The flow on the top surface of the wing initially is laminar before becoming turbulent
at which point the boundary layer rapidly increases in thickness. Further back the airflow detaches from
the wing surface and large-scale vortex structures lead to a wide boundary layer comparable in thickness to
the chord of the wing with vortex motion that leads to the airflow reversing its direction adjacent to the
upper surface of the wing which greatly increases drag. When the vortices begin to shed off the bounded
surface they do so at a certain frequency which can cause vibrations that can lead to structural failure if the
frequency of the shedding vortices is close to the resonance frequency of the structure.
Considerable time and effort are expended by aerodynamicists and hydrodynamicists designing aircraft
wings and ship hulls to maximize the length of laminar region of the boundary layer to minimize drag.
When the Reynolds number is large the slightest imperfections in the shape of wing, such as a speck of
dust, can trigger the transition from laminar to turbulent flow. The boundaries between adjacent large-scale
coherent structures are sensitively identified in computer simulations by large divergence of the streamlines
at any separatrix. A large positive, finite-time, Lyapunov exponent identifies divergence of the streamlines
which occurs at a separatrix between adjacent large-scale coherent vortex structures, whereas the Lyapunov
exponents are negative for converging streamlines within any coherent structure. Computations of turbulent
flow often combine the use of finite-time Lyapunov exponents to identify coherent structures, plus Lagrangian
mechanics for the equations of motion since the Lagrangian is a scalar function, it is frame independent, and
it gives far better results for fluid motion than using Newtonian mechanics. Thus the Lagrangian approach in
the continua is used extensively for calculations in aerodynamics, hydrodynamics, and studies of atmospheric
phenomena such as convection, hurricanes, tornadoes, etc.
15.9. SUMMARY AND IMPLICATIONS 453
Hamiltonian density formulation: In the limit that the coordinates are continuous, then the Hamil-
tonian density can be expressed in terms of a volume integral over the momentum density and the La-
grangian density L where
L
π≡ (1527)
q̇
Then the obvious definition of the Hamiltonian density H is
Z Z
= H = (π · q̇−L) (1528)
Linear elastic solids: The theory of continuous systems was applied to the case of linear elastic solids.
The stress tensor T is a rank 2 tensor defined as the ratio of the force vector F and the surface element
vector A. That is, the force vector is given by the inner product of the stress tensor T and the surface
element vector A.
F = T·A (1533)
The strain tensor σ also is a rank 2 tensor defined as the ratio of the strain vector ξ and infinitessimal
area A
ξ = σ·A (1538)
where the component form of the rank 2 strain tensor is
¯ ¯
¯ 1 1 1 ¯
1 ¯¯
2
1 2
2
3
2
¯
¯
σ = ¯ ¯ (1539)
2 ¯ 31 2
3
3
3 ¯
¯ ¯
1 2 3
The modulus of elasticity is defined as the slope of the stress-strain curve. For linear, homogeneous,
elastic matter, the potential energy density separates into diagonal and off-diagonal components of the
strain tensor " #
1 X 2
X 2
= ( ) + 2 ( ) (1542)
2
where the constants and are Lamé’s moduli of elasticity which are positive. The stress tensor is related
to the strain tensor by
X µ ¶ X
= + + = + 2 (1543)
454 CHAPTER 15. ANALYTICAL FORMULATIONS FOR CONTINUOUS SYSTEMS
Electromagnetic field theory: The rank 2 Maxwell stress tensor T has components
µ ¶ µ ¶
1 2 1 1 2
≡ 0 − + − (1571)
2 0 2
The divergence theorem allows the total electromagnetic force, acting of the volume to be written as
Z µ ¶ I Z
S
F= ∇ · T−0 0 = T·a−0 0 Sdτ (1574)
Viscous fluid dynamics: For incompressible flow the stress tensor term simplifies to ∇ · T =∇2 v. Then
the Navier-Stokes equation becomes
∙ ¸
v
+ v · ∇v = −∇ + ∇2 v+f (1598)
where ∇2 v is the viscosity drag term. The left-hand side of equation 1598 represents the rate of change
of momentum per unit volume while the right-hand side represents the summation of the forces per unit
volume that are acting.
The Reynolds number is a dimensionless number that characterizes the ratio of inertial forces to viscous
forces in a viscous medium. The evolution of flow from laminar flow to turbulent flow, with increase of
Reynolds number, was discussed.
The classical mechanics of continuous fields encompasses a remarkably broad range of phenomena with
important applications to laminar and turbulent fluid flow, gravitation, electromagnetism, relativity, and
quantum fields.
Chapter 16
Relativistic mechanics
16.1 Introduction
Newtonian mechanics incorporates the Newtonian concept of the complete separation of space and time.
This theory reigned supreme from inception, in 1687, until November 1905 when Einstein pioneered the
Special Theory of Relativity. Relativistic mechanics undermines the Newtonian concepts of absoluteness of
time that is inherent to Newton’s formulation, as well as when recast in the Lagrangian and Hamiltonian
formulations of classical mechanics. Relativistic mechanics has had a profound impact on twentieth-century
physics and the philosophy of science. Classical mechanics is an approximation of relativistic mechanics
that is valid for velocities much less than the velocity of light in vacuum. The term "relativity" refers to
the fact that physical measurements are always made relative to some chosen reference frame. Naively one
may think that the transformation between different reference frames is trivial and contains little underlying
physics. However, Einstein showed that the results of measurements depend on the choice of coordinate
system, which revolutionized our concept of space and time.
Einstein’s work on relativistic mechanics comprised two major advances. The first advance is the 1905
Special Theory of Relativity which refers to nonaccelerating frames of reference. The second major advance
was the 1916 General Theory of Relativity which considers accelerating frames of reference and their relation
to gravity. Thus the Special Theory is a limiting case of the General Theory of Relativity. The mathematically
complex General Theory of Relativity is required for describing accelerating frames, gravity, plus related
topics like Black Holes, or extremely accurate time measurements inherent to the Global Positioning System.
The present discussion will focus primarily on the mathematically simple Special Theory of Relativity since it
encompasses most of the physics encountered in atomic, nuclear and high energy physics. This chapter uses
the basic concepts of the Special Theory of Relativity to investigate the implications of extending Newtonian,
Lagrangian and Hamiltonian formulations of classical mechanics into the relativistic domain. The Lorentz-
invariant extended Hamiltonian and Lagrangian formalisms are introduced since they are applicable to the
Special Theory of Relativity. The General Theory of Relativity incorporates the gravitational force as a
geodesic phenomena in a four-dimensional Reimannian structure based on space, time, and matter. A
superficial introduction will be given to the fundamental concepts and evidence that underlie the General
Theory of Relativity.
455
456 CHAPTER 16. RELATIVISTIC MECHANICS
Consider two coordinate systems shown in figure 161, where the primed frame is moving along the
axis of the fixed unprimed frame. A Galilean transformation implies that the following relations apply;
01 = 1 − (16.1)
02 = 2
03 = 3
0 =
= 0
= 0µ ¶
4
0
= 0 + 2 3
The Lorentz factor, defined above, is the key feature 2
Tick! Tick!
d d
Figure 16.4: The observer and mirror are at rest in the left-hand frame (a). The light beam takes a time
∆ = to travel to the mirror. In the right-hand frame (b) the source and mirror are travelling at a velocity
relative to the observer. The light travels further in the right-hand frame of reference (b) than is the
stationary frame (a). Since Einstein states that the velocity of light is the same in both frames of reference
then the time interval must by larger in frame (b) since the light travels further than in (a).
There are many experimental verifications of time dilation in physics. For example, a stationary muon
has a mean lifetime of = 2 sec, whereas the lifetime of a fast moving muon, produced in the upper
atmosphere by high-energy cosmic rays, was observed in 1941 to be longer and given by as described in
example 161. In 1972 Hafely and Keating used four accurate cesium atomic clocks to confirm time dilation.
Two clocks were flown on regularly scheduled airlines travelling around the World, one westward and the
other eastward. The other two clocks were used for reference. The westward moving clock was slow by
(273 ± 7) compared to the predicted value of (275 ± 10) sec. The Global Positioning System of 24
geosynchronous satellites is used for locating positions to within a few meters. It has an accuracy of a few
nanoseconds which requires allowance for time dilation and is a daily tribute to the correctness of Einstein’s
Theory of Relativity.
Since 2 = 1 , the measured lengths in the two frames are related by:
16.3.5 Simultaneity
The Lorentz transformations imply a new philosophy of space and time. A surprising consequence is that
the concept of simultaneity is frame dependent in contrast to the prediction of Newtonian mechanics.
Consider that two events occur in frame at (1 1 ) and (2 2 ) In frame 0 these two events occur at
(1 1 ) and (02 02 ) From the Lorentz transformation the time difference is
0 0
∙ ¸
0 0 (2 − 1 )
2 − 1 = (2 − 1 ) − (16.14)
2
Thus the event is not simultaneous in frame 0 if (2 − 1 ) = 6= 0 That is, an event that is simultaneous
in one frame is not simultaneous in the other frame if the events are spatially separated. The equivalent
statement is that for two clocks, spatially separated by a distance , which are synchronized in their rest
frame, then in a moving frame they are not simultaneous.
460 CHAPTER 16. RELATIVISTIC MECHANICS
L L
2 2
Figure 16.5: If lightning strikes the front and rear of the carriage simultaneously, according to the man in
the fixed frame, then the woman in the moving frame sees the flash from the front first since she is moving
towards that approaching wavefront during the transit time of the light. Thus if the length of the carriage
in the stationary frame is (2 − 1 ) = then the time difference is ∆0 = 2 .
Einstein discussed the example shown in figure 165, where lightning strikes both ends of a train simul-
taneously in the stationary earth frame of reference. A woman on the train will see that the strikes are
not simultaneous since the wavefront from the front of the carriage will be seen first because she is moving
forward during the time the light from the two lightning flashes is travelling towards her. As a consequence
she observes that the two lightning flashes are not simultaneous. This explains why measurement of the
length of a moving rod, performed by simultaneously locating both ends in the fixed frame, implies that the
measurement occurs at different times for both ends in the moving frame resulting in a shorter apparent
length. The lack of simultaneity explains why one can get the apparent inconsistency that the moving bicy-
clist sees that the stationary street block to be length contracted, while in contrast, a pedestrian sees that
the bicycle is length contracted.
The concept of causality breaks down since (02 − 01 ) can be either positive or negative, therefore the
corresponding ∆ can be positive of negative. A consequence of the lack of simultaneity is that the image
shown by a photograph of a rapidly moving object is not a true representation of the moving object. Not
only is the body contracted in the direction of travel, but also it appears distorted because light arriving
from the far side of the body had to be emitted earlier, that is, when the body was at an earlier location,
in order to reach the observer simultaneously with light from the near side. The relativistic snake paradox,
addressed in workshop exercise 1 is an excellent example of the role of simultaneity in relativistic mechanics.
= ( − )∆
= =
( − )∆
According to the source, it emits waves of frequency 0 during the proper time interval ∆0 , that is
= 0 ∆0
This proper time interval ∆0 , in the source frame, corresponds to a time interval ∆ in the receiver frame
where
∆ = ∆0
p s
1 0 1 − ( )2 1+
= = 0 = 0
(1 − ) (1 − ) 1−
where ≡ . This formula for source and receiver approaching each other also gives the correct answer for
source and receiver receding if the sign of is changed.
This relativistic Doppler Effect accounts for the red shift observed for light emitted by receding stars and
galaxies, as well as many examples in atomic and nuclear physics involving moving sources of electromagnetic
radiation.
Consider the two parallel coordinate frames with the primed frame moving at a velocity along the 01 axis
as shown in figure 161. Velocities of an object measured in both frames are defined to be
= (16.16)
0
0 =
0
Using the Lorentz transformations 163 165 between the two frames moving with relative velocity along
the 1 axis, gives that the velocity along the 01 axis is
01 1 − 1 −
01 = = = (16.17)
0 − 2 1 1 − 12
Similarly we get the velocities along the perpendicular 02 and 03 axes to be
02 2
02 = = (16.18)
0 1 − 12
0
3 3
03 = =
0 1 − 12
When 12 → 0 these velocity transformations become the usual Galilean relations for velocity addition.
Do not confuse u and u0 with v; that is, u and u0 are the velocities of some object measured in the unprimed
and primed frames of reference respectively, whereas v is the relative velocity of the origin of one frame with
respect to the origin of the other frame.
16.4.2 Momentum
Using the classical definition of momentum, that is p =u, the linear momentum is not conserved using the
above relativistic velocity transformations if the mass is a scalar quantity. This problem originates from
the fact that both x and have non-trivial transformations and thus u = x is frame dependent.
Linear momentum conservation can be retained by redefining momentum in a form that is identical in
all frames of reference, that is by referring to the proper time as measured in the rest frame of the moving
object. Therefore we define relativistic linear momentum as
x x
p ≡ = (16.19)
But we know the time dilation relation
= q = (16.20)
2
(1 − 2 )
Note that the in this relation refers to the velocity between the moving object and the frame; this is
quite different from the = 1 2 which refers to the transformation between the two frames of reference.
(1− 2 )
Thus the new relativistic definition of momentum is
x x
p ≡ = = u (16.21)
The relativistic definition of linear momentum is the same as the classical definition with the rest mass
replaced by the relativistic mass .1
1 Note that, until recently, the rest mass was denoted by and the relativistic mass was referred to as . Modern texts
0
denote the rest mass by and the relativistic mass by . This book follows the modern nomenclature for rest mass to avoid
confusion.
16.4. RELATIVISTIC KINEMATICS 463
16.4.4 Force
Newton’s second law F = p is covariant under a Galilean transformation. In special relativity this definition
also applies using the relativistic definition of momentum p. The fact that the relativistic momentum p
is conserved in the force-free situation, leads naturally to using the definition of force to be
p
F= (16.22)
Then the relativistic momentum is conserved if F =0
16.4.5 Energy
The classical definition of work done is defined by
Z 2
12 = F·r =2 − 1 (16.23)
1
Assume 1 = 0 let r = u and insert the relativistic force relation in equation 1623, gives
Z Z
= = ( u) ·u = ( ) (16.24)
0 0
of a paper clip, provides = 9 × 1013 joules. This is the daily output of a 1 nuclear power station or
the explosive power of the Nagasaki or Hiroshima bombs.
As the velocity of a particle approaches then and the relativistic mass both approach infinity.
This means that the force needed to accelerate the mass also approaches infinity, and thus no particle can
exceed the velocity of light. The energy continues to increase not by increasing the velocity but by increase
of the relativistic mass. Although the relativistic relation for kinetic energy is quite different from the
Newtonian relation, the Newtonian form is obtained for the case of in that
2 − 12 1 2 1
= 2 (1 − 2
) − 2 = 2 (1 + + · · ·) − 2 = 2 (16.29)
2 2 2
An especially useful relativistic relation that can be derived from the above is
2 = 2 2 + 02 (16.30)
This is useful because it provides a simple relation between total energy of a particle and its relativistic
linear momentum plus rest energy.
Integrate the left-hand side between 0 and and the right-hand side between and gives
µ ¶ ³´
1 1 +
ln = − ln
2 1 −
This reduces to ¡ ¢2
1−
= ¡ ¢2
1+
When → 0 this equation reduces to the non-relativistic answer given in equation 2123.
16.5. GEOMETRY OF SPACE-TIME 465
s = 0 ê0 + 1 ê1 + 2 ê2 + 3 ê3 = 00 ê00 + 01 ê01 + 02 ê02 + 03 ê03 (16.31)
The convention used is that greek subscripts (covariant) or superscripts (contravariant) designate a four
vector with 0 ≤ ≤ 3 The covariant unit vectors ê are written with the subscript which has 4 values
0 ≤ ≤ 3. As described in appendix 3, using the Einstein convention the components are written with
the contravariant superscript where the time axis 0 = , while the spatial coordinates, expressed in
cartesian coordinates, are 1 = , 2 = , and 3 = . With respect to a different (primed) unit vector basis
ê0 the displacement must be unchanged as given by equation 1631. In addition, equation 1643 shows that
the magnitude ||2 of the displacement four vector is invariant to a Lorentz transformation.
The most general Lorentz transformation between inertial coordinate systems and 0 , in relative motion
with velocity v, assuming that the two sets of axes are aligned, and that their origins overlap when = 0 = 0,
is given by the symmetric matrix where
X
0 = (16.32)
This Lorentz transformation of the four vector X components can be written in matrix form as
X0 = λX (16.33)
Assuming that the two sets of axes are aligned, then the elements of the Lorentz transformation are
given by
⎛ ⎞
⎛ 0 ⎞ − 1 − 2 − 3 ⎛ ⎞
⎜ 2 ⎟
⎜ 01 ⎟ ⎜ − 1 1 + ( − 1) 12 ( − 1) 1 2 2 ( − 1) 1 2 3 ⎟ ⎜ 1 ⎟
X =⎜
0 ⎟ ⎜
⎝ 02 ⎠ = ⎜ − ( − 1) 1 2 2 2
⎟·⎜ 2 ⎟
⎟ ⎝ ⎠ (16.34)
⎝ 2
1 + ( − 1) 22 ( − 1) 2 2 3 ⎠
03 2 3
− 3 ( − 1) 1 2 3 ( − 1) 2 2 3 1 + ( − 1) 32
1
where = and = √ and assuming that the origin of transforms to the origin of 0 at (0 0 0 0).
1− 2
For the case illustrated in figure 161 where the corresponding axes of the two frames are parallel and in
relative motion with velocity in the 1 direction, then the Lorentz transformation matrix 1634 reduces to
⎛ 0 ⎞ ⎛ ⎞ ⎛ ⎞
− 0 0
⎜ 01 ⎟ ⎜ − 0 0 ⎟ ⎜ 1 ⎟
⎜ 02 ⎟ = ⎜ ⎟ · ⎜ 2 ⎟ (16.35)
⎝ ⎠ ⎝ 0 0 1 0 ⎠ ⎝ ⎠
03 3
0 0 0 1
This Lorentz transformation matrix is called a standard boost since it only boosts from one frame to another
parallel frame. In general a rotation matrix also is incorporated into the transformation matrix for the
spatial variables.
466 CHAPTER 16. RELATIVISTIC MECHANICS
The correct sign of the inner product is obtained by inclusion of the Minkowski metric defined by
The contravariant metric component is defined as the component of the inverse metric matrix g−1
where
gg−1 = I = g−1 g (16.40)
where I is the four-vector identity matrix. The contravariant components of the four vector can be expressed
in terms of the covariant components as
X3
= (16.41)
=0
Thus equations 1639 and 1641 can be used to transform between covariant and contravariant four vectors,
that is, to raise or lower the index .
The scalar inner product of two four vectors can be written compactly as the scalar product of a covariant
four vector and a contravariant four vector. The Minkowski metric matrix can be absorbed into either X or
Y thus
X 3
3 X 3
X 3
X
X·Y= = = (16.42)
=0 =0 =0 =0
If this covariant expression is Lorentz invariant in one coordinate system, then it is Lorentz invariant in all
coordinate systems obtained by proper Lorentz transformations.
2 Older textbooks, such as all editions of Marion, and the first two editions of Goldstein, use the Euclidean Poincaré 4-
dimensional space-time with the imaginary time axis . About half the scientific community, and modern physics textbooks
including this textbook and the 3 edition of Goldstein, use the Bjorken - Drell + − − −, sign convention given in equation
1638 where 0 ≡ and 1 2 3 are the spatial coordinates. The other half of the community, including mathematicians
and gravitation physicists, use the opposite − + + + sign convention. Further confusion is caused by a few books that assign
the time axis to be 4 rather than 0
16.5. GEOMETRY OF SPACE-TIME 467
The scalar inner product of the invariant space-time interval is an especially important example.
3
X
2 2 2 2
() ≡ X·X=2 () 2 − (r) = () − 2 = ( ) (16.43)
=1
This is invariant to a Lorentz transformation as can be shown by applying the Lorentz standard boost
transformation given above. In particular, if 0 is the rest frame of the clock, then the invariant space-time
interval is simply given by the proper time interval .
time , then the time observed in the fixed frame can be obtained by looking at the interval Because of
the invariance of the interval, 2 then
£ ¤
2 = 2 2 = 2 2 − 21 + 22 + 23 (16.44)
That is,
" ¡ 2 ¢ # 12 ∙ ¸1
1 + 22 + 23 2 2
= 1 − = 1 − 2 = (16.45)
2 2
that is = which satisfies the normal expression for time dilation, 168.
Remember that the square of the four-dimensional space-time element of length ()2 is invariant (1643),
and is simply related to the proper time element . Thus the scalar product
£ ¤
X·X = 2 = 2 2 = 2 2 − 21 + 22 + 23 (16.47)
Thus the proper time is an invariant.
The ratio of the four-vector element X and the invariant proper time interval is a four-vector called
the four-vector velocity U where
µ ¶ µ ¶
X x x
U= = = = ( u) (16.48)
where u is the particle velocity, and = 1 .
2
(1−
2
)
The four-vector momentum P can be obtained from the four-vector velocity by multiplying it by the
scalar rest mass
P = U = ( u) (16.49)
However,
= (16.50)
thus the momentum four vector can be written as
µ ¶
P= p (16.51)
where the vector p represents the three spatial components of the relativistic momentum. It is interesting to
realize that the Theory of Relativity couples not only the spatial and time coordinates, but also, it couples
their conjugate variables linear momentum p and total energy, .
An additional feature of this momentum-energy four vector P, is that the scalar inner product P · P is
invariant to Lorentz transformations and equals ()2 in the rest frame
X 3
3 X 3 X
X 3
2
P·P= = = ( ) − |p|2 = 2 2 (16.52)
=0 =0 =0 =0
where (q() q̇()) denotes the conventional Lagrangian. This approach implicitly assumes the Newtonian
concept of absolute time which is chosen to be the independent variable that characterizes the evolution
parameter of the system. The actual path [q() q̇()] the system follows is defined by the extremum of the
action integral (q q̇) which leads to the corresponding Euler-Lagrange equations. This assumption is
contrary to the Theory of Relativity which requires that the space and time variables be treated equally,
that is, the Lagrangian formalism must be covariant.
The conventional action and extended action S, address alternate characterizations of the same underlying
physical system, and thus the action principle implies that = S = 0 must hold simultaneously. That is,
Z Z
q q
(q ) = L(q ) (16.57)
As discussed in chapter 133 there is a continuous spectrum of equivalent gauge-invariant Lagrangians for
which the Euler-Lagrange equations lead to identical equations of motion. Equation 1657 is satisfied if the
conventional and extended Lagrangians are related by
q q Λ(q)
L(q ) = (q ) + (16.58)
where Λ(q) is a continuous function of q and that has continuous second derivatives. It is acceptable to
assume that Λ(q)
= 0, then the extended and conventional Lagrangians have a unique relation requiring
no simultaneous transformation of the dynamical variables. That is, assume
q q
L(q ) = (q ) (16.59)
Note that the time derivative of q can be expressed in terms of the derivatives by
q q
= (16.60)
Thus, for a conventional Lagrangian with variables, the corresponding extended Lagrangian is a function
of + 1 variables while the conventional and extended Lagrangians are related using equations 1659 and
1660.
The derivatives of the relation between the extended and conventional Lagrangians lead to
L
= (16.61)
L
= (16.62)
L
³ ´ = ³ ´ (16.63)
X
L
¡ ¢ = − ³ ´ (16.64)
=1
where 1 ≤ ≤ since the = 0 time derivatives are written explicitly in equations 1662 1664.
Equations 1663 — 1664, summed over the extended range 0 ≤ ≤ of time and spatial dynamical
variables, imply
X µ ¶
L X X
³ ´ = − ³ ´ + ³ ´ =L (16.65)
=0
=1 =1
Assume that the definitions of the extended Lagrangian L, and the extended Hamiltonian H, are related
by a Legendre transformation, and are based on variational principles, analogous to the relation that exists
between the conventional Lagrangian and Hamiltonian . The Legendre transformation requires defining
the extended generalized (canonical) momentum-energy four vector P()= ( E() p()). The momentum
components of the momentum-energy four vector P()= ( E()
p()) are given by the 1 ≤ ≤ components
using equation 1663
L
() = ³ ´ = ³ ´ (16.68)
The = 0 component of the momentum-energy four vector can be derived by recognizing that the right-hand
side of equation 1664 is equal to −( ). That is, the corresponding generalized momentum 0 that
is conjugate to 0 = is given by
à ! ⎛ ⎞
X
L 1 L 1 ⎠ ( )
0 = ³ 0 ´ = ¡ ¢ = ⎝ − ³ ´ =− (16.69)
=1
where the extended generalized force Q shown on the right-hand side of equation 1670 accounts for all
forces not included in the potential energy term in the Lagrangian. The extended generalized force Q can
be factored into two terms as discussed in chapter 6, equation 647. The Lagrange multiplier term includes
1 ≤ ≤ holonomic constraint forces where the holonomic constraints, which do no work, are expressed
in terms of the algebraic equations of holonomic constraint . The term includes the remaining
constraint forces and generalized forces that are not included in the Lagrange multiplier term or the potential
energy term of the Lagrangian.
For the case where = 0, since 0 = , then equation 1670 reduces to
à !
L L X X
¡ ¢ − = − (16.71)
=1
=1
These Euler-Lagrange equations of motion 1670 1671 determine the 1 ≤ ≤ generalized coordinates
() plus 0 = () in terms of the independent variable .
If the holonomic equations of constraint are time independent, that is
= 0 and if Q0
= 0, then
the = 0 term of the Euler-Lagrange equations simplifies to
à !
L L
¡ ¢ − =0 (16.72)
One interpretation is to select to be primary. Then L is derived from using equation 1659 and L
must satisfy the identity given by equation 1666 while the Euler-Lagrange equations containing yield an
identity which implies that does not provide an equation of motion in terms of (). Conversely, if L is
472 CHAPTER 16. RELATIVISTIC MECHANICS
chosen to be primary, then L is no longer a homogeneous function and equation 1666 serves as a constraint
on the motion that can be used to deduce , while yields a non-trivial equation of motion in terms of
(). In both cases the occurrence of a constraint surface results from the fact that the extended space has
2 + 2 variables to describe 2 + 1 degrees of freedom, that is, one more degree of freedom than required for
the actual system.
The constant third term in the bracket is included to ensure that the extended Lagrangian converges to the
standard Lagrangian in the limit → 1.
Note that the extended Lagrangian () is not homogeneous to first order in the velocities q
as is required.
Equation 1666 must be used to ensure that equation () is homogeneous. That is, it must satisfy the
constraint relation µ ¶2 µ ¶2
1 q
− 2 −1=0 ()
Inserting () into the extended Lagrangian () yields that the square bracket in equation must equal 2.
Thus
1
|L| = 2 [−2] = −2 ()
2
The constraint equation () implies that
s µ ¶2
1 q 1
= 1− 2 = ()
Equation () is the conventional relativistic Lagrangian derived by assuming that the system evolution para-
meter is transformed to be along the world line where the invariant length replaces the proper time
interval
= = ()
The definition of the generalized (canonical) momentum
= = ̇ ()
̇
leads to the relativistic expression for momentum given in equation 1621.
The relativistic Lagrangian is an important example of a non-standard Lagrangian. Equation () does not
equal the difference between the kinetic and potential energies, that is, the relativistic expression for kinetic
energy is given by 1628 to be
= ( − 1) 2 ()
The non-standard relativistic Lagrangian () can be used with the Euler-Lagrange equations to derive the
second-order equations of motion for both relativistic and non-relativistic problems within the Special Theory
of Relativity.
16.6. LORENTZ-INVARIANT FORMULATION OF LAGRANGIAN MECHANICS 473
If we adopt the definition that the relativistic canonical momentum is = then the left hand side is
the relativistic force while the right-hand side is the well-known Lorentz force of electromagnetism. Thus
the extended Lagrangian formulation correctly reproduces the well-known Lorentz force for a charged particle
moving in an electromagnetic field.
474 CHAPTER 16. RELATIVISTIC MECHANICS
Struckmeier[Str08] assumes that the definitions of the extended Lagrangian L, and the extended Hamil-
tonian H, are related by a Legendre transformation, and are based on variational principles, analogous to the
relation that exists between the conventional Lagrangian and Hamiltonian . The Legendre transforma-
tion requires defining the extended generalized (canonical) momentum-energy four vector P()= ( E() p()).
E()
The momentum components of the momentum-energy four vector P()= ( p()) are given by the 1 ≤
≤ components using either the conventional or the extended Lagrangians as given in equation 1668
L
() = ³ ´ = ³ ´ (1668)
where E() represents the instantaneous generalized energy of the conventional Hamiltonian at the point
but not the functional form of (q() p() ()). That is
E()6≡
= (q() p() ()) (16.76)
Note that E() does not give the function (q p ). Equations 1668 and 1669 give that
E()
0 () = − (16.77)
The extended Hamiltonian H(q p E()), in an extended phase space, can be defined by the Legendre
transformation and the four-vector P to be
q
H(q p E()) = (P·q) − L(q ) (16.78)
X µ ¶
q
= − L(q )
=0
X µ ¶
q
= −E − L(q ) (16.79)
=1
where the 0 term has been written explicitly as −E in equation 1679. The extended Hamiltonian
H((q p E()) can carry all the information on the dynamical system that is carried by the extended
Lagrangian L(q q
) if the Hesse matrix is non-singular. That is, if
⎛ ⎞
2
L
det ⎝ ³ ´ ³ ´ ⎠ 6= 0 (16.80)
16.7. LORENTZ-INVARIANT FORMULATIONS OF HAMILTONIAN MECHANICS 475
If the extended Lagrangian L(q q
) is not homogeneous in the +1 velocities , then the extended
set of Euler-Lagrange equations 1672 is not redundant. Thus equation 1666 is not an identity but it can be
regarded as an implicit equation that is always satisfied by the extended set of Euler-Lagrange equations. As
a result, the Legendre transformation to an extended Hamiltonian exists. That is, equation 1666 is identical
to the Legendre transform for H((q p E()) which was shown to equal zero. Therefore
H(q() p() () E()) = 0 (16.81)
which means that the extended Hamiltonian H((q p E()) directly defines the restricted hypersurface on
which the particle motion is confined.
The extended canonical equations of motion, derived using the extended Hamiltonian H(q() p() () E())
with the usual Hamiltonian mechanics relations, are:
H
= (16.82)
H
= − (16.83)
H E
= (16.84)
H
= − (16.85)
E
These canonical equations give that the total derivative of H((q() p() () E()) with respect to is
H H H H H E
= + + +
E
E E
= − + − =0 (16.86)
That is, in contrast to the total time derivative of (q p ), the total derivative of the extended Hamil-
tonian H((q() p() () E()) always vanishes, that is, H(q() p() () E()) is autonomous which is ideal
for use with Hamilton’s equations of motion. The constraints give that H(q() p() () E()) = 0, (equation
1681) and H = 0, (equation 1686) implying that the correlation between the extended and conventional
Hamiltonians is given by
X µ ¶
q
H((q() p() () E()) = −E − L(q ) (16.87)
=1
X µ ¶
q
= −E
− (q ) (16.88)
=1
µ ¶ " µ ¶#
X X
= −E + (q p ) − (16.89)
=1
=1
= ((q p ) − E) =0 (16.90)
since only the term with = 0 does not cancel in equation 1679. Equations 1681 and 1690 give that both the
left and right-hand sides of equation 1690 are zero while equation 1686 implies that H((q() p() () E())
is a constant of motion, that is, is a cyclic variable for H((q() p() () E()). Formally one can consider
the extended Hamiltonian is a constant which equals zero
H(q p E()) = E() = 0 (16.91)
Equations 1684 1685 imply that (E ) form a pair of canonically conjugate variables in addition to the
newly-introduced canonically-conjugate variables (E() ). Equation 1690 shows that the motion in the
2 + 2 extended phase space is constrained to the surface reflecting the fact that the observed system has
one less degree of freedom than used by the extended Hamiltonian.
In summary, the Lorentz-invariant extended canonical formalism leads to Hamilton’s first-order equations
of motion in terms of derivatives with respect to where is related to the proper time for a relativistic
system.
476 CHAPTER 16. RELATIVISTIC MECHANICS
As for the conventional Poisson bracket discussed in chapter 14, the extended Poisson also leads to the
fundamental Poisson bracket relations
££ ¤¤ ££ ¤¤
=0 [[ ]] = 0 = (16.93)
where = 0 1 . These are identical to the non-extended fundamental Poisson brackets.
The discussion of observables in Hamiltonian mechanics in chapter 1434 can be trivially expanded to
the extended Poisson bracket representation. In particular, the total derivative of the function is given
by
= + [[ H]] (16.94)
If commutes with the extended Hamiltonian, that is, the Poisson bracket equals zero, and if = 0, then
= 0. That is, the observable is a constant of motion.
Substitute the fundamental variables for gives
H H
= [[ H]] = − = [[ H]] = (16.95)
where = 0 1 . These are Hamilton’s extended canonical equations of motion expressed in terms of
the system evolution parameter . The extended Poisson bracket representation is a trivial extension of the
conventional canonical equations presented in chapter 143.
s s
2 4
4 2 Γ2 2 Γ2 (1 − 2 )
= Γ= 1− 2 2 = 2 = 1+
1 + cos[Γ( − 0 ] 1 − Γ2
The apses are min = (1+) for Γ( − 0 ) = 0 2 4 and max = (1−) for Γ( − 0 ) = 3. The
perihelion advances between cycles due to the change in relativistic mass during the trajectory as shown in
the adjacent figure. This precession leads to the fine structure observed in the optical spectra of the hydrogen
atom. The same precession of the perihelion occurs for planetary motion, however, there is a comparable
size effect due to gravity that requires use of general relativity to compute the trajectories.
478 CHAPTER 16. RELATIVISTIC MECHANICS
Mach’s principle:
The 1883 work "The Science of Mechanics" by the philosopher/physicist, Ernst Mach, criticized Newton’s
concept of an absolute frame of reference, and suggested that local physical laws are determined by the
large-scale structure of the universe. The concept is that local motion of a rotating frame is determined by
the large-scale distribution of matter, that is, relative to the fixed stars. Einstein’s interpretation of Mach’s
statement was that the inertial properties of a body is determined by the presence of other bodies in the
universe, and he named this concept Mach’s Principle. Mach’s Principle has never been developed into a
quantitative physical theory that would explain a mechanism by which the large-scale distribution of matter
can produce such an effect.
Equivalence principle:
The equivalence principle comprises closely-related concepts dealing with the equivalence of gravitational and
inertial mass. The weak equivalence principle states that the inertial mass and gravitational mass of a
body are identical, leading to an acceleration that is independent of the nature of the body. This experimental
fact usually is attributed to Galileo. Recent measurements have shown that this weak equivalence principle
is obeyed to a sensitivity of 5 × 10−13 . Einstein’s equivalence principle states that the outcome of
any local non-gravitational experiment, in a freely falling laboratory, is independent of the velocity of the
laboratory and its location in space-time. This principle implies that the result of local experiments must be
independent of the velocity of the apparatus. Einstein’s equivalence principle has been tested by searching
for variations of dimensionless fundamental constants such as the fine structure constant. The strong
equivalence principle combines the weak equivalence and Einstein equivalence principles, and implies
that the gravitational constant is constant everywhere in the universe. The strong equivalence principle
suggests that gravity is geometrical in nature and does not involve any fifth force in nature. Einstein’s
General Theory of Relativity satisfies the strong equivalence principle. Tests of the strong equivalence
principle have involved searches for variations in the gravitational constant and masses of fundamental
particles throughout the life of the universe.
Principle of covariance
A physical law expressed in a covariant formulation has the same mathematical form in all coordinate systems,
and is usually expressed in terms of tensor fields. Maxwell’s equations of electromagnetism are an example of
such a covariant formulation. In the Special Theory of Relativity, the Lorentz, rotational, translational and
reflection transformations between inertial coordinate frames all are covariant. The covariant quantities are
the 4-scalars, and 4-vectors in Minkowski space-time. Einstein recognized that the principle of covariance,
that is built into the Special Theory of Relativity, should apply equally to accelerated relative motion in
the General Theory of Relativity. He exploited tensor calculus to extend the Lorentz covariance to the
16.8. THE GENERAL THEORY OF RELATIVITY 479
more general local covariance in the General Theory of Relativity. The reduction locally of the general
metric tensor to the Minkowski metric corresponds to free-falling motion, that is geodesic motion, and thus
encompasses gravitation. Unified field theory involves attempts to extend the General Theory of Relativity
to incorporate other physical phenomena within a covariant framework in a purely geometric representation
in space-time.
Correspondence principle
The Correspondence Principle states that the predictions of any new scientific theory must reduce to the pre-
dictions of well established earlier theories under circumstances for which the preceding theory was known
to be valid. This also is referred to as the "correspondence limit". The Correspondence Principle is an
important concept used both in quantum mechanics and relativistic mechanics. Einstein’s Special Theory
of Relativity satisfies the Correspondence Principle because it reduces to classical mechanics in the limit
of velocities small compared to the speed of light. The Correspondence Principle requires that the Gen-
eral Theory of Relativity must reduce to the Special Theory of Relativity for inertial frames, and should
approximate Newton’s Theory of Gravitation in weak fields and at low velocities.
Kepler problem In 1915 Einstein showed that relativistic mechanics explained the anomalous advance
of the perihelion of the planet mercury, that is, the axes of the elliptical Kepler orbit precess. Example 161
discusses the analogue of this effect for the Bohr-Sommerfeld hydrogen atom.
480 CHAPTER 16. RELATIVISTIC MECHANICS
Deflection of light Eddington travelled to the island of Príncipe, near Africa, to watch the solar eclipse
of 29 May 1919. During the eclipse, he took pictures of the stars in the region around the Sun. According
to the theory of general relativity, stars with light rays that passed near the Sun would appear to have been
slightly shifted because their light had been curved by the sun’s gravitational field. This effect is noticeable
only during eclipses, since otherwise the Sun’s brightness obscures the affected stars. The results confirmed
Einstein’s prediction of the deflection of light in a gravitational field which made Einstein famous.
Black holes When the mass to radius ratio of the massive object becomes sufficiently large, general
relativity predicts formation of a black hole, which is a region of space from which neither light nor matter
can escape. At the center of a galaxy there usually exists a supermassive black hole with a mass that is
106 − 109 solar masses which is thought to have played an important role in formation of the galaxy.
16.10 Summary
Special theory of relativity: The Special Theory of Relativity is based on Einstein’s postulates;
1) The laws of nature are the same in all inertial frames of reference.
2) The velocity of light in vacuum is the same in all inertial frames of reference.
For a primed frame moving along the 1 axis with velocity Einstein’s postulates imply the following
Lorentz transformations between the moving (primed) and stationary (unprimed) frames
The General Theory of Relativity: An elementary summary was given of the fundamental concepts
of the General Theory of Relativity and the resultant unified description of the gravitational force plus
planetary motion as geodesic motion in a four-dimensional Riemannian structure. Variational mechanics
was shown to be ideally suited to applications of the General Theory of Relativity.
Philosophical implications: Newton’s equations of motion, and his Law of Gravitation, that reigned
supreme from 1687 to 1905, have been toppled from the throne by Einstein’s theories of relativistic me-
chanics. By contrast, the complete independence to coordinate frames in Lagrangian, and Hamiltonian
formulations of classical mechanics, and the underlying Principle of Least Action, are equally valid in both
the relativistic and non-relativistic regimes. As a consequence, relativistic Lagrangian and Hamiltonian
formulations underlie much of modern physics, especially quantum physics, which explains why relativistic
mechanics is so important to classical dynamics.6
6 Recommended reading:
”Mr. Tompkins in Paperback” by George Gamow. An excellent elementary description of the implications of the Theory of
Relativity
"Gravity: An Introduction to Einstein’s General Relativity" by James B. Hartle, Addison Wesley (2003).
"Classical Mechanics and Relativity" by H.J.W. Müller-Kirsten, World Scientific, Singapore (2008).
482 CHAPTER 16. RELATIVISTIC MECHANICS
Workshop exercises
1. A relativistic snake of proper length 100 is travelling to the right across a butcher’s table at = 06. You
hold two meat cleavers, one in each hand which are 100 apart. You strike the table simultaneously with
both cleavers at the moment when the left cleaver lands just behind the tail of the snake. You rationalize that
since the snake is moving with = 06 then the length of the snake is Lorentz contracted by the factor = 54
and thus the Lorentz-contracted length of the snake is 80 and thus will not be harmed. However, the snake
reasons that relative to it the cleavers are moving at = 06 and thus are only 80 apart when they strike
the 100 long snake and thus it will be severed. Use the Lorentz transformation to resolve this paradox.
2. Explain what is meant by the following statement: “Lorentz transformations are orthogonal transformations
in Minkowski space.”
(a) Energy
(b) Momentum
(c) Mass
(d) Force
(e) Charge
(f) The length of a vector
(g) The length of a four-vector
4. What does it mean for two events to have a spacelike interval? What does it mean for them to have a timelike
interval? Draw a picture to support your answer. In which case can events be causally connected?
Problems
1. A supply rocket flies past two markers on the Space Station that are 50 apart in a time of 02 as measured
by an observer on the Space station.
(a) What is the separation of the two markers as seen by the pilot riding in the supply rocket?
(b) What is the elapsed time as measured by the pilot in the supply rocket?
(c) What are the speeds calculated by the observer in the Space Station and the pilot of the supply rocket?
2. The Compton effect involves a photon of incident energy being scattered by an electron of mass which
initially is stationary. The photon scattered at an angle with respect to the incident photon has a final energy
. Using the special theory of relativity derive a formula that related and to .
3. Pair creation involves production of an electron-positron pair by a photon. Show that such a process is
impossible unless some other body, such as a nucleus, is involved. Suppose that the nucleus has a mass
and the electron mass . What is the minimum energy that the photon must have in order to produce an
electron-positron pair?
4. A meson of rest energy 494 decays into a meson of rest energy 106 and a neutrino of zero
rest energy. Find the kinetic energies of the meson and the neutrino into which the meson decays while
at rest.
Chapter 17
17.1 Introduction
Classical mechanics, including extensions to relativistic velocities, embrace an unusually broad range of topics
ranging from astrophysics to nuclear and particle physics, from one-body to many-body statistical mechanics.
It is interesting to discuss the role of classical mechanics in the development of quantum mechanics which
plays a crucial role in physics. A valid question is "why discuss quantum mechanics in a classical mechanics
course?". The answer is that quantum mechanics supersedes classical mechanics as the fundamental the-
ory of mechanics. Classical mechanics is an approximation applicable for situations where quantization is
unimportant. Thus there must be a correspondence principle that relates quantum mechanics to classical
mechanics, analogous to the relation between relativistic and non-relativistic mechanics. It is illuminating to
study the role played by the Hamiltonian formulation of classical mechanics in the development of quantal
theory and statistical mechanics. The Hamiltonian formulation is expressed in terms of the phase-space
variables q p for which there are well-established rules for transforming to quantal linear operators.
= = (17.1)
where is the frequency of the electromagnetic radiation and Planck’s constant, = 662610−34 · was
the best fit parameter of the interpolation. That is, Planck assumed that energy comes in discrete bundles
of energy equal to which are called quanta. By making this extreme assumption, in an act of desperation,
Planck was able to reproduce the experimental black body radiation spectrum. The assumption that energy
was exchanged in bundles hinted that the classical laws of physics were inadequate in the microscopic
domain. The older generation physicists initially refused to believe Planck’s hypothesis which underlies
483
484 CHAPTER 17. THE TRANSITION TO QUANTUM PHYSICS
quantum theory. It was the new generation physicists, like Einstein, Bohr, Heisenberg, Born, Schrödinger,
and Dirac, who developed Planck’s hypothesis leading to the revolutionary quantum theory.
In 1905, Einstein predicted the existence of the photon, derived the theory of specific heat, as well
as deriving the Theory of Special Relativity. It is remarkable to realize that he developed these three
revolutionary theories in one year, when he was only 26 years old. Einstein uncovered an inconsistency in
Planck’s derivation of the black body spectral distribution in that it assumed the statistical part of the energy
is quantized, whereas the electromagnetic radiation assumed Maxwell’s equations with oscillator energies
being continuous. Planck demanded that light of frequency be packaged in quanta whose energies were
multiples of , but Planck never thought that light would have particle-like behavior. Newton believed that
light involved corpuscles, and Hamilton developed the Hamilton-Jacobi theory seeking to describe light in
terms of the corpuscle theory. However, Maxwell had convinced physicists that light was a wave phenomena;
interference plus diffraction effects were convincing manifestations of the wave-like properties of light. In
order to reproduce Planck’s prediction, Einstein had to treat black-body radiation as if it consisted of a gas
of photons, each photon having energy = . This was a revolutionary concept that returned to Newton’s
corpuscle theory of light. Einstein realized that there were direct tests of his photon hypothesis, one of which
is the photo-electric effect. According to Einstein, each photon has an energy = , in contrast to the
classical case where the energy of the photoelectron depends on the intensity of the light. Einstein predicted
that the ejected electron will have a kinetic energy
= − (17.2)
where is the work function which is the energy needed to remove an electron from a solid.
Many older scientists, including Planck, accepted Einstein’s theory of relativity but were skeptical of
the photon concept, even after Einstein’s theory was vindicated in 1915 by Millikan who showed that, as
predicted, the energy of the ejected photoelectron depended on the frequency, and not intensity, of the light.
In 1923 Compton’s demonstrated that electromagnetic radiation scattered by free electrons obeyed simple
two-body scattering laws which finally convinced the many skeptics of the existence of the photon.
Table 171: Chronology of the development of quantum mechanics
Date Author Development
1887 Hertz Discovered the photo-electric effect
1895 Röntgen Discovered x-rays
1896 Becquerel Discovered radioactivity
1897 J.J. Thomson Discovered the first fundamental particle, the electron
1898 Pierre & Marie Curie Showed that thorium is radioactive which founded nuclear physics
1900 Planck Quantization = explained the black-body spectrum
1905 Einstein Theory of special relativity
1905 Einstein Predicted the existence of the photon
1906 Einstein Used Planck’s constant to explain specific heats of solids
1909 Millikan The oil drop experiment measured the charge on the electron
1911 Rutherford Discovered the atomic nucleus with radius 10−15
1912 Bohr Bohr model of the atom explained the quantized states of hydrogen
1914 Moseley X-ray spectra determined the atomic number of the elements.
1915 Millikan Used the photo-electric effect to confirm the photon hypothesis.
1915 Wilson-Sommerfeld Proposed quantization of the action-angle integral
1921 Stern-Gerlach Observed space quantization in non-uniform magnetic field
1923 Compton Compton scattering of x-rays confirmed the photon hypothesis
1924 de Broglie Postulated wave-particle duality for matter and EM waves
1924 Bohr Explicit statement of the correspondence principle
1925 Pauli Postulated the exclusion principle
1925 Goudsmit-Uhlenbeck Postulated the spin of the electron of = 12 h
1925 Heisenberg Matrix mechanics representation of quantum theory
1925 Dirac Related Poisson brackets and commutation relations
1926 Schrödinger Wave mechanics
1927 G.P. Thomson/Davisson Electron diffraction proved wave nature of electron
1928 Dirac Developed the Dirac relativistic wave equation
17.2. BRIEF SUMMARY OF THE ORIGINS OF QUANTUM THEORY 485
17.2.2 Quantization
By 1912 Planck, and others, had abandoned the concept that quantum theory was a branch of classical
mechanics, and were searching to see if classical mechanics was a special case of a more general quantum
physics, or quantum physics was a science altogether outside of classical mechanics. Also they were trying
to find a consistent and rational reason for quantization to replace the ad hoc assumption of Bohr.
In 1912 Sommerfeld proposed that, in every elementary process, the atom gains or loses a definite amount
of action between times 0 and of Z
= (0 )0 (17.3)
0
where is the quantal analogue of the classical action function It has been shown that the classical principle
of least action states that the action function is stationary for small variations of the trajectory. In 1915
Wilson and Sommerfeld recognized that the quantization of angular momentum could be expressed in terms
of the action-angle integral, that is equation 14116. They postulated that, for every coordinate, the action-
angle variable is quantized I
= (17.4)
where the action-angle variable integral is over one complete period of the motion. That is, they postulated
that Hamilton’s phase space is quantized, but the microscopic granularity is such that the quantization is
only manifest for atomic-sized domains. That is, is a small integer for atomic systems in contrast to
≈ 1064 for the Earth-Sun two-body system.
486 CHAPTER 17. THE TRANSITION TO QUANTUM PHYSICS
Sommerfeld recognized that quantization of more than one degree of freedom is needed to obtain more
accurate description of the hydrogen atom. Sommerfeld reproduced the experimental data by assuming
quantization of the three degrees of freedom,
I I I
= 1 = 2 = 3 (17.5)
and solving Hamilton-Jacobi theory by separation of variables. In 1916 the Bohr-Sommerfeld model solved
the classical orbits for the hydrogen atom, including relativistic corrections as described in example 167.
This reproduced fine structure observed in the optical spectra of hydrogen. The use of the canonical trans-
formation to action-angle variables proved to be the ideal approach for solving many such problems in
quantum mechanics. In 1921 Stern and Gerlach demonstrated space quantization by observing the splitting
of atomic beams deflected by non-uniform magnetic fields. This result was a major triumph for quantum
theory. Sommerfeld declared that "With their bold experimental method, Stern and Gerlach demonstrated
not only the existence of space quantization, they also proved the atomic nature of the magnetic moment,
its quantum-theoretic origin, and its relation to the atomic structure of electricity."
In 1925 Pauli’s Exclusion Principle proposed that no more than one electron can have identical quantum
numbers and that the atomic electronic state is specified by four quantum numbers. Two students, Goudsmit
and Uhlenbeck suggested that a fourth two-valued quantum number was the electron spin of ± ~2 . This
provided an explanation for the structure of multi-electron atoms.
This relation, derived by de Broglie, is required to ensure that the particle travels at the group velocity
of the wave packet characterizing the particle. Note that although the relations used to characterize the
matter waves are purely classical, the physical content of such waves is beyond classical physics. In 1927 C.
Davisson and G.P. Thomson independently observed electron diffraction confirming wave/particle duality
for the electron. Ironically, J.J. Thomson discovered that the electron was a particle, while his son attributed
it to an electron wave.
Heisenberg developed the modern matrix formulation of quantum theory in 1925; he was 24 years old
at the time. A few months later Schrödinger’s developed wave mechanics based on de Broglie’s concept of
wave-particle duality. The matrix mechanics, and wave mechanics, quantum theories are radically different.
Heisenberg’s algebraic approach employs non-commuting quantities and unfamiliar mathematical techniques
that emphasized the discreteness characteristic of the corpuscle aspect. In contrast, Schrödinger used the
familiar analytical approach that is an extension of classical laws of motion and waves which stressed the
element of continuity.
17.3. HAMILTONIAN IN QUANTUM THEORY 487
gives the Hamiltonian function (p q) of the matrices q and p which leads to Hamilton’s canonical equations
q̇= ṗ=− (17.11)
p q
− = ~ (17.12)
− = 0
− = 0
Born realized that equation (1712) is the only fundamental equation for introducing ~ into the theory in a
logical and consistent way.
Chapter 1424 discussed the formal correspondence between the Poisson bracket, defined in chapter 143,
and the commutator in classical mechanics. It was shown that the commutator of two functions equals a
constant multiplicative factor times the corresponding Poisson Bracket. That is
Dirac recognized that the correspondence between the classical Poisson bracket, and quantum commu-
tator, in equation (1713) provides a logical and consistent way that builds quantization directly into the
theory, rather than using an ad-hoc, case-dependent, hypothesis as used by the older quantum theory of
488 CHAPTER 17. THE TRANSITION TO QUANTUM PHYSICS
Bohr. The basis of Dirac’s quantization principle, involves replacing the classical Poisson Bracket, [ ]
by the commutator, ~1 ( − ). That is,
1
[ ] =⇒ ( − ) (17.17)
~
Hamilton’s canonical equations, as introduced in chapter 14, are only applicable to classical mechanics
since they assume that the exact position and conjugate momentum can be specified both exactly and
simultaneously which contradicts the Heisenberg’s Uncertainty Principle. In contrast, the Poisson bracket
generalization of Hamilton’s equations allows for non-commuting variables plus the corresponding uncertainty
principle. That is, the transformation from classical mechanics to quantum mechanics can be accomplished
simply by replacing the classical Poisson Bracket by the quantum commutator, as proposed by Dirac. The
formal analogy between classical Hamiltonian mechanics, and the Heisenberg representation of quantum
mechanics is strikingly apparent using the correspondence between the Poisson Bracket representation of
Hamiltonian mechanics and Heisenberg’s matrix mechanics.
The direct relation between the quantum commutator, and the corresponding classical Poisson Bracket,
can applied to many observables. For example, the quantum analogs of Hamilton’s equations of motion
are given by use of Hamilton’s equations of motion, 1453 1456 and replacing each Poisson Bracket by the
corresponding commutator. That is
1
= = [ ] = ( − ) (17.18)
~
1
= − = [ ] = ( − ) (17.19)
~
Chapter 1425 discussed the time dependence of observables in Hamiltonian mechanics. Equation 1445
gave the total time derivative of any observable to be
= + [ ] (17.20)
Equation 1717 can be used to replace the Poisson Bracket by the quantum commutator, which gives the
corresponding time dependence of observables in quantum physics.
1
= + ( − ) (17.21)
~
In quantum mechanics, equation 1721 is called the Heisenberg equation. Note that if the observable is
chosen to be a fundamental canonical variable, then
= 0 = and equation 1420 reduces to Hamilton’s
1
+ ( − ) = 0 (17.22)
~
Moreover, if is not an explicit function of time, then
1
0= ( − ) (17.23)
~
That is, the transition to quantum physics shows that, if is a constant of motion, and is not explicitly
time dependent, then commutes with the Hamiltonian .
The above discussion has illustrated the close and beautiful correspondence between the Poisson Bracket
representation of classical Hamiltonian mechanics, and the Heisenberg representation of quantum mechanics.
Dirac provided the elegant and simple correspondence principle connecting the Poisson bracket representation
of classical Hamiltonian mechanics, to the Heisenberg representation of quantum mechanics.
17.3. HAMILTONIAN IN QUANTUM THEORY 489
where the action gives the phase of the wavefront, and the amplitude of the wave, as described in
chapter 1444. The time dependence, that characterizes the motion of the wavefront, is contained in the
time dependence of This form for the wavefunction has the advantage that the wavefunction frequently
factors into a product of terms, e.g. = ()Θ()Φ() which corresponds to a summation of the exponents
= + + − . This summation form is exploited by separation of the variables, as discussed in
chapter 1443.
Insert (1733) into equation (1728) plus using the fact that
µ ¶ µ ¶ µ ¶2
2 1 2
= = = − 2 + 2 (17.35)
2 ~ ~ ~
leads to
1 ~ 2
− = (∇ · ∇) + () − ∇ = (17.36)
2 2
Note that if Planck’s constant ~ = 0 then the imaginary term in equation (1735) is zero, leading to 1735
being real, and identical to the Hamilton-Jacobi result, equation 1723. The fact that equation 1735
equals the Hamilton-Jacobi equation in the limit ~ → 0, illustrates the close analogy between the wave-
particle duality of the classical Hamilton-Jacobi theory, and de Broglie’s wave-particle duality in Schrödinger’s
quantum wave-mechanics representation.
The Schrödinger approach was rapidly adopted in 1925 and exploited extensively with tremendous success,
since it is much easier to grasp conceptually, than is the algebraic approach of Heisenberg. Initially there
was much conflict between the proponents of these two contradictory approaches, but this was resolved by
Schrödinger who showed in 1926 that there is a formal mathematical identity between wave mechanics and
matrix mechanics. That is, these quantal two representations of Hamiltonian mechanics are equivalent, even
though they are built on either the Poisson bracket representation, or the Hamilton-Jacobi representation.
Wave mechanics is based intimately on the quantization rule of the action variable. Heisenberg’s Uncertainty
Principle is automatically satisfied by Schrödinger’s wave mechanics since the uncertainty principle is a
feature of all wave motion, as described in chapter 3.
In 1928 Dirac developed a relativistic wave equation which includes spin as an integral part. This Dirac
equation remains the fundamental wave equation of quantum mechanics. Unfortunately it is difficult to
apply.
Today the powerful and efficient Heisenberg representation is the dominant approach used in the field of
physics, whereas chemists tend to prefer the more intuitive Schrödinger wave mechanics approach. In either
case, the important role of Hamiltonian mechanics in quantum theory is undeniable.
The motivation for Feynman’s 1942 Ph.D thesis, entitled "The Principle of Least Action in Quantum
Mechanics", was to quantize the classical action at a distance in electrodynamics. This theory adopted an
overall space-time viewpoint for which the classical Hamiltonian approach, as used in conventional formu-
lations of quantum mechanics, is inapplicable. Feynman used the Lagrangian, plus the principle of least
action, to underlie his development of quantum field theory. To paraphrase Feynman’s Nobel Lecture, he
used a physical approach that is quite different from the customary Hamiltonian point of view for which the
system is discussed in great detail as a function of time. That is, you have the field at this moment, then a
differential equation gives you the field at a later moment and so on; that is, the Hamiltonian approach is a
time differential method. In Feynman’s least-action approach the action describes the character of the path
throughout all of space and time. The behavior of nature is determined by saying that the whole space-time
path has a certain character. The use of action involves both advanced and retarded terms that make it
difficult to transform back to the Hamiltonian form. The Feynman space-time approach is far beyond the
scope of this course. This topic will be developed in advanced graduate courses on quantum field theory.
17.6 Summary
The important point of this discussion is that variational formulations of classical mechanics provide a
rational, and direct basis, for the development of quantum mechanics. It has been shown that the final form
of quantum mechanics is closely related to the Hamiltonian formulation of classical mechanics. Quantum
mechanics supersedes classical mechanics as the fundamental theory of mechanics in that classical mechanics
only applies for situations where quantization is unimportant, and is the limiting case of quantum mechanics
when ~ → 0 which is in agreement with the Bohr’s Correspondence Principle. The Dirac relativistic theory
of quantum mechanics is the ultimate quantal theory for the relativistic regime.
This discussion has barely scratched the surface of the correspondence between classical and quantal
mechanics, which goes far beyond the scope of this course. The goal of this chapter is to illustrate that
classical mechanics, in particular, Hamiltonian mechanics, underlies much of what you will learn in your
quantum physics courses. An interesting similarity between quantum mechanics and classical mechanics is
that physicists usually use the more visual Schrödinger wave representation in order to describe quantum
physics to the non-expert, which is analogous to the similar use of Newtonian physics in classical mechan-
ics. However, practicing physicists invariably use the more abstract Heisenberg matrix mechanics to solve
problems in quantum mechanics, analogous to widespread use of the variational approach in classical me-
chanics, because the analytical approaches are more powerful and have fundamental advantages. Quantal
problems in molecular, atomic, nuclear, and subnuclear systems, usually involve finding the normal modes
of a quantal system, that is, finding the eigen-energies, eigen-functions, spin, parity, and other observables
for the discrete quantized levels. Solving the equations of motion for the modes of quantal systems is sim-
ilar to solving the many-body coupled-oscillator problem in classical mechanics, where it was shown that
use of matrix mechanics is the most powerful representation. It is ironic that the introduction of matrix
methods to classical mechanics is a by-product of the development of matrix mechanics by Heisenberg, Born
and Jordan. This illustrates that classical mechanics not only played a pivotal role in the development of
quantum mechanics, but it also has benefitted considerably from the development of quantum mechanics;
that is, the synergistic relation between these two complementary branches of physics has been beneficial to
both classical and quantum mechanics.
Recommended reading
"Quantum Mechanics" by P.A.M. Dirac, Oxford Press, 1947,
"Conceptual Development of Quantum Mechanics" by Max Jammer, Mc Graw Hill 1966.
Chapter 18
Epilogue
This book has introduced powerful analytical methods based on variational principles that play a pivotal
role in classical dynamics, as well as in many modern branches of science and engineering. The prologue
showed a road map of the pathways in advanced classical mechanics that have been explored in order to
introduce the reader to sophisticated and powerful new approaches to problem solving in science. In spite of
the considerable amount of material covered, there are major topics that had to be omitted, or mentioned
superficially.
This long and arduous study of classical mechanics has elucidated the remarkable developments, plus
their philosophical implications, implied by use of variational formulations in classical mechanics. This
approach was pioneered by Leibniz, Lagrange, Euler, Hamilton and Jacobi during the remarkable Age of
Enlightenment, and finally reached full fruition at the start of the 20 century. Philosophically, Newtonian
mechanics is straightforward in that it uses differential equations of motion that relate the instantaneous
forces with the instantaneous accelerations, while the concepts of momentum and force are intuitive to
visualize and both cause and effect are embedded in Newtonian mechanics. However, Newtonian mechanics is
incompatible with the relativistic concept of space-time, it is unable to correctly predict relativistic mechanics,
and it fails to provide the unified description of the gravitational force plus planetary motion as geodesic
motion in a four-dimensional Riemannian structure.
The philosophical implications embedded in applying variational principles to mechanics are remarkable.
The applicability of variational principles is based on the astonishing fact that motion of a constrained
system in nature follows a path that minimizes the action integral. As a consequence, solving the equations
of motion is reduced to finding the optimum path that minimizes the action integral. The fact that nature
follows optimization principles is nonintuitive, and was considered to be metaphysical by many scientists
and philosophers which delayed full acceptance of analytical mechanics until the development of the Theory
of Relativity. Variational formulations now are the preeminent approach to classical mechanics and modern
physics; they have toppled Newtonian mechanics from the throne of classical mechanics that it occupied for
two centuries. The importance of the variational approach to science and engineering justifies the trials and
tribulations endured learning this powerful approach.
This book has gone beyond the normal syllabus to glimpse how Lagrangian and Hamiltonian dynamics
provide the foundation upon which modern physics is built. It has illustrated that a solid foundation in
analytical mechanics is essential for the study of modern physics. The techniques and physics discussed in
this book reappear in new guises in many other courses, but the basic physics is unchanged. The fundamen-
tal developments and applications of variational principles in classical mechanics illustrate the intellectual
beauty, the tremendous philosophical implications, and the unity of the field of physics. The enormous
breadth of physics addressed by classical mechanics, and the underlying unity of the field, is epitomized
by the wide range of dimensions and complexity involved. The dimensions range from as large as 1027
which is the current lower bound for the size of the universe derived from the Planck spacecraft, to quantal
analogues of classical mechanics of systems spanning in size down to the Planck length of 162 × 10−35 .
In complexity, classical mechanics spans from one body to the statistical mechanics of many-body systems.
Analytical variational methods have become the premier approach to describe systems from the very largest
to the smallest, and from one-body to many-body dynamical systems.
This book has illustrated the astonishingly power of analytical variational methods for understanding the
493
494 CHAPTER 18. EPILOGUE
physics underlying classical mechanics and many branches of modern physics. However, the present narrative
remains unfinished in that fundamental philosophical and technical questions remain to be solved in classical
mechanics. For example, analytical mechanics is based on the validity of the assumed principle of economy.
This book has not addressed the philosophical question, "is the principle of economy a fundamental law of
nature, or is it a fortuitous consequence of the fundamental laws of nature?"
Appendix A
Matrix algebra
A.2 Matrices
Matrix algebra provides an elegant and powerful representation of multivariate operators, and coordinate
transformations that feature prominently in classical mechanics. For example they play a pivotal role in
finding the eigenvalues and eigenfunctions for coupled equations that occur in rigid-body rotation, and
coupled oscillator systems. An understanding of the role of matrix mechanics in classical mechanics facilitates
understanding of the equally important role played by matrix mechanics in quantal physics.
It is interesting that although determinants were used by physicists in the late 19 century, the concept
of matrix algebra was developed by Arthur Cayley in England in 1855 but many of these ideas were the work
of Hamilton, and the discussion of matrix algebra was buried in a more general discussion of determinants.
Matrix algebra was an esoteric branch of mathematics, little known by the physics community, until 1925
when Heisenberg proposed his innovative new quantum theory. The striking feature of this new theory
was its representation of physical quantities by sets of time-dependent complex numbers and a peculiar
multiplication rule. Max Born recognized that Heisenberg’s multiplication rule is just the standard "row
times column" multiplication rule of matrix algebra; a topic that he had encountered as a young student in a
mathematics course. In 1924 Richard Courant had just completed the first volume of the new text Methods
of Mathematical Physics during which Pascual Jordan had served as his young assistant working on matrix
manipulation. Fortuitously, Jordan and Born happened to share a carriage on a train to Hanover during
495
496 APPENDIX A. MATRIX ALGEBRA
which Jordan overheard Born talk about his problems trying to work with matrices. Jordan introduced
himself to Born and offered to help. This led to publication, in September 1925, of the famous Born-Jordan
paper[Bor25a] that gave the first rigorous formulation of matrix mechanics in physics. This was followed in
November by the Born-Heisenberg-Jordan sequel[Bor25b] that established a logical consistent general method
for solving matrix mechanics problems plus a connection between the mathematics of matrix mechanics and
linear algebra. Matrix algebra developed into an important tool in mathematics and physics during World
War 2 and now it is an integral part of undergraduate linear algebra courses.
Most applications of matrix algebra in this book are restricted to real, symmetric, square matrices. The
size of a matrix is defined by the rank, which equals the row rank and column rank, i.e. the number of
independent row vectors or column vectors in the square matrix. It is presumed that you have studied
matrices in a linear algebra course. Thus the goal of this review is to list simple manipulation of symmetric
matrices and matrix diagonalization that will be used in this course. You are referred to a linear algebra
textbook if you need further details.
Matrix definition
A matrix is a rectangular array of numbers with rows and columns. The notation used for an element
of a matrix is where designates the row and designates the column of this matrix element in the
matrix A. Convention denotes a matrix A as
⎛ ⎞
11 12 1( −1) 1
⎜ 21 22 2( −1) 2 ⎟
⎜ ⎟
A≡⎜ ⎜ : : : : ⎟
⎟ (A.1)
⎝ ( −1)1 (−1)2 (−1)(−1) (−1) ⎠
1 2 (−1)
Matrices can be square, = , or rectangular 6= . Matrices having only one row or column are
called row or column vectors respectively, and need only a single subscript label. For example,
⎛ ⎞
1
⎜ 2 ⎟
⎜ ⎟
A =⎜ ⎜ : ⎟
⎟ (A.2)
⎝ −1 ⎠
Matrix manipulation
Matrices are defined to obey certain rules for matrix manipulation as given below.
1) Multiplication of a matrix by a scalar simply multiplies each matrix element by
= (A.3)
2) Addition of two matrices A and B having the same rank, i.e. the number of columns, is given by
= + (A.4)
3) Multiplication of a matrix A by a matrix B is defined only if the number of columns in A equals the
number of rows in B. The product matrix C is given by the matrix product
C= A · B (A.5)
X
= [] = (A.6)
For example, if both A and B are rank three symmetric matrices then
⎛ ⎞ ⎛ ⎞
11 12 13 11 12 13
C = A · B = ⎝ 21 22 23 ⎠ · ⎝ 21 22 23 ⎠
31 32 33 31 32 33
⎛ ⎞
11 11 + 12 21 + 13 31 11 12 + 12 22 + 13 32 11 13 + 12 23 + 13 33
= ⎝ 21 11 + 22 21 + 23 31 21 12 + 22 22 + 23 32 21 13 + 22 23 + 23 33 ⎠
31 11 + 32 21 + 33 31 31 12 + 32 22 + 33 32 31 13 + 32 23 + 33 33
A.2. MATRICES 497
Transposed matrix A
The transpose of a matrix A will be denoted by A and is given by interchanging rows and columns, that is
¡ ¢
= (A.8)
The transpose of a column vector is a row vector. Note that older texts use the symbol à for the transpose.
Orthogonal matrix
A matrix with real elements is orthogonal if
A = A−1 (A.12)
That is X¡ ¢ X
= = (A.13)
Adjoint matrix A†
For a matrix with complex elements, the adjoint matrix, denoted by A† is defined as the transpose of the
complex conjugate ¡ †¢
A = A∗ (A.14)
Hermitian matrix
The Hermitian conjugate of a complex matrix H is denoted as H† and is defined as
¡ ¢∗
H† = H = (H∗ ) (A.15)
Therefore
† ∗
= (A.16)
A matrix is Hermitian if it is equal to its adjoint
H† = H (A.17)
that is
† ∗
= = (A.18)
A matrix that is both Hermitian and has real elements is a symmetric matrix since complex conjugation has
no effect.
498 APPENDIX A. MATRIX ALGEBRA
Unitary matrix
A matrix with complex elements is unitary if its inverse is equal to the adjoint matrix
U† = U−1 (A.19)
which is equivalent to
U† U = I (A.20)
A unitary matrix with real elements is an orthogonal matrix as given in equation 12
The trace of a square matrix, denoted by A, is defined as the sum of the diagonal matrix elements.
X
A = (A.21)
=1
Real vectors The generalization of the scalar (dot) product in Euclidean space is called the inner prod-
uct. Exploiting the rules of matrix multiplication requires taking the transpose of the first column vector
to form a row vector which then is multiplied by the second column vector using the conventional rules for
matrix multiplication. That is, for rank vectors
⎛ ⎞ ⎛ ⎞ ⎛ ⎞
1 1 1
⎜ 2 ⎟ ⎜ 2 ⎟ ¡ ¢ ⎜ 2 ⎟ X
[X] · [Y] = ⎜ ⎟ ⎜
⎝ : ⎠·⎝ :
⎟ = [X] [Y] = 1
⎠ 2 ⎜
⎝ : ⎠
⎟= (A.22)
=1
For rank = 3 this inner product agrees with the conventional definition of the scalar product and gives a
result that is a scalar. For the special case when [A] · [B] = 0 then the two matrices are called orthogonal.
The magnitude squared of a column vector is given by the inner product
X 2
[X] · [X] = ( ) ≥ 0 (A.23)
=1
Complex vectors For vectors having complex matrix elements the inner product is generalized to a form
that is consistent with equation 22 when the column vector matrix elements are real.
⎛ ⎞
1
⎜ 2 ⎟ X
¡ ¢⎜ ⎟
∗ †
[X] · [Y] = [X] [Y] = 1∗ 2∗ ∗
−1 ∗
⎜ : ⎟= ∗ (A.24)
⎜ ⎟
⎝ −1 ⎠ =1
A.3 Determinants
Definition
The determinant of a square matrix with rows equals a single number derived using the matrix elements
of the matrix. The determinant is denoted as det A or |A| where
X
|A| = (1 2 )11 22 (A.26)
=1
where (1 2 ) is the permutation index which is either even or odd depending on the number of
permutations required to go from the normal order (1 2 3 ) to the sequence (1 2 3 ).
For example for = 3 the determinant is
|A| = 11 22 33 + 12 23 31 + 13 21 32 − 13 22 31 − 11 23 32 − 12 21 33 (A.27)
Properties
1. The value of a determinant || = 0, if
3. The value of a determinant changes sign if two rows, or any two columns, are interchanged.
¯ ¯
4. Transposing a square matrix does not change its determinant. ¯A ¯ = |A|
5. If any row (column) is multiplied by a constant factor then the value of the determinant is multiplied
by the same factor.
6. The determinant of a diagonal matrix equals the product of the diagonal matrix elements. That is,
when = then |A| = 1 2 3
8. The determinant of the null matrix, for which all matrix elements are zero, |0| = 0
10. If each element of any row (column) appears as the sum (difference) of two or more quantities, then
the determinant can be written as a sum (difference) of two or more determinants of the same order.
For example for order = 2
¯ ¯ ¯ ¯ ¯ ¯
¯ 11 ± 11 12 ± 12 ¯¯ ¯¯ 11 12 ¯¯ ¯¯ 11 12 ¯¯
¯
¯ 21 22 ¯ = ¯ 21 ±
22 ¯ ¯ 21 22 ¯
11 A determinant of a matrix product equals the product of the determinants. That is, if C = AB then
|C| = |A| |B|
500 APPENDIX A. MATRIX ALGEBRA
Cofactors are used to expand the determinant of a square matrix in order to evaluate the determinant.
1
−1
= (A.30)
|A|
¡ ¢
Equations 28 and 29 can be used to evaluate the element of the matrix product A−1 A
X
¡ −1 ¢ 1 X 1
A A = −1
= = |A| = = I (A.31)
|A| |A|
=1 =1
⎡ ⎤−1 ⎡ ⎤ ⎡ ⎤
1 1
A −1
= ⎣ ⎦ = ⎣ ⎦ = ⎣ ⎦
|A| |A|
⎡ ⎤
= ( − ) = − ( − ) = ( − )
1 ⎣ = − ( − )
= = ( − ) = − ( − ) ⎦ (A.33)
+ + = ( − ) = − ( − ) = ( − )
where the functions are equal to rank 2 determinants listed in equation 33.
A.4. REDUCTION OF A MATRIX TO DIAGONAL FORM 501
X0 = R·X
Y0 = R·Y (A.35)
R· (A · X) = R · Y (A.36)
R · A · R−1 · R · X = R · Y (A.37)
R · A · R−1 · X0 = A0 · X0 = Y0 (A.38)
using the fact that the identity matrix I = R · R−1 = R · R since the rotation matrix in dimensions is
orthogonal.
Thus we have that the rotated matrix
A0 = R · A · R (A.39)
Let us assume that this transformed matrix is diagonal, then it can be written as the product of the unit
matrix I and a vector of scalar numbers called the characteristic roots as
A0 = R · A · R = I (A.40)
or £ ¤
I−A0 X0 = 0 (A.43)
This represents a set of homogeneous linear algebraic equations in unknowns X0 where is a set of
characteristic roots, (eigenvalues) with corresponding eigenfunctions X0 Ignoring the trivial case of X0 being
zero, then (43) requires that the secular determinant of the bracket be zero, that is
¯ ¯
¯I−A0 ¯ = 0 (A.44)
( − 1 ) ( − 2 ) ( − 3 ) ( − ) = 0 (A.45)
eigenvalues are identical, then the reduction to a true diagonal form is not possible and one has the freedom
to select an appropriate eigenvector that is orthogonal to the remaining axes.
In summary, the matrix can only be fully diagonalized if (a) all the eigenvalues are distinct, (b) the real
matrix is symmetric, (c) it is unitary.
A frequent application of matrices in classical mechanics is for solving a system of homogeneous linear
equations of the form
11 1 +12 2 +1 = 0
11 1 +12 2 +1 = 0
(A.47)
=
1 1 +2 2 + = 0
Making the following definitions ⎛ ⎞
11 12 1
⎜ 21 22 2 ⎟
A =⎜
⎝
⎟ (A.48)
⎠
1 2
⎛ ⎞
1
⎜ 2 ⎟
X =⎜
⎝
⎟
⎠ (A.49)
Then the set of linear equations can be written in a compact form using the matrices
A · X =0 (A.50)
which can be solved using equation (43). Ensure that you are able to diagonalize a matrices with rank
2 and 3. You can use Mathematica, Maple, MatLab, or other such mathematical computer programs to
diagonalize larger matrices.
This expands to
−( + 1)( − 1) = 0
Thus the three eigen values are = −1 0 1
To find each eigenvectors we substitute the corresponding eigenvalue into equation (48)
⎛ ⎞⎛ ⎞ ⎛ ⎞
− 1 0 0
⎝ 1 − 0 ⎠ ⎝ ⎠ = ⎝ 0 ⎠
0 0 − 0
This expands to
(1 − ) ( + 1)( − 1) = 0
Thus the three eigen values are = −1 1 1
The eigenvectors are determined by substituting the corresponding eigenvalue into equation (42)
⎛ ⎞ ⎛ ⎞ ⎛ ⎞
1− 0 0 0
⎝ 0 − 1 ⎠ · ⎝ ⎠ = ⎝ 0 ⎠
0 0 − 0
The eigenvalue = −1 yields 2 = 0 and + = 0 Thus the eigen vector is 1 = (0 √12 √ −1
2
). The
eigenvalue = 1 yields − + = 0 The eigenvector 2 must be perpendicular to 1 and there are an infinite
number of choices. Let us assume that 2 = (0 √12 √12 ) which satisfies equation (50) then the eigenvector
3 must be perpendicular to both 1 and 2 For rank three this is found using
r3 = r1 × r2 = (1 0 0)
504 APPENDIX A. MATRIX ALGEBRA
Appendix B
Vector algebra
a ± b = ±b + a (B.2)
a+ (b + c) = (a + b) +c
(a + b) = a+b
The manipulation of vectors is greatly facilitated by use of components along an orthogonal coordinate
system defined by three orthogonal unit vectors (ê1 ê2 ê3 ) . For example the cartesian coordinate system
is defined by three unit vectors which, by convention, are called (î ĵ k̂).
where is the angle between the two vectors. It is a scalar and thus is independent of the orientation of
the coordinate axis system. Note that the scalar product commutes, is distributive, and associative with a
scalar multiplier, that is
Note that a · a = ||2 and if a and b are perpendicular then cos = 0 and thus a · b =0
505
506 APPENDIX B. VECTOR ALGEBRA
If the three unit vectors (ê1 ê2 ê3 ) form an orthonormal basis, that is, they are orthogonal unit vectors,
then from equations 3 and 4
ê · ê = (B.5)
If â is the unit vector for the vector a then the scalar product of a vector a with one of these unit vectors
ê gives the cosine of the angle between the vector a and ê , that is
a · ê1 = || (â · ê1 ) = || cos (B.6)
a · ê2 = || (â · ê2 ) = || cos
a · ê3 = || (â · ê3 ) = || cos
where the cosines are called the direction cosines since they define the direction of the vector a with respect
to each orthogonal basis unit vector. Moreover, a · ê1 = || â · ê1 = || cos is the component of a along the
ê1 axis. Thus the three components of the vector a is fully defined by the magnitude || and the direction
cosines, corresponding to the angles . That is,
1 = || (â · ê1 ) = || cos (B.7)
2 = || (â · ê2 ) = || cos
3 = || (â · ê3 ) = || cos
If the three unit vectors (ê1 ê2 ê3 ) form an orthonormal basis then the vector is fully defined by
a = 1 ê1 + 2 ê2 + 3 ê3 (B.8)
Consider two vectors
a = 1 ê1 + 2 ê2 + 3 ê3
b = 1 ê1 + 2 ê2 + 3 ê3
Then using 5
a · b =1 1 + 2 2 + 3 3 = || || cos (B.9)
1
where is the angle between the two vectors. In particular, since the direction cosine cos = || , then
equation 9 gives
cos = cos cos + cos cos + cos cos (B.10)
Note that when = 0 then 10 gives
cos2 + cos2 + cos2 = 1 (B.11)
where the (Levi-Civita) permutation symbol has the following properties
= 0 if an index is equal to any another index
= +1 if form an even permutation of 1 2 3 (B.14)
= −1 if form an odd permutation of 1 2 3
B.4. TRIPLE PRODUCTS 507
P
For example, if the three unit vectors (ê1 ê2 ê3 ) form an orthonormal basis, then ê ≡ ê ê , i.e.
ê1 × ê2 = ê3 ê2 × ê3 = ê1 ê3 × ê1 = ê2 (B.15)
ê2 × ê1 = −ê3 ê3 × ê2 = −ê1 ê1 × ê3 = −ê2 (B.16)
ê1 × ê1 = 0 ê2 × ê2 = 0 ê3 × ê0 = 0 (B.17)
a × b = −b × a (B.18)
a× (b + c) = a × b + a × c (B.19)
(a) ×b = (a × b) (B.20)
where is the angle between the two vectors and the determinant is evaluated for the top row. Examples of
vector products are torque N = r × F, angular momentum L = r × p, and the magnetic force F = v × B.
a· (b × c) = c· (a × b) = b· (c × a) = (a × b) · c = −a· (c × b) (B.21)
That is, the scalar product is invariant to cyclic permutations of the three vectors but changes sign for
interchange of two vectors. The scalar product is unchanged by swapping the scalar ()and vector ().
Because of the symmetry the scalar triple product can be denoted as [a b c] and
The scalar triple product can be written in terms of the components using a determinant
¯ ¯
¯ 1 2 3 ¯
¯ ¯
[a b c] = ¯¯ 1 2 3 ¯¯ (B.23)
¯ 1 2 3 ¯
508 APPENDIX B. VECTOR ALGEBRA
a × (b × c) = (a · c) b − (a · b) c (B.24)
Workshop exercises
1. Partition the following exercises among the group. Once you have completed your problem, check with a
classmate before writing it on the board. After you have verified that you have found the correct solution,
write your answer in the space provided on the board, taking care to include the steps that you used to arrive
at your solution. The following information is needed.
a⎛= 3i + 2j − 9k ⎞ b = −2i + 3k c =⎛
−2i + j − 6k⎞ d⎛= i + 9j + 4k ⎞
2 7 −4 µ ¶ 2 −4 −8 −1 −3
3 4
E = ⎝ 3 1 −2 ⎠ F = G=⎝ 7 1 ⎠ H = ⎝ −4 2 −2 ⎠
5 6
−2 0 5 −1 1 −1 0 0
Calculate each of the following
1 |a − (b + 3c)| 7 (EH)
2 Component of c along a 8 |HE|
3 Angle between c and d 9 EHG
4 (b × d) · a 10 EG − HG
5 (b × d) × a 11 EH − H E
6 b× (d × a) 12 F−1
Problems
[1] For what values of are the vectors A = 2̂ − 2̂ + ̂ and B = ̂ + 2̂ + 2̂ perpendicular?
Show also that the product is unaffected by interchange of the scalar and vector product operations or by change in
(A × B) · C = A · (B × C) = B · (C × A) =(C × A) · B
Therefore we may use the notation to denote the triple scalar product. Finally give a geometric interpre-
tation of by computing the volume of the parallelepiped defined by the three vectors A B C
Appendix C
The methods of vector analysis provide a convenient representation of physical laws. However, the manip-
ulation of scalar and vector fields is greatly facilitated by use of components with respect to an orthogonal
coordinate system.
= ( ) (C.1)
r = î+ ĵ+ k̂ (C.2)
Calculation of the time derivatives of the position vector is especially simple using cartesian coordinates
because the unit vectors (î ĵ k̂) are constant and independent in time. That is;
Since the time derivatives of the unit vectors are all zero then the velocity ṙ = r
reduces to the partial time
derivatives of and . That is,
ṙ =̇î+̇ ĵ+̇ k̂ (C.3)
Similarly the acceleration is given by
r̈ =̈î+̈ ĵ+̈ k̂ (C.4)
509
510 APPENDIX C. ORTHOGONAL COORDINATE SYSTEMS
Curvilinear coordinate systems introduce a complication in that the unit vectors are time dependent in
contrast to cartesian coordinate system where the unit vectors (î ĵ k̂) are independent and constant in time.
The introduction of this time dependence warrants further discussion.
Each of the three axes in curvilinear coordinate systems can be expressed in cartesian coordinates
( ) as surfaces of constant given by the function
= ( ) (C.5)
where = 1 2 or 3. An element of length perpendicular to the surface is the distance between the
surfaces and + which can be expressed as
where is a function of (1 2 3 ). In cartesian coordinates 1 ,2 and 3 are all unity. The unit-length
vectors ̂1 , ̂2 , ̂3 , are perpendicular to the respective 1 2 3 surfaces, and are oriented to have increasing
indices such that q̂1 ×q̂2 = q̂3 . The correspondence of the curvilinear coordinates, unit vectors, and transform
coefficients to cartesian, polar, cylindrical and spherical coordinates is given in table 1
s = 1 q̂1 + 2 q̂2 + 3 q̂3 = 1 1 q̂1 + 2 2 q̂2 + 3 3 q̂3 (C.7)
= 1 2 3 = 1 2 3 (1 2 3 ) (C.8)
These are evaluated below for polar, cylindrical, and spherical coordinates.
since the unit vector r̂ is a constant with |r̂| = 1. Note that the infinitessimal r̂ is perpendicular to the unit
vector r̂, that is, r̂ points in the tangential direction θ̂
Similarly, the infinitessimal
θ̂ = θ̂ 2 − θ̂ 1 = θ̂ = −r̂ (C.10)
which is perpendicular to the tangential θ̂ unit vector and therefore points in the direction −r̂ . The minus
sign causes −r̂ to be directed in the opposite direction to r̂.
C.2. CURVILINEAR COORDINATE SYSTEMS 511
r̂
= θ̂ (C.12)
θ̂
= − r̂ (C.13)
Note that the time derivatives of unit vectors are perpendicular to the corresponding unit vector, and the
unit vectors are coupled.
Consider that the velocity v is expressed as
r r̂
v= = (r̂) = r̂ + = ̇r̂ + ̇θ̂ (C.14)
The velocity is resolved into a radial component ̇ and an angular, transverse, component ̇.
Similarly the acceleration is given by
2
where the ̇ r̂ term is the effective centripetal acceleration while the 2̇̇θ̂ term is called the Coriolis term.
For the case when ̇ = ̈ = 0, then the first bracket in 15 is the centripetal acceleration while the second
bracket is the tangential acceleration.
This discussion has shown that in contrast to the time independence of the cartesian unit basis vectors,
the unit basis vectors for curvilinear coordinates are time dependent which leads to components of the velocity
and acceleration involving coupled coordinates.
Coordinates
Distance element s = r̂ + θ̂
Area element =
Unit vectors r̂ = ̂ cos + ̂ sin
θ̂ = −̂ sin + ̂ cos
r̂
Time derivatives = ̇ θ̂
̂
of unit vectors = −̇r̂
Velocity v= ³ ̇r̂ + ̇θ̂´
2
Kinetic energy 2 ̇2 +2 ̇
³ 2
´
Acceleration a = ̈ − ̇ r̂
³ ´
+ ̈ + 2̇̇ θ̂
Table 2: Differential relations plus a diagram of the unit vectors for 2-dimensional polar coordinates.
512 APPENDIX C. ORTHOGONAL COORDINATE SYSTEMS
Coordinates
Distance element s = ρ̂ + φ̂ + ẑ
Volume element =
Unit vectors ρ̂ = ̂ cos + ̂ sin
φ̂ = −̂ sin + ̂ cos
ẑ = k̂
̂
Time derivatives = ̇φ̂
̂
of unit vectors = −̇ρ̂
ẑ
= 0
Velocity v =³ ̇ρ̂ + ̇φ̂ + ̇ẑ
´
2
Kinetic energy
2̇2 +2 ̇ + ̇ 2
³ 2
´
Acceleration a = ̈ − ̇ ρ̂
³ ´
+ ̈ + 2̇̇ φ̂ + ̈ẑ
Table 3: Differential relations plus a diagram of the unit vectors for cylindrical coordinates.
Coordinates
Distance element = r̂ + θ̂ + sin φ̂
Volume element = 2 sin
Unit vectors r̂ = ̂ sin cos + ̂ sin sin + k̂ cos
θ̂ = ̂ cos cos + ̂ cos sin − k̂ sin
φ̂ = −̂ sin + ̂ cos
r̂
Time derivatives = θ̂ ̇ + φ̂̇ sin
̂
of unit vectors = −r̂̇ + φ̂̇ cos
̂
= −r̂̇ sin − θ̂ ̇ cos
Velocity v= ³ ̇r̂ + ̇θ̂ + ̇ sin φ̂´
2 2
Kinetic energy
2 ̇2 +2 ̇ +2 sin2 ̇
³ 2 2
´
Acceleration a = ̈ − ̇ − ̇ sin2 r̂
³ 2
´
+ ̈ + 2̇̇ − ̇ sin cos θ̂
³ ´
+ ̈ sin + 2̇̇ sin + 2̇̇ cos φ̂
Table 4 Differential relations plus a diagram of the unit vectors for spherical coordinates.
C.3. FRENET-SERRET COORDINATES 513
The distance and volume elements, the cartesian coordinate components of the spherical unit basis
vectors, and the unit vector time derivatives are shown in the table given in figure 4. The time dependence
of the unit vectors is used to derive the acceleration. As for the case of cylindrical coordinates, the r̂ θ̂ and
φ̂ components of the acceleration involve coupling of the coordinates and their time derivatives.
It is important to note that the angular unit vectors θ̂ and φ̂ are taken to be tangential to the circles of
rotation. However, for discussion of angular velocity of angular momentum it is more convenient to use the
axes of rotation defined by r̂ × θ̂ and r̂ × φ̂ for specifying the vector properties which is perpendicular to
the unit vectors θ̂ and φ̂. Be careful not to confuse the unit vectors θ̂ and φ̂ with those used for the angular
velocities ̇ and ̇.
t̂
= n̂ (C.16)
b̂
= − n̂ (C.17)
n̂
= −t̂+ b̂ (C.18)
The curvature = 1 where is the radius of curvature and is the torsion that can be either positive
or negative. For increasing a non-zero curvature implies that the triad of unit vectors rotate in a
right-handed sense about b̂. If the torsion is positive (negative) the triad of unit vectors rotates in right
(left) handed sense about t̂.
¯ ¯
¯ ¯
Distance element s() = t̂ ¯ r()
¯ = t̂()
v()
Unit vectors t̂() = |()|
t̂
n̂() =
|t̂|
b̂()= t̂ × n̂ ^
n
Time derivatives ⎛ ⎞ ⎛ ⎞⎛ ⎞
t̂ 0 0 t̂
⎝ n̂ ⎠
of unit vectors = || ⎝ − 0 ⎠ ⎝ n̂ ⎠ ^t
b̂ 0 − 0 b̂ ^
b
Velocity v() = r()
Acceleration a() = 2
t̂+ n̂
Table 5. The differential relations plus a diagram of the corresponding unit vectors for the Frenet-Serret
coordinate system.
514 APPENDIX C. ORTHOGONAL COORDINATE SYSTEMS
The above equations also can be rewritten in the form using a new unit rotation vector ω where
t̂
= ω × t̂ (C.20)
n̂
= ω × n̂ (C.21)
b̂
= ω × b̂ (C.22)
In general the Frenet-Serret unit vectors are time dependent. If the curvature = 0 then the curve is a
straight line and n̂ and b̂ are not well defined. If the torsion is zero then the trajectory lies in a plane. Note
that a helix has constant curvature and constant torsion.
The rate of change of a general vector field E along the trajectory can be written as
µ ¶
E
= t̂ + n̂+ b̂ + ω × E (C.23)
The Frenet-Serret coordinates are used in the life sciences to describe the motion of a moving organism
in a viscous medium. The Frenet-Serret coordinates also have applications to General Relativity.
Workshop exercises
1. The goal of this problem is to help you understand the origin of the equations that relate two different coordinate
systems. Refer to diagrams for cylindrical and spherical coordinates as your teaching assistant explains how to
arrive at expressions for 1 2 and 3 in terms of and and how to derive expressions for the velocity and
acceleration vectors in cylindrical coordinates. Now try to relate spherical and rectangular coordinate systems.
Your group should derive expressions relating the coordinates of the two systems, expressions relating the unit
vectors and their time derivatives of the two systems, and finally, expressions for the velocity and acceleration
in spherical coordinates.
Appendix D
Coordinate transformations
Coordinate systems can be translated, or rotated with respect to each other as well as being subject to spatial
inversion or time reversal. Scalars, vectors, and tensors are defined by their transformation properties under
rotation, spatial inversion and time reversal, and thus such transformations play a pivotal role in physics.
The velocities for a moving frame are given by the vector difference of the velocity in a stationary frame,
and the velocity of the origin of the moving frame. Linear accelerations can be handled similarly.
515
516 APPENDIX D. COORDINATE TRANSFORMATIONS
origins of both frames coincide. Rotation of a frame does not change the vector, only the vector components
of the unit basis states. Therefore
x = ê01 01 + ê02 02 + ê03 03 = ê1 1 + ê2 2 + ê3 3 (D.3)
Note that if one designates that the unit vectors for the unprimed coordinate frame are (ê1 ê2 ê3 ) and for
the primed coordinate frame (ê01 ê02 ê03 ) then taking the scalar product of equation 3 sequentially with
each of the unit base vectors (ê01 ê02 ê03 ) leads to the following three relations
01 = (ê01 ·ê1 )1 + (ê01 ·ê2 )2 + (ê01 ·ê3 )3 (D.4)
02 = (ê02 ·ê1 )1 + (ê02 ·ê2 )2 + (ê02 ·ê3 )3
03 = (ê03 ·ê1 )1 + (ê03 ·ê2 )2 + (ê03 ·ê3 )3
Note that the (ê0 ·ê ) are the direction cosines as defined by the scalar product of two unit vectors for axes
, that is, they are the cosine of the angle between the two unit vectors.
Equation 4 can be written in matrix form as
x0 = λ · x (D.5)
where the ” · ” means the inner matrix product of the rotation matrix λ and the vector x where
⎛ 0 ⎞ ⎛ ⎞ ⎛ 0 ⎞
1 1 ê1 ·ê1 ê01 ·ê2 ê01 ·ê3
0
x ≡ ⎝ 20 ⎠
x≡ ⎝ 2 ⎠ λ≡ ⎝ ê02 ·ê1 ê02 ·ê2 ê02 ·ê3 ⎠ (D.6)
0
3 3 ê03 ·ê1 ê03 ·ê2 ê03 ·ê3
The inverse procedure is obtained by multiplying equation 3 successively by one of the unit basis
vectors (ê1 ê2 ê3 ) leading to three equations
1 = (ê1 ·ê01 )01 + (ê1 ·ê02 )02 + (ê1 ·ê03 )03 (D.7)
2 = (ê2 ·ê01 )01 + (ê2 ·ê02 )02 + (ê2 ·ê03 )03
3 = (ê3 ·ê01 )01 + (ê3 ·ê02 )02 + (ê3 ·ê03 )03
x = λ ·x0 (D.8)
Thus ³ ´
λ ·λ = I
where I is the identity matrix. This implies that the rotation matrix λ is orthogonal with λ = λ−1 .
It is convenient to rename the elements of the rotation matrix to be
Consider an arbitrary rotation through an angle . Equations (10) and (11) can be used to relate
six of the nine quantities in the rotation matrix, so only three of the quantities are independent. That
is, because of equation (11) we have three equations which ensure that the transformation is unitary.
The fact that the rotation matrix should have three independent quantities is due to the fact that all rotations
can be expressed in terms of rotations about three orthogonal axes.
0
0 0 =cos(0 )
1 1 0 1
1 2 90 0
1 3 90 0
2 1 90 0
2 2 60 0500
2 3 90 − 60 0866
3 1 90 0
3 2 90 + 60 −0866
3 3 60 0500
λ = λ λ (D.19)
That is: ⎛ ⎞⎛ ⎞ ⎛ ⎞
0 1 0 1 0 0 0 0 1
λ = ⎝ −1 0 0 ⎠ ⎝ 0 0 1 ⎠ = ⎝ −1 0 0 ⎠ 6= λ (D.20)
0 0 1 0 −1 0 0 −1 0
An entirely different orientation results as illustrated in figure 1.
This behavior of finite rotations is a consequence of the fact that finite rotations do not commute, that
is, reversing the order does not give the same answer. Thus, if we associate the vectors A and B with
these rotations, then it implies that the vector product AB 6= BA. That is, for finite rotation matrices, the
product does not behave like for true vectors since they do not commute.
D.2. ROTATIONAL TRANSFORMATIONS 519
r = θ × r (D.21)
r1 = θ 1 × r (D.22)
and
r2 = θ 2 × (r + r1 ) (D.23)
Thus the final position vector for θ1 followed by θ2 is
Note that the products of these two infinitessimal rotations, 25 and 27 are identical. That is, assuming
that second-order infinitessimals can be neglected, then the infinitessimal rotations commute, and thus θ 1
and θ2 are correctly represented by vectors.
The fact that θ is a vector allows angular velocity to be represented by a vector. That is, angular
velocity is the ratio of an infinitessimal rotation to an infinitessimal time.
θ
ω= (D.28)
Note that this implies that the velocity of the point can be expressed as
r θ
v= = ×r=ω×r (D.29)
It was shown in equation 12 that, for such an orthogonal matrix, the inverse matrix −1 equals the
transposed matrix
λ−1 = λ
520 APPENDIX D. COORDINATE TRANSFORMATIONS
Inserting the orthogonality relation for the rotation matrix leads to the fact that the square of the determinant
of the rotation matrix equals one,
||2 = 1 (D.31)
that is
|| = ±1 (D.32)
A proper rotation is the rotation of a normal vector and has
|| = +1 (D.33)
For all proper rotations the determinant of = +1 and thus the cross product also acts like a proper vector
under rotation. This is not true for improper rotations where || = −1
x 1‘
A() = −A(−) (D.37)
B() = −B(−) x2 x ‘2
C = A × B (D.39)
C = B × A = −A × B (D.40)
D.4. TIME REVERSAL TRANSFORMATION 521
That is, handedness corresponds to a definite ordering of the cross product. Proper orthogonal transforma-
tions are said to preserve chirality (Greek for handedness) of a coordinate system.
An example of the use of the right-handed system is the usual definition of cartesian unit vectors,
bi × bj = k
b (D.41)
An obvious question to be asked, is the handedness of a coordinate system merely a mathematical curiosity
or does it have some deep underlying significance? Consider the Lorentz force
F = (E + v × B) (D.42)
Since force and velocity are proper vectors then the magnetic B field must be a pseudo vector. Note that
calculation of the B field occurs only in cross products such as,
∇ × B = j (D.43)
where the current density j is a proper vector. Another example is the Biot-Savart Law which expresses B
as
l × r
B = (D.44)
4 2
Thus even though B is a pseudo vector, the force F remains a proper vector. Thus if a left-handed coordinate
definition of B = 4 r×l
2 is used in 44, and F = (E + B ×v) in 42 then the same final physical
result would be obtained.
It was long thought that the laws of physics were symmetric with respect to spatial inversion ( i.e. mirror
reflection), meaning that the choice between a left-handed and right-handed representations (chirality) was
arbitrary. This is true for gravitational, electromagnetic and the strong force, and is called the conservation
of parity. The fourth fundamental force in nature, the weak force, violates parity and favours handedness.
It turns out that right-handed ordinary matter is symmetrical with left-handed antimatter.
In addition to the two flavours of vectors, one has scalars and pseudoscalars defined by:
are invariant under time reversal. Since the force can be expressed as the gradient of a scalar potential for
a conservative field, then the potential also remains unchanged. That is
p
= −∇ () = F (D.47)
It is necessary to introduce tensor algebra, given in appendix , prior to discussion of the transformation
properties of observables which is the topic of appendix 5.
Workshop exercises
1. Suppose the 2 -axis of a rectangular coordinate system is rotated by 30◦ away from the 3 -axis around the
1 -axis.
(a) Find the corresponding transformation matrix. Try to do this by drawing a diagram instead of going to
the book or the notes for a formula.
522 APPENDIX D. COORDINATE TRANSFORMATIONS
(b) Is this an orthogonal matrix? If so, show that it satisfies the main properties of an orthogonal matrix. If
not, explain why it fails to be orthogonal.
(c) Does this matrix represent a proper or an improper rotation? How do you know?
2. When you were first introduced to vectors, you most likely were told that a scalar is a quantity that is defined
by a magnitude, while a vector has both a magnitude and a direction. While this is certainly true, there is
another, more sophisticated way to define a scalar quantity and a vector quantity: through their transformation
P
properties. A scalar quantity transforms as 0 = while a vector quantity transforms as 0 = To
show that the scalar product does indeed transform as a scalar, note that:
⎛ ⎞Ã ! Ã !
X X X X X X
A0 ·B0 = 0 0 = ⎝ ⎠ =
à !
X X X
= = = A · B
Now you will show that the vector product transforms as a vector. Begin by writing out what you are trying
to show explicitly and show it to the teaching assistant. Once the teaching assistant has confirmed that you
have the correct expression, try to prove it. The vector product is a bit more difficult to work with than the
scalar product, so your teaching assistant is prepared to give you a hint if you get stuck.
3. Suppose you have two rectangular coordinate systems that share a common origin, but one system is rotated
by an angle with respect to the other. To describe this rotation, you have made use of the rotation matrix
(). (I’m changing the notation slightly to put the emphasis on the angle of rotation.)
(a) Verify that the product of two rotation matrices (1 )(2 ) is in itself a rotation matrix.
(b) In abstract algebra, a group is defined as a set of elements together with a binary operation ∗ acting
on that set such that four properties are satisfied:
i. (Closure) For any two elements and in the group , the product of the elements, ∗ is also
in the group .
ii. (Associativity) For any three elements of the group , ( ∗ ) ∗ = ∗ ( ∗ ).
iii. (Existence of Identity) The group contains an identity element such that ∗ = ∗ = for
all ∈ .
iv. (Existence of Inverses) For each element ∈ , there exists an inverse element −1 ∈ such that
∗ −1 = −1 ∗ = .
Show that if the product ∗ denotes the product of two matrices, then the set of rotation matrices together
with ∗ forms a group. This group is known as the special orthogonal group in two dimensions, also known
as (2).
(c) Is this group commutative? In abstract algebra, a commutative group is called an abelian group.
4. When you look in a mirror the image of you appears left-to-right reversed, that is, the image of your left ear
appears to be the right ear of the image and vise versa. Explain why the image is left-right reversed rather
than up-down reversed or reversed about some other axis; i.e. explain what breaks the symmetry that leads to
these properties of the mirror image.
Problems
[1] Find the transformation matrix that rotates the axis 3 of a rectangular coordinate system 45 toward 1 around
the 2 axis.
2
[2] For simplicity, take to be a two-dimensional transformation matrix. Show by direct expansion that |λ| = 1.
Appendix E
Tensor algebra
E.1 Tensors
Mathematically scalars and vectors are the first two members of a hierarchy of entities, called tensors,
that behave under coordinate transformations as described in appendix . The use of the tensor notation
provides a compact and elegant way to handle transformations in physics.
A scalar is a rank 0 tensor with one component, that is invariant under change of the coordinate system.
(0 0 0 ) = () (E.1)
A vector is a rank 1 tensor which has three components, that transform under rotation according to
matrix relation
x0 = λ · x (E.2)
where λ is the rotation matrix. Equation 2 can be written in the suffix form as
3
X
0
= (E.3)
=1
The above definitions of scalars and vectors can be subsumed into a class of entities called tensors of rank
that have 3 components. A scalar is a tensor of rank = 0, with only 30 = 1 component, whereas a vector
has rank = 1 that is, the vector x has one suffix and 31 = 3 components.
A second-order tensor has rank = 2 with two suffixes, that is, it has 32 = 9 components that
transform under rotation as
3 X
X 3
0 = (E.4)
=1 =1
For second-order tensors, the transformation formula given by equation 4 can be written more compactly
using matrices. Thus the second-order tensor can be written as a 3 × 3 matrix
⎛ ⎞
11 12 13
T ≡ ⎝ 21 22 23 ⎠ (E.5)
31 32 33
The rotational transformation given in equation 4 can be written in the form
3
à 3 ! 3
à 3 !
X X X X
0 = = (E.6)
=1 =1 =1 =1
where are the matrix elements of the transposed matrix λ . The summations in 6 can be expressed
in both the tensor and conventional matrix form as the matrix product
T0 = λ · T · λ (E.7)
Equation 7 defines the rotational properties of a spherical tensor.
523
524 APPENDIX E. TENSOR ALGEBRA
= (E.9)
This second-order tensor product has a rank = 2 that is, it equals the sum of the ranks of the two
vectors. Equation 8 is called a dyad since it was derived by taking the dyadic product of two vectors. In
general, multiplication, or division, of two vectors leads to second-order tensors. Note that this second-order
tensor product completes the triad of tensors possible taking the product of two vectors. That is, the scalar
product a · b, has rank = 0, the vector product a × b, rank = 1 and the tensor product a ⊗ b has rank1
= 2.
Higher-order tensors can be created by taking more complicated tensor products. For example, a rank-3
tensor can be created by taking the tensor outer product of the rank-2 tensor and a vector which, for
a dyadic tensor, can be written as the tensor product of three vectors. That is,
In summary, the rank of the tensor product equals the sum of the ranks of the tensors included in the tensor
product.
The scalar product a · b is a scalar number, and thus the inner-product tensor is the vector c renormalized
by the magnitude of the scalar product a · b. That is, it has a rank = 2 + 1− 2 = 1. Thus the inner product
of this rank-2 tensor with a vector gives a vector. The inner product of a rank-2 tensor with a rank-1 tensor
is used in this book for handling the rotation matrix, the inertia tensor for rigid-body rotation, and for the
stress and the strain tensors used to describe elasticity in solids.
Then the vector φ can be expressed compactly as the inner product of G and xthat is
φ = G·x
Equation 13 relates the contravariant components in the unprimed and primed frames.
Derivatives of a scalar function , such as
X X
0 = = = (E.14)
That is, covariant components of the tensor transform according to the relation
X
0 = (E.15)
It is important to differentiate between contravariant and covariant vectors. The Einstein superscript/subscript
convention for distinguishing between these two flavours of tensors is given in table 1
In linear algebra one can map from one coordinate system to another as illustrated in appendix . That
is, the tensor x can be expressed as components with respect to either the unprimed or primed coordinate
frames
x = ê01 01 + ê02 02 + ê03 03 = ê1 1 + ê2 2 + ê3 3 (E.16)
For a −dimensional manifold the unit basis column vectors ê transform according to the transformation
matrix λ
ê0 = λ · ê (E.17)
Since the tensor x is independent of the coordinate basis, the components of x must have the opposite
transform ¡ ¢
x0 = λ−1 ·x (E.18)
This normal vector x is called a ’contravariant vector" because it transforms contrary to the basis column
vector transformation.
The inverse of equation 18 gives that the column vector element
X
= λ 0 (E.19)
E.5. GENERALIZED INNER PRODUCT 527
Consider the case of a gradient with respect to the coordinate x in both the unprimed and primed bases.
Using the chain rule for the partial derivative then the component of the gradient in the primed frame can
be expanded as
X X
(∇ )0 = = = λ = (E.20)
0
0
Again the summation cancels the superscript and subscript. The Kronecker delta symbol is written as
X
= (E.24)
where is a unitary matrix called a covariant metric. The covariant metric transforms a contravariant to
a covariant tensor. For example the matrix element of a covariant tensor can be written as
X
= (E.26)
By association of the covariant metric with either of the vectors in the inner product gives
X X X
= = = (E.27)
Then
X
= (E.29)
Association of the contravariant metric with one of the vectors in the inner product gives the inner
product X X X
= = = (E.30)
For most situations in this book the metric is diagonal and unitary.
528 APPENDIX E. TENSOR ALGEBRA
Table 2 : Transformation properties of scalar, vector, pseudovector, and tensor observables
under rotation, spatial inversion, and time reversal2
Physical Observable Rotation Space Time Name
(Tensor rank) inversion reversal
1) Classical Mechanics
Mass density 0 Even Even Scalar
Kinetic energy 2 2 0 Even Even Scalar
Potential energy () 0 Even Even Scalar
Lagrangian 0 Even Even Scalar
Hamiltonian 0 Even Even Scalar
Gravitational potential 0 Even Even Scalar
Coordinate r 1 Odd Even Vector
Velocity v 1 Odd Odd Vector
Momentum p 1 Odd Odd Vector
Angular momentum L=r×p 1 Even Odd Pseudovector
Force F 1 Odd Even Vector
Torque N=r×F 1 Even Even Pseudovector
Gravitational field g 1 Odd Even Vector
Inertia tensor I 2 Even Even Tensor
Elasticity stress tensor T 2 Even Even Tensor
2) Electromagnetism
Charge density 0 Even Even Scalar
Current density j 1 Odd Odd Vector
Electric field E 1 Odd Even Vector
Polarization P 1 Odd Even Vector
Displacement D 1 Odd Even Vector
Magnetic field B 1 Even Odd Pseudovector
Magnetization M 1 Even Odd Pseudovector
Magnetic field H 1 Even Odd Pseudovector
Poynting vector S=E×H 1 Odd Odd Vector
Dielectric tensor K 2 Even Even Tensor
Maxwell stress tensor T 2 Even Even Tensor
2 Based on table 6.1 in "Classical Electrodynamics" 2 edition, by J.D. Jackson [?]
Appendix F
Multivariate calculus provides the framework for handling systems having many variables associated with
each of several bodies. It is assumed that the reader has studied linear differential equations plus multivariate
calculus and thus has been exposed to the calculus used in classical mechanics. Chapter 5 of this book
introduced variational calculus which covers several important aspects of multivariate calculus such as Euler’s
variational calculus and Lagrange multipliers. This appendix provides a brief review of a selection of other
aspects of multivariate calculus that feature prominently in classical mechanics.
529
530 APPENDIX F. ASPECTS OF MULTIVARIATE CALCULUS
typically comprise partial derivatives that act on scalar, vector, or tensor fields. Table 1 lists a few
elementary examples of the use of linear operators in this textbook. The first four linear operators involve
the widely used del operator ∇ to generate the gradient, divergence and curl as described in appendices
and . The fifth and sixth linear operators act on the Lagrangian in Lagrangian mechanics applications.
The final two linear operators act on the wavefunction for wave mechanics.
There are three ways of expressing operations such as addition, multiplication, transposition or inversion
of operations that are completely equivalent because they all are based on the same principles of linear
algebra. For example, a transformation O acting on a vector A can produced the vector B. The simplest
way to express this transformation is in terms of components
3
X
= (F.6)
=1
Another way is to use matrix mechanics where the 3 × 3 matrix (O) transforms the column vector (A) to
the column vector (B), that is,
(B) = (O) (A) (F.7)
The third approach is to assume an operator O acts on the vector A
B = OA (F.8)
In classical mechanics, and quantum mechanics, these three equivalent approaches are used and exploited
extensively and interchangeably. In particular the rules of matrix manipulation, that are given in appendix
are synonymous, and equivalent to, those that apply for operator manipulation. If the operator is complex
then the operator properties are summarized as follows.
The generalization of the transpose for complex operators is the Hermitian conjugate †
† ∗
= (F.9)
For a real matrix the complex conjugation has no effect so the matrix is real and symmetric.
The generalization of orthogonal is unitary for which the operator is unitary if it is non-singular and
−1 = † (F.12)
which implies
† = = † (F.13)
F.3. TRANSFORMATION JACOBIAN 531
As shown in table 4, 1 2 3 = 2 sin that is, the Jacobian equals 2 sin Thus equation 16
can be written as
∙ ¸
3 (1 2 3 ) 3 2 2 ( )
1 2 3 = (sin ) = Ω (F.17)
1 2 3 1 2 3 Ω
The differential cross section is defined by
2 ( ) 3
≡ 2 (F.18)
Ω 1 2 3
where the 2 factor is absorbed into the cross section and the solid angle term is factored out
the geometric relations 1 = sin cos , 2 = sin sin , 3 = cos . For this transformation the Jacobian
determinant equals
¯ ¯
¯ sin cos cos cos − sin sin ¯
¯ ¯
( ) = ¯¯ sin sin cos sin sin cos ¯¯ = 2 sin
¯ cos − sin 0 ¯
Thus the three-dimensional volume integral transforms to
Z Z Z
(1 2 3 )1 2 3 = ( )( ) = ( )2 sin
Workshop exercises
1. Below you will find a set of integrals. Your teaching assistant will divide you into groups and each group will
be assigned one integral to work on. Once your group has solved the integral, write the solution on the board
in the space provided by the teaching assistant.
R 2 R 4 R cos
(a) 2 sin
R0¡ ṙ 0
¢0
(b) − ṙ2
R
(c) A · a where A = ̂ + ̂ + k̂ and is the sphere 2 + 2 + 2 = 9.
R
(d) (∇ × A) · a where A = ̂ + ̂ + k̂ and is the surface defined by the paraboloid = 1 − 2 − 2 ,
where ≥ 0.
Appendix G
This appendix reviews vector differential calculus which is used extensively in both classical mechanics and
electromagnetism.
0
= (G.1)
That is, differentiation of scalar or vector fields with respect to a scalar operator does not change the
rotational behavior. In particular, the scalar differentials of vectors continue to obey the rules of ordinary
proper vectors. The scalar operator is used for calculation of velocity or acceleration.
0 = (G.3)
then the partial differential with respect to one component of the vector x0 gives
0 X
0 = (G.4)
0
533
534 APPENDIX G. VECTOR DIFFERENTIAL CALCULUS
Therefore
X 0 X
= = = (G.6)
0 0
Thus
0 X
0 = (G.7)
That is the vector derivative acting of a scalar field transforms like a proper vector.
Define the gradient, or ∇ operator, as
X
∇≡ eb (G.8)
where eb is the unit vector along the axis. In cartesian coordinates, the del vector operator is,
b
∇ ≡ bi + bj +k (G.9)
The gradient was applied to the gravitational and electrostatic potential to derive the corresponding field.
For example, for electrostatics it was shown that the gradient of the scalar electrostatic potential field can
be written in cartesian coordinates as
E = −∇ (G.10)
Note that the gradient of a scalar field produces a vector field. You are familiar with this if you are a skier
in that the gravitational force pulls you down the line of steepest descent for the ski slope.
By contrast to the scalar product, both the gradient of a scalar field, and the vector product, are vector
fields for which the components along the coordinate axes transform in a specific manner, such as to keep the
length of the vector constant, as the coordinate frame is rotated. The gradient, scalar and vector products
with the ∇ operator are the first order derivatives of fields that occur most frequently in physics.
Second derivatives of fields also are used. Let us consider some possible combinations of the product of
two del operators.
1) ∇· (∇ ) = ∇2
The scalar product of two del operators is a scalar under rotation. Evaluating the scalar product in
cartesian coordinates gives
µ ¶ µ ¶ 2 2 2
bi + bj + k b = + +
b · bi + bj + k (G.13)
2 2 2
This also can be obtained without confusion by writing this product as;
∇· (∇ ) = ∇ · ∇ = (∇ · ∇) (G.14)
G.3. VECTOR DIFFERENTIAL OPERATORS IN CURVILINEAR COORDINATES 535
where the scalar product of the del operator is a scalar, called the Laplacian ∇2 given by
2 2 2
∇ · ∇ = ∇2 ≡ + + (G.15)
2 2 2
The Laplacian operator is encountered frequently in physics.
2) ∇× (∇ ) = 0
Note that the vector product of two identical vectors
A×A=0 (G.16)
Therefore
∇× (∇ ) = 0 (G.17)
This can be confirmed by evaluating the separate components along each axis.
3) ∇· (∇ × A) = 0
This is zero because the cross-product is perpendicular to ∇ × A and thus the dot product is zero.
4) ∇× (∇ × A) = ∇· (∇ · A) − ∇2 A
The identity
A × (B × C) = B (A · C) − (A · B) C (G.18)
since ∇ · ∇ = ∇2
There are pitfalls in the discussion of second derivatives in that it is assumed that both del operators
operate on the same variable, otherwise the results are different.
G.3.1 Gradient:
The gradient in curvilinear coordinates is
1 1 1
∇ = q̂1 + q̂2 + q̂3 (G.20)
1 1 2 2 3 3
1
∇ = ρ̂ + ϕ̂ + ẑ (G.21)
In spherical coordinates
1 1
∇ = r̂ + θ̂ + ϕ̂ (G.22)
sin
536 APPENDIX G. VECTOR DIFFERENTIAL CALCULUS
G.3.2 Divergence:
The divergence can be expressed as
∙ ¸
1
∇·A= (1 2 3 ) + (2 3 1 ) + (3 1 2 ) (G.23)
1 2 3 1 2 3
G.3.3 Curl:
¯ ¯
¯ 1 q̂1 2 q̂2 3 q̂3 ¯¯
1 ¯
∇×A= ¯ ¯ (G.26)
1 2 3 ¯ 1 2 3 ¯
¯ 1 1 2 2 3 3 ¯
In cylindrical coordinates the curl is
¯ ¯
¯ ρ̂ ϕ̂ ẑ ¯¯
1 ¯¯ ¯
∇ × A = ¯ ¯ (G.27)
¯
¯
G.3.4 Laplacian:
Taking the divergence of the gradient of a scalar gives
∙ µ ¶ µ ¶ µ ¶¸
1 2 3 3 1 1 2
∇2 = ∇ · ∇ = + + (G.29)
1 2 3 1 1 1 2 2 2 3 3 3
The gradient, divergence, curl and Laplacian are used extensively in curvilinear coordinate systems when
dealing with vector fields in Newtonian mechanics, electromagnetism, and fluid flow.
Appendix H
Field equations, such as for electromagnetic and gravitational fields, require both line integrals, and surface
integrals, of vector fields to evaluate potential, flux and circulation. These require use of the gradient, the
Divergence Theorem and Stokes Theorem which are discussed in the following sections.
∆ = (∇ ) · l (H.1)
since the gradient of that is, ∇ is the rate of change of with l Discussions of gravitational and
electrostatic potential show that the line integral between points and is given in terms of the del operator
by
Z
− = (∇ ) · l (H.2)
This relates the difference in values of a scalar field at two points to the line integral of the dot product of
the gradient with the element of the line integral.
common to both 1 and 2 are equal and in the same direction. Then
the net flux through the sum of 1 and 2 is given by
I I I
F · S + F · S = F · S (H.4) cut
1 2
since the contributions of the common surface cancel in that the
flux out of 1 is equal and opposite to the flux into 2 over the surface
Figure H.1: A volume V enclosed
That is, independent of how many times the volume enclosed by
by a closed surface S is cut into two
is subdivided, the net flux for the sum of all the Gaussian
H surfaces
pieces at the surface S This gives
enclosing these subdivisions of the volume, still equals F · S
V1 enclosed by S1 and V1 enclosed
by S2
537
538 APPENDIX H. VECTOR INTEGRAL CALCULUS
Consider
H that the volume enclosed by is subdivided into subdivisions where → ∞ then even
though F · S → 0 as → ∞, the sum over surfaces of all the infinitessimal volumes remains unchanged
I →∞ I
X
Φ= F · S = F · S (H.5)
Thus we can take the limit of a sum of an infinite number of infinitessimal volumes as is needed to obtain a
differential
H form. The surface integral for each infinitessimal volume will equal zero which is not useful, that
is F · S → 0 as → ∞ However, the flux per unit volume has a finite value as → ∞ This ratio is
called the divergence of the vector field;
H
F · S
F = ∆ →0 (H.6)
∆
where ∆ is the infinitessimal volume enclosed by surface The divergence of the vector field is a scalar
quantity.
Thus the sum of flux over all infinitessimal subdivisions of the volume enclosed by a closed surface
equals
I →∞
X
H →∞
F · S X
Φ= F · S = ∆ = F∆ (H.7)
∆
This is called the Divergence Theorem or Gauss’s Theorem. To avoid confusion with Gauss’s law in electro-
statics, it will be referred to as the Divergence theorem.
Thus the net flux out of the box due to the z component of F is
x
∆Φ = ∆Φ
− ∆Φ
= ∆∆∆ (H.11)
Adding the similar and components for ∆Φ gives Figure H.2: Computation of flux
µ ¶ out of an infinitessimal rectangular
∆Φ = + + ∆∆∆ (H.12) box, ∆ ∆ ∆
since ∆ = ∆∆∆ But the right hand side of the equation equals the scalar product ∇ · F that is,
F = ∇ · F (H.14)
The divergence is a scalar quantity. The physical meaning of the divergence is that it gives the net flux per
unit volume flowing out of an infinitessimal volume. A positive divergence corresponds to a net outflow of
flux from the infinitessimal volume at any location while a negative divergence implies a net inflow of flux
to this infinitessimal volume.
It was shown that for an infinitessimal rectangular box
µ ¶
∆Φ = + + ∆∆∆ = ∇ · F∆ (H.15)
Integrating over the finite volume enclosed by the surface gives
I Z
Φ= F · S = ∇ · F (H.16)
The divergence theorem, developed by Gauss, is of considerable importance, it relates the surface integral of
a vector field, that is, the outgoing flux, to a volume integral of ∇ · F over the enclosed volume.
This is true independent of the shape of the Gaussian surface leading to the differential form of Gauss’s law
for B
∇·B=0
That is, the local value of the divergence of B is zero everywhere.
[∇ + ()ḡ(z)] · r =0
The right hand side of this equation equals minus the weight of the displaced fluid. That is, the buoyancy force
equals the weight of the fluid displaced by the empty volume. Note that this proof applies both to compressible
fluids, where the density depends on pressure, as well as to incompressible fluids where the density is constant.
It also applies to situations where local gravity is position
R dependent. If an object of mass is completely
submerged then the net force on the object is − ()() If the object floats on the surface
of a fluid then the buoyancy force must be calculated separately for the volume under the fluid surface and
the upper volume above the fluid surface. The buoyancy due to displaced air usually is negligible since the
density of air is about 10−3 times that of fluids such as water.
because the contributions along the common boundary cancel since they are taken in opposite directions if
1 and 2 both are taken in the same direction. Note that the line integral, and corresponding enclosed area,
H.3. STOKES THEOREM 541
are vector quantities related by the right-hand rule and this must be taken into account when subdividing
the area. Thus the area can be subdivided into an infinite number of pieces for which
I →∞ I
X →∞
X
H
F · l
F · l = F · l = b
∆S · n (H.19)
∆S · nb
where ∆S is the infinitessimal area bounded by the closed sub-loop and ∆S · n b is the normal component
of this area pointing along the nb direction which is the direction along which the line integral points.
The component of the curl of the vector function along the di-
rection nb is defined to be
→∞
H C
X F · l
b ≡ ∆→0
(F) · n
(H.20)
b
∆S · n
which is identical to the right hand side of the relation for the curl in cartesian coordinates. That is;
→
−
∇ × F = F (H.29)
The physics meaning of the curl is that it is the circulation, or rotation, for an infinitessimal loop at any
location. The word curl is German for rotation.
F = ∇ (H.32)
since
∇× (∇) = 0 (H.33)
That is, any curl-free vector field can be expressed in terms of the gradient of a scalar field.
The scalar field is not unique, that is, any constant can be added to since ∇ = 0 that is, the
addition of the constant does not change the gradient. This independence to addition of a number to the
scalar potential is called a gauge invariance discussed in chapter 132 for which
F = ∇0 = ∇ ( + ) = ∇ (H.34)
That is, this gauge-invariant transformation does not change the observable F. The electrostatic field E
and the gravitation field g are examples of irrotational fields that can be expressed as the gradient of scalar
potentials.
∇·F=0 (H.35)
everywhere. This is automatically obeyed if the field F is expressed in terms of the curl of a vector field G
such that
F=∇×G (H.36)
since ∇ · ∇ × G = 0. That is, any divergence-free vector field can be written as the curl of a related vector
field.
As discussed in chapter 132, the vector potential G is not unique in that a gauge transformation can be
made by adding the gradient of any scalar field, that is, the gauge transformation G0 = G + ∇ϕ gives
F = ∇ × G0 = ∇× (G + ∇ϕ) = ∇ × G (H.37)
This gauge invariance for transformation to the vector potential G0 does not change the observable vector
field F The magnetic field B is an example of a solenoidal field that can be expressed in terms of the curl
of a vector potential A.
∇×E=0
Therefore theorem 1 states that it is possible to express this static electric field as the gradient of the scalar
electric potential , where
E = −∇
544 APPENDIX H. VECTOR INTEGRAL CALCULUS
(∇ · A)
∇·E = −∇2 − = ()
0
Similarly insertion of the vector potential A in Ampère’s Law gives
µ ¶ µ ¶
E 2A
∇ × B = ∇ × (∇ × A)=0 j + 0 0 = 0 j−0 0 ∇ − 0 0
2
Using the vector identity ∇ × (∇ × A) = ∇ (∇ · A) − ∇2 allows the above equation to be rewritten as
µ µ 2 ¶¶ µ µ ¶¶
A
∇2 A−0 0 − ∇ ∇ · A+
0 0 = −0 j ( )
2
The use of the scalar potential and vector potential A leads to two coupled equations and . These
coupled equations can be transformed into two uncoupled equations by exploiting the freedom to make a gauge
transformation for the vector potential such that the middle brackets in both equations and are zero.
That is, choosing the Lorentz gauge µ ¶
∇ · A = −0 0
simplifies equations and to be
2
∇2 −0 0 2 = −
0
µ 2 ¶
A
∇2 A−0 0 = −0 j
2
The virtue of using the Lorentz gauge, rather than the Coulomb gauge ∇ · A = 0 is that it separates the
equations for the scalar and vector potentials. Moreover, these two equations are the wave equations for these
two potential fields corresponding to a velocity = √1 0 . This example illustrates the power of using the
0
concept of potentials in describing vector fields.
Appendix I
Waveform analysis
where 0 is the lowest (fundamental) frequency solution. For an aperiodic function a cosine decomposition
can be of the form Z ∞
() = () cos( + ()) (I.2)
0
Either of the complementary functions () ⇔ (), or () ⇔ ( ) are equivalent representations of
the harmonic content that can be used to describe signals and waves. The following two sections give an
introduction to Fourier analysis.
545
546 APPENDIX I. WAVEFORM ANALYSIS
∞
0 X
() = + sin ( + ) (I.5)
2 =0
where is an integer, and are phase shifts fit to the initial conditions.
The normal modes of a discrete system form a complete set of solutions that satisfy the following orthog-
onality relation Z 2
() () = (I.6)
0
where is the Kronecker delta symbol defined in equation (10). Orthogonality can be used to determine
the coefficients for equations (3) to be
Z +
1
0 = () (I.7)
−
Z +
1
= () cos () (I.8)
−
Z +
1
= () sin () (I.9)
−
Similarly the coefficients for (4) and (5) are related to the above coefficients by
Instead of the simple trigonometric form used in equations (3 − 5) the cosine and sine functions can
be expanded into the exponential form where
1 ¡ ¢
cos = + − (I.10)
2
− ¡ ¢
sin = − −
2
then equation (3) becomes
∞
X
() = (I.11)
=−∞
where is any integer and, from the orthogonality, the Fourier coefficients are given by
Z +
1
= () (I.12)
2 −
These coefficients are related to the cosine plus sine series amplitudes by
1
= ( − ) ( when is positive)
2
1
= ( + ) (when is negative)
2
These results show that the coefficients of the exponential series are in general complex, and that they
occur in conjugate pairs (that is, the imaginary part of a coefficient is equal but opposite in sign to that
for the coefficient − ). Although the introduction of complex coefficients may appear unusual, it should
be remembered that the real part of a pair of coefficients denotes the magnitude of the cosine wave of the
relevant frequency, and that the imaginary part denotes the magnitude of the sine wave. If a particular
pair of coefficients and − are real, then the component at the frequency 0 is simply a cosine; if
and − are purely imaginary, the component is just a sine; and if, as is the general case, and − are
complex, both cosine and a sine terms are present.
The use of the exponential form of the Fourier series gives rise to the notion of ‘negative frequency’. Of
course, () = cos is a wave of a single frequency = 0 radians/second, and may be represented
I.1. HARMONIC WAVEFORM DECOMPOSITION 547
by a single line of height in a normal spectral diagram. However, using the exponential form of the Fourier
series results in both positive and negative components.
The coexistence of both negative and positive angular frequencies ± can be understood by consideration
of the Argand diagram where the real component is plotted along the -axis and the imaginary component
along the -axis. The function + represents a vector of length that rotates with an angular velocity
in a positive direction, that is counterclockwise, whereas, − represents the vector rotating in a negative
direction, that is clockwise. Thus the sum of the two rotating vectors, according to equations (3), leads
to cancellation of the opposite components on the imaginary axis and addition of the two cos real
components on the axis. Subtraction leads to cancellation of the real components and addition of the
imaginary axis components.
where is the period of the periodic force. Let () = , = 0 and take the limit for → ∞ then
equation (12) can be written as
Z +∞
() = () (I.14)
−∞
2
Similarly making the same limit for → ∞ then 0 = → and equation (11) becomes
X∞ X∞ Z +∞
() 0 0 1
() = = () = () (I.15)
=−∞
=−∞
2 2 −∞
Equation (15) shows how a non-repetitive time-domain wave form is related to its continuous spectrum.
These are known as Fourier integrals or Fourier transforms. They are of central importance for signal
processing. For convenience the transforms often are written in the operator formalism using the F symbol
in the form
Z +∞ ∙ ¸
1 1
() = () ≡ F −1 () (I.16)
2 −∞ 2
Z +∞
() = () − ≡ F () (I.17)
−∞
It is very important to grasp the significance of these two equations. The first tells us that the Fourier
transform of the waveform () is continuously distributed in the frequency range between = ±∞, whereas
the second shows how, in effect, the waveform may be synthesized from an infinite set of exponential functions
of the form ± , each weighted by the relevant value of (). It is crucial to realize that this transformation
can go either way equally, that is, from () to () or vice versa.1
1 The only asymmetry in the Fourier transform relations comes from the 2 factor originating from the fact that by convention
physicists use the angular frequency = 2 rather than the frequency . In order to restore symmetry many papers use the
factor √1 in both relations rather than using the 21
factor in equation 16 and unity in equation 17.
2
548 APPENDIX I. WAVEFORM ANALYSIS
That is, assume that the amplitude of the pulse is unity between − 2 ≤ ≤
2 . Then the Fourier transform
Z µ ¶
+
− sin
2
() = 1 =
− 2
which is an unnormalized ( ) function. Note that the width of the pulse ∆ = ± 2 leads to a frequency
envelope that has the first zeros at ∆ = ± . Thus the product of these widths ∆ · ∆ = ± which is
independent of the width of the pulse, that is ∆ = ∆ which is an example of the uncertainty principle
which is applicable to all forms of wave motion.
The Dirac function, which is sometimes referred to as the impulse function, has many important appli-
cations to physics and signal processing. For example, a shell shot from a gun is given a mechanical impulse
imparting a certain momentum to the shell in a very short time. Other things being equal, one is interested
only in the impulse imparted to the shell, that is, the time integral of the force accelerating the shell in the
gun, rather than the details of the time dependence of the force. Since the force acts for a very short time
the Dirac delta function can be employed in such problems.
As described in section 311 and appendix , the Dirac delta function is employed in signal processing
when signals are sampled for short time intervals. The Fourier transform of the delta function is needed for
discussion of sampling of signals
Z +∞
0
() = ( − 0 ) − = −
−∞
Since − essentially is constant over the infinitesimal time duration of the ( − 0 ) function, and the
time integral of the function is unity, thus the term − has unit magnitude for any value of and has
a phase shift of − ( − 0 )radians. For 0 = 0 the phase shift is zero and thus the Fourier transform of a
Dirac () function is () = 1. That is, this is a uniform white spectrum for all values of .
Figure I.1: Response of a underdamped linear oscillator with = 10, and Γ = 2 to the following impulsive
force. (a) Step function force = 0 for 0 and = for 0 (b) Square-wave force where = for
0 for = 3 and = 0 at other times. (c) Delta-function impulse = 1.
()
̈ + Γ̇ + 20 = (I.18)
and assume that a step function is applied at time = 0. That is;
() ()
=0 0 = 0 (I.19)
where is a constant. The initial conditions are that (0) = ̇(0) = 0.
The transient or complementary solution is the solution of the linearly-damped harmonic oscillator
̈ + Γ̇ + 20 = 0 (I.20)
This is independent of the driving force and the solution is given in the chapter 35 discussion of the linearly-
damped harmonic oscillator.
The particular, steady-state, solution is easy to obtain just by inspection since the force is a constant,
that is, the particular solution is
= 0 = 0 0
20
Taking the sum of the transient and particular solutions, using the initial conditions, gives the final solution
to be " #
Γ
−Γ Γ− 2
() = 2 1 − 2 cos 1 − sin 1 (I.21)
0 2 1
q ¡ ¢2
where 1 ≡ 20 − Γ2 This functional form is shown in figure 1. Note that the amplitude of the
transient response equals − at = 0 to cancel the particular solution when it jumps to +. The oscillatory
behavior then is just that of the transient response.
A square impulse can be generated by the superposition of two opposite-sign stepfunctions separated by
a time as shown in figure 1.
The square impulse can be taken to the limit where the width is negligibly small relative to the response
times of the system. It can be shown that letting → 0 but keeping the magnitude of the total impulse
= finite for the impulse at time 0 , leads to the solution for the -function impulse occurring at 0
− Γ (−0 )
() = 2 sin 1 ( − 0 ) 0 (I.22)
1
This response to a delta function impulse is shown in figure 1 for the case where 0 = 0. An example is
the response when the hammer strikes a piano string at = 0.
550 APPENDIX I. WAVEFORM ANALYSIS
Figure I.2: Decomposition of the function () = 2 sin ()+sin (5)+ 13 sin (15)+ 15 sin(25) into a time-ordered
sequence of -function samples.
As illustrated in figure 2 discrete-time waveform analysis involves repeatedly sampling the instantaneous
amplitude in a regular and repetitive sequence of -function impulses. Since the superposition principle
applies for this linear system then the waveform can be described by a sum of an ordered series of delta-
function impulses where 0 is the time of an impulse. Integrating over all the -function responses that have
occurred at time 0 , that is prior to the time of interest leads to
Z
(0 ) − Γ2 (−0 )
() = sin 1 ( − 0 ) 0 ≥ 0 (I.24)
−∞ 1
Superposition allows the summed response of the system to be written in an integral form
Z
() = (0 )( − 0 )0 (I.26)
−∞
which gives the final time dependence of the forced system. This repetitive time-sampling approach avoids
the need of using Fourier analysis. Note that the Green’s function ( − 0 ) includes
q implicitly the frequency
2
¡ Γ ¢2
of the free undamped linear oscillator 0 the free damped linear oscillator 1 ≡ 0 − 2 as well as the
damping coefficient Γ. Access to the combination of fast microcomputers coupled to fast digital sampling
techniques has made digital signal sampling the pre-eminent technique for signal recording of audio, video,
and detector signal processing.
Bibliography
551
552 BIBLIOGRAPHY
[Ki85] T.W.B. Kibble, F.H. Berkshire. "Classical Mechanics, (5th edition)", Imperial College Press,
London, 2004. Based on the textbook written by Kibble that was published in 1966 by McGraw-
Hill. The 4th and 5th editions were published jointly by Kibble and Berkshire. This excellent
and well-established textbook addresses the same undergraduate student audience as the present
textbook. This book covers the variational principles and applications with minimal discussion of
the philosophical implications of the variational approach.
[La49] C. Lanczos, "The Variational Principles of Mechanics", University of Toronto Press, Toronto,
(1949)
An outstanding graduate textbook that has been one of the founding pillars of the field since
1949. It gives an excellent introduction to the philosophical aspects of the variational approach
to classical mechanics, and introduces the extended formulations of Lagrangian and Hamiltonian
mechanics that are applicable to relativistic mechanics.
[La60] L. D. Landau, E. M. Lifshitz, "Mechanics", Volume 1 of a Course in Theoretical Physics, Perga-
mon Press (1960)
An outstanding, succinct, description of analytical mechanics that is devoid of any superfluous
text. This Course in Theoretical Physics is a masterpiece of scientific writing and is an essential
component of any physics library. The compactness and lack of examples makes this textbook
less suitable for most undergraduate students.
[Li94] Yung-Kuo Lim, "Problems and Solutions on Mechanics" (1994)
This compendium of 408 solved problems, which are taken from graduate qualifying examinations
in physics at several U.S. universities, provides an invaluable resource that complements this
textbook for study of Lagrangian and Hamiltonian mechanics.
[Ma65] J. B. Marion, "Classical Dynamics of Particles and Systems", Academic Press, New York, (1965)
This excellent undergraduate text played a major role in introducing analytical mechanics to
the undergraduate curriculum. It has an outstanding collection of challenging problems. The 5
edition has been published by S. T. Thornton and J. B. Marion, Thomson, Belmont, (2004).
[Me70] L. Meirovitch, "Methods of Analytical Dynamics", McGraw-Hill New York, (1970)
An advanced engineering textbook that emphasizes solving practical problems, rather than the
underlying theory.
[Mu08] H. J. W. Müller-Kirsten, "Classical Mechanics and Relativity", World Scientific, Singapore, (2008)
This modern graduate-level textbook emphasizes relativistic mechanics making it an excellent
complement to the present textbook.
[Pe82] I. Percival and D. Richards, "Introduction to Dynamics" Cambridge University Press, London,
(1982)
Provides a clear presentation of Lagrangian and Hamiltonian mechanics, including canonical
transformations, Hamilton-Jacobi theory, and action-angle variables.
[Sy60] J.L. Synge, "Principles of Classical Mechanics and Field Theory" , Volume III/I of "Handbuck
der Physik" Springer-Verlag, Berlin (1960).
A classic graduate-level presentation of analytical mechanics.
[Ta05] J. R. Taylor, "Classical Mechanics", University Science Books, Sausalito, (2006)
This undergraduate book gives a well-written descriptive introduction to analytical mechanics.
The scope of the book is limited and the problems are easy.
BIBLIOGRAPHY 553
[Sta05] T. Stachowiak and T. Okada, Chaos, Solitons, and Fractals, 29 (2006) 417.
[Str00] S.H. Strogatz, Physica D43 (2000) 1
[Str05] J. Struckmeier, J. Phys. A: Math; Gen. 38 (2005) 1257
[Str08] J. Struckmeier, Int. J. of Mod. Phys. E18 (2008) 79
[Win67] A.T. Winfree, J. Theoretical Biology 16 (1967) 15
Index
Abbreviated action, 384 Bohr-Sommerfeld atom
Action special relativity, 477
abbreviated action, 384 Brahe
Hamilton’s principle, 382 history, 2
Action-angle variables Bulk modulus of elasticity, 443
Hamilton-Jacobi theory, 423 Buoyancy forces, 540
Sommerfeld atom, 485
Adiabatic invariance Calculus of variations
action variables, 426 brachistochrone, 111
plane pendulum, 426 Euler, 111
Analytical mechanics, xviii history, 111
Androyer-Deprit variables Leibniz, xviii
rigid-body rotation, 315 Canonical equation of motion
Archimedes Hamilton’s equations of motion, 202
history, 2 Canonical perturbation theory
Aristotle Hamilton-Jacobi theory, 428
history, 1 harmonic oscillator perturbation, 428
Asymmetric rotor Canonical transformations
stability of torque-free rotation, 322 generating function, 408
Asymmetric top Hamilton method, 410
5 somersaults plus 3 rotations of high diver, 334 Hamilton’s equations of motion, 407
separatrix, 322 Hamilton-Jacobi theory, 412
tennis racket motion, 323 identity transformation, 410
torque-free rotation, 321 Jacobi method, 410, 412
Attractor one-dimensional harmonic oscillator, 411
van der Pol oscillator, 94 Cartesian coordinates, 509
Autonomous system, 93, 169 Cayley
history, 495
Barycenter, 229 Center of momentum
Bernoulli bolas, 17
history, 4 Center of percussion, 35
principle of virtual work, 138 Central forces
virtual work, 111 two-body forces, 227
Bertrand’s Theorem Centre of mass
orbit stability, 241 finite sized objects, 12
Bertrand’s theorem Centre of momentum
orbit solution, 234 relativistic kinematics, 463
Bicycle stability Centrifugal force
rolling wheel, 331 parabolic mirror, 275
Bifurcation Chaos
non-linear system, 103 Lyapunov exponent, 103
Billiard ball, 15 onset of chaos for non-linear system, 101
Bohr Characteristic function
history, 7 Hamilton-Jacobi theory, 414
model of the atom, 485 Chasles’ theorem
555
556 INDEX
scleronomic systems, 185, 186 swinging mass connected to a rotating mass, 159
Kirchhoff’s rules, 67, 220 two connected blocks sliding without friction, 152
Kuramoto model two connected masses sliding on rigid rail, 160
coupled oscillators, 373 two masses sliding on inclined planes, 151
unconstrained motion, 146
Lagrange velocity-dependent Lorentz force, 168
calculus of variations, 111 yo-yo, 157
history, 5 Lame’s modulus of elasticity, 443
Lagrange equations Legendre transform
d’Alembert’s principle, 139 Hamiltonian and Lagrangian mechanics, 200, 532
Hamilton’s principle, 141, 381 Leibniz
Lagrange multipliers, 142 history, 4
Lagrange equations vis viva, xviii
generalized coordinates, 142 Linear oscillator
Lagrange multipliers critically damped, 60
algebraic equations of constraint, 126 driven, 62
Euler equations, 125 energy dissipation, 61
integral equations of constraint, 128 linear damping, 58
Lagrangian Lissajous figures, 55
definition, 111 overdamped, 60
dissipative, 388 Q factor, 61
equivalent lagrangians, 385 resonance, 65
extended formalism, 469 Steady state response of driven oscillator, 63
linear dissipation, 389 superposition, 54
non standard, 387, 392 transient response of driven oscillator, 62
relativistic free particle, 472 underdamped, 59
rotating frame, 271 Linear systems
special relativity, 472 Fourier harmonic analysis, 70
standard, 385 Linear velocity-dependent dissipation, 389
state space, 202 Linearly-damped linear oscillator
time dependent, 169 characteristic frequency, 58
Lagrangian density, 438 damping parameter, 58
Lagrangian mechanics Liouville’s theorem
Atwoods machine, 151 phase space, 405
block sliding on moveable inclined plane, 153 Lissajous figure, 55
body on periphery of rolling wheel, 166 Lorentz
central forces, 147 relativistic transformation, 457
comparison with Hamiltonian mechanics, 431 Lorentz force in electromagnetism
comparison with Newtonian mechanics, 172 Poisson brackets, 402
cyclic coordinates, 184 Lorentz transformation
disk rolling on inclined plane, 148 Minkowski metric, 466
generalized coordinates, 125, 172 Lyapunov exponent
holonomic constraints, 122, 144 onset of chaos, 103
mass sliding on paraboloid, 158
mass sliding on rotating rod, 154 Mach’s principle
motion in gravitational field, 146 general theory of relativity, 478
motion of a free particle, 146 Many-body systems
non-conservative forces, 216 angular momentum, 16
partial holonomic systems, 161 energy conservation, 18
plane pendulum, 191 linear momentum, 14
solid sphere sliding on hemispherical surface, 165 Mass
sphere rolling down inclined plane on fritionless gravitational, 39
floor, 154 inertial, 38
spherical pendulum, 155 Matrix algebra, 495
spring pendulum, 156 addition, 496
INDEX 561