It looks like you're new here. If you want to get involved, click one of these buttons!

- All Categories 2.2K
- Applied Category Theory Course 354
- Applied Category Theory Seminar 4
- Exercises 149
- Discussion Groups 49
- How to Use MathJax 15
- Chat 480
- Azimuth Code Project 108
- News and Information 145
- Azimuth Blog 149
- Azimuth Forum 29
- Azimuth Project 189
- - Strategy 108
- - Conventions and Policies 21
- - Questions 43
- Azimuth Wiki 711
- - Latest Changes 701
- - - Action 14
- - - Biodiversity 8
- - - Books 2
- - - Carbon 9
- - - Computational methods 38
- - - Climate 53
- - - Earth science 23
- - - Ecology 43
- - - Energy 29
- - - Experiments 30
- - - Geoengineering 0
- - - Mathematical methods 69
- - - Meta 9
- - - Methodology 16
- - - Natural resources 7
- - - Oceans 4
- - - Organizations 34
- - - People 6
- - - Publishing 4
- - - Reports 3
- - - Software 21
- - - Statistical methods 2
- - - Sustainability 4
- - - Things to do 2
- - - Visualisation 1
- General 39

Options

Gerald Jay Sussman and Jack Wisdom's book *Structure and Interpretation of Classical Mechanics, Second Edition* from the MIT press is available free online. ToC

Description:

We now know that there is much more to classical mechanics than previously suspected. Derivations of the equations of motion, the focus of traditional presentations of mechanics, are just the beginning. This innovative textbook, now in its second edition, concentrates on developing general methods for studying the behavior of classical systems, whether or not they have a symbolic solution. It focuses on the phenomenon of motion and makes extensive use of computer simulation in its explorations of the topic. It weaves recent discoveries in nonlinear dynamics throughout the text, rather than presenting them as an afterthought. Explorations of phenomena such as the transition to chaos, nonlinear resonances, and resonance overlap to help the student develop appropriate analytic tools for understanding. The book uses computation to constrain notation, to capture and formalize methods, and for simulation and symbolic analysis. The requirement that the computer be able to interpret any expression provides the student with strict and immediate feedback about whether an expression is correctly formulated.

This second edition has been updated throughout, with revisions that reflect insights gained by the authors from using the text every year at MIT. In addition, because of substantial software improvements, this edition provides algebraic proofs of more generality than those in the previous edition; this improvement permeates the new edition.

From the 1st ed. preface:

The contents of our class began with ideas from a class on nonlinear dynamics and solar system dynamics by Wisdom and ideas about how computation can be used to formulate methodology developed in an introductory computer science class by Abelson and Sussman. When we started we expected that using this approach to formulate mechanics would be easy. We quickly learned that many things we thought we understood we did not in fact understand. Our requirement that our mathematical notations be explicit and precise enough that they can be interpreted automatically, as by a computer, is very effective in uncovering puns and flaws in reasoning. The resulting struggle to make the mathematics precise, yet clear and computationally effective, lasted far longer than we anticipated. We learned a great deal about both mechanics and computation by this process. We hope others, especially our competitors, will adopt these methods, which enhance understanding while slowing research

It looks like a useful reference, beginning with configuration spaces, action and Lagrangians then general rigid bodies, quaternions, Hamiltonians, phase space and perturbation theory.

## Comments

Great reference, thanks. Nice application of differential geometry -- to configuration spaces.

`Great reference, thanks. Nice application of differential geometry -- to configuration spaces.`

I have a basic question. The classical Lagrangian is defined to be the difference between kinetic and potential energy. Then the actual paths of motion are found to be those that minimize the integral of the function over the over the path (the action).

So, this "works."

But is there an intuitive explanation for why it works? Does the difference between kinetic and potential energy -- at a point in time -- have a

physicalmeaning that is explainable?`I have a basic question. The classical Lagrangian is defined to be the difference between kinetic and potential energy. Then the actual paths of motion are found to be those that minimize the integral of the function over the over the path (the action). So, this "works." But is there an intuitive explanation for why it works? Does the difference between kinetic and potential energy -- at a point in time -- have a _physical_ meaning that is explainable?`

I don't understand it myself, maybe somebody else has a good explanation. Or maybe not; the opening quote of Sussman and Wisdom is:

`I don't understand it myself, maybe somebody else has a good explanation. Or maybe not; the opening quote of Sussman and Wisdom is: >“In almost all textbooks, even the best, this principle is presented so that it is impossible to understand.” (K. Jacobi, Lectures on Dynamics, 1842-1843). I have not chosen to break with tradition. >V. I. Arnold, Mathematical Methods of Classical Mechanics [5], footnote, p. 246`

It would be great to have such an explanation, and I wouldn't want us to give up prematurely, but perhaps...it is an axiom of classical nature -- equivalent to Newton's laws of motion -- for which there is not a basic mechanical intuition?

`It would be great to have such an explanation, and I wouldn't want us to give up prematurely, but perhaps...it is an axiom of classical nature -- equivalent to Newton's laws of motion -- for which there is not a basic mechanical intuition?`

Looking at the unit of action (kg m^2/s), it is the same as angular momentum, like Planck's constant. Usually action is seen as the time integral of energy, or sometimes as the spatial integral of momentum, but it also has the same unit as a time derivative of moment of inertia, or you can take the time integral of force to get momentum then the spatial integral of momentum to get angular momentum. Action also has the same units as the double spatial integral of mass flow (kg/s), or even the triple spatial integral of viscosity. I'm not sure if anybody integrates over charge, but the double charge integral of electric resistance or the charge integral of magnetic flux also have the same units as action. You could even integrate kinematic viscosity (area/s) over mass and get the units of action.

Action, like so many important quantities, has a factor of exactly length^2. So it looks like a spatial bivector sort of thing, and I have been told Lie algebras are basically bivector algebras. Other unit types that also have m^2 as an exact factor are: area, kinematic viscosity (area/s), moment of inertia, energy, power, magnetic flux, electric potential, inductance, resistance, and elastance (inverse capacitance). [Edit: Probability current has dimensions of 1/(m^2 s) and current density is in coulombs/(m^2 s). ]

Expressing action as "the time rate of change of moment of inertia" seems to me like the closest to being intelligible; minimizing the rate at which moment of inertia changes seems like a plausible rule.

`Looking at the unit of action (kg m^2/s), it is the same as angular momentum, like Planck's constant. Usually action is seen as the time integral of energy, or sometimes as the spatial integral of momentum, but it also has the same unit as a time derivative of moment of inertia, or you can take the time integral of force to get momentum then the spatial integral of momentum to get angular momentum. Action also has the same units as the double spatial integral of mass flow (kg/s), or even the triple spatial integral of viscosity. I'm not sure if anybody integrates over charge, but the double charge integral of electric resistance or the charge integral of magnetic flux also have the same units as action. You could even integrate kinematic viscosity (area/s) over mass and get the units of action. Action, like so many important quantities, has a factor of exactly length^2. So it looks like a spatial bivector sort of thing, and I have been told Lie algebras are basically bivector algebras. Other unit types that also have m^2 as an exact factor are: area, kinematic viscosity (area/s), moment of inertia, energy, power, magnetic flux, electric potential, inductance, resistance, and elastance (inverse capacitance). [Edit: Probability current has dimensions of 1/(m^2 s) and current density is in coulombs/(m^2 s). ] Expressing action as "the time rate of change of moment of inertia" seems to me like the closest to being intelligible; minimizing the rate at which moment of inertia changes seems like a plausible rule.`

My favorite explanation is in Brian Lee Beers: Geometric Nature of Lagrange's Equations. It starts with Newton's second law and projects it onto the coordinate basis vectors. For a single particle this is $$ \mathbf{F} \cdot \frac{\partial \mathbf{r}}{\partial q^i} = m \mathbf{\ddot{r}} \cdot \frac{\partial \mathbf{r}}{\partial q^i} $$ After converting to components, Lagrange equations just drop out after a little algebraic manipulation. It is never assumed that $T - V$ is minimized or even special in any way.

The derivation generalizes to arbitrary systems including mutiple particle sysms and rigid bodies, by replacing the mass term with the inertia tensor in Newtons second law, In traditional tensor notation with the Einstein convention this is: $$ F_i = I_{ij} \ddot{q}^j $$ Similar derivation is also in James Casey: Geometrical derivation of Lagrange’s equations for a system of particles Casey wrote a series of follow on papers extending the idea to rigid bodies, fluid dynamics, ...

There is a more advanced version of this idea dating back at least to Synge: On the geometry of dynamics, where the inertia tensor is treated as the Riemannian metric on the configuration manifold. The principle of least action then becomes quivalent to the principle of least distance on the configuration manifold under this metric. In particular, conservative systems follow geodesic trajectories in the intrinsic geometry of the configuration manifold under the inertia metric. By itself that is not a huge gain in insight, but in the absence of torsion, geodesic trajectories are characterised locally by $\nabla_\mathbf{\hat{v}} \mathbf{\hat{v}} = 0$ where $\mathbf{\hat{v}}$ is the unit tangent vector of the path, ie geodesic trajectories have a constant intrinsic direction or straightest paths are the shortest. This is shows the relationship between the Principle of Least Action and Hertz's Principle of Least Curvature. It also largely takes the teleological voodoo out of the Principle of Least Action.

Geodesics are equivalent to paths of least curvature only in the absence of torsion, Gabriel Kron and more recently Hagen Kleinert have shown that, in the presence of torsion, it is the principle of least curvature that is correct.

Crouch: Geometric structures in systems theory also has a nice exposition of geometrical views of dynamics.

This also shows that the relationship of the Lagrangean and Newtonian formulations of classical mechanics is really that of intrinsic and extrinsic approaches to differential geometry.

`My favorite explanation is in [Brian Lee Beers: Geometric Nature of Lagrange's Equations](http://dx.doi.org/10.1119/1.1987001). It starts with Newton's second law and projects it onto the coordinate basis vectors. For a single particle this is $$ \mathbf{F} \cdot \frac{\partial \mathbf{r}}{\partial q^i} = m \mathbf{\ddot{r}} \cdot \frac{\partial \mathbf{r}}{\partial q^i} $$ After converting to components, Lagrange equations just drop out after a little algebraic manipulation. It is never assumed that $T - V$ is minimized or even special in any way. The derivation generalizes to arbitrary systems including mutiple particle sysms and rigid bodies, by replacing the mass term with the inertia tensor in Newtons second law, In traditional tensor notation with the Einstein convention this is: $$ F_i = I_{ij} \ddot{q}^j $$ Similar derivation is also in [James Casey: Geometrical derivation of Lagrange’s equations for a system of particles](http://dx.doi.org/10.1119/1.17470) Casey wrote a series of follow on papers extending the idea to rigid bodies, fluid dynamics, ... There is a more advanced version of this idea dating back at least to [Synge: On the geometry of dynamics](http://www.math.cornell.edu/~rand/randdocs/classics/synge_geometry_of_dynamics.pdf), where the inertia tensor is treated as the Riemannian metric on the configuration manifold. The principle of least action then becomes quivalent to the principle of least distance on the configuration manifold under this metric. In particular, conservative systems follow geodesic trajectories in the intrinsic geometry of the configuration manifold under the inertia metric. By itself that is not a huge gain in insight, but in the absence of torsion, geodesic trajectories are characterised locally by $\nabla_\mathbf{\hat{v}} \mathbf{\hat{v}} = 0$ where $\mathbf{\hat{v}}$ is the unit tangent vector of the path, ie geodesic trajectories have a constant intrinsic direction or straightest paths are the shortest. This is shows the relationship between the Principle of Least Action and Hertz's Principle of Least Curvature. It also largely takes the teleological voodoo out of the Principle of Least Action. Geodesics are equivalent to paths of least curvature only in the absence of torsion, [Gabriel Kron](http://dx.doi.org/10.1063/1.1745376) and more recently Hagen Kleinert have shown that, in the presence of torsion, it is the principle of least curvature that is correct. [Crouch: Geometric structures in systems theory](http://ieeexplore.ieee.org/xpl/login.jsp?tp=&arnumber=4642080&url=http%3A%2F%2Fieeexplore.ieee.org%2Fxpls%2Fabs_all.jsp%3Farnumber%3D4642080) also has a nice exposition of geometrical views of dynamics. This also shows that the relationship of the Lagrangean and Newtonian formulations of classical mechanics is really that of intrinsic and extrinsic approaches to differential geometry.`

The same question was also asked on Quora and has an answer by Dan Piponi.

I prefer the Beers explanation though.

`The same question was also asked on Quora and has an [answer](https://www.quora.com/Why-is-Lagrangian-defined-as-Kinetic-energy-minus-potential-energy/answer/Dan-Piponi?srid=uYrm) by Dan Piponi. I prefer the Beers explanation though.`

Thanks Daniel, that's a lot of good material to think about!

The view of paths of least action as geodesics has a ring of elegance.

I'm also interested in your reference to Dan Piponi's explanation (previous message), because it is self-contained and sounds basic.

He uses motion through discrete time as model to explain the meaning of action. Okay. Then differentiation gets replaced with the matrix for the finite difference operator, $D$.

Then he says:

It sounds promising, but can someone explain what "apart from the edges, $D$ is minus its own transpose" means?

To me, $D$ doesn't look at all like its own transpose, even "in the middle" of the matrix:

The diagonals of $D$ and of $D^T$ are identical -- they are filled with -1 values.

On the off-diagonal, where $D$ has a 1, $D^T$ has a zero.

`Thanks Daniel, that's a lot of good material to think about! The view of paths of least action as geodesics has a ring of elegance. I'm also interested in your reference to Dan Piponi's explanation (previous message), because it is self-contained and sounds basic. He uses motion through discrete time as model to explain the meaning of action. Okay. Then differentiation gets replaced with the matrix for the finite difference operator, $D$. Then he says: > Notice how apart from the edges, $D$ is minus its own transpose. Ultimately, _this_ is the minus sign that appears in the definition of the Lagrangian. It sounds promising, but can someone explain what "apart from the edges, $D$ is minus its own transpose" means? To me, $D$ doesn't look at all like its own transpose, even "in the middle" of the matrix: * The diagonals of $D$ and of $D^T$ are identical -- they are filled with -1 values. * On the off-diagonal, where $D$ has a 1, $D^T$ has a zero.`

Maybe he means this:

If you remove the first row from $D^T$, then that equals the minus of $D$.

`Maybe he means this: If you remove the first row from $D^T$, then that equals the minus of $D$.`

I think Dan Piponi's point is that $D$ is a finite discrete approximation to a continuous operator. As you approach the continuum limit the harder it gets to tell the difference between $D^T$ and $-D$ "from a distance" so to speak. Both have a diagonal line of -1s adjacent to a diagonal line of 1s just below it. The difference is which a small shift as to which me is the actual diagonal.

I think his argument can be simplified if $D$ had the -1s on the superdiagonal instead of the diagonal. It would still be a valid approximation of the differential, and it really would be antisymmetric.

`I think Dan Piponi's point is that $D$ is a finite discrete approximation to a continuous operator. As you approach the continuum limit the harder it gets to tell the difference between $D^T$ and $-D$ "from a distance" so to speak. Both have a diagonal line of -1s adjacent to a diagonal line of 1s just below it. The difference is which a small shift as to which me is the actual diagonal. I think his argument can be simplified if $D$ had the -1s on the superdiagonal instead of the diagonal. It would still be a valid approximation of the differential, and it really would be antisymmetric.`

Here is another way to look at where $L$. For a dissipative system it is more natural to write the Lagrange just in terms of the kinetic energy $T$ instead of $L$: $$ \frac{d}{dt} \frac{\partial T}{\partial \dot{q}_i} - \frac{\partial T}{\partial q_i} = F_i = \mathbf{F} \cdot \frac{\partial \mathbf{r}}{\partial q^i} $$ This is the equation that the Beers actually derives. It is the general form. The conservative case is obtained by setting $ \mathbf{F} = -\nabla V$. $$ \begin{align} \frac{d}{dt} \frac{\partial T}{\partial \dot{q}_i} - \frac{\partial T}{\partial q_i} &= - \frac{\partial V}{\partial q_i}, \text{ since } \frac{\partial V}{\partial q_i} = \nabla V \cdot \frac{\partial \mathbf{r}}{\partial q^i} \\ \therefore \frac{d}{dt} \frac{\partial T}{\partial \dot{q}_i} - \frac{\partial (T - V) }{\partial q_i} &= 0 \\ \frac{d}{dt} \frac{\partial T}{\partial \dot{q}_i} - \frac{\partial L }{\partial q_i} &= 0 \end{align} $$ Now $\frac{\partial V}{\partial \dot{q}_i} = 0 $ since by definition $V$ is a function of only the $q_i$s and thus independent of $\dot{q}_i$, so $$ \begin{align} \frac{d}{dt} \left( \frac{\partial T}{\partial \dot{q}_i} - \frac{\partial V}{\partial \dot{q}_i} \right) - \frac{\partial L }{\partial q_i} &= 0 \\ \frac{d}{dt} \frac{\partial (T - V)}{\partial \dot{q}_i} - \frac{\partial L }{\partial q_i} &= 0 \\ \frac{d}{dt} \frac{\partial L}{\partial \dot{q}_i} - \frac{\partial L }{\partial q_i} &= 0 \\ \end{align} $$ Seen this way $L$ looks like a bit of a hack. It is really an artifact of the system being conservative and even then replacing $T$ by $L$ in the first term of the equation is kind of redundant.

`Here is another way to look at where $L$. For a dissipative system it is more natural to write the Lagrange just in terms of the kinetic energy $T$ instead of $L$: $$ \frac{d}{dt} \frac{\partial T}{\partial \dot{q}_i} - \frac{\partial T}{\partial q_i} = F_i = \mathbf{F} \cdot \frac{\partial \mathbf{r}}{\partial q^i} $$ This is the equation that the Beers actually derives. It is the general form. The conservative case is obtained by setting $ \mathbf{F} = -\nabla V$. $$ \begin{align} \frac{d}{dt} \frac{\partial T}{\partial \dot{q}_i} - \frac{\partial T}{\partial q_i} &= - \frac{\partial V}{\partial q_i}, \text{ since } \frac{\partial V}{\partial q_i} = \nabla V \cdot \frac{\partial \mathbf{r}}{\partial q^i} \\ \therefore \frac{d}{dt} \frac{\partial T}{\partial \dot{q}_i} - \frac{\partial (T - V) }{\partial q_i} &= 0 \\ \frac{d}{dt} \frac{\partial T}{\partial \dot{q}_i} - \frac{\partial L }{\partial q_i} &= 0 \end{align} $$ Now $\frac{\partial V}{\partial \dot{q}_i} = 0 $ since by definition $V$ is a function of only the $q_i$s and thus independent of $\dot{q}_i$, so $$ \begin{align} \frac{d}{dt} \left( \frac{\partial T}{\partial \dot{q}_i} - \frac{\partial V}{\partial \dot{q}_i} \right) - \frac{\partial L }{\partial q_i} &= 0 \\ \frac{d}{dt} \frac{\partial (T - V)}{\partial \dot{q}_i} - \frac{\partial L }{\partial q_i} &= 0 \\ \frac{d}{dt} \frac{\partial L}{\partial \dot{q}_i} - \frac{\partial L }{\partial q_i} &= 0 \\ \end{align} $$ Seen this way $L$ looks like a bit of a hack. It is really an artifact of the system being conservative and even then replacing $T$ by $L$ in the first term of the equation is kind of redundant.`

I have also written this up as an answer to the Quora question.

`I have also written this up as an [answer](https://www.quora.com/Why-is-Lagrangian-defined-as-Kinetic-energy-minus-potential-energy/answer/Daniel-Mahler?prompt_topic_bio=1#) to the Quora question.`

In that answer on Quora, you conclude:

Yet nature treats it as fundamental, because this is the magnitude whose path integral gets minimized.

Perhaps kinetic-minus-potential-energy is a primitive magnitude for which our brains have not evolved an intuition -- a foundation for a more abstract view of nature. So it would be fundamental, but easy to relate to the primitive concepts that biology has pre-wired us to understand, like pushes, pulls, and speeds.

If someone were to tell me that in a system of colliding particles, the sum of the masses times velocities is always preserved, I might at first wonder whether there is some meaning to the product of mass and velocity -- or if this was just one of nature's abstract invariants. But we've given a nice name to that product, and it can be related to a Newtonian intuition about how much "oomph" a moving mass has. Furthermore, it is intuitive that oomph is proportional to mass, and also to velocity. And, it even "makes sense" that the total oomph in the system is preserved.

But $m v$ could be on an equal footing with $T - V$, in terms of how intrinsically fundamental they are.

`In that answer on Quora, you conclude: > Seen this way defining $L=T−V$ looks like a bit of a hack to tidy up the equations for a conservative system rather than something fundamental. One can just use $L=T$ at least for classical mechanics. Yet nature treats it as fundamental, because this is the magnitude whose path integral gets minimized. Perhaps kinetic-minus-potential-energy is a primitive magnitude for which our brains have not evolved an intuition -- a foundation for a more abstract view of nature. So it would be fundamental, but easy to relate to the primitive concepts that biology has pre-wired us to understand, like pushes, pulls, and speeds. If someone were to tell me that in a system of colliding particles, the sum of the masses times velocities is always preserved, I might at first wonder whether there is some meaning to the product of mass and velocity -- or if this was just one of nature's abstract invariants. But we've given a nice name to that product, and it can be related to a Newtonian intuition about how much "oomph" a moving mass has. Furthermore, it is intuitive that oomph is proportional to mass, and also to velocity. And, it even "makes sense" that the total oomph in the system is preserved. But $m v$ could be on an equal footing with $T - V$, in terms of how intrinsically fundamental they are.`

This view is also strengthened by the fact that the principle of least action can be used to derive Newton's laws of motion.

`This view is also strengthened by the fact that the principle of least action can be used to derive Newton's laws of motion.`

This is the reasoning behind my statement, which were qualified to classical mechanics.

The Beers derivation shows that

$$ \frac{d}{dt} \frac{\partial T}{\partial \dot{q}_i} - \frac{\partial T}{\partial q_i} = F_i = \mathbf{F} \cdot \frac{\partial \mathbf{r}}{\partial q^i} $$ is the equation that is equivalent to Newtons laws. That already seems to make $T$ a kind of Lagrangian, since this an inhomogeneous Euler-Lagrange equation This equation holds for dissipative as well as conservative systems.

Conservation of energy is expressed by rquiring $\mathbf{F} = -\nabla V$ yielding $$ \frac{d}{dt} \frac{\partial T}{\partial \dot{q}_i} - \frac{\partial T}{\partial q_i} = - \frac{\partial V}{\partial q_i} $$ Using $L = T - V$ is a device make the conservative form of the equation homogeneous, ie remove the RHS force term. This is what enables mechanic formulated as a minimum principle. To me that part of the derivation looks a little contrived though (which of course does not mean a whole lot in the greater scheme of things :) )

Also while it is mathematically very beautiful, I have always found the principle of least action somewhat unsatisfying as a physical axiom (I am very happy with it as a derived theorem), and I really like the Synge's geometrical reformulation as $$\nabla_\mathbf{\hat{v}} \mathbf{\hat{v}} = 0$$, which is seems like a much smaller leap of faith. If $T$ is used to construct the metric instead of $L$, Synge's approach yields $$\nabla_\mathbf{\hat{v}} \mathbf{\hat{v}} = F_i$$ which is still very elegant, and what little it looses in elegance it makes up in physical insight, namely that deviations from geodesy are due to applied forces.

This is why I tend to view using $L = T - V$ primarily as a technical device to make the math prettier at the expense of physical insight and to view $T$ based formulations as more fundamental, though this is of course subjective.

This view is specific to classical mechanics, as I qualified in the earlier statements. My understanding is that the force concept does not really make sense in quantum theories an minimum principles are the only game in town then. However, my undestanding of quantum theories is much more superficial.

`This is the reasoning behind my statement, which were qualified to classical mechanics. The Beers derivation shows that $$ \frac{d}{dt} \frac{\partial T}{\partial \dot{q}_i} - \frac{\partial T}{\partial q_i} = F_i = \mathbf{F} \cdot \frac{\partial \mathbf{r}}{\partial q^i} $$ is the equation that is equivalent to Newtons laws. That already seems to make $T$ a kind of Lagrangian, since this an inhomogeneous Euler-Lagrange equation This equation holds for dissipative as well as conservative systems. Conservation of energy is expressed by rquiring $\mathbf{F} = -\nabla V$ yielding $$ \frac{d}{dt} \frac{\partial T}{\partial \dot{q}_i} - \frac{\partial T}{\partial q_i} = - \frac{\partial V}{\partial q_i} $$ Using $L = T - V$ is a device make the conservative form of the equation homogeneous, ie remove the RHS force term. This is what enables mechanic formulated as a minimum principle. To me that part of the derivation looks a little contrived though (which of course does not mean a whole lot in the greater scheme of things :) ) Also while it is mathematically very beautiful, I have always found the principle of least action somewhat unsatisfying as a physical axiom (I am very happy with it as a derived theorem), and I really like the Synge's geometrical reformulation as $$\nabla_\mathbf{\hat{v}} \mathbf{\hat{v}} = 0$$, which is seems like a much smaller leap of faith. If $T$ is used to construct the metric instead of $L$, Synge's approach yields $$\nabla_\mathbf{\hat{v}} \mathbf{\hat{v}} = F_i$$ which is still very elegant, and what little it looses in elegance it makes up in physical insight, namely that deviations from geodesy are due to applied forces. This is why I tend to view using $L = T - V$ primarily as a technical device to make the math prettier at the expense of physical insight and to view $T$ based formulations as more fundamental, though this is of course subjective. This view is specific to classical mechanics, as I qualified in the earlier statements. My understanding is that the force concept does not really make sense in quantum theories an minimum principles are the only game in town then. However, my undestanding of quantum theories is much more superficial.`

I think rather that it is a fundamental principle. For me it means that the balance between potential (simplified stored ) and kinetic (simplified spent) energy should on average be kept as close as possible despite constraints.

`I think rather that it is a fundamental principle. For me it means that the balance between potential (simplified stored ) and kinetic (simplified spent) energy should on average be kept as close as possible despite constraints.`

Is that what it is saying though? It is the difference that is being minimized, not the absolute difference. Seems to favor big gaps in favour of PE.

Principle of least action is really the principle of stationary which does not have any preference for the actual size of the gap.

Even if it is true how would one justify it as an axiom, as opposed to deriving it from other principles?

`> it means that the balance between potential (simplified stored ) and kinetic (simplified spent) energy should on average be kept as close as possible despite constraints. Is that what it is saying though? It is the difference that is being minimized, not the absolute difference. Seems to favor big gaps in favour of PE. Principle of least action is really the principle of stationary which does not have any preference for the actual size of the gap. Even if it is true how would one justify it as an axiom, as opposed to deriving it from other principles?`

Axioms, by definition, are not required to have a justification. The need to agree with the evidence, but that's it. For instance, the axioms of quantum theory.

My understanding is that the principle of least action and Newton's laws can each be used to prove the other. So which is the axiom and which is the theorem is a matter of choice and perspective. That Newton's laws appear more axiomatic to us than the principle of least action may have only to do with the kind of intuitions that evolution has hard-wired into our brains. Pushes, pulls, and accelerations are magnitudes that animals need to immediately process in order to survive.

`Axioms, by definition, are not required to have a justification. The need to agree with the evidence, but that's it. For instance, the axioms of quantum theory. My understanding is that the principle of least action and Newton's laws can each be used to prove the other. So which is the axiom and which is the theorem is a matter of choice and perspective. That Newton's laws appear more axiomatic to us than the principle of least action may have only to do with the kind of intuitions that evolution has hard-wired into our brains. Pushes, pulls, and accelerations are magnitudes that animals need to immediately process in order to survive.`

The difference is extremized (on average). What I had in mind was Gauß principle of least contraint. That is if you can express your force as $$F= - \nabla U$$ then you have some kind of balancing condition between a kinetic term and a potential term.

`>Is that what it is saying though? It is the difference that is being minimized, not the absolute difference. Seems to favor big gaps in favour of PE. The difference is extremized (on average). What I had in mind was <a href="https://en.wikipedia.org/wiki/Gauss's_principle_of_least_constraint">Gauß principle of least contraint.</a> That is if you can express your force as $$F= - \nabla U$$ then you have some kind of balancing condition between a kinetic term and a potential term.`

David wrote:

Here's the first great realization about energy:

Kinetic energymeasures how much is actually happening.Potential energymeasures how much could be happening, but isn't. The sum of these two, calledenergyis conserved: when more is happening, energy moves from potential to kinetic form.But this isn't enough to derive laws of physics For that we need the second great realization:

It's also important to think about the difference: kinetic energy minus potential energy. This is called the

Lagrangian. It measures how much is happening, minus how much could be but isn't. If we integrate this over time we get theaction, which is very well named. It's the total amount that happened over some interval of time, minus the amount that could have happened but didn't. Nature tries to minimize this. More precisely: if we fix initial and final positions of some collection of objects at the beginning and end of some interval of time, Nature will take the path from initial to final positions that minimizes the action.A wonderful fact is that this principle, the principle of least action,

impliesconservation of energy.I'm not sure what it would mean for you to have an intuitive explanation of "why this works". For that, you need to have some pre-established concept of what it means for it to "work". If for example you already believe in Newton's $F = m a$, then maybe deriving that principle from the principle of least action would be satisfying. You can't derive the principle of least action from $F = m a$ because not all forces are allowed by the principle of least action. In that sense the principle of least action is "deeper". But you can see exactly which forces are allowed, and why.

I'm very slowly writing a book on this stuff, and you can see a draft here:

• John Baez, Blair Smith and Derek Wise, Lectures on Classical Mechanics.

There's a bunch of talk near the beginning, and it already includes some of what I just said, and a lot of other stuff, like a historical introduction to the principle of least action, and how it emerged from the simpler "principle of least potential" in classical

statics.Someday I should include a better answer to your question! But it would help to know what your question actually means - that is, what could count as a satisfying answer. There is, of course, a limit on how much we can explain why the Universe has to be the way it is. This limit may change with time, as we gain deeper understandings, but there's always some limit.

`David wrote: > But is there an intuitive explanation for why it works? Here's the first great realization about energy: **Kinetic energy** measures how much is actually happening. **Potential energy** measures how much could be happening, but isn't. The sum of these two, called **energy** is conserved: when more is happening, energy moves from potential to kinetic form. But this isn't enough to derive laws of physics For that we need the second great realization: It's also important to think about the difference: kinetic energy minus potential energy. This is called the **Lagrangian**. It measures how much is happening, minus how much could be but isn't. If we integrate this over time we get the **action**, which is very well named. It's the total amount that happened over some interval of time, minus the amount that could have happened but didn't. Nature tries to minimize this. More precisely: if we fix initial and final positions of some collection of objects at the beginning and end of some interval of time, Nature will take the path from initial to final positions that minimizes the action. A wonderful fact is that this principle, the principle of least action, _implies_ conservation of energy. I'm not sure what it would mean for you to have an intuitive explanation of "why this works". For that, you need to have some pre-established concept of what it means for it to "work". If for example you already believe in Newton's $F = m a$, then maybe deriving that principle from the principle of least action would be satisfying. You can't derive the principle of least action from $F = m a$ because not all forces are allowed by the principle of least action. In that sense the principle of least action is "deeper". But you can see exactly which forces are allowed, and why. I'm very slowly writing a book on this stuff, and you can see a draft here: • John Baez, Blair Smith and Derek Wise, [Lectures on Classical Mechanics](http://math.ucr.edu/home/baez/classical/#lagrangian). There's a bunch of talk near the beginning, and it already includes some of what I just said, and a lot of other stuff, like a historical introduction to the principle of least action, and how it emerged from the simpler "principle of least potential" in classical _statics_. Someday I should include a better answer to your question! But it would help to know what your question actually means - that is, what could count as a satisfying answer. There is, of course, a limit on how much we can explain why the Universe has to be the way it is. This limit may change with time, as we gain deeper understandings, but there's always some limit.`

David wrote:

If you try to explain acceleration to students, you'll see how unintuitive this concept can be. Most students, even after a course on classical mechanics, can't predict what will happen when you swing a ball around on a string and the string suddenly breaks.

And if you show them a big car pushing a little car through some mud and ask them to compare the force exerted by the big car on the little one to the force exerted by the little car on the big one, they'll usually get it wrong.

Classical mechanics is profoundly counterintuitive at first, which is why it required geniuses like Galileo, Newton and Leibniz to invent it and overthrow the more intuitive Aristotelian picture of physics.

`David wrote: > Pushes, pulls, and accelerations are magnitudes that animals need to immediately process in order to survive. If you try to explain acceleration to students, you'll see how unintuitive this concept can be. Most students, even after a course on classical mechanics, can't predict what will happen when you swing a ball around on a string and the string suddenly breaks. And if you show them a big car pushing a little car through some mud and ask them to compare the force exerted by the big car on the little one to the force exerted by the little car on the big one, they'll usually get it wrong. Classical mechanics is profoundly counterintuitive at first, which is why it required geniuses like Galileo, Newton and Leibniz to invent it and overthrow the more intuitive Aristotelian picture of physics.`

It is hard to define an "intuitive explanation" for why something is true, but one knows it when it is at hand.

Example: Why does a pin puncture a strong rubber sheet? The force is applied to a very small area, so the pressure $F/A$ is enormous.

Of course, intuition is elastic, it can be trained, and such explanations might or might not exist.

Now suppose that someone came up with a theory for rotational motion which postulated that the square root of torque divided by the cube of the moment of inertia is minimized along all paths. Let call this magnitude $Z$. Now I do have a physical intuition for what torque is, and what the moment of inertia is, but it's hard to get a

physicalsense of what $Z$ could be designating -- it just looks like a mathematical combination. On the other hand, $(1/2) m v^2$ is also a mathematical combination, but I have a physical sense of what it quantifies.That was my question about the meaning of action $T - V$. It feels like a mathematical combination, not a physical concept. So, although the principle of least action is elegant and powerful, I don't find the statement of it to be physically intuitive.

I could say that, one-hundred times over, for quantum theory.

That may be just the way the cookie crumbles. And it may say more about the subjectivity of intuition -- which is conditioned by evolution and experience -- than about the physics itself.

`It is hard to define an "intuitive explanation" for why something is true, but one knows it when it is at hand. Example: Why does a pin puncture a strong rubber sheet? The force is applied to a very small area, so the pressure $F/A$ is enormous. Of course, intuition is elastic, it can be trained, and such explanations might or might not exist. * * * Now suppose that someone came up with a theory for rotational motion which postulated that the square root of torque divided by the cube of the moment of inertia is minimized along all paths. Let call this magnitude $Z$. Now I do have a physical intuition for what torque is, and what the moment of inertia is, but it's hard to get a _physical_ sense of what $Z$ could be designating -- it just looks like a mathematical combination. On the other hand, $(1/2) m v^2$ is also a mathematical combination, but I have a physical sense of what it quantifies. That was my question about the meaning of action $T - V$. It feels like a mathematical combination, not a physical concept. So, although the principle of least action is elegant and powerful, I don't find the statement of it to be physically intuitive. I could say that, one-hundred times over, for quantum theory. That may be just the way the cookie crumbles. And it may say more about the subjectivity of intuition -- which is conditioned by evolution and experience -- than about the physics itself.`

Sidenote: because potential energy doesn't have an objective zero point, the sign of $T - V$ is not objective.

`Sidenote: because potential energy doesn't have an objective zero point, the sign of $T - V$ is not objective.`

On the other hand, here is a formulation -- which is logically equivalent -- that I find more intuitively elementary:

Energy = $T + V$ is conserved.

Nature seeks paths that minimize kinetic energy $T$. Or, equivalently, that maximize potential energy $V$.

The simpler statement (2) becomes possible, once we take the conservation of energy as given.

How does this formulation compare to the statement that action is minimized?

Logically, they are

identical, but:That $T - V$ is minimized is more parsimonious, being that it is one statement.

This formulation is more "direct" in the sense that it makes simple statements about magnitudes that we are already familiar with -- rather than about an unfamiliar combination of familiar magnitudes.

But now that I see that these are the same, the standard statement is looking more intuitive to me! :)

`On the other hand, here is a formulation -- which is logically equivalent -- that I find more intuitively elementary: 1. Energy = $T + V$ is conserved. 2. Nature seeks paths that minimize kinetic energy $T$. Or, equivalently, that maximize potential energy $V$. The simpler statement (2) becomes possible, once we take the conservation of energy as given. * * * How does this formulation compare to the statement that action is minimized? Logically, they are _identical_, but: * That $T - V$ is minimized is more parsimonious, being that it is one statement. * This formulation is more "direct" in the sense that it makes simple statements about magnitudes that we are already familiar with -- rather than about an unfamiliar combination of familiar magnitudes. But now that I see that these are the same, the standard statement is looking more intuitive to me! :)`

Total energy is conserved, and potential is preferred over kinetic. Nature is lazy.

Example: when you throw a ball upwards, it moves quickly towards the higher altitude, and there slows down, in order to maximize the time that is spent at high potential and low kinetic energy,

`Total energy is conserved, and potential is preferred over kinetic. Nature is lazy. Example: when you throw a ball upwards, it moves quickly towards the higher altitude, and there slows down, in order to maximize the time that is spent at high potential and low kinetic energy,`

Remember that Sussman's goal for that text was not necessarily physics but this:

I do most of my work in Prolog but of course Sussman prefers Lisp or Scheme deriving from the Abelson&Sussman text on Lisp programming.

Here is a comparison of a Sussman algorithm in Lisp and then my version in Prolog

`(define (multiple-dwelling) (let ((baker (amb 1 2 3 4 5)) (cooper (amb 1 2 3 4 5)) (fletcher (amb 1 2 3 4 5)) (miller (amb 1 2 3 4 5)) (smith (amb 1 2 3 4 5))) (require (distinct? (list baker cooper fletcher miller smith))) (require (not (= baker 5))) (require (not (= cooper 1))) (require (not (= fletcher 5))) (require (not (= fletcher 1))) (require (> miller cooper)) (require (not (= (abs (- smith fletcher)) 1))) (require (not (= (abs (- fletcher cooper)) 1))) (list (list 'baker baker) (list 'cooper cooper) (list 'fletcher fletcher) (list 'miller miller) (list 'smith smith))))`

Evaluating the expression (multiple-dwelling) produces the result

((baker 3) (cooper 2) (fletcher 4) (miller 5) (smith 1))

And here is my version in Prolog, which I didn't transcribe from his Lisp but directly from the textual description of the problem.

`floors_live(Baker, Cooper, Fletcher, Miller, Smith) :- member(Baker,[1,2,3,4,5]), member(Cooper,[1,2,3,4,5]), member(Fletcher,[1,2,3,4,5]), member(Miller,[1,2,3,4,5]), member(Smith,[1,2,3,4,5]), all_distinct([Baker, Cooper, Fletcher, Miller, Smith]),`

`Baker \= 5, Cooper \= 1, Fletcher \= 5, Fletcher \= 1, Miller > Cooper, abs(Smith - Fletcher) > 1, abs(Fletcher - Cooper) > 1.`

`Remember that Sussman's goal for that text was not necessarily physics but this: > "The task of formulating a method as a computer-executable program and debugging that program is a powerful exercise in the learning process. Also, once formalized procedurally, a mathematical idea becomes a tool that can be used directly to compute results." Sussman and Wisdom, with Meinhard Mayer, have produced a textbook, Structure and Interpretation of Classical Mechanics, to capture these new ideas." I do most of my work in Prolog but of course Sussman prefers Lisp or Scheme deriving from the Abelson&Sussman text on Lisp programming. Here is a comparison of a Sussman algorithm in Lisp and then my version in Prolog > The [following puzzle](https://mitpress.mit.edu/sicp/full-text/sicp/book/node90.html) (taken from Dinesman 1968) is typical of a large class of simple logic puzzles: > Baker, Cooper, Fletcher, Miller, and Smith live on different floors of an apartment house that contains only five floors. Baker does not live on the top floor. Cooper does not live on the bottom floor. Fletcher does not live on either the top or the bottom floor. Miller lives on a higher floor than does Cooper. Smith does not live on a floor adjacent to Fletcher's. Fletcher does not live on a floor adjacent to Cooper's. Where does everyone live? We can determine who lives on each floor in a straightforward way by enumerating all the possibilities and imposing the given restrictions: [*] <code>(define (multiple-dwelling) (let ((baker (amb 1 2 3 4 5)) (cooper (amb 1 2 3 4 5)) (fletcher (amb 1 2 3 4 5)) (miller (amb 1 2 3 4 5)) (smith (amb 1 2 3 4 5))) (require (distinct? (list baker cooper fletcher miller smith))) (require (not (= baker 5))) (require (not (= cooper 1))) (require (not (= fletcher 5))) (require (not (= fletcher 1))) (require (> miller cooper)) (require (not (= (abs (- smith fletcher)) 1))) (require (not (= (abs (- fletcher cooper)) 1))) (list (list 'baker baker) (list 'cooper cooper) (list 'fletcher fletcher) (list 'miller miller) (list 'smith smith))))</code> Evaluating the expression (multiple-dwelling) produces the result ((baker 3) (cooper 2) (fletcher 4) (miller 5) (smith 1)) --- And here is my version in Prolog, which I didn't transcribe from his Lisp but directly from the textual description of the problem. <code>floors_live(Baker, Cooper, Fletcher, Miller, Smith) :- member(Baker,[1,2,3,4,5]), member(Cooper,[1,2,3,4,5]), member(Fletcher,[1,2,3,4,5]), member(Miller,[1,2,3,4,5]), member(Smith,[1,2,3,4,5]), all_distinct([Baker, Cooper, Fletcher, Miller, Smith]), </code> <code> Baker \= 5, Cooper \= 1, Fletcher \= 5, Fletcher \= 1, Miller > Cooper, abs(Smith - Fletcher) > 1, abs(Fletcher - Cooper) > 1.</code>`

Thanks for the Dinesman link. I get the same answer as you with this code. Purposely left just a test so extending it can maybe be used as an exercise for some haskell students.

> data Lives = Lives {n::String,fl::Int}

> db = [b,c,f,m,s]

> b = Lives {n="b", fl=3}

> c = Lives {n="c", fl=2}

> f = Lives {n="fl1",fl=4}

> m = Lives {n="m", fl=5}

> s = Lives {n="s", fl=1}

> name :: Lives -> String > name (Lives n _) = n

> flr :: Lives -> Int

> flr (Lives _ f) = f

> isOK :: [Lives] -> Bool

> isOK db = if (flr (db!!0) /= 5

> && flr (db!!1) /= 1

> && flr (db!!2) /= 5

> && flr (db!!2) /= 1

> && abs (flr (db!!4) - flr (db!!2)) > 1

> && abs (flr (db!!2) - flr (db!!1)) > 1)

> then True

> else False

> main = print $ isOK db

& main ==> True

`Thanks for the Dinesman link. I get the same answer as you with this code. Purposely left just a test so extending it can maybe be used as an exercise for some haskell students. \> data Lives = Lives {n::String,fl::Int} \> db = [b,c,f,m,s] \> b = Lives {n="b", fl=3} \> c = Lives {n="c", fl=2} \> f = Lives {n="fl1",fl=4} \> m = Lives {n="m", fl=5} \> s = Lives {n="s", fl=1} \> name :: Lives -> String \> name (Lives n _) = n \> flr :: Lives -> Int \> flr (Lives _ f) = f \> isOK :: [Lives] -> Bool \> isOK db = if (flr (db!!0) /= 5 \> && flr (db!!1) /= 1 \> && flr (db!!2) /= 5 \> && flr (db!!2) /= 1 \> && abs (flr (db!!4) - flr (db!!2)) > 1 \> && abs (flr (db!!2) - flr (db!!1)) > 1) \> then True \> else False \> main = print $ isOK db & main ==> True`

Jim, Does your Haskell code have the automatic enumerating of all possibilities, or is it simply checking the truth of the initial assertion of where each person lives?

The Prolog enumerates and selects by an automatic backtracking mechanism, while the Lisp I believe does an expansion on the list elements (the first let statement) and culls only those possibilities that meet the constraints.

The difference between declarative (Prolog) and functional (Lisp) programming right there.

I am bringing this up because what I am going to try to do with the ENSO analysis is write a Prolog rule that will automatically test all the lunisolar frequencies and apply biennial and yearly modulation sideband splits to those frequencies and determine where they best align with the ENSO spectra. This will use recursion as the splits can propagate.

Data analysis is often exploratory analysis and that's one of the reasons I am using a language with all the search mechanisms ready to apply.

Paul

`Jim, Does your Haskell code have the automatic enumerating of all possibilities, or is it simply checking the truth of the initial assertion of where each person lives? The Prolog enumerates and selects by an automatic backtracking mechanism, while the Lisp I believe does an expansion on the list elements (the first let statement) and culls only those possibilities that meet the constraints. The difference between declarative (Prolog) and functional (Lisp) programming right there. --- I am bringing this up because what I am going to try to do with the ENSO analysis is write a Prolog rule that will automatically test all the lunisolar frequencies and apply biennial and yearly modulation sideband splits to those frequencies and determine where they best align with the ENSO spectra. This will use recursion as the splits can propagate. Data analysis is often exploratory analysis and that's one of the reasons I am using a language with all the search mechanisms ready to apply. Paul`

Paul, I just wrote a test to check that my constraints were correct by hard coding the correct answer. I help an old friend tutoring students in haskell and thought it might be a good exercise for them to just code an exhaustive search just given the correct answer. I'll look at how you and lisp coded the real answer and maybe comment on functional logic programming when I've got the code in #27 to format like literate haskell :).

`Paul, I just wrote a test to check that my constraints were correct by hard coding the correct answer. I help an old friend tutoring students in haskell and thought it might be a good exercise for them to just code an exhaustive search just given the correct answer. I'll look at how you and lisp coded the real answer and maybe comment on functional logic programming when I've got the code in #27 to format like literate haskell :).`

If you are interested doing logic programming (Prolog etc) in functional languages (Haskell, Scheme, Lisp, ...) look at Kanren, MiniKanren and "backtracking monads". The MiniKanren and Oleg Kiselyov homepages seem to have links to everything relevant to the subject.

`If you are interested doing logic programming (Prolog etc) in functional languages (Haskell, Scheme, Lisp, ...) look at Kanren, MiniKanren and "backtracking monads". The [MiniKanren ](http://minikanren.org/) and [Oleg Kiselyov](http://okmij.org/) homepages seem to have links to everything relevant to the subject.`

Good suggestions Daniel. If anybody else here is interested in functional logic I've added the needed Applicative and Alternative instances to Dan Doel's logict package which implements Oleg's LogicT paper and will post them on github.com/jimstutt. Wren Romano posted a list of useful links: * http://web.cecs.pdx.edu/~mpj/thih/ * http://web.cecs.pdx.edu/~sheard/papers/generic.ps * http://www.cs.chalmers.se/~emax/wired/documents/LP_HFL07.pdf * http://okmij.org/ftp/papers/LogicT.pdf * http://citeseer.ist.psu.edu/318776.html * http://citeseer.ist.psu.edu/claessen00typed.html * http://www.curry-language.org/

I also fixed the code in the code in the blog post Wren commented on:

http://propella.blogspot.co.uk/2009/04/prolog-in-haskell.html.

`Good suggestions Daniel. If anybody else here is interested in functional logic I've added the needed Applicative and Alternative instances to Dan Doel's logict package which implements Oleg's LogicT paper and will post them on github.com/jimstutt. Wren Romano posted a list of useful links: * http://web.cecs.pdx.edu/~mpj/thih/ * http://web.cecs.pdx.edu/~sheard/papers/generic.ps * http://www.cs.chalmers.se/~emax/wired/documents/LP_HFL07.pdf * http://okmij.org/ftp/papers/LogicT.pdf * http://citeseer.ist.psu.edu/318776.html * http://citeseer.ist.psu.edu/claessen00typed.html * http://www.curry-language.org/ I also fixed the code in the code in the blog post Wren commented on: http://propella.blogspot.co.uk/2009/04/prolog-in-haskell.html.`

David wrote:

That's fine as far as it goes. Everyone who understands the principle of least action has thought about this. It's important.

But it's really wonderful how

"Nature takes paths that minimize T - V"

implies

bothof these. Indeed,allconservation laws can be derived from the principle of least action together with symmetries, thanks to Noether's theorem - and symmetry under time translation gives conservation of energy. So we wind up deciding the principle of least action is fundamental, even if it takes a while for each one of us to come around to that viewpoint.Have you read Feynman's little story about how he learned about the principle of least action in high school? Somewhere else he wrote about how he fought it, because he found it unintuitiv. He didn't like it, so he would cleverly solve all his mechanics problems using just $F = ma$, which is usually much harder. But ironically, one of his lasting claims to fame is understanding how quantum mechanics generalizes the principle of least action: in fact nature takes

allpaths, each with amplitude $\exp(i S / \hbar)$, where $S$ is the action and $\hbar$ is Planck's constant.`David wrote: > On the other hand, here is a formulation -- which is logically equivalent -- that I find more intuitively elementary: > 1. Energy = T+V > is conserved. > Nature seeks paths that minimize kinetic energy T. Or, equivalently, that maximize potential energy V. That's fine as far as it goes. Everyone who understands the principle of least action has thought about this. It's important. But it's really wonderful how "Nature takes paths that minimize T - V" implies _both_ of these. Indeed, _all_ conservation laws can be derived from the principle of least action together with symmetries, thanks to Noether's theorem - and symmetry under time translation gives conservation of energy. So we wind up deciding the principle of least action is fundamental, even if it takes a while for each one of us to come around to that viewpoint. Have you read [Feynman's little story](http://www.feynmanlectures.caltech.edu/II_19.html) about how he learned about the principle of least action in high school? Somewhere else he wrote about how he fought it, because he found it unintuitiv. He didn't like it, so he would cleverly solve all his mechanics problems using just $F = ma$, which is usually much harder. But ironically, one of his lasting claims to fame is understanding how quantum mechanics generalizes the principle of least action: in fact nature takes _all_ paths, each with amplitude $\exp(i S / \hbar)$, where $S$ is the action and $\hbar$ is Planck's constant.`

From the Feynman lectures:

In 19 I wrote deliberately:

and not "is minimized on average" (as Feynman says). Why? - because it seems the action may not always be minimized.

Arnold has an Example, where Trajectories are Geodesics, and like the geodesics on the sphere those need not be shortest. Moreover I just checked an old physics undergraduate textbook and there it is even stated that if one has constraints then the action is always maximized. This "theorem" is there however stated without proof or reference so I dont know how trustworthy that is.

`From the Feynman lectures: >“In other words, the laws of Newton could be stated not in the form F=ma but in the form: the average kinetic energy less the average potential energy is as little as possible for the path of an object going from one point to another. In <a href="https://forum.azimuthproject.org/discussion/comment/15298/#Comment_15298">19</a> I wrote deliberately: >The difference is extremized (on average). and not "is minimized on average" (as Feynman says). Why? - because it seems the action may not always be minimized. Arnold has an Example, where Trajectories are Geodesics, and like the geodesics on the sphere those need not be shortest. Moreover I just checked an old physics undergraduate textbook and there it is even stated that if one has constraints then the action is always maximized. This "theorem" is there however stated without proof or reference so I dont know how trustworthy that is.`

The Sussman and Wisdom text discusses some interesting, basic techniques for searching for a path of least action, using a software-experimental approach.

Suppose you have a general path $(t,x(t))$, along with a Lagrangian function, which when integrated over the path, gives the action for that path.

Suppose that $x(t)$ is a point in $n$-dimensional space.

Suppose that we constraint the endpoints of the path to $(t_0,x_0)$ and $(t_1,x_1)$, and we want to search for a path between these points that minimizes the action.

They describe a search approach that is based on interpolation.

Suppose that we have a given interpolation algorithm. (They mention polynomial/spline interpolation.)

For fixed $k$, let $u_1, ..., u_k$ be an increasing sequence of values intermediate between $t_0$ and $t_1$.

Then by choosing values for $f(u_1), ..., f(u_k)$, the interpolation algorithm gives us a path, which we can then feed into the action function, to get a numerical value for the action on that path.

This gives us a $k \cdot n$ dimensional space of paths to search through. We can then apply e.g. a gradient descent search algorithm to find a path of (locally) minimum action.

`The Sussman and Wisdom text discusses some interesting, basic techniques for searching for a path of least action, using a software-experimental approach. Suppose you have a general path $(t,x(t))$, along with a Lagrangian function, which when integrated over the path, gives the action for that path. Suppose that $x(t)$ is a point in $n$-dimensional space. Suppose that we constraint the endpoints of the path to $(t_0,x_0)$ and $(t_1,x_1)$, and we want to search for a path between these points that minimizes the action. They describe a search approach that is based on interpolation. Suppose that we have a given interpolation algorithm. (They mention polynomial/spline interpolation.) For fixed $k$, let $u_1, ..., u_k$ be an increasing sequence of values intermediate between $t_0$ and $t_1$. Then by choosing values for $f(u_1), ..., f(u_k)$, the interpolation algorithm gives us a path, which we can then feed into the action function, to get a numerical value for the action on that path. This gives us a $k \cdot n$ dimensional space of paths to search through. We can then apply e.g. a gradient descent search algorithm to find a path of (locally) minimum action.`

That of course only covers the range of paths that are generated by the interpolation algorithm, with $k$ intermediate points.

So it gives an upper bound on the least action.

The next step they describe is run a loop in which $k$ is increased, and observe how these upper bounds decrease.

In the example they describe, it clearly converges to the value for least action which is obtained by analysis.

`That of course only covers the range of paths that are generated by the interpolation algorithm, with $k$ intermediate points. So it gives an upper bound on the least action. The next step they describe is run a loop in which $k$ is increased, and observe how these upper bounds decrease. In the example they describe, it clearly converges to the value for least action which is obtained by analysis.`

As an exercise, they ask the following interesting question.

Suppose that the interpolation algorithm is constrained so as to produce paths that are not physically possible. For instance, for the situation where a ball is thrown into the air from the ground and then returns to the ground ten seconds later, suppose that the derivatives at the endpoints are constrained so that the ball begins by moving upward, but also is moving upward at the ending time.

Then, they ask, what will the above search method produce, as a progressive sequence of paths, and what will the associated sequence of action-estimates converge to?

They suggest actually programming it, and say that it is an instructive exercise to perform.

`As an exercise, they ask the following interesting question. Suppose that the interpolation algorithm is constrained so as to produce paths that are not physically possible. For instance, for the situation where a ball is thrown into the air from the ground and then returns to the ground ten seconds later, suppose that the derivatives at the endpoints are constrained so that the ball begins by moving upward, but also is moving upward at the ending time. Then, they ask, what will the above search method produce, as a progressive sequence of paths, and what will the associated sequence of action-estimates converge to? They suggest actually programming it, and say that it is an instructive exercise to perform.`

If you take the geodesic form of the Lagrange equations, then you can take the starting positions and generalized velocities and keep parallel transporting the velocities along themselves. That will produce a solution that AFAICT is always physical and the process does not require solving an optimization problem.

`If you take the geodesic form of the Lagrange equations, then you can take the starting positions and generalized velocities and keep parallel transporting the velocities along themselves. That will produce a solution that AFAICT is always physical and the process does not require solving an optimization problem.`

Daniel, Isn't that the general idea that Einstein used in his relativity formulation?

In any case, this thread is kind of split in two. Part of it is a discussion about Sussman using an interpreted language in doing physics and another part is about a specific physics problem.

Addessing the former, I thought I would show what I have been doing in the past few years with Prolog, and specifically the SWIPL distro. The semantic web infrastructure I lean on has incorporated the Notebook paradigm elegantly.

Add the SWISH package and it appears in the menu:

Query the knowledgebase using an RDF triple relation, i.e. rdf(Subject,Predicate,Object).

In this wildcard case, you see query results from the NASA JPL SWEET ontology.

It handles declarative representations of charts. Format the data structure and the notebook rule will attempt to render it,

The same for directed graphs.

`Daniel, Isn't that the general idea that Einstein used in his relativity formulation? --- In any case, this thread is kind of split in two. Part of it is a discussion about Sussman using an interpreted language in doing physics and another part is about a specific physics problem. Addessing the former, I thought I would show what I have been doing in the past few years with Prolog, and specifically the SWIPL distro. The semantic web infrastructure I lean on has incorporated the Notebook paradigm elegantly. Add the SWISH package and it appears in the menu: ![swish1](http://imageshack.com/a/img923/2989/aVVNPf.png) Query the knowledgebase using an RDF triple relation, i.e. rdf(Subject,Predicate,Object). ![swish2](http://imageshack.com/a/img924/121/DbA2zv.png) In this wildcard case, you see query results from the NASA JPL SWEET ontology. ![swish3](http://imageshack.com/a/img923/4536/HodFLk.png) It handles declarative representations of charts. Format the data structure and the notebook rule will attempt to render it, ![swish4](http://imageshack.com/a/img922/8649/tJGcpF.png) The same for directed graphs. ![swish5](http://imageshack.com/a/img922/5448/WuUlbb.png)`

David, Enon, Daniel, John and all, Thank you for this question which I hope to absorb along with all of your deep answers.

As regards intuition, I am thinking that another way to say what has been said here is:

Nature minimizes kinetic energy, but relative to what? Relative to the potential energy. So the potential energy is just the baseline, the "zero", the absolute, and kinetic energy is what is relative to that. That's why we subtract it.

Note that potential energy is typically infinite, as in the case of an object's gravitational field or electromagnetic field. Whereas kinetic energy is definitely finite.

So we have to restrict ourselves to considering a finite range of potential energy.

And within that range we consider the exerted action which is also finitely delineated. And the action is the (integrated) sum of what the force exerts over space and time.

I am picking up from Dan Piponi's answer at Quora https://www.quora.com/Why-is-Lagrangian-defined-as-Kinetic-energy-minus-potential-energy/answer/Dan-Piponi?srid=uYrm this distinction between the "bottom-up" (building up a space with vectors) and "top-down" (tearing down a space in terms of hyperplanes/reflections/covectors) which keeps coming up as I try to understand tensors. So it seems perhaps that kinectic energy is the bottom-up view of energy (going up as the space is built up) and potential energy is the top-down view of energy (going down as the space is broken down).

I've also been trying to understand the intuitive difference amongst the classical Lie groups, the unitary, orthogonal and symplectic groups. And it seems to boil down to how to "undo" an action: * Unitary: conjugate transpose * Orthogonal: symmetric transpose * Symplectic: anti-symmetric transpose And the transpose has to do with switching from vectors to covectors in their duality. Given this commonality, then there are three "easy" ways to invert the action, the matrix. And I think the three have to do with ways of thinking about the relationship between the complexes and the reals, and in particular, the philosophical relationship between unmarked opposites (like the two solutions to the square root of -1, call them i and j) and marked opposites (like i and -i). But I'm just imagining as I try to understand.

`David, Enon, Daniel, John and all, Thank you for this question which I hope to absorb along with all of your deep answers. As regards intuition, I am thinking that another way to say what has been said here is: Nature minimizes kinetic energy, but relative to what? Relative to the potential energy. So the potential energy is just the baseline, the "zero", the absolute, and kinetic energy is what is relative to that. That's why we subtract it. Note that potential energy is typically infinite, as in the case of an object's gravitational field or electromagnetic field. Whereas kinetic energy is definitely finite. So we have to restrict ourselves to considering a finite range of potential energy. And within that range we consider the exerted action which is also finitely delineated. And the action is the (integrated) sum of what the force exerts over space and time. I am picking up from Dan Piponi's answer at Quora https://www.quora.com/Why-is-Lagrangian-defined-as-Kinetic-energy-minus-potential-energy/answer/Dan-Piponi?srid=uYrm this distinction between the "bottom-up" (building up a space with vectors) and "top-down" (tearing down a space in terms of hyperplanes/reflections/covectors) which keeps coming up as I try to understand tensors. So it seems perhaps that kinectic energy is the bottom-up view of energy (going up as the space is built up) and potential energy is the top-down view of energy (going down as the space is broken down). I've also been trying to understand the intuitive difference amongst the classical Lie groups, the unitary, orthogonal and symplectic groups. And it seems to boil down to how to "undo" an action: * Unitary: conjugate transpose * Orthogonal: symmetric transpose * Symplectic: anti-symmetric transpose And the transpose has to do with switching from vectors to covectors in their duality. Given this commonality, then there are three "easy" ways to invert the action, the matrix. And I think the three have to do with ways of thinking about the relationship between the complexes and the reals, and in particular, the philosophical relationship between unmarked opposites (like the two solutions to the square root of -1, call them i and j) and marked opposites (like i and -i). But I'm just imagining as I try to understand.`

Hi Andrius, welcome to Azimuth.

I am sure these questions can be looked at from different perspectives, but I must say that I am not understanding how you are making these connections. Sometimes the more sweeping statements can be confusing, and the smaller points can have more educational value.

`Hi Andrius, welcome to Azimuth. I am sure these questions can be looked at from different perspectives, but I must say that I am not understanding how you are making these connections. Sometimes the more sweeping statements can be confusing, and the smaller points can have more educational value.`

Daniel wrote:

Paul wrote:

What is the general relationship between the Langrangian dynamics and the general theory of relativity? Since both have a geodesic formulation, their metrics must agree enough to give the same paths of motion in gravitational fields -- the way that nature actually behaves.

`Daniel wrote: > If you take the geodesic form of the Lagrange equations, then you can take the starting positions and generalized velocities and keep parallel transporting the velocities along themselves. That will produce a solution that AFAICT is always physical and the process does not require solving an optimization problem. Paul wrote: > Daniel, Isn't that the general idea that Einstein used in his relativity formulation? What is the general relationship between the Langrangian dynamics and the general theory of relativity? Since both have a geodesic formulation, their metrics must agree enough to give the same paths of motion in gravitational fields -- the way that nature actually behaves.`

Here is a comparison of some aspects of Riemannian geometry behing general relativity and classical mechanics

In general relativity inertial trajectories are geodesics on the space-time 4-manifold and the metric is derived from the stress energy tensor via the Einstein equations. Paths on this manifold represent (possible) trajectories of individual particles,

There are several different ways to treat classical mechanics geometrical. The Synge paper talks about 4 of them.

The simplest one I discussed uses the inertia tensor as the metric on the configuration manifold. So there are as many dimesions as degrees of freedom, but time is not a part of the manifold. In this setting only trajectories with no forces other than forces of constraint are geodesic, otherwise the covariant derivative of the trajectory along itself is equal to the generalised forces being applied (including conservative ones).

Synge's paper concentrates on a slightly more complex setting which also use the configuration manifold but has a slightly complex metric based on the Lagrangean. This incorporates conservative forces into the geometry. The trajectories of all conservative systems are geodesics and the covariant derivative of a trajectory along itself is equal to the generalised non-conservative forces

These geometric formulations of classical mechanics relate to Hertz's Principle of Least Curvature and Gauss's Principle of Least Constraint, which are other formulations of classical mechanics like Lagrange's, Hamilton's and D'Alembert's.

One major difference from GR is that paths on the above manifolds represent the evolution of the entire system rather than that of a single partice as in GR.

Synge's paper also briefly discusses 2 formulations that incorporate time into the manfold but I am less clear on the geometry of those.

`> What is the general relationship between the Langrangian dynamics and the general theory of relativity? Here is a comparison of some aspects of Riemannian geometry behing general relativity and classical mechanics In general relativity inertial trajectories are geodesics on the space-time 4-manifold and the metric is derived from the stress energy tensor via the Einstein equations. Paths on this manifold represent (possible) trajectories of individual particles, There are several different ways to treat classical mechanics geometrical. The Synge paper talks about 4 of them. The simplest one I discussed uses the inertia tensor as the metric on the configuration manifold. So there are as many dimesions as degrees of freedom, but time is not a part of the manifold. In this setting only trajectories with no forces other than forces of constraint are geodesic, otherwise the covariant derivative of the trajectory along itself is equal to the generalised forces being applied (including conservative ones). Synge's paper concentrates on a slightly more complex setting which also use the configuration manifold but has a slightly complex metric based on the Lagrangean. This incorporates conservative forces into the geometry. The trajectories of all conservative systems are geodesics and the covariant derivative of a trajectory along itself is equal to the generalised non-conservative forces These geometric formulations of classical mechanics relate to Hertz's Principle of Least Curvature and Gauss's Principle of Least Constraint, which are other formulations of classical mechanics like Lagrange's, Hamilton's and D'Alembert's. One major difference from GR is that paths on the above manifolds represent the evolution of the entire system rather than that of a single partice as in GR. Synge's paper also briefly discusses 2 formulations that incorporate time into the manfold but I am less clear on the geometry of those.`