Options

Negative Entropy

edited March 29 in General

I'd like to start a discussion about negative entropy. Consider the idea of the arc of history bending to a positive resolution.

Alexander the Great conquered large portions of Europe and Asia. At the time, being conquered was a disaster, but centuries later it provided languages with a unified basis. So negative entropy has a component of duration. This can be seen as moving to a global optima.

I am interested in the levels of the Ackermann function for modeling non-local optima.

Do folks have other examples demonstrating negative entropy and it's principles?

Comments

  • 1.

    Negative entropy is a measure of order and I apply it quite frequently.

    Comment Source:Negative entropy is a measure of order and I apply it quite frequently.
  • 2.
    edited April 12

    Proposed Principles

    Negative entropy is more complex than positive entropy.

    I submit that Maslow's hierarchy of needs is a good model of negative entropy, particularly his later models. See https://www.simplypsychology.org/maslow.html . Note the anthropomorphic aspect.

    The Category Theory Zulip group discusses music and ascetics as an unavoidable human trait.

    Comment Source:Proposed Principles Negative entropy is more complex than positive entropy. I submit that Maslow's hierarchy of needs is a good model of negative entropy, particularly his later models. See https://www.simplypsychology.org/maslow.html . Note the anthropomorphic aspect. The Category Theory Zulip group discusses music and ascetics as an unavoidable human trait.
  • 3.

    I can't help feeling that the concept of negative entropy is central to the betterment of mankind.

    Comment Source:I can't help feeling that the concept of negative entropy is central to the betterment of mankind.
  • 4.

    Negative entropy is just entropy with a reversed sign and so is a measure of order, or of lower complexity. I have some good recent examples but this paper I wrote using Shannon entropy lays out some practical applications https://www.intechopen.com/books/applications-of-digital-signal-processing/entropic-complexity-measured-in-context-switching

    Comment Source:Negative entropy is just entropy with a reversed sign and so is a measure of order, or of lower complexity. I have some good recent examples but this paper I wrote using Shannon entropy lays out some practical applications https://www.intechopen.com/books/applications-of-digital-signal-processing/entropic-complexity-measured-in-context-switching
  • 5.

    Here's the way to think about using negative entropy to find an optimal solution. Consider autonomous vs non-autonomous differential equations. One way to think about the distinction is that the transfer function for non-autonomous only depends on the presenting input. Thus, it acts like an op-amp with infinite bandwidth. Or below saturation it gives perfectly linear amplification

    In contrast, for an autonomous formulation, the amplification depends on prior values so it requires a time-domain convolution or a frequency-domain transfer function

    Yet there are many other non-autonomous formulations that aren't linear, for example a companding transfer that takes the square root of the input (used for compressing the dynamic range of a signal).

    What does this have to do with entropy? Well that transfer function can get very strange but still possess underlying order. Yet that order or pattern may be difficult to discern without adequate information. So consider if the non-autonomous transfer function itself is something odd, such as an unknown and potentially complex sinusoidal modulation. This occurs in Mach-Zehnder modulation. The effect is to distort the input enough to fold the amplitude at certain points.

    The difficulty is if we have little knowledge of the input forcing or the modulation, we will not be able to decode anything. But with a measure such as Negative Shannon Entropy, we can see how far we can go with limited info.

    So consider this output waveform that we are told is due to Mach-Zehnder modulation of an unknown input

    All we know is that there may be a basis forcing that consists of a couple of sinusoids, and that there is an obvious non-autonomous complex modulation that is generating the above waveform

    The idea is that we test out various combinations of sinusoidal parameters and then maximize the Shannon entropy of the power spectrum of the transfer from input to output (see the citation in the previous post). We can do this calculating a discrete Fourier transform or an FFT and multiplying by the complex conjugate to get the power spectrum. For a perfectly linear amplification as in the first example, it is essentially a delta function at a frequency of zero, indicating maximum order with a maximum negative Shannon entropy. And for a single sinusoidal frequency modulation, the power spectrum would be a delta shifted to the frequency of the modulation. Again this will be a maximally-ordered amplification, and again with a maximum in negative Shannon entropy. Yet, in practical terms, perhaps something such as a Renyi or Tsallis entropy measure would work even better than Shannon entropy. Actually, the Tsallis entropy is close to describing a mean-square variance error in a signal, whereby it exaggerates clusters or strong excursions when compared against a constant background.

    So this is what I have used that works quite well. I essentially maximize the normalized mean-squared variance of the power spectrum

    $$\frac{\sum (F(\omega)-<F(\omega)>)^2}{\sum F(\omega)}$$ The result of a search algorithm of input sinusoidal factors to maximize the power spectrum variance value is this power spectrum

    which stems from this optimal input forcing

    Note that this is not the transfer modulation, which we still need to extract from the power spectrum.

    As a result, this negative entropy algorithm is able to deconstruct or decode a Mach-Zehnder modulation of two sinusoidal factors that's encoding an input forcing of another pair of sinusoidal factors. So essentially we are able to find 4 unknown factors (or 8 if both amplitude and phase are included) by only searching on 2 factors (or 4 if amplitude and phase are included). But how is that possible? It's actually not a free lunch because the power spectrum calculation is essentially testing all possible modulations in parallel and the negative entropy calculation is keeping track of the frequency components that maximize the delta functions in the spectrum. That is the mean-square variance is weighting greater excursions than a flat highly-random background would.

    From the paper, this is the general idea. For negative entropy we are looking for the upper spectrum, not the lower, which is a maximum entropy

    Good luck, this works well for certain applications. It may even work better in a search algorithm than if you did a pure RMS minimization of fitting the 4 sinusoidal factors directly against the output, as it may not fall into local minima as easily. Doing the power spectrum helps to immediately broaden the search I think.

    Comment Source:Here's the way to think about using negative entropy to find an optimal solution. Consider autonomous vs non-autonomous differential equations. One way to think about the distinction is that the transfer function for non-autonomous only depends on the presenting input. Thus, it acts like an op-amp with infinite bandwidth. Or below saturation it gives perfectly linear amplification ![](https://pbs.twimg.com/media/EyUEt_2U8AIhkhg.png) In contrast, for an autonomous formulation, the amplification depends on prior values so it requires a time-domain convolution or a frequency-domain transfer function ![](https://pbs.twimg.com/media/EyUGgCeU8AEeg0r.png) Yet there are many other non-autonomous formulations that aren't linear, for example a companding transfer that takes the square root of the input (used for compressing the dynamic range of a signal). ![](https://pbs.twimg.com/media/EyUHXfVVcAEvEkV.png) What does this have to do with entropy? Well that transfer function can get very strange but still possess underlying order. Yet that order or pattern may be difficult to discern without adequate information. So consider if the non-autonomous transfer function itself is something odd, such as an unknown and potentially complex sinusoidal modulation. This occurs in Mach-Zehnder modulation. The effect is to distort the input enough to fold the amplitude at certain points. ![](https://pbs.twimg.com/media/EyUIsZRVIAUcfYY.png) The difficulty is if we have little knowledge of the input forcing or the modulation, we will not be able to decode anything. But with a measure such as Negative Shannon Entropy, we can see how far we can go with limited info. So consider this output waveform that we are told is due to Mach-Zehnder modulation of an unknown input ![](https://imagizer.imageshack.com/img922/3888/lLoVr3.png) All we know is that there may be a basis forcing that consists of a couple of sinusoids, and that there is an obvious non-autonomous complex modulation that is generating the above waveform The idea is that we test out various combinations of sinusoidal parameters and then maximize the Shannon entropy of the *power spectrum* of the transfer from input to output (see the citation in the previous post). We can do this calculating a discrete Fourier transform or an FFT and multiplying by the complex conjugate to get the power spectrum. For a perfectly linear amplification as in the first example, it is essentially a delta function at a frequency of zero, indicating maximum order with a maximum negative Shannon entropy. And for a single sinusoidal frequency modulation, the power spectrum would be a delta *shifted* to the frequency of the modulation. Again this will be a maximally-ordered amplification, and again with a maximum in negative Shannon entropy. Yet, in practical terms, perhaps something such as a Renyi or Tsallis entropy measure would work even better than Shannon entropy. Actually, the [Tsallis entropy](https://en.wikipedia.org/wiki/Tsallis_entropy) is close to describing a mean-square variance error in a signal, whereby it exaggerates clusters or strong excursions when compared against a constant background. So this is what I have used that works quite well. I essentially maximize the normalized mean-squared variance of the power spectrum $$\frac{\sum (F(\omega)-<F(\omega)\>)^2}{\sum F(\omega)}$$ The result of a search algorithm of input sinusoidal factors to maximize the power spectrum variance value is this power spectrum ![](https://imagizer.imageshack.com/img924/5228/w54jkW.png) which stems from this optimal input forcing ![](https://imagizer.imageshack.com/img923/3659/wE7Gon.png) Note that this is not the transfer modulation, which we still need to extract from the power spectrum. As a result, this negative entropy algorithm is able to deconstruct or decode a Mach-Zehnder modulation of two sinusoidal factors that's encoding an input forcing of another pair of sinusoidal factors. So essentially we are able to find 4 unknown factors (or 8 if both amplitude and phase are included) by only searching on 2 factors (or 4 if amplitude and phase are included). But how is that possible? It's actually not a free lunch because the power spectrum calculation is essentially testing all possible modulations in parallel and the negative entropy calculation is keeping track of the frequency components that maximize the delta functions in the spectrum. That is the mean-square variance is weighting greater excursions than a flat highly-random background would. From the paper, this is the general idea. For negative entropy we are looking for the upper spectrum, not the lower, which is a maximum entropy ![](https://imagizer.imageshack.com/img922/6891/XKauf7.png) Good luck, this works well for certain applications. It may even work better in a search algorithm than if you did a pure RMS minimization of fitting the 4 sinusoidal factors directly against the output, as it may not fall into local minima as easily. Doing the power spectrum helps to immediately broaden the search I think.
  • 6.

    Wonderful response Paul. I'll work on mastering your material and then reply.

    Comment Source:Wonderful response Paul. I'll work on mastering your material and then reply.
  • 7.

    In 1979 I rolled into Coco Beach, Florida as a new Air Force recruit. I would work at AFTAC which monitors nuclear events. I was assigned to TGS, the geophysical division, as a seismologist.

    It was the end of one era and the beginning of another. We had access to interactive signal processing software that would have made Stephen Wolfram proud. Spectrums, cepstrums, convolution filters and so on. Unfortunately, we were constrained to the realm of statistics and statistical sigmas. From what I understand, we still are.

    Comment Source:In 1979 I rolled into Coco Beach, Florida as a new Air Force recruit. I would work at AFTAC which monitors nuclear events. I was assigned to TGS, the geophysical division, as a seismologist. It was the end of one era and the beginning of another. We had access to interactive signal processing software that would have made Stephen Wolfram proud. Spectrums, cepstrums, convolution filters and so on. Unfortunately, we were constrained to the realm of statistics and statistical sigmas. From what I understand, we still are.
  • 8.
    edited April 16

    Daniel, if you haven't come across it you might enjoy Brillouin's 1956 originating text "Science and Information Theory": https://www.informationphilosopher.com/solutions/scientists/brillouin/

    Comment Source:Daniel, if you haven't come across it you might enjoy Brillouin's 1956 originating text "Science and Information Theory": https://www.informationphilosopher.com/solutions/scientists/brillouin/
  • 9.

    Regarding Laplace and his demon and the possibility of determinism, these items from the past week:

    1. On the Shoulders of Laplace

    2. BBC podcast on Laplace

    Comment Source:Regarding Laplace and his demon and the possibility of determinism, these items from the past week: 1. [On the Shoulders of Laplace](https://www.sciencedirect.com/science/article/pii/S0031920121000510) 1. [BBC podcast on Laplace](https://podcasts.google.com/feed/aHR0cDovL3d3dy5yc3NtaXguY29tL3UvODMyMjkzNi9yc3MueG1s/episode/dXJuOmJiYzpwb2RjYXN0Om0wMDB0d2dq?ep=14)
  • 10.
    edited April 18

    Paul, I'm not getting the connection between determinism and negative entropy although I'm happy to discuss almost any subject. I will say in my mind that Laplace's equation and Hamilton's equation are two of the most beautiful constructs in mathematics and physics. My own research delves into dynamics or so called chaos theory. I'd love to try and put something together on the link between dynamics and negative entropy.

    Comment Source:Paul, I'm not getting the connection between determinism and negative entropy although I'm happy to discuss almost any subject. I will say in my mind that Laplace's equation and Hamilton's equation are two of the most beautiful constructs in mathematics and physics. My own research delves into dynamics or so called chaos theory. I'd love to try and put something together on the link between dynamics and negative entropy.
  • 11.

    If determinism follows an ordered pattern then that will show up as a higher value of negative entropy. That's all there is to it, which is shown in the last pic I posted.

    Comment Source:If determinism follows an ordered pattern then that will show up as a higher value of negative entropy. That's all there is to it, which is shown in the last pic I posted.
  • 12.

    Oops, I should have got that myself. OK, determinism and negative entropy are connected.

    Comment Source:Oops, I should have got that myself. OK, determinism and negative entropy are connected.
  • 13.

    Some climate scientists may be figuring out that adding a "control signal" to a non-linear and potentially chaotic signal may make it deterministic

    "Control Simulation Experiment with the Lorenz’s Butterfly Attractor" in Nonlinear Processes in Geophysics https://npg.copernicus.org/preprints/npg-2021-24/

    My comment

    Comment Source:Some climate scientists may be figuring out that adding a "control signal" to a non-linear and potentially chaotic signal may make it deterministic "Control Simulation Experiment with the Lorenz’s Butterfly Attractor" in Nonlinear Processes in Geophysics https://npg.copernicus.org/preprints/npg-2021-24/ My comment ![](https://pbs.twimg.com/media/E5v7EhuWYAEueCu.png)
  • 14.
    edited July 9

    In the realm of complex systems there can be a large number of arbitrary properties that two systems share. Elegance is important in the understanding of negative entropy, as well as many other things. I do suspect that there is a direct connection between negative entropy and elegance. Given that a number of models of ENSO are not commonly accepted, and that negative entropy itself is not set on solid ground, I have a difficult time making a useful connection. Is ENSO an example of negative entropy?

    Comment Source:In the realm of complex systems there can be a large number of arbitrary properties that two systems share. Elegance is important in the understanding of negative entropy, as well as many other things. I do suspect that there is a direct connection between negative entropy and elegance. Given that a number of models of ENSO are not commonly accepted, and that negative entropy itself is not set on solid ground, I have a difficult time making a useful connection. Is ENSO an example of negative entropy?
  • 15.

    Daniel, Any behavior or pattern that looks complex but that you can describe with a short algorithm is an example of negative entropy (or low entropy). So if a model of ENSO can be described with a short algorithm then conceivably the mechanism underlying ENSO is a low entropy, ordered behavior.

    How short is the algorithm for describing tides? The example below is based on a very short low entropy algorithm, based on the astronomical factors, and explains essentially everything except for obvious storm surge signal spike in the middle.

    So we take that SAME set of astronomical factors using the data from NASA JPL ephemeris or an astronomical algorithm and we can get this for ENSO by applying Laplace's Tidal Equations:

    So we can conclude that ENSO is not a very complex system since it can be described succinctly.

    But if anyone wants to debunk it or reject the model with their own better low entropy model then they can go ahead and try.

    Comment Source:Daniel, Any behavior or pattern that looks complex but that you can describe with a short algorithm is an example of negative entropy (or low entropy). So if a model of ENSO can be described with a short algorithm then conceivably the mechanism underlying ENSO is a low entropy, ordered behavior. How short is the algorithm for describing tides? The example below is based on a very short low entropy algorithm, based on the astronomical factors, and explains essentially everything except for obvious storm surge signal spike in the middle. ![](https://cirpwiki.info/images/5/57/TAPtides_Figure4.PNG) So we take that SAME set of astronomical factors using the data from [NASA JPL ephemeris](https://ssd.jpl.nasa.gov/?horizons) or [an astronomical algorithm](https://www.sciencedirect.com/science/article/pii/B9780128205136000229) and we can get this for ENSO by applying [Laplace's Tidal Equations](https://en.wikipedia.org/wiki/Theory_of_tides#Laplace's_tidal_equations): ![](https://imagizer.imageshack.com/img924/5813/IPhZFw.png) So we can conclude that ENSO is not a very complex system since it can be described succinctly. But if anyone wants to debunk it or reject the model with their own better low entropy model then they can go ahead and try.
  • 16.
    edited July 10

    Paul, I agree with you regarding low entropy and ENSO, but I don't agree that low entropy and negative entropy are the same. I'm currently reading about negative entropy https://en.wikipedia.org/wiki/Entropy_and_life#Negative_entropy.

    Comment Source:Paul, I agree with you regarding low entropy and ENSO, but I don't agree that low entropy and negative entropy are the same. I'm currently reading about negative entropy https://en.wikipedia.org/wiki/Entropy_and_life#Negative_entropy.
  • 17.

    Living systems are the canonical examples of negative entropy. What can be done to use technology to create negative entropy?

    Comment Source:Living systems are the canonical examples of negative entropy. What can be done to use technology to create negative entropy?
  • 18.

    "I don't agree that low entropy and negative entropy are the same"

    Since entropy is defined by taking the logarithm of a measure, it's actually impossible for the entropy to switch sign. So the concept of negative entropy is essentially the same as low entropy -- i.e. by definition. The negative is simply a modifier to indicate that one is characterizing order as opposed to disorder.

    Comment Source:> "I don't agree that low entropy and negative entropy are the same" Since entropy is defined by taking the logarithm of a measure, it's actually impossible for the entropy to switch sign. So the concept of negative entropy is essentially the same as low entropy -- i.e. by definition. The negative is simply a modifier to indicate that one is characterizing order as opposed to disorder.
  • 19.

    "What can be done to use technology to create negative entropy?"

    Create a waveguide. The waveguide allows only certain wavelengths to constructively interfere. A Mach-Zehnder-like modulation will disguise the amplitude so it will appear chaotic, even though it isn't. That's why the ENSO model shown above works so well. The input is already highly structured as it is generated from a highly ordered tidal time-series

    The amplitude as shown in the spectrum matches very well the accepted ordering of strength of tidal forcing, with the top 3 being the Mf, Mf', and Mm signals (corresponding to 13.66, 13.63 and 27.55 days). This is how the equatorial Pacific waveguide creates negative entropy.

    We can likely harness this energy on a large scale if someone wants to fund a Panama-canal like structure along the equator, and let the moon and sun cycle up some forces with El-Nino intensity levels.

    Comment Source:> "What can be done to use technology to create negative entropy?" Create a waveguide. The waveguide allows only certain wavelengths to constructively interfere. A Mach-Zehnder-like modulation will disguise the amplitude so it will appear chaotic, even though it isn't. That's why the [ENSO model shown above](https://forum.azimuthproject.org/discussion/comment/22819/#Comment_22819) works so well. The input is already highly structured as it is generated from a highly ordered tidal time-series ![](https://imagizer.imageshack.com/img924/3708/xDfy9Z.png) The amplitude as shown in the spectrum matches very well the accepted ordering of strength of tidal forcing, with the top 3 being the Mf, Mf', and Mm signals (corresponding to 13.66, 13.63 and 27.55 days). This is how the equatorial Pacific waveguide creates negative entropy. We can likely harness this energy on a large scale if someone wants to fund a Panama-canal like structure along the equator, and let the moon and sun cycle up some forces with El-Nino intensity levels.
  • 20.

    p.s. That forcing spectrum appears complex (i.e. high entropy) but is actually quite simple as it's just the result of an orbit-based calculation of tractive gravitational forcing where the lunar range R and declination D changes according to ~sin(D)/R^3. Everything else factors as weaker harmonics via nonlinear Taylor series expansion.

    If climate research was a scientific discipline with the same kind of rigor and competitiveness as condensed matter or solid-state physics, I would venture that someone/anyone would be jumping on this approach to either debunk it or to support it. Way too much is at stake in understanding how climate changes in its natural state. Billions or even trillions of global $$$ can be saved in agriculture planning and disaster control if we can anticipate when the next El Nino will occur. It's disappointing how little interest there is in these ideas -- I have tried, presenting at 3 AGU meetings and one overseas EGU meeting, with virtually zero feedback.

    People on this forum have probably not seen this video but I asked the esteemed climate scientist Raymond Pierrehumbert a question during a recent online session. As context, in astrophysics/astronomy circles it often occurs that an amateur scientist will make an amazing discovery, with the full backing of professional and academic astronomers. Alas, this does not often happen in climate science as many of the "amateurs" are not as interested in advancing science as they are in making a political statement, which Pierrehumbert explains:

    https://youtu.be/XdtTapL9fLg

    "That amateur scientists are more accepted in astrophysics uh than climate science uh and there's actually quite a lot of there's actually quite a lot of citizen science done in client and climate uh the phenomenon in climate science though is that a lot of so-called citizen science has been really sloppy science not citizen science. Citizen science can be very good but but some people with a political agenda to try to so-called disprove global warming uh uh have you know have have been making a lot of a lot of onerous data requests and uh uh that scientists have hard time keeping up with and then just raising spurious questions that that uh that just try to confuse the issue rather than doing honest science but but the data is almost completely open from climate modeling and so forth there's a lot of open source code but unfortunately some some uses of citizen science and climate science has has has given it a bad name"

    from a Facebook presentation on the Climate of Exoplanets

    So the explanation is essentially that the topic of climate sciences and likely earth sciences in general has been poisoned by lots of cr@nks and cr@ckpots harboring some political agenda. This doesn't happen in other disciplines such as semiconductor research since a fraudster is quickly uncovered, allowing everyone else to move on.

    A follow-up question that I asked Pierrehumbert (but he didn't answer) pertained to the use of machine learning in climate science. What happens when naive machine learning algorithms start "explaining" climate behaviors? Will these get more readily accepted by the climate science community, as machines have no political agenda? Or will this just poison the atmosphere with more junk models that will take years more to weed out?

    Science is like tending a garden: it doesn't matter how much water & fertilizer you give the plants -- if you don't weed out the bad stuff and let the good stuff room to grow, you won't be making any progress.

    So if climate science remains stuck in the land of chaos, blindly obeying the pronouncements of Edward Lorenz and his butterfly theory, it's likely that little progress will be made. Between a belief in high-entropy chaos on one side and cr@nks & cr@ckpots on the other, the outlook for climate science looks bleak.


    Here's a trending news item from a NASA JPL study: A Study Predicts Record Flooding In The 2030s, And It's Partly Because Of The Moon

    Comment Source:p.s. That forcing spectrum appears complex (i.e. high entropy) but is actually quite simple as it's just the result of an orbit-based calculation of tractive gravitational forcing where the lunar range R and declination D changes according to ~sin(D)/R^3. Everything else factors as weaker harmonics via nonlinear Taylor series expansion. If climate research was a scientific discipline with the same kind of rigor and competitiveness as condensed matter or solid-state physics, I would venture that someone/anyone would be jumping on this approach to either debunk it or to support it. Way too much is at stake in understanding how climate changes in its natural state. Billions or even trillions of global $$$ can be saved in agriculture planning and disaster control if we can anticipate when the next El Nino will occur. It's disappointing how little interest there is in these ideas -- I have tried, presenting at 3 AGU meetings and one overseas EGU meeting, with virtually zero feedback. People on this forum have probably not seen this video but I asked the esteemed climate scientist Raymond Pierrehumbert a question during a recent online session. As context, in astrophysics/astronomy circles it often occurs that an amateur scientist will make an amazing discovery, with the full backing of professional and academic astronomers. Alas, this does not often happen in climate science as many of the "amateurs" are not as interested in advancing science as they are in making a political statement, which Pierrehumbert explains: https://youtu.be/XdtTapL9fLg > "That amateur scientists are more accepted in astrophysics uh than climate science uh and there's actually quite a lot of there's actually quite a lot of citizen science done in client and climate uh the phenomenon in climate science though is that a lot of so-called citizen science has been really sloppy science not citizen science. Citizen science can be very good but but some people with a **political agenda** to try to so-called disprove global warming uh uh have you know have have been making a lot of a lot of onerous data requests and uh uh that scientists have hard time keeping up with and then just raising spurious questions that that uh that just try to confuse the issue rather than doing honest science but but the data is almost completely open from climate modeling and so forth there's a lot of open source code but unfortunately some some uses of citizen science and climate science has has has **given it a bad name**" > *from a Facebook presentation on the [Climate of Exoplanets](https://www.facebook.com/watch/live/?v=2947621085364200&ref=watch_permalink)* So the explanation is essentially that the topic of climate sciences and likely earth sciences in general has been poisoned by lots of cr@nks and cr@ckpots harboring some political agenda. This doesn't happen in other disciplines such as semiconductor research since [a fraudster is quickly uncovered](https://en.wikipedia.org/wiki/Sch%C3%B6n_scandal), allowing everyone else to move on. A follow-up question that I asked Pierrehumbert (but he didn't answer) pertained to the use of machine learning in climate science. What happens when naive machine learning algorithms start "explaining" climate behaviors? Will these get more readily accepted by the climate science community, as machines have no political agenda? Or will this just poison the atmosphere with more junk models that will take years more to weed out? Science is like tending a garden: it doesn't matter how much water & fertilizer you give the plants -- if you don't weed out the bad stuff and let the good stuff room to grow, you won't be making any progress. So if climate science remains stuck in the land of chaos, blindly obeying the pronouncements of Edward Lorenz and his butterfly theory, it's likely that little progress will be made. Between a belief in high-entropy chaos on one side and cr@nks & cr@ckpots on the other, the outlook for climate science looks bleak. --- Here's a trending news item from a NASA JPL study: [A Study Predicts Record Flooding In The 2030s, And It's Partly Because Of The Moon](https://www.npr.org/2021/07/14/1015800103/a-study-predicts-record-flooding-in-the-2030s-and-its-partly-because-of-the-moon)
  • 21.
    edited August 8

    From https://johncarlosbaez.wordpress.com/2021/08/05/information-geometry-part-18/ a discussion of entropy analogies using probability

    "Similarly, a probability distribution ‘wants’ to flatten out, to maximize entropy, and p_i says how eager it is to increase the probability q_i in order to do this."

    The lower right is essentially the power spectrum of the upper right, see this discussion: https://johncarlosbaez.wordpress.com/2021/08/08/information-geometry-part-19/#comment-171330

    Comment Source:From https://johncarlosbaez.wordpress.com/2021/08/05/information-geometry-part-18/ a discussion of entropy analogies using probability > "Similarly, a probability distribution ‘wants’ to flatten out, to maximize entropy, and p_i says how eager it is to increase the probability q_i in order to do this." > ![](https://pbs.twimg.com/media/E8DUfIQUYAIOuCQ.jpg) The lower right is essentially the power spectrum of the upper right, see this discussion: https://johncarlosbaez.wordpress.com/2021/08/08/information-geometry-part-19/#comment-171330
  • 22.

    Further, there is a recent sequence of articles in an AGU journal on Water Resources Research under the heading: "Debates: Does Information Theory Provide a New Paradigm for Earth Science?"

    By anticipating all these ideas, you can find plenty of examples and derivations (with many centered on the ideas of Maximum Entropy) in my book Mathematical Geoenergy.

    Here is an excerpt from the "Emerging concepts" entry, which indirectly addresses negative entropy:

    "While dynamical system theories have a long history in mathematics and physics and diverse applications to the hydrological sciences (e.g., Sangoyomi et al., 1996; Sivakumar, 2000; Rodriguez-Iturbe et al., 1989, 1991), their treatment of information has remained probabilistic akin to what is done in classical thermodynamics and statistics. In fact, the dynamical system theories treated entropy production as exponential uncertainty growth associated with stochastic perturbation of a deterministic system along unstable directions (where neighboring states grow exponentially apart), a notion linked to deterministic chaos. Therefore, while the kinematic geometry of a system was deemed deterministic, entropy (and information) remained inherently probabilistic. This led to the misconception that entropy could only exist in stochastically perturbed systems but not in deterministic systems without such perturbations, thereby violating the physical thermodynamic fact that entropy is being produced in nature irrespective of how we model it.

    In that sense, classical dynamical system theories and their treatments of entropy and information were essentially the same as those in classical statistical mechanics. Therefore, the vast literature on dynamical systems, including applications to the Earth sciences, was never able to address information in ways going beyond the classical probabilistic paradigm."

    That is, there are likely many earth system behaviors that are highly ordered, but the complexity and non-linearity of their mechanisms makes them appear stochastic or chaotic (high positive entropy) yet the reality is that they are just a complicated deterministic model (negative entropy). We just aren't looking hard enough to discover the underlying patterns on most of this stuff.

    An excerpt from the Occam's Razor entry, lifts from my cite of Gell-Mann

    "Science and data compression have the same objective: discovery of patterns in (observed) data, in order to describe them in a compact form. In the case of science, we call this process of compression “explaining observed data.” The proposed or resulting compact form is often referred to as “hypothesis,” “theory,” or “law,” which can then be used to predict new observations. There is a strong parallel between the scientific method and the theory behind data compression. The field of algorithmic information theory (AIT) defines the complexity of data as its information content. This is formalized as the size (file length in bits) of its minimal description in the form of the shortest computer program that can produce the data. Although complexity can have many different meanings in different contexts (Gell-Mann, 1995), the AIT definition is particularly useful for quantifying parsimony of models and its role in science. "

    Parsimony of models is a measure of negative entropy

    Comment Source:Further, there is a recent sequence of articles in an AGU journal on Water Resources Research under the heading: **"Debates: Does Information Theory Provide a New Paradigm for Earth Science?"** - [Introduction](https://agupubs.onlinelibrary.wiley.com/doi/abs/10.1029/2019WR026398) - [Hypothesis Testing](https://agupubs.onlinelibrary.wiley.com/doi/full/10.1029/2019WR024918) - [Causality, Interaction, and Feedback](https://agupubs.onlinelibrary.wiley.com/doi/full/10.1029/2019WR024940) - [Emerging concepts and pathways of information physics](https://agupubs.onlinelibrary.wiley.com/doi/full/10.1029/2019WR025270) - [Sharper Predictions Using Occam's Digital Razor](https://agupubs.onlinelibrary.wiley.com/doi/full/10.1029/2019WR026471) By anticipating all these ideas, you can find plenty of examples and derivations (with many centered on the ideas of Maximum Entropy) in my book Mathematical Geoenergy. Here is an excerpt from the "Emerging concepts" entry, which indirectly addresses negative entropy: > "While dynamical system theories have a long history in mathematics and physics and diverse applications to the hydrological sciences (e.g., Sangoyomi et al., 1996; Sivakumar, 2000; Rodriguez-Iturbe et al., 1989, 1991), their treatment of information has remained probabilistic akin to what is done in classical thermodynamics and statistics. In fact, the dynamical system theories treated entropy production as exponential uncertainty growth associated with stochastic perturbation of a deterministic system along unstable directions (where neighboring states grow exponentially apart), a notion linked to deterministic chaos. Therefore, while the kinematic geometry of a system was deemed deterministic, entropy (and information) remained inherently probabilistic. This led to the misconception that entropy could only exist in stochastically perturbed systems but not in deterministic systems without such perturbations, thereby violating the physical thermodynamic fact that entropy is being produced in nature irrespective of how we model it. >In that sense, classical dynamical system theories and their treatments of entropy and information were essentially the same as those in classical statistical mechanics. Therefore, the vast literature on dynamical systems, including applications to the Earth sciences, was never able to address information in ways going beyond the classical probabilistic paradigm." That is, there are likely many earth system behaviors that are highly ordered, but the complexity and non-linearity of their mechanisms makes them appear stochastic or chaotic (high positive entropy) yet the reality is that they are just a complicated deterministic model (negative entropy). We just aren't looking hard enough to discover the underlying patterns on most of this stuff. An excerpt from the Occam's Razor entry, lifts from [my cite of Gell-Mann](https://www.google.com/search?q=Gell-Mann+%22mathematical+geoenergy%22&source=lnms&tbm=bks&sa=X&ved=2ahUKEwit_KGXvM_yAhXXXc0KHTAZDMMQ_AUoAXoECAEQCw&biw=868&bih=465) > "Science and data compression have the same objective: discovery of patterns in (observed) data, in order to describe them in a compact form. In the case of science, we call this process of compression “explaining observed data.” The proposed or resulting compact form is often referred to as “hypothesis,” “theory,” or “law,” which can then be used to predict new observations. There is a strong parallel between the scientific method and the theory behind data compression. The field of algorithmic information theory (AIT) defines the complexity of data as its information content. This is formalized as the size (file length in bits) of its minimal description in the form of the shortest computer program that can produce the data. Although complexity can have many different meanings in different contexts (Gell-Mann, 1995), the AIT definition is particularly useful for quantifying parsimony of models and its role in science. " Parsimony of models is a measure of negative entropy
  • 23.

    Minimum Description Length is such a parsimonious and easy to understand description of entropy and randomness. Anything by Chaitin is well worth reading.

    Comment Source:Minimum Description Length is such a parsimonious and easy to understand description of entropy and randomness. Anything by Chaitin is well worth reading.
  • 24.
    edited September 1

    This is a review paper that explains why mean-squared variance works similarly to Shannon entropy as applied in comment #5

    The Energy of Data

    "The energy of data is the value of a real function of distances between data in metric spaces. The name energy derives from Newton’s gravitational potential energy, which is also a function of distances between physical objects. One of the advantages of working with energy functions (energy statistics) is that even if the data are complex objects, such as functions or graphs, we can use their real-valued distances for inference. Other advantages are illustrated and discussed in this review. Concrete examples include energy testing for normality, energy clustering, and distance correlation."

    "The duality between powers of distances and their Fourier transforms is similar to the duality between probability density functions of random variables and their characteristic functions (especially of normal distributions whose probability density functions have the same form as their characteristic functions). This duality was called a “beautiful theorem of probability theory (Sch¨ones Theorem der Wahrscheinlichkeitrechnung)” by Gauss (Fischer 2011, p. 46)."

    Comment Source:This is a review paper that explains why mean-squared variance works similarly to Shannon entropy as applied in comment #5 [The Energy of Data](https://doi.org/10.1146/annurev-statistics-060116-054026) > "The energy of data is the value of a real function of distances between data in metric spaces. The name energy derives from Newton’s gravitational potential energy, which is also a function of distances between physical objects. One of the advantages of working with energy functions (energy statistics) is that even if the data are complex objects, such as functions or graphs, we can use their real-valued distances for inference. Other advantages are illustrated and discussed in this review. Concrete examples include energy testing for normality, energy clustering, and distance correlation." ![](https://imagizer.imageshack.com/img922/7094/4wVfI3.png) >"The duality between powers of distances and their Fourier transforms is similar to the duality between probability density functions of random variables and their characteristic functions (especially of normal distributions whose probability density functions have the same form as their characteristic functions). This duality was called a “beautiful theorem of probability theory (Sch¨ones Theorem der Wahrscheinlichkeitrechnung)” by Gauss (Fischer 2011, p. 46)."
  • 25.

    Update on maximizing negative entropy here: https://geoenergymath.com/2021/05/17/inverting-non-autonomous-functions/

    Under the heading of inverting non-autonomous functions

    Comment Source:Update on maximizing negative entropy here: https://geoenergymath.com/2021/05/17/inverting-non-autonomous-functions/ Under the heading of inverting non-autonomous functions
Sign In or Register to comment.