It looks like you're new here. If you want to get involved, click one of these buttons!

- All Categories 2.2K
- Applied Category Theory Course 355
- Applied Category Theory Seminar 4
- Exercises 149
- Discussion Groups 49
- How to Use MathJax 15
- Chat 480
- Azimuth Code Project 108
- News and Information 145
- Azimuth Blog 149
- Azimuth Forum 29
- Azimuth Project 189
- - Strategy 108
- - Conventions and Policies 21
- - Questions 43
- Azimuth Wiki 711
- - Latest Changes 701
- - - Action 14
- - - Biodiversity 8
- - - Books 2
- - - Carbon 9
- - - Computational methods 38
- - - Climate 53
- - - Earth science 23
- - - Ecology 43
- - - Energy 29
- - - Experiments 30
- - - Geoengineering 0
- - - Mathematical methods 69
- - - Meta 9
- - - Methodology 16
- - - Natural resources 7
- - - Oceans 4
- - - Organizations 34
- - - People 6
- - - Publishing 4
- - - Reports 3
- - - Software 21
- - - Statistical methods 2
- - - Sustainability 4
- - - Things to do 2
- - - Visualisation 1
- General 39

Options

As one of the ongoing projects -- understanding El Nino behavior -- I thought I would review the status of my ENSO model. It is in a relatively stable configuration but I occasionally revisit it and tweak the parameters.

The model essentially emulates a 2nd-order wave equation as described by Allan Clarke [1]. The characteristic frequency $\omega$ has a period of approximately 4.25 years. This is modulated by variations due to TSI and forced by QBO wind variations, angular forcing variations characterized by the Chandler wobble effect, and by TSI. The result is a differential formulation with the characteristics of a Mathieu or Hill equation . This formulation is well-known in the hydrodynamics literature describing sloshing of liquid volumes [2][3].

$ f''(t) + (\omega^2 + k TSI(t)) f(t) = QBO(t) + TSI(t) + CW(t) $

To solve this equation, I tried to create closed-form expressions for the known forcings and then applied the expressions to solve the differential equation using Mathematica. The number above each chart is the correlation coefficient multiplied by 100. These formulations are pseudo-periodic so that the tweaking I applied is to best emulate the experimentally measured oscillations, without going overboard in fidelity. The curve labeled SOIM is the model of the ENSO SOI data.

As an example of a tweak, the TSI curve was broken into two pieces at 1980. Before that point in time, the 22-year period cycle was more evident and after that the 11-year was stronger. The CW correlation coefficient seems low because the period appears to exhibit a phase reversal before 1930. Whether or not this is a real physical transition, I am not sure, so I left it alone.

The million dollar question is whether such an overall fit is possible just by chance selection of these factors. Each one of the factors, QBO [4], CW [5], and TSI [] is suggested as important in ENSO behavior mechanics in the literature.

I wrote up the paper here on ARXIV. I submitted this to a journal but it got rejected w/o review.

[1] A. J. Clarke, S. Van Gorder, and G. Colantuono, “Wind stress curl and ENSO discharge/recharge in the equatorial Pacific,” Journal of physical oceanography, vol. 37, no. 4, pp. 1077–1091, 2007.

[2] O. M. Faltinsen and A. N. Timokha, Sloshing. Cambridge University Press, 2009.

[3] J. B. Frandsen, “Sloshing motions in excited tanks,” Journal of Computational Physics, vol. 196, no. 1, pp. 53–87, 2004.

[4] W. M. Gray, J. D. Sheaffer, and J. A. Knaff, “Inﬂuence of the stratospheric QBO on ENSO variability,” J. Meteor: Soc. Japan, vol. 70, pp. 975–995, 1992.

[5] R. S. Gross, “The excitation of the Chandler wobble,” Geophysical Research Letters, vol. 27, no. 15, pp. 2329–2332, 2000.

[66] L. Kuai, R.-L. Shia, X. Jiang, K. K. Tung, and Y. L. Yung, “Modulation of the period of the quasi-biennial oscillation by the solar cycle,” Journal of the Atmospheric Sciences, vol. 66, no. 8, pp. 2418–2428, 2009.

## Comments

This is a compilation of El Nino and La Nina ratings since the year 1950.

What I did was annotate my SOI model with the locations and weightings (strong, moderate, weak) of each of the ENSO events. The locations labelled with ? are regions that the model diverges to some extent.

For example, in 1967, the model has a stronger LaNina excursion than is identified by the historical record. Similarly, in 1980 there is a very broad LaNina excursion. From 1984 to 1986 the transition from ElNino to LaNina is delayed.

I haven't seen the identifications of LaNina/ElNino events before 1950, but looking at the model vs SOI fit, it should be an equally good match. The exception is an interval around 1945, in which the ElNino vs LaNina excursions appears 180 degrees out of-phase.

`[This is a compilation](http://ggweather.com/enso/oni.htm) of El Nino and La Nina ratings since the year 1950. What I did was annotate my SOI model with the locations and weightings (strong, moderate, weak) of each of the ENSO events. The locations labelled with ? are regions that the model diverges to some extent. ![ElNinos](http://imageshack.com/a/img537/2931/yzHrkB.gif) For example, in 1967, the model has a stronger LaNina excursion than is identified by the historical record. Similarly, in 1980 there is a very broad LaNina excursion. From 1984 to 1986 the transition from ElNino to LaNina is delayed. I haven't seen the identifications of LaNina/ElNino events before 1950, but looking at the model vs SOI fit, it should be an equally good match. The exception is an interval around 1945, in which the ElNino vs LaNina excursions appears 180 degrees out of-phase.`

That's a very information-rich graphic, I wish I had the fu. It'll be very interesting to see a hindcast up to 1945.

`That's a very information-rich graphic, I wish I had the fu. It'll be very interesting to see a hindcast up to 1945.`

Jim, All I have to do is find a list of identified ElNino and LaNina events prior to 1950. Asking a question, I am not sure if anyone has done this?

I also have an additional insight into what goes into pattern matching and whether a model is a good fit or not.

This is a stretched analogy but consider the case of Robert Durst that is in the news. There is an infamous comparison of his authenticated hand-writing sample with the hand-writing of someone writing a threatening note.

There are various elements to comparing the two samples. One can look at the overall style (block capitals), the stroke, and the spelling. At a finer detail, one can look at the L's and notice that the base differs, but not entirely.

So you ask yourself is there any doubt that the two samples are written by the same person?

I would say no, that there is no doubt that the same author wrote both..

But with ENSO, the comparison is not so cut-and-dried. For one, what I am doing isn't exactly performing a pattern match, but coming up with the best representative model of the underlying physics. For example, it is entirely possible that I am missing some important factor. This could mean that the model can be easily made better, but at the risk of over-fitting. Further, some inconsistency between the model and data comparison may be a dead giveway that the model is wrong.

But even with the Durst letter, consider that what would happen if the comparison was with a sample where the word Beverly was spelled correctly and one where it wasn't. Would that have invalidated the match? Not necessarily, as the model would just need to be changed to that the author was an inconsistent speller.

That's kind of similar to the noise in the data. We actually don't know the true signal that we are comparing the model to. There is enough noise in the underlying data that deviations may be simply spurious.

These are issues that I am always chewing on.

`Jim, All I have to do is find a list of identified ElNino and LaNina events prior to 1950. Asking a question, I am not sure if anyone has done this? I also have an additional insight into what goes into pattern matching and whether a model is a good fit or not. This is a stretched analogy but consider the case of Robert Durst that is in the news. There is an infamous comparison of his authenticated hand-writing sample with the hand-writing of someone writing a threatening note. ![durst](http://img1.nymag.com/imgs/daily/vulture/2015/03/14/durst3.nocrop.w529.h280.2x.jpg) There are various elements to comparing the two samples. One can look at the overall style (block capitals), the stroke, and the spelling. At a finer detail, one can look at the L's and notice that the base differs, but not entirely. So you ask yourself is there any doubt that the two samples are written by the same person? I would say no, that there is no doubt that the same author wrote both.. But with ENSO, the comparison is not so cut-and-dried. For one, what I am doing isn't exactly performing a pattern match, but coming up with the best representative model of the underlying physics. For example, it is entirely possible that I am missing some important factor. This could mean that the model can be easily made better, but at the risk of over-fitting. Further, some inconsistency between the model and data comparison may be a dead giveway that the model is wrong. But even with the Durst letter, consider that what would happen if the comparison was with a sample where the word Beverly was spelled correctly and one where it wasn't. Would that have invalidated the match? Not necessarily, as the model would just need to be changed to that the author was an inconsistent speller. That's kind of similar to the noise in the data. We actually don't know the true signal that we are comparing the model to. There is enough noise in the underlying data that deviations may be simply spurious. These are issues that I am always chewing on.`

Wouldn't it be better to do a hindcast using your equation and see what your 3C, weak, medium and strong criteria identify as El Ninos/La Ninas.

`> All I have to do is find a list of identified ElNino and LaNina events prior to 1950. Wouldn't it be better to do a hindcast using your equation and see what your 3C, weak, medium and strong criteria identify as El Ninos/La Ninas.`

I would rather someone else do that. This is what else I found for events prior to 1950.

Here is one that identifies I assume are strong El Nino events http://apollo.lsc.vsc.edu/classes/met130/notes/chapter10/elnino.html

Here is another for both El Nino and La Nino events https://sites.google.com/site/medievalwarmperiod/Home/historic-el-nino-events

`I would rather someone else do that. This is what else I found for events prior to 1950. Here is one that identifies I assume are strong El Nino events http://apollo.lsc.vsc.edu/classes/met130/notes/chapter10/elnino.html Here is another for both El Nino and La Nino events https://sites.google.com/site/medievalwarmperiod/Home/historic-el-nino-events`

Here is an annotated set of El Nino and La Nina events prior to 1950. In contrast to those shown above post-1950, these are not sorted according to Strong, Medium, and Weak, so "s" is used everywhere.

Just a few peculiar excursions noted by ?. Some modeled oscillations are noted during the late WWII years which are distinctly out-of-phase with the data. The modeled valley at 1949 didn't register as an El Nino as the SOI data was weaker. A La Nina peak at 1934 didn't register either.

Other than that, it looks as if all documented La Nina and El Nino events match to a modeled SOI peak and valley, respectively. The El Nino at 1912 was only modeled as a shoulder dip.

`Here is an annotated set of El Nino and La Nina events prior to 1950. In contrast to those shown above post-1950, these are not sorted according to Strong, Medium, and Weak, so "s" is used everywhere. ![Elnino1950](http://imageshack.com/a/img673/1185/ARziwx.gif) Just a few peculiar excursions noted by ?. Some modeled oscillations are noted during the late WWII years which are distinctly out-of-phase with the data. The modeled valley at 1949 didn't register as an El Nino as the SOI data was weaker. A La Nina peak at 1934 didn't register either. Other than that, it looks as if all documented La Nina and El Nino events match to a modeled SOI peak and valley, respectively. The El Nino at 1912 was only modeled as a shoulder dip.`

Here is another view of the SOI dipole. The way to read this is that when the atmospheric pressure at Darwin is low then it is high at Tahiti (and vice versa). So this contour plot is essentially a series of 1D cross-sections as the pressure varies from west (Darwin) to east (Tahiti) with time increasing downward. The model is on the left and the somewhat noisy data on the right.

`Here is another view of the SOI dipole. The way to read this is that when the atmospheric pressure at Darwin is low then it is high at Tahiti (and vice versa). So this contour plot is essentially a series of 1D cross-sections as the pressure varies from west (Darwin) to east (Tahiti) with time increasing downward. The model is on the left and the somewhat noisy data on the right. ![SOI dipole](http://imageshack.com/a/img910/3114/vt52VD.png)`

Another interesting aside is the connection of the differential equation that I am solving to the elliptical orbit discussion on the Azimuth blog -- http://johncarlosbaez.wordpress.com/2015/03/17/planets_in_the_4th_dimension

I am applying a nonlinear term to the second-order differential equation describing a sloshing of a volume of liquid. When this is a single sinusoidal term, it is known as the Mathieu equation. More generally, when the factor is a set of sinusoidal terms of different frequencies, it is known as the Hill differential equation. I am specifically using Hill because I am trying to approximate the TSI, which is roughly a harmonic combination of 11 year and 22 year periods.

The connection to orbital mechanics comes about because this is the same Hill, George William who developed the first comprehensive perturbation theory for determining corrections to the lunar orbital period. The correction essentially comes about because of the non-linear effects of approximating the 3-body problem of the sun-earth-moon system. I believe that the variational corrections to official lunar calendars all have some basis in this physics.

A fascinating account of Hill and others involved in the development of the math is found in the review by Gutzwiller, Moon-Earth-Sun: The oldest three-body problem.

Gutzwiller also refers to Hill and Poincaré in his account of the development of Quantum Chaos math. I imagine that this is the basis of Poincaré's quote :)

`Another interesting aside is the connection of the differential equation that I am solving to the elliptical orbit discussion on the Azimuth blog -- http://johncarlosbaez.wordpress.com/2015/03/17/planets_in_the_4th_dimension I am applying a nonlinear term to the second-order differential equation describing a sloshing of a volume of liquid. When this is a single sinusoidal term, it is known as the [Mathieu equation](http://en.wikipedia.org/wiki/Mathieu_function#Mathieu_equation). More generally, when the factor is a set of sinusoidal terms of different frequencies, it is known as the [Hill differential equation](http://en.wikipedia.org/wiki/Hill_differential_equation). I am specifically using Hill because I am trying to approximate the TSI, which is roughly a harmonic combination of 11 year and 22 year periods. The connection to orbital mechanics comes about because this is the same Hill, [George William](http://en.wikipedia.org/wiki/George_William_Hill) who developed the first comprehensive perturbation theory for determining corrections to the lunar orbital period. The correction essentially comes about because of the non-linear effects of approximating the 3-body problem of the sun-earth-moon system. I believe that the variational corrections to official lunar calendars all have some basis in this physics. A fascinating account of Hill and others involved in the development of the math is found in the review by Gutzwiller, [Moon-Earth-Sun: The oldest three-body problem](http://sites.apam.columbia.edu/courses/ap1601y/moon-earth-sin%20rmp.70.589.pdf). > 'The motion of the perigee c0 is obtained to very high accuracy. Poincaré remarks in his preface to Hill’s Collected Papers: ‘‘In this work, one is allowed to perceive the germ of most of the progress that Science has made ever since.’' Gutzwiller also refers to Hill and Poincaré in his account of the development of [Quantum Chaos](http://www.stealthskater.com/Documents/Math_02.pdf) math. I imagine that this is the basis of Poincaré's quote :)`

Here is an interesting idea concerning dipoles that I have been playing around with. The Southern Oscillation of ENSO is defined as the magnitude of the atmospheric pressure dipole between Tahiti and Darwin. See the following figure, where x represents the longitudinal distance between Darwin (to the west) and Tahiti (to the east).

So when Darwin is positive, Tahiti is generally negative (and vise versa).

However the data is fairly noisy, so it is difficult to trust any particular data point. It is possible that one or the other of the data points is more trustworthy than the other. It may be that instead of using Tahiti-Darwin, one could use 2×Tahiti or -2×Darwin, if that reflected the "true" value of the dipole.

My idea is that the SOI model could be compared to either T-D, 2×T, or -2×D, depending on what minimizes the error between data and model. Remember that the model does not have a lot of parameter wiggle room, as the factors are set.

It amounts to a simple Mathematica one-liner algorithm, and shown in the figure below are the results of picking the "best" SOI data and comparing to the SOI model (SOIM) in red.

I am not sure if this is invoking Morton's demon confirmation bias, or is an appropriate kind of learning classifier that is useful for handling uncertain data.

`Here is an interesting idea concerning dipoles that I have been playing around with. The Southern Oscillation of ENSO is defined as the magnitude of the atmospheric pressure dipole between Tahiti and Darwin. See the following figure, where x represents the longitudinal distance between Darwin (to the west) and Tahiti (to the east). ![darwintahiti](http://imageshack.com/a/img633/8527/eMcDb7.gif) So when Darwin is positive, Tahiti is generally negative (and vise versa). However the data is fairly noisy, so it is difficult to trust any particular data point. It is possible that one or the other of the data points is more trustworthy than the other. It may be that instead of using Tahiti-Darwin, one could use 2×Tahiti or -2×Darwin, if that reflected the "true" value of the dipole. My idea is that the SOI model could be compared to either T-D, 2×T, or -2×D, depending on what minimizes the error between data and model. Remember that the model does not have a lot of parameter wiggle room, as the factors are set. It amounts to a simple Mathematica one-liner algorithm, and shown in the figure below are the results of picking the "best" SOI data and comparing to the SOI model (SOIM) in red. ![dipole](http://imageshack.com/a/img908/1848/oYEzb6.gif) I am not sure if this is invoking <a href="http://en.wikipedia.org/wiki/Demon_%28thought_experiment%29">Morton's demon</a> confirmation bias, or is an appropriate kind of learning classifier that is useful for handling uncertain data.`

This is an annotated comparison where I flood-filled the regions that diverge.

Only a few regions stand out as raising concerns. The 1982-1985 interval has a lot going on -- not only was it a very significant El Nino, but the El Chichon volcano erupted in 1982 (and the TSI showed a natural break in behavior around 1980). Could the blanket cooling of El Chichon impact the equatorial Pacific atmospheric pressure that much?

At another blog, a mysterious effort is underway to model 10Be levels in paleo proxy data. They claim that a simple orbital resonance is responsible for the long-term quasi-oscillations. The fit is impressive but are withholding further explanation because they have IP=Intellectual Property concerns. I am not making that up!

The indications are that only a few oscillating factors are involved. The claim is that these are enough to impact the solar wind and the magnetic properties of the 10Be material that Steinhilber et al first analyzed ftp://ftp.ncdc.noaa.gov/pub/data/paleo/climate_forcing/solar_variability/

`This is an annotated comparison where I flood-filled the regions that diverge. ![diverge](http://imageshack.com/a/img538/3505/8obR7X.gif) Only a few regions stand out as raising concerns. The 1982-1985 interval has a lot going on -- not only was it a very significant El Nino, but the El Chichon volcano erupted in 1982 (and the TSI showed a natural break in behavior around 1980). Could the blanket cooling of El Chichon impact the equatorial Pacific atmospheric pressure that much? --- At another blog, a mysterious effort is underway to model 10Be levels in paleo proxy data. [They claim](https://tallbloke.wordpress.com/2015/03/24/much-alarmist-ado-about-amoc-and-the-subpolar-gyre-collapsing/comment-page-1/#comment-99202) that a simple orbital resonance is responsible for the long-term quasi-oscillations. The fit is impressive but are withholding further explanation because they have IP=Intellectual Property concerns. I am not making that up! ![be10](https://tallbloke.files.wordpress.com/2015/03/update-steinhibler-correl-to-resonance-model.jpg) The indications are that only a few oscillating factors are involved. The claim is that these are enough to impact the solar wind and the magnetic properties of the 10Be material that Steinhilber et al first analyzed ftp://ftp.ncdc.noaa.gov/pub/data/paleo/climate_forcing/solar_variability/`

Has anyone actually seen a fit of the delay differential equation (or even a Lorenz model) for modeling ENSO? The delay differential of a charge-discharge oscillator seems to be the consensus model for ENSO but don't ever recall seeing anybody comparing an impressive model solution against experimental results.

The model I am using is clearly related but a significant simplification to the delay differential equation. I understand why there isn't for a Lorenz model because it is much too sensitive to initial conditions. What likely provides the stability in the solution is the impact of the forcing function. Like the Lorenz, the delay differential will not respond predictably at all to changes in the forcing, as it is close to the edge of chaotic behavior. Whereas the 2nd-order wave equation with perturbation that I am applying is much better behaved with slight changes in forcing.

This paper by Tziperman, Scher, Zebiak, Cane is close to what I am thinking about http://dash.harvard.edu/bitstream/handle/1/3425922/Tziperman_ControllingSpatiotemporal.pdf

In climate science research, a "narrative" description of the behavior always seems to be of primary importance. To me this is endlessly frustrating because I like to go straight to simplified first-order physics and math to see what kind of headway one can make. Somebody once compared these narratives to the "just so stories" of Rudyard Kipling. I can see how people can get drawn into these because they are often beguiling and leave room open for lots of discussion.

`Has anyone actually seen a fit of the delay differential equation (or even a Lorenz model) for modeling ENSO? The delay differential of a charge-discharge oscillator seems to be the consensus model for ENSO but don't ever recall seeing anybody comparing an impressive model solution against experimental results. The model I am using is clearly related but a significant simplification to the delay differential equation. I understand why there isn't for a Lorenz model because it is much too sensitive to initial conditions. What likely provides the stability in the solution is the impact of the forcing function. Like the Lorenz, the delay differential will not respond predictably at all to changes in the forcing, as it is close to the edge of chaotic behavior. Whereas the 2nd-order wave equation with perturbation that I am applying is much better behaved with slight changes in forcing. This paper by Tziperman, Scher, Zebiak, Cane is close to what I am thinking about http://dash.harvard.edu/bitstream/handle/1/3425922/Tziperman_ControllingSpatiotemporal.pdf In climate science research, a "narrative" description of the behavior always seems to be of primary importance. To me this is endlessly frustrating because I like to go straight to simplified first-order physics and math to see what kind of headway one can make. Somebody once compared these narratives to the ["just so stories"](http://en.wikipedia.org/wiki/Just_So_Stories) of Rudyard Kipling. I can see how people can get drawn into these because they are often beguiling and leave room open for lots of discussion.`

Somebody, not I, wrote a delayed action oscillator as an Azimuth code project in Java; it built for me but I was only testing and never used it before transferring it from a google code repo to github.

Here's the relevant Azimuth wiki page with ENSO references and some other open ie. suggested Azimuth projects are here. DAO is marked as closed but only in the sense that some code was written: further code would be great.

`Somebody, not I, wrote a [delayed action oscillator](https://github.com/azimuth-project/SDEModels/tree/master/azimuth-sde-models/src/main/java/org/azimuthproject/enso/delayedoscillator) as an Azimuth code project in Java; it built for me but I was only testing and never used it before transferring it from a google code repo to github. [Here](http://www.azimuthproject.org/azimuth/show/ENSO#DelayedActionOscillator)'s the relevant Azimuth wiki page with ENSO references and some other open ie. suggested Azimuth projects are [here](http://www.azimuthproject.org/azimuth/show/Open+projects). DAO is marked as closed but only in the sense that some code was written: further code would be great.`

Jim, That is good and I see many cases of algorithms, but not much in trying to match to the experimental observations. I don't know what it is ... are they saying the problem is intractable, but here are some algorithms that you can play around with?

That is what is intriguing about doing the modeling, in that if you can start to make some headway then the possibilities open up regarding real analysis. I have yet to see someone make some obvious headway. Dara's approach is still one of the best alternatives .. to consider a model-based prior history and see how well it can predict in the near-term time frame.

`Jim, That is good and I see many cases of algorithms, but not much in trying to match to the experimental observations. I don't know what it is ... are they saying the problem is intractable, but here are some algorithms that you can play around with? That is what is intriguing about doing the modeling, in that if you can start to make some headway then the possibilities open up regarding real analysis. I have yet to see someone make some obvious headway. Dara's approach is still one of the best alternatives .. to consider a model-based prior history and see how well it can predict in the near-term time frame.`

In comment #10, I had a mistake in the algorithm that picks the "better" of the two SOI values of Darwin vs Tahiti. The following is the corrected figure.

It didn't really change much with respect to the intervals that show some divergence, but I thought to highlight the La Nina episode at 2000 at being much stronger in the model than in the data. This has something to do with the change of forcing behavior after 1980. The TSI became stronger on an 11 year cycle, and I added a linearly increasing forcing as well to the RHS. The impact of this ramping factor is to push the SOI into the positive La Nina territory.

`In comment #10, I had a mistake in the algorithm that picks the "better" of the two SOI values of Darwin vs Tahiti. The following is the corrected figure. ![BestSOIMCorrection](http://imagizer.imageshack.us/a/img537/3800/O8F8JK.png) It didn't really change much with respect to the intervals that show some divergence, but I thought to highlight the La Nina episode at 2000 at being much stronger in the model than in the data. This has something to do with the change of forcing behavior after 1980. The TSI became stronger on an 11 year cycle, and I added a linearly increasing forcing as well to the RHS. The impact of this ramping factor is to push the SOI into the positive La Nina territory.`

I am answering my own question here -- I doubt that any of the models that claim ENSO follows a red noise pattern [1,2,3] are justified. As long as the source of the forcing is known and that the response can be characterized, as I have shown above, the stochastic nature is essentially eliminated as a candidate to explain ENSO variability. This follows the typical pattern of whatever is not understood is characterized as noise, but since we understand the forcing and response, the result is not noise, but signal. For example, that hum you hear on your speakers is not noise if you can trace it to the 60 Hz AC power source and subsequent amplification.

However apart from noise, there is a possibility that I have not experimented with -- and that is the strong likelihood that two different ENSO dipole mechanisms may be at play. The canonical ENSO is the one associated with the eastern Pacific, while another one is associated with the central Pacific (see [3]). In modeling the SOI, I have assumed that a single canonical ENSO was the primary mechanism at work and so solved a single wave equation.

I think that the best way to work this is to find another dipole that follows the second ENSO mechanism, while maintaining the original model for the canonical SOI dipole. Then once these are independently modeled, the two can be "mixed" to see if they improve both dipole model fits. One of the other SST NINO indices may be a potential complementary candidate, but upon examination they all seem to correlate to the SOI index rather well (or more precisely anti-correlate since SOI has the opposite sign to temperature) . This means that it will be hard to isolate the second dipole.

[1] D. L. Rudnick and R. E. Davis, “Red noise and regime shifts,” Deep Sea Research Part I: Oceanographic Research Papers, vol. 50, no. 6, pp. 691–699, 2003.

[2] “REDFIT: estimating red-noise spectra directly from unevenly spaced paleoclimatic time series$.” [Online]. Available: http://www.manfredmudelsee.com/publ/pdf/redfit.pdf. [Accessed: 01-Dec-2012].

[3 ] M. Newman, S. Shin, and M. A. Alexander, “Natural variation in ENSO flavors,” Geophysical Research Letters, vol. 38, no. 14, 2011. http://onlinelibrary.wiley.com/doi/10.1029/2011GL047658/full

`I am answering my own question here -- I doubt that any of the models that claim ENSO follows a red noise pattern [1,2,3] are justified. As long as the source of the forcing is known and that the response can be characterized, as I have shown above, the stochastic nature is essentially eliminated as a candidate to explain ENSO variability. This follows the typical pattern of whatever is not understood is characterized as noise, but since we understand the forcing and response, the result is not noise, but signal. For example, that hum you hear on your speakers is not noise if you can trace it to the 60 Hz AC power source and subsequent amplification. However apart from noise, there is a possibility that I have not experimented with -- and that is the strong likelihood that two different ENSO dipole mechanisms may be at play. The canonical ENSO is the one associated with the eastern Pacific, while another one is associated with the central Pacific (see [3]). In modeling the SOI, I have assumed that a single canonical ENSO was the primary mechanism at work and so solved a single wave equation. I think that the best way to work this is to find another dipole that follows the second ENSO mechanism, while maintaining the original model for the canonical SOI dipole. Then once these are independently modeled, the two can be "mixed" to see if they improve both dipole model fits. One of the other SST NINO indices may be a potential complementary candidate, but upon examination they all seem to correlate to the SOI index rather well (or more precisely anti-correlate since SOI has the opposite sign to temperature) . This means that it will be hard to isolate the second dipole. [1] D. L. Rudnick and R. E. Davis, “Red noise and regime shifts,” Deep Sea Research Part I: Oceanographic Research Papers, vol. 50, no. 6, pp. 691–699, 2003. [2] “REDFIT: estimating red-noise spectra directly from unevenly spaced paleoclimatic time series$.” [Online]. Available: http://www.manfredmudelsee.com/publ/pdf/redfit.pdf. [Accessed: 01-Dec-2012]. [3 ] M. Newman, S. Shin, and M. A. Alexander, “Natural variation in ENSO flavors,” Geophysical Research Letters, vol. 38, no. 14, 2011. http://onlinelibrary.wiley.com/doi/10.1029/2011GL047658/full`

The different flavours of El Ninos must imply some difference in mechanisms. Separating them out also seems to me to be a sensible move.

Watching the currents and temperatures on earth.nullschool.net over the winter this year's El Nino seems to have a very anomalous path to higher latitudes dumping lots of hot water off the Californian coast; I've never seen a description of any previous one showing such behaviour.

`> the strong likelihood that two different ENSO dipole mechanisms may be at play. The different flavours of El Ninos must imply some difference in mechanisms. Separating them out also seems to me to be a sensible move. Watching the currents and temperatures on earth.nullschool.net over the winter this year's El Nino seems to have a very anomalous path to higher latitudes dumping lots of hot water off the Californian coast; I've never seen a description of any previous one showing such behaviour.`

Jim, Yes, the usual historical observation for El Nino is lots of warm water off the coast of Peru and Equador. This causes a collapse in the local sardine fishery as the warm water holds less nutrients. That behavior in fact was how the original El Nino was characterized and why it got a Spanish name.

Now it may be occurring to a greater extent in California. If this had happened earlier in the historical record, instead of "El Nino" it would have been called "The Dude". :)

As far as the possibility of two dipoles, separating out the effects will have to wait. There is no use for me to get ahead of the analysis. Before I start adding new solutions to the mix I have to get acceptance for the general approach. It is a judgement call, but at some point you have to step back and present the salient factor without extra layers of modeling. I wanted to at least acknowledge the potential of two dipoles interacting just to show that we know that difficulties may still exist.

`Jim, Yes, the usual historical observation for El Nino is lots of warm water off the coast of Peru and Equador. This causes a collapse in the local sardine fishery as the warm water holds less nutrients. That behavior in fact was how the original El Nino was characterized and why it got a Spanish name. Now it may be occurring to a greater extent in California. If this had happened earlier in the historical record, instead of "El Nino" it would have been called "The Dude". :) As far as the possibility of two dipoles, separating out the effects will have to wait. There is no use for me to get ahead of the analysis. Before I start adding new solutions to the mix I have to get acceptance for the general approach. It is a judgement call, but at some point you have to step back and present the salient factor without extra layers of modeling. I wanted to at least acknowledge the potential of two dipoles interacting just to show that we know that difficulties may still exist.`

lol pardner,

Iirc it's been suggested that there my be at least 3 flavours of El Ninos. Are central El Ninos as periodic as eastern ones? The Californian blocking behaviour seems similar to blocking of the NH jet stream (the 'arctic vortex').

Perhaps there is only one dipole and some aperiodic phenomenon modifies it?

I'd love to visualise the data for the ocean conveyor and how it's changed since the deeper ocean buoys were deployed. Dara, did you download any of the Argo data?

`lol pardner, Iirc it's been suggested that there my be at least 3 flavours of El Ninos. Are central El Ninos as periodic as eastern ones? The Californian blocking behaviour seems similar to blocking of the NH jet stream (the 'arctic vortex'). Perhaps there is only one dipole and some aperiodic phenomenon modifies it? I'd love to visualise the data for the ocean conveyor and how it's changed since the deeper ocean buoys were deployed. Dara, did you download any of the Argo data?`

The eastern El Nino is considered the "canonical" El Nino. As you say it may be that the central (Modoki?) is an overlapping or modifying effect.

Here is a paper that at least shows the components: [1]F. Xie, J. Li, W. Tian, J. Zhang, and C. Sun, “The relative impacts of El Niño Modoki, canonical El Niño, and QBO on tropical ozone changes since the 1980s,” Environmental Research Letters, vol. 9, no. 6, p. 064020, May 2014. http://iopscience.iop.org/1748-9326/9/6/064020/pdf/1748-9326_9_6_064020.pdf

Parse this sentence:

These narratives are so obtuse.

`>" Are central El Ninos as periodic as eastern ones?" The eastern El Nino is considered the "canonical" El Nino. As you say it may be that the central (Modoki?) is an overlapping or modifying effect. Here is a paper that at least shows the components: [1]F. Xie, J. Li, W. Tian, J. Zhang, and C. Sun, “The relative impacts of El Niño Modoki, canonical El Niño, and QBO on tropical ozone changes since the 1980s,” Environmental Research Letters, vol. 9, no. 6, p. 064020, May 2014. http://iopscience.iop.org/1748-9326/9/6/064020/pdf/1748-9326_9_6_064020.pdf Parse this sentence: > "Modoki activity can affects the tropical TCO is likely to cause by directly transports ozone-poor air in the tropics from the troposphere to the stratosphere, which leads to significant ozone decreases in the lower-middle stratosphere (figures 3 and 4), since El Niño Modoki can significantly enhance the BD circulation (figure 5). " These narratives are so obtuse.`

The Xie paper in #19, although in need of editing, covers the same territory that I am trying to do -- infer both the role of QBO and Solar cycle TSI in ENSO.

Clearly the solar cycle on the decadal range is correlating with ENSO.

The Xie paper brings up another issue for me to ponder.

I am rationalizing why the SOIM wave equation model provides such a good fit to ENSO while no one (to my knowledge) has reported on this approach. Although Clarke and others have described this wave equation formulation, it appears to be comatose in practice. The apparent success that I have been seeing probably comes down to a few reinforcing factors that have never been combined effectively in past studies:

Applying machine learning via symbolic regression to efficiently push the solution in the right direction.

I could see how researchers using Fourier analysis and linear DiffEq approaches might quickly give up, as including a modulated forcing and Hill factor is not conducive to traditional methods of solving the basic DiffEq.

The old adage is always at play to prevent from fooling oneself -- "if it's such a good idea, why hasn't anyone thought of it before?" The answer is that the playing field is complicated. The climate science community has bitten the big bullet and committed themselves to working on immense general circulation models (GCMs) . To consider a radical simplification of a GCM -- to the point that it maps to a simplified wave equation as I am evaluating would be a huge step backward. Yet, that happens over and over again in physics, that a primary effect provides a first-order model to the overall behavior.

`The Xie paper in #19, although in need of editing, covers the same territory that I am trying to do -- infer both the role of QBO and Solar cycle TSI in ENSO. > "An 11-year period is found in the PC1 of EOF analysis of tropical TCO variations (figure 1(a)), which is deemed related to the solar cycle (Shindell et al1999). .... For another, as El Niño Modoki has an 11-year cycle since the 1980s (Ashok et al 2007) and the variations of EMI lead the tropical TCO changes, whether the 11-year cycle of topical TCO also relatives to El Niño Modoki activity which also deserve further investigation by simulation" Clearly the solar cycle on the decadal range is correlating with ENSO. --- The Xie paper brings up another issue for me to ponder. I am rationalizing why the SOIM wave equation model provides such a good fit to ENSO while no one (to my knowledge) has reported on this approach. Although Clarke and others have described this wave equation formulation, it appears to be comatose in practice. The apparent success that I have been seeing probably comes down to a few reinforcing factors that have never been combined effectively in past studies: 1. Combining the periodic QBO, TSI, and wobble terms as an aggregate forcing. 2. Adding a Hill/Mathieu modulation factor to the wave-equation, scaled to the TSI factor. 3. Filtering the forcing terms so they are smooth but modulate in sync with each of the source frequencies 4. Applying machine learning via symbolic regression to efficiently push the solution in the right direction. I could see how researchers using Fourier analysis and linear DiffEq approaches might quickly give up, as including a modulated forcing and Hill factor is not conducive to traditional methods of solving the basic DiffEq. The old adage is always at play to prevent from fooling oneself -- "if it's such a good idea, why hasn't anyone thought of it before?" The answer is that the playing field is complicated. The climate science community has bitten the big bullet and committed themselves to working on immense general circulation models (GCMs) . To consider a radical simplification of a GCM -- to the point that it maps to a simplified wave equation as I am evaluating would be a huge step backward. Yet, that happens over and over again in physics, that a primary effect provides a first-order model to the overall behavior.`

I think you can't really be on your own; it must be possible to get some appropriate climate scientist to comment on your results.

`I think you can't really be on your own; it must be possible to get some appropriate climate scientist to comment on your results.`

Jim, thanks for the advice.

`Jim, thanks for the advice.`

Thinking about it some more, I don't hold much hope that a credentialed climate scientist would be interested in the approach I am taking. I think it may be too barebones. By that I mean that I tend to work the physics down to the bone so that all is left is the first-order impacts.

Recently I have been running a machine learning algorithm on the ENSO SOI data set. The only objective constraint I applied was that the learning algorithm was trying to minimize the wave equation LHS and RHS difference.

$ \frac{d^2 x(t)}{dt^2} = -k(t) x(t) + f(t) $

So it essentially tries to find symbolic representations of f(t) ( and k(t) /= constant for the Hill/Mathieu variant of the wave equation).

What is not unexpected but certainly validating is that the machine learning discovers solutions that substantiate the results that I get from solving the DiffEq via Mathematica.

This is the raw SOI data:

This is an example solution that the machine learning finds, where

Dis the derivative operator -- applied twice -- andsmais a smoothing filter -- applied as a triple window to remove ringing:The algorithm is Eueqa's symbolic regression. The second derivative of the data is plotted alongside that generated by Eureqa (blue=data, red=model fit).

As I have found earlier, the f(t) function has a profile similar to the Quasi-Biennial Oscillation (QBO), which is known to have a relationship to ENSO as a wind-forcing mechanism. The SOI data 1900 to 1980 was used as a training interval, and then the symbolic expression extended outside this interval:

The QBO data is only available after 1953 as radiosonde measurements became common. If we plot f(t) and QBO(t) together one can see how they correlate for the lower altitude 70 hPa QBO data. Remember that the SOI data from 1900 to 1980 was used and it correlates to the QBO

after1980.The fundamental frequency of the QBO corresponds to 2.33 years and this lines up with the higher altitude 20 hPa QBO data:

Another period found by the machine learning corresponds to the 6.47 years of the Chandler Wobble (CW) cycle. Like the QBO this was inferred previously and has its origins as a mechanical forcing due to changes in the angular momentum of the earth.

So the next idea is to feed the machine learning these periods, the QBO=2.33 and CW=6.47 as well as the Hale sunspot cycle of 11 years and see how it can fit the data.

The twist is to use Clarke's estimated value of K=2.2 rad/yr^2 for the ocean's wave resonance and group the x(t) on the LHS, assuming that k(t)=K is constant over time. (It is likely not according to liquid sloshing theory but it is a perturbing simplification here)

$ \frac{d^2 x(t)}{dt^2} + K x(t) = f(t) $

In other words we have the LHS representing mainly data and the RHS the unknown forcing f(t).

LHS data ~ RHS forcing

The other factor we add is a conditional that enables the algorithm to evaluate the Pacific climate shift that occurred around 1976-1977, so that we can use the entire time-series from 1880-2014. Eureqa was also set to minimize the correlation coefficient error, as this works most efficiently to home in on a solution.

The result I picked from the set of solutions is the highest complexity in the Pareto front. This is actually not as a high a complexity as it looks because the 1976 transition essentially doubled the solution space complexity. The blue is the data and the red is the symbolic solution discovered. Look carefully and you can see how well the peaks are covered and how the 1976 transition reveals itself.

The correlation coefficient is illustrated by a model vs data scatter plot:

The symbolic regression fit that Eureqa found is also presented in a parameterized form. Each of the factors has some meaning which I will go through in a separate post. The behavior after 1976 is also very interesting as this could be an effect of a change in TSI, a change in ocean characteristics, or perhaps a resonance in the wave equation or an instability manifested in the sloshing Hill/Mathieu variant. I see this in DiffEq solutions if k(t) modulates too strongly and at a specific frequency.

These are the statistics. The correlation coefficient of 0.76 is very good for such an erratic and noisy signal source.

One can do all this with a spreadsheet as well, by implementing an averager in a cell and then doing a discretized second derivative from neighboring cells.

The overall approach is all very straightforward, as it maps well to the brute-force Mathematica DiffEq integration that I have been favoring elsewhere. Moreover it provides further substantiation for the simple first-order physics that I am applying.

`Thinking about it some more, I don't hold much hope that a credentialed climate scientist would be interested in the approach I am taking. I think it may be too barebones. By that I mean that I tend to work the physics down to the bone so that all is left is the first-order impacts. Recently I have been running a machine learning algorithm on the ENSO SOI data set. The only objective constraint I applied was that the learning algorithm was trying to minimize the wave equation LHS and RHS difference. $ \frac{d^2 x(t)}{dt^2} = -k(t) x(t) + f(t) $ So it essentially tries to find symbolic representations of f(t) ( and k(t) /= constant for the Hill/Mathieu variant of the wave equation). What is not unexpected but certainly validating is that the machine learning discovers solutions that substantiate the results that I get from solving the DiffEq via Mathematica. This is the raw SOI data: ![SOI Raw](http://imageshack.com/a/img673/4210/pIUv0i.gif) This is an example solution that the machine learning finds, where *D* is the derivative operator -- applied twice -- and *sma* is a smoothing filter -- applied as a triple window to remove ringing: ![Example Solution](http://imageshack.com/a/img673/2871/lxC5GR.gif) The algorithm is Eueqa's symbolic regression. The second derivative of the data is plotted alongside that generated by Eureqa (blue=data, red=model fit). ![Example Profile](http://imageshack.com/a/img538/622/uYpOrO.gif) As I have found earlier, the f(t) function has a profile similar to the Quasi-Biennial Oscillation (QBO), which is known to have a relationship to ENSO as a wind-forcing mechanism. The SOI data 1900 to 1980 was used as a training interval, and then the symbolic expression extended outside this interval: ![Training Interval](http://imageshack.com/a/img673/3219/IizGGc.gif) The QBO data is only available after 1953 as radiosonde measurements became common. If we plot f(t) and QBO(t) together one can see how they correlate for the lower altitude 70 hPa QBO data. Remember that the SOI data from 1900 to 1980 was used and it correlates to the QBO _after_ 1980. ![Training Interval Fit](http://imageshack.com/a/img673/5668/5qAAXc.gif) The fundamental frequency of the QBO corresponds to 2.33 years and this lines up with the higher altitude 20 hPa QBO data: ![Simple QBO fit](http://imageshack.com/a/img673/8396/MYxsTj.gif) Another period found by the machine learning corresponds to the 6.47 years of the Chandler Wobble (CW) cycle. Like the QBO this was inferred previously and has its origins as a mechanical forcing due to changes in the angular momentum of the earth. So the next idea is to feed the machine learning these periods, the QBO=2.33 and CW=6.47 as well as the Hale sunspot cycle of 11 years and see how it can fit the data. The twist is to use Clarke's estimated value of K=2.2 rad/yr^2 for the ocean's wave resonance and group the x(t) on the LHS, assuming that k(t)=K is constant over time. (It is likely not according to liquid sloshing theory but it is a perturbing simplification here) $ \frac{d^2 x(t)}{dt^2} + K x(t) = f(t) $ In other words we have the LHS representing mainly data and the RHS the unknown forcing f(t). LHS data ~ RHS forcing The other factor we add is a conditional that enables the algorithm to evaluate the Pacific climate shift that occurred around 1976-1977, so that we can use the entire time-series from 1880-2014. Eureqa was also set to minimize the correlation coefficient error, as this works most efficiently to home in on a solution. ![SOI Solution](http://imageshack.com/a/img540/3766/SUcFcK.gif) The result I picked from the set of solutions is the highest complexity in the Pareto front. This is actually not as a high a complexity as it looks because the 1976 transition essentially doubled the solution space complexity. The blue is the data and the red is the symbolic solution discovered. Look carefully and you can see how well the peaks are covered and how the 1976 transition reveals itself. ![SOI Profile](http://imageshack.com/a/img913/5517/EvH9gl.gif) The correlation coefficient is illustrated by a model vs data scatter plot: ![SOI Scatter](http://imageshack.com/a/img913/4093/p2cLf3.gif) The symbolic regression fit that Eureqa found is also presented in a parameterized form. Each of the factors has some meaning which I will go through in a separate post. The behavior after 1976 is also very interesting as this could be an effect of a change in TSI, a change in ocean characteristics, or perhaps a resonance in the wave equation or an instability manifested in the sloshing Hill/Mathieu variant. I see this in DiffEq solutions if k(t) modulates too strongly and at a specific frequency. ![SOI Model](http://imageshack.com/a/img673/295/gFsHEX.gif) These are the statistics. The correlation coefficient of 0.76 is very good for such an erratic and noisy signal source. ![Model Stats](http://imageshack.com/a/img661/6949/cq18H1.gif) One can do all this with a spreadsheet as well, by implementing an averager in a cell and then doing a discretized second derivative from neighboring cells. The overall approach is all very straightforward, as it maps well to the brute-force Mathematica DiffEq integration that I have been favoring elsewhere. Moreover it provides further substantiation for the simple first-order physics that I am applying.`

0.75 correlation for a forecast seems like good evidence against overfitting.

Perhaps I should have suggested running your model past some 'local' statisticians rather than a climatologist; Steve Venner and Nick Stokes spring to mind.

Have you a link for "Clarke's estimated value of K=2.2 rad/yr^2"?

Eureqa looks dangerously powerful. Do you think it merits an intro in the software pages of the wiki?

I've never tried implementing a 2nd derivative in a spreadsheet. Is an averager the smoothing function?

`0.75 correlation for a forecast seems like good evidence against overfitting. Perhaps I should have suggested running your model past some 'local' statisticians rather than a climatologist; Steve Venner and Nick Stokes spring to mind. Have you a link for "Clarke's estimated value of K=2.2 rad/yr^2"? Eureqa looks dangerously powerful. Do you think it merits an intro in the software pages of the wiki? I've never tried implementing a 2nd derivative in a spreadsheet. Is an averager the smoothing function?`

Jim, The Clarke ref is the first cite at the top of this thread. He uses the basic wave equation, his Eq(4.6) and Eq(4.8)

$ f''(t) + \omega ^2 f(t) = 0 , \ \omega \approx 2\pi/4.25 yr$

which gives K=2.2

I will also have to look at this Clarke ref: A. J. Clarke, “Analytical theory for the quasi-steady and low-frequency equatorial ocean response to wind forcing: The ‘tilt’ and ‘warm water volume’ modes,” Journal of Physical Oceanography, vol. 40, no. 1, pp. 121–137, 2010.

The best antidote for overfitting is to use parameters that have physical significance and are out-of-the-box based on other evidence. Doing a sensitivity analysis around these points might be useful as well.

I have bothered Nick Stokes on his blog before. He is obviously a very experienced hydrodynamics researcher with skills at numerical modeling and statistics. He is a member here, but I don't think he reads it routinely. Like I said, I place comments on his blog, but usually only get feedback from naysayers. A frequent nemesis of mine is Pekka Pirrila. This is what he said after a comment I posted to Nick's blog

## ---

The Eureqa suggestion is intriguing. I have so many tricks up my sleeve as to how to use it properly might warrant a page.

The averager or smoother is interchangeable. Nick Stokes actually has a recent post on his blog pertaining to the challenges of getting a derivative off of a noisy function. One thing he finds is that smoothing is commutative with differentiation -- i.e. one can filter and then differentiate or differentiate and then filter. But some filtering is required otherwise the data is buried in seasonal noise. To bring it full circle, I suggested that this is one application that could benefit from his ideas with regards to a second derivative. I don't recall seeing him reply though.

I can appreciate that he may think that what I am trying to do is trivial in comparison to the compute-heavy 3-D finite element fluid modeling he has probably done over the years. If he is carrying that baggage I can understand. As I said, I am only interested in the first-order effects and don't carry any of that baggage.

`Jim, The Clarke ref is the first [cite](http://yly-mac.gps.caltech.edu/AGU/AGU_2008/Zz_Others/Li_agu08/Clarke2007.pdf) at the top of this thread. He uses the basic wave equation, his Eq(4.6) and Eq(4.8) $ f''(t) + \omega ^2 f(t) = 0 , \ \omega \approx 2\pi/4.25 yr$ which gives K=2.2 I will also have to look at this Clarke ref: A. J. Clarke, “Analytical theory for the quasi-steady and low-frequency equatorial ocean response to wind forcing: The ‘tilt’ and ‘warm water volume’ modes,” Journal of Physical Oceanography, vol. 40, no. 1, pp. 121–137, 2010. The best antidote for overfitting is to use parameters that have physical significance and are out-of-the-box based on other evidence. Doing a sensitivity analysis around these points might be useful as well. I have bothered Nick Stokes on his blog before. He is obviously a very experienced hydrodynamics researcher with skills at numerical modeling and statistics. He is a member here, but I don't think he reads it routinely. Like I said, I place comments on his blog, but usually only get feedback from naysayers. A frequent nemesis of mine is Pekka Pirrila. This is what he said after a comment I posted to Nick's blog > "The paper of Clarke et al is looking at the mechanisms of ENSO and makes conclusions at the level their research can support referring to approx. 2 pi / 4.25 yr as a reasonable interannual frequency, and stating also that "The above idealized model errs in several respects". They present justification for their approach, but consider many details only illustrative. A very different style than what I have seen from you." --- --- --- The Eureqa suggestion is intriguing. I have so many tricks up my sleeve as to how to use it properly might warrant a page. The averager or smoother is interchangeable. Nick Stokes actually has a recent post on his blog pertaining to the challenges of getting a derivative off of a noisy function. One thing he finds is that smoothing is commutative with differentiation -- i.e. one can filter and then differentiate or differentiate and then filter. But some filtering is required otherwise the data is buried in seasonal noise. To bring it full circle, I suggested that this is one application that could benefit from his ideas with regards to a second derivative. I don't recall seeing him reply though. I can appreciate that he may think that what I am trying to do is trivial in comparison to the compute-heavy 3-D finite element fluid modeling he has probably done over the years. If he is carrying that baggage I can understand. As I said, I am only interested in the first-order effects and don't carry any of that baggage.`

Here is a graphic from Los Alamos showing the currents of the world’s oceans

from http://www.lanl.gov/newsroom/picture-of-the-week/pic-week-2.php

I have seen much discussion by skeptics that it is futile to try to simulate turbulence of the ocean and atmosphere due to the extreme resolution needed to solve the Navier-Stokes formula at all spatio-temporal scales.

This brings up a conundrum: why is it possible that the

reallylarge-scale turbulence, such as El Nino and ENSO, can be solved -- as I am suggesting with the SOI model ? Shouldn't that be intractable?I think a solution is actually possible due to the over-riding forcing stimulus. If the external factors did not exist, it is likely that ENSO would not be as strong. Just like tides would not exist without the presence of the moon.

Further, I consider the seemingly chaotic vortices that appear in the graphic above analogous to the scattering of the electrons in a crystal lattice. An individual electron will take a chaotic route to get from point A to B, but overall it is the external potential that enables the electronic device to oscillate its current in interesting ways depending on the specific forcing applied. That is statistical mechanics in action.

To extend the analogy, when I first heard what a dipole was in sophomore E-M theory, it came only after separating the temporal behavior from the spatial characteristics of Maxwell's equation. That was known as a "standing wave" dipole and it was quite common to see such a field for a waveguide or resonant cavity. In this case the math derivation preceded the observation in my education.

So when I started studying the Southern Oscillation of ENSO and saw mention of the dipole nature of this phenomenon, things started to click and I decided to see how far one can go from this kind of first-order analysis.

This forum thread is essentially a status of the results so far, and I am more convinced that ENSO is solvable than when I first started on this project.

And that is why I think the skeptics are a bit mad about suggesting that climate can't be simulated at the ENSO scale.

`Here is a graphic from Los Alamos showing the currents of the world’s oceans ![img](http://assets.climatecentral.org/images/made/images/remote/http_assets.climatecentral.org/images/uploads/news/4_1_15_Brian_LosAlamosOceanModel_720_494_s_c1_c_c.jpg) from http://www.lanl.gov/newsroom/picture-of-the-week/pic-week-2.php > "How currents and eddies move that heat around is of particular interest to scientists, but the current crop of climate models can’t fully reproduce them." -- http://www.climatecentral.org/news/one-image-future-climate-models-18844 I have seen much discussion by skeptics that it is futile to try to simulate turbulence of the ocean and atmosphere due to the extreme resolution needed to solve the [Navier-Stokes](http://en.wikipedia.org/wiki/Navier%E2%80%93Stokes_equations) formula at all spatio-temporal scales. This brings up a conundrum: why is it possible that the *really* large-scale turbulence, such as El Nino and ENSO, can be solved -- as I am suggesting with the [SOI model](http://contextearth.com/2014/11/18/paper-on-sloshing-model-for-enso/) ? Shouldn't that be intractable? I think a solution is actually possible due to the over-riding forcing stimulus. If the external factors did not exist, it is likely that ENSO would not be as strong. Just like tides would not exist without the presence of the moon. Further, I consider the seemingly chaotic vortices that appear in the graphic above analogous to the scattering of the electrons in a crystal lattice. An individual electron will take a chaotic route to get from point A to B, but overall it is the external potential that enables the electronic device to oscillate its current in interesting ways depending on the specific forcing applied. That is statistical mechanics in action. To extend the analogy, when I first heard what a dipole was in sophomore E-M theory, it came only after separating the temporal behavior from the spatial characteristics of Maxwell's equation. That was known as a "standing wave" dipole and it was quite common to see such a field for a waveguide or resonant cavity. In this case the math derivation preceded the observation in my education. So when I started studying the Southern Oscillation of ENSO and saw mention of the dipole nature of this phenomenon, things started to click and I decided to see how far one can go from this kind of first-order analysis. This forum thread is essentially a status of the results so far, and I am more convinced that ENSO is solvable than when I first started on this project. And that is why I think the skeptics are a bit mad about suggesting that climate can't be simulated at the ENSO scale.`

I am rationalizing why solving the wave equation is so difficult. This is a 2nd-order differential equation, with a few forcing factors applied. It shouldn't be hard to solve via a multiple linear regression or by applying a Fourier or Laplace transform and solving in reciprocal space. I am sure that researchers have tried this, but perhaps became frustrated. I think the reason for the difficulty is that any Hill/Mathieu modulation on the characteristic frequency invalidates the traditional linear approaches of solving such a model. The best way is to brute force a solution by solving numerically. That's why I have leaned so heavily on using Mathematica.

I mentioned a couple of electrical engineering analogies in the last comment -- that of carrier scattering representing statistical noise and Maxwell's equation modeling a dipole. Another one is solving Schroedinger's equation in the context of a semiconductor lattice and how this relates. Fundamentally, the math behind semiconductor band formation is no different than the Hill or Mathieu equation applied to the wave equation.

But no one harbors any illusions that solving the band structure of a semiconductor lattice is a walk in the park. The periodic modulation of a lattice potential as applied to the electron will cause all sorts of odd spatial wiggles in the allowed states. This is the basis of the infamous Bloch wave. I recognized this in the first blog post I wrote when I started looking at ENSO.

The first iteration of finding the Hill/Mathieu modulation is reviewed below.

The equation I am solving in Mathematica is

where CF is the characteristic frequency term squared, Hill is the modulation, and RHS is the forcing.

The factors contributing to the Hill and RHS modulation are based on models of QBO, Chandler Wobble, and TSI as I described earlier. The shape of these profiles are captured as a Fourier series of sine waves with a fidelity that achieves at least 70% of a unity correlation coefficient. Those numbers are shown in panels 3 through 5 below with the correlation coefficient percentage shown to upper left of each graph. The last two panels show the modulation. Interestingly, the Hill modulation appears dominated by the TSI factor while the RHS shows the more rapid wiggles of the QBO.

Isn't applied mathematical physics fun?

`I am rationalizing why solving the wave equation is so difficult. This is a 2nd-order differential equation, with a few forcing factors applied. It shouldn't be hard to solve via a multiple linear regression or by applying a Fourier or Laplace transform and solving in reciprocal space. I am sure that researchers have tried this, but perhaps became frustrated. I think the reason for the difficulty is that any Hill/Mathieu modulation on the characteristic frequency invalidates the traditional linear approaches of solving such a model. The best way is to brute force a solution by solving numerically. That's why I have leaned so heavily on using Mathematica. I mentioned a couple of electrical engineering analogies in the last comment -- that of carrier scattering representing statistical noise and Maxwell's equation modeling a dipole. Another one is solving Schroedinger's equation in the context of a semiconductor lattice and how this relates. Fundamentally, the math behind semiconductor band formation is no different than the Hill or Mathieu equation applied to the wave equation. But no one harbors any illusions that solving the band structure of a semiconductor lattice is a walk in the park. The periodic modulation of a lattice potential as applied to the electron will cause all sorts of odd spatial wiggles in the allowed states. This is the basis of the infamous [Bloch wave](http://en.wikipedia.org/wiki/Bloch_wave). I recognized this in the [first blog post](http://contextearth.com/2014/02/10/the-southern-oscillation-index-model/) I wrote when I started looking at ENSO. The first iteration of finding the Hill/Mathieu modulation is reviewed below. The equation I am solving in Mathematica is NDSolve[{y''[t]+(CF+Hill[t])*y[t] == RHS[t] where CF is the characteristic frequency term squared, Hill is the modulation, and RHS is the forcing. The factors contributing to the Hill and RHS modulation are based on models of QBO, Chandler Wobble, and TSI as I described earlier. The shape of these profiles are captured as a Fourier series of sine waves with a fidelity that achieves at least 70% of a unity correlation coefficient. Those numbers are shown in panels 3 through 5 below with the correlation coefficient percentage shown to upper left of each graph. The last two panels show the modulation. Interestingly, the Hill modulation appears dominated by the TSI factor while the RHS shows the more rapid wiggles of the QBO. ![ENSO](http://imageshack.com/a/img538/7056/n8NiX1.gif) Isn't applied mathematical physics fun?`

This is all very interesting!

The best I can instantly do is this page, which provides the Southern Oscillation Index (SOI) since 1882. I doubt the Niño 3.4 goes back before 1950, since it involves measuring the temperature of a large patch of water in the Pacific Ocean. But the SOI is also a good El Niño diagnostic.

`This is all very interesting! > All I have to do is find a list of identified El Niño and La Niña events prior to 1950. The best I can instantly do is [this page](http://www.cpc.ncep.noaa.gov/data/indices/), which provides the Southern Oscillation Index (SOI) since 1882. I doubt the Niño 3.4 goes back before 1950, since it involves measuring the temperature of a large patch of water in the Pacific Ocean. But the SOI is also a good El Niño diagnostic.`

Thanks John, A convenient place to get the SOI data is the Climate Explorer.

It also has a seasonal filter to get the underlying signal http://climexp.knmi.nl/getindices.cgi?WMO=CRUData/soi&STATION=SOI&TYPE=i&id=someone@somewhere

`Thanks John, A convenient place to get the SOI data is the Climate Explorer. It also has a seasonal filter to get the underlying signal http://climexp.knmi.nl/getindices.cgi?WMO=CRUData/soi&STATION=SOI&TYPE=i&id=someone@somewhere`

Here is the RHS of the SOI DiffEq model based on two different techniques. The representation in blue is from maximizing the correlation coefficient of a DiffEq solution via Mathematica (see Comment #17 ). The red dotted profile is based on letting loose the Eureqa symbolic regression solver on the LHS constraint and unknown RHS: The solver then combined algebraic operators and sinusoids to arrive at an analytical expression which matched the residual RHS data. This is an extremely powerful technique.

Amazing that so little guidance is required for Eureqa to find a solution that essentially maps to the hypothesized aggregate input of QBO, TSI, and CW that was fed into the Mathematica DiffEq solution.

The only RHS constraint that was provided was a

suggestionthat a conditional building block centered at 1974 (i.e Time > 1974 or Time < 1974) could be used as a "no penalty" factor in arriving at a solution. The transition between 1974 and 1980 is obvious and shows a clear shift from an approximate 22 year beat envelope to a stronger 11 year beat envelope.This is also clearly seen in the wavelet spectrum that I described in the white paper. In the figure below, month 1200 corresponds to the year 1980. Above ~1980, a strong lower frequency beat component is seen in the scalogram.

Things are still holding together, and if anything the ENSO solution is as far from being a chaotic mess as I can imagine. Certainly, the answer is more complex than a single sinusoid, but not that unusual considering what is routinely solved in other scientific and engineering disciplines -- e.g. band structures, etc.

`Here is the RHS of the SOI DiffEq model based on two different techniques. The representation in blue is from maximizing the correlation coefficient of a DiffEq solution via Mathematica (see [Comment #17](https://forum.azimuthproject.org/discussion/comment/14461/#Comment_14461) ). The red dotted profile is based on letting loose the Eureqa symbolic regression solver on the LHS constraint and unknown RHS: ![eureqa](http://imageshack.com/a/img909/4924/k298G8.png) The solver then combined algebraic operators and sinusoids to arrive at an analytical expression which matched the residual RHS data. This is an extremely powerful technique. ![fit comparison](http://imageshack.com/a/img911/6858/BaRrOP.gif) Amazing that so little guidance is required for Eureqa to find a solution that essentially maps to the hypothesized aggregate input of QBO, TSI, and CW that was fed into the Mathematica DiffEq solution. The only RHS constraint that was provided was a *suggestion* that a conditional building block centered at 1974 (i.e Time > 1974 or Time < 1974) could be used as a "no penalty" factor in arriving at a solution. The transition between 1974 and 1980 is obvious and shows a clear shift from an approximate 22 year beat envelope to a stronger 11 year beat envelope. This is also clearly seen in the wavelet spectrum that I described in the white paper. In the figure below, month 1200 corresponds to the year 1980. Above ~1980, a strong lower frequency beat component is seen in the scalogram. ![wavelet](http://imageshack.com/a/img904/714/jkgkWf.gif) Things are still holding together, and if anything the ENSO solution is as far from being a chaotic mess as I can imagine. Certainly, the answer is more complex than a single sinusoid, but not that unusual considering what is routinely solved in other scientific and engineering disciplines -- e.g. band structures, etc.`

I posted a reply to your 11 April 2015 comment on the Azimuth blog. I think it might be interesting and useful to try to publish this work in an academic journal... but I mentioned two things that I believe would greatly improve your (or our) odds of success.

There's also another very minor thing: it would be better if you spelled El Niño correctly in your paper! Such details can make a big difference when a referee is trying to size up a paper by an author they don't know.

`I posted a reply to [your 11 April 2015 comment on the Azimuth blog](https://johncarlosbaez.wordpress.com/2014/06/24/el-nino-project-part-2/#comment-65699). I think it might be interesting and useful to try to publish this work in an academic journal... but I mentioned two things that I believe would greatly improve your (or our) odds of success. There's also another very minor thing: it would be better if you spelled El Niño correctly in your paper! Such details can make a big difference when a referee is trying to size up a paper by an author they don't know.`

Thanks John, I checked for spellings of El Nino in the paper and didn't find anything odd.

`Thanks John, I checked for spellings of El Nino in the paper and didn't find anything odd.`

I mentioned that I would place a canonical representation of the SOI model on the forum. This is a typical combination of the factors representing the QBO, TSI, Chandler Wobble (CW), plus a slowly varying multi-decadal term (MW). Each of these is represented as a RHS forcing and as a Mathieu/Hill modulation term. The characteristic frequency of the wave equation is embodied in the CF term.

That's it, with initial conditions specified and x in years.

These are the fundamental frequencies for each of the factors. The characteristic frequency was determined by Clarke at FSU.

For the terms representing the four factors, I did my best to emulate the real data as a combination of sinusoidal signals leading to the following closed-form expressions. This makes it easier to solve the differential equation -- as the Mathematica solver can interpolate as necessary to converge to a solution:

The following figure shows how well the waveforms match the actual data for QBO, CW, and TSI. The correlation coefficient multiplied by 100 is shown above each time-series. The closer one can correlate to these waveforms , the better the fit to the SOI data.

And here is the fit of the canonical model to the SOI data for the years ranging from 1880 to 2013:

The bottom two graphs are the wavelet scalograms of the data and of the model. I am still looking for a goodness of fit measure for these scalograms.

The bar chart is the classifier result. Calculated for every month of data, it shows the probability that the model and SOI data have the same sign, which is above 70% of the time. Although this may not look so good on first glance, it is excellent considering that the two dipole values of Tahiti and Darwin only have the opposite signed excursion only about 2/3 of the time! That is what we are dealing with in terms of measuring the actual climate dipole in action.

EDITJim reminded me that the sigmoid utility functions were inadvertently omitted`I mentioned that I would place a canonical representation of the SOI model on the forum. This is a typical combination of the factors representing the QBO, TSI, Chandler Wobble (CW), plus a slowly varying multi-decadal term (MW). Each of these is represented as a RHS forcing and as a Mathieu/Hill modulation term. The characteristic frequency of the wave equation is embodied in the CF term. (* The SOIM differential equation solved with the Mathematica NDSolve function *) RHS[x_]:= .9*(.063*qbo[x+.51]*(1+.108*tsi[x-1.1])-.0192*cw[x+1.7]-.012*tsi[x-4.2])+.018*mw[x]; Hill[x_] := 1.32*(.37*tsi[x-1.6]-.52*cw[x+2.94]-.24*qbo[x+.6]*(1-.54*tsi[x-3.1])) ; SOLN = NDSolve[{y''[x]+(CF+Hill[x])*y[x] == RHS[x], y'[52]==.0, y[52]==.0015}, y, {x, -20, 160}]; That's it, with initial conditions specified and x in years. These are the fundamental frequencies for each of the factors. The characteristic frequency was determined by Clarke at FSU. (* constants for QBO=2.33yr, TS=10.6yr, CW=6.46yr(432day beat with 365day) and the Clarke CharFreq=4.24yr *) QB=2.696; TS:=0.5909; CF = 2.197; CW = 0.9726; MW := 0.116; For the terms representing the four factors, I did my best to emulate the real data as a combination of sinusoidal signals leading to the following closed-form expressions. This makes it easier to solve the differential equation -- as the Mathematica solver can interpolate as necessary to converge to a solution: (* closed form representations of the QBO, TSI, CW, MW data *) qbo[x_] := 1.1*Cos[QB*x+1.5+.35*Sin[.515*x-2.] ]; (* QBO time series *) tsi[x_] := (Cos[TS*x+.357]*(1+3.5*Sin[(x-35.4)/217]) + (* TSI time series low-pass filtered *) .6*sig2[x,5.7,25.1,28.]+ 2.5*sig2[x,1.6,87.9,89.6])*((1+.3*sigmoid[x,106.4])/1.3); cw[x_] := 1.11*Sin[CW*(x+0.88)+0.2*Sin[0.361*(x+5.3)]] ; (* Chandler wobble time series *) mw[x_] := Cos[MW*x-0.7]; (* inferred multi-decadal forcing, LOD ? Markowitz wobble ? *) The following figure shows how well the waveforms match the actual data for QBO, CW, and TSI. The correlation coefficient multiplied by 100 is shown above each time-series. The closer one can correlate to these waveforms , the better the fit to the SOI data. ![terms](http://imageshack.com/a/img537/3733/bvRsJf.gif) And here is the fit of the canonical model to the SOI data for the years ranging from 1880 to 2013: ![fit](http://imageshack.com/a/img537/3760/4vmQ0H.gif) The bottom two graphs are the wavelet scalograms of the data and of the model. I am still looking for a goodness of fit measure for these scalograms. The bar chart is the classifier result. Calculated for every month of data, it shows the probability that the model and SOI data have the same sign, which is above 70% of the time. Although this may not look so good on first glance, it is excellent considering that the two dipole values of Tahiti and Darwin only have the opposite signed excursion only about 2/3 of the time! That is what we are dealing with in terms of measuring the actual climate dipole in action. ![TahitiVsDarwin](http://imageshack.com/a/img538/8718/v3ZrA7.gif) --- *EDIT* Jim reminded me that the sigmoid utility functions were inadvertently omitted (* shaping functions for capturing model of TSI *) sig2[x_, W_, T1_, T2_] := -1/(1+Exp[W*(x-T1)])+1/(1+Exp[W*(x-T2)]); sigmoid[x_, T_] := 1/(1+Exp[2.7*(x-T)]);`

The TSI plots here look quite differently from yours.

`The TSI plots <a href="http://lasp.colorado.edu/home/sorce/data/tsi-data/#plots">here</a> look quite differently from yours.`

Because I do a mean filter adjustment. That creates a balanced waveform about zero. I still have the clear dip around 1970 :) Long wavelength modulation goes in to the mw term. Next question, please.

`Because I do a mean filter adjustment. That creates a balanced waveform about zero. I still have the clear dip around 1970 :) Long wavelength modulation goes in to the mw term. Next question, please.`

Urghh, I've just tried to find the mathematica sig2 and sigmoid functions in the mathematica documentation and can only find a LogisticSigmoid and 2 references in google books to a sig2 = sig[x] * sig[x], sig = and sig = . Any links Paul?

Nonetheless, computing for physics is real hard fun :) I'm still wading throught all your references.

ps, fwiw might Physical Oceanography where Clarke publishes a lot of papers be a good target?

`Urghh, I've just tried to find the mathematica sig2 and sigmoid functions in the mathematica documentation and can only find a LogisticSigmoid and 2 references in google books to a sig2 = sig[x] * sig[x], sig = <a covariance matrix> and sig = <a random integer>. Any links Paul? Nonetheless, computing for physics is real hard fun :) I'm still wading throught all your references. ps, fwiw might Physical Oceanography where Clarke publishes a lot of papers be a good target?`

Jim, Those are utility functions I wrote that I forgot to include. I will attach them in a bit. But now that you mention it, I should probably use the LogisticSigmoid instead as that is built in!

In the meantime you can blank those out as they are only used to shape the TSI time-series from a sinusoid to a modulated sine. The sigmoid serves the same role as a tanh function as it provides a gradual envelope from one peak-to-peak level to another level. The sig2 combines two back-to-back sigmoids to provide a modulated peak or valley on an envelope. That is how I got the dip in TSI around 1970.

Ostensibly, I should find an algorithm that can interpolate the QBO, TSI, and CW time series with enough resolution that the DiffEq solver can continuously integrate across the time span.

thanks

`Jim, Those are utility functions I wrote that I forgot to include. I will attach them in a bit. But now that you mention it, I should probably use the LogisticSigmoid instead as that is built in! In the meantime you can blank those out as they are only used to shape the TSI time-series from a sinusoid to a modulated sine. The sigmoid serves the same role as a tanh function as it provides a gradual envelope from one peak-to-peak level to another level. The sig2 combines two back-to-back sigmoids to provide a modulated peak or valley on an envelope. That is how I got the dip in TSI around 1970. Ostensibly, I should find an algorithm that can interpolate the QBO, TSI, and CW time series with enough resolution that the DiffEq solver can continuously integrate across the time span. thanks`

Jim, I updated #33 with the missing sigmoid functions.

Incidentally, besides Mathematica, I can also compute the DiffEq using an R solver. Intriguing also to consider is the JavaScript version in jsxgraph that is being used in the Azimuth github project.

`Jim, I updated #33 with the missing sigmoid functions. Incidentally, besides Mathematica, I can also compute the DiffEq using an R solver. Intriguing also to consider is the JavaScript version in jsxgraph that is being used in the Azimuth github project.`

Tnx Paul, I'm looking some spectra of your equation and learning a lot :).

`Tnx Paul, I'm looking some spectra of your equation and learning a lot :).`

Jim, As in wavelet spectra or classic Fourier?

The impact of modulating a wave equation is that the well-understood spectral lines can split in complex ways. And then with a forcing added to the mix, it can really get weird.

`> "Tnx Paul, I'm looking some spectra of your equation and learning a lot" Jim, As in wavelet spectra or classic Fourier? The impact of modulating a wave equation is that the well-understood spectral lines can split in complex ways. And then with a forcing added to the mix, it can really get weird.`

As food for thought, I think what's missing in various El Nino analyses is that the phenomenon itself exists on a continuum. So certainly there are cases of strong El Nino that get the top billing, but the rest of the peaks and valleys are still there and that's what this style of SOI analysis is all about.

The fundamental concept of the SOI dipole is that a sloshing of the ocean and of the thermocline is taking place and due to the enormous inertial mass of the ocean's waters, this sloshing can't stop on a dime. But what it can do is respond to forcing functions that are at approximately at the same frequency as the characteristic (i.e. resonant) frequency of the underlying wave equation.

That means that frequencies close to the Clarke characteristic period of 4.25 years are very effective at controlling the dynamics of ENSO. So what frequencies are close to 4 years? The QBO at 2.33 years is an obvious candidate to consider. As is the Chandler wobble beat frequency of 6.4 years. The TSI signal has a period of about 10.6 years, which makes it less of a factor, but then again it does likely have higher harmonics due to the fast rise on a cycle. That is part of the reason that they are included in the analysis. It doesn't hurt that the literature is full of assertions that the QBO clearly interacts with ENSO, that the Chandler wobble is associated with oceanic motion, and suggestions that the Solar Hale cycle of sunspot activity has interactions with the QBO and certainly with the ocean absorbing periodic incoming radiation.

(The other frequencies such as tides, seasonal, and daily operate too quickly and are not as close to creating resonances with the 4.25 year characteristic period. Although one of the tidal beat frequencies is 1/2 of the 8.85 year lunar precession)

In essence, the DiffEq solution provides the fundamental behavior that any strong El Nino behavior will likely spring from. It appears to be a low probability occurence that an El Nino year will not coincide with a negative excursion corresponding to the SOI model. See comments #1 and #6.

As a diversion to John's suggestion of applying statistical measures, I have been experimenting with the machine learning Eureqa tool, and with its AIC-based error measure. Sure enough, the tool isolates the 2.33 year QBO, the 6.5 year CW, and a 10-11 year period on certain intervals. And the 4 year characteristic period is found when cast as a wave equation. The AIC error metric penalizes too complicated a model and thus it appears that it is providing evidence that this (DiffEq in comment #33) could be the simplest / canonical formulation for the SOI dipole. So there is at least some Aikike Information Criteria statistical support that the model is on the right track. Criteria are often useless unless other candidate models are supplied to evaluate the target model. So by creating a constantly evolving set of alternative symbolic formulations, Eureqa can gauge the most efficient information-weighted model on a Pareto curve.

I was able to mimic what Eureqa is doing to some extent on an Excel spreadsheet and the results are very interesting. This basically parameterizes the symbolic expression in comment #33 and the Excel evolutionary solver does a nice job of finding the optimal set in the context of an error-minimizing regression analysis. It really excels (!) at evaluating a training interval and then showing whether the correlation extends outside of the training interval. I will show these graphs when I finish polishing them up.

`As food for thought, I think what's missing in various El Nino analyses is that the phenomenon itself exists on a continuum. So certainly there are cases of strong El Nino that get the top billing, but the rest of the peaks and valleys are still there and that's what this style of SOI analysis is all about. The fundamental concept of the SOI dipole is that a sloshing of the ocean and of the thermocline is taking place and due to the enormous inertial mass of the ocean's waters, this sloshing can't stop on a dime. But what it can do is respond to forcing functions that are at approximately at the same frequency as the characteristic (i.e. resonant) frequency of the underlying wave equation. That means that frequencies close to the Clarke characteristic period of 4.25 years are very effective at controlling the dynamics of ENSO. So what frequencies are close to 4 years? The QBO at 2.33 years is an obvious candidate to consider. As is the Chandler wobble beat frequency of 6.4 years. The TSI signal has a period of about 10.6 years, which makes it less of a factor, but then again it does likely have higher harmonics due to the fast rise on a cycle. That is part of the reason that they are included in the analysis. It doesn't hurt that the literature is full of assertions that the QBO clearly interacts with ENSO, that the Chandler wobble is associated with oceanic motion, and suggestions that the Solar Hale cycle of sunspot activity has interactions with the QBO and certainly with the ocean absorbing periodic incoming radiation. (The other frequencies such as tides, seasonal, and daily operate too quickly and are not as close to creating resonances with the 4.25 year characteristic period. Although one of the tidal beat frequencies is 1/2 of the 8.85 year [lunar precession](http://en.wikipedia.org/wiki/Lunar_month#Anomalistic_month)) In essence, the DiffEq solution provides the fundamental behavior that any strong El Nino behavior will likely spring from. It appears to be a low probability occurence that an El Nino year will not coincide with a negative excursion corresponding to the SOI model. See comments #1 and #6. --- As a diversion to John's suggestion of applying statistical measures, I have been experimenting with the machine learning Eureqa tool, and with its AIC-based error measure. Sure enough, the tool isolates the 2.33 year QBO, the 6.5 year CW, and a 10-11 year period on certain intervals. And the 4 year characteristic period is found when cast as a wave equation. The AIC error metric penalizes too complicated a model and thus it appears that it is providing evidence that this (DiffEq in comment #33) could be the simplest / canonical formulation for the SOI dipole. So there is at least some Aikike Information Criteria statistical support that the model is on the right track. Criteria are often useless unless other candidate models are supplied to evaluate the target model. So by creating a constantly evolving set of alternative symbolic formulations, Eureqa can gauge the most efficient information-weighted model on a Pareto curve. I was able to mimic what Eureqa is doing to some extent on an Excel spreadsheet and the results are very interesting. This basically parameterizes the symbolic expression in comment #33 and the Excel evolutionary solver does a nice job of finding the optimal set in the context of an error-minimizing regression analysis. It really excels (!) at evaluating a training interval and then showing whether the correlation extends outside of the training interval. I will show these graphs when I finish polishing them up.`

I'm trying to learn something about wavelets (I have my own crude fft code) using the http://climexp.knmi.nl/diffdat.cgifun KNMI Climate Explorer tool

I annually detrended and chose 64 bins (cf default 3) for the observed soim data as as a comparative model to your equation.

The other week I did some comparison between the observed data for 1880-1930, 1930-1975 and 1976-2014 and thought I could see different periodicities decreasing very roughly from 60, 40 to 20 yrs. Trouble is, I can't remember if I saved the images or how I produced them which is about as much help as a pogostick in quicksand :(.

`I'm trying to learn something about wavelets (I have my own crude fft code) using the http://climexp.knmi.nl/diffdat.cgifun [KNMI Climate Explorer tool](http://climexp.knmi.nl/diffdat.cgi) I annually detrended and chose 64 bins (cf default 3) for the observed soim data as as a comparative model to your equation. ![morlet6soim18602014](https://dl.dropboxusercontent.com/u/61621163/Images/Morlet61860-2014-720x640.png) The other week I did some comparison between the observed data for 1880-1930, 1930-1975 and 1976-2014 and thought I could see different periodicities decreasing very roughly from 60, 40 to 20 yrs. Trouble is, I can't remember if I saved the images or how I produced them which is about as much help as a pogostick in quicksand :(.`

Jim, I am using the Mathematica wavelet scalogram in #33. Yours looks different partly because you have the period increasing on the vertical axis while mine is decreasing upward.

Here is an amazing machine learning run using Excel that I mentioned in my previous comment :

The solver only operated on post 1980 data using the fixed periods of 2.33 years (QBO), 6.4 years (CW), and

actual TSI dataand deduced the LHS and RHS of the model DiffEq by minimizing the difference and maximizing the correlation between LHS and RHS alongonly the training intervalfrom 1980 onwards. All the dozen parameters are found by the Excel solver using the Evolutionary selector.Although the fit is not the greatest on the training interval, it projects backwards remarkably well over the validation interval extending back to 1880 ! Each of the peaks and valleys is identified save for a few between 1895 and 1900 and one right after 1921 -- the fact that there are a few is actually good because this eliminates an artifactual basis for the fit.

This is what I might consider excellent statistical validation for the model.

BTW, I do listen to all suggestions so that John's advice to doing more statistical eval and Nad's suggestion to take the actual TSI data was taken to heart. Thanks.

`Jim, I am using the Mathematica wavelet scalogram in #33. Yours looks different partly because you have the period increasing on the vertical axis while mine is decreasing upward. --- Here is an amazing machine learning run using Excel that I mentioned in my previous comment : ![machineLearn](http://imageshack.com/a/img909/2582/YJfdLh.gif) The solver only operated on post 1980 data using the fixed periods of 2.33 years (QBO), 6.4 years (CW), and *actual TSI data* and deduced the LHS and RHS of the model DiffEq by minimizing the difference and maximizing the correlation between LHS and RHS along *only the training interval* from 1980 onwards. All the dozen parameters are found by the Excel solver using the Evolutionary selector. Although the fit is not the greatest on the training interval, it projects backwards remarkably well over the validation interval extending back to 1880 ! Each of the peaks and valleys is identified save for a few between 1895 and 1900 and one right after 1921 -- the fact that there are a few is actually good because this eliminates an artifactual basis for the fit. This is what I might consider excellent statistical validation for the model. BTW, I do listen to all suggestions so that John's advice to doing more statistical eval and Nad's suggestion to take the actual TSI data was taken to heart. Thanks.`

I found these 3 periodograms I made:

which show some changes. I'm just following the heuristics of random waving and the open source principle of publish first,think later. Good training though.

`I found these 3 periodograms I made: ![Soim periodogram 1860-1930](https://dl.dropboxusercontent.com/u/61621163/Images/periodisoim1-1860-1930h.png) ![Soim periodogram1931-1975](https://dl.dropboxusercontent.com/u/61621163/Images/periodisoim1-1930-1976h.png) ![SoimPeriodogram 1976-2014](https://dl.dropboxusercontent.com/u/61621163/Images/periodisoim1-1976-2014h.png) which show some changes. I'm just following the heuristics of random waving and the open source principle of publish first,think later. Good training though.`

What I may have learned about this data is that conventional time series analysis will drive you crazy. The SOI data has to be transformed somehow so that it starts to show the underlying periodic nature. Something as simple as adding the 2nd-derivative to the waveform itself will partially cancel the characteristic frequency and begin to reveal the details.

As a case in point, Jim's three periodograms do not show the 2.33 year QBO frequency but which are readily revealed via comment #43. So the key is trying to re-evaluate the data in different ways, and definitely wavelets are part of that strategy.

And again as I mentioned before, I am looking for a metric that can correlate between wavelet scalograms. It appears that this gives another degree of freedom, i.e. 2 axis instead of 1, so that it will be much more difficult to obtain fortuitous correlations, which is a common problem of time series analysis, especially with overfitted models. Yet in two dimensions, no two fingerprints are alike so that it might help to substantiate the agreement between model and data.

`What I may have learned about this data is that conventional time series analysis will drive you crazy. The SOI data has to be transformed somehow so that it starts to show the underlying periodic nature. Something as simple as adding the 2nd-derivative to the waveform itself will partially cancel the characteristic frequency and begin to reveal the details. As a case in point, Jim's three periodograms do not show the 2.33 year QBO frequency but which are readily revealed via comment #43. So the key is trying to re-evaluate the data in different ways, and definitely wavelets are part of that strategy. And again as I mentioned before, I am looking for a metric that can correlate between wavelet scalograms. It appears that this gives another degree of freedom, i.e. 2 axis instead of 1, so that it will be much more difficult to obtain fortuitous correlations, which is a common problem of time series analysis, especially with overfitted models. Yet in two dimensions, no two fingerprints are alike so that it might help to substantiate the agreement between model and data.`

I only just noticed #43. Is your .xls file downloadable from somewhere? If you haven't come across it there's an azimuth-project repo on github. If you (and/or Dara and/or Nick Stokes) have code Azimuth folks might use then just post your github.com account name and I'll add you as a committer.

`I only just noticed #43. Is your .xls file downloadable from somewhere? If you haven't come across it there's an [azimuth-project repo](https://github.com/azimuth-project) on github. If you (and/or Dara and/or Nick Stokes) have code Azimuth folks might use then just post your github.com account name and I'll add you as a committer.`

Jim,

Yes, certainly -- my GitHub account name is pukpr

And I can put you on my code project if you are interested http://github.com/pukpr/context (this is private but I add people upon request)

`Jim, Yes, certainly -- my GitHub account name is pukpr And I can put you on my code project if you are interested http://github.com/pukpr/context (this is private but I add people upon request)`

I think I've managed to add you to the azimuth-project github as an owner. Please upload anything you think might be interesting. I'd like to know what codes your have on pukpr but I guess there's a low prob of me contributing any commits(but I might be able to nudge you to open source anything useful).

`I think I've managed to add you to the azimuth-project github as an owner. Please upload anything you think might be interesting. I'd like to know what codes your have on pukpr but I guess there's a low prob of me contributing any commits(but I might be able to nudge you to open source anything useful).`

Jim, Most of the code I have on my context project is in Prolog, so it can interact with a graph database, RDF, ontologies, and then do all sorts of logical reasoning on the data. It is all open source but I somehow don't see anybody jumping at the chance to work it. BTW, this was an outgrowth from a $4M research project that I finished up a couple years ago..

`Jim, Most of the code I have on my context project is in Prolog, so it can interact with a graph database, RDF, ontologies, and then do all sorts of logical reasoning on the data. It is all open source but I somehow don't see anybody jumping at the chance to work it. BTW, this was an outgrowth from a $4M research project that I finished up a couple years ago..`

Does the second sentence mean that all that $4m's work was too premature, obscure, useless or ano.? I don't think xml is much use, nor xslt. I'd like to know what Nad makes of your answer. I had a go at classifying the universe in comprehensive language with the late 90s-early noughties Java ontology fad; the laws of Codd are still the only micro-epistemology (DSL) I've found I've needed in practice.

`Does the second sentence mean that all that $4m's work was too premature, obscure, useless or ano.? I don't think xml is much use, nor xslt. I'd like to know what Nad makes of your answer. I had a go at classifying the universe in comprehensive language with the late 90s-early noughties Java ontology fad; the laws of Codd are still the only micro-epistemology (DSL) I've found I've needed in practice.`