It looks like you're new here. If you want to get involved, click one of these buttons!

- All Categories 2.4K
- Chat 505
- Study Groups 21
- Petri Nets 9
- Epidemiology 4
- Leaf Modeling 2
- Review Sections 9
- MIT 2020: Programming with Categories 51
- MIT 2020: Lectures 20
- MIT 2020: Exercises 25
- Baez ACT 2019: Online Course 339
- Baez ACT 2019: Lectures 79
- Baez ACT 2019: Exercises 149
- Baez ACT 2019: Chat 50
- UCR ACT Seminar 4
- General 75
- Azimuth Code Project 111
- Statistical methods 4
- Drafts 10
- Math Syntax Demos 15
- Wiki - Latest Changes 3
- Strategy 113
- Azimuth Project 1.1K
- - Spam 1
- News and Information 148
- Azimuth Blog 149
- - Conventions and Policies 21
- - Questions 43
- Azimuth Wiki 719

Options

Right now I feel Graham is way ahead of me; I need to slowly catch up by writing blog articles explaining his work. That's my main way to understand things.

However, I think it would be good to talk about what to do next. Dara O Shayda is now offering us a lot of computer power, data storage, and his own abilities. Daniel Mahler is now starting to produce interesting visualizations of what look like "empirical orthogonal functions" for the Earth's climate. WebHubTel is working on simple physical models of El Niños. I'm having trouble keeping up with what they're doing! It would be nice if we could agree on some goals and then work toward those goals... leaving room for different people to do what they enjoy, of course.

Here are some rough ideas:

1) It should be possible to simplify Ludescher *et al*'s prediction algorithm while having it work just as well, or almost as well.

a) Graham has already found at least one example: replace correlations by covariances! Covariances are logically simpler, but they work about equally well:

b) Another possibility is this. Ludescher *et al* work with * quantities $C_{i,j}^t(\tau)$ that say how air temperatures at times $t$ at locations $i$ * outside* the El Niño basin are correlated to air temperatures a time $\tau$ earlier

- quantities $C_{i,j}^t(-\tau)$ that say how air temperatures at times $t$ at locations $i$
the El Niño basin are correlated to air temperatures a time $\tau$ earlier*inside*the El Niño basin.*outside*

These are logically quite distinct, and we might hope that one is more important than another for El Niño prediction. We can find out.

c) Ludescher *et al* define their link strengths $S_{i,j}$ in terms of $C_{i,j}^t(\tau)$ and $C_{i,j}^t(-\tau)$ where $\tau$ ranges from 0 to 200 days. Do we really need this, or would some specific choice of $\tau$ work almost as well?

Etcetera. The point of this sub-project is to better understand what really matters.

## Comments

2) Besides "simplifications" we can also look at more general "modifications". Graham has already listed some on the blog:

a) Change the basin.

b) Change the area outside the basin (eg omit Australia and Mexico, add the Atlantic).

c) Change the period over which covariances are calculated (the 364 above).

d) Change the range of $\tau$ (the 200 above).

e) Treat positive and negative $\tau$ differently.

f) Treat positive and negative correlations differently.

g) Use covariances not correlations, or a different type of normalisation.

h) Use a different formula for converting correlations to link strengths.

i) Use a different criterion for the alarm.

These include some of the "simplifications". When we consider modifications, we should try to avoid making things more and more complicated. We should also try to seek modifications that "make sense" in terms of how the Earth's weather actually works.

3) Thus, it would be good for all of us to get a better understanding for El Niños. I want an "intuitive feeling" for what they're like. (They aren't all the same, and people argue about whether there are distinct kinds.) So, I should read more papers and write about them in blog articles. But also, it would be nice if those who are experts on data visualization could create a bunch of "movies" showing what weather patterns in the Pacific have been like... say, since 1948. One option would be to create movies showing air surface temperatures taken from the data we've already been looking at:

• National Centers for Environmental Prediction and the National Center for Atmospheric Research Reanalysis I Project, worldwide daily average temperatures from 1948 to 2010 on a 2.5° latitude × 2.5° longitude grid.

focused in the Pacific. But there is a lot of other data out there, and some of the movies I'd like to see have already been created.

The goal here is better understand "what really matters" for El Niño prediction.

4) Simple climate models will be the most useful if they're well-explained and really easy for people to use and/or passively watch. The purpose of these models is in part

educational. Not everyone has the ability to run Mathematica or R code. So, if we can use that code to make animated gifs or YouTube videos, we'll expand the audience a lot. Some of today's smart high school or even elementary school students will be tomorrow's climate scientists! And some of today's scientists will be more likely to get interested in ecological issues if we can show them interesting movies or animations.That's 4 ideas. There are obviously many more, but it would be nice if we could roughly agree on what directions are the most promising in the short term, so we can cooperate and get more done.

`2) Besides "simplifications" we can also look at more general "modifications". Graham has already listed some on the blog: a) Change the basin. b) Change the area outside the basin (eg omit Australia and Mexico, add the Atlantic). c) Change the period over which covariances are calculated (the 364 above). d) Change the range of $\tau$ (the 200 above). e) Treat positive and negative $\tau$ differently. f) Treat positive and negative correlations differently. g) Use covariances not correlations, or a different type of normalisation. h) Use a different formula for converting correlations to link strengths. i) Use a different criterion for the alarm. These include some of the "simplifications". When we consider modifications, we should try to avoid making things more and more complicated. We should also try to seek modifications that "make sense" in terms of how the Earth's weather actually works. 3) Thus, it would be good for all of us to get a better understanding for El Niños. I want an "intuitive feeling" for what they're like. (They aren't all the same, and people argue about whether there are distinct kinds.) So, I should read more papers and write about them in blog articles. But also, it would be nice if those who are experts on data visualization could create a bunch of "movies" showing what weather patterns in the Pacific have been like... say, since 1948. One option would be to create movies showing air surface temperatures taken from the data we've already been looking at: • National Centers for Environmental Prediction and the National Center for Atmospheric Research Reanalysis I Project, [worldwide daily average temperatures from 1948 to 2010 on a 2.5° latitude × 2.5° longitude grid](http://www.esrl.noaa.gov/psd/cgi-bin/db_search/DBSearch.pl?Dataset=NCEP+Reanalysis+Daily+Averages+Surface+Level&Variable=Air+Temperature&group=0&submit=Search"). focused in the Pacific. But there is a lot of other data out there, and some of the movies I'd like to see have already been created. The goal here is better understand "what really matters" for El Niño prediction. 4) Simple climate models will be the most useful if they're well-explained and really easy for people to use and/or passively watch. The purpose of these models is in part _educational_. Not everyone has the ability to run Mathematica or R code. So, if we can use that code to make animated gifs or YouTube videos, we'll expand the audience a lot. Some of today's smart high school or even elementary school students will be tomorrow's climate scientists! And some of today's scientists will be more likely to get interested in ecological issues if we can show them interesting movies or animations. That's 4 ideas. There are obviously many more, but it would be nice if we could roughly agree on what directions are the most promising in the short term, so we can cooperate and get more done.`

John please delete some of my comments if clutter the discussions, but I have to add this in response to:

Algebra on Kuratowski's Monoid

Here is a challenge: I have a bunch of non-numeric horrifically tedious symbolic computations that I need to evaluate interactively to educate a person who needs to learn both math and code. So what I did was encapsulate the computations into buttons on the browser (cdf) and pressing the button executes the computations with interim computations as output into the panel of the button. I could easily add R code to the button mixed with Mathematica and C ... Mind you the code could be removed from the document and be only the buttons and english and plots.

For many of your ideas here I could make such, let's say for now, tutorials to make sure I understand what is going on. I did that already for your blog on differential equations and learned a lot:

Delay Differential Equation: El Nino

Noisy Hopf Bifurcation

Again I must emphasize that my own self needs to learn, so this is important to my thinking and learning.

Dara

`John please delete some of my comments if clutter the discussions, but I have to add this in response to: >The purpose of these models is in part educational. Not everyone has the ability to run Mathematica or R code. [Algebra on Kuratowski's Monoid](http://mathematica.lossofgenerality.com/2014/06/26/14-sets-algebra-code/) Here is a challenge: I have a bunch of non-numeric horrifically tedious symbolic computations that I need to evaluate interactively to educate a person who needs to learn both math and code. So what I did was encapsulate the computations into buttons on the browser (cdf) and pressing the button executes the computations with interim computations as output into the panel of the button. I could easily add R code to the button mixed with Mathematica and C ... Mind you the code could be removed from the document and be only the buttons and english and plots. For many of your ideas here I could make such, let's say for now, tutorials to make sure I understand what is going on. I did that already for your blog on differential equations and learned a lot: [Delay Differential Equation: El Nino](http://mathematica.lossofgenerality.com/2014/06/24/delay-differential-equation-el-nino/) [Noisy Hopf Bifurcation](http://mathematica.lossofgenerality.com/2014/06/24/noisy-hopf-bifurcation/) Again I must emphasize that my own self needs to learn, so this is important to my thinking and learning. Dara`

What I am struggling with is the mechanism that turns the ordinary ENSO quasi-periodic oscillations into an El Nino or La Nina. I have been working from the point-of-view that this is just a continuum, and if one can model the continuum adequately, then the criteria for an El Nino is exceeding a fixed threshold on the continuum scale.

So my question is: Is the most applicable model formulated as a 2-part process? Where the first part is to establish the underlying quasi-periodic foundation, and the second part to determine the threshold for a more specific El Nino prediction.

Elsewhere I have read that an El Nino condition is based on the criteria that the sea-surface temperature anomaly exceeds a certain value for a number of consecutive months. Is that enough to go on?

`What I am struggling with is the mechanism that turns the ordinary ENSO quasi-periodic oscillations into an El Nino or La Nina. I have been working from the point-of-view that this is just a continuum, and if one can model the continuum adequately, then the criteria for an El Nino is exceeding a fixed threshold on the continuum scale. So my question is: Is the most applicable model formulated as a 2-part process? Where the first part is to establish the underlying quasi-periodic foundation, and the second part to determine the threshold for a more specific El Nino prediction. Elsewhere I have read that an El Nino condition is based on the criteria that the sea-surface temperature anomaly exceeds a certain value for a number of consecutive months. Is that enough to go on?`

WebHubTel - to approach the problem scientifically it's probably better not to consider an El Niño as a yes-or-no thing: that is, a binary distinction, like a switch that's on or off. There are human conventions for making this binary distinction: for example, Ludescher

et alsay there's an El Niño when the Niño 3.4 index is over 0.5°C for at least 3 months, and there's very worthwhile stuff to read here:• Kevin E. Trenberth, The definition of El Niño,

Bulletin of the American Meteorological Society78(1997), 2771–2777.But nature doesn't know our human conventions, and indeed Trenberth writes:

I think it might be wiser to focus, not on a boolen yes-or-no variable, but on a continuous variable like the Niño 3.4 index itself. Can you try to predict that?

If you look at graphs of this index, you'll see that in addition to "official El Niños" there are smaller spikes which don't quite count as El Niños... but are still real physical things.

Of course I'm not saying the Niño 3.4 index is the only choice of continuous variable to look at... but experts have decided it's useful, so it's probably one thing to look at. The Southern Oscillation Index or SOI is another, but it's highly correlated. You can get both online.

`WebHubTel - to approach the problem scientifically it's probably better not to consider an El Niño as a yes-or-no thing: that is, a binary distinction, like a switch that's on or off. There are human conventions for making this binary distinction: for example, Ludescher _et al_ say there's an El Niño when the Niño 3.4 index is over 0.5°C for at least 3 months, and there's very worthwhile stuff to read here: • Kevin E. Trenberth, <a href = "http://www.cgd.ucar.edu/staff/trenbert/trenberth.papers/defnBAMS.pdf">The definition of El Niño</a>, <i>Bulletin of the American Meteorological Society</i> <b>78</b> (1997), 2771–2777. But nature doesn't know our human conventions, and indeed Trenberth writes: > A brief review is given of the various uses of the term and attempts to define it. It is even more difficult to come up with a satisfactory quantitative definition. However, one such definition is explored here, and the resulting times and durations of El Niño and La Niña events are given. While unsatisfactory, the main recommendation is that any use of the term should state which definition is being used, as the definition is still evolving. I think it might be wiser to focus, not on a boolen yes-or-no variable, but on a continuous variable like the Niño 3.4 index itself. Can you try to predict that? If you look at graphs of this index, you'll see that in addition to "official El Niños" there are smaller spikes which don't quite count as El Niños... but are still real physical things. <img src = "http://math.ucr.edu/home/baez/ecological/el_nino/soi_nino34.gif" alt = ""/> Of course I'm not saying the Niño 3.4 index is the only choice of continuous variable to look at... but experts have decided it's useful, so it's probably one thing to look at. The Southern Oscillation Index or SOI is another, but it's highly correlated. You can get both online.`

On why replacing the correlation with covariance has little effect. The normalising constant is

$$D_{ij}^{(t)}(\tau) = \sqrt{\langle T_i(t)^2\rangle - \langle T_i(t)\rangle^2 } \; \sqrt{\langle T_j(t-\tau)^2 \rangle - \langle T_j(t-\tau)\rangle^2} $$ When calculating the link strength for points $i,j$ at time $t$, only the

relativevalues for different $\tau$ matter. Any overall scaling has no effect on the (mean-max)/sd value. $D_{ij}^{(t)}(\tau) $ does not vary much with $\tau$. The first square root is constant, and the second only reflects changes in the variability, measured over a year, of the temperatures at a point, as the start and end of the year change by up to 200 days. I wouldn't expect that to change much, and these graphs show that mostly it doesn't. The graphs are arranged geographically, with Australia bottom left. Adjacent graphs represent points 15 degrees apart. Land areas show up as areas with high variation.`On why replacing the correlation with covariance has little effect. The normalising constant is $$D_{ij}^{(t)}(\tau) = \sqrt{\langle T_i(t)^2\rangle - \langle T_i(t)\rangle^2 } \; \sqrt{\langle T_j(t-\tau)^2 \rangle - \langle T_j(t-\tau)\rangle^2} $$ When calculating the link strength for points $i,j$ at time $t$, only the *relative* values for different $\tau$ matter. Any overall scaling has no effect on the (mean-max)/sd value. $D_{ij}^{(t)}(\tau) $ does not vary much with $\tau$. The first square root is constant, and the second only reflects changes in the variability, measured over a year, of the temperatures at a point, as the start and end of the year change by up to 200 days. I wouldn't expect that to change much, and these graphs show that mostly it doesn't. The graphs are arranged geographically, with Australia bottom left. Adjacent graphs represent points 15 degrees apart. Land areas show up as areas with high variation. <img src = "http://www.azimuthproject.org/azimuth/files/stddevs_1957_365_5_13" alt = ""/>`

If I were to study this system using machine learning I would have started like this, a standard approach:

Correlation Distance Kemeans

3.5 make the list of chosen equator points smaller and larger and analyze the clusters several times.

{(Ti1, Tj1, delayM1, delayN1), (Ti2, Tj2, delayM2, delayN2), ...}

Note that I did not take any averages.Also #4 uses large computing to find the Nearest neighbour, but we need to this once per period.

I then use the first member of the Nearest list in 4 and tabulate. I get a linkage list and measure of correlation between the nodes.

You will find from actual experience that Wavelet Denoising Ti in step 2 does produce different results, perhaps better results

`If I were to study this system using machine learning I would have started like this, a standard approach: 1. Choose a series of points on the equator (as the said paper does) 2. Form a space of tuples A = {(Ti, Tj, delayM, delayN)} where the Ti is the temperature for the entire period, delayM is right-shift or left-shift of the signal T by M (positive or negative numbers) 3. K-means clustering on A, using the Correlation Distance as metric, to visualize how the temperature histories cluster (NOTE that there is no taking of averages): [Correlation Distance](http://reference.wolfram.com/mathematica/ref/CorrelationDistance.html) [Kemeans](http://en.wikipedia.org/wiki/K-means_clustering) 3.5 make the list of chosen equator points smaller and larger and analyze the clusters several times. 4. To find specific relationships do Nearest Neighbour algorithm (Correlation Distance) on a particular Ti for (Ti, TX, delayM, delayN), but I let TX run a larger area up to the poles, and the algorithm might issue a list of SORTED nearest neighbours like: {(Ti1, Tj1, delayM1, delayN1), (Ti2, Tj2, delayM2, delayN2), ...} Note that I did not take any averages.Also #4 uses large computing to find the Nearest neighbour, but we need to this once per period. I then use the first member of the Nearest list in 4 and tabulate. I get a linkage list and measure of correlation between the nodes. You will find from actual experience that Wavelet Denoising Ti in step 2 does produce different results, perhaps better results`

We could also try to directly predict so aspect of weather, like the El Nino index, some period into the future. Train a model that takes the whole map at some date as input and outputs either the predicted number, or the change in that number over the prediction period.

I would split the data into test and train sets at some date or do cross validation based on contiguous date blocks. Splittng randomly would be bad since that would mean that for each test case there would be a train case from only a day or 2 away.

`We could also try to directly predict so aspect of weather, like the El Nino index, some period into the future. Train a model that takes the whole map at some date as input and outputs either the predicted number, or the change in that number over the prediction period. I would split the data into test and train sets at some date or do cross validation based on contiguous date blocks. Splittng randomly would be bad since that would mean that for each test case there would be a train case from only a day or 2 away.`

Great, I am ready to do that. However I need a function that maps the input into the output clearly defined so I could build the forecast models:

I prefer the output to be a continuous range indicating the gradations of how severe the El-Nino is. I am really un-educated on this index so please point me to the right directions.

I am sure THE WHOLE MAP might not work at first, so perhaps we use a coarser grid. For NN we can flatten the map into a large array. For SVR same thing but the solver for this algorithm might take a long time since its equation is as many variable as the input. But we could try.

So my suggestion is to start with a small grid to get the code going.

Now on the real-issue: Training these algorithms are very very very very hard work. I do them on daily basis for other time-series. It is backbreaking work to find the proper training parameters for the said algorithms. Please let us have several months of continuous effort to train the algorithms for optimal performance. But once we learn how to train them, it is sweat! sit back and enjoy the ride :)

Dara

`>Train a model that takes the whole map at some date as input and outputs either the predicted number, or the change in that number over the prediction period. Great, I am ready to do that. However I need a function that maps the input into the output clearly defined so I could build the forecast models: 1. SVR (Support Vector Regression) 2. NN (Neural Networks) 3. K-nn I prefer the output to be a continuous range indicating the gradations of how severe the El-Nino is. I am really un-educated on this index so please point me to the right directions. I am sure THE WHOLE MAP might not work at first, so perhaps we use a coarser grid. For NN we can flatten the map into a large array. For SVR same thing but the solver for this algorithm might take a long time since its equation is as many variable as the input. But we could try. So my suggestion is to start with a small grid to get the code going. Now on the real-issue: Training these algorithms are very very very very hard work. I do them on daily basis for other time-series. It is backbreaking work to find the proper training parameters for the said algorithms. Please let us have several months of continuous effort to train the algorithms for optimal performance. But once we learn how to train them, it is sweat! sit back and enjoy the ride :) Dara`

Since we want to predict a continuous number we should use a regression model. To get a baseline a usually start with random forest and an l1 regularized linear model. Decent implementations of these should have no trouble handling the entire dataset. The world maps have around 10k data points per day and the pacific region has around 2k data points per day and there is around 10k days worth of data. Training a linear model or a random forest even on the 10k * 10k dataset is feasible on a laptop. I could do this too, but I am not sure what value we should be trying to predict.

`Since we want to predict a continuous number we should use a regression model. To get a baseline a usually start with random forest and an l1 regularized linear model. Decent implementations of these should have no trouble handling the entire dataset. The world maps have around 10k data points per day and the pacific region has around 2k data points per day and there is around 10k days worth of data. Training a linear model or a random forest even on the 10k * 10k dataset is feasible on a laptop. I could do this too, but I am not sure what value we should be trying to predict.`

Random Forrest should be fine for this, I have however little experience with it but could speed up fast. And I cannot tell you how accurate the forecasts will be since I have little experience with it.

My problem too. El-Nino index variable of some kind to indicate the severity of El Nino.

`Random Forrest should be fine for this, I have however little experience with it but could speed up fast. And I cannot tell you how accurate the forecasts will be since I have little experience with it. >but I am not sure what value we should be trying to predict. My problem too. El-Nino index variable of some kind to indicate the severity of El Nino.`

Daniel wrote:

If it's okay, let's talk for at least a week about what we want to do, before you do something. I'm really impressed by your and Dara Shayda's programming ability and level of energy, but I am unfamiliar with a lot of the techniques you're talking about, so it takes me time to understand them, and I'd like to make sure we're doing something good before we do it.

(Maybe this is wrong; maybe you should just try stuff and tell me what works. But perhaps because I'm a mathematician, I like to talk a lot before taking action.)

Anyway, I think I know what we should be trying to predict. Either:

Niño 3.4 index, which is the sea surface temperature anomaly in the Niño 3.4 region. You can get monthly values of this from 1870 to 2011 here or here (in different formats).or

Oceanic Niño Index (ONI), which is the 3-month running mean of the Niño 3.4 index.When the ONI is over 0.5 °C, people say there's an

El Niño. When it's below -0.5 °C, some people say there's aLa Niña, though other people think this criterion for a La Niña is suboptimal.If I had to pick one of these for you to predict, it would be the ONI. However, it might be to take the same prediction algorithm, keep everything else the same, and use it to predict both ONI and Niño 3.4. Since the former is just the 3-month running average of the latter, I guess the question is whether the "smoothed-out" quantity is significantly easier to predict.

We could try to predict either of these quantities using data obtained

at least 3 months earlier,at least 6 months earlier,at least 9 months earlier, orat least 12 months earlier. People believe it gets much harder to predict them more than 6 months in advance. So, to get people interested, we should succeed in predicting them more than 6 months in advance. However, it will be very interesting to see how our predictive ability degrades as we try predictions that are 3, 6, 9 or 12 months in advance.`Daniel wrote: > I could do this too, but I am not sure what value we should be trying to predict. If it's okay, let's talk for at least a week about what we want to do, before you do something. I'm really impressed by your and Dara Shayda's programming ability and level of energy, but I am unfamiliar with a lot of the techniques you're talking about, so it takes me time to understand them, and I'd like to make sure we're doing something good before we do it. (Maybe this is wrong; maybe you should just try stuff and tell me what works. But perhaps because I'm a mathematician, I like to talk a lot before taking action.) Anyway, I think I know what we should be trying to predict. Either: * The **Niño 3.4 index**, which is the sea surface temperature anomaly in the Niño 3.4 region. You can get monthly values of this from 1870 to 2011 <a href = "http://www.esrl.noaa.gov/psd/gcos_wgsp/Timeseries/Nino34/">here</a> or <a href = "http://www.cgd.ucar.edu/cas/catalog/climind/TNI_N34/index.html#Sec5">here</a> (in different formats). or * the **Oceanic Niño Index (ONI)**, which is the 3-month running mean of the Niño 3.4 index. When the ONI is over 0.5 °C, people say there's an **El Niño**. When it's below -0.5 °C, some people say there's a **La Niña**, though other people think this criterion for a La Niña is suboptimal. If I had to pick one of these for you to predict, it would be the ONI. However, it might be to take the same prediction algorithm, keep everything else the same, and use it to predict both ONI and Niño 3.4. Since the former is just the 3-month running average of the latter, I guess the question is whether the "smoothed-out" quantity is significantly easier to predict. We could try to predict either of these quantities using data obtained _at least 3 months earlier_, _at least 6 months earlier_, _at least 9 months earlier_, or _at least 12 months earlier_. People believe it gets much harder to predict them more than 6 months in advance. So, to get people interested, we should succeed in predicting them more than 6 months in advance. However, it will be very interesting to see how our predictive ability degrades as we try predictions that are 3, 6, 9 or 12 months in advance.`

Daniel wrote:

Could you please explain these, or point me to a good explanation? I have a lot to learn.

I will start with this:

and then read this:

These article say a lot more about classification trees than regression trees, so I don't really know how a regression tree works! Okay, it says that here:

Later maybe I should try this:

et al, $L_1$-based compression of random forest models.However, right now I need to understand the basic ideas, not the ways to do things more efficiently.

What are other good

very elementaryintroductions to these methods? More precisely: I know lots of math, but almost nothing about predictive algorithms.`Daniel wrote: > Since we want to predict a continuous number we should use a regression model. To get a baseline a usually start with random forest and an l1 regularized linear model. Could you please explain these, or point me to a good explanation? I have a lot to learn. I will start with this: * [Decision tree](https://en.wikipedia.org/wiki/Decision_tree_learning), Wikipedia. and then read this: * [Random forest](https://en.wikipedia.org/wiki/Random_forest), Wikipedia. These article say a lot more about classification trees than regression trees, so I don't really know how a regression tree works! Okay, it says that here: * Cosma Shalizi, [Lecture 10: regression trees](http://www.stat.cmu.edu/~cshalizi/350-2006/lecture-10.pdf). Later maybe I should try this: * Arnaud Joly _et al_, [$L_1$-based compression of random forest models](https://www.elen.ucl.ac.be/Proceedings/esann/esannpdf/es2012-43.pdf). However, right now I need to understand the basic ideas, not the ways to do things more efficiently. What are other good _very elementary_ introductions to these methods? More precisely: I know lots of math, but almost nothing about predictive algorithms.`

By the way: I mentioned a while ago that I'm giving a talk at NIPS (the

Neural Information Processing Seminar, a big annual conference on neural networks, machine learning and other things), in early December 2014. I did something crazy: I said I would give a talk on climate networks and El Niño prediction.This is crazy because I'm not an expert on these things. But, if we can do something interesting, I'll talk about it. I will call it a talk "by the Azimuth Project" and cite by name everybody who has helped out.

We could try different things.

Among other things, I would like to try a neural network or Bayesian network approach, since the audience will be interested in those. But the random forest method also looks interesting (though I don't know much about it yet).`By the way: I mentioned a while ago that I'm giving a talk at [NIPS](http://nips.cc/) (the **Neural Information Processing Seminar**, a big annual conference on neural networks, machine learning and other things), in early December 2014. I did something crazy: I said I would give a talk on climate networks and El Niño prediction. This is crazy because I'm not an expert on these things. But, if we can do something interesting, I'll talk about it. I will call it a talk "by the Azimuth Project" and cite by name everybody who has helped out. We could try different things. _Among other things_, I would like to try a neural network or Bayesian network approach, since the audience will be interested in those. But the random forest method also looks interesting (though I don't know much about it yet).`

Graham wrote:

Very nice! Are the red numbers here the El Niño basin? Why does one square have

bigred numbers?Squares (5,19) and (5,21) have a curious pattern. What's that about? Is that Central America?

Squares (7,15) to (7,19) have a quite different interesting pattern.

How can I translate these coordinates into latitude/longitude or the goofy system used by NOAA (latitudes going from 1 to 73 as we go from North pole to South pole, longitudes going from 1 to 145 as we go east starting at Greenwich).

`Graham wrote: > On why replacing the correlation with covariance has little effect. Very nice! Are the red numbers here the El Niño basin? Why does one square have _big_ red numbers? Squares (5,19) and (5,21) have a curious pattern. What's that about? Is that Central America? Squares (7,15) to (7,19) have a quite different interesting pattern. How can I translate these coordinates into latitude/longitude or the goofy system used by NOAA (latitudes going from 1 to 73 as we go from North pole to South pole, longitudes going from 1 to 145 as we go east starting at Greenwich). <img src = "http://www.azimuthproject.org/azimuth/files/stddevs_1957_365_5_13" alt= ""/>`

The big 5,13 doesn't mean anything, its just accidentally there because I was using the same code to plot link strengths between a particular point and the rest.

The numbers like 5,13 are indices in Ludescher et al's 9 by 23 grid - I'm not going to do the arithmetic to convert into degrees!

`The big 5,13 doesn't mean anything, its just accidentally there because I was using the same code to plot link strengths between a particular point and the rest. The numbers like 5,13 are indices in Ludescher et al's 9 by 23 grid - I'm not going to do the arithmetic to convert into degrees!`

John asked

There's a book

The Elements of Statistical Learningby Trevor Hastie, Robert Tibshirani, Jerome Friedman. It is available free online, but I can't find a "clean" link.This book is very good - especially for mathematicians I think - if a little dated:

Pattern Recognition and Neural Networks, Cambridge University Press, 7 January 1996. http://www.stats.ox.ac.uk/~ripley/PRbook/(Nothing against

The Elements of Statistical Learning, but I'm not so familar with it.)`John asked > Could you please explain these, or point me to a good explanation? There's a book _The Elements of Statistical Learning_ by Trevor Hastie, Robert Tibshirani, Jerome Friedman. It is available free online, but I can't find a "clean" link. This book is very good - especially for mathematicians I think - if a little dated: * B.D. Ripley, _Pattern Recognition and Neural Networks, Cambridge University Press_, 7 January 1996. [http://www.stats.ox.ac.uk/~ripley/PRbook/](http://www.stats.ox.ac.uk/~ripley/PRbook/) (Nothing against _The Elements of Statistical Learning_, but I'm not so familar with it.)`

Thanks, Graham!

So, the answer to my question

is that these are two most easterly red dots right on the equator here... right off the coast of Ecuador:

The weather must be different there.

And these are off the coast of Peru.

`Thanks, Graham! So, the answer to my question > Squares (5,19) and (5,21) have a curious pattern. What’s that about? Is that Central America? is that these are two most easterly red dots right on the equator here... right off the coast of Ecuador: <a href="http://www.pnas.org/content/early/2013/06/26/1309353110.full.pdf+html"> <img width="450" src="http://math.ucr.edu/home/baez/ecological/el_nino/ludescher_el_nino_cooperativity_1a.jpg" alt="" /></a> The weather must be different there. > Squares (7,15) to (7,19) have a quite different interesting pattern. And these are off the coast of Peru.`

'Crazy' motivates me :)

Might I suggest, and of course you have the final say, to introduce the proper usage of Wavelets in this field of endeavour for El Nino and other atmospherics computations.

If we run daily machine learning algorithms by that time then you could present them as well.

Dara

`> I did something crazy: I said I would give a talk on climate networks and El Niño prediction....This is crazy because I’m not an expert on these things. But, if we can do something interesting, I’ll talk about it. 'Crazy' motivates me :) Might I suggest, and of course you have the final say, to introduce the proper usage of Wavelets in this field of endeavour for El Nino and other atmospherics computations. If we run daily machine learning algorithms by that time then you could present them as well. Dara`

John thanks for the ideas and will try to focus on the SOI, SST, and other proxies.

My next topic to ponder is what proportion do people think is responsible for the sea-saw of water from the east Pacific to west Pacific and back?

Is it mainly driven by oscillations in the trade winds that will pile water to the leeward or downwind side of the ocean? We know this happens on a lake on a windy day, where the water level will rise on the shoreline taking a beating from the wind. The oscillation reverses when the winds subside or reverses direction.

Is it driven by more by deep upwelling and downwelling of water in the Pacific? This creates a natural sloshing of water back-and-forth, roughly analogous to the effect of tides. The winds come about as a result of this action, as upwelling cold water will cause lower atmospheric pressure creating a gradient for regions of higher atmospheric pressure to flow toward, thus creating a wind.

Which came first, the chicken (wind) or the egg (hydrodynamics)?

The origin of either wind forcing or hydrodynamic forcing is also uncertain. Is it long term cyclic solar or lunar effects, or a natural stochastic resonance invoked by random forcing? Or is it some mutual interaction between the wind and the hydrodynamics creating a network pair?

Again, my strategy is to work out the hydrodynamic angle and see how far we can go with that. Here are a couple of interesting GIF animations for sloshing on a much smaller scale

http://imageshack.com/a/img856/4043/kr7l.gif -- Dynamic instability (0.5 MB)

http://imageshack.com/a/img823/6842/6fd5.gif -- Detuning effect (1.5 MB)

The important point to consider is that sloshing can be very erratic and only quasi-periodic, even if the forcing is strictly periodic. A periodic forcing may not match the natural characteristic frequency of the liquid volume, thereby creating very interesting beat periods. See the detuning animation in particular and watch how erratic it gets. And then when multiple forcing periods are applied, the behavior turns even more erratic (which is partly why tidal charts are not so simple, as they are a mix of diurnal and semi-diurnal periods).

More info here where these animations come form http://contextearth.com/2014/07/05/sloshing-animation/

`John thanks for the ideas and will try to focus on the SOI, SST, and other proxies. My next topic to ponder is what proportion do people think is responsible for the sea-saw of water from the east Pacific to west Pacific and back? 1. Is it mainly driven by oscillations in the trade winds that will pile water to the leeward or downwind side of the ocean? We know this happens on a lake on a windy day, where the water level will rise on the shoreline taking a beating from the wind. The oscillation reverses when the winds subside or reverses direction. 2. Is it driven by more by deep upwelling and downwelling of water in the Pacific? This creates a natural sloshing of water back-and-forth, roughly analogous to the effect of tides. The winds come about as a result of this action, as upwelling cold water will cause lower atmospheric pressure creating a gradient for regions of higher atmospheric pressure to flow toward, thus creating a wind. Which came first, the chicken (wind) or the egg (hydrodynamics)? The origin of either wind forcing or hydrodynamic forcing is also uncertain. Is it long term cyclic solar or lunar effects, or a natural stochastic resonance invoked by random forcing? Or is it some mutual interaction between the wind and the hydrodynamics creating a network pair? Again, my strategy is to work out the hydrodynamic angle and see how far we can go with that. Here are a couple of interesting GIF animations for sloshing on a much smaller scale <http://imageshack.com/a/img856/4043/kr7l.gif> -- Dynamic instability (0.5 MB) <http://imageshack.com/a/img823/6842/6fd5.gif> -- Detuning effect (1.5 MB) The important point to consider is that sloshing can be very erratic and only quasi-periodic, even if the forcing is strictly periodic. A periodic forcing may not match the natural characteristic frequency of the liquid volume, thereby creating very interesting beat periods. See the detuning animation in particular and watch how erratic it gets. And then when multiple forcing periods are applied, the behavior turns even more erratic (which is partly why tidal charts are not so simple, as they are a mix of diurnal and semi-diurnal periods). More info here where these animations come form <http://contextearth.com/2014/07/05/sloshing-animation/>`

Hello WebHubTel

Not being fresh-mouth but I suspect that whatever causing the weather problems is not simply increase or decrease of a couple of things in the environment, nor simple mechanisms like sloshing might explain them away.

The planetary physics is one of the largest systems we have encountered with co-relations to sunspots and CME outbursts of particles and magnetic pole shifts of sun all the way to microbial life forms in clouds and in rain on the leaves that manipulate the nucleation and crystalization of water.

To start, IMHO, I would collect a large number of parameters and make a long vector out of them:

(temp, humidity, wind speed, leapyear, gravitation, longiture, latitude, number of sunspots, CO2 level,...)

And I would make a large collection of these vectors.

I will then add several parameters : EL Nino index or other similar measure of severity.

Then compute a K-means clustering algorithm (or any other clustering) using a weighted Euclidean metric e.g. Canberra Distance and small number of clusters say 30.

If the parameters are not related, then the resulting clusters would be most of similar very large sizes. I then look at the distribution severity parameters amongst the clusters. If uniformly distributed then there is no patterns found from this computation.

However usually smaller clusters show up and odd distributions of parameters in the clusters and these then could tell a NON-COGNITIVE DATA-DRIVEN narrative of what is happening.

Then again and again add and drop parameters and review the clusters.

Then we could find patterns of similarities that are not human cognizable i.e. machine learnt since the system is just too large.

Of course these raw ideas but we have reached a tech curve which allows us to do investigate them.

Dara

`Hello WebHubTel Not being fresh-mouth but I suspect that whatever causing the weather problems is not simply increase or decrease of a couple of things in the environment, nor simple mechanisms like sloshing might explain them away. The planetary physics is one of the largest systems we have encountered with co-relations to sunspots and CME outbursts of particles and magnetic pole shifts of sun all the way to microbial life forms in clouds and in rain on the leaves that manipulate the nucleation and crystalization of water. To start, IMHO, I would collect a large number of parameters and make a long vector out of them: (temp, humidity, wind speed, leapyear, gravitation, longiture, latitude, number of sunspots, CO2 level,...) And I would make a large collection of these vectors. I will then add several parameters : EL Nino index or other similar measure of severity. Then compute a K-means clustering algorithm (or any other clustering) using a weighted Euclidean metric e.g. Canberra Distance and small number of clusters say 30. If the parameters are not related, then the resulting clusters would be most of similar very large sizes. I then look at the distribution severity parameters amongst the clusters. If uniformly distributed then there is no patterns found from this computation. However usually smaller clusters show up and odd distributions of parameters in the clusters and these then could tell a NON-COGNITIVE DATA-DRIVEN narrative of what is happening. Then again and again add and drop parameters and review the clusters. Then we could find patterns of similarities that are not human cognizable i.e. machine learnt since the system is just too large. Of course these raw ideas but we have reached a tech curve which allows us to do investigate them. Dara`

Dara wrote:

The audience for my talk will be more interested in machine learning and networks (Bayesian networks, decision trees, neural networks etc.) than wavelets. So, until December I need to move in the direction of machine learning and networks for El Niño prediction.

El Niño seems to be a highly nonlinear phenomenon, not periodic. So, while fitting it with linear combinations of periodic functions may reveal which kind of oscillations are significant, it doesn't sound like a good way to

predictEl Niños.I could be wrong! But I'm saying that our predictive algorithm shouldn't model Niño 3.4 as a linear combination of periodic functions and then use that to extrapolate into the future.

I should say more about how people already try to predict El Niños. Here you can see the results of 15 dynamical models and 7 statistical models:

You can read more about this here.

`Dara wrote: > Might I suggest, and of course you have the final say, to introduce the proper usage of Wavelets in this field of endeavour for El Nino and other atmospherics computations. > If we run daily machine learning algorithms by that time then you could present them as well. The audience for my talk will be more interested in machine learning and networks (Bayesian networks, decision trees, neural networks etc.) than wavelets. So, until December I need to move in the direction of machine learning and networks for El Niño prediction. El Niño seems to be a highly nonlinear phenomenon, not periodic. So, while fitting it with linear combinations of periodic functions may reveal which kind of oscillations are significant, it doesn't sound like a good way to _predict_ El Niños. I could be wrong! But I'm saying that our predictive algorithm shouldn't model Niño 3.4 as a linear combination of periodic functions and then use that to extrapolate into the future. I should say more about how people already try to predict El Niños. Here you can see the results of 15 dynamical models and 7 statistical models: <img src = "http://www.cpc.ncep.noaa.gov/products/analysis_monitoring/enso_advisory/figure6.gif" alt = ""/> You can read more about this [here](http://www.cpc.ncep.noaa.gov/products/precip/CWlink/MJO/enso.shtml#discussion).`

I wrote:

I still want to see such movies! But graphics like this are also useful. These are sea surface temperatures, not air temperatures:

See here for more.

We could probably create these using the TAO/TRITON data. Are any of you good at this sort of thing?

`I wrote: > One option would be to create movies showing air surface temperatures taken from the data we’ve already been looking at: > • [National Centers for Environmental Prediction and the National Center for Atmospheric Research Reanalysis I Project](http://www.esrl.noaa.gov/psd/cgi-bin/db_search/DBSearch.pl?Dataset=NCEP+Reanalysis+Daily+Averages+Surface+Level&Variable=Air+Temperature&group=0&submit=Search%22), worldwide daily average air temperatures from 1948 to 2010 on a 2.5° latitude × 2.5° longitude grid. > focused in the Pacific. I still want to see such movies! But graphics like this are also useful. These are sea surface temperatures, not air temperatures: <img src = "http://math.ucr.edu/home/baez/ecological/el_nino/SST_anomalies_august_2013_to_july_2014.jpg" alt = ""/> See [here](http://www.cpc.ncep.noaa.gov/products/analysis_monitoring/lanina/enso_evolution-status-fcsts-web.pdf) for more. We could probably create these using the TAO/TRITON data. Are any of you good at this sort of thing?`

Interesting slide deck. Unfortunately they do not provide references to the models they are using. I can guess the what types of models some of these might be but the problem representation and data preprocessing are likely to be more important than the algorithm itself. It would be nice to get a fuller specification of each model and it's performnace on unseen data.

Also the slides are from May. We should already be able to start verifying the predictions. Most of them predict we should have crossed the 0.5 line by now.

`Interesting slide deck. Unfortunately they do not provide references to the models they are using. I can guess the what types of models some of these might be but the problem representation and data preprocessing are likely to be more important than the algorithm itself. It would be nice to get a fuller specification of each model and it's performnace on unseen data. Also the slides are from May. We should already be able to start verifying the predictions. Most of them predict we should have crossed the 0.5 line by now.`

Daniel wrote:

Right! The National Weather Service has the Niño 3.4 data up to June:

The second column lists the month, and the last one, ANOM, is the Niño 3.4 index corrected for global warming (and thus a bit lower than some other estimates).

So, it's not quite 0.47 yet, but it was shooting up in the last couple months. We need about 5 months of it being over 0.5 to count as an El Niño.

I really recommend the Weekly ENSO evolution, status, and prediction presentation to get a sense of the latest developments!

`Daniel wrote: > Also the slides are from May. We should already be able to start verifying the predictions. Most of them predict we should have crossed the 0.5 line by now. Right! The [National Weather Service](http://www.cpc.noaa.gov/products/analysis_monitoring/ensostuff/detrend.nino34.ascii.txt) has the Niño 3.4 data up to June: ~~~~ YR MON TOTAL ClimAdjust ANOM 2014 1 26.03 26.68 -0.64 2014 2 26.08 26.84 -0.76 2014 3 26.87 27.34 -0.47 2014 4 27.68 27.81 -0.13 2014 5 28.16 27.91 0.24 2014 6 28.16 27.69 0.47 ~~~~ The second column lists the month, and the last one, ANOM, is the Niño 3.4 index corrected for global warming (and thus a bit lower than some other estimates). So, it's not quite 0.47 yet, but it was shooting up in the last couple months. We need about 5 months of it being over 0.5 to count as an El Niño. I really recommend the [Weekly ENSO evolution, status, and prediction presentation](http://www.cpc.ncep.noaa.gov/products/analysis_monitoring/lanina/enso_evolution-status-fcsts-web.pdf) to get a sense of the latest developments!`

If the NOAA is already running 20+ predictive models, we really ought to have some new angle. How will we be better/different from the rest? It looks like they already have at least a neural net and a Markov model in the mix. Though that may not mean much without knowing how they are using them. Is there a way to find out more about the models they are running?

`If the NOAA is already running 20+ predictive models, we really ought to have some new angle. How will we be better/different from the rest? It looks like they already have at least a neural net and a Markov model in the mix. Though that may not mean much without knowing how they are using them. Is there a way to find out more about the models they are running?`

In the graph in comment #22 of various simulated and statistical EN3.4 prediction there appear to be 4 statistical (analytic) projections of < 0.5C out of 17 in total. This makes me ask whether this is suggestive of over-fitting in the simulations.

I'll go off now and try and find the relevant graph for another question. In Graham's covariance graph there were fairly steep rises in the covariance of link strengths to 70km? but not further. What happens at longer distances?

`In the graph in comment #22 of various simulated and statistical EN3.4 prediction there appear to be 4 statistical (analytic) projections of < 0.5C out of 17 in total. This makes me ask whether this is suggestive of over-fitting in the simulations. I'll go off now and try and find the relevant graph for another question. In Graham's covariance graph there were fairly steep rises in the covariance of link strengths to 70km? but not further. What happens at longer distances?`

Here is information about their regression models

`Here is information about their [regression models](http://www.cpc.ncep.noaa.gov/products/precip/CWlink/ENSO/regressions)`

A radio program discusses the possible El Niño in 2014, at about 17:30:

http://www.bbc.co.uk/programmes/b048034h

`A radio program discusses the possible El Niño in 2014, at about 17:30: [http://www.bbc.co.uk/programmes/b048034h](http://www.bbc.co.uk/programmes/b048034h)`

Daniel wrote:

I think the first step is to be different; then we can become better.

One obvious way to be different is to use "average link strength" as one ingredient of how we predict El Niños. This is the new idea of Ludescher

et al, which they claim can predict El Niños more than 6 months in advance... unlike other models, which have a lot of trouble making predictions more than 6 months in advance.But Ludescher

et alare merely trying to make a yes-or-no prediction of whether an El Niño will occur in the next calendar year. Now that we have software that computes the average link strength, we could use that together with machine learning ideas to attempt to predict the Niño 3.4 index.I presume there are some off-the-shelf machine learning packages that take a bunch of time series inputs and learn to predict some time series as output? If so, the cheapest idea is to use one of those.

I can generate more intelligent suggestions; this is just supposed to be an easy thing to try.

I'm sure there must be, and I'll find out how. Clearly it's important to learn the state of the art.

`Daniel wrote: > If the NOAA is already running 20+ predictive models, we really ought to have some new angle. How will we be better/different from the rest? It looks like they already have at least a neural net and a Markov model in the mix. I think the first step is to be different; then we can become better. <img src = "http://math.ucr.edu/home/baez/emoticons/tongue2.gif" alt = ""/> One obvious way to be different is to use "average link strength" as one ingredient of how we predict El Niños. This is the new idea of Ludescher _et al_, which they claim can predict El Niños more than 6 months in advance... unlike other models, which have a lot of trouble making predictions more than 6 months in advance. But Ludescher _et al_ are merely trying to make a yes-or-no prediction of whether an El Niño will occur in the next calendar year. Now that we have software that computes the average link strength, we could use that together with machine learning ideas to attempt to predict the Niño 3.4 index. I presume there are some off-the-shelf machine learning packages that take a bunch of time series inputs and learn to predict some time series as output? If so, the cheapest idea is to use one of those. I can generate more intelligent suggestions; this is just supposed to be an easy thing to try. > Is there a way to find out more about the models they are running? I'm sure there must be, and I'll find out how. Clearly it's important to learn the state of the art.`

"I presume there are some off-the-shelf machine learning packages that take a bunch of time series inputs and learn to predict some time series as output? If so, the cheapest idea is to use one of those."

John, One of the most interesting off-the-shelf machine learning packages is Eureqa from Nutonian.

http://www.nutonian.com/

Nothing beats this for ease of use. Copy & paste a set of ENSO time-series pairs into the spreadsheet-like entry form, label the columns as "time" and "ENSO", and then configure a solution as ENSO=f(time) before hitting run.

It will generate a Pareto front of possible mathematical formulations.

It also handles multi-dimensional data, so multiple ENSO arrays can be entered.

It has a delay(Data, Delay) function so one can experiment with time shifts from one data set to another.

Both differential equations and delay differential equations can be evaluated (but watch out mixing the two).

You can still get this for free if you are an academic. I use it all the time and let it run in the background.

`"I presume there are some off-the-shelf machine learning packages that take a bunch of time series inputs and learn to predict some time series as output? If so, the cheapest idea is to use one of those." John, One of the most interesting off-the-shelf machine learning packages is Eureqa from Nutonian. <http://www.nutonian.com/> Nothing beats this for ease of use. Copy & paste a set of ENSO time-series pairs into the spreadsheet-like entry form, label the columns as "time" and "ENSO", and then configure a solution as ENSO=f(time) before hitting run. It will generate a Pareto front of possible mathematical formulations. It also handles multi-dimensional data, so multiple ENSO arrays can be entered. It has a delay(Data, Delay) function so one can experiment with time shifts from one data set to another. Both differential equations and delay differential equations can be evaluated (but watch out mixing the two). You can still get this for free if you are an academic. I use it all the time and let it run in the background.`

What are the accuracy rates you are getting using this tool i.e. run the tool against the known historical data and then compare the forecast to the actual values.

D

`>http://www.nutonian.com/ What are the accuracy rates you are getting using this tool i.e. run the tool against the known historical data and then compare the forecast to the actual values. D`

Dara, I use the Nutonian Eureqa tool mainly for exploration as it doesn't know anything about physics, but only fits waveforms to various mathematical formulations.

Yet, saying that, it would be interesting to see what it would give for the "Plume of Model ENSO Predictions" that John included in comment #22 above .

What I can do is take Nino3.4 data prior to 2014, feed it into Eureqa, let it run for a few hours, then take the best result along the Pareto front and extrapolate it for this year.

From my experience, Eureqa tends to create compositions of sine waves of different frequencies, but it also does "chirp"-type fits such as sin(wt^2) or modulated sin waves such as sin(wt+cos(vt)). The latter are understandable because it is trying to fit the quasi-periodic nature of ENSO. The fits can get fairly bizarre, which makes it fun to interpret.

`Dara, I use the Nutonian Eureqa tool mainly for exploration as it doesn't know anything about physics, but only fits waveforms to various mathematical formulations. Yet, saying that, it would be interesting to see what it would give for the "Plume of Model ENSO Predictions" that John included in comment #22 above . What I can do is take Nino3.4 data prior to 2014, feed it into Eureqa, let it run for a few hours, then take the best result along the Pareto front and extrapolate it for this year. From my experience, Eureqa tends to create compositions of sine waves of different frequencies, but it also does "chirp"-type fits such as sin(wt^2) or modulated sin waves such as sin(wt+cos(vt)). The latter are understandable because it is trying to fit the quasi-periodic nature of ENSO. The fits can get fairly bizarre, which makes it fun to interpret.`

Here are the results of Eureqa machine learning on the El Nino 3.4 SST measurements since 1982.

These are screen snapshots of two fits, one a moderate complexity and one a high complexity. I left them as links because they are full screen shots.

The first is near a Pareto frontier inflection point and it consists of 4 sine waves http://imageshack.com/a/img841/9503/csc.gif

The extrapolation from 2014 to 2016 for this waveform is shown here. The peak doesn't get above 0.5, so no El Nino in the near future http://imageshack.com/a/img841/4567/bl8h.gif

Beyond that Pareto inflection point, the waveforms become much more complex, and Eureqa starts to modulate the sine waves in a nested fashion. This is the most complex and shows the smallest fitting error http://imageshack.com/a/img856/4692/4cgo.gif

The extrapolation from 2014 onwards for this fit shows a higher peak for next year into 2016 http://imageshack.com/a/img849/4509/bkdw.gif

But take these with a grain of salt as there is no physics behind any of these fits. It is just Eureqa trying to reduce the absolute error by picking a mathematical heuristic which best matches the data. More dimensions can be added but it will still be a heuristic fit.

I am working toward a more physical model and ran these to give everyone a chance to see how Eureqa evaluates the raw one-dimensional data.

`Here are the results of Eureqa machine learning on the El Nino 3.4 SST measurements since 1982. These are screen snapshots of two fits, one a moderate complexity and one a high complexity. I left them as links because they are full screen shots. The first is near a Pareto frontier inflection point and it consists of 4 sine waves <http://imageshack.com/a/img841/9503/csc.gif> The extrapolation from 2014 to 2016 for this waveform is shown here. The peak doesn't get above 0.5, so no El Nino in the near future <http://imageshack.com/a/img841/4567/bl8h.gif> Beyond that Pareto inflection point, the waveforms become much more complex, and Eureqa starts to modulate the sine waves in a nested fashion. This is the most complex and shows the smallest fitting error <http://imageshack.com/a/img856/4692/4cgo.gif> The extrapolation from 2014 onwards for this fit shows a higher peak for next year into 2016 <http://imageshack.com/a/img849/4509/bkdw.gif> But take these with a grain of salt as there is no physics behind any of these fits. It is just Eureqa trying to reduce the absolute error by picking a mathematical heuristic which best matches the data. More dimensions can be added but it will still be a heuristic fit. I am working toward a more physical model and ran these to give everyone a chance to see how Eureqa evaluates the raw one-dimensional data.`

WOW terrific!

A bit longer term goal: Why don't you do some regression from these and I do SVR and NN and compare results.

D

`WOW terrific! A bit longer term goal: Why don't you do some regression from these and I do SVR and NN and compare results. D`

Dara, Unfortunately I don't have any confidence in those particular fits, as they appear to be cases of extreme over-fitting. Unless you are thinking that they may be useful as "naive" fits and serve as a benchmark to compare against?

I do have all the regression error values displayed on the charts.

`Dara, Unfortunately I don't have any confidence in those particular fits, as they appear to be cases of extreme over-fitting. Unless you are thinking that they may be useful as "naive" fits and serve as a benchmark to compare against? I do have all the regression error values displayed on the charts.`

I was looking at them trying to see if there is such a thing as equation-fit for our data see FindFit [] in Mathematica.

I know the answer is NO but I was looking at these stuff you did + I was trying to see what those pieces of software you had quoted actually do.

In reality for forecast and realistic signal processing which could be use for a serious daily operations, there is not that many sources of software and systems. Still we need to hand code most of the algrorithms and their variations.

Dara

`>Unless you are thinking that they may be useful as “naive” fits and serve as a benchmark to compare against? I was looking at them trying to see if there is such a thing as equation-fit for our data see FindFit [] in Mathematica. I know the answer is NO but I was looking at these stuff you did + I was trying to see what those pieces of software you had quoted actually do. In reality for forecast and realistic signal processing which could be use for a serious daily operations, there is not that many sources of software and systems. Still we need to hand code most of the algrorithms and their variations. Dara`

I am using a parametric variation of FindFit [] for fitting the nonlinear equations. It is a combination of Mathematica's ParametricNDSolve and NonlinearModelFit, and then Mathematica attempts to sweep through the constrained parameter space.

I am not entirely satisfied with the approach because it seems to get trapped in local optima, and I can do a better job manually for the time being. I know that shouldn't be the case but that's my situation at the moment. I know roughly the parameters required and I fiddle with these a bit until the waveforms start to align, using a correlation coefficient for feedback.

`I am using a parametric variation of FindFit [] for fitting the nonlinear equations. It is a combination of Mathematica's ParametricNDSolve and NonlinearModelFit, and then Mathematica attempts to sweep through the constrained parameter space. I am not entirely satisfied with the approach because it seems to get trapped in local optima, and I can do a better job manually for the time being. I know that shouldn't be the case but that's my situation at the moment. I know roughly the parameters required and I fiddle with these a bit until the waveforms start to align, using a correlation coefficient for feedback.`

heh heh that is why you should use SVR which uses a GLOBAL max or min for its algorithm and then use the parallelized version based upon DIFFERENTIAL EVOLUTION

`> I am not entirely satisfied with the approach because it seems to get trapped in local optima, heh heh that is why you should use SVR which uses a GLOBAL max or min for its algorithm and then use the parallelized version based upon DIFFERENTIAL EVOLUTION`

Dara, I bow to your expertise on this matter. I think differential evolution is what Eureqa does, but I can't be certain. It holds on to a candidate set and combines them while keeping track of those with the best fit-vs-complexity score.

A few months ago, I had an epiphany and realized that I should enter into Eureqa the differential equation directly.

So I entered the following Mathieu equation with forcing function as a candidate fit into Eureqa

$ \frac{d^2x(t)}{dt^2}+[a-2q\cos(2\omega t)]x(t)=F(t) $

this gets transcribed as the Eureqa candidate solution:

and the results are actually useful !

What Eureqa eventually finds are values for the parameters a, q cos(), and F

Of course, these need to be fed back into the differential equation and the equation solved for initial conditions to see how it matches to the actual time series.

I will give an example of this later on for x=Nino3.4 but I have it described in this blog post for the x=SOI data set : http://contextearth.com/2014/05/27/the-soim-differential-equation/

`Dara, I bow to your expertise on this matter. I think differential evolution is what Eureqa does, but I can't be certain. It holds on to a candidate set and combines them while keeping track of those with the best fit-vs-complexity score. A few months ago, I had an epiphany and realized that I should enter into Eureqa the differential equation directly. So I entered the following Mathieu equation with forcing function as a candidate fit into Eureqa $ \frac{d^2x(t)}{dt^2}+[a-2q\cos(2\omega t)]x(t)=F(t) $ this gets transcribed as the Eureqa candidate solution: D(x, t, 2) = f(x, t) and the results are actually useful ! What Eureqa eventually finds are values for the parameters a, q cos(), and F Of course, these need to be fed back into the differential equation and the equation solved for initial conditions to see how it matches to the actual time series. I will give an example of this later on for x=Nino3.4 but I have it described in this blog post for the x=SOI data set : <http://contextearth.com/2014/05/27/the-soim-differential-equation/>`

I would break the data into Wavelets with known frequencies, I just posted something to that effect for Nino 3.4 for John.

Then I do what you are doing in post #40 for each decomposition however I know w and roughly I could get q.

In other words I would make a system of diff EQs for as many decompositions I could get for Localized Time-Frequency of the data generated by Wavelets.

This is a premature thought but I am seeing lots of papers linking wavelets to diff EQs.

Dara

`I would break the data into Wavelets with known frequencies, I just posted something to that effect for Nino 3.4 for John. Then I do what you are doing in post #40 for each decomposition however I know w and roughly I could get q. In other words I would make a system of diff EQs for as many decompositions I could get for Localized Time-Frequency of the data generated by Wavelets. This is a premature thought but I am seeing lots of papers linking wavelets to diff EQs. Dara`

Another idea, SVR actually computes a non-linear equation which is the regressor, so substitute that as x(t) in your diff EQ ins post #40. They try to solve for a and q and w if you know F(t) .

The only problem is that the SVR regressor has ABS[] in its equation so there are points in the solution space of parameters that have no derivatives.

D

`Another idea, SVR actually computes a non-linear equation which is the regressor, so substitute that as x(t) in your diff EQ ins post #40. They try to solve for a and q and w if you know F(t) . The only problem is that the SVR regressor has ABS[] in its equation so there are points in the solution space of parameters that have no derivatives. D`

CORRECTION:

Dara worte in #42:

This is not correct, there is not ABS[] I got mixed up with another equation in SVR.

In a little bit I post the SVR regressor's equation for nino34.

Apologies

Dara

`CORRECTION: Dara worte in #42: >The only problem is that the SVR regressor has ABS[] in its equation so there are points in the solution space of parameters that have no derivatives. This is not correct, there is not ABS[] I got mixed up with another equation in SVR. In a little bit I post the SVR regressor's equation for nino34. Apologies Dara`

I downloaded the Nino 3.4 anomaly from the NCDC NOAA Equatorial Pacific SST site and let Eureqa crunch on it with the candidate solution shown in #40, with a Eureqa supplied 12-month smoother (sma) applied to the data. The following screenshot highlights one high correlation coefficient result along the Pareto front of solutions.

http://imageshack.com/a/img87/5505/358d30.gif

The absolutely fascinating result -- which I have seen in the past with other data sets -- is that it isolates periodicities of less than one calendar month, and very close to the lunar month periods. http://en.wikipedia.org/wiki/Lunar_months

In this case, one of the sine wave components has a frequency of 83.28373 radians/year (Eureqa generates more significant digits than is displayed in the screenshot, and this extra precision is recovered by a copy&paste). That looks like unnecessary precision but watch what happens when we convert it to a period.

$ 83.28373 = \frac{2 \pi}{T} $

or

$ T = \frac{2 \pi}{83.28373} 365.242 = 27.555 days $

As it happens, the anomalistic lunar month is 27.55455 days ! This is the average time the Moon takes to go from perigee to perigee, or the point in the Moon's orbit when it is closest to Earth, and is a key factor in establishing the long term tidal periods.

So that is either an amazing coincidence of a random period, or this is telling us that the Nino 3.4 SST is sensitive to the anomalistic lunar month.

Yet this result is troubling as well, since the sampling period for Nino 3.4 is only a calendar month and Eureqa is picking out periods shorter than 30 days. This violates the Nyquist sampling criteria which says that sine waves would have to be at least 60 days in period to be detected. So somehow Eureqa is generating a "subsample" period that ordinarily would get aliased to a longer period. I have seen this before when applying Eureqa to the QBO data set -- see http://contextearth.com/2014/06/17/the-qbom/

This is good news but perplexing. It is good because it is all machine learning and completely hands off. All I did was apply a 12-month smoother to the data and Eureqa produced this result after a day of machine learning data crunching. I can't see how I could have influenced the results. But how it decides to violate the Nyquist criteria is beyond me. Is it because the sampling is on the same day of the month, yet each month is of different length, and thus the data is showing subtle inflection points that Eureqa's differential evolution algorithm is picking up? That is amazing sensitivity if that is the case.

Feel free to question the result, try to duplicate the result, or explain the result. I can not figure it out.

Perhaps Dara can try his alternative machine learning magic on the data and maybe we can solve the puzzle.

`I downloaded the Nino 3.4 anomaly from the NCDC NOAA Equatorial Pacific SST site and let Eureqa crunch on it with the candidate solution shown in #40, with a Eureqa supplied 12-month smoother (sma) applied to the data. The following screenshot highlights one high correlation coefficient result along the Pareto front of solutions. <http://imageshack.com/a/img87/5505/358d30.gif> The absolutely fascinating result -- which I have seen in the past with other data sets -- is that it isolates periodicities of less than one calendar month, and very close to the lunar month periods. <http://en.wikipedia.org/wiki/Lunar_months> In this case, one of the sine wave components has a frequency of 83.28373 radians/year (Eureqa generates more significant digits than is displayed in the screenshot, and this extra precision is recovered by a copy&paste). That looks like unnecessary precision but watch what happens when we convert it to a period. $ 83.28373 = \frac{2 \pi}{T} $ or $ T = \frac{2 \pi}{83.28373} 365.242 = 27.555 days $ As it happens, the anomalistic lunar month is 27.55455 days ! This is the average time the Moon takes to go from perigee to perigee, or the point in the Moon's orbit when it is closest to Earth, and is a key factor in establishing the long term tidal periods. So that is either an amazing coincidence of a random period, or this is telling us that the Nino 3.4 SST is sensitive to the anomalistic lunar month. Yet this result is troubling as well, since the sampling period for Nino 3.4 is only a calendar month and Eureqa is picking out periods shorter than 30 days. This violates the Nyquist sampling criteria which says that sine waves would have to be at least 60 days in period to be detected. So somehow Eureqa is generating a "subsample" period that ordinarily would get aliased to a longer period. I have seen this before when applying Eureqa to the QBO data set -- see <http://contextearth.com/2014/06/17/the-qbom/> This is good news but perplexing. It is good because it is all machine learning and completely hands off. All I did was apply a 12-month smoother to the data and Eureqa produced this result after a day of machine learning data crunching. I can't see how I could have influenced the results. But how it decides to violate the Nyquist criteria is beyond me. Is it because the sampling is on the same day of the month, yet each month is of different length, and thus the data is showing subtle inflection points that Eureqa's differential evolution algorithm is picking up? That is amazing sensitivity if that is the case. Feel free to question the result, try to duplicate the result, or explain the result. I can not figure it out. Perhaps Dara can try his alternative machine learning magic on the data and maybe we can solve the puzzle.`

Hello

Check out the regressor I just posted if you like we could change the calendar to lunar and re-compute a new regressor.

Spread the good news...

Dara

`Hello Check out the regressor I just posted if you like we could change the calendar to lunar and re-compute a new regressor. > This is good news but perplexing. It is good because it is all machine learning and completely hands off. Spread the good news... Dara`

I have yet to figure out why the Eureqa machine learning algorithm is identifying a component period corresponding very closely to the anomalistic lunar month of 27.55455 days, see #44. Eureqa is finding a sine wave of period 27.555 days in the Nino 3.4 data series, which is less than 1 part in 10,000 error from the official lunar tables value.

I don't know how this can happen as the sampling period is only a calendar month, a little over 30 days.

According to the Nyquist criteria, one would think that the algorithm would default to finding an aliased version of this waveform, as it contains exactly the same information content. In the figure below, the sampled data points at monthly intervals intersect with the longer period aliased waveform.

What might be happening is an error in the monthly measurement in terms of time, and since the higher frequency creates a more sensitive cross-section, this period may be preferred by Eureqa.

Also possible is that because Eureqa is trying to create the lowest complexity Fourier series, it is eliminating a sine/cosine pair in favor of a single component and then is able to more easily adjust for the actual phase shift via the higher frequency waveform. In other words, it is easier to "register" the phase by sliding the higher frequency waveform than the aliased low frequency waveform.

I would also add that searching for these cycles is often maddening. To propose an attribution for an oscillation, it requires that one only identify a period and a phase. There is nothing more to it than that. Yet, this agreement could be completely coincidental. So the larger question is what does it take to make the leap from a possible coincidental agreement to something that is causal with high certainty?

The possibility that the ENSO oscillations is keyed to the long-term perigee-to-perigee lunar tidal force is extremely plausible. This is a real gravitational forcing function that could provide a definite stimulus to a sloshing behavior. After all, tides are a form of very slight sloshing of a water volume.

So the question is whether this anomalistic forcing period is emerging via a stochastic resonance with the hydrodynamic equations of the deeper water volume. That is where the physics intersects with the math, and the machine learning of Eureqa is giving us some hints of where to look.

`I have yet to figure out why the Eureqa machine learning algorithm is identifying a component period corresponding very closely to the anomalistic lunar month of 27.55455 days, see #44. Eureqa is finding a sine wave of period 27.555 days in the Nino 3.4 data series, which is less than 1 part in 10,000 error from the official lunar tables value. I don't know how this can happen as the sampling period is only a calendar month, a little over 30 days. According to the Nyquist criteria, one would think that the algorithm would default to finding an aliased version of this waveform, as it contains exactly the same information content. In the figure below, the sampled data points at monthly intervals intersect with the longer period aliased waveform. ![anomalistic](http://imageshack.com/a/img853/271/bid.gif) What might be happening is an error in the monthly measurement in terms of time, and since the higher frequency creates a more sensitive cross-section, this period may be preferred by Eureqa. Also possible is that because Eureqa is trying to create the lowest complexity Fourier series, it is eliminating a sine/cosine pair in favor of a single component and then is able to more easily adjust for the actual phase shift via the higher frequency waveform. In other words, it is easier to "register" the phase by sliding the higher frequency waveform than the aliased low frequency waveform. I would also add that searching for these cycles is often maddening. To propose an attribution for an oscillation, it requires that one only identify a period and a phase. There is nothing more to it than that. Yet, this agreement could be completely coincidental. So the larger question is what does it take to make the leap from a possible coincidental agreement to something that is causal with high certainty? The possibility that the ENSO oscillations is keyed to the long-term perigee-to-perigee lunar tidal force is extremely plausible. This is a real gravitational forcing function that could provide a definite stimulus to a sloshing behavior. After all, tides are a form of very slight sloshing of a water volume. So the question is whether this anomalistic forcing period is emerging via a stochastic resonance with the hydrodynamic equations of the deeper water volume. That is where the physics intersects with the math, and the machine learning of Eureqa is giving us some hints of where to look.`

I think this is a good example of Multi-Trend and problematic inferences it might cause

Dara

`I think this is a good example of Multi-Trend and problematic inferences it might cause Dara`

By that token, the superposition of lunar effects on tides is multi-trend and yet still can be predicted.

With tides as an exemplar of what could be done, I am still operating under the premise that the underling forcing causing ENSO may be more or less periodic, but the resulting effects that emerge are quasi-periodically erratic.

So is the similarity of a revealed 27.555 day period to the anomalous month length of 27.55455 days just a coincidental inference, or is it meaningful?

Unless we build on this with other evidence, it is simply a standalone and isolated piece of the puzzle.

`By that token, the superposition of lunar effects on tides is multi-trend and yet still can be predicted. With tides as an exemplar of what could be done, I am still operating under the premise that the underling forcing causing ENSO may be more or less periodic, but the resulting effects that emerge are quasi-periodically erratic. So is the similarity of a revealed 27.555 day period to the anomalous month length of 27.55455 days just a coincidental inference, or is it meaningful? Unless we build on this with other evidence, it is simply a standalone and isolated piece of the puzzle.`

Later on today I will look at the exact time-series you were using and run the DGaussian Wavelet on it to see if there is such a period.

This is why I was asking John about multi-trend data, because it seems most of the data here is as such.

Personally I believe there are other induced trends due to solar spots and magnetic storms, and really off the chart idea that microbial transports from surface waters to clouds and back also has an impact on this weather situation.

`>So is the similarity of a revealed 27.555 day period to the anomalous month length of 27.55455 days just a coincidental inference, or is it meaningful? Later on today I will look at the exact time-series you were using and run the DGaussian Wavelet on it to see if there is such a period. This is why I was asking John about multi-trend data, because it seems most of the data here is as such. Personally I believe there are other induced trends due to solar spots and magnetic storms, and really off the chart idea that microbial transports from surface waters to clouds and back also has an impact on this weather situation.`

What makes the attribution more difficult is that some of the revealed frequencies may be characteristic of a natural resonance within the system, so like the swing starting to oscillate in the wind that John has used as an example, the forcing frequency can become entangled with the characteristic frequency, leading to a beating multi-trend resultant signal.

So as attribution goes, values of frequencies that match known oscillating systems -- such as the perigee-to-perigee anomalous lunar month -- are candidates for primary forcings while unmatched frequencies could be induced as secondary resonances.

The further complicating factor is the likelihood of nonlinear responses -- such as the Mathieu-like modulation -- which make the traditional approaches for decomposition more difficult to apply. For example spectral composition of sine waves via multiple linear regression is a standard approach, but if the basis functions are elliptical MathieuC/MathieuS pairs, some other technique will need to take its place. That's why I think the machine learning like Dara is advocating is effective -- it's often brute force, but if it works in the end, what's not to like?

`What makes the attribution more difficult is that some of the revealed frequencies may be characteristic of a natural resonance within the system, so like the swing starting to oscillate in the wind that John has used as an example, the forcing frequency can become entangled with the characteristic frequency, leading to a beating multi-trend resultant signal. So as attribution goes, values of frequencies that match known oscillating systems -- such as the perigee-to-perigee anomalous lunar month -- are candidates for primary forcings while unmatched frequencies could be induced as secondary resonances. The further complicating factor is the likelihood of nonlinear responses -- such as the Mathieu-like modulation -- which make the traditional approaches for decomposition more difficult to apply. For example spectral composition of sine waves via multiple linear regression is a standard approach, but if the basis functions are elliptical MathieuC/MathieuS pairs, some other technique will need to take its place. That's why I think the machine learning like Dara is advocating is effective -- it's often brute force, but if it works in the end, what's not to like?`

a) that is why I am here since we need John's input and skills b) These weather pattern signals/data are really complex in comparison say to the stocks signals from Wall Street

In comparison the stocks signals are actually very simple objects.

The signals for El Nino and other natural atmospherics are telling tales of multiple trends as you so elegantly described, and if each trend come from a different dynamical system then we have amazing complexity to deal with to model.

My intuition says: Decompose the data into as many Trends as possible, model each and then put them back together

D

`> so like the swing starting to oscillate in the wind that John has used as an example ... leading to a beating multi-trend resultant signal. a) that is why I am here since we need John's input and skills b) These weather pattern signals/data are really complex in comparison say to the stocks signals from Wall Street In comparison the stocks signals are actually very simple objects. The signals for El Nino and other natural atmospherics are telling tales of multiple trends as you so elegantly described, and if each trend come from a different dynamical system then we have amazing complexity to deal with to model. My intuition says: Decompose the data into as many Trends as possible, model each and then put them back together D`