#### Howdy, Stranger!

It looks like you're new here. If you want to get involved, click one of these buttons!

Options

# Continuous statistic measuring "El Nino-ness"?

John asked if there were any obvious "turnkey" machine learning algorithms to try on the El Nino problem that haven't been suggested yet. In the course of thinking about that, I became aware that I haven't read all of the posts on the forum over the past 6 months. Has there been any thought about if one could come up with a continuous statistic that measures how much of an El Nino is/isn't occurring? (eg, a severe El Nino, a mild El Nino, behaviour that's close to an El Nino but doesn't cross the threshold, nothing like an El Nino, etc.) Complete fidelity isn't important, it's just that some "regression type" machine learning techniques perform a lot worse when you represent very different examples of outputs with class labels 0 and 1 rather than a more continuous measure.

• Options
1.

The Nino 3.4 index is pretty standard.

Comment Source:The Nino 3.4 index is pretty standard.
• Options
2.

The existence of El Niños is derived from a continuous index. Here is one example. There are others, and various ways of turning the continuous index into 0 or 1.

We have talked about the issue you raise, but from the other end. It would be more convincing to have a decent prediction of a continuous index than a two-class classifier trained on 30 years and tested on another 30. Personally I am more experienced with classification than regression, so I tend to think "you can do regression with a support vector machine, but should you?" as opposed to "you can do classification with a bog-standard neural net, but should you?".

Comment Source:The existence of El Niños is derived from a continuous index. Here is [one example](http://www.azimuthproject.org/azimuth/show/Oceanic+Ni%C3%B1o+index). There are others, and various ways of turning the continuous index into 0 or 1. We have talked about the issue you raise, but from the other end. It would be more convincing to have a decent prediction of a continuous index than a two-class classifier trained on 30 years and tested on another 30. Personally I am more experienced with classification than regression, so I tend to think "you can do regression with a support vector machine, but should you?" as opposed to "you can do classification with a bog-standard neural net, but should you?".
• Options
3.

So there's not any hidden bear traps in looking at the average temperature anomaly in the Nino 3.4 for 5 months as that kind of continuous measure? (I can't see any, but they might not be obvious which is why I'm asking.)

Comment Source:So there's not any hidden bear traps in looking at the average temperature anomaly in the Nino 3.4 for 5 months as that kind of continuous measure? (I can't see any, but they might not be obvious which is why I'm asking.)
• Options
4.

The issue with SST measurements is that the oscillating factor can be conflated with a secular warming trend, and the latter is not part of ENSO but likely other causes such as GHG-driven AGW.

That is why the atmospheric pressure derived Southern Oscillation Index (SOI) is often considered a handy continuous measure for characterizing an El Nino. Whenever the SOI is in a extreme negative direction (see 1983 or 1998 for example) it indicates El Nino, while positive says La Nina.

Comment Source:The issue with SST measurements is that the oscillating factor can be conflated with a secular warming trend, and the latter is not part of ENSO but likely other causes such as GHG-driven AGW. That is why the atmospheric pressure derived Southern Oscillation Index (SOI) is often considered a handy continuous measure for characterizing an El Nino. Whenever the SOI is in a extreme negative direction (see 1983 or 1998 for example) it indicates El Nino, while positive says La Nina. ![SOI](http://www.cgd.ucar.edu/cas/catalog/climind/soi4.gif)
• Options
5.

David Tweed wrote:

Complete fidelity isn’t important, it’s just that some “regression type” machine learning techniques perform a lot worse when you represent very different examples of outputs with class labels 0 and 1 rather than a more continuous measure.

That is correct and really troubling, I refer to them as IF THEN ELSE output forecast or classification any introduction of any expression or formula that has IF THEN ELSE in its algorithm e.g. ABS causes a great loss of accuracy.

Example from Stock market:

1-5% error rate on the forecast of a stock price translates to 45-50% error if one asks the question "does the price tomorrow go up or down?" which is a Boolean output.

With my colleague Michael Thorek, we have developed (still infantile) Machine Learning algorithms extensions and incorporation of wavelets to gain a few percent accuracy on the IF THEN ELSE outputs.

Sadly El Nino definition has IF THEN ELSE in it to be above some threshold. Therefore when I hear there is this algorithm which is predicting El Nino correctly I am awe-shocked!

I posted a SVR regressor here if you care to look for, but that is for some temperature data, as you might call continuous data, the regressor is quite accurate, but as soon as we use it to predict if it could forecast the volatility above some level, the accuracy could drop to almost flipping a coin.

Dara

Comment Source:David Tweed wrote: >Complete fidelity isn’t important, it’s just that some “regression type” machine learning techniques perform a lot worse when you represent very different examples of outputs with class labels 0 and 1 rather than a more continuous measure. That is correct and really troubling, I refer to them as IF THEN ELSE output forecast or classification any introduction of any expression or formula that has IF THEN ELSE in its algorithm e.g. ABS causes a great loss of accuracy. Example from Stock market: 1-5% error rate on the forecast of a stock price translates to 45-50% error if one asks the question "does the price tomorrow go up or down?" which is a Boolean output. With my colleague Michael Thorek, we have developed (still infantile) Machine Learning algorithms extensions and incorporation of wavelets to gain a few percent accuracy on the IF THEN ELSE outputs. Sadly El Nino definition has IF THEN ELSE in it to be above some threshold. Therefore when I hear there is this algorithm which is predicting El Nino correctly I am awe-shocked! I posted a SVR regressor here if you care to look for, but that is for some temperature data, as you might call continuous data, the regressor is quite accurate, but as soon as we use it to predict if it could forecast the volatility above some level, the accuracy could drop to almost flipping a coin. Dara
• Options
6.
edited July 2014

If it's of any help I remember reading a paper by Yann LeCun demonstrating better performance of deep (ie. more than one hidden layer) convolutional networks than state vector machines for some problems.

Maybe somewhere here:

http://arxiv.org/find/all/1/all:+AND+yann+lecun/0/1/0/all/0/1

There is also Orange, a Python state vector machine framework

http://orange.biolab.si/.

One of the authors, Alex Jakulin's PhD was on the analysis of attribute interactions:

http://arxiv.org/find/all/1/all:+jakulin/0/1/0/all/0/1

Comment Source:If it's of any help I remember reading a paper by Yann LeCun demonstrating better performance of deep (ie. more than one hidden layer) convolutional networks than state vector machines for some problems. Maybe somewhere here: [http://arxiv.org/find/all/1/all:+AND+yann+lecun/0/1/0/all/0/1](http://arxiv.org/find/all/1/all:+AND+yann+lecun/0/1/0/all/0/1) There is also Orange, a Python state vector machine framework [http://orange.biolab.si/.](http://orange.biolab.si/.) One of the authors, Alex Jakulin's PhD was on the analysis of attribute interactions: [http://arxiv.org/find/all/1/all:+jakulin/0/1/0/all/0/1]( http://arxiv.org/find/all/1/all:+jakulin/0/1/0/all/0/1)
• Options
7.

Hello Jim thanx

Orange and Rapid Miner and Pentaho are all un-suitable for our purposes here, did due diligence a year ago. I have the lower level algorithms, so we could easily modify the underlying parameters and parallelize, but I do not have all of the known ones.

I review the papers you mentioned and see if something usable, keep you posted.

D

Comment Source:Hello Jim thanx Orange and Rapid Miner and Pentaho are all un-suitable for our purposes here, did due diligence a year ago. I have the lower level algorithms, so we could easily modify the underlying parameters and parallelize, but I do not have all of the known ones. I review the papers you mentioned and see if something usable, keep you posted. D
• Options
8.
edited July 2014

Great stuff Darya, especially as now I don't have to review any more packages, so thanks a bunch :).

Comment Source:Great stuff Darya, especially as now I don't have to review any more packages, so thanks a bunch :).
• Options
9.
edited July 2014

David wrote:

Has there been any thought about if one could come up with a continuous statistic that measures how much of an El Nino is/isn’t occurring? (eg, a severe El Nino, a mild El Nino, behaviour that’s close to an El Nino but doesn’t cross the threshold, nothing like an El Nino, etc.) Complete fidelity isn’t important, it’s just that some “regression type” machine learning techniques perform a lot worse when you represent very different examples of outputs with class labels 0 and 1 rather than a more continuous measure.

That's what I thought - I'm glad to hear my intuition confirmed!

Over in our discussion El Niño project - thinking about the next steps , I wrote:

I think I know what we should be trying to predict. Either:

• The Niño 3.4 index, which is the sea surface temperature anomaly in the Niño 3.4 region. You can get monthly values of this from 1870 to 2011 here or here (in different formats).

or

• the Oceanic Niño Index (ONI), which is the 3-month running mean of the Niño 3.4 index.

When the ONI is over 0.5 °C, people say there's an El Niño. When it's below -0.5 °C, some people say there's a La Niña, though other people think this criterion for a La Niña is suboptimal.

If I had to pick one of these for you to predict, it would be the ONI. However, it might be to take the same prediction algorithm, keep everything else the same, and use it to predict both ONI and Niño 3.4. Since the former is just the 3-month running average of the latter, I guess the question is whether the "smoothed-out" quantity is significantly easier to predict.

We could try to predict either of these quantities using data obtained at least 3 months earlier, at least 6 months earlier, at least 9 months earlier, or at least 12 months earlier. People believe it gets much harder to predict them more than 6 months in advance. So, to get people interested, we should succeed in predicting them more than 6 months in advance. However, it will be very interesting to see how our predictive ability degrades as we try predictions that are 3, 6, 9 or 12 months in advance.

What I really want now is some standard "off the shelf" software that takes several time series $a_n, b_n, \dots$ and tries to predict the values of one, say $a_n$, given all of them up to some time $m$ steps before step $n$: $a_i, b_i, \dots$ for $i \le n - m$.

Blake Pollard suggested that I try this:

I want to do something easy and fairly standard in time to talk about it at NIPS in December. Then the people at NIPS can tell me what I should be doing.

There is probably something simpler than the "random forest" method that's worth doing, too.

I would like a fairly standard way to rate how "good" different prediction methods are. Maybe just the root mean square of the difference between the predicted value and the (later) observed value?

Comment Source:David wrote: > Has there been any thought about if one could come up with a continuous statistic that measures how much of an El Nino is/isn’t occurring? (eg, a severe El Nino, a mild El Nino, behaviour that’s close to an El Nino but doesn’t cross the threshold, nothing like an El Nino, etc.) Complete fidelity isn’t important, it’s just that some “regression type” machine learning techniques perform a lot worse when you represent very different examples of outputs with class labels 0 and 1 rather than a more continuous measure. That's what I thought - I'm glad to hear my intuition confirmed! Over in our discussion [El Niño project - thinking about the next steps ](http://forum.azimuthproject.org/discussion/1382/el-nino-project-thinking-about-the-next-steps/?Focus=11382#Comment_11382), I wrote: > I think I know what we should be trying to predict. Either: > * The **Niño 3.4 index**, which is the sea surface temperature anomaly in the Niño 3.4 region. You can get monthly values of this from 1870 to 2011 <a href = "http://www.esrl.noaa.gov/psd/gcos_wgsp/Timeseries/Nino34/">here</a> or <a href = "http://www.cgd.ucar.edu/cas/catalog/climind/TNI_N34/index.html#Sec5">here</a> (in different formats). > or > * the **Oceanic Niño Index (ONI)**, which is the 3-month running mean of the Niño 3.4 index. > When the ONI is over 0.5 &deg;C, people say there's an **El Niño**. When it's below -0.5 &deg;C, some people say there's a **La Niña**, though other people think this criterion for a La Niña is suboptimal. > If I had to pick one of these for you to predict, it would be the ONI. However, it might be to take the same prediction algorithm, keep everything else the same, and use it to predict both ONI and Niño 3.4. Since the former is just the 3-month running average of the latter, I guess the question is whether the "smoothed-out" quantity is significantly easier to predict. > We could try to predict either of these quantities using data obtained _at least 3 months earlier_, _at least 6 months earlier_, _at least 9 months earlier_, or _at least 12 months earlier_. People believe it gets much harder to predict them more than 6 months in advance. So, to get people interested, we should succeed in predicting them more than 6 months in advance. However, it will be very interesting to see how our predictive ability degrades as we try predictions that are 3, 6, 9 or 12 months in advance. What I really want now is some standard "off the shelf" software that takes several time series $a_n, b_n, \dots$ and tries to predict the values of one, say $a_n$, given all of them up to some time $m$ steps before step $n$: $a_i, b_i, \dots$ for $i \le n - m$. Blake Pollard suggested that I try this: * [R packages: random forest](http://cran.r-project.org/web/packages/randomForest/randomForest.pdf). I want to do something easy and fairly standard in time to talk about it at NIPS in December. Then the people at NIPS can tell me what I _should_ be doing. There is probably something simpler than the "random forest" method that's worth doing, too. I would like a fairly standard way to rate how "good" different prediction methods are. Maybe just the root mean square of the difference between the predicted value and the (later) observed value?
• Options
10.
edited July 2014

David wrote:

So there’s not any hidden bear traps in looking at the average temperature anomaly in the Nino 3.4 for 5 months as that kind of continuous measure?

WebHubTel wrote:

The issue with SST (sea surface temperature) measurements is that the oscillating factor can be conflated with a secular warming trend, and the latter is not part of ENSO (El Niño Southern Oscillation) but likely other causes such as GHG-driven AGW.

That's true. However, the US National Weather Service now computes the Niño 3.4 data in a way that corrects for the gradual warming trend:

Monthly Niño 3.4 index, Climate Prediction Center, National Weather Service.

The idea is that instead of taking the temperature and subtracting the temperature at the same time of year averaged over all years on record, they subtract an average over nearby years. See the website for details and a nice graph showing how significant this correction is:

Comment Source:David wrote: > So there’s not any hidden bear traps in looking at the average temperature anomaly in the Nino 3.4 for 5 months as that kind of continuous measure? WebHubTel wrote: > The issue with SST (sea surface temperature) measurements is that the oscillating factor can be conflated with a secular warming trend, and the latter is not part of ENSO (El Ni&ntilde;o Southern Oscillation) but likely other causes such as GHG-driven AGW. That's true. However, the US National Weather Service now computes the Ni&ntilde;o 3.4 data in a way that corrects for the gradual warming trend: &bull; <a href = "http://www.cpc.noaa.gov/products/analysis_monitoring/ensostuff/ONI_change.shtml">Monthly Ni&ntilde;o 3.4 index</a>, Climate Prediction Center, National Weather Service. The idea is that instead of taking the temperature and subtracting the temperature at the same time of year averaged over _all years on record_, they subtract an average over _nearby years_. See the website for details and a nice graph showing how significant this correction is: <img src = "http://www.cpc.noaa.gov/products/analysis_monitoring/ensostuff/30yrbaseperiods_Nino34.png" alt = ""/>
• Options
11.

Dara wrote

That is correct and really troubling, I refer to them as IF THEN ELSE output forecast or classification any introduction of any expression or formula that has IF THEN ELSE in its algorithm e.g. ABS causes a great loss of accuracy.

FWIW, what I was referring to was more just that a regression based system is designed to reproduce the training outputs as accurately as possible. When the training outputs fall are continuous then it can often find "quite natural" "approximation function" which then generalise well. In contrast, when the outputs are classes encoded as numbers which it then tries to reproduce as accuratey as it can, it often has to "contort" the "approximation function" to make that happen, which generally significantly reduces how well it generalises.

Comment Source:Dara wrote > That is correct and really troubling, I refer to them as IF THEN ELSE output forecast or classification any introduction of any expression or formula that has IF THEN ELSE in its algorithm e.g. ABS causes a great loss of accuracy. FWIW, what I was referring to was more just that a regression based system is designed to reproduce the training outputs as accurately as possible. When the training outputs fall are continuous then it can often find "quite natural" "approximation function" which then generalise well. In contrast, when the outputs are classes encoded as numbers which it then tries to reproduce as accuratey as it can, it often has to "contort" the "approximation function" to make that happen, which generally significantly reduces how well it generalises.
• Options
12.

Hello Jim and David

Thanx for references.

I reviewed the papers mentioned by Jim, and generally speaking they have features here and there we could use, but we need large servers to make any sensible use of them.

Each new variation of algorithm could require up to 6 months of training i.e. how to process the data and develop expertise to use a variation of the learning algorithm to make reasonable forecasts and classifications.

None of the papers, at least what I skimmed thru had any serious workbench or backtesting setup we could use to evaluate, so all those we need to do ourselves.

Dara

Comment Source:Hello Jim and David Thanx for references. I reviewed the papers mentioned by Jim, and generally speaking they have features here and there we could use, but we need large servers to make any sensible use of them. Each new variation of algorithm could require up to 6 months of training i.e. how to process the data and develop expertise to use a variation of the learning algorithm to make reasonable forecasts and classifications. None of the papers, at least what I skimmed thru had any serious workbench or backtesting setup we could use to evaluate, so all those we need to do ourselves. Dara
• Options
13.

Dara wrote:

Sadly El Nino definition has IF THEN ELSE in it to be above some threshold. Therefore when I hear there is this algorithm which is predicting El Nino correctly I am awe-shocked!

Then you will be less awe-shocked when you hear people admit that most El Niño prediction algorithms are quite bad. In fact they've been getting less good in the last 10 years, apparently because the weather is doing strange things.

I think this means we should not try to predict whether there is an El Niño. We should try to predict the Niño 3.4 index or its 3-month running average (the ONI).

Comment Source:Dara wrote: > Sadly El Nino definition has IF THEN ELSE in it to be above some threshold. Therefore when I hear there is this algorithm which is predicting El Nino correctly I am awe-shocked! Then you will be less awe-shocked when you hear people admit that most El Ni&ntilde;o prediction algorithms are quite bad. In fact they've been getting _less_ good in the last 10 years, apparently because the weather is doing strange things. I think this means we should _not_ try to predict whether there is an El Ni&ntilde;o. We should try to predict the Ni&ntilde;o 3.4 index or its 3-month running average (the ONI).
• Options
14.

I think this means we should not try to predict whether there is an El Niño. We should try to predict the Niño 3.4 index or its 3-month running average (the ONI).

Done! I will work on SVR and later Neural Networks forecast models.

Perhaps thru most of Aug

Comment Source:>I think this means we should not try to predict whether there is an El Niño. We should try to predict the Niño 3.4 index or its 3-month running average (the ONI). Done! I will work on SVR and later Neural Networks forecast models. Perhaps thru most of Aug
• Options
15.
edited July 2014

Dara,

I thought the Hessian-free LLE and FFT optimisations might be useful. Ian Ross has a good description on LLE and El Nino in his thesis (arXic:0901.0537v1.pdf).

Not unusually I forgot the machine learning package I wanted to mention used by the Courant people: torch.ch. Have you come across it? LeCun's work is nearly all image processing and how to implement either of the optimisation techniques I read about wasn't deducible by me from the papers but perhaps there might be more clues in Torch. I'm trying to get a different deep-learning library to work.

At the rate you're going I wouldn't bet you can't get 6 months learning done in a month :)

Comment Source:Dara, I thought the Hessian-free LLE and FFT optimisations might be useful. Ian Ross has a good description on LLE and El Nino in his thesis (arXic:0901.0537v1.pdf). Not unusually I forgot the machine learning package I wanted to mention used by the Courant people: torch.ch. Have you come across it? LeCun's work is nearly all image processing and how to implement either of the optimisation techniques I read about wasn't deducible by me from the papers but perhaps there might be more clues in Torch. I'm trying to get a different deep-learning library to work. At the rate you're going I wouldn't bet you can't get 6 months learning done in a month :)
• Options
16.

I forgot the machine learning package I wanted to mention used by the Courant people: torch.ch.

If you go here:

LIBSVM from Taiwian

So these torch people are actually using the Chih-Chung Chang and Chih-Jen Lin LIBVSM, which is the industry standard. So they placed their own layer on top of this libvsm as did the SCIKIT-Learn people .

Now here is the problem: I need to parallelize libvsm, I wrote to the Taiwnese developer and they do not know how to do it and have no code for it. Solution: I wrote my own SVR which allows me to plug in a Differential Evolution parallelizer for GPU servers (not fully optimized yet but has a good hope to work super fast soon)

our issue is not machine learning, our issue is parallelized machine learning on multi-core CPU servers and GPU servers. If we do not parallelize, there is nothing we could provide for John's research worth mentioning.

For NN:

Neural Network

They have CUDA support for GPU servers, so there is hope, now I need to choose between their code and mine (I have OpenMP parallelization on intel platform working perfect, porting over to CUDA).

Comment Source:> I forgot the machine learning package I wanted to mention used by the Courant people: torch.ch. If you go here: [LIBSVM from Taiwian](https://github.com/koraykv/torch-svm/blob/master/libsvm/COPYRIGHT) So these torch people are actually using the Chih-Chung Chang and Chih-Jen Lin LIBVSM, which is the industry standard. So they placed their own layer on top of this libvsm as did the SCIKIT-Learn people . Now here is the problem: I need to parallelize libvsm, I wrote to the Taiwnese developer and they do not know how to do it and have no code for it. Solution: I wrote my own SVR which allows me to plug in a Differential Evolution parallelizer for GPU servers (not fully optimized yet but has a good hope to work super fast soon) our issue is not machine learning, our issue is parallelized machine learning on multi-core CPU servers and GPU servers. If we do not parallelize, there is nothing we could provide for John's research worth mentioning. For NN: [Neural Network](https://github.com/torch/nn) They have CUDA support for GPU servers, so there is hope, now I need to choose between their code and mine (I have OpenMP parallelization on intel platform working perfect, porting over to CUDA).
• Options
17.

At the rate you’re going I wouldn’t bet you can’t get 6 months learning done in a month :)

:)

Comment Source:>At the rate you’re going I wouldn’t bet you can’t get 6 months learning done in a month :) :)
• Options
18.
edited July 2014

That's very useful info Dara, I didn't know any of it. It's great that there's an industry standard lib. What language did you write your SVR code in?

FWIW I've been trying to learn to use some of the burgeoning plethora of Haskell parallel libraries like Data Parallel Haskell (dph), cloud haskell, Repa flat data parallel arrays) but not CUDA accelerate as I don't have the hardware.

Comment Source:That's very useful info Dara, I didn't know any of it. It's great that there's an industry standard lib. What language did you write your SVR code in? FWIW I've been trying to learn to use some of the burgeoning plethora of Haskell parallel libraries like Data Parallel Haskell (dph), cloud haskell, Repa flat data parallel arrays) but not CUDA accelerate as I don't have the hardware. [Haskell parallel libraries](http://hackage.haskell.org/packages/search?terms=parallel)
• Options
19.

Haskell would be useful for John's petri net simulation ideas. You need to parallelize the firings somehow or it would take a long time to run simple simulations.

I have a CUDA server for the upcoming new algorithms

Dara

Comment Source:Haskell would be useful for John's petri net simulation ideas. You need to parallelize the firings somehow or it would take a long time to run simple simulations. I have a CUDA server for the upcoming new algorithms Dara
• Options
20.

With all this CUDA enthusiasm, I'm :-( that for the past two years my work has been (in theory) working on a compiler for OpenCL!

Comment Source:With all this CUDA enthusiasm, I'm :-( that for the past two years my work has been (in theory) working on a compiler for OpenCL!
• Options
21.

BTW Dara, does your CUDA server have an OpenCL interface? (I gather that most recent NVIDIA setups do?)

Comment Source:BTW Dara, does your CUDA server have an OpenCL interface? (I gather that most recent NVIDIA setups do?)
• Options
22.

We are fully NVIDIA based but I will check about OPen CL I believe yes.

I am going to move the neural network code and Differential Evolution (the non-linear optimizer you were asking about) into the CUDA gpu servers.

It will take a few months of full-time coding but then we are free to experiment with real algorithms.

D

Comment Source:We are fully NVIDIA based but I will check about OPen CL I believe yes. I am going to move the neural network code and Differential Evolution (the non-linear optimizer you were asking about) into the CUDA gpu servers. It will take a few months of full-time coding but then we are free to experiment with real algorithms. D
• Options
23.

Hi Dara, I don't suppose you've managed to find out about OpenCL support yet?

Comment Source:Hi Dara, I don't suppose you've managed to find out about OpenCL support yet?
• Options
24.

Hello David answer is a resounding yes:

NVIDIA CUDA OPEN CL

Comment Source:Hello David answer is a resounding yes: [NVIDIA CUDA OPEN CL](https://developer.nvidia.com/opencl)