What I think I am up against is determining what the "true" data is. For SOI, I can use Darwin, or Tahiti, or the difference between the two. But neither of these is likely right, since the correlation between Darwin and Tahiti is only -0.55. So unless Darwin is more correct than Tahiti, or that doing the difference is a noise reducer, what the optimization will try to fit to is an incorrect view of the true ENSO process, and any predictions will tend to disperse from the actual result the further out that they are made. In other words, getting a 0.9 model correlation with the Darwin data, which has only a correlation of -0.55 with Tahiti has to be an example of over-fitting, unless the small chance of Darwin being a perfect proxy for the ENSO sloshing.

In this case the model correlation to Darwin is 0.67, which is better than 0.55, which makes it a decent fit if Darwin happens to be more correct that Tahiti as an ENSO measure.

That brings up another question:

Is Nino3.4 better than SOI in extracting the true measure of what we should be fitting the model to? Or is MEI better?

Yet that said, if this approach can only predict for a few years, it will still be quite an achievement in terms of laying to rest the idea that ENSO is a purely chaotic process.


Dara, good luck on getting the bugs fixed. I haven't run into anything so far, except for an occasional wild ride that Mathematica takes the computer on -- using up all available memory.