The white paper I wrote on the Southern Index Oscillation Model, arxived [here](http://arxiv.org/abs/1411.0815) , is probably in need of an edit.

This is actually good news. To create the model as described in the paper, I applied 4 factors : the QBO, the Chandler Wobble, the TSI, and an extra boundary condition defined by the Pacific Ocean Climate shift event of 1976/1977.

But are all these factors necessary? Is it possible to reduce the number of factors so that only the primary factors contribute and still maintain fidelity to the data?

The two I felt were the weakest and most in need of a scientific rationale were the TSI and climate shift factors. So I removed these and juggled only the QBO and CW factors. Sure enough, the model fit retained most of the erratic structure and the correlation coefficient was only a bit lower. Part of the reason that I think this works is that the more detailed QBO wave-form that I have been recently applying compensates for the extra degrees of freedom that the TSI data provided.

And as far as the climate shift is concerned, this factor vanished.

The implications of the removal of these factors is that the model should have better predictive power with the higher determinism of QBO and CW.. Sudden climate shifts are impossible to predict so that would have added uncertainty to the forecast capability. Same thing with TSI, as the period varies around its 10 to 11 year period.

So as far as predictive power, we rely on the stationary aspects of the QBO (2.33 year average period) and of the Chandler wobble (6.5 year beat period). These each show slight jitter about the mean frequency, but they also show stationarity in terms of exhibiting coherence over long time scales -- in other words if you advance the QBO by many multiples of a 2.33 year period, you will know where you are on the waveform (minus the jitter).

---

Now, why did I take this detour in the first place? If you go back to blog posts from last year you will see that I started with only the CW and QBO factors. But then I somehow became motivated to achieve a higher correlation coefficient than the two factors could produce. Yet as I have becoming more and more convinced, chasing a high CC is pointless when the data itself has noisy characteristics. The fear is that the extra fitting is simply fitting the noise excursions.

This is actually a good outcome. There are only 2 factors now to cover the year-over-year fluctuations of ENSO SOI, with a slow sinusoid of a 50-60 year period to capture the multi-decadal envelope.

To add a caveat, some of the residual still appears to contain a TSI factor, so this term may still have some significance, but not when the quest is to arrive at the least complex solution.