Steve Wenner has done a bit more statistical analysis of Ludescher _et al_'s paper, just in time for my NIPS talk, and I put it [here]( so I can link to it. He wrote:

> I computed Bayes probability intervals for the success probabilities of Ludescher’s predictions (and for my variations). My method is not perfect, since I assumed independence across years, which is surely not correct. However, taking into account the serial correlations would only increase the length of the error bars, and they are plenty long enough already! I’m not certain how to do better, but I suspect I would need to use Monte Carlo methods, and they have their own problems.

> Ludescher’s method has an expected posterior probability of successfully predicting an El Nino initiation event, when one actually occurs, of 0.601. This is very close to the frequentist estimate (11 successes / 18 El Ninos = 0.611); so, the prior distribution has little effect on the estimate of the mean. The 95% confidence interval is from 0.387 to 0.798; so, the data and method did succeed in narrowing the prior uniform interval (see the next paragraph) that extends from 0.286 to 1. The intervals for “non” El Nino events are shorter: 0.768 to 0.951 for the probability of successfully predicting a “non” event; however, the prior for the non-event is from 0.714 to 1, so Ludescher’s method doesn’t narrow the range very much!

> If we don’t condition on the outcome, then the estimate of the mean success probability is 0.795 using Ludescher’s method; but, if we simply use the “dumb” rule (always predict “no El Nino”) then we will be right with probability 0.714 - the data and Ludescher gain us very little!

> > **Truncated Uniform Prior**

> > I assume that any reasonable method will do at least as well as chance. Over many years we have experienced an El Nino initiation event in 28.6% of those years. So, a dumb method that simply declares “El Nino will happen next year!” with probability 28.6% and “will not happen!” with probability 71.4% will be successful in “predicting” 28.6% of all El Ninos. So, I set the minimum success probability at p0 = 28.6%, given that an El Nino actually occurs. Similarly, the dumb method successfully predicts 71.4% of the “no El Nino” years; so, I set the minimum success probability at p0 = 71.4% for any prediction method, given that the outcome is “no El Nino”. In both cases the upper limit is p1 = 1 for a perfect prediction method.

> For a binomial sampling situation with a truncated uniform prior the posterior density is expressible with the help of the beta distribution function (normalized incomplete beta function). The formulas can be found in Bayesian Reliability Analysis by Martz & Waller, Wiley, 1982, pp262-264. The posterior mean has a closed form, but the Bayes probability intervals must be found by iterative methods (I used the Excel “solver” add-in).

> The details are on the spreadsheet. I greyed out superfluous stuff I used to help me get the formulas straight.

He added:

> Hi, I just wanted to add a couple of comments:

> The reason I used both the training and the validation data in estimating confidence limits was because the validation data show a better fit to the model than the training data; so, it seemed more fair to use both data sets for these calculations.

> I did some rough calculations to estimate the increase in the error bars that might result if I were to take the serial correlations into account. For instance, I think that the lower limit for the probability of successfully predicting an El Nino with Ludescher’s method is actually closer to 0.366, rather than the 0.387 reported below, and the upper limit would increase from 0.798 to 0.815. Since my ideas for this adjustment are only half-baked, I won’t go into the details here.

I included a few of these figures in [my talk](; overall the talk is highly critical of Ludescher _et al_'s paper but concludes that there seems to be an interesting idea lurking in the vicinity.