Options

Critique of Ludescher et al.

Part of the abstract for John's NIPS paper talks about critically analyzing the technique used by Ludescher et. al.

I'm starting this thread as a place for us to bat around ideas about this. Perhaps this will help John in his thinking about the subject.

Sorry I don't have a lot concrete to contribute here, because I was busy with other things, and lost track of the flow of what was going on.

Here are some possible points:

  • Definitions are more complex than they need to be; Graham showed that the same results can be achieved with simpler definitions.

  • A better goal is to predict a continuous index of El Nino.

  • Questions about the statistical significance of their results. Someone posted about this, but I forgot who or where.

  • Is a network of correlation strengths really a meaningful entity, which can form the basis of empirical predictions? Are the other cases where such correlation networks have led to predictive results? What are the underlying physical bases.

  • They put themselves out on a limb by predicting El Nino in 2014, but now it is appearing to be less likely...

Perhaps you guys who have been more active in the El Nino project can fill in some of the ideas here, or contradict them, or add some other points.

If 2014 turned out to be a non-El-Nino year, and this raises doubts about the Ludescher et. al approach, then part of your talk could be cast as a post-mortem, a returning to the blackboard, and a drumming up of more speculative ideas about climate network theory in new directions. Trouble is your talk occurs too soon to tell.

But now I read that the probability of El Nino this winter has been reduced to 58%. But that means that there is a 42% chance that a strong prediction made by the Ludescher et al theory is incorrect. Doesn't that in itself cast some doubt on the their "theory"?

Comments

  • 1.
    edited November 2014

    EDIT: deleted a comment expressing skepticism about machine learning that is not guided by physical theory. Needs more consideration.

    Comment Source:EDIT: deleted a comment expressing skepticism about machine learning that is not guided by physical theory. Needs more consideration.
  • 2.
    edited November 2014

    David I share most of your sentiments above.

    Ludescher paper is bothersome for me but I do not want to babble anything about it here due to John's status in academia, being respectful of John course. However I have very very strong opinions on that paper.

    I believe in such network link model if it also includes the vertical North-South variations, since clearly I showed in most of the Sig data for volumetric surface data there is a Trend corresponding to North South oscillation.

    Dara

    Comment Source:David I share most of your sentiments above. Ludescher paper is bothersome for me but I do not want to babble anything about it here due to John's status in academia, being respectful of John course. However I have very very strong opinions on that paper. I believe in such network link model if it also includes the vertical North-South variations, since clearly I showed in most of the Sig data for volumetric surface data there is a **Trend** corresponding to North South oscillation. Dara
  • 3.
    edited November 2014

    Dara,

    From the descriptions I've read Pacific northward and southward meridional winds feed into the westerly zonal winds. This is the inter-tropical convergence zone (ITCZ) and there is at least one paper referring a waveguide mechanism. Might this relate to your findings?

    Comment Source:Dara, From the descriptions I've read Pacific northward and southward meridional winds feed into the westerly zonal winds. This is the inter-tropical convergence zone (ITCZ) and there is at least one paper referring a waveguide mechanism. Might this relate to your findings?
  • 4.

    From what I remember I think this is called the tropical convergence zone (TCZ)and there is at least one paper referring a waveguide mechanism. Might this relate to your findings?

    I think so Jim I saw that in a paper too, let me search the databases I have and report back.

    The actual software animation was posted here clearly shows the North South Trend.

    Gimme a day

    Dara

    Comment Source:>From what I remember I think this is called the tropical convergence zone (TCZ)and there is at least one paper referring a waveguide mechanism. Might this relate to your findings? I think so Jim I saw that in a paper too, let me search the databases I have and report back. The actual software animation was posted here clearly shows the North South Trend. Gimme a day Dara
  • 5.

    Jim

    One other thing about this paper and in general about any such similar research in climate, is that there is no backtesting !

    Imagine I went to a Hedge Fund manager and told him that my stock price forecast worked accurately for past 3 days! He would laugh ,and say ok what about the past 3000 days?

    Dara

    Comment Source:Jim One other thing about this paper and in general about any such similar research in climate, is that there is no backtesting ! Imagine I went to a Hedge Fund manager and told him that my stock price forecast worked accurately for past 3 days! He would laugh ,and say ok what about the past 3000 days? Dara
  • 6.

    This is one of the animations that are of interest showing wind speeds at various altitudes and latitudes : http://www.ugamp.nerc.ac.uk/hot/ajh/qboanim.movie

    Taken from this site: http://www.ugamp.nerc.ac.uk/hot/ajh/qbo.htm

    The animation is kind of hypnotic, but watch as the North and South hemispheres alternately appear to push volumes of air toward the equator, while along the equator, the QBO of 2.33 year period descends from the stratosphere. The QBO cycle is the feature that really impacts the ENSO as it is more zonal, i.e. east/west.

    Comment Source:This is one of the animations that are of interest showing wind speeds at various altitudes and latitudes : <http://www.ugamp.nerc.ac.uk/hot/ajh/qboanim.movie> Taken from this site: <http://www.ugamp.nerc.ac.uk/hot/ajh/qbo.htm> The animation is kind of hypnotic, but watch as the North and South hemispheres alternately appear to push volumes of air toward the equator, while along the equator, the QBO of 2.33 year period descends from the stratosphere. The QBO cycle is the feature that really impacts the ENSO as it is more zonal, i.e. east/west.
  • 7.
    edited December 2014

    Steve Wenner has done a bit more statistical analysis of Ludescher et al's paper, just in time for my NIPS talk, and I put it here so I can link to it. He wrote:

    I computed Bayes probability intervals for the success probabilities of Ludescher’s predictions (and for my variations). My method is not perfect, since I assumed independence across years, which is surely not correct. However, taking into account the serial correlations would only increase the length of the error bars, and they are plenty long enough already! I’m not certain how to do better, but I suspect I would need to use Monte Carlo methods, and they have their own problems.

    Ludescher’s method has an expected posterior probability of successfully predicting an El Nino initiation event, when one actually occurs, of 0.601. This is very close to the frequentist estimate (11 successes / 18 El Ninos = 0.611); so, the prior distribution has little effect on the estimate of the mean. The 95% confidence interval is from 0.387 to 0.798; so, the data and method did succeed in narrowing the prior uniform interval (see the next paragraph) that extends from 0.286 to 1. The intervals for “non” El Nino events are shorter: 0.768 to 0.951 for the probability of successfully predicting a “non” event; however, the prior for the non-event is from 0.714 to 1, so Ludescher’s method doesn’t narrow the range very much!

    If we don’t condition on the outcome, then the estimate of the mean success probability is 0.795 using Ludescher’s method; but, if we simply use the “dumb” rule (always predict “no El Nino”) then we will be right with probability 0.714 - the data and Ludescher gain us very little!

    Truncated Uniform Prior

    I assume that any reasonable method will do at least as well as chance. Over many years we have experienced an El Nino initiation event in 28.6% of those years. So, a dumb method that simply declares “El Nino will happen next year!” with probability 28.6% and “will not happen!” with probability 71.4% will be successful in “predicting” 28.6% of all El Ninos. So, I set the minimum success probability at p0 = 28.6%, given that an El Nino actually occurs. Similarly, the dumb method successfully predicts 71.4% of the “no El Nino” years; so, I set the minimum success probability at p0 = 71.4% for any prediction method, given that the outcome is “no El Nino”. In both cases the upper limit is p1 = 1 for a perfect prediction method.

    For a binomial sampling situation with a truncated uniform prior the posterior density is expressible with the help of the beta distribution function (normalized incomplete beta function). The formulas can be found in Bayesian Reliability Analysis by Martz & Waller, Wiley, 1982, pp262-264. The posterior mean has a closed form, but the Bayes probability intervals must be found by iterative methods (I used the Excel “solver” add-in).

    The details are on the spreadsheet. I greyed out superfluous stuff I used to help me get the formulas straight.

    He added:

    Hi, I just wanted to add a couple of comments:

    The reason I used both the training and the validation data in estimating confidence limits was because the validation data show a better fit to the model than the training data; so, it seemed more fair to use both data sets for these calculations.

    I did some rough calculations to estimate the increase in the error bars that might result if I were to take the serial correlations into account. For instance, I think that the lower limit for the probability of successfully predicting an El Nino with Ludescher’s method is actually closer to 0.366, rather than the 0.387 reported below, and the upper limit would increase from 0.798 to 0.815. Since my ideas for this adjustment are only half-baked, I won’t go into the details here.

    I included a few of these figures in my talk; overall the talk is highly critical of Ludescher et al's paper but concludes that there seems to be an interesting idea lurking in the vicinity.

    Comment Source:Steve Wenner has done a bit more statistical analysis of Ludescher _et al_'s paper, just in time for my NIPS talk, and I put it [here](http://johncarlosbaez.wordpress.com/2014/07/23/el-nino-project-part-6/#comment-60800) so I can link to it. He wrote: > I computed Bayes probability intervals for the success probabilities of Ludescher’s predictions (and for my variations). My method is not perfect, since I assumed independence across years, which is surely not correct. However, taking into account the serial correlations would only increase the length of the error bars, and they are plenty long enough already! I’m not certain how to do better, but I suspect I would need to use Monte Carlo methods, and they have their own problems. > Ludescher’s method has an expected posterior probability of successfully predicting an El Nino initiation event, when one actually occurs, of 0.601. This is very close to the frequentist estimate (11 successes / 18 El Ninos = 0.611); so, the prior distribution has little effect on the estimate of the mean. The 95% confidence interval is from 0.387 to 0.798; so, the data and method did succeed in narrowing the prior uniform interval (see the next paragraph) that extends from 0.286 to 1. The intervals for “non” El Nino events are shorter: 0.768 to 0.951 for the probability of successfully predicting a “non” event; however, the prior for the non-event is from 0.714 to 1, so Ludescher’s method doesn’t narrow the range very much! > If we don’t condition on the outcome, then the estimate of the mean success probability is 0.795 using Ludescher’s method; but, if we simply use the “dumb” rule (always predict “no El Nino”) then we will be right with probability 0.714 - the data and Ludescher gain us very little! > > **Truncated Uniform Prior** > > I assume that any reasonable method will do at least as well as chance. Over many years we have experienced an El Nino initiation event in 28.6% of those years. So, a dumb method that simply declares “El Nino will happen next year!” with probability 28.6% and “will not happen!” with probability 71.4% will be successful in “predicting” 28.6% of all El Ninos. So, I set the minimum success probability at p0 = 28.6%, given that an El Nino actually occurs. Similarly, the dumb method successfully predicts 71.4% of the “no El Nino” years; so, I set the minimum success probability at p0 = 71.4% for any prediction method, given that the outcome is “no El Nino”. In both cases the upper limit is p1 = 1 for a perfect prediction method. > For a binomial sampling situation with a truncated uniform prior the posterior density is expressible with the help of the beta distribution function (normalized incomplete beta function). The formulas can be found in Bayesian Reliability Analysis by Martz &amp; Waller, Wiley, 1982, pp262-264. The posterior mean has a closed form, but the Bayes probability intervals must be found by iterative methods (I used the Excel “solver” add-in). > The details are <a href="http://math.ucr.edu/home/baez/ecological/el_nino/wenner/http://math.ucr.edu/home/baez/ecological/el_nino/wenner/ElNinoTemps_v2.xlsx" rel="nofollow">on the spreadsheet</a>. I greyed out superfluous stuff I used to help me get the formulas straight. He added: > Hi, I just wanted to add a couple of comments: > The reason I used both the training and the validation data in estimating confidence limits was because the validation data show a better fit to the model than the training data; so, it seemed more fair to use both data sets for these calculations. > I did some rough calculations to estimate the increase in the error bars that might result if I were to take the serial correlations into account. For instance, I think that the lower limit for the probability of successfully predicting an El Nino with Ludescher’s method is actually closer to 0.366, rather than the 0.387 reported below, and the upper limit would increase from 0.798 to 0.815. Since my ideas for this adjustment are only half-baked, I won’t go into the details here. I included a few of these figures in [my talk](http://math.ucr.edu/home/baez/climate_networks/); overall the talk is highly critical of Ludescher _et al_'s paper but concludes that there seems to be an interesting idea lurking in the vicinity.
Sign In or Register to comment.