Dara - no need to apologize; let's all just try to be nice to each other.

I hadn't heard about the weather in Ireland. Here in California we've been in a severe drought for several years, and farmers are letting orchards die due to lack of water. The drought seems to be caused by unusual high-pressure systems that block storm systems. Nobody is sure whether this is caused by global warming, though [some people have theories](http://www.latimes.com/science/sciencenow/la-sci-sn-climate-change-california-drought-20140929-story.html).

> it is my wish to have more guidelines from you to continue this most important work, solely focused on computations on actual data

I'd love someone to do the following thing by December 1st:

**Use some clearly defined, standard machine learning methods to predict the Niño 3.4 index from the average link strength at times $\ge m$ months earlier, and see how good these prediction are.**

I have so far failed to get anyone to help me with this. It is not a completely well-defined task, but if someone asks me specific questions about what I want, I can answer them, to make it well-defined.

To begin with:

The **Niño 3.4 index** is one number per month. The data is available [here](https://github.com/johncarlosbaez/el-nino/blob/master/R/nino3.4-anoms.txt). This is data from Climate Prediction Center of the National Weather Service, from January 1950 to May 2014. The NiƱo 3.4 index is in the column called ANOM.

The **average link strength**, a quantity we've computed and put [here](https://github.com/johncarlosbaez/el-nino/blob/master/R/average-link-strength.txt). This file has the average link strength, called $S$, at 10-day intervals starting from day 730 and going until day 12040, where day 1 is the first of January 1948. (For an explanation of how this was computed, see [Part 4](http://johncarlosbaez.wordpress.com/2014/07/08/el-nino-project-part-4/) of the El Niño Project series.)

The first goal is to **predict the Niño 3.4 index on any given month starting from the average link strengths during some roughly fixed-length interval of time ending around $m$ months before the given month**.

The second goal is to **measure how good the prediction is, and see how it gets worse as we increase $m$**.

(Note: There is a certain technical annoyance here involving "months" versus "10-day intervals"! If every month contained exactly 30 days, life would be sweet. But no. Here's one idea on how to deal with this:

We predict the Niño 3.4 index for a given month given the values of $S$ over the $k$ 10-day periods whose final one occurs at the end of the month $m$ months before the given month. For example, if $m = 1$, $k = 3$ and the given month is February, we must predict the February Niño 3.4 index given the 3 average link strengths at 10-day periods ending at the end of January.

Ignore these details if you're just trying to get the basic idea. If you can think of a less annoying way to handle this annoying issue, it could be fine.)

Roughly speaking, I'd like predictions that minimize the time average of

|(predicted El Niño 3.4 index - observed El Niño 3.4 index)|${}^2$