David Tanzer wrote:

> Can anyone _boil down_ some of the mains lines of these investigations that we are undertaking, or considering?

I'm really glad you've started this thread, since right now I'm running around like a headless chicken: teaching a course on network theory and a course on real analysis, directing 4 grad students, putting together some grant proposals, and trying to get my NIPS talk ready... without knowing exactly what it will be about!

You, more than most people in the Azimuth gang, are good at organization - by which I don't mean bossing people around, but simply talking about the big picture and our goals, and thinking about how we can accomplish something where the whole is more than the sum of the parts.

Here is one strand of investigation:

* One of the main things Graham has already done is begin to simplify the work of Ludescher _et al_, stripping it of complexity without robbing it of predictive power.

* One of the main things I'd like to do is take a more investigative approach. Ludescher _et al_ published a paper that basically claims climate networks are good at El Niño prediction. This is a good way to get newspapers to pay attention, but instead of trying to "beat the competition" and predict El Niños better than the last guy, I'd really like to _understand_ climate networks and _figure out ways to measure_ how good they are at predicting El Niños.

* David Tweed has emphasized that instead of treating an El Niño as a binary on-off condition as Ludescher _et al_ did, it's wiser to try to predict a continuously variable quantity. There are some great candidates: the Niño 3.4 index, and its time-averaged versions.

* I like the idea of using fairly simple machine learning procedures to study "how well X can predict Y", where X might be something like the "average link strength" in a climate network, and Y might be something like the Niño 3.4 index. This would be simplest if X is just a single time series, or a few. Then it's up to us to ponder which time series to use! Using the "average link strength" amounts to making a hypothesis about what's important for El Niño prediction. An alternative approach is to let X be a huge pile of time series, like the temperatures at hundreds of grid points. Then it's up to the machine learning algorithm to formulate its own hypotheses. As an old-fashioned scientist, I sort of like the idea of formulating hypotheses and testing them myself. But the two approaches are not mutually exclusive! They could go well together.

* Very important is this: I _don't_ think the goal here is to become the world's experts on El Niño prediction. I think the goal is to have new ideas about climate networks, prediction, machine learning and other quite general things.