#### Howdy, Stranger!

It looks like you're new here. If you want to get involved, click one of these buttons!

Options

# Crunch time

• Options
201.
edited November 2014

How is Nino3.4 computed from the temperatures?

As an average.

Right. The precise details are here:

Comment Source:Nad wrote: > How is Nino3.4 computed from the temperatures? WebHubTel answered: > As an average. Right. The precise details are here: * [ENSO - Ni&ntilde;o 3.4 and SOI](http://www.azimuthproject.org/azimuth/show/ENSO#Nino3.4), Azimuth Library.
• Options
202.
edited November 2014

As an average.

thanks Paul, by looking at that wind graphics I suddenly thought that the wind speeds might have eventually entered too, but now I found another graphic which says that the indices are sea surface temperature anomalies...at least if I interpret the abbreviation SST correctly.

In particular I wrote:

which it is actually not what the graphic is supposed to show but rather that what I could suspect what it shows, that is the data mentioned in #163 is actually not about temperatures but about socalled wind indices, where I also don't know how these indices are computed from the actual wind speeds, in particular I haven't found anything on those in the frequently asked questions file next to the data and the graphic.

Comment Source:>As an average. thanks Paul, by looking at that wind graphics I suddenly thought that the wind speeds might have eventually entered too, but now I found another <a href="http://www.cpc.ncep.noaa.gov/products/analysis_monitoring/enso_update/ssta_c.gif">graphic</a> which says that the indices are sea surface temperature anomalies...at least if I interpret the abbreviation SST correctly. In particular I wrote: >about the strange temperatures graphics which it is actually not what the graphic is supposed to show but rather that what I could suspect what it shows, that is the data mentioned in #163 is actually not about temperatures but about socalled wind indices, where I also don't know how these indices are computed from the actual wind speeds, in particular I haven't found anything on those in the <a href="http://www.cpc.ncep.noaa.gov/data/indices/Readme.index.shtml">frequently asked questions file</a> next to the data and the graphic.
• Options
203.
edited November 2014

Yay! Blake computed Ludescher's "average link strength" on a daily basis, and I put the data here:

The second column in this file lists the average link strengths S as computed by Blake using a modified version of ludescher.R at daily intervals, starting from day 730, and going until day 24090, where day 1 is 1 January 1948. The first column numbers these items from 730 to 24090. For an explanation see Part 4 of the El Niño Project series.

Comment Source:Yay! Blake computed Ludescher's "average link strength" on a _daily_ basis, and I put the data here: * [https://github.com/azimuth-project/el-nino/blob/master/average-link-strength-daily.txt](https://github.com/azimuth-project/el-nino/blob/master/average-link-strength-daily.txt) The second column in this file lists the average link strengths S as computed by Blake using a modified version of ludescher.R at daily intervals, starting from day 730, and going until day 24090, where day 1 is 1 January 1948. The first column numbers these items from 730 to 24090. For an explanation see Part 4 of the El Niño Project series.
• Options
204.
edited November 2014

Blake also computed the average link strengths on a monthly basis, but he would like someone to check his work, e.g. by comparing it Daniel's existing estimates of monthly average link strengths, or recomputing it from the daily data.

I put his work here:

The second column in this file lists the average link strengths S as computed by Blake using a modified version of ludescher.R at monthly intervals, starting from January 1950 and going until December 2013. The first column numbers these items from 1 to 768. For an explanation see Part 4 of the El Niño Project series.

Comment Source:Blake also computed the average link strengths on a _monthly_ basis, but he would like someone to check his work, e.g. by comparing it Daniel's existing estimates of monthly average link strengths, or recomputing it from the daily data. I put his work here: * [https://github.com/azimuth-project/el-nino/blob/master/average-link-strength-monthly.txt](https://github.com/azimuth-project/el-nino/blob/master/average-link-strength-monthly.txt) The second column in this file lists the average link strengths S as computed by Blake using a modified version of ludescher.R at monthly intervals, starting from January 1950 and going until December 2013. The first column numbers these items from 1 to 768. For an explanation see [Part 4](http://johncarlosbaez.wordpress.com/2014/07/08/el-nino-project-part-4/) of the El Niño Project series.
• Options
205.

Now I found another graphic which says that the indices are sea surface temperature anomalies…at least if I interpret the abbreviation SST correctly.

I gave a link to the precise definition of the Niño 3.4 index in comment 200. Now I just want to remind you (and everyone) of a non-obvious fact: "SST" means "sea surface temperature", but this means the temperature of the air slightly above the sea surface.

(Apparently this is often very close to the temperature of the water, but anyway, it's the air temperature!)

Comment Source:Nad wrote: > Now I found another graphic which says that the indices are sea surface temperature anomalies…at least if I interpret the abbreviation SST correctly. I gave a link to the precise definition of the Ni&ntilde;o 3.4 index in comment 200. Now I just want to remind you (and everyone) of a non-obvious fact: "SST" means "sea surface temperature", but this means the temperature of the _**air**_ slightly above the sea surface. (Apparently this is often very close to the temperature of the water, but anyway, it's the air temperature!)
• Options
206.

Nad - our comments passed each other, but in comment 200 I pointed you to the definition of Niño 3.4 index:

Yes, SST means sea surface temperature.

Comment Source:Nad - our comments passed each other, but in comment 200 I pointed you to the definition of Ni&ntilde;o 3.4 index: * [ENSO - Ni&ntilde;o 3.4 and SOI](http://www.azimuthproject.org/azimuth/show/ENSO#Nino3.4), Azimuth Library. Yes, SST means sea surface temperature.
• Options
207.
edited November 2014

Nad - our comments passed each other, but in comment 200 I pointed you to the definition of Niño 3.4 index:

yes sorry. I know I shouldn't have asked but look it up again, I guess that happened because I sort of wanted to have any reaction at all.

what's with the blog post about the temperature data?

Comment Source:>Nad - our comments passed each other, but in comment 200 I pointed you to the definition of Niño 3.4 index: yes sorry. I know I shouldn't have asked but look it up again, I guess that happened because I sort of wanted to have any reaction at all. what's with the <a href="http://forum.azimuthproject.org/discussion/1501/how-good-is-climate-science-temperature-data/?Focus=13056#Comment_13056">blog post about the temperature data</a>?
• Options
208.

The monthly figures linked to from #203 look very similar to ones I calculated by interpolating the ten day figures using the code from #164. So we both made the same mistakes, if any.

Mine:

"Year","Month","S"
1950,1,2.70178689291113
1950,2,2.60221854718809
1950,3,2.53372837232994
1950,4,2.49447059052603
1950,5,2.52343325029961
...
2013,7,2.73474337185331
2013,8,2.82223182873403
2013,9,2.87405506804477
2013,10,2.93929845827353
2013,11,2.98576616712402
2013,12,3.024712642373


Blake's:

"month" "S"
"1" 2.70389999886587
"2" 2.60019196693804
"3" 2.53369933967342
"4" 2.4932572592701
"5" 2.52305342960841
...
"763" 2.7349204284955
"764" 2.82233374599453
"765" 2.87416278205779
"766" 2.93793110581363
"767" 2.98613238800691
"768" 3.02519852113451

Comment Source:The monthly figures linked to from #203 look very similar to ones I calculated by interpolating the ten day figures using the code from #164. So we both made the same mistakes, if any. Mine: ~~~~ "Year","Month","S" 1950,1,2.70178689291113 1950,2,2.60221854718809 1950,3,2.53372837232994 1950,4,2.49447059052603 1950,5,2.52343325029961 ... 2013,7,2.73474337185331 2013,8,2.82223182873403 2013,9,2.87405506804477 2013,10,2.93929845827353 2013,11,2.98576616712402 2013,12,3.024712642373 ~~~~ Blake's: ~~~~ "month" "S" "1" 2.70389999886587 "2" 2.60019196693804 "3" 2.53369933967342 "4" 2.4932572592701 "5" 2.52305342960841 ... "763" 2.7349204284955 "764" 2.82233374599453 "765" 2.87416278205779 "766" 2.93793110581363 "767" 2.98613238800691 "768" 3.02519852113451 ~~~~
• Options
209.

For #206, I would say the two analyses are identical for all intents and purposes.

Comment Source:For #206, I would say the two analyses are identical for all intents and purposes.
• Options
210.
edited November 2014

Great! If the two ways of computing monthly link strengths differ by at most about 0.002, as they seem to here, there's no point in Daniel redoing any of his calculations.

But, it was worthwhile for Blake to do this crosscheck!

Comment Source:Great! If the two ways of computing monthly link strengths differ by at most about 0.002, as they seem to here, there's no point in Daniel redoing any of his calculations. But, it was worthwhile for Blake to do this crosscheck!
• Options
211.
edited November 2014

what's with the blog post about the temperature data?

Thanks for putting that article on the wiki! I hadn't even noticed it, because I'm busy preparing this talk for Dec. 10th.

I edited your article a bit just now. I will get it ready to post shortly after Dec. 10th, or perhaps even sooner.

Comment Source:Nad wrote: > what's with the <a href="http://forum.azimuthproject.org/discussion/1501/how-good-is-climate-science-temperature-data/?Focus=13056#Comment_13056">blog post about the temperature data</a>? Thanks for putting that article on the wiki! I hadn't even noticed it, because I'm busy preparing this talk for Dec. 10th. I edited your article a bit just now. I will get it ready to post shortly after Dec. 10th, or perhaps even sooner.
• Options
212.
edited November 2014

Daniel - what are the time intervals between the lines in this plot of the correlation between average link strength and Niño 3.4 index?

Also: what time does the maximum occur?

This appears below In [24] in your notebook. If I could read Python, I could probably figure it out. I figure the spacing between lines should either be 10 days or one month, but it makes a huge difference which one it is! 10 days seems more likely....

Comment Source:Daniel - what are the time intervals between the lines in this plot of the correlation between average link strength and Ni&ntilde;o 3.4 index? <img width = "500" src = "http://math.ucr.edu/home/baez/climate_networks/mahler_link_strength_nino3.4_correlation.png" alt = ""/> Also: what time does the maximum occur? This appears below In [24] in [your notebook](https://5619417f7fb3a489ed01c7f329cbd1e9b70a10d6-www.googledrive.com/host/0B4cyIPgV_VxrX0lxSUxHU2VLN28/link-anom.html). If I could read Python, I could probably figure it out. I figure the spacing between lines should either be 10 days or one month, but it makes a huge difference which one it is! 10 days seems more likely....
• Options
213.

I have posted a draft of my talk here:

The main thing it's missing is a summary of Dara Shayda's results. I want to get comments as soon as possible!

I can't make the talk much longer, so as usual the comments I really need are not ones that suggest more material but ones that improve the clarity and effectiveness of what I'm saying... perhaps by omitting distracting irrelevant material.

(For example, I will probably omit the names of other teleconnections when I give the actual talk.)

Comment Source:I have posted a draft of my talk here: * [Networks in Climate Science](http://johncarlosbaez.wordpress.com/2014/11/29/climate-networks/), Azimuth, 29 November 2014. The main thing it's missing is a summary of Dara Shayda's results. I want to get comments as soon as possible! I can't make the talk much longer, so as usual the comments I really need are not ones that suggest more material but ones that improve the clarity and effectiveness of what I'm saying... perhaps by omitting distracting irrelevant material. (For example, I will probably omit the names of other teleconnections when I give the actual talk.)
• Options
214.

re #210

Also: what time does the maximum occur?

The spacing is 1 month. The maximum correlation between link strength and the anomaly is at 10 month, but by then the correlation with the current anomaly is 0, so the overall predictability at 10 month will be lower since it will only come from the link strength which is already close to max at 6 months. The peak cross correlation of the combined regression model is actually at 0, but is skewed towards the future so the cross correlation does not decay as fast as for the anomaly alone. The anomaly is still the dominant effect and the links strength just modifies it.

This suggests it might be worthwhile to look at predicting the change in anomaly over the next six month rather that predicting the anomaly itself. It is possible that the link strength predicts the delta while the current anomaly provides the baseline through inertia. This would be consistent with the fact that ExtraRandomRegressor model based an the past 6 month anomaly values got .32 $R^2$, but using past 6 month combined anomaly and link strength did only marginaly better. This suggests that both anomaly history and the link strength provide information about the trajectory of the the anomaly. Also adding more then 6 month anomaly history did not improve the predictions.

Comment Source:re #210 > Also: what time does the maximum occur? The spacing is 1 month. The maximum correlation between link strength and the anomaly is at 10 month, but by then the correlation with the current anomaly is 0, so the overall predictability at 10 month will be lower since it will only come from the link strength which is already close to max at 6 months. The peak cross correlation of the combined regression model is actually at 0, but is skewed towards the future so the cross correlation does not decay as fast as for the anomaly alone. The anomaly is still the dominant effect and the links strength just modifies it. This suggests it might be worthwhile to look at predicting the change in anomaly over the next six month rather that predicting the anomaly itself. It is possible that the link strength predicts the delta while the current anomaly provides the baseline through inertia. This would be consistent with the fact that ExtraRandomRegressor model based an the past 6 month anomaly values got .32 $R^2$, but using past 6 month combined anomaly and link strength did only marginaly better. This suggests that both anomaly history and the link strength provide information about the trajectory of the the anomaly. Also adding more then 6 month anomaly history did not improve the predictions.
• Options
215.
edited November 2014

Aha thanks to Johns talk summation I now realized that this mysterious STD DEV (while strictly speaking it is called ST DEV here) appears also for the Tahiti Darwin SOI - however again with cryptic explanations.

Comment Source:Aha thanks to <a href="http://johncarlosbaez.wordpress.com/2014/11/29/climate-networks/">Johns talk summation</a> I now realized that this <a href="http://forum.azimuthproject.org/discussion/1523/crunch-time/?Focus=13710#Comment_13710">mysterious STD DEV</a> (while strictly speaking it is called ST DEV here) appears also for the Tahiti Darwin SOI - however again with <a href="http://www.cgd.ucar.edu/cas/catalog/climind/soiAnnual.html">cryptic explanations.</a>
• Options
216.

The main thing it’s missing is a summary of Dara Shayda’s results.

John let me know what you need so I could jut down something

Comment Source:>The main thing it’s missing is a summary of Dara Shayda’s results. John let me know what you need so I could jut down something
• Options
217.

The convention used is that anomalies are normalized against their RMS value, which is generated by computing the standard deviation over the sample size. Then one can get a feel for how big the outliers are with respect to the other values in the sample. So anytime one sees a significant outlier like a strong El Nino, it will be at least a few standard deviations in value. Other than for that use, the value is meaningless and you might as well call it A.U.

Comment Source:The convention used is that anomalies are normalized against their RMS value, which is generated by computing the standard deviation over the sample size. Then one can get a feel for how big the outliers are with respect to the other values in the sample. So anytime one sees a significant outlier like a strong El Nino, it will be at least a few standard deviations in value. Other than for that use, the value is meaningless and you might as well call it A.U.
• Options
218.
edited December 2014

John, I know you do not want more stuff, but ...

I ran sparsity regularized linear models against the data and got performance comparable to the extra random trees from a model that only uses 5 pressure values. One nice aspect is that the degree of regularization was determined by cross validation on the training set, with no manual tuning of the hyperparameters.

More interesting are the locations and coefficients of the pressure values used (map at the veru bottom of the notebook): Two negatively weighted points in each of the north and south Horse Latitudes and one positively weighted point in the Indonesian region, between Borneo and Java.

This means the model is predicting high anomaly values when there is increased pressure in the west pushing back against the trade winds and the subtropical high pressure ridges are weakened or moved leading to reduced driving force behind the trade winds. This is in line with Walker Circulation mechanism discussed by others here before.

However lots of theories are consisten with just 5 points. I tweaked the models to be less sparse to see what pattern emerges. More extensive negatively correlated regions emerge in the Horse Latitudes and the Eastern Pacifica and new positevely correlated regions emerge in the Central Pacific and Inindian Ocean. Here are some more norebooks. The most interesting part is the BW map at the bottom each notebook. The lighter regions are negatively correlated and darker regions are positively correlated.

Comment Source:John, I know you do not want more stuff, but ... I ran sparsity regularized linear models against the data and got performance comparable to the extra random trees from a model that only uses 5 pressure values. One nice aspect is that the degree of regularization was determined by cross validation on the training set, with no manual tuning of the hyperparameters. More interesting are the locations and coefficients of the pressure values used (map at the veru bottom of the [notebook](https://www.googledrive.com/host/0B4cyIPgV_VxrX0lxSUxHU2VLN28/sl-pressure-anom-predict-omp.html)): Two negatively weighted points in each of the north and south Horse Latitudes and one positively weighted point in the Indonesian region, between Borneo and Java. This means the model is predicting high anomaly values when there is increased pressure in the west pushing back against the trade winds and the subtropical high pressure ridges are weakened or moved leading to reduced driving force behind the trade winds. This is in line with Walker Circulation mechanism discussed by others here [before](http://forum.azimuthproject.org/discussion/1360/paper-ludescher-et-al-improved-el-nino-forecasting-by-cooperativity-detection/?Focus=10890#Comment_10890). However lots of theories are consisten with just 5 points. I tweaked the models to be less sparse to see what pattern emerges. More extensive negatively correlated regions emerge in the Horse Latitudes and the Eastern Pacifica and new positevely correlated regions emerge in the Central Pacific and Inindian Ocean. Here are [some](https://www.googledrive.com/host/0B4cyIPgV_VxrX0lxSUxHU2VLN28/sl-pressure-anom-predict-bag-omp.html) [more](https://www.googledrive.com/host/0B4cyIPgV_VxrX0lxSUxHU2VLN28/sl-pressure-anom-predict-elnet0_2.html) [norebooks](https://www.googledrive.com/host/0B4cyIPgV_VxrX0lxSUxHU2VLN28/sl-pressure-anom-predict-elnet0_1.html). The most interesting part is the BW map at the bottom each notebook. The lighter regions are negatively correlated and darker regions are positively correlated.
• Options
219.

Posted this on the blog:

John, I think that you should try to cobble together a more positive message in the introduction to the talk.

Take a few steps back, and consider the question of why this material — at a very general level — could potentially be of interest to (1) you, and (2) the audience at NIPS. What would be the abstract for this talk?

Here are some possible ingredients:

– New area of application for network theory

– New area of application for machine learning

– Application area represents a pressing human concern

– Azimuth project is searching for ways that mathematicians, scientists and programmers can contribute to the understanding of significant environmental problems

– Made a decision to investigate a more concrete problem

– In this talk, I will begin by giving background and context on the El Nino phenomenon and its physics; then discuss climate network structures that have been posited as indicators for the occurrence of El Nino events; then proceed to evaluate a specific paper which uses this framework, and makes specific testable hypotheses about the preconditions for the occurrence of an El Nino event.

I would also suggest a section that talks about the role of machine learning in this study.

Good Luck!

Comment Source:Posted this on the blog: John, I think that you should try to cobble together a more positive message in the introduction to the talk. Take a few steps back, and consider the question of why this material — at a very general level — could potentially be of interest to (1) you, and (2) the audience at NIPS. What would be the abstract for this talk? Here are some possible ingredients: – New area of application for network theory – New area of application for machine learning – Application area represents a pressing human concern – Azimuth project is searching for ways that mathematicians, scientists and programmers can contribute to the understanding of significant environmental problems – Made a decision to investigate a more concrete problem – In this talk, I will begin by giving background and context on the El Nino phenomenon and its physics; then discuss climate network structures that have been posited as indicators for the occurrence of El Nino events; then proceed to evaluate a specific paper which uses this framework, and makes specific testable hypotheses about the preconditions for the occurrence of an El Nino event. I would also suggest a section that talks about the role of machine learning in this study. Good Luck!
• Options
220.

In blog, John wrote:

Preliminary throat-clearing

I’m very flattered to be invited to speak here. I was probably invited because of my abstract mathematical work on networks and category theory. But when I got the invitation, instead of talking about something I understood, I thought I’d learn about something a bit more practical and talk about that. That was a bad idea. But I’ll try to make the best of it.

It's disarmingly honest, but I don't think that starting with this is the best way to make the best of it.

There was a reason why we were drawn to this subject, and we performed some preliminary explorations, for the cause of the Azimuth project. Now, having been through this, what are your reflections on this subject, and its prospects for further research? Imagine you were drafting a research agenda for networks in climate science. Where, at all, would you fit the work that you have reviewed into this agenda?

I think that at the end of your conclusion, you should add some statements that summarize your perspective as a scientist on further avenues to pursue for the application of climate network theory.

Comment Source:In blog, John wrote: > Preliminary throat-clearing > > I’m very flattered to be invited to speak here. I was probably invited because of my abstract mathematical work on networks and category theory. But when I got the invitation, > instead of talking about something I understood, I thought I’d learn about something a bit more practical and talk about that. That was a bad idea. But I’ll try to make the best of it. It's disarmingly honest, but I don't think that starting with this is the best way to make the best of it. There was a reason why we were drawn to this subject, and we performed some preliminary explorations, for the cause of the Azimuth project. Now, having been through this, what are your reflections on this subject, and its prospects for further research? Imagine you were drafting a research agenda for networks in climate science. Where, at all, would you fit the work that you have reviewed into this agenda? I think that at the end of your conclusion, you should add some statements that summarize your perspective as a scientist on further avenues to pursue for the application of climate network theory.
• Options
221.
edited December 2014

Can anyone here who knows about machine learning, and the approach of Ludescher et. al, give a little blurb about the role of machine learning per se in this kind of research.

This could give John some ideas for his talk.

Time is of the essence here.

(Sorry I would have given these comments earlier, had the words come to me then.)

Comment Source:Can anyone here who knows about machine learning, and the approach of Ludescher et. al, give a little blurb about the role of machine learning _per se_ in this kind of research. This could give John some ideas for his talk. Time is of the essence here. (Sorry I would have given these comments earlier, had the words come to me then.)
• Options
222.
edited December 2014

Thanks, David. I probably won't be as disarmingly honest and negative in my actual talk as I was in the blog version - maybe I just needed to get it out of my system. I've given about 200 talks in my life, and I've rarely felt so poorly in control of the material I'm presenting. For this talk I had to learn about El Niños, "complex network theory" (which is different than my network theory), and smidgens of programming in R, statistics and machine learning - a huge range of new stuff. I feel like a raw newbie in all these fields.

Thanks for your suggestions. Most of them are good. But I don't like the idea in comment 219. There's no way I'm going to say anything really interesting about machine learning to this crowd of experts on machine learning. If I try to parrot a short blurb, it will probably come off sounding wrong, and I'll just set myself up for questions that puncture my paper-thin veneer of knowledge. I haven't even had time to learn about the 9 different statistical approaches to El Niño prediction listed here:

On the other hand, there is enough material in my talk that I'll be hard-pressed to cover it in 50 minutes... I gave a version earlier this week in my seminar, and it took 90 minutes! So I don't really need more material. I just need to frame it better, zip through it, and make it sound exciting. But I'm pretty good at that; that's why people invite me to give lots of talks.

One thing I want to do is get the audience - experts on machine learning - to try their hand at El Niño prediction. So my actual strategy for disarming them - a bit different than in the blog article - will be to flatter them and say the world, and the Azimuth Project, needs their help.

Comment Source:Thanks, David. I probably won't be as disarmingly honest and negative in my actual talk as I was in the blog version - maybe I just needed to get it out of my system. I've given about 200 talks in my life, and I've rarely felt so poorly in control of the material I'm presenting. For this talk I had to learn about El Ni&ntilde;os, "complex network theory" (which is different than my network theory), and smidgens of programming in R, statistics and machine learning - a huge range of new stuff. I feel like a raw newbie in _all_ these fields. Thanks for your suggestions. Most of them are good. But I don't like the idea in comment 219. There's no way I'm going to say anything really interesting about machine learning to this crowd of experts on machine learning. If I try to parrot a short blurb, it will probably come off sounding wrong, and I'll just set myself up for questions that puncture my paper-thin veneer of knowledge. I haven't even had time to learn about the 9 different statistical approaches to El Ni&ntilde;o prediction listed here: <img src = "http://math.ucr.edu/home/baez/climate_networks/2014-11-20-Nino34-predictions.jpg" alt = ""/> On the other hand, there is enough material in my talk that I'll be hard-pressed to cover it in 50 minutes... I gave a version earlier this week in my seminar, and it took 90 minutes! So I don't really need more material. I just need to frame it better, zip through it, and make it sound exciting. But I'm pretty good at that; that's why people invite me to give lots of talks. One thing I want to do is get the audience - experts on machine learning - to try their hand at El Ni&ntilde;o prediction. So my actual strategy for disarming them - a bit different than in the blog article - will be to flatter them and say the world, and the Azimuth Project, needs their help.
• Options
223.

By the way, I hope to give more talks about roughly similar stuff after we do more work on it. I think in a year or two we could do something really interesting, like developing methods to predict El Niños and/or rate existing El Niño prediction methods. That's if people here are interested in this, of course!

I don't think the "climate network" stuff should be the central focus of further work if it's El Niños we're interested in. It suggests some interesting ideas but those ideas will probably wind up looking rather different by the time they've matured.

Comment Source:By the way, I hope to give more talks about roughly similar stuff after we do more work on it. I think in a year or two we could do something really interesting, like developing methods to predict El Ni&ntilde;os and/or rate existing El Ni&ntilde;o prediction methods. That's if people here are interested in this, of course! I don't think the "climate network" stuff should be the central focus of further work if it's El Ni&ntilde;os we're interested in. It suggests some interesting ideas but those ideas will probably wind up looking rather different by the time they've matured.
• Options
224.
edited December 2014

John wrote:

But I don’t like the idea in comment 219. There’s no way I’m going to say anything really interesting about machine learning to this crowd of experts on machine learning. If I try to parrot a short blurb, it will probably come off sounding wrong, and I’ll just set myself up for questions that puncture my paper-thin veneer of knowledge.

Even if you didn't use it, I'd still be interested to read such a blurb, if anyone wants to take a shot at it.

I'd liked to be convinced of the meaningfulness of machine learning.

Here is a quote from an article Peter Norvig on Google's mistrust of machine learning:

So why isn't Google using this machine learning model for their search engine then? Well, Peter suggests that there are two reasons. The first is that those engineers who hand made the current algorithm don't think a machine could do better. The second, as Anand says, is more interesting. Google worries that machine-learned models may suffer "catastrophic errors on searches that look very different from the training data".

But, because I know so little about machine learning, I retain an open mind -- my skeptical instincts could be completely wrong!!

Comment Source:John wrote: > But I don’t like the idea in comment 219. There’s no way I’m going to say anything really interesting about machine learning to this crowd of experts on machine learning. If I try to parrot a short blurb, it will probably come off sounding wrong, and I’ll just set myself up for questions that puncture my paper-thin veneer of knowledge. I see your point. Even if you didn't use it, I'd still be interested to read such a blurb, if anyone wants to take a shot at it. I'd liked to be convinced of the meaningfulness of machine learning. Here is a quote from an article [Peter Norvig on Google's mistrust of machine learning](http://www.zdnet.com/article/peter-norvig-on-googles-mistrust-of-machine-learning/): > So why isn't Google using this machine learning model for their search engine then? Well, Peter suggests that there are two reasons. The first is that those engineers who hand made the current algorithm don't think a machine could do better. The second, as Anand says, is more interesting. Google worries that machine-learned models may suffer "catastrophic errors on searches that look very different from the training data". But, because I know so little about machine learning, I retain an open mind -- my skeptical instincts could be completely wrong!!
• Options
225.

So why isn’t Google using this machine learning model for their search engine then?

The price of their stock and their massive IPO was based upon the promise that there is a secrete sauce i.e. handmade algorithm that does much better than any other algorithm, do you think they risk their franchise to admit that machine learning algorithms could produce much better results?

Comment Source:>So why isn’t Google using this machine learning model for their search engine then? The price of their stock and their massive IPO was based upon the promise that there is a **secrete sauce** i.e. handmade algorithm that does much better than any other algorithm, do you think they risk their franchise to admit that machine learning algorithms could produce much better results?
• Options
226.
edited December 2014

David wrote:

I’d liked to be convinced of the meaningfulness of machine learning.

Are you convinced of the meaningfulness of animal learning? Say, human learning?

For all these things, it works when it work and it doesn't when it doesn't, and it's hard to say exactly when it does and when it doesn't... but it's still useful.

Anyway, I won't try to spend any time trying to tell the people at this conference anything about machine learning - it's a huge conference of experts on the subject, and all I can do is learn. In my talk, it's more important to get a few interested in the Azimuth Project. I'm glad you pushed me away from that extremely humble self-introduction. If it were just me giving this talk, I might try it. But I'm speaking for the Azimuth Project, and some of you have done a lot of work leading up to this talk, so I shouldn't make it sound bad. I don't think the work we've done is bad... I just feel I'd need to do a lot more work myself to become an expert on the subjects I'm talking about here!

Comment Source:David wrote: > I’d liked to be convinced of the meaningfulness of machine learning. Are you convinced of the meaningfulness of animal learning? Say, human learning? For all these things, it works when it work and it doesn't when it doesn't, and it's hard to say exactly when it does and when it doesn't... but it's still useful. Anyway, I won't try to spend any time trying to tell the people at this conference anything about machine learning - it's a huge conference of experts on the subject, and all I can do is learn. In my talk, it's more important to get a few interested in the Azimuth Project. I'm glad you pushed me away from that extremely humble self-introduction. If it were just _me_ giving this talk, I might try it. But I'm speaking for the Azimuth Project, and some of you have done a lot of work leading up to this talk, so I shouldn't make it sound bad. I don't think the work we've done is bad... I just feel I'd need to do a lot more work myself to become an expert on the subjects I'm talking about here!
• Options
227.

Hello John

I hope your talk was a smash :)

I would like to post the forecasts I made, as articles in GSJOURNAL.net, if ok with you. They allow me to post code in my writings.

Dara

Comment Source:Hello John I hope your talk was a smash :) I would like to post the forecasts I made, as articles in GSJOURNAL.net, if ok with you. They allow me to post code in my writings. Dara
• Options
228.
edited December 2014

Hi -

I'm giving my talk tomorrow morning at 9 am. I hope it's a smash too!

You can see the slides here:

The talk may also be videotaped; if it is, I'll tell the world.

Comment Source:Hi - I'm giving my talk tomorrow morning at 9 am. I hope it's a smash too! You can see the slides here: * [Networks in Climate Science](http://math.ucr.edu/home/baez/climate_networks/), NIPS 2014. The talk may also be videotaped; if it is, I'll tell the world.
• Options
229.

Rock on, as they say.

Comment Source:Rock on, as they say.
• Options
230.

One last minute thought. It may be worth explaining exactly what the anomaly is. Seasonal variations would easily account for more than 22% or even 36% percent of mean tempeature variatiation so people might think we are shooting fish in a barrel if they do not understand what it is a percentage of. Good Luck!

Comment Source:One last minute thought. It may be worth explaining exactly what the anomaly is. Seasonal variations would easily account for more than 22% or even 36% percent of mean tempeature variatiation so people might think we are shooting fish in a barrel if they do not understand what it is a percentage of. Good Luck!
• Options
231.
edited December 2014

Thanks, David and Daniel! I certainly do explain what a temperature anomaly is - it's on page 4 of my slides, but I'll try to remind people a couple of times.

There are 2100 people at this conference - six hotels are completely sold out - and no other talks during mine. I'm giving my talk in a HUGE room, that can actually hold all these people, so it will be quite exciting.

(I hope they don't all decide to stay in bed. Luckily the conference is serving breakfast on location - a clever incentive.)

Comment Source:Thanks, David and Daniel! I certainly do explain what a temperature anomaly is - it's on page 4 of [my slides](http://math.ucr.edu/home/baez/climate_networks/climate_networks.pdf), but I'll try to remind people a couple of times. There are 2100 people at this conference - six hotels are completely sold out - and no other talks during mine. I'm giving my talk in a **HUGE** room, that can actually hold all these people, so it will be quite exciting. (I hope they don't all decide to stay in bed. Luckily the conference is serving breakfast on location - a clever incentive.)
• Options
232.

Best questions:

1) How well can you do predicting Niño 3.4 using the entire matrix of link strengths, not just the average link strength?

2) To what extent have people systematically checked that Niño 3.4 is the best quantity for predicting other El Niño-related quantities, e.g. ones that actually matter to farmers? There are a number of El Niño indices, but maybe machine learning could be used to look for optimal ones. (Optimal for different purposes.)

Comment Source:Best questions: 1) How well can you do predicting Ni&ntilde;o 3.4 using the entire matrix of link strengths, not just the average link strength? 2) To what extent have people _systematically_ checked that Ni&ntilde;o 3.4 is the best quantity for predicting other El Ni&ntilde;o-related quantities, e.g. ones that actually matter to farmers? There are a number of El Ni&ntilde;o indices, but maybe machine learning could be used to look for optimal ones. (Optimal for different purposes.)
• Options
233.

1) How well can you do predicting Niño 3.4 using the entire matrix of link strengths, not just the average link strength?

There should be meaningful advantages to do this, since we had computed north-south trend for the temperature. My intuition tells me that the entire grid is far better than the averages.

the best quantity for predicting other El Niño-related quantities, e.g. ones that actually matter to farmers?

John I am very very interested in this

Dara

Comment Source:>1) How well can you do predicting Niño 3.4 using the entire matrix of link strengths, not just the average link strength? There should be meaningful advantages to do this, since we had computed north-south trend for the temperature. My intuition tells me that the entire grid is far better than the averages. >the best quantity for predicting other El Niño-related quantities, e.g. ones that actually matter to farmers? John I am very very interested in this Dara
• Options
234.

Hello John

Could you find if anyone is doing clustering algorithms for the climate data to create a network/graph model?

Comment Source:Hello John Could you find if anyone is doing clustering algorithms for the climate data to create a network/graph model?
• Options
235.

1) How well can you do predicting Niño 3.4 using the entire matrix of link strengths, not just the average link strength?

2) To what extent have people systematically checked that Niño 3.4 is the best quantity for predicting other El Niño-related quantities, e.g. ones that actually matter to farmers? There are a number of El Niño indices, but maybe machine learning could be used to look for optimal ones. (Optimal for different purposes.)

That is a a really excellent question. I have been thinking about that. I would like to do build models that try to predict economic/agricultural statistics from the el nino index, link strenth or raw NOAA data. I asked about available data sets on the Open Data StachExchnge site and got a couple of pointers:

and got potnters to the FOOD AND AGRICULTURE ORGANIZATION OF THE UNITED NATIONS - Statistics Division and to the USDA NASS site.

Comment Source:> 1) How well can you do predicting Niño 3.4 using the entire matrix of link strengths, not just the average link strength? Good one. I had tried a time window of avg link strength and anomalies, but had not thought about their spatial distribution. > 2) To what extent have people systematically checked that Niño 3.4 is the best quantity for predicting other El Niño-related quantities, e.g. ones that actually matter to farmers? There are a number of El Niño indices, but maybe machine learning could be used to look for optimal ones. (Optimal for different purposes.) That is a a really excellent question. I have been thinking about that. I would like to do build models that try to predict economic/agricultural statistics from the el nino index, link strenth or raw NOAA data. I asked about available data sets on the Open Data StachExchnge site and got a couple of pointers: + [Historical monthly farm/agricultural data 1950 to the present](http://opendata.stackexchange.com/questions/4085/historical-monthly-farm-agricultural-data-1950-to-the-present) and got potnters to the FOOD AND AGRICULTURE ORGANIZATION OF THE UNITED NATIONS - Statistics Division and to the USDA NASS site.
• Options
236.

I suggest we drop the grid and instead form the clusters and measure a link strength between them. Because I am sure the planet does not have a grid on its surface

Comment Source:>but had not thought about their spatial distribution. I suggest we drop the grid and instead form the clusters and measure a link strength between them. Because I am sure the planet does not have a grid on its surface
• Options
237.

Dear John

Thank you for mentioning my name for my minor contribution and I am honored to be a part of your publication and research, and my note of gratitude for everyone else.

Dara

Comment Source:Dear John Thank you for mentioning my name for my minor contribution and I am honored to be a part of your publication and research, and my note of gratitude for everyone else. Dara
• Options
238.
edited December 2014

My intuition tells me that the entire grid is far better than the averages.

Sure. and you will see most probable even more if you look at different time delays. If someone should do the computations then please make images.

Comment Source:>My intuition tells me that the entire grid is far better than the averages. Sure. and you will see most probable even more if you look at different time delays. If someone should do the computations then please make images.
• Options
239.

If someone should do the computations then please make images.

I did a bunch of them, but you need to run a Wolfram CDF on your machine, which is a free plugin to see the results.

Actually it is quite telling about the North-South+East-West oscillations.

D

Comment Source:>If someone should do the computations then please make images. I did a bunch of them, but you need to run a Wolfram CDF on your machine, which is a free plugin to see the results. Actually it is quite telling about the North-South+East-West oscillations. D
• Options
240.

Yes thanks John for naming us on the talk.

Comment Source:Yes thanks John for naming us on the talk.
• Options
241.
edited December 2014

John wrote:

Are you convinced of the meaningfulness of animal learning? Say, human learning?

For all these things, it works when it work and it doesn’t when it doesn’t, and it’s hard to say exactly when it does and when it doesn’t… but it’s still useful.

Now that I think about it, what was confusing me was the terminology, not the actual practice of ML. In this branch of computer science, the term "learning" has become very diluted, to the point where it many not involve any substantive knowledge representation.

Here is a quote from Tom M. Mitchell in the Wikipedia article:

A computer program is said to learn from experience E with respect to some class of tasks T and performance measure P, if its performance at tasks in T, as measured by P, improves with experience E.

Ok, but note that a consequence of this definition is that a linear regression function learn from the data points, though what it produces for its outputs are just a heuristic set of coefficients -- and these may be completely misguided, if the real relationship is not linear.

To me it makes more sense to think of learning as the process of building declarative models from data. This is no small feat. The weaker notion is the algorithmic optimization of parameters for a given model.

I think I'll just call it ML :)

Comment Source:John wrote: > Are you convinced of the meaningfulness of animal learning? Say, human learning? > > For all these things, it works when it work and it doesn’t when it doesn’t, and it’s hard to say exactly when it does and when it doesn’t… but it’s still useful. Now that I think about it, what was confusing me was the terminology, not the actual practice of ML. In this branch of computer science, the term "learning" has become very diluted, to the point where it many not involve any substantive knowledge representation. Here is a quote from Tom M. Mitchell in the Wikipedia article: > A computer program is said to learn from experience E with respect to some class of tasks T and performance measure P, if its performance at tasks in T, as measured by P, improves with experience E. Ok, but note that a consequence of this definition is that a linear regression function learn from the data points, though what it produces for its outputs are just a heuristic set of coefficients -- and these may be completely misguided, if the real relationship is not linear. To me it makes more sense to think of learning as the process of building declarative models from data. This is no small feat. The weaker notion is the algorithmic optimization of parameters for a given model. I think I'll just call it ML :)
• Options
242.

To me it makes more sense to think of learning as the process of building declarative models from data.

But it seems these may be completely misguided as well - especially if they need to be nonlinear.

Comment Source:>To me it makes more sense to think of learning as the process of building declarative models from data. But it seems these may be completely misguided as well - especially if they need to be nonlinear.
• Options
243.

I remember someone back in the 1980s saying that there were two reasons why people prefer 'learning' to 'parameter estimation' Firstly it is shorter. Secondly it looks much more impressive in grant proposals.

Comment Source:I remember someone back in the 1980s saying that there were two reasons why people prefer 'learning' to 'parameter estimation' Firstly it is shorter. Secondly it looks much more impressive in grant proposals.