#### Howdy, Stranger!

It looks like you're new here. If you want to get involved, click one of these buttons!

Options

# Experiments in El Niño analysis and prediction

• Options
51.
edited June 2014

I have added some images of correlations and covariances to the page.

Comment Source:I have added some images of correlations and covariances to [the page](http://www.azimuthproject.org/azimuth/show/Experiments+in+El+Ni%C3%B1o+detection+and+prediction+).
• Options
52.
edited June 2014

John,

Has anyone here spent time doing this kind of data analysis?

I haven't, but plan to start playing around with EOF analysis this summer, as a warmup to constructing reduced stochastic models of climate variability.

Comment Source:John, > Has anyone here spent time doing this kind of data analysis? I haven't, but plan to start playing around with EOF analysis this summer, as a warmup to constructing reduced stochastic models of climate variability.
• Options
53.
edited June 2014

Graham wrote:

I get why you might call covariances “2-point functions” but I thought “Green’s functions” were something more sophisticated.

Green's functions began life as a way of solving partial differential equations: very roughly, if we take as our initial conditions at time $t$ a delta function at a point $\mathbf{x}$ in space, the solution of the equation at some other time $t'$ and place $\mathbf{x}'$ is the Green's function $G(t,\mathbf{x}; t', \mathbf{x}')$.

However, Green's functions take on a new life in quantum field theory. In the vacuum, a quantum field fluctuates in a random sort of way, and its covariance or 2-point function is sometimes the Green's function for some partial differential equation! More precisely, this happens for quantum field theories arising from the quantization of linear partial differential equations. A good example would be the quantized version of Maxwell's equations for light, ignoring charged matter.

So, quantum field theorists treat Green's functions, 2-point functions, and covariances as closely allied things. They're all about how random fluctuations at one place and time are correlated to random fluctuations at another place and time. Imagine a rubber membrane with little random quantum elves hopping on it, and you'll get the idea. If one of them does a hop here, it sends out waves that make the membrane wiggle at other places later on.

Quantum field theory sound far removed from climate physics, but all these ideas also apply to stochastic field theory, which is much closer.

Still, I'm just using these ideas as inspiration rather than positing some precise connection.

Comment Source:Graham wrote: > I get why you might call covariances “2-point functions” but I thought “Green’s functions” were something more sophisticated. Green's functions began life as a way of solving partial differential equations: very roughly, if we take as our initial conditions at time $t$ a delta function at a point $\mathbf{x}$ in space, the solution of the equation at some other time $t'$ and place $\mathbf{x}'$ is the **Green's function** $G(t,\mathbf{x}; t', \mathbf{x}')$. However, Green's functions take on a new life in quantum field theory. In the vacuum, a quantum field fluctuates in a random sort of way, and its covariance or **2-point function** is sometimes the Green's function for some partial differential equation! More precisely, this happens for quantum field theories arising from the quantization of _linear_ partial differential equations. A good example would be the quantized version of Maxwell's equations for light, ignoring charged matter. So, quantum field theorists treat Green's functions, 2-point functions, and covariances as closely allied things. They're all about how random fluctuations at one place and time are correlated to random fluctuations at another place and time. Imagine a rubber membrane with little random quantum elves hopping on it, and you'll get the idea. If one of them does a hop here, it sends out waves that make the membrane wiggle at other places later on. Quantum field theory sound far removed from climate physics, but all these ideas also apply to _stochastic_ field theory, which is much closer. Still, I'm just using these ideas as inspiration rather than positing some precise connection.
• Options
54.
edited June 2014

Graham wrote:

David, I have been thinking about making it easier for others to join in too. It's great that you're prepared to learn R, but not everyone will want to, and I'd rather that people can get on with playing around in their favourite language.

I agree, this is best!

Thanks for posting the data export program -- I will be trying it out.

I get the sense you want to do something more organised ("a library of functions") and perhaps that you want to implement the Ludescher algorithm exactly as they did.

Not necessarily, though reproducing parts of their computation seems like a good way for use to get our engines started. I'm thinking more in terms of the next small milestones.

It would be a good, easy milestone to reach to have all of the data read into memory at once, in a coherent structure. A next logical step would be to compute the temperature anomalies, since this is the starting point for going in many directions. And computing the temperature anomalies, because it involves passing over all the data, would be a good check on the the performance of our platforms for a single pass over the whole data set.

Comment Source:Graham wrote: > David, I have been thinking about making it easier for others to join in too. It's great that you're prepared to learn R, but not everyone will want to, and I'd rather that people can get on with playing around in their favourite language. I agree, this is best! Thanks for posting the data export program -- I will be trying it out. > I get the sense you want to do something more organised ("a library of functions") and perhaps that you want to implement the Ludescher algorithm exactly as they did. Not necessarily, though reproducing parts of their computation seems like a good way for use to get our engines started. I'm thinking more in terms of the next small milestones. It would be a good, easy milestone to reach to have all of the data read into memory at once, in a coherent structure. A next logical step would be to compute the temperature anomalies, since this is the starting point for going in many directions. And computing the temperature anomalies, because it involves passing over all the data, would be a good check on the the performance of our platforms for a single pass over the whole data set.
• Options
55.
edited June 2014

John, thanks for explaining the link between Green’s functions and 2-point functions. This thing: $G(t,\mathbf{x}; t', \mathbf{x}')$ jogs a memory.

I am just using what I know about textures as inspiration (in case that wasn't obvious).

Comment Source:John, thanks for explaining the link between Green’s functions and 2-point functions. This thing: $G(t,\mathbf{x}; t', \mathbf{x}')$ jogs a memory. I am just using what I know about textures as inspiration (in case that wasn't obvious).
• Options
56.

Terminology digression: I don't think it's worthwhile to have an extended debated about the terminology -- for now we can each use the words that make most sense to us, as long as others know what we mean -- but since it's come up I'll add my opinion into the mix.

I think the original term that Graham posted, temperature anomaly is the most accurate. Seasonal adjusted temperature sounds nicer, but has some confusing connotations. It's not really about "adjusting" the temperature -- that connotes an additive or multiplicative correction factor -- but rather it is the deviations from the per-day expected temperature. So my proposal is for temperature deviations, which is a crisper description than temperature anomaly, and if not that, then using the standard term temperature anomaly.

Comment Source:Terminology digression: I don't think it's worthwhile to have an extended debated about the terminology -- for now we can each use the words that make most sense to us, as long as others know what we mean -- but since it's come up I'll add my opinion into the mix. I think the original term that Graham posted, temperature anomaly is the most accurate. Seasonal adjusted temperature sounds nicer, but has some confusing connotations. It's not really about "adjusting" the temperature -- that connotes an additive or multiplicative correction factor -- but rather it is the deviations from the per-day expected temperature. So my proposal is for temperature deviations, which is a crisper description than temperature anomaly, and if not that, then using the standard term temperature anomaly.
• Options
57.
edited June 2014

I have now added added a PDF to the wiki page. It is along the lines of John's challenge. A couple of points to add to what's in the PDF:

• the graphs show the median of the covariances over the region.

• black means zero geographical displacement, and paler greys show displacements increasing by 2.5 degrees.

I used something called Sweave which allows you to mix Latex and R, and in theory document exactly what you did to make produce the results, thus achieving reproducible research. I am a newbie to Sweave: it shows.

One-day delay:

Five-day delay:

Comment Source:I have now added added a [PDF](http://www.azimuthproject.org/azimuth/files/covs-near-equator.pdf) to the wiki page. It is along the lines of John's challenge. A couple of points to add to what's in the PDF: * the graphs show the **median** of the covariances over the region. * black means zero geographical displacement, and paler greys show displacements increasing by 2.5 degrees. I used something called Sweave which allows you to mix Latex and R, and in theory document exactly what you did to make produce the results, thus achieving [reproducible research](http://en.wikipedia.org/wiki/Reproducibility#Reproducible_research). I am a newbie to Sweave: it shows. One-day delay: <a href = "http://www.azimuthproject.org/azimuth/files/covs-near-equator.pdf"> <img width = "400" src = "http://math.ucr.edu/home/baez/ecological/el_nino/covs-near-equator_Page_6.jpg" alt = ""/></a> Five-day delay: <a href = "http://www.azimuthproject.org/azimuth/files/covs-near-equator.pdf"> <img width = "400" src = "http://math.ucr.edu/home/baez/ecological/el_nino/covs-near-equator_Page_7.jpg" alt = ""/></a>
• Options
58.
edited June 2014

While on the topic of temperature deviations (or what have you), I have a question about the basic methodology here.

Suppose that we have data for 1950 - 2000, and are computing the deviation for June 1, 1975. Suppose the temperature on that day is T. Let U be the average temperature on June 1, over all the years 1950 - 2000. Then as we've defined it the deviation for that day as T' = T - U.

I get instinctively uncomfortable with the idea that the deviation for a day in 1975 depends upon an average that contains data extending into the future. Can someone reassure me that this is not necessarily a methodological problem?

Comment Source:While on the topic of temperature deviations (or what have you), I have a question about the basic methodology here. Suppose that we have data for 1950 - 2000, and are computing the deviation for June 1, 1975. Suppose the temperature on that day is T. Let U be the average temperature on June 1, over all the years 1950 - 2000. Then as we've defined it the deviation for that day as T' = T - U. I get instinctively uncomfortable with the idea that the deviation for a day in 1975 depends upon an average that contains data extending into the future. Can someone reassure me that this is not necessarily a methodological problem?
• Options
59.

I don't want a great debate on what to call things either but I didn't invent the term seasonal adjustment, whereas I think Ludescher et al did give a new meaning to "temperature anomaly". In one of the papers pointed to by Nathan, I saw the phrase "The climatological seasonal cycle is subtracted from the SST data", which we might find handy.

Comment Source:I don't want a great debate on what to call things either but I didn't invent the term [seasonal adjustment](http://en.wikipedia.org/wiki/Seasonal_adjustment), whereas I think Ludescher et al did give a new meaning to "temperature anomaly". In one of the papers pointed to by Nathan, I saw the phrase "The climatological seasonal cycle is subtracted from the SST data", which we might find handy.
• Options
60.
edited June 2014

Moreover, consider the global warming that has taken place during this interval. Then the deviations at the beginning of the period would have a bias towards negative numbers, and the deviations at the end would have a positive bias. That doesn't sound appropriate -- especially for the goal of computing covariances.

Perhaps we should be taking the average over a sliding window that extends backwards e.g. 10 years from the current date. But that leads us into the sticky question of how big the window should be. It would also mean discarding the first 10 years of data from the range of years in the analysis, which would be a loss given that we don't have that many years of data to work with.

Comment Source:Moreover, consider the global warming that has taken place during this interval. Then the deviations at the beginning of the period would have a bias towards negative numbers, and the deviations at the end would have a positive bias. That doesn't sound appropriate -- especially for the goal of computing covariances. Perhaps we should be taking the average over a sliding window that extends backwards e.g. 10 years from the current date. But that leads us into the sticky question of how big the window should be. It would also mean discarding the first 10 years of data from the range of years in the analysis, which would be a loss given that we don't have that many years of data to work with.
• Options
61.

I get instinctively uncomfortable with the idea that the deviation for a day in 1975 depends upon an average that contains data extending into the future.

Me too. If you are attempting to make predictions, you need to avoid anything from the future.

Kind of related. I have deliberately not downloaded data after 1980. I'm trying to keep myself honest, and form some hypotheses about what is going on, and then test them on later data.

Comment Source:> I get instinctively uncomfortable with the idea that the deviation for a day in 1975 depends upon an average that contains data extending into the future. Me too. If you are attempting to make predictions, you need to avoid anything from the future. Kind of related. I have deliberately not downloaded data after 1980. I'm trying to keep myself honest, and form some hypotheses about what is going on, and then test them on later data.
• Options
62.
edited June 2014

In this online thread the question of how to determine the window size for a moving average is posed. The reply recasts the problem in terms of filtering of frequencies.

Or would an exponentially weighted moving average be appropriate? As for which exponent to use...

Heuristically speaking, I would say that -- for a given calendar day e.g. June 1 -- variations on a time scale of under a decade should be smoothed out, but variations on a larger scale should be retained. This is subjective, I know.

Perhaps by looking at the data and experimenting with exponents, we would find the exponent that just suffices to smooth out what we consider to appropriately filter out the short term variations.

Comment Source:In this [online thread](http://dsp.stackexchange.com/questions/9590/how-to-determine-the-window-size-and-weights-in-weighted-moving-average-wma-g) the question of how to determine the window size for a moving average is posed. The reply recasts the problem in terms of filtering of frequencies. Or would an exponentially weighted moving average be appropriate? As for which exponent to use... Heuristically speaking, I would say that -- for a given calendar day e.g. June 1 -- variations on a time scale of under a decade should be smoothed out, but variations on a larger scale should be retained. This is subjective, I know. Perhaps by looking at the data and experimenting with exponents, we would find the exponent that just suffices to smooth out what we consider to appropriately filter out the short term variations.
• Options
63.
edited June 2014

I don't want to debate about terminology either, even though it's hard to resist: note that David, after urging that we not do so, introduced his own preferred terminology and explained why.

I just want to make sure that 1) we understand each other and 2) when we get around to blogging about this, we use widely accepted terminology whenever it exists... so we sound like we know what we're doing. For the gatekeepers of science, correct use of jargon is the first way to distinguish insiders from outsiders.

Comment Source:I don't want to debate about terminology either, even though it's hard to resist: note that David, after urging that we not do so, introduced his own preferred terminology and explained why. <img src = "http://math.ucr.edu/home/baez/emoticons/tongue2.gif" alt = ""/> I just want to make sure that 1) we understand each other and 2) when we get around to blogging about this, we use widely accepted terminology whenever it exists... so we sound like we know what we're doing. For the gatekeepers of science, correct use of jargon is the first way to distinguish insiders from outsiders.
• Options
64.

Ha.

Comment Source:Ha.
• Options
65.
edited June 2014

David wrote:

I get instinctively uncomfortable with the idea that the deviation for a day in 1975 depends upon an average that contains data extending into the future. Can someone reassure me that this is not necessarily a methodological problem?

Great point! If one is merely looking at climate data and trying to find patterns, it seems okay. But a method of predicting something is useless if it requires data from the future.

Ludescher et al confront this issue as follows:

At each node $k$ of the network shown in Fig. 1, the daily atmospheric temperature anomalies $T_k(t)$ (actual temperature value minus climatological average for each calendar day; see below) at the surface area level are determined.

and then below:

We would like to add that, for the calculation of the climatological average in the learning phase, all data within this time window were taken into account, whereas in the prediction phase, only data from the past up to the prediction date were considered.

So, in the "learning phase" $T_k(t)$ depends on data from times after $t$, while in the "prediction phase" it does not.

None of this matter much if the average temperature at a given day of the year depends only slightly on the "window" in which the average is computed... but otherwise, it might make a difference.

Comment Source:David wrote: > I get instinctively uncomfortable with the idea that the deviation for a day in 1975 depends upon an average that contains data extending into the future. Can someone reassure me that this is not necessarily a methodological problem? Great point! If one is merely looking at climate data and trying to find patterns, it seems okay. But a method of _predicting_ something is useless if it requires data from the future. Ludescher _et al_ confront this issue as follows: > At each node $k$ of the network shown in Fig. 1, the daily atmospheric temperature anomalies $T_k(t)$ (actual temperature value minus climatological average for each calendar day; see below) at the surface area level are determined. and then below: > We would like to add that, for the calculation of the climatological average in the learning phase, all data within this time window were taken into account, whereas in the prediction phase, only data from the past up to the prediction date were considered. So, in the "learning phase" $T_k(t)$ depends on data from times after $t$, while in the "prediction phase" it does not. None of this matter much if the average temperature at a given day of the year depends only slightly on the "window" in which the average is computed... but otherwise, it might make a difference.
• Options
66.
edited June 2014

John wrote:

If one is merely looking at climate data and trying to find patterns, it seems okay.

Are you comfortable with the idea that the link strength for two nodes in 1950 can depend on data from 2000?

Suppose that a node $X$ had substantial temperature anomalies over its whole history, but that node $Y$ had temperature anomalies of zero over its whole history.

Then all cross-covariances between $X$ and $Y$ will be zero.

Now consider a node $Y'$ that is exactly the same as $Y$, except that there is a giant spike on June 1, 2000.

How would you expect the link strengths of $X$ and $Y'$ to look, over all the years in the period?

Well, let's look at the cross-covariance between $X$ and $Y'$. Since we added a giant spike to $Y'$ on June 1, 2000, the temperature anomalies at $Y'$ would be bumped up on June 1 of every year, say to the value $D$. This would add $D / 365$ to the yearly mean used in the covariance formula. Therefore, on every day $d$ in the whole history of $Y'$, the difference between the temperature anomaly for $d$ and the mean temperature anomaly for that year would be non-zero. These are the deltas that get multiplied with the corresponding deltas for $Y$ and then averaged to form the covariances. Therefore, the covariances will be non-zero.

So, the spike on a single day in 2000 changes the link strength for all preceding years from zero to non-zero.

Can anyone justify this as being reasonable?

To me it seems particularly unreasonable, given that the idea of the link strength is an abstraction from some type of physical structure in the present. How could that be affected by a single event in the distant future?

For this reason I think that it would be better to use a moving average to compute the baseline for each of the calendar days. I'm leaning towards an exponentially weighted moving average.

Comment Source:John wrote: > If one is merely looking at climate data and trying to find patterns, it seems okay. I'm still uneasy about this, at least for our particular case of computing cross-covariances. Are you comfortable with the idea that the link strength for two nodes in 1950 can depend on data from 2000? Suppose that a node $X$ had substantial temperature anomalies over its whole history, but that node $Y$ had temperature anomalies of zero over its whole history. Then all cross-covariances between $X$ and $Y$ will be zero. Now consider a node $Y'$ that is exactly the same as $Y$, except that there is a giant spike on June 1, 2000. How would you expect the link strengths of $X$ and $Y'$ to look, over all the years in the period? Well, let's look at the cross-covariance between $X$ and $Y'$. Since we added a giant spike to $Y'$ on June 1, 2000, the temperature anomalies at $Y'$ would be bumped up on June 1 of every year, say to the value $D$. This would add $D / 365$ to the yearly mean used in the covariance formula. Therefore, on every day $d$ in the whole history of $Y'$, the difference between the temperature anomaly for $d$ and the mean temperature anomaly for that year would be non-zero. These are the deltas that get multiplied with the corresponding deltas for $Y$ and then averaged to form the covariances. Therefore, the covariances will be non-zero. So, the spike on a single day in 2000 changes the link strength for all preceding years from zero to non-zero. Can anyone justify this as being reasonable? To me it seems particularly unreasonable, given that the idea of the link strength is an abstraction from some type of physical structure in the present. How could that be affected by a single event in the distant future? For this reason I think that it would be better to use a moving average to compute the baseline for each of the calendar days. I'm leaning towards an exponentially weighted moving average.
• Options
67.
edited June 2014

If we conclude that this is a methodology improvement, then I would be curious to see how it affects whatever measures we deem to be of interest, e.g., the mean link strength of a links between nodes that are separated by a distance $d$.

We could also see whether it affects the forecasting signal presented in Ludescher et al.

I do understand, however, that we are now in pursuit of more fundamental results.

Comment Source:If we conclude that this is a methodology improvement, then I would be curious to see how it affects whatever measures we deem to be of interest, e.g., the mean link strength of a links between nodes that are separated by a distance $d$. We could also see whether it affects the forecasting signal presented in Ludescher et al. I do understand, however, that we are now in pursuit of more fundamental results.
• Options
68.
edited June 2014

In physics, 2-point functions often involve "data from the future" of the sort you're worrying about, David. This is because physicists consider the future just as real (or unreal) as the past... since most of the fundamental laws of physics are symmetric under time-reversal symmetry. The big difference between the future and the past is that (for reasons too lengthy to explain here) we have records of the past but not of the future. This is a huge practical difference, but it's more about us than the system we're studying.

So: if I felt nervous about some quantity depending on data from the future, I would also be nervous about it depending on data from the past. There are certainly advantages to working with quantities that depend only on what's going on at one moment in time. But means, standard deviations, correlations etc. often depend on data over a range of time.

Are you comfortable with the idea that the link strength for two nodes in 1950 can depend on data from 2000?

1. Are you comfortable that it can depend on data from 1900?

2. Yes, I'm comfortable, except for the purposes of devising a method of prediction... which is in fact one of the main things I want to do here. So your point actually does resonate with me, but more as a practical one.

We could get into an immense philosophicophysical discussion about the arrow of time, since it's a topic I'm fond of - I once wrote a book review about it:

but I'd rather avoid it.

I'd prefer to take a practical viewpoint now. I'm certainly happy to have people compute averages in different ways - including the future, only including the past, etc. - and see how it changes things. But any predictive algorithm can't use data from the future.

We can debate how much it's "cheating" to "train" an algorithm using means that involve data from the future (the future of past times, known to us now) and then run it using means that only involve data from the past (since the future of now is unknown to us now). Or we could see if it changes the answers much.

Comment Source:In physics, 2-point functions often involve "data from the future" of the sort you're worrying about, David. This is because physicists consider the future just as real (or unreal) as the past... since most of the fundamental laws of physics are symmetric under time-reversal symmetry. The big difference between the future and the past is that (for reasons too lengthy to explain here) we have _records_ of the past but not of the future. This is a huge practical difference, but it's more about us than the system we're studying. So: if I felt nervous about some quantity depending on data from the future, I would also be nervous about it depending on data from the past. There are certainly advantages to working with quantities that depend only on what's going on at _one moment in time_. But means, standard deviations, correlations etc. often depend on data over a range of time. > Are you comfortable with the idea that the link strength for two nodes in 1950 can depend on data from 2000? Two answers: 1. Are you comfortable that it can depend on data from 1900? 1. Yes, I'm comfortable, _except_ for the purposes of devising a method of prediction... which is in fact one of the main things I want to do here. So your point actually does resonate with me, but more as a practical one. We could get into an immense philosophicophysical discussion about the arrow of time, since it's a topic I'm fond of - I once wrote a book review about it: * [The Physical Basis of the Direction of Time, by H. D. Zeh](http://math.ucr.edu/home/baez/time/time.html). but I'd rather avoid it. I'd prefer to take a practical viewpoint now. I'm certainly happy to have people compute averages in different ways - including the future, only including the past, etc. - and see how it changes things. But any predictive algorithm can't use data from the future. We can debate how much it's "cheating" to "train" an algorithm using means that involve data from the future (the future of _past_ times, known to us now) and then run it using means that only involve data from the past (since the future of _now_ is unknown to us now). Or we could see if it changes the answers much.
• Options
69.

I want to talk about Graham's new work but I just flew back from Banff to Riverside last night, arriving at 1 am, and I'm leaving for Singapore tonight at 8 pm. So for a little while I may only be able to do easy things like argue about the arrow of time.

I used to be very interested in that, and wrote a book review about it:

Comment Source:I want to talk about Graham's new work but I just flew back from Banff to Riverside last night, arriving at 1 am, and I'm leaving for Singapore tonight at 8 pm. So for a little while I may only be able to do easy things like argue about the arrow of time. <img src = "http://math.ucr.edu/home/baez/emoticons/rolleyes.gif" alt = ""/> I used to be very interested in that, and wrote a book review about it: * [The Physical Basis of the Direction of Time, by H. D. Zeh](http://math.ucr.edu/home/baez/time/time.html).
• Options
70.
edited June 2014

By the way, I just noticed that the definition of "El Niño in year X" depends on temperature data for years after year X, in precisely the way David is worrying about!

I mention this not in the spirit of winning an argument (though that's always fun), but because it struck me just now, reading this:

To filter out month-to-month variability, average sea surface temperature in the Niño 3.4 region is calculated for each month, and then averaged with values from the previous month and following month. This running three-month average value is compared with average sea surface temperature for the same three months during 1971 - 2000. The departure from the 30-year average of the three-month average is known as the Oceanic Niño Index or ONI.

So, first of all, the ONI this month depends on next month's temperatures.

But it also depends on the average sea surface temperature for certain times in the interval 1971 - 2000, so whether we had an El Niño in 1975 could depend on the temperature in 1990.

Comment Source:By the way, I just noticed that the definition of "El Ni&ntilde;o in year X" depends on temperature data for years after year X, in precisely the way David is worrying about! I mention this not in the spirit of winning an argument (though that's always fun), but because it struck me just now, reading this: * LuAnn Dahlmann, [Climate Variability: Oceanic Niño Index](http://www.climate.gov/news-features/understanding-climate/climate-variability-oceanic-ni%C3%B1o-index), 2009. > To filter out month-to-month variability, average sea surface temperature in the Niño 3.4 region is calculated for each month, and then averaged with values from the previous month and following month. This running three-month average value is compared with average sea surface temperature for the same three months during 1971 - 2000. The departure from the 30-year average of the three-month average is known as the Oceanic Niño Index or ONI. So, first of all, the ONI this month depends on next month's temperatures. But it also depends on the average sea surface temperature for certain times in the interval 1971 - 2000, so whether we had an El Niño in 1975 could depend on the temperature in 1990.
• Options
71.
edited June 2014

Here are the graphs Graham generated, taken from his PDF:

I'm not completely sure what these graphs show. Graham wrote:

For each 5 days from 1951 through 1979, for a region straddling the equator, for delays of 1 and 5 days, and for 0 to 7 eastwards steps of 2.5 degrees, find the covariances of the temperature over six months (183 days).

So maybe the first graph is 1-day delay and the second is a 5-day delay. They look quite similar.

He also wrote:

The PDF file Covariances near equator shows covariances between different places near the equator in the Pacific and at two different time delays. Most of the details are in the PDF, but some things are not:

• the graphs show the median of the covariances over the region

• black means zero geographical displacement, and paler greys show displacements increasing by 2.5 degrees.

Comment Source:Here are the graphs Graham generated, taken <a href = "http://www.azimuthproject.org/azimuth/files/covs-near-equator.pdf">from his PDF</a>: <a href = "http://www.azimuthproject.org/azimuth/files/covs-near-equator.pdf"> <img width = "600" src = "http://math.ucr.edu/home/baez/ecological/el_nino/covs-near-equator_Page_6.jpg" alt = ""/></a> <a href = "http://www.azimuthproject.org/azimuth/files/covs-near-equator.pdf"> <img width = "600" src = "http://math.ucr.edu/home/baez/ecological/el_nino/covs-near-equator_Page_7.jpg" alt = ""/></a> I'm not completely sure what these graphs show. Graham wrote: > For each 5 days from 1951 through 1979, for a region straddling the equator, for delays of 1 and 5 days, and for 0 to 7 eastwards steps of 2.5 degrees, find the covariances of the temperature over six months (183 days). So maybe the first graph is 1-day delay and the second is a 5-day delay. They look quite similar. He also wrote: > The PDF file Covariances near equator shows covariances between different places near the equator in the Pacific and at two different time delays. Most of the details are in the PDF, but some things are not: > * the graphs show the median of the covariances over the region > * black means zero geographical displacement, and paler greys show displacements increasing by 2.5 degrees.
• Options
72.
edited June 2014

I like how the covariance has dramatic spikes in Graham's graphs - that's a hint we may get useful information this way.

There are high covariances in 1956, 1964, 1970, especially in 1973 and the two neighboring years, and also 1976 and 1978.

Puzzle: what is special about these years?

Comment Source:I like how the covariance has dramatic spikes in Graham's graphs - that's a hint we may get useful information this way. There are high covariances in 1956, 1964, 1970, _especially_ in 1973 and the two neighboring years, and also 1976 and 1978. **Puzzle:** what is special about these years?
• Options
73.
edited June 2014

By the way, this is important: Ludescher et al claim that:

1. During an El Niño or La Niña, climate links between the El Niño basin (a certain region of the Pacific) and other regions are weak. This could make sense if when these phenomena are raging, the El Niño basin "does its own thing" and changes in temperature there are not strongly correlated to changes in other regions.

2. Shortly before an an El Niño, climate links the El Niño basin and other regions are strong. This could make sense if the formation of an El Niño requires a "consensus" in a large part of the Pacific.

I was confused, earlier, about these two statements.

For Ludescher et al the the El Niño basin is the red dots and the "other regions" are the other dots:

Comment Source:By the way, this is important: Ludescher _et al_ claim that: 1. During an El Ni&ntilde;o or La Ni&ntilde;a, climate links between the El Ni&ntilde;o basin (a certain region of the Pacific) and other regions are _weak_. This could make sense if when these phenomena are raging, the El Ni&ntilde;o basin "does its own thing" and changes in temperature there are not strongly correlated to changes in other regions. 1. Shortly before an an El Ni&ntilde;o, climate links the El Ni&ntilde;o basin and other regions are _strong_. This could make sense if the formation of an El Ni&ntilde;o requires a "consensus" in a large part of the Pacific. I was confused, earlier, about these two statements. For Ludescher _et al_ the the El Ni&ntilde;o basin is the red dots and the "other regions" are the other dots: <img src = "http://math.ucr.edu/home/baez/ecological/ludescher_el_nino_cooperativity_1.jpg" alt = ""/>
• Options
74.

John said

So maybe the first graph is 1-day delay and the second is a 5-day delay.

You guessed right. I edited the page, and removed the "ambiguous", perhaps prematurely...

By the way, this is important: Ludescher et al claim that: During an El Niño or La Niña, climate links between the El Niño basin (a certain region of the Pacific) and other regions are weak.

I don't see them claiming this during La Niñas.

Comment Source:John said > So maybe the first graph is 1-day delay and the second is a 5-day delay. You guessed right. I edited the page, and removed the "ambiguous", perhaps prematurely... > By the way, this is important: Ludescher et al claim that: During an El Niño or La Niña, climate links between the El Niño basin (a certain region of the Pacific) and other regions are weak. I don't see them claiming this during La Niñas.
• Options
75.
edited June 2014

I don’t see them claiming this during La Niñas.

Okay, whoops.

I think some of the peaks of covariance in your graph occur in La Niñas... but it's actually a bit hard to tell.

Comment Source:> I don’t see them claiming this during La Niñas. Okay, whoops. I think some of the peaks of covariance in your graph occur in La Niñas... but it's actually a bit hard to tell.
• Options
76.

On using data to infer things about the past, I just want to point out that a lot of scientists do this, not just fundamental physicists. It is mostly what evolutionary biologists, geologists, and forensic scientists do, and it does not require using time-reversible models.

For the practical problem of predicting El Niños, there is only data going back to 1948, so if you use, say, a running 30-year average to estimate the climatological seasonal cycle, you'd have to start at 1978, losing a large proportion of the period. Using a running 30-year average for 1978 onwards, and the period 1948-1977 for earlier seems an OK compromise to me.

For covariances, the reason for wanting to subtract the climatological seasonal cycle is because we want to detect the correlations between different points and times that are not due to seasonal change. An alternative approach is to calculate covariances over shorter periods, say 10 days. If that is long enough to detect a link between two points, and short enough not to be affected by seasonal change, no seasonal adjustment is needed. Many 10-day periods could be found between the same two points to reduce variance. For 30-day periods, you could remove a linear trend from both temperature records before calculating a covariance.

Comment Source:On using data to infer things about the past, I just want to point out that a lot of scientists do this, not just fundamental physicists. It is mostly what evolutionary biologists, geologists, and forensic scientists do, and it does not require using time-reversible models. For the practical problem of predicting El Niños, there is only data going back to 1948, so if you use, say, a running 30-year average to estimate the climatological seasonal cycle, you'd have to start at 1978, losing a large proportion of the period. Using a running 30-year average for 1978 onwards, and the period 1948-1977 for earlier seems an OK compromise to me. For covariances, the reason for wanting to subtract the climatological seasonal cycle is because we want to detect the correlations between different points and times that are not due to seasonal change. An alternative approach is to calculate covariances over shorter periods, say 10 days. If that is long enough to detect a link between two points, and short enough not to be affected by seasonal change, no seasonal adjustment is needed. Many 10-day periods could be found between the same two points to reduce variance. For 30-day periods, you could remove a linear trend from both temperature records before calculating a covariance.
• Options
77.
edited June 2014

Some things I've noticed in my results:

Increasing the displacement from 0 to 17.5 degrees (~2000km) shows very 'clean' behaviour in the one-day case, with covariances dropping off in a regular way. Its a bit more muddled in the 5-day case.

The dramatic peaks are very similar in the two cases, but during non-peak times, the one-day covariances are larger.

During peaks, the displacement hardly seems to matter. It looks like the region has become 'in-step' in the east-west direction.

I think there's some positive correlation between peaks and La Niñas (like John said) but there are peaks without La Niñas and La Niñas without peaks.

Comment Source:Some things I've noticed in my results: Increasing the displacement from 0 to 17.5 degrees (~2000km) shows very 'clean' behaviour in the one-day case, with covariances dropping off in a regular way. Its a bit more muddled in the 5-day case. The dramatic peaks are very similar in the two cases, but during non-peak times, the one-day covariances are larger. During peaks, the displacement hardly seems to matter. It looks like the region has become 'in-step' in the east-west direction. I think there's some positive correlation between peaks and La Niñas (like John said) but there are peaks without La Niñas and La Niñas without peaks.
• Options
78.
edited June 2014

John wrote:

So: if I felt nervous about some quantity depending on data from the future, I would also be nervous about it depending on data from the past.

Ok, I'm persuaded. Little by little the concern that I raised was backing me into a corner. It would further imply that one couldn't even take the covariances themselves, because that involves looking at future data!

Actually, it doesn't "violate causality" to incorporate future data into the analysis of the present. A causal connection from A to B will naturally be manifest in a correspondence between an event at A and a corresponding future event at B. So that spike that I was worrying about in the future could, with some probability, be a manifestation of an event that took place much earlier at one of the other nodes. So it's not unreasonable that it increases our estimates of the strength of the connections between that node and others, even at an earlier time.

So, thanks for leading me in the right direction -- and I'm happy to move on with the analysis.

Comment Source:John wrote: > So: if I felt nervous about some quantity depending on data from the future, I would also be nervous about it depending on data from the past. Ok, I'm persuaded. Little by little the concern that I raised was backing me into a corner. It would further imply that one couldn't even take the covariances themselves, because that involves looking at future data! Actually, it doesn't "violate causality" to incorporate future data into the analysis of the present. A causal connection from A to B will naturally be manifest in a correspondence between an event at A and a corresponding future event at B. So that spike that I was worrying about in the future could, with some probability, be a manifestation of an event that took place much earlier at one of the other nodes. So it's not unreasonable that it increases our estimates of the strength of the connections between that node and others, even at an earlier time. So, thanks for leading me in the right direction -- and I'm happy to move on with the analysis.
• Options
79.
edited June 2014

Graham, I've been reading through your R code that you "sweaved" into the pdf file. It looks good. I'd like to try to replicate the graph you made, using this code.

I see that you're referencing a file Pacific-1950-1979.txt, and there is a commented out reference to Scotland-1950-1952.txt. These I presume are files that you generated from the raw NOAA files.

A couple of requests, which would help me to get going with R on this data. First, can you post these files to our github repository. Second, can you post a complete R script that generates the graphs you shared with us. It looks like one could just paste the stuff that you have sweaved into the pdf file, but I'm not sure if anything is missing. One thing that would help is the final command that you used to generate the plots. Thanks.

(I won't ask for standalone scripts for everything, but some well chosen standalones that do something meaningful are a great way to get ramped up with a new language.)

Suggestion, can we make a folder called el_nino, which could have subfolders for each of us. If we put together something coherent as a group, we could organize it in the parent directory el_nino. Useful things to post would be data files, standalone scripts that produce graphs, and useful functions to build on. But each artist can organize their folder as they see fit!

My own personal plan is to get ramped up both in the R environment and the Python scientific programming platform (scipy/numpy/matplotlib/pandas). The more tools the merrier!

Comment Source:Graham, I've been reading through your R code that you "sweaved" into the pdf file. It looks good. I'd like to try to replicate the graph you made, using this code. I see that you're referencing a file Pacific-1950-1979.txt, and there is a commented out reference to Scotland-1950-1952.txt. These I presume are files that you generated from the raw NOAA files. A couple of requests, which would help me to get going with R on this data. First, can you post these files to our github repository. Second, can you post a complete R script that generates the graphs you shared with us. It looks like one could just paste the stuff that you have sweaved into the pdf file, but I'm not sure if anything is missing. One thing that would help is the final command that you used to generate the plots. Thanks. (I won't ask for standalone scripts for everything, but some well chosen standalones that do something meaningful are a great way to get ramped up with a new language.) Suggestion, can we make a folder called el_nino, which could have subfolders for each of us. If we put together something coherent as a group, we could organize it in the parent directory el_nino. Useful things to post would be data files, standalone scripts that produce graphs, and useful functions to build on. But each artist can organize their folder as they see fit! My own personal plan is to get ramped up both in the R environment and the Python scientific programming platform (scipy/numpy/matplotlib/pandas). The more tools the merrier!
• Options
80.
edited June 2014

Graham, sorry I don't want to make you do extra work here. I just saw your netCDF converter on the wiki page -- there's been a burst of information here, and I overlooked it -- so it looks like I can generate the files myself. Also I can make and post some standalone demos.

If you have something already along the lines I asked for that's great, but otherwise I should be able to cover it.

Comment Source:Graham, sorry I don't want to make you do extra work here. I just saw your netCDF converter on the wiki page -- there's been a burst of information here, and I overlooked it -- so it looks like I can generate the files myself. Also I can make and post some standalone demos. If you have something already along the lines I asked for that's great, but otherwise I should be able to cover it.
• Options
81.
edited June 2014

I have put two R scripts onto github. I've put the convertor in R, since it is intended for others to use, test, debug. The other script makes the graphs in comments 58 and 73 above. I don't know who gets told about this by github.

Comment Source:I have put two R scripts onto github. I've put the convertor in R, since it is intended for others to use, test, debug. The other script makes the graphs in comments 58 and 73 above. I don't know who gets told about this by github.
• Options
82.
edited June 2014

I edited the paths in the various .r and .R scripts to my unix absolute paths /home/jim/...etc.

I managed to get AirSig.r to convert some individual air-sig995.XXXX.nc files with the command R CMD BATCH AirSig.r (although I've just read a post saying that R CMD BATCH is not the best syntax).

But I'm afraid I haven't a clue what the sets of 5x25 numbers (105 unique) per year are meant to represent.

rnc4[1:5, 1:5, 1:5] , , 1

[,1]   [,2]   [,3]   [,4]   [,5]


[1,] 258.78 259.30 264.10 269.18 272.38 [2,] 258.78 259.55 264.47 269.57 272.35 [3,] 258.78 259.82 264.82 269.78 272.12 [4,] 258.78 260.00 265.05 269.88 271.80 [5,] 258.78 260.18 265.22 269.90 271.45

I also failed with:

netcef-displayyear.r

netcef-displayyear.r.Rout: ==>

yearmeans <- make.yearmeans(1951, lat30N, lat30S, lon120E, lon315E) Error: NetCDF: Unknown file format Execution halted (END)

and

netconvertor.R

netconvertor.r.Rout ==>

setof.Kvals <- NULL for (i in firstyear:lastyear) { + setof.Kvals <- rbind(setof.Kvals, make.Kvals.for.year(i)) + } Error: NetCDF: Unknown file format Execution halted

I got a missing "Pacific...txt" message as well.

I'm definitely missing at least a couple of tricks. I have my gridded earth model but obviously no air-sig995 or other temperature data yet.

Thanks Graham for putting your stuff on github. If it helps, people have to "Watch" individual repos to get change notifications if that's what you meant.

Comment Source:I edited the paths in the various .r and .R scripts to my unix absolute paths /home/jim/...etc. I managed to get AirSig.r to convert some individual air-sig995.XXXX.nc files with the command R CMD BATCH AirSig.r (although I've just read a post saying that R CMD BATCH is not the best syntax). But I'm afraid I haven't a clue what the sets of 5x25 numbers (105 unique) per year are meant to represent. > rnc[[4]][1:5, 1:5, 1:5] , , 1 > [,1] [,2] [,3] [,4] [,5] > [1,] 258.78 259.30 264.10 269.18 272.38 > [2,] 258.78 259.55 264.47 269.57 272.35 > [3,] 258.78 259.82 264.82 269.78 272.12 > [4,] 258.78 260.00 265.05 269.88 271.80 > [5,] 258.78 260.18 265.22 269.90 271.45 I also failed with: netcef-displayyear.r netcef-displayyear.r.Rout: ==> > yearmeans <- make.yearmeans(1951, lat30N, lat30S, lon120E, lon315E) Error: NetCDF: Unknown file format Execution halted (END) and netconvertor.R netconvertor.r.Rout ==> > setof.Kvals <- NULL > for (i in firstyear:lastyear) { + setof.Kvals <- rbind(setof.Kvals, make.Kvals.for.year(i)) + } Error: NetCDF: Unknown file format Execution halted I got a missing "Pacific...txt" message as well. I'm definitely missing at least a couple of tricks. I have my gridded earth model but obviously no air-sig995 or other temperature data yet. Thanks Graham for putting your stuff on github. If it helps, people have to "Watch" individual repos to get change notifications if that's what you meant.
• Options
83.
edited June 2014

Jim, I suggest you start with netcdf-convertor.R. I don't know what AirSig.r is (its not my file name).

You will need to edit it to your requirements. As supplied, it converts 3 years for 4 grid points covering Scotland. I've put the Ludescher et al Pacific co-ordinates in comments. Instructions are in the script. Then start R, and then copy and paste the whole file into the R console.

I've put this info in the README.md

rnc[[4]][1:5, 1:5, 1:5]


You don't need to know about this if you don't want to use R. It displays a small part of a large 3D array. You do need to know that the actual values like258.78 259.30 264.10 are in Kelvin.

In order to do further processing, I've been using a data file called Pacific-1950-1979.txt made by netcdf-convertor.R and reading it in again.

Comment Source:Jim, I suggest you start with netcdf-convertor.R. I don't know what AirSig.r is (its not my file name). You will need to edit it to your requirements. As supplied, it converts 3 years for 4 grid points covering Scotland. I've put the Ludescher et al Pacific co-ordinates in comments. Instructions are in the script. Then start R, and then copy and paste the whole file into the R console. I've put this info in the README.md ~~~~ rnc[[4]][1:5, 1:5, 1:5] ~~~~ You don't need to know about this if you don't want to use R. It displays a small part of a large 3D array. You do need to know that the actual values like258.78 259.30 264.10 are in Kelvin. In order to do further processing, I've been using a data file called Pacific-1950-1979.txt made by netcdf-convertor.R and reading it in again.
• Options
84.
edited June 2014

I have the air.sig9951948.nc to air.sig995.2014.nc files in a directory on their own.

I launch R and copy and paste the path-amended netcdf-convertor.R script into the REPL.

==>

setof.Kvals <- NULL for (i in firstyear:lastyear) { + setof.Kvals <- rbind(setof.Kvals, make.Kvals.for.year(i)) + } Error: NetCDF: Unknown file format

write.table(x=setof.Kvals, file=outputfi +

I got the same error with covariances-near-equator-1day-5day.R.

Perhaps my .nc files are corrupt?

Comment Source:I have the air.sig9951948.nc to air.sig995.2014.nc files in a directory on their own. I launch R and copy and paste the path-amended netcdf-convertor.R script into the REPL. ==> > setof.Kvals <- NULL > for (i in firstyear:lastyear) { + setof.Kvals <- rbind(setof.Kvals, make.Kvals.for.year(i)) + } Error: NetCDF: Unknown file format > > write.table(x=setof.Kvals, file=outputfi + I got the same error with covariances-near-equator-1day-5day.R. Perhaps my .nc files are corrupt?
• Options
85.
edited June 2014

I have nc files for 1948-1979, and netcdf-convertor.R uses 1950-1952 as it is, so maybe you're just missing the right years.

I updated netcdf-convertor.R on github with hopefully better instructions.

Comment Source:I have nc files for 1948-1979, and netcdf-convertor.R uses 1950-1952 as it is, so maybe you're just missing the right years. I updated netcdf-convertor.R on github with hopefully better instructions.
• Options
86.
edited June 2014

Just FYI, there is an RNetCDF package available as well. I've used this when I assimilated and processed HadCRUT4 and made it available to the public, as described at http://hypergeometric.wordpress.com/2014/01/22/hadcrut4-version-hadcrut-4-2-0-0-available-as-rdata-r-workspace-or-image/

Comment Source:Just FYI, there is an RNetCDF package available as well. I've used this when I assimilated and processed HadCRUT4 and made it available to the public, as described at [http://hypergeometric.wordpress.com/2014/01/22/hadcrut4-version-hadcrut-4-2-0-0-available-as-rdata-r-workspace-or-image/](http://hypergeometric.wordpress.com/2014/01/22/hadcrut4-version-hadcrut-4-2-0-0-available-as-rdata-r-workspace-or-image/)
• Options
87.

Jan, I am using RNetCDF.

Comment Source:Jan, I am using RNetCDF.
• Options
88.
Very good. Saw a mention of netcdf-convertor.R.
Comment Source:Very good. Saw a mention of netcdf-convertor.R.
• Options
89.

netcdf-convertor.R is my code, which uses RNetCDF. Its purpose is to convert to a format that can be easily read by, say, a Haskell user.

Comment Source:netcdf-convertor.R is my code, which uses RNetCDF. Its purpose is to convert to a format that can be easily read by, say, a Haskell user.
• Options
90.

I do have all the air.sig..nc files from 1948-2014. I've tried it again with the years set to 2010-2013 with the same: "Error: NetCDF: Unknown file format" msg. My .nc files are each 852 bytes long.

Comment Source:I do have all the air.sig..nc files from 1948-2014. I've tried it again with the years set to 2010-2013 with the same: "Error: NetCDF: Unknown file format" msg. My .nc files are each 852 bytes long.
• Options
91.

Jim, is that a typo, 852 bytes? That's way too small, they are each about 7 megabytes big.

Comment Source:Jim, is that a typo, 852 bytes? That's way too small, they are each about 7 megabytes big.
• Options
92.

That's it then. How my saves came out like that I've no idea. I'll just do it again.

Comment Source:That's it then. How my saves came out like that I've no idea. I'll just do it again.
• Options
93.
edited June 2014

I've found the ftp site and air.sig995.1948.nc > 30MB so no testing tonight. Thanks for the prompt help from you all. Sorry for the noise.

Comment Source:I've found the ftp site and air.sig995.1948.nc > 30MB so no testing tonight. Thanks for the prompt help from you all. Sorry for the noise.
• Options
94.
edited June 2014

Greater than 30M is too big, look for 7M. Also the files are named e.g. air.sig995.1948.nc.

I got the files from this URL (this is reachable by a few clicks from the link on our Experiments wiki page, later I'll add this direct link to the page).

It was a bit tedious to download each file by hand, but all told it took about ten minutes.

Comment Source:Greater than 30M is too big, look for 7M. Also the files are named e.g. air.sig995.1948.nc. I got the files from this [URL](http://www.esrl.noaa.gov/psd/cgi-bin/db_search/DBListFiles.pl?did=33&tid=41809&vid=668) (this is reachable by a few clicks from the link on our Experiments wiki page, later I'll add this direct link to the page). It was a bit tedious to download each file by hand, but all told it took about ten minutes.
• Options
95.

It was a bit tedious to download each file by hand, but all told it took about ten minutes.

I used

for (year in 1950:1979) {
destfile=paste0("air.sig995.", year, ".nc"), mode="wb")
}


This took about 2 minutes to write and about 15 to debug. ;-) (The mode="wb" was the bit I missed first time.)

Comment Source:> It was a bit tedious to download each file by hand, but all told it took about ten minutes. I used ~~~~ for (year in 1950:1979) { download.file(url=paste0("ftp://ftp.cdc.noaa.gov/Datasets/ncep.reanalysis.dailyavgs/surface/air.sig995.", year, ".nc"), destfile=paste0("air.sig995.", year, ".nc"), mode="wb") } ~~~~ This took about 2 minutes to write and about 15 to debug. ;-) (The mode="wb" was the bit I missed first time.)
• Options
96.
edited June 2014

I realised that I must have got the wrong data file after Dave said his were ~7MB. Thanks very much for the url; the first Reanalysis link on the forum or the wiki didn't work when I first tried it. Your R script is currently downloading the correct files successfully :) Cheers

Comment Source:I realised that I must have got the wrong data file after Dave said his were ~7MB. Thanks very much for the url; the first Reanalysis link on the forum or the wiki didn't work when I first tried it. Your R script is currently downloading the correct files successfully :) Cheers
• Options
97.
edited June 2014

Downloads went ok (I think). Pasting netcdf-convertor.R at an R prompt errored out. I've posted an issue on github as that seems like the best place.

I also tried it on a different couple of years just in case...

(And I want to see who, if anybody gets notified and by what settings?).

Comment Source:Downloads went ok (I think). Pasting netcdf-convertor.R at an R prompt errored out. I've posted an issue on github as that seems like the best place. I also tried it on a different couple of years just in case... (And I want to see who, if anybody gets notified and by what settings?).
• Options
98.

Jim, I got an email. I commented on github:

This line looks wrong:

onc <- open.nc(paste0("air.sig995.


It should be

onc <- open.nc(paste0("air.sig995.", year, ".nc"))


PS, use four tildes for code in markdown.

Comment Source:Jim, I got an email. I commented on github: This line looks wrong: ~~~~ onc <- open.nc(paste0("air.sig995. ~~~~ It should be ~~~~ onc <- open.nc(paste0("air.sig995.", year, ".nc")) ~~~~ PS, use four tildes for code in markdown.
• Options
99.
edited June 2014

### Covariance maps 1951-1979

The image shows one map of the Pacific for each quarter for the years 1951 through 1979. On the right, the NINO3.4 index is shown for the year.

The area is that used by Ludescher et al (2013). The "El Nino basin", as defined by Ludescher et al (2013) is the black region along the Equator towards the East, plus two pixels below. For every other pixel i, the sum TC(i) of the covariances between i and the 14 pixels in the basin is shown. The covariances are calculated over the previous year. The absolute values are "squashed" by before conversion to colours. Negative values of TC(i) are red, positive values green, paler meaning bigger in absolute value. Very big values are shown by bright red and green.

#### More detail

The climatological seasonal cycle (mean over years for each grid point, each day-in-year) is subtracted. The data is spatially subsampled into 7.5 by 7.5 degree squares. There are 9 by 23 such squares. The covariances are calculated for a day in the middle of each quarter. The covariances are calculated over a period of 365 days. There is no time delay between the periods. The TC(i) values are squashed by sign(TC(i)) * sqrt(abs(TC(i))) before conversion to colours. The range -3,3 is mapped to dull shades; below -3 is bright red, above 3 bright green.

The El Nino index is from http://www.cpc.ncep.noaa.gov/products/analysis_monitoring/ensostuff/detrend.nino34.ascii.txt.

Comment Source:### Covariance maps 1951-1979 <img width = "800" src = "http://www.azimuthproject.org/azimuth/files/cov-maps-1951-1959.png" alt = ""/> The image shows one map of the Pacific for each quarter for the years 1951 through 1979. On the right, the NINO3.4 index is shown for the year. The area is that used by Ludescher et al (2013). The "El Nino basin", as defined by Ludescher et al (2013) is the black region along the Equator towards the East, plus two pixels below. For every other pixel i, the sum TC(i) of the covariances between i and the 14 pixels in the basin is shown. The covariances are calculated over the previous year. The absolute values are "squashed" by before conversion to colours. Negative values of TC(i) are red, positive values green, paler meaning bigger in absolute value. Very big values are shown by bright red and green. #### More detail The climatological seasonal cycle (mean over years for each grid point, each day-in-year) is subtracted. The data is spatially subsampled into 7.5 by 7.5 degree squares. There are 9 by 23 such squares. The covariances are calculated for a day in the middle of each quarter. The covariances are calculated over a period of 365 days. There is no time delay between the periods. The TC(i) values are squashed by sign(TC(i)) * sqrt(abs(TC(i))) before conversion to colours. The range -3,3 is mapped to dull shades; below -3 is bright red, above 3 bright green. The El Nino index is from [http://www.cpc.ncep.noaa.gov/products/analysis_monitoring/ensostuff/detrend.nino34.ascii.txt](http://www.cpc.ncep.noaa.gov/products/analysis_monitoring/ensostuff/detrend.nino34.ascii.txt).
• Options
100.
edited June 2014

+3

I just went to get a list of El Nino years (different sets seem to be somewhat off by one from each other) and came across this:

I've never seen the site before so haven't checked it out:

( Doubts Surface Over 2014 El Nino Development, posted 8 hours ago.

Comment Source:+3 I just went to get a list of El Nino years (different sets seem to be somewhat off by one from each other) and came across this: [](http://www.reportingclimatescience.com/news-stories/article/doubts-surface-over-2014-el-nino-development-as-warming-stalls.html) I've never seen the site before so haven't checked it out: ( [Doubts Surface Over 2014 El Nino Development](http://www.reportingclimatescience.com/news-stories/article/doubts-surface-over-2014-el-nino-development-as-warming-stalls.html), posted 8 hours ago.