Home › Azimuth Project › › Azimuth Blog

It looks like you're new here. If you want to get involved, click one of these buttons!

- All Categories 2.4K
- Chat 505
- Study Groups 21
- Petri Nets 9
- Epidemiology 4
- Leaf Modeling 2
- Review Sections 9
- MIT 2020: Programming with Categories 51
- MIT 2020: Lectures 20
- MIT 2020: Exercises 25
- Baez ACT 2019: Online Course 339
- Baez ACT 2019: Lectures 79
- Baez ACT 2019: Exercises 149
- Baez ACT 2019: Chat 50
- UCR ACT Seminar 4
- General 75
- Azimuth Code Project 111
- Statistical methods 4
- Drafts 10
- Math Syntax Demos 15
- Wiki - Latest Changes 3
- Strategy 113
- Azimuth Project 1.1K
- - Spam 1
- News and Information 148
- Azimuth Blog 149
- - Conventions and Policies 21
- - Questions 43
- Azimuth Wiki 719

## Comments

John wrote in #29:

Gulp!

I looked at our dearest WebHUBTel's code :)

He uses the MeanFilter:

MeanFilter

if you show me the R code for the other plots I will show you the smoothing algorithm used some place.

John the "RAW data" or "RAW signal" in my terminology to mean the original untouched data, cannot be plotted it looks like this crap:

Tahiti Darwin RAW plots

So most places they are smoothed or denoised these plots.

Ok John are you ready? If these signals are denoised they have negative correlation, if RAW they have positive correlation!

So the diagram I plotted earlier (Blue vs. Red) for the two time-series is indeed denoised! as you concluded that their correlation must be negative and it is. But both my diagram or webHUBTel and the other R plots are all denoised one way or the other (unless I am bonked in my head) so people are accustomed to them thinking that they are the actual RAW data plots. BUT THEY ARE NOT THE ACTUAL RAW DATA!

To avoid confusion:

So what should we use:

For forecast algorithms involving output expressions of IF THEN ELSE sort I pick the denoised, for correlation and covariance or any such algorithms again I pick denoised.

I could be completely wrong, and I need to learn how to use these techs properly.

WebBubTel wrote:

I am completely infallible!

Dara

`John wrote in #29: > So, there is some mistake somewhere. Gulp! I looked at our dearest WebHUBTel's code :) He uses the MeanFilter: [MeanFilter](http://reference.wolfram.com/language/ref/MeanFilter.html) if you show me the R code for the other plots I will show you the smoothing algorithm used some place. John the "RAW data" or "RAW signal" in my terminology to mean the original untouched data, cannot be plotted it looks like this crap: [Tahiti Darwin RAW plots](http://files.lossofgenerality.com/tahiti_darwinRAW.jpg) So most places they are smoothed or denoised these plots. Ok John are you ready? If these signals are denoised they have negative correlation, if RAW they have positive correlation! So the diagram I plotted earlier (Blue vs. Red) for the two time-series is indeed denoised! as you concluded that their correlation must be negative and it is. But both my diagram or webHUBTel and the other R plots are all denoised one way or the other (unless I am bonked in my head) so people are accustomed to them thinking that they are the actual RAW data plots. BUT THEY ARE NOT THE ACTUAL RAW DATA! To avoid confusion: 1. Correlation between RAW-Darwin vs. RAW-Tahiti data in question is 0.58 > 0 2. Correlation between smoothed-Darwin vs. smoothed-Tahiti is -0.50 < 0 So what should we use: 1. denoised negative correlation 2. RAW positive correlation For forecast algorithms involving output expressions of IF THEN ELSE sort I pick the denoised, for correlation and covariance or any such algorithms again I pick denoised. I could be completely wrong, and I need to learn how to use these techs properly. WebBubTel wrote: >It is possible that Dara is doing something similar by way of taking the complement and just losing track of the sign. I am completely infallible! Dara`

Hi Dara. I'm sure you're aware of it, but when you say

that would be extremely disturbing given that the normalized correlations are negative 0.5ish and positive 0.5ish, as it says that the

noisehas a magnitude that can swamp the underlying signal to such a large extent. (Note that that is if it isnoise; if it's something structured like a seasonal underlying changes or an "increasing/decreasing trend" those can be of larger magnitude and successfully decoupled from a much weaker signal. The only case I can think of where I'd trust a pure denoising algorithm is if this was due to really intense "salt-and-pepper" -- ie, sparse, impulsive noise -- whose extreme magnitude swung things around but where a median filter would remove it quite effectively.)One thing that might help to understand stuff is if you could put up plots of just some (identical) 24 months period of the RAW and denoised data so we could see some typical details in the graphs. (Unfortunately I don't have Mathematica so I can't do this myself.)

`Hi Dara. I'm sure you're aware of it, but when you say > If these signals are denoised they have negative correlation, if RAW they have positive correlation! that would be extremely disturbing given that the normalized correlations are negative 0.5ish and positive 0.5ish, as it says that the _noise_ has a magnitude that can swamp the underlying signal to such a large extent. (Note that that is if it is _noise_; if it's something structured like a seasonal underlying changes or an "increasing/decreasing trend" those can be of larger magnitude and successfully decoupled from a much weaker signal. The only case I can think of where I'd trust a pure denoising algorithm is if this was due to really intense "salt-and-pepper" -- ie, sparse, impulsive noise -- whose extreme magnitude swung things around but where a median filter would remove it quite effectively.) One thing that might help to understand stuff is if you could put up plots of just some (identical) 24 months period of the RAW and denoised data so we could see some typical details in the graphs. (Unfortunately I don't have Mathematica so I can't do this myself.)`

"WebHubTel - can you remind me of your real name?"

Paul Pukite http://scholar.google.com/citations?user=B-gWBq8AAAAJ

`"WebHubTel - can you remind me of your real name?" Paul Pukite <http://scholar.google.com/citations?user=B-gWBq8AAAAJ>`

Dara the Infallible clearly identified the distinction as the application of a seasonal filter. Darwin has a much larger seasonal effect, likely due to the land mass it is attached to. Tahiti is tempered by the surrounding ocean but still shows the effects. I am thinking that the significant seasonal fluctuations is substantially due to the occasional cyclone that will move through the area, and these usually occur during a specific part of the year. A cyclone will be a very low pressure event that will impact the monthly mean.

It may also be that the geographic low and high pressure areas, known as the doldrums and horse latitudes shift seasonally contributing to the oscillation.

So the question is how much these factors can be factored out to reveal the salient ENSO contributions.

btw, I use the NCAR SOI data.

`Dara the Infallible clearly identified the distinction as the application of a seasonal filter. Darwin has a much larger seasonal effect, likely due to the land mass it is attached to. Tahiti is tempered by the surrounding ocean but still shows the effects. I am thinking that the significant seasonal fluctuations is substantially due to the occasional cyclone that will move through the area, and these usually occur during a specific part of the year. A cyclone will be a very low pressure event that will impact the monthly mean. It may also be that the geographic low and high pressure areas, known as the doldrums and horse latitudes shift seasonally contributing to the oscillation. So the question is how much these factors can be factored out to reveal the salient ENSO contributions. btw, I use the NCAR SOI data.`

John asked in post 44:

Tahiti Darwin Anomaly Comparison

zipped:

zipped files

John the scalograms are not decomposed and are the original RAW data as you asked. I also did what David Tweed asked and took the last few 100 months and did plots for the RAW data.

Plots larger x axis number on the right e.g. 500 means 500 most recent sample values. I will fix that for you later on with better dates.

However I decomposed the signals and clearly the RAW signal is much more complex with multiple trends with high energy fractions. More interestingly the energy fractions for both Tahiti and Darwin wavelet decompositions are quite close numerically! WebHubTel could feast on that one!

I then did the calculations for the Correlations all negative! happy?

And one other thing, I did the correlations for each decomposition level + the corresponding periodicity. I looked at the higher wavelet orders/frequencies and Gabor, all came out quite similar.

I have no idea what any of it means or how to interpret, I hope to learn from you and your other colleagues here.

Dara

`John asked in post 44: > Darwin air pressure anomalies are here, and Tahiti air pressure anomalies are here. [Tahiti Darwin Anomaly Comparison](http://files.lossofgenerality.com/tahiti_darwin_comp.pdf) zipped: [zipped files](http://files.lossofgenerality.com/Baez2ZIP2.zip) John the scalograms are not decomposed and are the original RAW data as you asked. I also did what David Tweed asked and took the last few 100 months and did plots for the RAW data. Plots larger x axis number on the right e.g. 500 means 500 most recent sample values. I will fix that for you later on with better dates. However I decomposed the signals and clearly the RAW signal is much more complex with multiple trends with high energy fractions. More interestingly the energy fractions for both Tahiti and Darwin wavelet decompositions are quite close numerically! WebHubTel could feast on that one! I then did the calculations for the Correlations all negative! happy? And one other thing, I did the correlations for each decomposition level + the corresponding periodicity. I looked at the higher wavelet orders/frequencies and Gabor, all came out quite similar. I have no idea what any of it means or how to interpret, I hope to learn from you and your other colleagues here. Dara`

WebHubTel wrote:

Ah! much much better...

What talent! I cannot fathom any of this, so I focus on computing and learn

Dara

`WebHubTel wrote: >Dara the Infallible clearly identified the distinction Ah! much much better... > am thinking that the significant seasonal fluctuations is substantially due to the occasional cyclone What talent! I cannot fathom any of this, so I focus on computing and learn Dara`

David Tweed wrote:

I have no assumptions when computing and report the results as they come along. I am also disturbed that signals are highly multi-trended but conclusions are drawn as if the signal has one trend, and that is one of the things I hope to investigate and learn from John.

I did it for you in post 56 but with the new data John had asked to use.

Dara

`David Tweed wrote: >that would be extremely disturbing given that the normalized correlations are negative 0.5ish and positive 0.5ish I have no assumptions when computing and report the results as they come along. I am also disturbed that signals are highly multi-trended but conclusions are drawn as if the signal has one trend, and that is one of the things I hope to investigate and learn from John. > One thing that might help to understand stuff is if you could put up plots of just some (identical) 24 months period of the RAW and denoised data so we could see some typical details in the graphs. (Unfortunately I don’t have Mathematica so I can’t do this myself.) I did it for you in post 56 but with the new data John had asked to use. Dara`

Dara wrote:

Yes. Thanks for everything!

There is too much to say, but here's one thing Graham already mentioned. When you first reported

positivecorrelations, you were using data aboutair pressuresat Darwin and Tahiti. These swing up and down together according to the season.Now you are using

air pressure anomaliesat Darwin and Tahiti, where the (average) seasonal variations have been subtracted out. So now the obvious source of positive correlation is removed... and we see anegative correlation: that's the El Niño Southern Oscillation.I think this issue is separate from the issue of how "denoising" changes correlations. Denoising should remove high-frequency signals like the

that David Tweed mentioned. So, denoising would reduce correlations if there are events that suddenly reduce or increase the air pressure at both Darwin and Tahiti. A storm system or some other weather system might do this, as Paul Pukite suggested. But I don't know if weather systems are large enough to affect both Darwin and Tahiti: here is one place where having an actual meteorologist would be helpful!

Dara wrote:

Indeed, it's also helpful to have people in the project who

don'thave expert knowledge of what to expect, since that "expert knowledge" is also a source of bias.`Dara wrote: > I then did the calculations for the Correlations all negative! happy? Yes. Thanks for everything! There is too much to say, but here's one thing Graham already mentioned. When you first reported _positive_ correlations, you were using data about _air pressures_ at Darwin and Tahiti. These swing up and down together according to the season. Now you are using _air pressure anomalies_ at Darwin and Tahiti, where the (average) seasonal variations have been subtracted out. So now the obvious source of positive correlation is removed... and we see a _negative correlation_: that's the El Niño Southern Oscillation. I think this issue is separate from the issue of how "denoising" changes correlations. Denoising should remove high-frequency signals like the > really intense “salt-and-pepper” – i.e., sparse, impulsive noise that David Tweed mentioned. So, denoising would reduce correlations if there are events that suddenly reduce or increase the air pressure at both Darwin and Tahiti. A storm system or some other weather system might do this, as Paul Pukite suggested. But I don't know if weather systems are large enough to affect both Darwin and Tahiti: here is one place where having an actual meteorologist would be helpful! Dara wrote: > I have no assumptions when computing and report the results as they come along. Indeed, it's also helpful to have people in the project who _don't_ have expert knowledge of what to expect, since that "expert knowledge" is also a source of bias.`

As an observation, consider what happens when one posts their real name online. I have been trying to publicize the Azimuth project on Prof. Curry's Climate Etc blog because she and her husband Prof Peter Webster are very interested in ENSO and I assume that some fraction of the readership is.

Yet the minute I link in my real name here, guess what happens in the comments?

and it gets worse from there with the guy saying it is all "bonkers science".

I do tend to stir things up on the skeptical sites, so I probably deserve this. This is the price we will pay for working in a hotly debated research field.

BTW, Peter Webster is likely aware of what we are doing as he has commented on my blog. He is definitely an authority on ENSO and gave some very good advice that the seasonal component is the key to triggering ENSO. The late-year to early-year transition period is critical he said. But once this is averaged out by an temporal filtering process, of course this will not be as apparent.

`As an observation, consider what happens when one posts their real name online. I have been trying to publicize the Azimuth project on Prof. Curry's Climate Etc blog because she and her husband Prof Peter Webster are very interested in ENSO and I assume that some fraction of the readership is. Yet the minute I link in my real name here, guess what happens in the comments? Paul Pukite is an electrical engineer with zilch background in Earth sciences. It shows. http://scholar.google.com.au/citations?hl=en&user=TgqlMYcAAAAJ&view_op=list_works&cstart=60 and it gets worse from there with the guy saying it is all "bonkers science". I do tend to stir things up on the skeptical sites, so I probably deserve this. This is the price we will pay for working in a hotly debated research field. BTW, Peter Webster is likely aware of what we are doing as he has commented on my blog. He is definitely an authority on ENSO and gave some very good advice that the seasonal component is the key to triggering ENSO. The late-year to early-year transition period is critical he said. But once this is averaged out by an temporal filtering process, of course this will not be as apparent.`

Okay, folks, I polished up the article a bit more and published it:

There is a lot more we could do, but we can do it in future articles; this one has already has much (or more) information than most people can absorb in one sitting.

Dara:

It would be great if you could add some information about yourself to this page: Dara O Shayda. Then I could link to it for people curious about you. This is what we usually do for any Azimuth Forum member whose name appears in a blog article. For examples see David Tweed, Nadja Kutz, John Baez, David Tanzer etc.

What correlation did you finally get between the Tahiti and Darwin pressure anomalies...

withoutany denoising? I'd like to put that number in the blog article.`Okay, folks, I polished up the article a bit more and published it: * [Exploring climate data (part 1)](http://johncarlosbaez.wordpress.com/2014/08/01/exploring-climate-data-part-1/), Azimuth Blog. There is a lot more we could do, but we can do it in future articles; this one has already has much (or more) information than most people can absorb in one sitting. Dara: 1. It would be great if you could add some information about yourself to this page: [[Dara O Shayda]]. Then I could link to it for people curious about you. This is what we usually do for any Azimuth Forum member whose name appears in a blog article. For examples see [[David Tweed]], [[Nadja Kutz]], [[John Baez]], [[David Tanzer]] etc. 1. What correlation did you finally get between the Tahiti and Darwin pressure anomalies... _without_ any denoising? I'd like to put that number in the blog article.`

WebHubTel wrote:

Well, I don't read that blog much but it seems to be frequented by some aggessive "climate skeptics", so unless you're one of them you can expect them to attack you.

On a vaguely related note, it would be great if like Dara you could create a page on the Azimuth Wiki saying a bit about yourself. This would be a nice corrective to claims that you're "bonkers". You could do this by clicking on WebHubTel or Paul Pukite and adding some information. I leave the choice to you, but as you know we strongly favor real names.

`WebHubTel wrote: > I have been trying to publicize the Azimuth project on Prof. Curry’s Climate Etc blog because she and her husband Prof Peter Webster are very interested in ENSO and I assume that some fraction of the readership is. Well, I don't read that blog much but it seems to be frequented by some aggessive "climate skeptics", so unless you're one of them you can expect them to attack you. On a vaguely related note, it would be great if like Dara you could create a page on the Azimuth Wiki saying a bit about yourself. This would be a nice corrective to claims that you're "bonkers". You could do this by clicking on [[WebHubTel]] or [[Paul Pukite]] and adding some information. I leave the choice to you, but as you know we strongly favor real names.`

David Tanzer wrote:

I actually already had: I was feeling very proud of myself for that. I said that the covariance $\langle x y \rangle - \langle x \rangle \langle y \rangle $ reduces to $\langle x y \rangle$ when our wavelet $y$ has mean zero. Indeed, the whole section "Very basic statistics" was a buildup to this reason for focusing on $\langle x y \rangle$ and using it as the definition of the continuous wavelet transform.

But okay, clearly I needed to amplify that point a bit... so I did in the published version. I also added more about this in Puzzle 3, and I also followed your other suggestions.

`David Tanzer wrote: > Perhaps say a bit about why in general the wavelet should have a mean of zero. I actually already had: I was feeling very proud of myself for that. I said that the covariance $\langle x y \rangle - \langle x \rangle \langle y \rangle $ reduces to $\langle x y \rangle$ when our wavelet $y$ has mean zero. Indeed, the whole section "Very basic statistics" was a buildup to this reason for focusing on $\langle x y \rangle$ and using it as the definition of the continuous wavelet transform. But okay, clearly I needed to amplify that point a bit... so I did in the [published version](http://johncarlosbaez.wordpress.com/2014/08/01/exploring-climate-data-part-1/). I also added more about this in Puzzle 3, and I also followed your other suggestions.`

John asked:

-0.253727

That is {} in wavelet notations or RAW in my English nomenclature.

Dara

`John asked: >What correlation did you finally get between the Tahiti and Darwin pressure anomalies… without any denoising? I’d like to put that number in the blog article. -0.253727 > without any denoising That is {} in wavelet notations or RAW in my English nomenclature. Dara`

John an odd surprising observation, when I decomposed the 1D time-series for both Darwin and Tahiti data I got almost the Energy Fractions (coefficients for each index level of multi-trends):

Darwin {{1} -> 0.226711, {0, 1} -> 0.120706, {0, 0, 1} -> 0.105509, {0, 0, 0, 1} -> 0.152306, {0, 0, 0, 0} -> 0.394767}

Tahiti {{1} -> 0.232513, {0, 1} -> 0.171043, {0, 0, 1} -> 0.142122, {0, 0, 0, 1} -> 0.137091, {0, 0, 0, 0} -> 0.317231}

From experience it is damned hard to get these numbers so close to each other from different signals!

It says: The amount of each Trend in both signals is almost the same.

I cannot further a thought!

Dara

`John an odd surprising observation, when I decomposed the 1D time-series for both Darwin and Tahiti data I got almost the Energy Fractions (coefficients for each index level of multi-trends): Darwin {{1} -> 0.226711, {0, 1} -> 0.120706, {0, 0, 1} -> 0.105509, {0, 0, 0, 1} -> 0.152306, {0, 0, 0, 0} -> 0.394767} Tahiti {{1} -> 0.232513, {0, 1} -> 0.171043, {0, 0, 1} -> 0.142122, {0, 0, 0, 1} -> 0.137091, {0, 0, 0, 0} -> 0.317231} From experience it is damned hard to get these numbers so close to each other from different signals! It says: The amount of each Trend in both signals is almost the same. I cannot further a thought! Dara`

John wrote:

Sorry, I was missing the flow there. I did see the point in the previous section the point that when the mean is zero, covariance is expressed by an inner product. But then I took the CWT section as introducing a new general idea, which still would be well-defined even without the mean of zero. What I had overlooked there was the motivation throughout this application to take "local covariances" of the signal with a given pattern.

`John wrote: > I actually already had: I was feeling very proud of myself for that. Sorry, I was missing the flow there. I did see the point in the previous section the point that when the mean is zero, covariance is expressed by an inner product. But then I took the CWT section as introducing a new general idea, which still would be well-defined even without the mean of zero. What I had overlooked there was the motivation throughout this application to take "local covariances" of the signal with a given pattern.`

John wrote:

Where is the reference that shows how this seasonal effect was subtracted, I like to see how this is done.

Dara

`John wrote: >Now you are using air pressure anomalies at Darwin and Tahiti, where the (average) seasonal variations have been subtracted out. Where is the reference that shows how this seasonal effect was subtracted, I like to see how this is done. Dara`

Hi David, so in relation to post 66, here are two other reasons:

The simple engineer's approach: If I've got a function (eg, a wavelet) which has a non-zero mean, then I can split that into a sum of a constant function whose value is always equal to the mean of the function and a zero-mean function that deals with the deviation around that mean vaue. Of course, if one is able to use functions which actually have finite support then that expands to "constant functions over the compact support". So in a sense I can get everything from constant funcitons (possibly over a compact support) and functions of mean zero.

The discussion here says that more generally having zero low-order moments leads to a sparser representation.

`Hi David, so in relation to post 66, here are two other reasons: 1. The simple engineer's approach: If I've got a function (eg, a wavelet) which has a non-zero mean, then I can split that into a sum of a constant function whose value is always equal to the mean of the function and a zero-mean function that deals with the deviation around that mean vaue. Of course, if one is able to use functions which actually have finite support then that expands to "constant functions over the compact support". So in a sense I can get everything from constant funcitons (possibly over a compact support) and functions of mean zero. 2. The [discussion here](http://math.stackexchange.com/questions/128165/what-is-a-vanishing-moment) says that more generally having zero low-order moments leads to a sparser representation.`

Hello John

The blog is really really kool!

I am working on some code to educate a person what the continuous wavelet transform does, I did the equal spaced discrete version, I might do some integrals on actual parametric functions.

The code now supports DGaussian and Gabor, but could add more if you like to.

What I humbly suggest, to develop live-code documents to educate the serious researchers and students on these computing techniques and use the real data for the actual examples. As you add the mathematical parts, I could then counter support by adding the live-code documents with sample data.

Dara

`Hello John The blog is really really kool! I am working on some code to educate a person what the continuous wavelet transform does, I did the equal spaced discrete version, I might do some integrals on actual parametric functions. The code now supports DGaussian and Gabor, but could add more if you like to. What I humbly suggest, to develop live-code documents to educate the serious researchers and students on these computing techniques and use the real data for the actual examples. As you add the mathematical parts, I could then counter support by adding the live-code documents with sample data. Dara`

Dara O Shayda wrote:

Thanks. I''ve added that now, together with an explanatory remark:

Sorry for taking a while - I've been distracted by other work.

`Dara O Shayda wrote: > John wrote: > > What correlation did you finally get between the Tahiti and Darwin pressure anomalies… without any denoising? I’d like to put that number in the blog article. > -0.253727 Thanks. I''ve added that now, together with an explanatory remark: > For example, if we compute the correlation between the air pressure anomalies at Darwin and Tahiti, measured monthly from 1866 to 2012, we get -0.253727. This indicates that when one goes up, the other tends to go down. But since we're not getting -1, it means they're not completely locked into a linear relationship where one is some negative number times the other. Sorry for taking a while - I've been distracted by other work.`

All the information I know is available at this link in the blog article we just wrote:

There are 3 papers listed here, which may contain more details.

`> > Now you are using air pressure anomalies at Darwin and Tahiti, where the (average) seasonal variations have been subtracted out. > Where is the reference that shows how this seasonal effect was subtracted, I like to see how this is done. All the information I know is available at this link in the blog article we just wrote: * <a href="http://www.cgd.ucar.edu/cas/catalog/climind/soiAnnual.html">Southern Oscillation Index based upon annual standardization</a>, Climate Analysis Section, NCAR/UCAR. There are 3 papers listed here, which may contain more details.`

Thanx John, looks great, I will download those papers and see what they did to the data.

Dara

`Thanx John, looks great, I will download those papers and see what they did to the data. Dara`