#### Howdy, Stranger!

It looks like you're new here. If you want to get involved, click one of these buttons!

Options

# Simple model for temperature variability and global warming.

By using just a few climate factors, one can reproduce the global temperature series with incredibly fine detail. The green curve is the NASA GISS temperature data and the dark blue is the model.

The overall trend is scaled log(CO2). The mutidecadal wobble is deviations of the Length-of-Day (-dLOD).

The fine structure is made up of the ENSO Southern Oscillation Index (-SOI) which is the atmospheric pressure at Tahiti minus the pressure at Darwin.

Any place the fine structure is not represented well by SOI, cooling spikes caused by significant volcanic eruptions are more than likely too blame. Thus, the model includes volcanic eruptions of those with at least a volcanic explosivity index (VEI) of 5.

Those are the main ingredients so far. Because the model works so well, I have a few more factors that I experiment with. One is a correction that I apply for the WWII years where the temperature calibration appears to be very poor.

I have shown this around to various blogs and the most common accusation I get concerns over-fitting. Fitting curves and manifolds to data is what science is all about when it comes down to it, so I guess that criticism comes with the territory.

• Options
1.
edited March 2015

I think it would be good if you posted some links to the most interesting of the blog discussions. Techniques for detecting overfitting apart from empirical prediction are pretty important and well-worth some wiki pages imo.

Comment Source:I think it would be good if you posted some links to the most interesting of the blog discussions. Techniques for detecting overfitting apart from empirical prediction are pretty important and well-worth some wiki pages imo.
• Options
2.

Jim, I got some feedback from the And Then There's Physics blog, in a thread dedicated to analyzing internal variability. http://andthentheresphysics.wordpress.com/2015/03/01/some-thoughts-on-internal-variability/

The two biggest complaints that I had were (1) that using LOD had no physical justification (2) that the use of only CO2 to represent GHGs was wrong.

There was also a discussion about using the correct thermal impulse response in for example the volcanic forcings.

Having a discussion around LOD is perfect for a physics forum. What the LOD anomaly represents is loss or gain in the earth's rotational kinetic energy. In terms of a relative number it is obviously very small but it can represent a number of different mechanisms in the earth's mantle, oceans, or atmosphere. The correlation to temperature was apparently first observed back in 1976:

Lambeck, K., and A. Cazenave, 1976: "Long-term variations in the length of day and climatic change." Geophys. J. Roy. Astron. Soc., 46, 555-573.

The general idea is that heat is being produced or sunk as the kinetic energy of the earth changes and only by this proxy correlation is the association noticed. The significance of this finding was revealed by Dickey and colleagues at NASA JPL, who noticed that the correlation does not appear to include warming due to AGW, only to natural temperature variations. This makes it perfect for compensating the temperature series for natural long-term temperature variations, leaving the man-made portion alone.

Dickey, Jean O., Steven L. Marcus, and Olivier de Viron. “Air temperature and anthropogenic forcing: insights from the solid earth.” Journal of Climate 24.2 (2011): 569-574.

Comment Source:Jim, I got some feedback from the And Then There's Physics blog, in a thread dedicated to analyzing internal variability. http://andthentheresphysics.wordpress.com/2015/03/01/some-thoughts-on-internal-variability/ The two biggest complaints that I had were (1) that using LOD had no physical justification (2) that the use of only CO2 to represent GHGs was wrong. There was also a discussion about using the correct thermal impulse response in for example the volcanic forcings. Having a discussion around LOD is perfect for a physics forum. What the LOD anomaly represents is loss or gain in the earth's rotational kinetic energy. In terms of a relative number it is obviously very small but it can represent a number of different mechanisms in the earth's mantle, oceans, or atmosphere. The correlation to temperature was apparently first observed back in 1976: > Lambeck, K., and A. Cazenave, 1976: "Long-term variations in the length of day and climatic change." Geophys. J. Roy. Astron. Soc., 46, 555-573. The general idea is that heat is being produced or sunk as the kinetic energy of the earth changes and only by this proxy correlation is the association noticed. The significance of this finding was revealed by Dickey and colleagues at NASA JPL, who noticed that the correlation does not appear to include warming due to AGW, only to natural temperature variations. This makes it perfect for compensating the temperature series for natural long-term temperature variations, leaving the man-made portion alone. > Dickey, Jean O., Steven L. Marcus, and Olivier de Viron. “Air temperature and anthropogenic forcing: insights from the solid earth.” Journal of Climate 24.2 (2011): 569-574. 
• Options
3.

As far as fine structure goes, note that around 1912 that the model temperature does not quite drop as much as the data shows. Could this be a case where the model is not capturing a significant cooling?

Consider that on June 1912, the 20th century's most voluminous volcano erupted, Novarupta on Alaska's Katmai peninsula. http://pubs.usgs.gov/pp/1791/

The chart above shows a magnified view of that time span. The model already includes a response to the volcano according to Sato's data From NASA GISS. But perhaps there is something more to that specific volcano. Could more aerosols have been pushed into the atmosphere during this transient interval, thus suppressing temperatures more than estimated?

The temptation is to work this kind of model to the limits. When a model can capture the dynamics to a certain level of fidelity,it is not an indication to stop investigating but a sign that one can potentially plumb the depths to extract even more information.

Comment Source:As far as fine structure goes, note that around 1912 that the model temperature does not quite drop as much as the data shows. Could this be a case where the model is not capturing a significant cooling? Consider that on June 1912, the 20th century's most voluminous volcano erupted, Novarupta on Alaska's Katmai peninsula. http://pubs.usgs.gov/pp/1791/ ![Novarupta](http://imageshack.com/a/img905/8353/vkaXaI.gif) The chart above shows a magnified view of that time span. The model already includes a response to the volcano according to Sato's data From NASA GISS. But perhaps there is something more to that specific volcano. Could more aerosols have been pushed into the atmosphere during this transient interval, thus suppressing temperatures more than estimated? The temptation is to work this kind of model to the limits. When a model can capture the dynamics to a certain level of fidelity,it is not an indication to stop investigating but a sign that one can potentially plumb the depths to extract even more information. 
• Options
4.

WHT. Great. I'm working my way through the posts. Would you prefer any questions or comments to go on the original posts or here?

Comment Source:WHT. Great. I'm working my way through the posts. Would you prefer any questions or comments to go on the original posts or here?
• Options
5.

I will need some feedback on one of my recent fitting procedures. This is something I noticed with respect to the Southern Oscillation Index measure. Since the SOI characteristic is a measure of the Darwin-Tahiti pressure dipole, I decided to modify the fitting procedure slightly.

What I did was to use one of three values. (1) If the modeling error was minimized for the Darwin-Tahiti value, I used that difference. (2) If the error was minimized if it was closer to the Darwin value, I used 2Darwin (3) If the error was minimized if it was closer to Tahiti, I used -2Tahiti

The rationale for this approach is that since the atmospheric pressure readings are very noisy , the true value may be revealed in one of the two dipole measures instead of the "average". So that if one of the locations is influenced by an occasional cyclone, which would throw off the atmospheric pressure transiently in that geographical location, that value is not used in preference for the other value.

However, I am not sure how statistically correct this approach is. It obviously helps the measure of correlation coefficient but the concern is that this is an unfairly biased procedure.

The subjective reason that I tried out this procedure is that the purely unbiased approach of using Darwin-Tahiti looked as if it was tantalizingly close to reducing the residual error, but that the occasional noisy value would compensate the least-squares fit in the wrong direction. But by self-selecting a value, the fit improved impressively.

There is no right or wrong way in analyzing this data when one is trying to get at the fundamental properties, and this is one of those approaches that could be rightly criticized but that the benefits may be worthwhile. I am trying to find a precedent for this type of approach in other areas of experimental science but have come up empty. It is kind of like playing best-ball golf -- scores will improve but is it fair?

Comment Source:Jim, Any comments are fine. I will need some feedback on one of my recent fitting procedures. This is something I noticed with respect to the Southern Oscillation Index measure. Since the SOI characteristic is a measure of the Darwin-Tahiti pressure dipole, I decided to modify the fitting procedure slightly. What I did was to use one of three values. (1) If the modeling error was minimized for the Darwin-Tahiti value, I used that difference. (2) If the error was minimized if it was closer to the Darwin value, I used 2*Darwin (3) If the error was minimized if it was closer to Tahiti, I used -2*Tahiti The rationale for this approach is that since the atmospheric pressure readings are very noisy , the true value may be revealed in one of the two dipole measures instead of the "average". So that if one of the locations is influenced by an occasional cyclone, which would throw off the atmospheric pressure transiently in that geographical location, that value is not used in preference for the other value. However, I am not sure how statistically correct this approach is. It obviously helps the measure of correlation coefficient but the concern is that this is an unfairly biased procedure. The subjective reason that I tried out this procedure is that the purely unbiased approach of using Darwin-Tahiti looked as if it was tantalizingly close to reducing the residual error, but that the occasional noisy value would compensate the least-squares fit in the wrong direction. But by self-selecting a value, the fit improved impressively. There is no right or wrong way in analyzing this data when one is trying to get at the fundamental properties, and this is one of those approaches that could be rightly criticized but that the benefits may be worthwhile. I am trying to find a precedent for this type of approach in other areas of experimental science but have come up empty. It is kind of like playing best-ball golf -- scores will improve but is it fair? 
• Options
6.

A naive question I had re. your sloshing analysis pdf is how do your equations deal with what we think is a regime change circa 1976? Don't such events preclude prediction or even continuation with the same parameters?

If outlier cyclones distort the fit, how is ignoring half the information different from manually removing outliers?

Sorry I can't help with anything more than stats 101.

Comment Source:A naive question I had re. your sloshing analysis pdf is how do your equations deal with what we think is a regime change circa 1976? Don't such events preclude prediction or even continuation with the same parameters? If outlier cyclones distort the fit, how is ignoring half the information different from manually removing outliers? Sorry I can't help with anything more than stats 101.
• Options
7.

Jim, something changed around 1976 which is most evident from the wavelet time series. It is possible that prediction is difficult at best and all we are doing is post-mortem analysis.

Yesterday it came out that NOAA announced that an El Nino finally arrived. This is odd in terms of the SOI settling down to neutral conditions at the same time. Much of the internal discussion right now is on whether that decades of AGW-induced warming of the Pacific is actually impacting the rules at what constitutes an El Nino. It used to be that a warming anomaly of 4 to 5 consecutive months was required before an El Nino was declared, but it could be that sustained warmth is screwing with the rules so that this El Nino is "by definition" only.

Yet, by the same token, it may be possible that AGW is providing a non-stationary stimulus to ENSO so that we are really winging it by trying to predict anything.

"If outlier cyclones distort the fit, how is ignoring half the information different from manually removing outliers?"

It would be better if I did have information on what are the outliers. As it is, I just have an idea that one of the two values of the SOI dipole is a better indicator of extreme ENSO conditions. A perfect dipole of Darwin and Tahiti would give a correlation coefficient of -1, but the value is only -0.7 which means that we aren't picking out a "perfect" dipole to regress against. What I am doing when stressing either Tahiti or Darwin is exaggerating the dipole nature. I am still not sure that this is fair but it is certainly an intriguing approach that shows how important ENSO is in quantitatively explaining natural temperature variability -- we already know how important ENSO and El Ninos are in qualitatively explaining global heat waves.

Comment Source:Jim, something changed around 1976 which is most evident from the wavelet time series. It is possible that prediction is difficult at best and all we are doing is post-mortem analysis. Yesterday it came out that NOAA announced that an El Nino finally arrived. This is odd in terms of the SOI settling down to neutral conditions at the same time. Much of the internal discussion right now is on whether that decades of AGW-induced warming of the Pacific is actually impacting the rules at what constitutes an El Nino. It used to be that a warming anomaly of 4 to 5 consecutive months was required before an El Nino was declared, but it could be that sustained warmth is screwing with the rules so that this El Nino is "by definition" only. Yet, by the same token, it may be possible that AGW is providing a non-stationary stimulus to ENSO so that we are really winging it by trying to predict anything. <blockquote>"If outlier cyclones distort the fit, how is ignoring half the information different from manually removing outliers?"</blockquote> It would be better if I did have information on what are the outliers. As it is, I just have an idea that one of the two values of the SOI dipole is a better indicator of extreme ENSO conditions. A perfect dipole of Darwin and Tahiti would give a correlation coefficient of -1, but the value is only -0.7 which means that we aren't picking out a "perfect" dipole to regress against. What I am doing when stressing either Tahiti or Darwin is exaggerating the dipole nature. I am still not sure that this is fair but it is certainly an intriguing approach that shows how important ENSO is in <i>quantitatively</i> explaining natural temperature variability -- we already know how important ENSO and El Ninos are in <i>qualitatively</i> explaining global heat waves. 
• Options
8.

Ever heard of the so-called "pause" in global warming?

Comment Source:Ever heard of the so-called "pause" in global warming? 
• Options
9.

?

Comment Source:?
• Options
10.

One claim is that many of these compensating temperature effects, such as ENSO, can lead to an extended pause. You can kind of see this in the chart above, a plateau that extends starting around the year 2000. Michael Mann has referred to this behavior as a "Faux Pause".

Comment Source:One claim is that many of these compensating temperature effects, such as ENSO, can lead to an extended pause. You can kind of see this in the chart above, a plateau that extends starting around the year 2000. Michael Mann has referred to this behavior as a "Faux Pause". 
• Options
11.

Did you come across Jan Galkowski's "Warmng slowdown" posts: https://johncarlosbaez.wordpress.com/2014/05/29/warming-slowdown-2/ which imo didn't get wide enough circulation?

Comment Source:Did you come across Jan Galkowski's "Warmng slowdown" posts: https://johncarlosbaez.wordpress.com/2014/05/29/warming-slowdown-2/ which imo didn't get wide enough circulation?
• Options
12.

Jan's just done another great post on his blog https://hypergeometric.wordpress.com/ about regime shifts (including 1978)

Professor Peter Congdon reports on two Bayesian models for global temperature shifts in his textbook, Applied Bayesian Modelling, as “Example 6.12: Global temperatures, 1850-2010″, on pages 252-253. A direct link is available online. The first is apparently original with Congdon, but the second is an adaptation of Thomson, Kimmerer, Brown, Newman, MacNally, Bennett, Feyrer, and Fleishman, “Bayesian change point analysis of abundance trends for pelagic fishes in the upper San Francisco Estuary”, Ecological Applications, 20(5), 2010, 1431-1448.

The first model supports changes in levels, and assumes change points are uniformly distributed across the entire history. It finds three change points during 1850-2010, approximately in the years 1929, 1978, and 1996, dividing the history into four regimes, the first two being relatively downlevel, and the last two uplevel, with 1996-2010 being especially pronounced. This offers an alternative and intriguing interpretation of the data discussed both at the Azimuth Project, and especially results of Fyfe, Gillett, and Zwiers reviewed therein, and by Tamino. The results are robust against alternative uniform priors, still yielding 1929, 1978, and 1996 as change points. The deviance information criterion (“DIC”; see also here, here, here, and here) value for the model is -552 and expected predictive deviance (“EPD”) is 0.319.

Comment Source:Jan's just done another great post on his blog https://hypergeometric.wordpress.com/ about regime shifts (including 1978) > Professor Peter Congdon reports on two Bayesian models for global temperature shifts in his textbook, Applied Bayesian Modelling, as “Example 6.12: Global temperatures, 1850-2010″, on pages 252-253. A direct link is available online. The first is apparently original with Congdon, but the second is an adaptation of Thomson, Kimmerer, Brown, Newman, MacNally, Bennett, Feyrer, and Fleishman, “Bayesian change point analysis of abundance trends for pelagic fishes in the upper San Francisco Estuary”, Ecological Applications, 20(5), 2010, 1431-1448. > The first model supports changes in levels, and assumes change points are uniformly distributed across the entire history. It finds three change points during 1850-2010, approximately in the years 1929, 1978, and 1996, dividing the history into four regimes, the first two being relatively downlevel, and the last two uplevel, with 1996-2010 being especially pronounced. This offers an alternative and intriguing interpretation of the data discussed both at the Azimuth Project, and especially results of Fyfe, Gillett, and Zwiers reviewed therein, and by Tamino. The results are robust against alternative uniform priors, still yielding 1929, 1978, and 1996 as change points. The deviance information criterion (“DIC”; see also here, here, here, and here) value for the model is -552 and expected predictive deviance (“EPD”) is 0.319.
• Options
13.

Thanks Jim, A change point analysis certainly leaves room for thought. What I think is that the change points are entirely embedded in the ENSO signal. The rationale for this is that the ENSO signal itself is enough to capture most of the global temperature variability.

$f : M \to M$

As far as the change points at the dates 1929, 1978, and 1996, those are transition regions that have given me problems in modeling ENSO via my favored sloshing analysis. For an early analysis I did, I started from the year 1932, below

Extending the solution before 1930, the time-series got out of sync.

Possibly some clues here

Chao, Benjamin F., and Wei-Yung Chung. "Amplitude and phase variations of Earth's Chandler wobble under continual excitation." Journal of Geodynamics 62 (2012): 35-39.

"The motivation of this study is the well-known, but so-far baffling, behavior of the Chandler wobble around 1925 and into the 1930s, when it was found to reach a near-zero minimum in amplitude and undergo a concurrent phase reversal"

Comment Source:Thanks Jim, A change point analysis certainly leaves room for thought. What I think is that the change points are <i>entirely</i> embedded in the ENSO signal. The rationale for this is that the ENSO signal itself is enough to capture most of the global temperature variability. $f : M \to M$ As far as the change points at the dates 1929, 1978, and 1996, those are transition regions that have given me problems in modeling ENSO via my favored sloshing analysis. For an early analysis I did, I started from the year 1932, below ![SOIM](http://imagizer.imageshack.us/a/img845/835/u1e3.gif) Extending the solution before 1930, the time-series got out of sync. Possibly some clues here Chao, Benjamin F., and Wei-Yung Chung. "Amplitude and phase variations of Earth's Chandler wobble under continual excitation." Journal of Geodynamics 62 (2012): 35-39. > "The motivation of this study is the well-known, but so-far baffling, behavior of the Chandler wobble around 1925 and into the 1930s, when it was found to reach a near-zero minimum in amplitude and undergo a concurrent phase reversal" 
• Options
14.

" If outlier cyclones distort the fit, how is ignoring half the information different from manually removing outliers? "

I don't have an answer because I am not sure how to identify a real transient distortion. But it looks like we are in the middle of a transient event right now. This passage from the Australia BOM explains what might be happening.

Two counter-rotating cyclones are whipping up westerlies, which will pile up water and potentially trigger a bigger El Nino if it is in phase with the underlying sloshing motion.

Comment Source:Jim asked <blockquote>" If outlier cyclones distort the fit, how is ignoring half the information different from manually removing outliers? "</blockquote> I don't have an answer because I am not sure how to identify a real transient distortion. But it looks like we are in the middle of a transient event right now. This passage from the Australia BOM explains what might be happening. ![BOM](http://imageshack.com/a/img540/5397/nJyNMI.gif) Two counter-rotating cyclones are whipping up westerlies, which will pile up water and potentially trigger a bigger El Nino if it is in phase with the underlying sloshing motion. ![cyclones](http://pbs.twimg.com/media/B_0iYJZWAAAc27s.png:small) 
• Options
15.

Wow, this looks exciting:

If the model that produces this is sufficiently simple, I think you can easily defend yourself against the charges of overfitting by a good statistical argument. You may need to recruit a statistician or two to make this argument... and I think that could be a fun project for Azimuth to concern itself with.

On a different but related note, I someday want to write a blog post accusing a lot of more popular climate models - much more complicated ones, full atmosphere-ocean general circulations models - of overfitting. I believe there's some good evidence that this is occurring. But I want to get my ducks lined up before I make any public claims.

I also think that would be a good project for Azimuth to concern itself with.

Comment Source:Wow, this looks exciting: ![csalt](http://imageshack.com/a/img909/5498/nEjPVr.gif) If the model that produces this is sufficiently simple, I think you can easily defend yourself against the charges of overfitting by a good statistical argument. You may need to recruit a statistician or two to make this argument... and I think that could be a fun project for Azimuth to concern itself with. On a different but related note, I someday want to write a blog post accusing a lot of more popular climate models - much more complicated ones, full atmosphere-ocean general circulations models - of overfitting. I believe there's some good evidence that this is occurring. But I want to get my ducks lined up before I make any public claims. I also think _that_ would be a good project for Azimuth to concern itself with. 
• Options
16.

John, The discussion within the climate science establishment over using simple models is evident.

Dr.SANTER : And then there is this issue of an unusual manifestation of natural variability in the observations associated with ENSO, the PDO or some combination thereof. So, let's look at this first. Do ENSO effects explain the stasis? So, what we did is we used some iterative regression-based method to remove ENSO effects. Turns out you have got to be a little careful because there's co-linearity between ENSO and Volcanos. And that matters over this period of record. So, if you just plug everything into some multiple regression framework, you get the wrong answer.

and then you get a contrary view:

DR.HELD: The tropical Pacific is powerful. ... And also it's motivated. This isn't arbitrary. It's motivated by the fact that it looks like the recent past has been La Niña-like. And so, just go in and let the model do it for you. And it looks pretty good.

The argument is over being able to compensate out the natural variability due to a few over-riding factors such as ENSO, volcanos, etc. This characteristic has long been known -- see Foster&Rahmstorf, Lean et al, Hansen et al among many others. Yet, you see a huge resistance to pushing this idea forward.

I really don't get it. I may indeed be over-fitting, but that is something that scientists like to play around with -- to see how far they can go with an idea. But to just say that it gives the "wrong answer" is really questionable to me.

Comment Source:John, The discussion within the climate science establishment over using simple models is evident. See this long transcript : <a href="http://www.aps.org/policy/statements/upload/climate-seminar-transcript.pdf">American Physical Society (APS) held a one day workshop on climate change and invited six climatologists to participate</a> > _Dr.SANTER_ : And then there is this issue of an unusual manifestation of natural variability in the observations associated with ENSO, the PDO or some combination thereof. So, let's look at this first. Do ENSO effects explain the stasis? So, what we did is we used some iterative regression-based method to remove ENSO effects. Turns out you have got to be a little careful because there's co-linearity between ENSO and Volcanos. And that matters over this period of record. So, if you just plug everything into some multiple regression framework, you get the wrong answer. and then you get a contrary view: > _DR.HELD_: The tropical Pacific is powerful. ... And also it's motivated. This isn't arbitrary. It's motivated by the fact that it looks like the recent past has been La Niña-like. And so, just go in and let the model do it for you. And it looks pretty good. The argument is over being able to compensate out the natural variability due to a few over-riding factors such as ENSO, volcanos, etc. This characteristic has long been known -- see Foster&Rahmstorf, Lean et al, Hansen et al among many others. Yet, you see a huge resistance to pushing this idea forward. I really don't get it. I may indeed be over-fitting, but that is something that scientists like to play around with -- to see how far they can go with an idea. But to just say that it gives the "wrong answer" is really questionable to me. 
• Options
17.
edited March 2015

I would say your model is "overfitting" if you're obtaining this good agreement by having enough adjustable elements in your model - both parameters and conceptual elements - that you could get agreement approximately this good no matter what the small wiggles in the observed data look like. It's possible to do statistics to study whether this is going on. I'm hoping that it is not going on.

(My definition of overfitting here is rough. There are people who really know about this stuff, and I'd like to learn what they know.)

To prove that overfitting is not occurring, making the model more complicated to make it fit even better is not what we want to; instead it would be helpful to make the model as simple as possible while still fitting the data remarkably well.

The payoff of proving that there's no overfitting is that we'd get evidence that the mechanisms you say are important, actually are important. And since some of these mechanisms are unorthodox (namely the changes in length-of-day), that would be important.

Comment Source:I would say your model is "overfitting" if you're obtaining this good agreement by having enough adjustable elements in your model - both parameters and conceptual elements - that you could get agreement approximately this good _no matter what_ the small wiggles in the observed data look like. It's possible to do statistics to study whether this is going on. I'm hoping that it is _not_ going on. (My definition of overfitting here is rough. There are people who really know about this stuff, and I'd like to learn what they know.) To prove that overfitting is not occurring, making the model more complicated to make it fit even better is not what we want to; instead it would be helpful to make the model as simple as possible while still fitting the data remarkably well. The payoff of proving that there's no overfitting is that we'd get evidence that the mechanisms you say are important, actually _are_ important. And since some of these mechanisms are unorthodox (namely the changes in length-of-day), that would be important.