Some deep-learning algorithm out there will find the ENSO pattern eventually, but perhaps not anytime soon. This is how bad a neural network does on natural variability as reported at last week's AGU meeting
![](https://pbs.twimg.com/media/EMFAXFvWoAAPjOQ.png)

It's so bad that they punt and admit beforehand that it won't work on natural variability, yet later in the presentation they show that it does work on man-made forced variability.

![](http://imageshack.com/a/img923/3290/npNE0D.gif)

This is rather obvious and it would be a wonder if it didn't work since all they have to do is train the NN on the CO2 emission time series -- i.e. the warming trend and CO2 rise are already well correlated.

But unless the NN knows the math behind the nonlinear LTE solution and the fact that tidal forcing plays a role, it will never get the ENSO natural variability correct. In other words, it can't train on itself if the pattern is deeply obscured by unknown forcing factors further modulated by an unknown nonlinear transfer function.

The presentation is here: https://youtu.be/O67El-fqA9c

An ArXiv paper is here: https://arxiv.org/abs/1912.01752
"Physically Interpretable Neural Networks for the Geosciences: Applications to Earth System Variability"

>" Network interpretation techniques have become more advanced in recent years, however, and we therefore propose that the ultimate objective of using a neural network can also be the interpretation of what the network has learned rather than the output itself.
> We show that the interpretation of a neural network can enable the discovery of scientifically meaningful connections within geoscientific data. By training neural networks to use one or more components of the earth system to identify another, interpretation methods can be used to gain scientific insights into how and why the two components are related. In particular, we use two methods for neural network interpretation. These methods project the decision pathways of a network back onto the original input dimensions, and are called "optimal input" and layerwise relevance propagation (LRP). "