Hi John, I see that in the blog article you're building up the correlation coefficient from covariance. That's probably the right thing to do, particularly given the other parts of the El Nino series, but just checking that you're aware that you can also develop some ideas about correlation using the following:

If we've got some time series of data $(x_i)$ and a fixed total amount of "energy" in $y$-stuff equal to the energy in the $x_i$s (say due to normalisation) that we can distribute into a time series $(y_i)$, then it's clear that looking at $\sum_{i=1:N} (x_i-y_i)^2$ gives small values for a very good match and progressively bigger values as they are become less correlated (intuitive meaning of the word). Observing

\[

\sum_{i=1:N} (x_i-y_i)^2 =\sum_{i=1:N} (x_i^2 - 2 x_i y_i + y_i^2)=\sum_{i=1:N} x_i^2 - 2 \sum_{i=1:N} x_i y_i + \sum_{i=1:N} y_i^2

\]

Since $\sum_{i=1:N} x_i^2$ is a constant and we've assumed that the total energy $\sum_{i=1:N} y_i^2$ is fixed, we can see that $\sum_{i=1:N} x_i y_i$ contains all the varying behaviour of the $\sum_{i=1:N} (x_i-y_i)^2$ measure, only moving in the opposite direction (ie, big values indicate good matches).

That's probably actually a more complicated setup than the covariance/correlation one, but I thought I'd mention it.

If we've got some time series of data $(x_i)$ and a fixed total amount of "energy" in $y$-stuff equal to the energy in the $x_i$s (say due to normalisation) that we can distribute into a time series $(y_i)$, then it's clear that looking at $\sum_{i=1:N} (x_i-y_i)^2$ gives small values for a very good match and progressively bigger values as they are become less correlated (intuitive meaning of the word). Observing

\[

\sum_{i=1:N} (x_i-y_i)^2 =\sum_{i=1:N} (x_i^2 - 2 x_i y_i + y_i^2)=\sum_{i=1:N} x_i^2 - 2 \sum_{i=1:N} x_i y_i + \sum_{i=1:N} y_i^2

\]

Since $\sum_{i=1:N} x_i^2$ is a constant and we've assumed that the total energy $\sum_{i=1:N} y_i^2$ is fixed, we can see that $\sum_{i=1:N} x_i y_i$ contains all the varying behaviour of the $\sum_{i=1:N} (x_i-y_i)^2$ measure, only moving in the opposite direction (ie, big values indicate good matches).

That's probably actually a more complicated setup than the covariance/correlation one, but I thought I'd mention it.