Thanks to David Tanzer for helping to organise this discussion. As has

been mentioned, I'm currently a bit flaky so I would really advise

against any plans that depend upon me delivering certain results in a

certain timeframe. However, to add to the bullet points:

* I'm currently working on some software for doing sparse

linear/bilinear regression against medium-large features vectors. I

hope to get this completed and run it against a big collection of

min/max correlations between

various "measurement points" at different temporal offsets. This is

mainly exploratory, attempting to used (bi-)linear relationships to

provide some ideas for more detailed, physically based models and as

many people have observed El Nino behaviour is definitely not just a

simple linear phenomenon. (The

kind of thing I'm thinking of is, say, a positive correlation

between SF bay and the sea around Japan at the same time is

important, and so is a negative correlation between the areas at

some distance around the El Nino 3.4 box and the points within the

box 3 months later. This might be plausible because, say, due to

energy conservation behaviour outside the box has to move towards

the mean

as the area inside the box moves away from the mean. But a goal is

to avoid doing too much assumption and just explore the data.) The

code is being put up as I'm writing it [on github](https://github.com/davidtweed/multicoreBilinearRegression), and anyone

is welcome to do anything they wish with it (expecially if I

complete it).

A while back Dara asked whether this code could be used for doing

non-linear fitting, and I didn't get around to answering. To address

that, the code is assuming that

1. you've got a prediction function $f$

of a multivariate parameter $p$ such that

$$f(p) = \sum_{j\in 1:K} f_j(p \intersection P_j)$$

ie, the prediction can be broken into a simple sum of predictors that

depend only on some particular subdivision of $p$ into subsets,

2. That to optimise $f_j(p \intersection P_j)$ you get reasonable

results by optimising over each scalar element of the parameters in

turn for multiple cycles until there's no change. (This is true for

things like linear models, but you can imagine predictors where the

influences of the different variables are so deeply intertwined that

optimising along one co-ordinate without also simultaneously

considering the others will bounce around forever without converging.)

As such, it could be used for fitting against a _known in advance_ set

of non-linear functions providing they aren't too non-linear that 2 no

longer holds.

I really hope to finish this software, at least to the point where I

can provide some interesting plots for the blog article, hopefully further.

----

Also, just to note that I'm not against binary classification: for classifiers that are inherently based upon a binary decision (eg, SVM's, random hyperplane trees, etc) you really want have a binary output to be trying to estimate. I'm just a little bit wary of any technique that takes a "real number-prediction model" and than makes it binary by applying some form of sigmoid function to the output (eg, a logistic function in logistic regression).

been mentioned, I'm currently a bit flaky so I would really advise

against any plans that depend upon me delivering certain results in a

certain timeframe. However, to add to the bullet points:

* I'm currently working on some software for doing sparse

linear/bilinear regression against medium-large features vectors. I

hope to get this completed and run it against a big collection of

min/max correlations between

various "measurement points" at different temporal offsets. This is

mainly exploratory, attempting to used (bi-)linear relationships to

provide some ideas for more detailed, physically based models and as

many people have observed El Nino behaviour is definitely not just a

simple linear phenomenon. (The

kind of thing I'm thinking of is, say, a positive correlation

between SF bay and the sea around Japan at the same time is

important, and so is a negative correlation between the areas at

some distance around the El Nino 3.4 box and the points within the

box 3 months later. This might be plausible because, say, due to

energy conservation behaviour outside the box has to move towards

the mean

as the area inside the box moves away from the mean. But a goal is

to avoid doing too much assumption and just explore the data.) The

code is being put up as I'm writing it [on github](https://github.com/davidtweed/multicoreBilinearRegression), and anyone

is welcome to do anything they wish with it (expecially if I

complete it).

A while back Dara asked whether this code could be used for doing

non-linear fitting, and I didn't get around to answering. To address

that, the code is assuming that

1. you've got a prediction function $f$

of a multivariate parameter $p$ such that

$$f(p) = \sum_{j\in 1:K} f_j(p \intersection P_j)$$

ie, the prediction can be broken into a simple sum of predictors that

depend only on some particular subdivision of $p$ into subsets,

2. That to optimise $f_j(p \intersection P_j)$ you get reasonable

results by optimising over each scalar element of the parameters in

turn for multiple cycles until there's no change. (This is true for

things like linear models, but you can imagine predictors where the

influences of the different variables are so deeply intertwined that

optimising along one co-ordinate without also simultaneously

considering the others will bounce around forever without converging.)

As such, it could be used for fitting against a _known in advance_ set

of non-linear functions providing they aren't too non-linear that 2 no

longer holds.

I really hope to finish this software, at least to the point where I

can provide some interesting plots for the blog article, hopefully further.

----

Also, just to note that I'm not against binary classification: for classifiers that are inherently based upon a binary decision (eg, SVM's, random hyperplane trees, etc) you really want have a binary output to be trying to estimate. I'm just a little bit wary of any technique that takes a "real number-prediction model" and than makes it binary by applying some form of sigmoid function to the output (eg, a logistic function in logistic regression).