I've been away from computers a lot recently, and what little time I have had has mostly been spent working on an improved mechanism for easy-to-write data crunching (so in particular model-based exploratory data analysis) for the language Julia. (This has partly been motivated using the well-known observation that programming languages acquire "re-implementation costs for infrastructure" which means new languages only take-off when they are compelling new elements enough to overcome this cost. Otherwise they just lie on the shelf unused.) Julia seems to have both the compelling new elements and enough general momentum that it might take-off. So getting some data crunching (the way I, in my wisdom, think it's best done) in there might lead to it being easier/more productive working with data, including ecological data. (This is a roundabout way of saying this is a bit related to Azimuth.)
Anyway, I'm reaching the point of being able to run stuff, and I'm looking for any interesting primarily numerical (although some categorical elements would be fine) data sets to crunch on. I'm looking for stuff in the 1-4GB range (enough to fit in main memory of a recent workstation, not so big that splitting it over multiple machines is strongly advised). I've found there's a reasonable selection of finanical or social network-type datasets available, but it would be nice to feel that at the very least my testing was working on environmental data, if only to show that it's something to do.
Obviously this is an odd request: caring about data size rather than content means it's difficult to look up on the main wiki, so I thought i'd briefly ask here.