@Igor It sounds like you probably know at least as much about them as I do; I haven't worked with them directly enough to have any useful insights into how someone more experienced might see them.

I don't know if any of this will be any help, but there are a few impressions about tensors that I've picked up (which are a far cry from the perspective of "tensors are magic, and nearly inexplicable" that I had before then; I attribute much of my change of perspective to Loring Tu's Introduction to Manifolds, where he talked about tensors like they're no big deal):

Tensors seem to be more about bookkeeping than anything else. It's a process that seems to be only one notch more sophisticated than putting numbers into an ordered pair. (On that note, I don't recall the exact relationship between the cartesian product of two vector spaces and the tensor product of those same two vector spaces; was it simply that the cartesian product is isomorphic to a subset of the tensor product, I think?)

But the strength of the tensor notation seems to stem from the fact that we're keeping factors separate. If you can keep pieces of data separate all the way through a problem, then you never have to deal with the ambiguity that's introduced when there are numerous pairs of elements that, if you were to simply multiply them, would have the same result. But let's say my problem uses the tensor \$$4 \otimes 5\$$. If I multiply them, I have to start referring to my quantity as "two factors that multiplied to yield 20," and I may have to rediscover those factors again later. Worse, I may not have a way to distinguish my original tensor from \$$5 \otimes 4\$$, \$$10 \otimes 2\$$ and so on. But in the tensor notation, I never introduce that ambiguity.

About the dimension of the tensor product as a vector space....I'm no longer the kind of person who gives much though to the geometry of things beyond three dimensions. (Projections back into three or fewer dimensions, sure.) To me, that fact that the tensor product of two 3-dimensional vector spaces just means that it takes nine pieces of data to uniquely specify an element of that tensor product. The programmer in me just shrugs and declares nine variables with little regard to their geometry. And once the problem is nearly done, the universal property of tensor products says that there will be a unique way to map each tensor into the codomain of the problem.

btw, I don't have any experience actually doing that last part yet...it's just my understanding of how tensors are 'meant' to work. I hope to get a chance to put them to the test, sooner or later.