Options

Github repository containing seeds of array programming extension availabe

Hi,

this is in chat because it's "half" related to Azimuth. Some of you may know the newly developing language Julia for scientific computing. One of the things in the "theory of technology" that seems well founded to me is the observation that a technology won't take off because it's just "a bit better" than what already exists, because any conversion has immediate costs that make it unwise to migrate just for a small benefit. A technology has to be "much better" (it's commony stated as "twice as good") in order to make converting existing stuff worthwhile. Julia is the first thing that I've seen in a while where I've felt it might be able to replace signficant amounts of Matlab/fortran/C++ (technical reason: it's actually being designed with JIT compilation in mind, whereas retrofitting a JIT to those languages runs into issues with their existing semantics.) It's also young enough that signficant changes in the language are still possible.

As such, I've been thinking about how to extent Julia (or, in the event the Julia guys don't want it, some other language) with array programming syntax. I've had to do this non-openly until I received written confirmation from my employer that this after-hours activity would be considered completely separate from anything I produce for them. That's finally come through, and so I'm going to start developing in the open. The repository is over at github. Note that it's not even functioning yet: after seeking some advice from Graham Jones about trees I think I've finally figured out a useful representation for loop nestings, but I still have to actually program that in. So it' s not remotely ready for actual use yet, but I'm putting it out there as part of doing open development (now that I'm finally able to do so).

(BTW: haven't mentioned the github repo being up on the Julia development mailinglist yet because I'd like to at least get the loop-nest tree search implemented before I get people looking at it, as they're more likely to persevere with it from that point.)

Comments

  • 1.

    David: Could you say a few words about what array programming syntax is, or point to a URL? There's a wikipedia article on Array programming. It mentions APL which I played around with a long time ago. Is that article a good background source for what you're working on? Also, what's a loop-nest tree search?

    Comment Source:David: Could you say a few words about what _array programming syntax_ is, or point to a URL? There's a wikipedia article on Array programming. It mentions APL which I played around with a long time ago. Is that article a good background source for what you're working on? Also, what's a loop-nest tree search?
  • 2.
    edited October 2012

    That's good news David!

    I've noticed you being very active on the Julia lists (where I lurk), and now I see why.. I think it's a great language too, and in my opinion would be better still if they adopted an array processing syntax --- more detailed discussion of why belongs on the Julia lists I guess, but let me just say I wish they could shake the mindset that just because they can write fast imperative code (unlike say in R, or Python) doesn't mean they always should.

    Comment Source:That's good news David! I've noticed you being very active on the Julia lists (where I lurk), and now I see why.. I think it's a great language too, and in my opinion would be better still if they adopted an array processing syntax --- more detailed discussion of why belongs on the Julia lists I guess, but let me just say I wish they could shake the mindset that just because they *can* write fast imperative code (unlike say in R, or Python) doesn't mean they always should.
  • 3.
    edited October 2012

    Hi Ken,

    Array programming syntax is just the sort of syntax used by array programming languages such as APL or Matlab (languages which have a lot of differences, but also a lot of similarities). It starts with the basic notation of being able to apply operations such as addition (or subraction, multiplcation, etc) to array entities directly

    C = A + B
    

    rather than

        for i=1:m {
            for j=1:n {
                 C[i,j] = A[i,j] + B[i,j]
            }
        }
    

    but also brings in other notions, using sum rather than an explicit loop to calculate the sum of the entires, and generalising this to reductions over various operators in addition to + and prefix scans. Part of the attraction is that it removes redundancy in the source code, since the system already knows what the dimensions are, there's no need to put in explicit loops for this kind of construct.

    This is relatively widely implemented but, as far as I know, none of the implementations routinely try to merge the "implied loops" between statements, eg

        C = A + B
        D = C - F
    

    giving

        for i=1:m {
            for j=1:n {
                 C[i,j] = A[i,j] + B[i,j]
                 D[i,j] = C[i,j] - F[i,j]
            }
        }
    

    rather than two separate loop nests. (This is obviously one of the simplest cases; things get a bit more complicated with multiple dimesions and reductions all in the mix.)

    The "loop nest tree" is just a term related to implementation: if you think about how loops can be nested they correspond to an ordered tree, so searching over possible loop implementations requires search over tree structures.)

    Comment Source:Hi Ken, Array programming syntax is just the sort of syntax used by array programming languages such as APL or Matlab (languages which have a lot of differences, but also a lot of similarities). It starts with the basic notation of being able to apply operations such as addition (or subraction, multiplcation, etc) to array entities directly C = A + B rather than ~~~~ for i=1:m { for j=1:n { C[i,j] = A[i,j] + B[i,j] } } ~~~~ but also brings in other notions, using <tt>sum</tt> rather than an explicit loop to calculate the sum of the entires, and generalising this to reductions over various operators in addition to + and prefix scans. Part of the attraction is that it removes redundancy in the source code, since the system already knows what the dimensions are, there's no need to put in explicit loops _for this kind of construct_. This is relatively widely implemented but, as far as I know, none of the implementations routinely try to merge the "implied loops" between statements, eg ~~~~ C = A + B D = C - F ~~~~ giving ~~~~ for i=1:m { for j=1:n { C[i,j] = A[i,j] + B[i,j] D[i,j] = C[i,j] - F[i,j] } } ~~~~ rather than two separate loop nests. (This is obviously one of the simplest cases; things get a bit more complicated with multiple dimesions and reductions all in the mix.) The "loop nest tree" is just a term related to implementation: if you think about how loops can be nested they correspond to an ordered tree, so searching over possible loop implementations requires search over tree structures.)
Sign In or Register to comment.