I woke up this morning with a vague but niggling idea that I had missed a more generalized problem in developing my ray tracer. It was going to eat at me all day unless I bothered finding out.
To take advantage of the array-oriented programming style of J I needed a single array of all the x, y coordinates on which to "draw" the ray lighting effects. I gave an example of one such array with "boxing" applied to demonstrate how the result was a self-indexed array:
coordinates=: [: (] ,"0 |:) 2&# $ ] # i. NB. y is square array width
<"1 coordinates 3
┌───┬───┬───┐
│0 0│0 1│0 2│
├───┼───┼───┤
│1 0│1 1│1 2│
├───┼───┼───┤
│2 0│2 1│2 2│
└───┴───┴───┘
I was already familiar with the catalogue verb ({
) which can
accomplish something very similar, but with the boxing applied
directly:
{ (i.3) ; (i.3)
┌───┬───┬───┐
│0 0│0 1│0 2│
├───┼───┼───┤
│1 0│1 1│1 2│
├───┼───┼───┤
│2 0│2 1│2 2│
└───┴───┴───┘
I had avoided using catalogue because of how slow it ends up being for large arrays. I think a big part of the slow-down is a result of boxing (and maybe unboxing) which, as I understand it, represent indirection from what are otherwise contiguous memory access.
This can be demonstrated with the basic profiling verb timeAndSpace
,
which returns time taken in seconds and space used in bytes:
timeAndSpace=: 6!:2 , 7!:2@]
timeAndSpace '{ ;~ (i. 1200)'
0.102982 2.01115e8
timeAndSpace '>{ ;~ (i. 1200)'
0.151136 2.34669e8
timeAndSpace 'coordinates 1200'
0.022646 6.71102e7
In the case of just calling catalogue the operation takes more than 4
times as long as my own coordinates
, and in order to match the array
format requires unboxing (>
) which takes longer still.
This was enough initially for me to be satisfied with my own implementation and I didn't bother thinking further until this morning. Stepping back a bit made me realize I'm poorly re-implementing a more limited subset of a Cartesian product (which is what catalogue is for). I accomplished this by created the 2 dimensional array:
(2&# $ ] # i.) 3
0 0 0
1 1 1
2 2 2
and then appending (rank 0) the transposition of the same array:
(] ,"0 |:) (2&# $ ] # i.) 3
0 0
0 1
0 2
1 0
1 1
1 2
2 0
2 1
2 2
Really this is a few more steps than necessary if you consider the (unacknowledged) goal of generating the cartesian product of a simple array. Instead it is possible to append over each atom of the same array, the nearly literal translation to J is:
over=: /
append=: ,
eachAtom=: "0
someArray=: i.3
itself=: ~
append eachAtom over itself someArray
0 0
0 1
0 2
1 0
1 1
1 2
2 0
2 1
2 2
A single verb to capture this can be collapsed down to just:
indexArray=: ,"0/~
timeAndSpace 'indexArray i. 1200'
0.015167 3.36382e7
Which is faster, uses less memory than either prior solution and is simpler to read. The performance improvement is even more apparent with larger arrays:
open=: <
catalogue=: {
timeAndSpace 'open catalogue ;~ i.10000'
39.0662 1.38739e10
timeAndSpace 'coordinates 10000'
4.0254 4.29497e9
timeAndSpace 'indexArray i. 10000'
1.59542 2.14814e9
I was generally feeling happy with this realization right up until the point where I opened the documentation for catalogue which notes:
Note: Table (/) has a similar action. It is the more usual primitive to use for this task.
0 1 (<@,"0)/ 7 8 9 +---+---+---+ |0 7|0 8|0 9| +---+---+---+ |1 7|1 8|1 9| +---+---+---+
Which is exactly what I've managed to rediscover here. Once again I find myself a little humbled by my own inability to, first, recognize the more general form of a problem and second, guess that it has already been solved. Oh well, at least I know a little bit more about the language and how to profile it.