The Life of Lil

Lil is the scripting language of Decker. Most of the time, scripts in Decker are very simple, like a script which changes cards when a user clicks on a button:

on click do
	go[thatCoolCard]
end

For simple scripts, performance may not matter at all. Sometimes, though, performance makes the difference between accomplishing your vision and sighing in frustration while the cooling fans on your unresponsive computer attempt unsuccessfully to make it levitate. In this article we’ll work through several approaches to implementing a computationally-intensive program with Lil, and along the way learn a little bit about Vector Programming and how it can help us write scripts that are more concise and more efficient.

Feel free to follow along and try your own ideas in the Lil playground; it includes a version of the final iteration described here as the life example program.

The Meaning of Life

A cellular automaton simulates a world made up of cells. We begin with some arrangement of cells in space, and use their relationships to calculate a next generation of cells. By repeatedly applying simple rules, cellular automata can produce fascinatingly complex patterns which resemble (in some ways) patterns found in nature.

John Conway’s Game of Life is one cellular automaton which has been studied extensively, exhibiting patterns which can self-replicate, move through space, and even perform computation.

The Game of Life takes place on a two-dimensional (2D) grid composed of cells which are either “dead” (above: white) or “alive” (above: black). On each generation, each cell in the grid considers each of its 8 orthogonal and diagonal neighbors, counts the number of neighbors which are alive, and computes its state in the next generation according to the following three rules:

Loopy Life

Let’s begin by translating the rules above into a Lil program with a straightforward, imperative approach. We will focus on writing a function named step which is given a game board representing all the cells in our miniature universe and which is responsible for computing and returning the next generation’s board.

One way to represent a Life board in Lil is to use a dictionary, associating the (x,y) coordinates of live cells with a value of 1. If we try to look up a coordinate pair that isn’t set, we’ll get a 0, which can represent dead cells. We’ll additionally define a global variable named SIZE which contains the (width,height) dimensions of the entire board.

As pseudocode:

Translated into Lil,

SIZE:20,20
board:()

...

on step board do
 next:()
 each y in range SIZE[1]
  each x in range SIZE[0]
   pos:x,y
   around:0
   each dx in -1,0,1
    each dy in -1,0,1
     if !(0,0)~(dx,dy)
      around:around+board[pos+(dx,dy)]
     end
    end
   end
   if board[pos]
    if around in 2,3 next[pos]:1 end
   else
    if around ~ 3    next[pos]:1 end
   end
  end
 end
 next
end

There are a few surface-level simplifications we could make to the above. For starters, we can tweak how we represent the rules to include the cell itself in our around count, saving a conditional:

on step board do
 next:()
 each y in range SIZE[1]
  each x in range SIZE[0]
   pos:x,y
   around:0
   each dx in -1,0,1
    each dy in -1,0,1
     around:around+board[pos+(dx,dy)]
    end
   end
   if board[pos]
    if around in 3,4 next[pos]:1 end
   else
    if around ~ 3    next[pos]:1 end
   end
  end
 end
 next
end

Next, we’ll focus on our inner loop:

around:0
each dx in -1,0,1
 each dy in -1,0,1
  around:around+board[pos+(dx,dy)]
 end
end

Instead of reassigning around in each iteration of our inner loops, we can take advantage of the fact that each loops naturally return a list of the results of their body expression and sum those together:

around:sum each dx in -1,0,1
 sum each dy in -1,0,1
  board[pos+(dx,dy)]
 end
end

Lil has a cross operator which can be used to calculate the cartesian product of two lists or ranges:

cross 3       # ((0,0),(1,0),(2,0),(0,1),(1,1),(2,1),(0,2),(1,2),(2,2))
-1 + 3 cross 3  # ((-1,-1),(0,-1),(1,-1),(-1,0),(0,0),(1,0),(-1,1),(0,1),(1,1))

This is exactly what we’re expressing with our nested dx/dy loops, so with cross we can collapse them together:

around:sum each delta in -1 + 3 cross 3
 board[pos + delta]
end

A pair (pos) doesn’t conform with a list of pairs (the deltas), but it does if we flip the list of deltas first. In this way we can add a position to an entire list of offsets at once:

deltas: -1 + 3 cross 3     # ((-1,-1),(0,-1),(1,-1),(-1,0),(0,0),(1,0),(-1,1),(0,1),(1,1))
(10,20) + deltas           # ((9,9),(20,19))
(10,20) + flip deltas      # ((9,10,11,9,10,11,9,10,11),(19,19,19,20,20,20,21,21,21))
flip (10,20) + flip deltas # ((9,19),(10,19),(11,19),(9,20),(10,20),(11,20),(9,21),(10,21),(11,21))

Taking advantage of this, we can “lift” the addition from the inner loop:

around:sum each x in flip pos + flip -1 + 3 cross 3
 board[x]
end

Which in turn allows us to write the expression more concisely with the @ operator instead of an explicit loop:

around:sum board @ flip pos + flip -1 + 3 cross 3

The cross operator can be used to simplify our outer loops as well:

on step board do
 next:()
 each pos in SIZE[0] cross SIZE[1]
  around:sum board @ flip pos + flip -1 + 3 cross 3
  if board[pos]
   if around in 3,4 next[pos]:1 end
  else
   if around ~ 3    next[pos]:1 end
  end
 end
 next
end

This is a reasonably concise and straightforward solution, but it is not particularly fast. Lil dictionaries, like all of Lil’s primitive datatypes, are immutable. Every time we write a 1 to some position in next, the Lil interpreter constructs an entirely new dictionary with the same keys and elements plus our added key-element pair, and stores the result in next. On a large board with many living cells, this overhead can add up to a huge amount of work! Lil’s copy-on-write semantics mean dictionary insertions are (in general) linear with respect to the size of the dictionary, rather than constant. It’s best to construct large dictionaries all at once, rather than a tiny piece at a time.

A Leap of Image-ination

In addition to primitive types, like lists and dictionaries, Lil is furnished with a number of interfaces, which expose system resources and provide mutable data structures. The Image interface is a perfect fit for our game board: it represents a 2D array of bytes.

We don’t have to change much to use an Image instead of a dictionary; Images can also be indexed with coordinate pairs. The SIZE constant is no longer needed, since the Image interface itself keeps track of its dimensions:

on step board do
 next:image[board.size]
 each pos in board.size[0] cross board.size[1]
  around:sum board @ flip pos + flip -1 + 3 cross 3
  if board[pos]
   if around in 3,4 next[pos]:1 end
  else
   if around ~ 3    next[pos]:1 end
  end
 end
 next
end

We avoided explicitly storing 0 values for dead cells in the dictionary to keep it “sparse” and minimize the number of entries required. An Image is always “dense”, so this is no longer a concern. As a result, we can simplify our conditionals a bit, taking advantage of the fact that if ... else ... end returns the value of the taken branch:

on step board do
 next:image[board.size]
 each pos in board.size[0] cross board.size[1]
  around:sum board @ flip pos + flip -1 + 3 cross 3
  next[pos]:if board[pos]
   around in 3,4
  else
   around~3
  end
 end
 next
end

In fact, we could do without conditionals entirely by reworking the above into a single logical expression:

on step board do
 next:image[board.size]
 each pos in board.size[0] cross board.size[1]
  around:sum board @ flip pos + flip -1 + 3 cross 3
  next[pos]:(board[pos] & around in 3,4) | around~3
 end
 next
end

And we could hoist the assignment of every cell out of the loop by using image.pixels to perform a bulk write. With writes occurring only when all reads have finished, we could even redesign the function to modify board in-place without need for a next buffer:

on step board do
 board.pixels:each pos in board.size[0] cross board.size[1]
  around:sum board @ flip pos + flip -1 + 3 cross 3
  (board[pos] & around in 3,4) | around~3
 end
end

This Image-based solution is significantly shorter, and a good bit faster. Instead of rebuilding the board dictionary for every write, the Lil interpreter only needs to allocate a fresh Image once at the beginning of the program. In Decker, the Image-based board representation has the additional benefit that we can easily display an Image graphically in the Listener or by pasting it to a Canvas Widget, or even the Card background Image.

Vector Thinking

Lil has many features inherited from Vector-oriented programming languages (sometimes also called “Array Languages”). The essence of vector programming is learning to think about manipulating entire data structures at once, in parallel, rather than serially.

For example, if we wanted to multiply a list by a scalar number in Lil we could write a loop which processes list elements serially:

each v in mylist
 3*v
end

But the vector-oriented approach is to take advantage of the parallelism built into the * operator:

3*mylist

The Image interface includes a number of functions for doing bulk operations on its pixels which we can take advantage of:

Since all of these operations modify the target Image in-place, we may need to use image.copy[] before applying them, to avoid destroying a dataset we need again later.

The inner loop of our current solution probes the board for the neighbors of each individual pixel by adding an offset to the coordinates of the central pixel. If we instead imagine using image.translate[], we could apply each of those offsets to the entire Image at once, and then use image.merge["+" ...] to sum all of the live cells in the translated Images together.

We were already able to convert the if ... else ... end statement in our outer loop into a logical expression that uses in, ~, & and | along with some constant numbers; given our original board Image and the neighbor-count Image, we can perform an equivalent calculation by using image.merge[] and image.map[]. Bringing it all together,

on step board do
 around:image[board.size]
 each delta in -1 + 3 cross 3
  around.merge["+" board.copy[].translate[delta]]
 end
 stay:around.copy[].map[((4,3) dict 1,1) 0].merge["&" board]
 next:around       .map[(   3  dict   1) 0].merge["|" stay ]
end

Give yourself a few moments to enjoy this solution. It describes the rules of Life in parallel, applying operations in bulk across the entire game board. The only explicit loop in this program iterates exactly nine times, no matter the size of the board. If we wanted, we could even “unroll” this loop and skip the (0,0) translation:

on step board do
 around:board.copy[]
 around.merge["+" board.copy[].translate[-1,-1]]
 around.merge["+" board.copy[].translate[-1, 0]]
 around.merge["+" board.copy[].translate[-1, 1]]
 around.merge["+" board.copy[].translate[ 0,-1]]
 around.merge["+" board.copy[].translate[ 0, 1]]
 around.merge["+" board.copy[].translate[ 1,-1]]
 around.merge["+" board.copy[].translate[ 1, 0]]
 around.merge["+" board.copy[].translate[ 1, 1]]
 stay:around.copy[].map[((4,3) dict 1,1) 0].merge["&" board]
 next:around       .map[(   3  dict   1) 0].merge["|" stay ]
end

While this is bulkier, avoiding the overhead of the loop and cross product is slightly more efficient, and it is clearer to a reader that the entire function is a straight “pipeline” of Image-manipulation operations.

With some additional pondering (or at least a perusal of the relevant literature), we might arrive at a slightly more elegant approach first attributed to Donald Knuth in the context of his METAFONT language. The key idea is to reorder the translations so that several neighbors can be summed together at the same time, and to double-count the cells surrounding the central cell, which simplifies the update rule:

on step board do
 horiz:board.copy[]
  .merge["+" board.copy[].translate[-1,0]]
  .merge["+" board.copy[].translate[ 1,0]]
 vert:horiz.copy[]
  .merge["+" horiz.copy[].translate[0,-1]]
  .merge["+" horiz.copy[].translate[0, 1]]
 next:vert.copy[]
  .merge["+" vert.merge["-" board]]
  .map[((5,6,7) dict 1,1,1) 0] 
end

All three of these formulations are tremendously faster than our earlier attempts, and scale better to larger boards. Instead of leaving the Lil interpreter to do most of the heavy lifting, iterating over and making decisions about individual pixels one at a time, we have pushed all of the work into image.translate[], image.map[] and image.merge[], which contain tight loops written in C (or JS, in Web-Decker).

Benchmarking

Let’s compare the runtime for several approaches to Life:

For each implementation, we will initialize an appropriately-sized board with random cells. Lilt’s random[] number generator is seeded consistently by default, so for a given board size each run will have the same configuration and final result. We will then step the board for 100 generations and print the final board. Each overall execution time will be measured on the same laptop with the unix time utility, taking the best total time (in seconds) of three attempts:

Approach 20x20 board 40x40 board 60x60 board
Dictionary, Naive 1.683 16.900 80.980
Dictionary, Simplified 0.489 3.925 38.885
Image, Iterative 0.329 1.262 2.838
Image, Vector 0.013 0.025 0.070

The usual lessons for improving program performance apply: Avoid unnecessary or redundant work, choose appropriate data structures, and parallelize.

Exercises

Throughout our examples, we have focused on a version of Life which implicitly fixes the edges of the board as dead. What modifications to each implementation would be necessary to instead simulate Life on a toroid, in which cells are considered adjacent across opposing edges? How does this impact performance and the overall elegance of the code?

Brian’s Brain is another interesting 2D cellular automaton. It uses 3 cell states (“Live”, “Dying”, and “Dead”) and slightly different rules for how cells change state. Try modifying our Life programs to simulate Brian’s Brain instead.

Rule 110 is a 1D cellular automaton which (with certain constraints and initial conditions) is capable of universal computation. Try your hand at writing an implementation! Using an Image to store the board state works, but you could also explore using Lil’s Array interface.

Further Reading

  1. The Lil Manual
  2. The Decker Manual
  3. Decker Project Page
  4. Life: Nasty, Brutish, and Short (APL)

back