Revamping Lush's Memory Management

Note: This is the project description. Go to my blog or to the project wiki to find out what's in the works right now. To play with the code, install GNU Arch and get a copy of lush--soc--1.4 from my archive juengling@pdx.edu--2008.

Synopsis

Lush ("Lisp Universal SHell") has evolved over more than twenty years from a neural network simulator to a programming environment for numerical applications. Object oriented language constructs, a Lisp-to-C compiler and other facilities were added over time. Not all those facilities fit well into the language because no provisions were made in the original design. In particular Lush's memory management is in disarray. It gets in the way with object-orient programming and further development of the Lush system itself.

In this project I want to revamp Lush's memory management from scratch. Most of the work will be spent on modifying the compiler and the language runtime to work with a new garbage collector. The immediate benefit will be to Lush users, as automatic memory management will work the same way in compiled code as in the interpreter. Long-term benefits to the Lush project result through simplification of the compiler, which will make it much easier to add new language features.

Applicant

Ralf Juengling (juenglin at cs.pdx.edu), hello. I am a senior CS graduate student at Portland State University, started using Lush in 2005 and have been contributing to the project for almost as long (I maintain the PSU Lush fork). I am fluent in C, Lisp, Python, and Matlab.

Project Details

Problem Details

Lush compiles to C rather than directly generating native machine code and it allows mixing Lisp with C code. Many of Lush's builtin types have the canonical C representation (strings, numbers, arrays of numbers, etc.). Thanks to these two features it is often not necessary to create an explicit binding to third-party libraries, one may just call external functions directly in inline C code.

There currently are two independent aspects in Lush's memory management that makes life difficult for programmers. The first is that there is no automatic memory management for objects created outside the interpreter. As a consequence, explicit interface code to third-party libraries need to be written to ensure that opaque data structures returned from a library function will be released when no longer referenced.

The other aspect is that the Lush-to-C compiler performs escape analysis and allocates objects on the stack whenever possible. Because of this, C functions generated by the compiler often have lots of hidden arguments through which memory for temporary variables or components of the return value are passed to the function. This is problematic when functions are to be used as arguments to other functions, or when methods are overloaded in subclasses (the compiler quits with an error when there is a mismatch in number of hidden arguments).

Solution Details

Both of the above problems may be fixed with addition of automatic memory management to the Lush runtime. The new facility will be used by the interpreter for internal data structures as well as for Lisp objects. The compiler will perform escape analysis and allocate objects on the stack just as today, except for methods and when explicitly prohibited by a new compiler directive.

Existing code that uses manually managed memory (i.e., malloc & free) will continue to work as before. However, a new facility will allow to register memory allocated in this way with the runtime. Part of this new facility will be a new pointer type "managed pointer". The API for externally allocated objects will take a conventional (unmanaged) pointer and a destructor function and will return a managed pointer.

It is not my plan to design and implement a new garbage collector from scratch, but to copy from an existing system. Nickle uses an elegant scheme for integrating automatic memory management in C code. I plan to cannibalize the implementation of the Nickel garbage collector and to adopt the scheme of keeping track of transient memory with a global stack of root pointers. I also plan to improve on Nickel's system by running the collector in a separate process (thanks to Keith Packard for the idea), and by adding language support for managed pointers as outlined above.

Technical Details

Lush is released under GPL v2 and so would be the proposed work. It is planned that PSU Lush and mainline Lush merge soon (independently of this SoC application). The proposed work would happen on a branch of Lush or PSU Lush, depending on whether the merger happens before the SoC start date.

Work Plan and Milestones

There are three large milestones:
  • The interpreter runs with the new GC
  • The compiler generates code with managed memory
  • The language supports managed pointers
  • If no unanticipated technical difficulties turn up I expect to easily reach the third milestone by the end of SoC (see my timeline). In any event, I am certain to get to milestone two.

    (The first two weeks of SoC overlap with the end of Spring term at my school. During that time I may not be able to spend more than three days per week on the project due to other commitments.)