A new dialect and implementation of a python-like language that supports massively scalable cooperative multi-threading, and eventually, erlang-style concurrency.
A VM that supports continuations. 'threads' will be built using continuations. The VM will be written in a new high-level language 'Irken' that compiles to C. Continuations will be available in both the target language and the implementation language, and since we want to support millions of threads, we cannot use the C stack. Our approach is to generate continuation-passing style (CPS) code in C.
Currently, I'm in stage 2 - though about 90% of the work is still in stage 1.
[2012] Actually, I skipped a few steps and rewrote the Irken compiler in Irken.
This should look as much like Python as possible, though it will start life as a much smaller language. My current thinking is that it will be dynamically typed (just like Python), though I may experiment with some kind of type inference. PLL will use a VM, implemented in Irken. Yes, I need a better name than 'PLL'. Originally, I had planned to call the two languages 'Irken-Low' and 'Irken-High', but this would just be confusing since they have almost nothing in common. Also, I believe Irken will be useful on its own.
Currently, this is a strange hybrid of Scheme and ML/Haskell - although it looks like Scheme, it uses ML-style Hindley-Milner let-polymorphism, with algebraic datatypes. The lisp syntax was chosen for reasons of simplicity and flexibility. Once PLL is available, I may think about giving Irken a pythonic syntax. The main feature missing from the ML/Haskell world is full pattern matching, though a simpler variant-case syntax works pretty well.
Irken compiles to a single large function in a single C file. Function calls are implemented as 'gotos' within this function. A simple register-machine model is used.
I may consider a second backend target to LLVM, although LLVM's thin support for GC will be useless without a stack.
Two problems I hope to solve using Erlang-style isolated 'threads':
First, garbage collection. If every thread has its own heap, garbage collection will be naturally 'concurrent' - i.e., we'll avoid having one gigantic collection that sweeps 12GB of memory..
Second, modules / external code. The runtime model of Irken makes it very difficult to link in new compiled code at run-time. If we instead demand erlang-style isolation, it's not a problem - just load up a new 'VM' and run it in a separate process/thread.
IPC: the price to pay - now we need a very efficient way of communicating between process/threads. But we needed that anyway.
My current testing platforms include FreeBSD/amd64, FreeBSD/i386, OSX/G5, both 32 and 64-bit. Irken should work on any platform that supports gcc (or any other C compiler that supports the address-of-label extension, which most do...) and some kind of rdtsc-like facility - though if you were desperate you could just strip out the tsc stuff - it's only used for timings.
[Note: looks like clang is out - at least on Snow Leopard the address-of-label feature is silently broken. Also seems to not like lexical functions.]
[2012 note: now that the compiler is self-hosted, you will find these files written in Irken, in the 'self' directory, with '.scm' extensions.
These are considered 'official' library implementations of runtime features.
$ irken tests/tak20.scm - compile tests/tak20.scm to tests/tak20.c and then tests/tak20 Usage: irken <irken-src-file> [options] -c : don't compile .c file -v : verbose (very!) output -t : generate trace-printing code -f : set CFLAGS for C compiler -m : debug macro expansion -O : tell gcc to optimize -p : generate profile-printing code
Currently, the compiler assumes that its 'lib' directory and support files are in the current directory (i.e., it assumes it's being run where it was built). This will be fixed eventually with a proper install in /usr/local/.