These are my hobby software projects. Most of them are design experiments; a few are in active use.
NOTE: many of these links will break when Google Code is turned down. I should migrate them.
poly2 (2014 - 2015) - A second version of Poly, using a more granular multiprocess architecture. Poly v1 was around 30K lines of code. I'm aiming for a 10K line "kernel" in version 2. (Python, shell, C, JavaScript)
webpipe (2014) - Tools to bridge your terminal and web browser. Useful for working from home without X forwarding. Somewhat inpsired by Mathematica. A few R users at Google are using this. (R, Python, JavaScript, shell)
lox (2013) - Personal web proxy (Node JS)
poly (2010 - 2013) - A multi-tenant web server, or "PaaS", for R and other languages. Many people at Google are using it. It motivated many of the projects below.
Lessons: UI matters and requires writing a lot of code. Shiny does well here.
Multiprocess architectures are robust.
To properly handle timeouts, you need events rather than threads.
annex (Winter 2012-2013) - Lexing and Parsing library, with self-hosted regex dialect "CRE". (Python)
Lessons: Lexing and parsing should indeed be separate. PEGs unify them, probably for theoretical reasons, but this makes the tool harder to use.
Python is not the right language for implementating languages. I will use OCaml for subsequent projects like this.
pulp (Winter 2012-2013) - Unix-style toolkit for documentation. I was unhappy with asciidoc, and built this system around markdown, JSON Template, and structured data. I used it to document Annex.
tnet (2012 - 2013) - An experiment with tnetstrings, a simple serialization format. Used in Poly v1.
Lessons: Different languages have inherently different data models -- R being a prime example. Protocol Buffers still suffer from this problem.
An simpler approach often works: JSON + regular netstrings.
pry (2012) - tiny Python library for server process introspection. This is the only one I didn't write -- I packaged internal Google code for release.
tin (2012) - A simple tool to make
self-contained Python executables. (In 2007 or so, I sent a patch to Python
which added .zip file support, which led to the current __main__.py
support.)
Torn (2012) - Single threaded currency library with event loops and coroutines. Based on Tornado's event loop. (Python)
fly (2012) - A wrapper and protocol that lets command line tools hold state across invocations (e.g. network connections or big data structures.)
xmap (2012) - An xargs -P
variant with
stateful processes (like fly).
xargs -P
.
coopt (2012) - JSON Schema implementation in Python. Used in Poly.
rpeg (Summer 2011) - I forked the Lua LPeg library, which is based on the PEG abstraction. LPeg is DSL embedded in Lua; I was created an external DSL for it.
pyb (2009 - 2012) - A fully dynamic protocol buffer implementation in Python.
Lessons: This was a good experience with bootstrapping. To read a protocol buffer schema, you need to be able to parse protocol buffers.
It's possible to implement protocol buffers using Python metaprogramming -- that is, without code generation.
taste (2009) - A multi-language test framework extracted from JSON Template, and used for JSON Pattern. (Python)
JSON Pattern (2009) - A pattern language to extract structured data from text (as regular expressions extract simple strings from a string). The complement of JSON Template.
JSON Template (2009 - ) - a minimal but powerful templating language, implemented in Python, JavaScript, and other languages.
It was inspired by EZT and Google ctemplate, and in turn inspired the Go template language. Initially, Go's implementation used the same syntax. As of Go 1.0, it uses a different syntax, but same semantic model: the "cursor".
Slightly before node.js came out, I was interested in JavaScript as a general purpose language and embeddable language.
I still like it, but I mostly use it in the browser now. Event loop concurrency is nice, but it's not the only thing you need. Events, threads, processes, and coroutines are all useful in different situations.
narcissus (2009) - A port of the Narcissus JavaScript parser in JavaScript to CommonJS on Narwhal. I removed a lot of the Mozilla-specific JS extensions.
jsonschema (2010) - CommonJS packaging for JSON Schema.
gelatin (2009) - a bundling tool for CommonJS modules. As of 2015, there are a lot of tools that do similar things.
oil (2009) - Cross platform JavaScript code. It worked on on Narwhal, Node, and Windows Script Host. Gelatin uses it.