I've been playing a bit with Irony, an open source .NET compiler construction toolkit created by Roman Ivantsov. My interest in Irony was sparked after watching the video of Roman's presentation at Lang.NET 2008 (I'd link to it, but they are unavailable at the moment).

Currently, I have a little side project where I've been experimenting with the Dynamic Language Runtime, for which I was using GPPG and GPLEX to build the parser and tokenizer. The tools are OK, but it certainly requires quite a bit of manual work, you need to keep regenerating the code, and, frankly, the error messages both tools produce leave a bit to be desired.

I'm aware of ANTLR, but frankly don't want to mess around with the whole java generator thing at this time (already have way too much crap around).

Irony is pretty interesting because all the tokenizer and grammar rules are expressed directly in C# code as a simple object model built from Terminal and NontTerminal objects. The syntax is fairly intuitive because Irony overloads several operators like | and & to make the grammar definition look a lot similar to BNF. Here's a sample of what a simple expression grammar in Irony looks like: http://www.codeplex.com/irony/Wiki/View.aspx?title=Expression%20grammar%20sample

After testing a bit Irony, I'm very encouraged to give it a try on my pet project, so I've been playing around with my grammar (unfortunately broken at this time because of some ambiguities I haven't resolved yet), and its looking very nice overall.

There's also a useful "Grammar Explorer" tool included with the Irony code which can be used to experiment and diagnose a compiled Irony grammar. Unfortunately right now it just has a predefined list of a few test grammars included with the project, but extending it to use others is trivial anyway, so I've been using it to test my grammar before actually committing to it.

Another useful thing in Irony is the concept of TokenFilters, which are objects that can filter the token stream as it is produced by the scanner and can either remove/modify/add new tokens as necessary or provide extra validations. Right now Irony provides two built-in filters, but you can extend it with new ones as necessary:

  • CodeOutlineFilter, which can be used in languages where whitespace is significant (indentation and/or newlines).
  • BraceMatchingFilter for matching brace characters, for example '()' in scheme/lisp.
Technorati tags:


Tomas Restrepo

Software developer located in Colombia. Sr. PFE at Microsoft.