Thursday, April 13, 2006

Stratego is cool

I've been meaning to talk about this for a while. Stratego is a very cool language. Their raison d'être seems to be performing program transformations, such as compilers and program optimizations. But because it's a term rewriting system, it's also perfectly reasonable to use it as a framework for specifying operational semantics of programming languages.

Here's an example rewrite rule, specifying the evaluation of variable references in a CEKS machine:
EvalVar :
|[
Expr: $e,
Env: $r,
Cont: $k,
Store: $s
]|

|[
Value: $v,
Cont: $k,
Store: $s
]|
where <is-var> e
; x := <var-name> e
; v := <resolve>(x,r)
Unless you know Stratego, you won't realize the most remarkable thing about this code: it's using a custom syntax for the representation of the configurations in the CEKS machine! That is, I've extended the syntax of Stratego with a concrete syntax for configurations, bracketed between |[ and ]| tokens. I've also added unquotation to that concrete syntax in the form of $ tokens followed by Stratego terms. This allows me to bind meta-variables within the left-hand side of the rewrite rule and the where clauses and refer to bound meta-variables within the right-hand side of the rewrite rule and again in the where clauses.

Stratego is based in part on SDF, the Syntax Definition Formalism, which is a lovely system for defining modular grammars. SDF is based on Tomita's Generalized LR (GLR) parsing algorithm, which recognizes the full class of context-free grammars, even ambiguous ones. If you want extensible grammars, you have to work with a class of grammars that's closed under composition, so they've chosen the most permissive, the full CFG class. Unfortunately this means you can end up with ambiguities, but as far as I know I haven't run into any sneaky ambiguities yet.

The other important feature of Stratego is its computational meta-language. My two requirements for a term rewriting system were the ability to specify rewrite rules in terms of the concrete language, which Stratego provides with flying colors, and the ability to compute side conditions. Stratego's entire computational model is based on rewriting, so side conditions work by saving the current state of the world and performing a rewrite on the side. Their rewriting combinators, known as strategies, allow for some pretty powerful abstractions. It has the feel of programming in the backtracking monad, not unlike Prolog. It's a little unfamiliar for a functional programmer, but it's perfectly adequate for the job.

It is my hope, and I'm optimistic, that we'll use Stratego to model the whole of the ECMAScript Edition 4 semantics. It's not on the same level as modeling the language in a theorem prover and using it to prove safety properties of the language mechanically, but my experience so far has shown that just having an executable model of a language is extremely useful for testing and driving out bugs in the semantics.

3 comments:

Stevie said...

Wow! I hope to get the chance to mess around with this sometime. That's indeed a handy tool to have when working with operational semantics, just to have something with which to play around and test ideas.

Anonymous said...

Definitely cool. It's nice to see that they already have syntax definitions for several languages (e.g. Java; C; BibTex: interesting choice...; what, no C++?! ;). Thanks for bringing it to my attention!

Anonymous said...

BTW, I can't tell at a first glance if Stratego naturally supports extensible grammars (a la Polyglot for Java). Would you happen to know? This would make it a nice tool for testing language extensions.