LR & GLL Parsers Study

I've spent the last week studying parsers. I studied out various parsing algorithms, eventually sticking to studying LR and GLL closer.

If I want better IDEs or occassionally write an UIs similar to what there's in place in vim. Hand written parsers will quickly become cumbersome to use.

LR

LR parsers are interesting due to the few actions it takes in parsing. The pushdown automaton does zero or more reductions and then proceeds with a shift that consumes an input token.

You usually write a parser generator and generate a table to denote every st ate, but I wrote one by hand for demonstration and studying purposes.

LR state can be parametrized to implement precedence parsing without exploding the parser tables. The parser may be also made layout sensitive with same techniques.

GLL

GLL is a variation on LL parsing that can handle hidden ambiguous grammars with hidden left recursion. It does that by an immutable stack that may diverge or converge. Every entry in the stack is traversed only once.

I wrote a GLL recognizer that accepts input incrementally. Unless I messed anything, it's doing everything mentioned in the paper.

I read that if the GLL recognizer is transformed into a parser, it will be roughly as bad as the other parsers in worst case. The transformation is traightforward though, and described in Royal Holloway's learning material.

The GLL parser looks as if it was difficult to transform it to handle precednece and layout sensitivity. Especially if I want the tool to apply for anything else besides parsing afterwards. It may be worthwhile because these kind of parsers are easy and quick to configure.

Links:

Similar posts