[PATCH] DTC: Remove the need for the GLR Parser.

Tue Oct 23 12:54:12 EST 2007

On Mon, Oct 22, 2007 at 04:13:54PM -0500, Jon Loeliger wrote:
> Previously, there were a few shift/reduce and reduce/reduce
> errors in the grammar that were being handled by the not-so-popular
> GLR Parser technique.

I haven't actually heard anyone whinge about glr-parser...

> Flip a right-recursive stack-abusing rule into a left-recursive
> stack-friendly rule and clear up three messes in one shot: No more
> conflicts, no need for the GLR parser, and friendlier stackness.

Ouch.  I'm feeling a bit stupid now, I really thought our conflicts
were somewhere else.  Specifically I thought the problem was that we
needed to look ahead more tokens that we were able to differentiate
between property and subnode definitions, i.e. between:
	label propname =
and
	label propname {

Except... I'm almost certain the conflicts first appeared when I added
labels, and I can't see how that would affect this.  Well, colour me
baffled.

Especially since the comments and content of commit
4102d840d993e7cce7d5c5aea8ef696dc81236fc (second commit in the entire
history!) appear to back up my memory of this.  I used to have a
lookahead hack in the lexer to remove the conflict.

But this patch certainly seems to make the conflicts go away, so I'm
confused.

Well, regardless of that, I have a few concerns.

First, a trivial one: I remember leaving this as a right-recursion,
despite the stack-nastiness, because that way the properties end up in
the same order as in the source.  I think that behaviour is worth
preserving, but of course we can do it with left-recursion by changing
chain_property() to add to the end of the list instead of the
beginning.  Also, if we're going to avoid right-recursion here, we
should do so for the 'subnodes' productions as well, which is
completely analogous.

More significantly, I don't know that we want to burn our bridges with
glr-parser.  glr-parser is a beautiful algorithm which means we can
use essentially whatever form of grammar is the easiest to work with
without having to fiddle about to ensure it's LALR(1).  This could
still be useful if we encounter some less easily finessable grammar
construct in future.

And even without glr-parser, I'm still uncomfortable with the
lexer<->parser execution ordering issues with the current
/dts-version/ proposal.  It may now be true that the order is
guaranteed to be correct, but it's still not exactly obvious.

It seems to me that the version change introduces a lexical change to
the input format, and should therefore be handled at the lexical
level.  And I think there are other potential advantages to parsing
the version identifier as a token, rather than as an integer (such as
being able to define entirely different grammars for different
versions, if we have to).

-- 
David Gibson			| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au	| minimalist, thank you.  NOT _the_ _other_
				| _way_ _around_!
http://www.ozlabs.org/~dgibson