Tuesday, 14 July 2015

IncursionScript and the Accent Compiler Compiler

When I got a few minutes, I spent a bit more time looking at replacing the Accent usage in Incursion.  For the most part, Accent works pretty much the same as yacc or bison, but does so in a much more user friendly way.  Yacc and bison use BNF grammars, where Accent uses an EBNF grammar.  This allows minor use of some of the regular expression syntax to avoid painful rule finagling completely.

This StackOverflow post quotes the type of work you have to do (from another web page), to convert from EBNF to BNF.  Looking at the one off case, it only seems mildly inconvenient, but when you have several variations, it becomes painful.  Especially, if you're considering converting ~100k of rules for parsing IncursionScript.

monster_def:
  MONSTER { theMon->Attr[0] = theMon->Attr[1] =
            theMon->Attr[2] = theMon->Attr[3] =
            theMon->Attr[4] = theMon->Attr[5] = 0;
            CurrAttk = 0; CurrFeat = 0; theRes = theMon; }
  LITERAL<name>  { theMon->Name = name; }
  ':' cexpr3<n>    { theMon->MType[0] = n; }
    ( ',' cexpr3<n2> { theMon->MType[1] = n2; }
    ( ',' cexpr3<n3> { theMon->MType[2] = n3; } )? )?
    '{' (mon_entry)* '}' { theMon++; };
That was the initial part of a monster type definition. Note that the Accent grammar syntax for the left hand side of rules, embeds the parameter name in < and >, whereas bison has a much more limited support and uses [ and ]. Also, the right hand side of a rule can use the previously mentioned aspects of regular expression syntax. In this case ( and )? are used to indicate 0 or 1 occurrences of a match, and additionally + and * can be used similarly to indicate one or more, and none or more matches.  Note also that the parsed parameter values are used directly in the code blocks, without syntactical sugar, where bison requires that n be $n.

Any use of these regular expression syntaxes would when rewritten for bison, require more and more complicated contrivances and would break up the whole rule making the grammar less readable.  In theory, it should be possible to construct a bison grammar that parses the Accent syntax, with a matching lexer, and convert an Accent grammar to a more convoluted bison grammar.  The bison grammar for the Accent syntax isn't hard to do, in fact, the Accent author provides one in their documentation.  The bulk of the work would be in writing some subset of C parsing, to transform the code blocks, and then there's the flexible Accent parameter lists which bison cannot support at all.

Old with dated pre-ANSI code and using a GPLv2 license, requiring a commercial license to be able to be used freely, Accent is a not just a compelling alternative to bison.  It's almost impossible justify downgrading from it, if one ignores the unfortunate license.  But to become something where any inspired player, can open up a script and start modding, it needs to downgrade from it.

Translating the grammar from EBNF to BNF?  Not enough.  Bison doesn't offer the additional support required.  This solution would still require hours of painstaking rewriting of the custom Accent grammar syntax.

The only remaining alternative, is to write an Accent replacement.  Or to enhance bison or byacc, to support EBNF, and the additional functionality.  But then how many hours would that require?  It might be worth downgrading that 90k Accent grammar to a bison grammar after all.  Something to think about it.

That reminds me, I've searched everywhere three times and I can't find my Dragon parsing book.  It's gone.  I imagine it's with all my compact flash usb adapter, my micro sd card, my issue of Dragontales, and the other things I haven't noticed going missing yet.

No comments:

Post a Comment