Eigenstate: myrddin-dev mailing list

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Parser Generator: Demo Exists.


A few reasons:

1) I'd like to unify the lexer and parser
2) I'd like to use a different table (IELR, instead of LALR)
3) I've actually found that it's not so hard to do.
4) I'd like to have a platform to explore better error handling.
5) I'm aiming to get more code actually written in Myrddin to shake
   out compiler and library bugs.

I could probably adapt one of the LALR implementations in a weekend,
though.

On Mon, 03 Aug 2015 09:11:25 -0500, Ryan Gonzalez <rymg19@xxxxxxxxx> wrote:

> So...maybe I'm just saying the obvious...
> 
> ...but have you ever considered just porting Berkeley YACC (byacc)? It's public domain, so you can license it however the heck you want, and I'm pretty sure that'll end up easier than trying to do the whole thing from scratch.
> 
> 
> On August 3, 2015 1:49:00 AM CDT, Ori Bernstein <ori@xxxxxxxxxxxxxx> wrote:
> >So, it's far from done, and is currently completely useless, but enough
> >is
> >working that I feel like I can announce that it exists:
> >
> >    http://git.eigenstate.org/ori/mpgen.git
> >
> >This currently implements a lexer and an lr0 parser generator; the plan
> >is to move to ielr.
> >
> >An example input is here:
> >
> >        %pkg parse
> >        %tok id = "/[a-z]*/"
> >        %tok Lbra = "("
> >        %tok Rbra = ")"
> >        %tok Plus = "+"
> >        %start expr
> >
> >        expr: term  {std.put("got a term\n")}
> >            | expr Plus term
> >            ;
> >
> >        term: id {std.put("got an id\n")}
> >            | Lbra expr Rbra
> >            ;
> >
> >        %myr {
> >        use std
> >        }
> >
> >For documentation, you'll mostly have to read the source, but a
> >summary:
> >
> >    %pkg:
> >       optional; this sets the package that the parse() function is in.
> >
> >    %tok name = pat
> >        Creates a token with the name 'name', and pattern 'pat'. Two
> >        types of pattern are accepted:
> >       - String patterns. These are quoted strings that are interpreted
> >              verbatim: "foo.*" will match the exact string 'foo.*'.
> >         - Regex patterns. These are quoted in slashes, and are treated
> >         as regexes. Capture groups, of course, aren't supported, since
> >              we compile to a DFA.
> >
> >    %skip name = pat
> >     Skips tokens; useful for, eg, ignoring whitespace. Should probably
> >        drop the name= bit.
> >
> >    %start sym
> >        Identifies the initial symbol that we use. This is mandatory.
> >
> >    %myr
> >        A single myrddin blob. This is injected at the end of the input
> >        (although, thanks to the order independence, the exact location
> >        is never important).
> >
> >A single function is exported:
> >
> >    const parse : (input : byte[:] -> bool)
> >
> >Things that are not supported yet:
> >
> >    - Usefully powerful grammars (lr0, hah)
> >    - Actions on reductions -- right now, the code snippets are no-ops.
> >    - Types and values
> >    - Unicode
> >    - Parsing from an input stream (currently uses a string)
> >    - Implicit tokens; I'd like to have strings usable directly in the
> >      grammar, instead of needing %tok directives.
> >    - Error handling; we just blow up.
> >
> >It's also buggy as fuck. So, 'tis a demo, not a useful tool. Yet.
> >
> >-- 
> >    Ori Bernstein
> 
> -- 
> Sent from my Nexus 5 with K-9 Mail. Please excuse my brevity.


-- 
    Ori Bernstein

Follow-Ups:
Re: Parser Generator: Demo Exists.Ori Bernstein <ori@xxxxxxxxxxxxxx>
References:
Parser Generator: Demo Exists.Ori Bernstein <ori@xxxxxxxxxxxxxx>
Re: Parser Generator: Demo Exists.Ryan Gonzalez <rymg19@xxxxxxxxx>