Tool parser_gen
transform the syntax definition as standard input into a a C++ generated code onto the standard output.
parser_gen <syntax_def.csf >syntax.hh
TODO.
Space, horizontal tabulation, new line, carriage return characters are ignored. Also double slash ("//") ignores the end of the line.
SpecialChar ::= "\\" | "n" | "r" | "t" | "\"" | "-" | "]" Char ::= [^\]\\\-] | "\\" SpecialChar Range ::= Char "-" Char CharClass ::= "[" (Range | Char)+ "]" | "[" "^" (Range | Char)+ "]" StringChar ::= [^\"\\] | "\\" SpecialChar String ::= "\"" StringChar* "\"" Symbol ::= ([A-Z] | [a-z] | "*" | "?" | "+" | "-" | "_")+ AttrRule ::= "$" CxxCode "$"
A syntax definition consists on a list of rules. Rules are defined as such:
Rule ::= ("<" AttrChainElt ">")? Symbol "::=" ListOfChains Filtering? ListOfChains ::= {(AttrChainElt* AttrRule?) "|"}+ AttrChainElt ::= AttrRule? ChainElt ChainElt ::= Symbol | String | CharClass
Filtering ::= "-/-" LookAhead+
The attribute rules (AttrRule) on the left of a parsing element (symbol, string, or character class) are prior to the descent onto parsing this element.
It must define next_frame
of type convertible to type aurelia::llstack::frame.
The default copies the current frame:
$ frame next_frame(f); $
The frame should never be modified in case of left recursion, otherwise it breaks the parsing algorithm and does not halt.
The last attribute rule must define ret
. If there is no return value, it must define it as type aurelia::llstack::void_return.
void_return ret;
If the code decides to reject the current parsing, then the code must return.
Aurelia provides template aurelia::llstack::ret_frame to stack easily data to pass through rules. The data must be hash-able and movable or copyable.
f
is the current frame. s
is the current stream.Type stream
must provide operators "++" and "*". Aurelia provides aurelia::buf_stream_iterator.
// We first check that the stream has the right first character. if (s.eof()) { if (!is_empty_first(S_0_0)) throw error(); } else if (!is_first(s, S_0_0)) { throw error(); } // We then define the boundary nodes of the graph. // end is the end iterator of the stream. node u1 = node::boundary(start_label(), end); node u0 = node::boundary(end_label(), end); u1.parents().insert(u0); // We define a queue of parsing contexts. r_t R; // We then add the start symbol in the queue. S start; start(R, empty_frame(), s, u1); // We then loop requesting descriptors from the queue, then calling // them. try { while (true) { descriptor d = R.pop(); d.L.call(R, d.f, d.s, d.n); } } catch (r_t::eof) { }