How to implement recursive grammar in Perl6

  • A+

I'm trying to implement a Markdown parser with Perl6 grammar and got stuck with blockquotes. A blockquote paragraph cannot be expressed in terms of nested braces because it is a list of specifically formatted lines. But semantically it is a nested markdown.

Basically, it all came down to the following definition:

    token mdBlockquote {         <mdBQLine>+ {             my $quoted = [~] $m<mdBQLine>.map: { $_<mdBQLineBody> };         }     } 

The actual implementation of mdBQLine token is not relevant here. The only imporant thing to note is that mdBQLineBody key contains actually quoted line with > stripped off already. After all, for a block:

> # quote1 > quote2 > > quote3 quote3.1 

the $quoted scalar will contain:

# quote1 quote2  quote3 quote3.1 

Now, the whole point is to have the above data parsed and injected back into the Match object $/. And this is where I'm totally stuck with no idea. The most apparent solution:

    token mdBlockquote {         <mdBQLine>+ {             my $quoted = [~] $m<mdBQLine>.map: { $_<mdBQLineBody> };             $<mdBQParsed> = self.parse( $quoted, actions => self.actions );         }     } 

Fails for two reasons at once: first, $/ is a read-only object; second, .parse modifies it effectively making it impossible to inject anything into the original tree.

Is there any solution then post-analysing the parsed data, extracting and re-parsing blockquotes, repeat...?


Expanding a little on @HåkonHægland's comment...

$/ is a read-only object ... effectively making it impossible to inject anything into the original tree.

Not quite:

  • Pedantically speaking, $/ is a symbol and never an object whether or not it's bound to one. Like any other symbol in P6, it can always be freely rebound. ($/ := 42 will always work.)

  • But what you're referring to is assignment. The semantics of assignment is determined by the item(s) being assigned to. If they're ordinary objects that are not containers then they won't support lvalue semantics and you'll get a Cannot modify an immutable ... error if you try to assign to them. A Match object is immutable in this sense.

What you can do is hang arbitrary data off any Match object by using the .make method on it. (The make routine calls this method on $/.) This is how you store custom data in a parse tree.

To access what's made in a given node of a parse tree / Match object, call .made (or .ast which is a synonym) on that node.

Typically what you make for higher nodes in a parse tree includes what was made for lower level nodes.

Please try the following untested code out and see what you get, then comment if it fails miserably and you can't figure out a way to make it work, or build from there taking the last two paragraphs above into consideration, and comment on how it works out:

token mdBlockquote {     <mdBQLine>+ {         make .parse: [~] $m<mdBQLine>.map: { $_<mdBQLineBody> };     } } 


:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen: