What's the meta-object rule for naming grammar rules

  • A+
Category:Languages

As indicated in this issue, some token names clash with method names in the class hierarchy of Grammar (which includes Match, Capture, Cool, Any and obviously My.). For instance, `Mu.item

grammar g {     token TOP { <item> };     token item { 'defined' } }; say g.parse('defined'); 

issues an error like this one:

Too many positionals passed; expected 1 argument but got 2␤   in regex item at xxx 

item is part of Anys methods, too; I haven't found any other methods in the other classes whose name fails as a rule, but then there are no subs defined (except for item); most are multis or actually defined as method.

This happens too when submethods like TWEAK of BUILD are used for token names, but the error in this case is different:

Cannot find method 'match': no method cache and no .^find_method␤ at xxx 

However, other submethods like FALLBACK have no problem at all:

grammar g {    token TOP { <FALLBACK> };    token FALLBACK { 'defined' }  };  say g.parse('defined') # OUTPUT: «「defined」␤ FALLBACK => 「defined」␤»   

and ditto for some other methods in the class hierarchy of Grammar, such as rand or, in general, most methods defined as such.

What problematic names seem to have in common is the fact that they are declared as sub but that is not always the case: CREATE, which caused the whole issue initially, is declared as a method. So it is not clear to me at all what are the names to avoid, and which ones can be used legitimately. Can someone clarify?


This is almost entirely about multiple awkward bugs.

item etc.

See RT#127945 -- Mu methods cannot be used as grammar tokens due to default Actions class. Also token name confilct with internal name ?. Unfortunately this isn't easy to fix.

An explanation of this bug and its impact follows.

Per the Actions mechanism, if a grammar rule matches, the .parse call immediately tries to call a correspondingly named action method.

If you don't explicitly pass an actions class/object to the .parse method then it uses the default, which is Mu. Then, when a rule in your grammar matches, it looks for a Mu method with the same name. If it doesn't find one, all is well. But if it finds one then it calls that method on Mu with the current Match object as the first and only argument. In almost all cases that'll go badly. item is an example of this.

If you do tell the .parse method to use a particular actions class/object, another wrinkle arises:

grammar g           { rule all { all } }; class actions       { } g.parse: 'all',          rule    => 'all',          actions => actions,  

This yields a similar error to item, except this time the all method comes from Any. This is because the actions class's MRO includes Any:

say class actions   { }.^mro ; # ((actions) (Any) (Mu)) 

You can eliminate this wrinkle by declaring your actions classes with is Mu:

grammar g           { rule all { all } }; class actions is Mu { } g.parse: 'all',          rule    => 'all',          actions => actions,  

This works fine because now the actions only inherit from Mu -- and Mu doesn't have an all method.

It would be great if you could inherit from nothing, but you can't; is Mu is as minimal as you can get.

What can we conclude about this first bug?

Because newer versions of Perl 6 and/or Rakudo may ship with new Mu methods, the safest thing to do to defend against this bug is to always declare an actions class and always declare a method corresponding to every single rule in your grammar. If you do this you don't need to follow any naming rules to avoid this bug.

TWEAK etc.

I will file an RT bug about this if I can't find an existing one.

Golfed:

grammar g { rule TWEAK {} } 

This blows up at compile-time (immediately after parsing the closing curly brace of the grammar declaration). So this is definitely not the same bug as the item bug -- because the latter is due to the run-time Actions mechanism that only kicks in after a rule matches.

This does not blow up:

grammar g { method TWEAK {} } 

Perhaps, as part of creating/finalizing a grammar package, some code introspects and/or manipulates any TWEAK "method" found in the new grammar package in a way that works fine if it's an ordinary method but blows up if it's not.

However, other submethods like FALLBACK have no problem at all

TWEAK and BUILD methods or submethods in a class are part of standard object construction. They have a very different role to play than FALLBACK (which is called if a method is missing).

What can we conclude about this second bug?

There's clearly something very specific going on with TWEAK and BUILD and they may well be the only two rule names with the problem they exhibit. So just avoid those two names and you'll hopefully be clear of this bug.

Accidentally using built-in rule names

See RT#125518 -- Grammar 'ident' override behaviour.

You can override built-in rules by just specifying your own version.

As dwarring notes "It certainly causes confusion if you accidentally declare [a rule] with the same name as a built-in rule.".

So the key question is, what's the definitive source for knowing built-in rules and how might one manage things given that they may change over time?

(Yes, very vague, I know. Also, I think Perl 6's built-ins must necessarily extend NQP's and that seems likely to be relevant. Also, there are multiple slangs in each overall language and perhaps that's relevant. I plan to discuss this issue more fully in a later edit.)

Other relevant bugs

See also Moritz' answer.

Comment

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen: