RFC 54, by Damian Conway: Operators: Polymorphic comparisons

This RFC was originally proposed on August 7th 2020 and frozen in six weeks.

It described a frustration with comparison operations in Perl. The expression:

"cat" == "dog"   #True

Perl (and now Raku) has excellent support for generic programming because of dynamic typing, generic data types and interface polymorphism. Just about the only place where that DWIM genericity breaks down is in comparisons. There is no generic way to specify an ordering on dynamically typed data. For example, one cannot simply code a generic BST insertion method. Using <=> for the comparison will work well for numeric keys, but fails miserably on most string-based keys because <=> will generally return 0 for most pairs of strings. The above code would work correctly however if <=> detected the string/string comparison and automagically used cmp instead.

The Raku implementation is as follows:

There are three built-in comparison operators that can be used for sorting. They are sometimes called three-way comparators because they compare their operands and return a value meaning that the first operand should be considered less than, equal to or more than the second operand for the purpose of determining in which order these operands should be sorted. The leg operator coerces its arguments to strings and performs a lexicographic comparison. The <=> operator coerces its arguments to numbers (Real) and does a numeric comparison. The aforementioned cmp operator is the “smart” three-way comparator, which compares strings with string semantics and numbers with number semantics.[1]

The allomorph types IntStr, NumStr, RatStr and ComplexStr may be created as a result of parsing a string quoted with angle brackets…

my $f = <42.1>; say $f.^name; # OUTPUT: «RatStr␤»

Returns either Order::Less, Order::Same or Order::More object. Compares Pair objects first by key and then by value etc.

Evaluates Lists by comparing element @a[$i] with @b[$i] (for some Int $i, beginning at 0) and returning Order::Less, Order::Same, or Order::More depending on if and how the values differ. If the operation evaluates to Order::Same, @a[$i + 1] is compared with @b[$i + 1]. This is repeated until one is greater than the other or all elements are exhausted. If the Lists are of different lengths, at most only $n comparisons will be made (where $n = @a.elems min @b.elems). If all of those comparisons evaluate to Order::Same, the final value is selected based upon which List is longer.

If $a eqv $b, then $a cmp $b always returns Order::Same. Keep in mind that certain constructs, such as Sets, Bags, and Mixes care about object identity, and so will not accept an allomorph as equivalent of its components alone.

Now, we can leverage the Raku sort syntax to suit our needs:

my @sorted = sort { $^a cmp $^b }, @values;
my @sorted = sort -> $a, $b { $a cmp $b }, @values;
my @sorted = sort * cmp *, @values;
my @sorted = sort &infix:cmp», @values;

And neatly avoid duplication in a functional style:

# sort case-insensitively
say sort { $^a.lc cmp $^b.lc }, @words;
#          ^^^^^^     ^^^^^^  code duplication
# sort case-insensitively
say sort { .lc }, @words;

So this solution to RFC54 smoothly combines many of the individual capabilities of Raku – classes, allomorphs, dynamic typing, interface polymorphism, and functional programming to produce a set of practical solutions to suit your coding style.

[1] This is ass-backwards from the RFC54 request with ‘cmp’ as the dynamic “apex”, degenerating to ‘<=>’ for Numeric or ‘leg’ for Lexicographic variants.

RFC 64: New pragma ‘scope’ to change Perl’s default scoping

Let’s talk about a fun RFC that mostly did not make its way into current day Raku, nor is it planned for later implementation.

This is about RFC 64 by Nathan Wiger. Let me quote the abstract:

Historically, Perl has had the default “everything’s global” scope. This means that you must explicitly define every variable with my, our, or its absolute package to get better scoping. You can ‘use strict’ to force this behavior, but this makes easy things harder, and doesn’t fix the actual problem.

Those who don’t learn from history are doomed to repeat it. Let’s fix this.

It seems use strict; simply has won, despite Nathan Wiger’s dislike for it.

Raku enables it by default, even for one-liners with the -e option, Perl 5 enables it with use 5.012 and later versions, and these days there’s even talk to enable it by default in version 7 or 8.

I’d say the industry as a whole has moved in the direction of accepting the tiny inconvenience of having to declare variables over the massive benefit in safety and protection against typos and other errors. Hey, even javascript got a "use strict" and TypeScript enables it by default in modules. PHP 7 also got something comparable. The only holdout in the “no strict” realm seems to be python.

With pretty much universal acceptance of required variable definitions, there was no compelling reason to improve implicit declaration, so no scope pragma was needed.

But… there’s always a “but”, isn’t there?

One of the primary motivations for not wanting to declare variables was laziness, and Raku did introduce several features that allow to you avoid some declarations:

  • Parameters in signatures are an implicit declaration, The RFC’s example sub squre could be written in Raku simply as sub square($x) { $x * $x }. No explicit declaration necessary.
  • Self-declaring formal parameters with the ^ twigil also imply a declaration, for example sub square { $^x * $^x }.
  • There are many functional features in Raku that you can use to avoid explicit variables altogether, like meta operators, Whatever star currying etc.

I am glad Raku requires variable declarations by default, and haven’t seen any code in the wild that explicitly states no strict;. And without declarations, where would you even put the type constraint?

RFC 5, by Michael J. Mathews: Multiline comments

This is the first RFC proposed related to documentation. It asks for a common feature in most of the modern programming languages: multiline comments.

The problem of not having multi-line comments is quite obvious: if you need to comment a large chunk of code, you need to manually insert a # symbol at the beginning of every line (in Raku). This can be incredibly tedious if you do not have, for instance, a text editor to do this with a shortcut or similar. This practice is very common in large code bases. For that reason, Michael refers to C++ and Java as

popular languages that were designed from the outset as being useful for large projects, implementing both single line and multiline comments

In those languages you can type comments as follows:

// single line of code

 Several lines of code

But, in addition, in Java you have a special multiline comment syntax 1 for writing documentation:

* Here you can write the doc!

A lot of people proposed POD as a solution to this problem, but Michael lists some inconvenients:

  • “it’s not intuitive”: given that POD is only used by Perl, people coming from different languages will face some struggles learning an entire new syntax.

From my point of view, this not as big a problem since POD6 syntax is quite simple and it’s well documented. In addition, it is quite intuitive for newcomers: if you want a header, you use =head1, if you want italics, you use I<> and so on.

  • “it’s not documentation”: this one is still true. The main problem is that when you want to comment a big chunk of code, that’s probably not documentation, so using =begin pod ... =end pod it’s a little weird.
  • “it doesn’t encourage consistency”: another problem of POD is that you can use arbitrary terms in its syntax:

    While this behavior gives us a lot freedom, it also complicates consistency across different projects and users.

After some discussion, Perl chose POD for implementing multiline comments. Nonetheless, Michael proposal was taken into account and Raku supports multiline comments similar to those of C++ and Java, but with a slightly different syntax:

Raku is a large-project-friendly
language too!
say ":D";

And as a curiosity, Raku has embedded comments, that is:

if #`( embedded comment ) True {
    say "Raku is awesome";

In the end, as a modern, 100-year language, Raku gives you more than one way to do it, so choose whatever fits you best!

  1. It’s not really a multiline comment because you also need to type the * symbol at the beginning of every line.

RFC 225: Superpositions (aka Junctions)

Damian Conway is one of those names in the Perl and Raku world that almost doesn’t need explaining. He is one of the most prolific contributors to CPAN and was foundational in the design of Raku (then Perl 6). One of his more interesting proposals came in RFC225 on Superpositions, which suggested making his Perl Quantum::Superposition‘s features available in the core of the language.

What is a Superposition?¹

In the quantum world, there are measurable things that can exist in multiple states — simultaneously — until the point in time in which we measure them. For computer scientists, perhaps the most salient application of this is in qubits which, as a core element of quantum computing, threaten to destroy encryption as we know it, if quantum supremacy is borne out.

At the end of the day, though, for us it means being able to treat multiple values as if they were a single value so long as never actually need there to only be one, at which point we get a single state from them.

The Perl Implementation

In the original implementation, Dr. Conway adds two new operators, all and any. These converted a list of values into a single scalar value. How was this different from using a list or array? Consider the following Perl/Raku code:

my @foo = (0, 1, 2, 3, 4, 5);

We can easily access each of the values by using array notation:

print @names[0]; # 0
print @names[1]; # 1
print @names[2]; # 2

But what if we wanted to do stuff to this list of numbers? That’s a bit trickier. Functional programmers would probably say “But you have map!”. That’s true, of course. If I wanted to double everything, I could say

@foo = map { $_ * 2}  @foo; # Perl
@foo = map { $_ * 2}, @foo; # Raku

But it could also be nice if I could just say

@foo *= 2;

This is where the superposition can be helpful. Now imagine we have another array and wanted to add it to our now doubled set of values in @foo

my @bar = (0,20,40,60,80,100);
@foobar = @foo + @bar;          # (12); wait what?  Recall that arrays in numeric context are the number of elements, or 6 here.

Your instinctive reaction might be to say that we’d want to end up with (0,22,44,66,88,110) which is simple enough to handle in a basic map or for loop (using the zip operator, Raku can do this simply as @foo Z+ @bar). But remember what a superposition means: anything done happens to all the values, so each value in @foo needs to be handled with each value in @bar, which requires at least two loops if done via map or for (the cross operator in Raku can do this simply as @foo X+ @bar). We actually want (0, 2, 4, 6, 8, 10, 20, 22, 24, 26, 28, 30, 40, 42, 44, 46, 48, 50 … ). More difficult, then, would be to somehow compare this value:

@foobar > 10;

There is no map method we can attach to @foobar to check its values against 10, we’d need to instead map the > 10 into @foobar. But by using superpositioning, we can painless do all of the above with a single use of map, for, or anything else that generates line noise:

use Quantum::Superposition;
my $foo = any (0, 1, 2, 3, 4, 5);    # superposition of 0..5
$foo *= 2;                           # superposition of 0,2,4,6,8,10
my $bar = any (0,20,40,60,80,100);
my $foobar = $foo + $bar;            # superposition of 0,2,4,6,8,20,22,24,26…
$foobar > 10;                        # True
$foobar > 200;                       # False
$foobar < 50;                        # True
$foobar < 0;                         # False

In fact, comparison operators are where the power of superpositions really shine. Instead of checking if a string response is an an array of acceptable responses, or using a hash

The Raku proposal

In the original proposal, there were two types of superpositions possible: all and any. These were proposed to work exactly as described above (creating a single scalar value out of a list of values), with their most useful feature being evident when interpreted in a boolean context. For example, in the code

my $numbers = all 1,3,5,7,9;
say "Tiny"  if $numbers < 10;     # Tiny
say "Prime" if $numbers.is-prime; # (no output)

For those wishing to obtain the values, he proposed the using the sub eigenstates, which would retrieve the states without forcing it to collapse to a single one. The rest of the RFC argues why superpositions should not be left in module space, as even the Dr. Conway’s work had limitations that he himself readily admitted — namely, interacting with everything that assumes a single value for a scalar and (auto)threading. The former should be fairly obvious why it would be difficult for the Quantum::Superposition module to work perfectly outside of core, because “the use of superpositions changes the nature of subroutine and operator invocations that have superpositions as arguments”.² As well, if we had a superposition of a million values, doing each operation one by one on computers with multiple processors seems silly: it should be possible to take advantage of the multiple processors. While this seems like an obvious proposition today, we must recall the multicore processors were simply not common in the consumer market when the proposal was made. (Intel’s Pentium D chips didn’t arrive until 2005, IBM’s PowerPC970 MP in 2002.) By placing it in core, things can just work as intended and, in the rare event that a module author cares about receiving superimposed values, they could provide special support.

The Raku implementation

For the most part, RFC 225 was well received and expanded in scope. The most obvious change is the name. In the final implementation, Raku calls these superimposed values junctions. But on a practical level, two additional keywords were added, none and one which provide more options to those using the junctions.³A wildly different — and useful — option was added to provide syntax to create the junctions. Instead of using any 1,2,3, one can also write 1 | 2 | 3, and in lieu of all 1,2,3 it’s possible to write 1 & 2 & 3. Different situations might give rise to using one or the other form, which aids the Perl & Raku philosophy of TIMTOWTDI.

One feature that did not make the cut was the ability to introspect the current states. As late as 2009, it seems it was still planned (based on this issue), but at some point, it was taken out, probably because the way that junctions work means that any methods called on them ought to be fully passed through to their superimposed values, so it would be weird to have a single method that didn’t. Nonetheless, by abusing some of the multithreading that Raku does with junctions, it’s still possible if one really wants to do it:

sub eigenstates(Mu $j) {
    my @states;
    -> Any $s { @states.push: $s }.($j);


Junctions are, despite their internal complexity and rarity in programming languages are something that are so well thought out and integrated into common Raku coding styles that most use them without any thought. Who hasn’t written a signature with a parameter like $foo where Int|Rat or @bar where .all < 256? Who prefers

if $command eq 'quit' || $command eq 'exit'

to these versions? (because TIMTOWTDI)

if $command eq 'quit'|'exit'
if $command eq any <quit exit>
if $command eq @bye.any

None of these are implemented with syntactical sugar for conditionals, though it may seem otherwise. Instead, at their core, is a junction. Dr. Conway’s RFC 225 is a prime example of a modest proposal that is so simultaneously both crazy and natural that, while it fundamentally changed how we wrote code, we haven’t even realized it.

  1. I am not a physicist, much less a quantum one. I probably made mistakes here. /me is not sorry.
  2. Maybe there’s a super convoluted way to still pull it off, but to my knowledge, he’s the only person who wrote an entire regex to parse Perl itself in order to add a few new keywords, so if he deems it not possible… I’m gonna go with it’s not possible.
  3. Perhaps in the future others could be designed, such as at-least-half. The sky’s the limit after all in Raku.

Cover image by Sharon Hahn Darlin, licensed under CC-BY 2.0

RFC 168, by Johan Vromans: Built-in functions should be functions

Proposed on 27 August 2000, frozen on 20 September 2000, which was a generalization of RFC 26: Named operators versus functions proposed on 4 August 2000, frozen on 28 August 2000, also by Johan Vromans.

Johan’s proposal was to completely obliterate the difference between built-in functions, such as abs, and functions defined by the user. In Perl, abs can be called both as a prefix operator (without parentheses), as well as a function taking a single argument.

You see, Perl has this concept of built-in functions that are slightly different from “normal” subroutines for performance reasons. In Perl, as in Raku, the actual name of a subroutine, is prefixed with an ‘&‘. In Perl, you can take a reference to a subroutine with ‘\‘, but that doesn’t work for built-in functions.

Nowadays, in Raku, the difference between a subroutine taking a single positional argument, and a built-in prefix operator whose name is acceptable as an identifier, is already minimal. Well, actually absent. Suppose we want to define a prefix operator foo that has the same semantics as abs:

sub foo(Numeric:D $value) {
    $value < 0 ?? -$value !! $value

say abs -42;  # 42
say foo -42;  # 42

say abs(-42); # 42
say foo(-42); # 42

You can’t really see a difference, now can you? Well, the reason is simple: in Raku, there is no real difference between the foo subroutine, and the abs prefix operator. They’re both just subroutines: just look at the definition of the abs function for Real numbers.

But how does that function for infix operators? Those aren’t surely subroutines as well in Raku? How can they be? Something like “+” is not a valid identifier, so you cannot define a subroutine with it?

The genius in the process from the RFC to the implementation in Raku, has really been the idea to give a subroutine that represents an infix operator, a specially formatted name. In the case of infix + operator, the subroutine is known by the name infix:<+>. And if you look at its definition, you’ll see that it is actually quite simple: the left hand side of the infix operator becomes the first positional argument, and the right hand side the second positional argument. So something like:

say 42 + 666;

is really just syntactic sugar for:

say infix:<+>(42, 666);

Does this apply to all built-in operators in Raku? Well, almost. Some operators, such as ||, or, && and and are short-circuiting. This means that the value on the right hand side, might not be evaluated if the left hand side has a certain value.

A simple example using the say function (which always returns True):

say "foo" or say "bar"; # foo

Because the infix or operator sees that its left hand side is already True, it will not bother to evaluate the right hand side, and thus will not print “bar”. There is currently no way in Raku to mimic this short-circuiting behaviour in “ordinary” subroutines. But this will change when macro’s will finally also become first-class citizens in Raku land. Which is expected to be happening in the coming year as part of Jonathan Worthington‘s work on the RakuAST grant.

Going back to the original RFC, it also mentions:

In particular, it is desired that every built-in
- can be overridden by a user defined subroutine;
- can have a reference taken;
- has a useful prototype.

So, let’s check that those points:

can be overridden by a used defined subroutine

OK, so infix operators have a special name. So what happens if I declare a subroutine with that name? Well, let’s try:

sub infix:<+>(\a, \b) { a + b }
say 42 + 666;

Hmmm… that doesn’t show anything, that just hangs! Well, yeah, because we basically have a case of a subroutine here calling itself without ever returning!

This code example eats about 1GB of memory per second, so don’t do that too long unless you have a lot of memory available!

The easiest fix would be to not use the infix ‘+‘ operator in our version:

sub infix:<+>(\a, \b) { sum a, b }
say 42 + 666;  # 708

But what if we want to refer to original infix:<+> logic? It’s just a subroutine after all! But where does that subroutine live? Well, in the core of course! And for looking up things in the core, you use the CORE:: PseudoStash:

sub infix:<+>(\a, \b) {
    say "plussing";
    CORE::<&infix:<+>>(a, b)
say 42 + 666; # plussing\n708

You look in the CORE:: pseudostash for the full name of the infix operator: CORE::<&infix:<+>> will then give you the subroutine object of the core’s infix + operator, and you can call that as a subroutine with two parameters.

So that part of the RFC has been implemented!

can have a reference taken

For the infix + operator, that would be &infix:<+>, as basically is shown in the example above. You could actually store that in a variable, and use that later in an expression:

my $foo = &infix:<+>;
say $foo(42,666);  # 708

Note that contrary to Perl, you do not need to take a reference in Raku. Since everything in Raku is an object, &foo and &infix:<+> are just objects as well. You can just use them as they are. So literally this part of the RFC could never be implemented because Raku does not have reference. But for the use case, which is obtaining something that can be called, the RFC has also been implemented.

has a useful prototype

Perl’s prototypes basically morphed into Raku’s Signatures. But that’s at least one blog post all by itself. So for now, we just say that the “prototypes” of Perl in 2000 turned into signatures in Raku. And since you can ask for a subroutine’s signature:

sub foo(\a, \b) { }
say &foo.signature;  # (\a, \b)

You can also do that for the infix + operator:

say &infix:<+>.signature;  # ($?, $?, *%)

Hmmm… that looks different? Well, yes, it does a bit, but what is mainly different is that both positional parameters are optional. And that any named parameters will also be accepted. As to why that is, that’s really the topic of a yet another blog post about meta-operators. Which we’ll also leave for another time.


RFC’s 168 and 26 have been implemented completely, although maybe not in the way the original RFC’s envisioned. In a way that nowadays just feels very natural. Which allows us to build further, on the shoulders of giants!

RFC 145, by Eric J. Roode: Brace-matching for Perl Regular Expressions

Problem and proposal

The RFC 145 calls for a new regex mechanism to assist in matching paired characters like parentheses, ensuring that they are balanced. There are many “paired characters” in more or less daily use: (), [], {}, <>, «», "", '', depending on your local even »«, or in the fancy world of Unicode additionally ⟦⟧ and many, many more. In this article I will take up the RFC’s title and call all of them “braces”.

For example, consider the string ([b - (a + 1)] * 7). We might wish to extract all the subformulas

  • [b - (a + 1)] * 7,
  • b - (a + 1),
  • a + 1

from it, all of which are surrounded by a matching pair of braces using a global match. The reader is invited to try to write such a regex now.

The RFC author Eric Roode notes that this was still quite difficult in Perl in the year 2000. The task splits into two parts:

  1. Determining for an opening bracket what is its closing counterpart.
  2. Keeping track of the nesting levels and matching braces at each level.

The first subtask becomes hairy in a regex when there are multiple options for the opening bracket. The second subtask is hard for a more profound reason which goes by the name of “Dyck language“. The Dyck language is the set of all strings of properly paired parentheses (with hypothetical contents between them erased). It is the prototypical example of a language in the computer-science sense which is not regular but still context-free, meaning that it somehow needs a stack to keep track of nesting levels. Of course, regexes are more powerful than computer-scientific regular expressions but this fact may still justify why this is a difficult thing to do. Eric Roode recognized the gap between how easy this very common task in parsing structured data should be and how easy it is and wrote an RFC.

He proposed a pragma use matchpairs to solve subtask № 1 by providing a map from opening to closing braces. Pragmas are activated in a lexical scope and influence all regex matches in it. For subtask № 2 two new regex metacharacters were proposed, \m and \M for matching and remembering corresponding braces. Using these hooks, the nesting level business is offloaded onto the regex engine.

Spec and solution

RFC 145 is marked “developing”, meaning that it was not fully addressed in the Perl 6, and now Raku, specification. (Apocalypse 5 on pattern matching includes a response to RFC 145.) But there have been related improvements which I am going to use in this section to show how the problem posed in the beginning might be handled in Raku today.

The idea of using a pragma to set up a table of valid braces and then using “brace” regex metacharacters was not implemented, but the regex language was to be redesigned anyway and the designers extrapolated from brace matching and created a new regex operator for nesting structures, the tilde. This operator is used like this:

anon regex { '(' ~ ')' <body> }

and it achieves two things: it transposes body and closing brace so that the two delimiters are close to each other, even when <body> is long, and it sets up error reporting for when the closing brace was not found.

We can use this new feature to slightly improve the regex structure and get error reporting for free, but it does not keep track of nesting levels of the parentheses and it does not compute the closing brace for us if there had been multiple options for the opening one.

To compute the closing brace, it would suffice to have a way to capture the opening brace and pass it to a function whose return value is dynamically interpolated into the regex. This is now easy in Raku regexes and grammars:

grammar Formula {
    # Registry of understood braces.
    constant %braces =
        '(' => ')',
        '[' => ']',
        '{' => '}',

    # A parametric token which matches the closing brace
    # corresponding to its argument.
    token closing ($opening) {

    rule braced {
        $<opening>=@(%braces.keys) ~ <closing($<opening>)>
          [ <expr> {} ]

    rule expr {
        [ <:Letter>+ || <:Number>+ || <braced> ]+ % <[+*/-]>

The crucial part is rule braced.¹ We capture the opening brace and then later ask for its corresponding closing brace from a lookup in the %braces map.² The @(%braces.keys) interpolation of a list invokes longest-token matching, so it will DWIM when multiple braces with overlapping prefixes are present.

Notice that the mutually recursive use of the <expr> and <braced> rules ensures correct nesting of braces without needing a dedicated gear for this in the regex engine. It falls out of Raku’s improved regex structuring and reusing facilities. It is time for a test:

grammar Formula { … }
sub braced-subexprs ($expr) { … }

braced-subexprs Q|([b - (a + 1)] * 7)|;
-- ([b - (a + 1)] * 7) ---------------------------------------------------------
Braces: ( * ) ||| Subexpr: a + 1
Braces: [ * ] ||| Subexpr: b - (a + 1)
Braces: ( * ) ||| Subexpr: [b - (a + 1)] * 7


In summary, brace matching is obviously useful in parsing structured data. It was proposed by Eric Roode to make this simple in Perl 6 / Raku. Although the feature was not implemented in the proposed form, the task has indeed become easier to accomplish and the code much easier to read, notably due to the new regex syntax and grammar support.


If, like me, you are slightly bothered by the static brace table but are fine with heuristics, then the Unicode Consortium may be an unexpected ally. The Unicode Bidi_Mirroring_Glyph property gives hints about bidirectional writing, that is putting text on the screen when multiple scripts are involved, some of which write left-to-right and others right-to-left. Raku has built-in support for Unicode properties and we can use this one to let the Unicode Consortium pick closing braces for us:

    sub unicode-mirror ($_) {
        join '', .comb.reverse.map: {
                or .self

    token closing ($opening) {
        "{ unicode-mirror($opening) }"

    regex braced {
        $<opening>=<:Symbol + :Punctuation>+ ~ <closing($<opening>)>
        [ <expr> {} ]

The &unicode-mirror heuristic splits the argument into characters, reverses their order and then either picks its mirroring glyph, if one is defined, or leaves the character as-is, then reassembles them into a string. This function successfully turns <{ into }>, for example.

braced was tweaked in two regards: it accepts any sequence of symbols and punctuation as opening braces now and it has been turned into a regex for full backtracking power when it is too greedy in consuming opening braces.

With these tweaks, we can go nuts and have the grammar do free association and match everything that “looks like a brace pair”:

-- ([b - (a + 1)] * 7) ---------------------------------------------------------
Braces: ( * ) ||| Subexpr: a + 1
Braces: [ * ] ||| Subexpr: b - (a + 1)
Braces: ( * ) ||| Subexpr: [b - (a + 1)] * 7

-- (=^123^=) -------------------------------------------------------------------
Braces: (=^ * ^=) ||| Subexpr: 123

-- <<<123>> --------------------------------------------------------------------

-- >123< -----------------------------------------------------------------------
Braces: > * < ||| Subexpr: 123

-- >123> -----------------------------------------------------------------------

-- <{ (a + <b>) / !c! / e * »~d~« }> -------------------------------------------
Braces: < * > ||| Subexpr: b
Braces: ( * ) ||| Subexpr: a + <b>
Braces: ! * ! ||| Subexpr: c
Braces: »~ * ~« ||| Subexpr: d
Braces: <{ * }> ||| Subexpr: (a + <b>) / !c! / e * »~d~«


The function used to report braced subexpressions is this:

sub braced-subexprs ($expr) {
    # Get all submatches of the C<braced> subrule.
    class BracedCollector {
        has @.braced-subexprs;

        method braced ($/) {
            push @!braced-subexprs, $/

        method braced-subexprs {
            @!braced-subexprs.unique(as => *.pos)

    say "-- $expr ", '-' x (76 - $expr.chars);

    my BracedCollector $collect .= new;
    say "FAILED" and return
        unless Formula.parse($expr, :rule<expr>, :actions($collect));

    for $collect.braced-subexprs -> $/ {
        say "Braces: $<opening> * $<closing> ||| Subexpr: $<expr>";

¹ In case you are wondering about the use of an empty block in [ <expr> {} ], this is due to an implementation detail in Rakudo’s regex engine which does not make the capture $<opening> available to a later subrule closing unless it is forced to. The empty block is one way to force it; cf. RT#111518 and DOC#3478.

² The essential feature of interpolating back the return value of a function call closing $<op> which may depend on previous captures was added, to the best of my knowledge, also around the year 2000 (so about the time this RFC was posted), to Perl 5.6, in this case with the spelling (??{closing $+{op}}).

RFC 137: Perl OO should not be fundamentally changed.

Now, as you have read the title and already stopped laughing… Er, not all of you yet? Ok, I’ll give you another minute…

Good, let’s be serious now. RFC 137 was written by Damian Conway. Yes, the title too. No, I’m serious! Check it yourself! And then also read other RFCs from language-objects category. Turns out, it was the common intention back then: don’t break things beyond necessary, keep everything as backward compatible as possible.

A familiar stance, isn’t it?

I chose this RFC not over its title but because it might be considered as the source of the river we now call the Raku OO model.

Let’s look closer into the RFC text. To be frank, this gonna be second time only as I read it. Gosh, three weeks ago I even had no idea Perl6 started with RFCs! My long-time belief was the synopses were the first to come. And since even they are now pretty much outdated in many places, studying RFCs is like studying the first stone tools of Homo Habilis: there is not much in common with what we use nowadays, but how great is it to see ways human mind adapt and improve its own ideas! So, let’s get back into our archeological excavation and start examining our sample.

The first prominent feature we find states that:

It ain’t broken. Don’t fix it.

Here and below all quotes are from the RFC body.

And you know what? Back then it was my consideration too! All I really wanted was class and method keywords, private declarations… And that’s basically all. Perl with classes, what else?

Heh, young and stupid, aha…

The way I see it now? It’s still not broken. But neither it is good. And once you try it in Raku – there is no way back.

Perl’s current OO model has a number of well-known deficiencies: lack of (easy) encapsulation, poor support for hierarchical method calls (especially constructors and destructors), limited (single) dispatch mechanism, poor compile-time checking. More fundamentally, many people find that setting up reliable OO class hierarchies requires too much low-level coding.

Fairly long list, isn’t it? The good thing: these days there is Moose family of toolkits to get solutions for many of the listed problems. Not all of them though. But the fact of existence of these toolkits tells a lot about Perl’s flexibility which was really something outstanding back then, in the late 90s. Even though Moose didn’t exists yet in 2000, there were already a few OO toolkits implementing different approaches.

The bad thing: all of them are external solutions. Yes, CPAN. Yes, easy to install. Still…

This is one of the aspects where Raku shines absolutely: every single issue from the list above has been taken care of. And event some problems beyond the listed ones. Eventually, this is what made Raku almost totally backward incompatible to Perl. But do I feel pity about it? I’m certainly do not!

Later in RFC text Damian writes:

The non-prescriptive, non-proscriptive nature of Perl’s OO model makes it possible to construct am enormous range of OO systems within the one language

And you know what? You can do it in Raku too! Yes, you can create your own OO model from the scratch by utilizing Raku’s powerful meta-object protocol (MOP) capabilities. Apparently I won’t be discussing this matter in this article. But I can give you a hint:

say EXPORTHOW::<class>.^name; # Perl6::Metamodel::ClassHOW

Congratulations! We just found the class which is responsible for Raku class keyword. It is possible to implement your own keyword like, say, myclass (sorry for a commonplace) and make it possible to have declarations like:

myclass Foo { }

And make sure that the behavior of myclass kind of type objects is different from class. How much different is totally up to one’s imagination and demands.

Don’t like it? Create your own slang and extend Raku grammar with it! Why not to have something like:

myclass Foo;
   private Int attr1;
   public Str attr2;

It’s possible. The only remark to make about it: the power of Raku makes this kind of tricks unnecessary most of the time.

Yet, the ability to declare own class-like keyword which provides some kind of specialized functionality proves to be extremely useful for creating an ORM with an exceptional level of flexibility and ease of use!

A polite cough from the audience reminds me that it’s time to change subject. My apologies, the MOP is my all-time fad!

Of course, the RFC is not about critics but proposals. Let’s see why I think that this one is the source of many things we have in the Raku language today.

A private keyword that lexically scopes hash keys to the current package

As Perl’s OO wasn’t planned to for a drastic change in v5 to v6 transition, it was supposed to still remain based upon blessed hashes. This is not true for Raku. Why and how are questions not for an article, but rather for a small book. Anyway, the idea of private class members is here, even though not the keyword private itself:

class Foo {
    has $!private;
    has $.public;
    method !only-mine { }
    method for-anyone { self!only-mine }
my $foo = Foo.new;
$foo.only-mine; # Error
say $foo.public;
say $foo.private; # Error

Instead of over-verbose public/private declarations Raku uses concise twigil notation using . for publics and ! for privates. But what’s even more interesting is that the only difference between the two declared attributes is that $.public automatically receives an accompanying method public which is the accessor for the attribute. Because, as a matter of fact, in Raku all attributes are actually private! By simplifying a bit, it can be said that the only thing which a class exposes into the world is its public methods.

A new special subroutine name — SETUP — to separate construction from initialization.

In fact, Raku’s object construction model is based upon three special methods: BUILDALL, BUILD, and TWEAK. The latter is the evolution of SETUP method idea.

But! (There is often a but in Raku) The three are not the kind of methods we used to think about them. They are submethods which are the sole property of a class or a role which declares them. What it means in practice is:

class Foo {
    submethod foo { say "foo!" } 
    method bar { say "bar!" }
class Bar is Foo { }
Foo.foo; # foo!
Bar.bar; # bar!
Bar.foo; # Error: no such method

Submethods are a kind of tool ideally suited for performing tasks totally specific to the class/role. Precisely like the construction/destruction tasks.

  • Changes to the semantics of bless so that, after associating an object with a class, the class’s SETUP methods are automatically called on the object. An additional trailing @ parameter for bless, to allow arguments to be passed to SETUP methods.
  • Changes to the semantics of DESTROY, so that all inherited destructors are, by default, automatically called when an object is destroyed.

Yes and no. Back then, when the RFC was written, the low-level architecture of Perl6 not even started to be discussed. Thus many ideas was based on Perl5 design in which the core is written in C and bundled with a bunch of modules. Correspondingly, bless is the core thing doing some kind of magic to the data provided as an argument. It seemed natural to teach bless a few additional tricks to get the desired outcome.

In Raku everything is different in this area. First of all, the language specification doesn’t imply the exact way of how construction/destruction stages are to be implemented, it only demands the constructor/destructor methods to be supported and invoked in specific order with specific parameters. So, the “Yes” in the beginning of the previous paragraph is related to the fact that the automatic invocation does take place and the arguments are passed from a call to the method new to the construction submethods:

class Foo {
    has $.foo;
    method TWEAK(:$foo) {
        $!foo = $foo * 10 if $foo < 10;
say Foo.new(foo => 5).foo; # 50

But the “No” above is related to the fact that there is no low-level bless subroutine responsible for how the things are done. For example, the way Rakudo compiler implements the specification of object construction is basically winds down to something like this incomplete pseudo-code:

method new(*%attrinit) {
method bless(*%attrinit) {
    nqp::create(self).BUILDALL(Empty, %attrinit)

So bless is no more than just an ordinary pre-defined method. If necessary, you can override it in your class:

class Foo {
    has $.foo;
    method bless(:$foo, *%c) {
        nextwith(|%c, foo => $foo * 2) 
say Foo.new(foo => 4).foo; # 8

The code will work because this part of the construction logic is partially implemented by class Mu from which all Raku classes indirectly inherit by default. So, if no special care is taken, when you do Foo.new it means the method new from Mu is invoked.

Besides, in Rakudo’s scenario all the “magic” of object initialization eventually happens within the Mu::BUILDALL method which is using the MOP to determine all the steps to be done for you to get an instance of Foo properly setup and ready for use.

Off all the above steps it’s only the nqp::create call which is served by the low-level core executable (virtual or bytecode machine in Rakudo terms) and which purpose is to allocate and initialize memory for an object representation.

Pre- and post-condition specifiers, which associate code blocks with particular subroutine/method names.

This idea has never got developed. Instead Raku has gotten something way more powerful! A concept which applies not only to routines but to any object. And don’t forget: in Raku everything is an object! The concept I’m talking about is trait.

A trait is a routine which gets invoked at code compilation time and gets the object it is applied to as its argument. It’s also possible to pass your own arguments to the trait, which is able to use the full power of MOP to setup or even alter the object the way the user needs it to. For example:

class Foo {
    has $.foo is rw is default(42);

In this snippet is default(42) is a simple example of passing an argument to a trait. The meaning is to specify the value the attribute will get initially and every time it gets assigned with Nil.

is rw makes the attribute writable because, by default, all attributes in Raku are read-only. Remember I wrote earlier that everything is an object? Attributes are no exception! RO or RW status of an attribute is determined by… well, by an attribute value on an Attribute object! Thus, our is rw trait is actually as simple as:

multi sub trait_mod:<is>(Attribute:D $attr, :rw($)!) {
    warn "useless use of 'is rw' on $attr.name()" unless $attr.has_accessor;

Just ignore all the syntax used here and consider the use of set_rw() method. That’s basically all is needed to make an attribute writable.

If we now get back to the pre- and post-condition specifiers mentioned in the RFC, they also can be implemented with traits using method wrap of Method objects.

I would now skip a few following items in the RFC list we’re walking over now. Some were not implemented, some are just self-evident. Multi-dispatch itself worth an article and I hope somebody would pick it up for this advent calendar.

Let’s just fast-forward directly to the last one:

A new pragma — delegation — that would modify the dispatch mechanism to automatically delegate specific method calls to specified attributes of an object.

This just is another example where traits came to the rescue. There is no delegation in the Raku, but there is a trait named handles (already mentioned in a previous article):

class Book {
    has Str  $.title;
    has Str  $.author;
    has Str  $.language;
    has Cool $.publication;
class Product {
    has Book $.book handles('title', 'author', 'language', year => 'publication');

I chose it since it’s another example of a very elegant solution where no core intervention is needed to implemented some advanced functionality. In two words, the only thing handles does – it installs new methods on the class to which its attribute belongs. Of course, it takes into account some edge cases, tries to optimize things where possible. But, otherwise, there is no magic in it. There is no thing which you wouldn’t be able to do yourself!

This is what I’d like to conclude this article with. Years ago the word magic was kind of trendy among Perl developers. “Here we do some magic” – and then something really unexpected was happening. It was a lot of fun!

However, recently I found an interesting definition of what magic is: a kind of action a person performs to achieve a result which doesn’t logically follow from the action itself.

Sorry for perhaps clumsy translation, but I hope it reflects the idea behind the definition. And makes the good point of the magic while being fun not being good for the production.

In Raku the magic is eliminated. Instead, Raku brought in such a level of uniformity among different levels of code that often the first impression of: wow, this is magical! – is soon replaced with: wow, it’s so logical!

But you know what? When you put everything together and look at the language as a whole, it creates even bigger magic which can easily enchant you once and forever.


RFC 112 by Richard Proctor: Assignment within a regex

Richard wanted to

Provide a simple way of naming and picking out information from a regex without having to count the brackets.

I can say without hesitation that Raku (and before its rename, Perl 6) has achieved this goal — but all the details are different than proposed.

The reason is two-fold.

For one, Richard assumed a pretty straight-forward extension to Perl 5’s regex syntax, and his proposed syntax, (?$hours=..) for a named capture made sense. Instead, Raku regex syntax is pretty much a new thing, where all non-alphanumeric characters are potentially meta characters, and thus either used or reserved for one purpose or another. This made easier syntax than (?$name=regex) available for named captures.

The second is even more profound: The Raku designers realized that regexes could only be truly powerful if reuse was built in from the ground up. And the best way to make that happen was to make them first class.

I want to dwell on that point a bit: consider the power of functions (and closures) as first-class citizens in modern programming languages. Lisp has shown us what you can do with them, and now basically every programming language has got them. Dynamic languages like Perl, Ruby, Javascript and Python were pretty early adopters, modern statically typed languages like C# and F# also got them; even Java caught up eventually. Java didn’t even have functions, just methods, and now it’s got closures that you can pass around.

In my humble opinion, raising regexes to the level of first-class citizens and introducing a concise call syntax gave regexes a similar boost.

In the old days, it was common wisdom that you cannot parse XML (or other arbitrarily nested languages) with regexes, because they are not a regular language in the computer science sense. Perl 5 has some workarounds for that, but they are so clunky and verbose that I haven’t even seen them recommended much, and my general impression is that if you use them, it’s just for the lack of good alternatives.

Not so in Raku: <subrule> in a regex calls another regex called subrule, and so you have recursion (and, relevant to the discussion of RFC 112, named captures). This recursion moves regexes from regular into context-free language territory in the Chomsky Hierarchy. But more than recursion, the named regexes allow much easier reuse, testing in isolation and all that other wonderful stuff that first-classiness gave to functions. It also moved the sentiment towards parsing XML and other languages with regexes from “are you serious?” to “sure, it’s the best tool”.

The call syntax <subrule> implies a named capture, and it turns out that’s convenient enough that explicit named captures (not tied to a call) are actually pretty rare in real-world parsers. An explicit syntax for that exists though, it’s $<capturename>=[...].

Which brings us to the second interesting bit: RFC 112 doesn’t just talk about named captures, but implies that they are directly stored into variables of the same name.

This is problematic for a variety of reasons:

  • Scoping. A regex can be declared and used at two very different parts of the program. Forcing the variable to be in scope would make the capture syntax a source variables with an unnecessary large scope (dare I say global?), which is a clear anti pattern.
  • Quantifiers. In RFC 211 syntax, what would have happened with a regex like (?$char=.)+ matching the string abc? What’s in $char? The sigil implies a scalar, so… maybe the last capture, c? And throw away all the other matches? Doesn’t sound too good. Or maybe (?@char=.) would have an array with all captures, but then, when writing a regex, you’d have to know if anybody later wants to use that inside a quantifier. Not stellar either.
  • Composition. Binding matches to a variable assumes the regex is used as a top-level construct, and not part of larger thing.
  • Recursion. Do I even need to elaborate? Probably not.

The solution that’s implemented in Raku now is more suited to world of first-class regexes: for each regex match there’s a Match object. The top-level match object is stored in the variable $/, so accessing a named capture key is $/<key>, and there’s even a short-hand for that, $<key>. Just two character longer than the originally proposed $key.

This solution shines though when used in the context of composition, for example. Since the named capture corresponds to a regex match, it’s also a Match object, and so we arrive at a tree of matches (all alike). Or rephrased: a regex match already is a syntax tree.

I think this RFC is a good example how a real pain point and problem was identified, and a solution proposed. Aspects of this solution have survived the language design process, but most details haven’t, because the language changed much more than it barely being Perl 5 plus a few extensions through RFCs.

I have been participating in the Perl 6 project since around the year 2007, and have watched some of these transformations; for regexes, the majority of the consolidation and redesign work had already been done. I watched the implementations become more powerful, and even helped a little here and there.

Living in this process was a magical experience, just as magical as the result is now.