Featured

20 years ago tomorrow

20 years ago, on the first of August, the inception of a language started to, well, incept.

Actually, it started a bit earlier than that. Perl was in need of change, so it was decided that the community itself should propose what the language needed to do to go forward one step, from Perl 5 to Perl 6. A call for requests for change was made; every request for change should include possible changes to Perl, as well as, if possible, an implementation proposal, laying out how to proceed. The procedure didn’t lack criticism, but it can’t be said that it was not received, in general, with such an enthusiasm that August 1st already saw the first RFC, pretty much at the same time as some instructions from Larry Wall on how to actually proceed.

The rest is history. It looks a bit like sacred history, since those RFCs were picked up (and apart) by Larry Wall’s apocalypses, explained later by Damian Conway’s exegeses, and roasted in the synopsis, which eventually became the roast repository, the actual specification of the language.

Which is now called Raku. But that’s another story.

To celebrate this part of the history and the people that brought us where we are now, starting tomorrow, we’ll publish 20 articles, one a day, that will focus on one or a few RFCs and show what they eventually became in today’s Raku. So come back every day for a piece of Raku, of history, and of Raku history!

RFC 190, by Damian Conway: NEXT pseudoclass for method redispatch

In his series of object orientation RFC’s Perl/Raku luminary Damian Conway includes a proposal for method redispatch, RFC 190, which is the subject of today’s article.

On method dispatch

Perl has a pseudoclass named SUPER which an object can use to invoke a method of its parent class that it has overridden. It looks approximately like this:

sub dump_info {
    my $self = shift;           # obtain invocant
    $self->SUPER::dump_info;    # first dump parent
    say $self->{derived_info};  # then ourselves
}

In this example, taken loosely from the RFC, we define a method dump_info in a derived class which dumps the info of its parent class and then anything that itself added to the object, exemplified by the derived_info attribute. Conway notes that this breaks down under multiple inheritance because SUPER will only dispatch to the first parent class of $self and once you go SUPER you can’t go back. Supposing that all dump_info methods in parent classes are similarly implemented, only the family bonds indicated in the below diagram with double lines would be traversed, resulting in lots of info potentially undumped:

Grand11  Grand12      Grand21  Grand22
   ║        │            │        │
   ╟────────┘            ├────────┘
   ║                     │
Parent1               Parent2
   ║                     │
   ╟─────────────────────┘
   ║
Derived

One might think that to get this right, each class needs to dispatch to all of its parent classes somehow, which would be akin to a post-order traversal of the inheritance tree. This is correct insofar as it models the relevance of methods in the inheritance tree, supposing that left parents are more important than right ones.

The NEXT pseudoclass

Conway’s proposal is subtly different. Namely, he proposes to add a new pseudoclass named NEXT which is to be used just like SUPER above and which should, instead of continuing in the parent of the current package, resume the original method dispatch process with the next appropriate candidate as if the current one had not existed. Then, method redispatch is performed with respect to the original object’s class and sees the entire inheritance tree instead of cutting off all other branches below SUPER. Effectively, this offloads the responsibility of redispatching from each specific class onto the runtime method dispatch mechanism.

Grand11══Grand12══╗   Grand21══Grand22
   ║        │     ║      ║        │
   ╟────────┘     ║      ╟────────┘
   ║              ║      ║
Parent1           ╚═══Parent2
   ║                     │
   ╟─────────────────────┘
   ║
Derived

Concretely, when calling dump_info on an object blessed into the Derived class, there is an array of possible candidates. They are the methods of the same name of Derived, Parent1, Grand11, Grand12, Parent2, Grand21 and Grand22, in order of relevance. Given this array, each dump_info implementation just has to redispatch to the single next method in line. It is on the runtime to keep enough data around to continue the dispatch chain.

Notably, this mechanism can also provide an implementation of RFC 8, which is about the special method AUTOLOAD. AUTOLOAD is called as a fallback when some method name could not be resolved. Redispatching via NEXT can be used to decline to autoload a method in the current class and leave that task to another AUTOLOAD in the inheritance tree.

The Raku implementation

While the status of RFC 190 is “frozen”, hence accepted, this feature looks
different in Raku today. There is no NEXT and even SUPER is gone.
Instead we have three types of redispatch keywords:

  • callsame, callwith: calls the next candidate for the method, either
    using the same arguments or with the other, given ones.
  • nextsame, nextwith: the same as as the call* keywords, except
    they do not return control to the current method.
  • samewith: calls the same candidate again with different arguments.

callsame and callwith implement the process that NEXT would have, but in the wider context of all dispatch-related goodies that Raku got. They work in all places that have a linearized hierarchy of “callable candidates”. This includes redispatch of methods along the inheritance tree, it naturally includes candidates for multi methods and subs, it includes FALLBACK (erstwhile AUTOLOAD) and wrapped routines. Another speciality is in case you ever find you have to redispatch from the current method but do so in another context, for example inside a Promise, then the nextcallee keyword can be used to obtain a Callable which can continue the dispatch process from anywhere.

We change NEXT::callsame and after some localizations, our dump_info method looks like this and does not omit any parent class’s methods anymore:

method dump-info {
    callsame;            # first dump next
    put $!derived-info;  # then ourselves
}

In summary, being able to redispatch to a parent class’s method is useful. In a situation with multiple inheritance and multi subs, the fixation on the “parent class” is less helpful and is replaced by “less specific method”. Conway’s proposal to redispatch to the next less specific method made it into Raku and its usefulness is amplified way beyond the RFC by other Raku features and the careful design connecting them.

Curiously, a NEXT module was first shipped as a core module with Perl v5.7.3, released in 2002, written by… Damian Conway.


There is more than one way to dump info

For the particular pattern used as an example in the RFC, where each class in the inheritance tree independently throws in its own bit, there is another way in Raku to accomplish the same as redispatch. This is the method call operator .* (or its greedy variant .+). Consider

# Parent and grandparent classes look the same...

class Derived is Parent1 is Parent2 {
    sub dump-info { put "Derived" }
}

Each dump-info method is only concerned with its own info and does not redispatch. The .* methodop walks the candidate chain and calls all of them, returning a list of return values (although in this case we care only about the side effect of printing to screen):

Derived.new.*dump-info
# Derived
# Parent1
# Grand11
# Grand12
# Parent2
# Grand21
# Grand22

The methods are called from most to least relevant, in pre-order of the inheritance tree. This way we can keep our dump-info methods oblivious to redispatch, whereas explicitly redispatching using callsame and co. would allow us to choose between pre- and post-order.

RFC22: Control flow: Builtin switch statement, by Damian Conway

The problem

C has switch/case, and many other languages either copied it, or created a similar construct. Perl in 2000 didn’t have any such thing, and this was seen as a lack.

A Tale of Two Languages

This RFC not only became two (related) features, it did so in both Perl and Raku with dramatically different results: in Perl it’s widely considered the biggest design failure of the past two decades, whereas in Raku it’s an entirely non-controversial. The switch that Perl ended up with is working very similar to the original proposal. This is actually helpful in analysing what changes were necessary to make it a successful feature. It looks something like this (in both languages):

given $foo {
    when /foo/ {
        say "No dirty words here";
    }
    when "None" {

    }
    when 1 {
        say "One!";
    }
    when $_ > 42 {
        say "It's more than life, the universe and everything";
    }
}

The switch is actually two features (for the price of only one RFC): smartmatch and given/when on top of it. Smartmatch is an operator ~~ that checks if the left hand side fits the constraints of the right hand side. Given/when is a construct that smartmatches the given argument to a series of when arguments (e.g. $given ~~ $when) until one succeeds.

However, one of the distinguishing features of Perl is that it doesn’t generally overload operators, instead it has different operators for different purposes (e.g. == for comparing numbers and eq for comparing strings). Smartmatch however is inherently all about overloading. This mismatch is essentially the source of all the trouble of smartmatch in Perl. Raku on the other hand has an extensive type-system, and is not so dependent on type specific operators (though it still has some for convenience), and hence is much more predictable.

Most obviously, the type system of Raku means that it doesn’t use a table of possibilities, but instead $left ~~ $right just translates to $right.ACCEPTS($left). The right-first semantics makes it a lot easier to reason about (e.g. matching against a number will always do numeric equality).

It means it can easily distinguish between a string and an integer, unlike Perl which has to guess what you meant: $foo ~~ 1 always means $foo == 1, and $foo ~~ "foo" always means $foo eq "foo". In Perl, $foo ~~ "1" would do a numeric match.

But perhaps the most important rule is smartmatching booleans. Perl doesn’t have them, and this makes when so much more complicated than most people realize. The problem is with statements like $_ > 42, which need boolean logic. Perl solves this using a complex heuristic that no one really can remember (no really, no one). Most surprisingly, that means that when /foo/ does not use smartmatching (this becomes obvious when the left hand side is an arrayref).

Raku uses a very different method to solve this problem. In Raku, when always smartmatches. Smartmatching against a Bool (like in $_ > 42), will always return that bool, so $foo ~~ True always equals True. This enables a wide series of boolean expressions to be used as when condition without problems. It’s a much simpler, and surprisingly effective method of dealing with this challenge.

Other uses

The other difference between smartmatching in Perl versus Raku is that it is actually used outside of given/when. In particular, selecting methods such as grep and first use it to great effect: @array.grep(1), @array.grep(Str), @array.grep(/foo/), @array.grep(&function), @array.grep(1..3), and @array.grep((1, *, 3))* all do what you probably expect them to do. Likewise it’s used in a number of other places where one checks if a value is part of a certain group or not, like the ending argument of a sequence (e.g. 1000, 1001 ... *.is-prime) and the flip-flop operators.

Smartmatch is all about making code do what you mean, and it’s pretty useful and reusable for that.

RFC 43: Integrate BigInts (and BigRats) Support Tightly With The Basic Scalars

Intro

RFC 43, titled ‘Integrate BigInts (and BigRats) Support Tightly With The Basic Scalars’ was submitted by Jarkko Hietaniemi on 5 August 2000. It remains at version 1 and was never frozen during the official RFC review process.

Despite this somewhat “unoffical” seeming status, the rational (or Rat) numeric type, by default, powers all of the fractional mathematics in the Raku programming language today.

You might say that this RFC was “adopted, and then some.”

A legacy of imprecision

There is a dirty secret at the core of computing: most computations of reasonable complexity are imprecise. Specifically, computations involving floating point arithmetic have known imprecision dynamics even while the imprecision effects are largely un-mapped and under-discussed.

The standard logic is that this imprecision is so small in its individual expression – in other words, because the imprecision is so negligible when considered in the context of an individual equation, it is taken for granted that the overall imprecision of the system is “fine.”

Testing the accuracy of this gut feeling in most , however, would involve re-creating the business logic of that given system to use a different, more accurate representation for fractional values. This is a luxury that most projects do not get.

Trust the data, question the handwaves

What could be much more worrisome, however, is the failure rate of systems that do attempt the conversion. In researching this topic I almost immediately encountered third party anecdotes. Both come from a single individual and arrived within minutes of broaching the floating point imprecision question.1

One related the unresolved issue of a company that cannot switch their accounting from floating point to integer math – a rewrite that must necessarily touch every equation in the system – without “losing” money. Since it would “cost” the company too much to compute their own accounting accurately, they simply don’t.

Another anecdote related the story of a health equipment vendor whose equipment became less effective at curing cancer when they attempted to migrate from floating point to arbitrary precision.

In a stunning example of an exception that just might prove the rule of “it’s a feature, not a bug”, it turned out that the tightening up of the equation ingredients resulted in a less effective dose of radiation because the floating point was producing systemic imprecisions that made a device output radiation beyond its designed specification.

In both cases it could be argued that the better way to do things would be to re-work the systems so that expectations matched reality. I do have much more sympathy for continued use of the life-saving imprecision than I do for a company’s management to prefer living in a dream world of made up money than actually accounting for themselves in reality, though.

In fact, many financing and accounting departments have strict policies banning floating point arithmetic. Should we really only be so careful when money is on the line?

Precision-first programming

It is probably safe to say that when Jarkko submitted his RFC for native support high-precision bigint/bigrat types that he didn’t necessarily imagine that his proposal might result in the adoption of “non-big” rats (that it is, arbitrary precision rationals shortened into typically playful Perl-ese) as the default fractional representation in Perl 6.2

Quoting the thrust of Jarkko’s proposal, with emphasis added:

Currently Perl ‘transparently’ starts using double floating point numbers when the numeric values grow too large for the native integer types (int, long, quad) can no more hold quantities that large. Because double floats are at their heart a lie, they cannot truly represent large numbers accurately. Therefore sometimes when the application would prefer to stay accurate, the use of ‘bigints’ (and for division, ‘bigrats’) would be preferable.

Larry, it seems, decided to focus on a different phrase in the stated problem: “because double floats are at their heart a lie”. In a radical break from the dominant performance-first paradigm, it was decided that the core representation of fractional values would default to the most precise available – regardless of the performance implications.

Perl has always had a focus on “losing as little information as possible” when it comes to your data. Scalars dynamically change shape and type based on what you put into them at any given time in order to ensure that they can hold that new value.

Perl has also always had a focus on DWIM – Do What I Mean. Combining DWIM with the “lose approximately nothing” principle in the case of division of numbers, Perl 6 would thus default to understanding your meaning to be a desire to have a precise fractional representation of this math that you just asked the computer to compute.

Likewise, Perl has also always had a focus on putting in curbs where other languages build walls. Nothing would force the user to perform their math at the “speed of rational” (to turn a new phrase) as they would have the still-essential Num type available.

In this sense, nothing was removed – rather, Rat was added and the default behavior of representing parsed decimal values was to create a Rat instead of a Num. Many languages introduce rationals as a separate syntax, thus making precision opt-in (a la 1r5 in J).

In Perl 6 (and thus eventually Raku), the opposite is true – those who want imprecision must opt-in to floating point arithmetic instead via an explicit construction like Num.new(0.2), type coercion of an existing object a la $fraction.Num, or via the short-hand value notation of 0.2e0.

A visual example thanks to a matrix named Hilbert

In pursuit of a nice example of programming concerns solved by arbitrary precision rationals that are a bit less abstract than the naturally unfathomable “we have no actual idea how large the problem of imprecision is in effect on our individual systems let alone society as a whole”, I came across the excellent presentation from 2011 by Roger Hui of Dyalog where he demonstrates a development version of Dyalog APL which included rational numbers.

In his presentation he uses the example of Hilbert matrices, a very simple algorithm for generating a matrix of any size that will be of notorious difficulty for getting a “clean” identity matrix (if that’s not clear at the moment, don’t worry, the upcoming visual examples should make this clear enough to us non-experts in matrix algebra).

Here is a (very) procedural implementation for generating our Hilberts for comparison (full script in this gist):3

my %TYPES = :Num(1e0), :Rat(1);
subset FractionalRepresentation of Str where {%TYPES{$^t}:exists};
sub generate-hilbert($n, $type) {
    my @hilbert = [ [] xx $n ];
    for 1..$n -> $i {
        for 1..$n -> $j {
            @hilbert[$i-1;$j-1] = %TYPES{$type} / ($i + $j - 1);
        }
    }
    @hilbert
}

One of the most important aspects of having rationals as a first-class member of your numeric type hierarchy is that only extremely minimal changes are required of the math to switch between rational and floating point.

There is a danger, as Roger Hui notes in both his video and in a follow-up email to a query I sent about “where the rationals” went, that rational math will seep out into your application and unintentionally slow everything down. This is a valid concern that I will return to in just a bit.

Floating Hilbert

Here are the results of the floating point dot product between a Hilbert and its inverse – an operation that generally results in an identity matrix (all 0’s except for a straight diagonal of 1’s from the top left corner down to the bottom right).

Floating Hilbert
         1       0.5  0.333333      0.25       0.2
       0.5  0.333333      0.25       0.2  0.166667
  0.333333      0.25       0.2  0.166667  0.142857
      0.25       0.2  0.166667  0.142857     0.125
       0.2  0.166667  0.142857     0.125  0.111111
 
Inverse of Floating Hilbert
     25    -300     1050    -1400     630
   -300    4800   -18900    26880  -12600
   1050  -18900    79380  -117600   56700
  -1400   26880  -117600   179200  -88200
    630  -12600    56700   -88200   44100

Floating Hilbert ⋅ Inverse of Floating Hilbert
            1             0            0             0            0
            0             1            0  -7.27596e-12  1.81899e-12
  2.84217e-14  -6.82121e-13            1  -3.63798e-12            0
  1.42109e-14  -2.27374e-13  2.72848e-12             1            0
            0  -2.27374e-13  9.09495e-13  -1.81899e-12            1

All those tiny floating point values are “infinitesimal” – yet some human needs to choose at what point of precision we determine the cutoff. Since we “know” that the inverse dot product is supposed to yield an identity matrix for Hilberts, we can code our algorithm to translate everything below e-11 into zeroes.

But what about situations that aren’t so certain? Some programmers undoubtedly write formal proofs of the safety of using a given cutoff – however I doubt any claims that this population represents a significant subset of programmers based on lived experience.

Rational Hilbert

With the rational representation of Hilbert, it’s a lot easier to see what is going on with the Hilbert algorithm as the pattern in the rationals clearly evokes the progression. This delivers an ability to “reason backwards” about the numeric data in a way that is not quite as seamless with decimal notation of fractions.4

Rational Hilbert
    1  1/2  1/3  1/4  1/5
  1/2  1/3  1/4  1/5  1/6
  1/3  1/4  1/5  1/6  1/7
  1/4  1/5  1/6  1/7  1/8
  1/5  1/6  1/7  1/8  1/9

Inverse (same output, trimmed here for space)

Rational Hilbert ⋅ Inverse of Rational Hilbert
  1  0  0  0  0
  0  1  0  0  0
  0  0  1  0  0
  0  0  0  1  0
  0  0  0  0  1

Notice the lack of ambiguity in both the initial Hilbert data set and the final output of the inverse dot product. There is no need to choose any threshold values, thus no programmer decisions come in between the result of the equation provided by the computer and the result of the equation as stored by the computer or presented to the user.

Consider again the fact that it is useless to do an equality comparison in floating point math – the imprecision of the underlying representation forces users to choose some decimal place to end the number they check against. There is no possible model that can take into account the total degree of impact this human interaction (in aggregate) creates on top of the known and quantifiable imprecisions that are usually the basis of the “it’s good enough” argument.

The cost of precision

In summary, the nice equations about maximum loss per equation are far from the entire story and it is in recognition of that fact that Raku does its best to ensure that the eventual user of whatever Raku-powered system is not adversely impacted by unintentional or negligent errors in the underlying math used by the programmer.

It nevertheless remains a controversial decision that has a significant impact on baseline Raku performance. Roger Hui’s warning about rational math “leaking out” into the business application is well-served by the timing results I get for the current implementations in Matrix::Math:

raku rationale-matrix.p6 --timings-only 8
Created 'Num' Hilbert of 8x8: 0.03109376s
Created 'Rat' Hilbert of 8x8: 0.0045802s
Starting 'Num' Hilbert test at 2020-08-10T08:38:20.147506+02:00
'Num' Hilbert inverse calculation: 4.6506456s
'Num' dot product: 4.6612038s
Starting 'Rat' Hilbert test at 2020-08-10T08:38:24.808709+02:00
'Rat' Hilbert inverse calculation: 5.0791889s
'Rat' dot product: 5.0791889s

Considering it is quite unlikely for the floating point version to be so close to the speed of the rational one, this simple benchmark appears to prove the case for Roger’s warning about rational math “leaking out” into the larger system and causing significant performance degradation.5

So even though we ourselves took the extra step to try and get floating point (im-)precision and thus speeds, we were thwarted in our attempts. I believe this is a valid concern for the plenty of use cases that benefit from floating point math without causing any harm.

Potential futures of even more rational and precise rationality

The new capabilities provided by the upcoming introduction of Raku AST include the ability to modify/extend the behavior of dispatch. It then becomes quite feasible for a user-space module to be able to tweak the behavior of core to the extent that a more forceful approach to using floating point representation could be applied.

In the current preliminary conceptual phase, the idea would be to provide means to test the necessity of rational representation to the outcome of a system. This could be achieved by making the underlying numeric type “promotion” logic configurable. It then becomes possible to imagine replacing the Rat type with a Bell representation that can track the overall loss of precision throughout the system.

If that is below a given threshold, the same configurability could be used to remove Rat from the hierarchy entirely, thus ensuring floating point performance.

Balancing concerns

The concerns of unintentional and/or unrecoverable performance degradation remain extremely valid and up until beginning the research on this I was un-concerned about rational performance.

This lack of concern relied on a huge caveat – that floating point performance is available and guarantee-able to anyone who wants to ensure its use throughout their system.

Unfortunately this is not the case in current Rakudo (ie, the reference implementation of Raku).

I am still likely to move forward with a requirement of provable precision as a personal requirement when judging a system for whether it meets my expectations for “21st century.”

However, I now disavow my previous feelings that rational math as a default fractional representation is the best – or, if you’ll allow me to put it more playfully, most rational – route for a given language implementation to take.6

Conclusion

The intention of defaulting to rational representation was not intended to force users to the performance floor of rationals and it is a natural extension of Perlish philosophy to strive towards being even better at giving the user exactly – without surprises – what they want.

Rather, Raku chooses a precision-first philosophy because it fits into the overall philosophy of keeping the truest representation of a user’s demands as possible regardless of current computation cost – without locking them into the expectations of the language implementor either.

To the extent that Raku does not yet get this quite perfect, there remains a certain quality of rebellion and original thinking on this topic specifically and also throughout the language design that puts Raku quite close to providing powerful yet seamless mechanisms for choosing the best fractional representation for any given system.


  1. I say this because it implies that these are hardly the only two occasions where people have chosen an imprecise representation, have seen that imprecision manifest at significant scale, and have chosen to continue with the imprecise representation for mostly dubious reasons.↩︎
  2. To date I’m not aware of any other programming languages designed and implemented in the 21st century that take this same “precision-first” approach. It apparently remains a controversial/unpopular choice in language design.↩︎
  3. This is called from inside some glue that sepearates the timings and type cheacking from the underlying details.↩︎
  4. I patched a local version of Math::Matrix to print its rationals always as fractions, so your output will look different for the rational Hilbert (ie the same as fractional Hilbert) if you end up running the code in this gist.↩︎
  5. If the timings of those calculations depress you, it is worth pointing out that Raku developer lichtkind has developed a version of the library that produces timings like these instead.
    raku rationale-matrix.p6 --timings-only 8
    Created 'Num' Hilbert of 8x8: 0s
    Created 'Rat' Hilbert of 8x8: 0.006514s
    Starting 'Num' Hilbert test at 2020-08-10T11:06:34.567898+02:00
    'Num' Hilbert inverse calculation: 0.2784531s
    'Num' dot product: 0.2784531s
    Starting 'Rat' Hilbert test at 2020-08-10T11:06:34.846350+02:00
    'Rat' Hilbert inverse calculation: 0.06900818s
    'Rat' dot product: 0.0846446s

    It is likely that this work will be incorporated in the default release at some point. Both systems work much better but here the asymmetry of Num to Rat performance is looking even more suspicious than the near-parity of the slower, released version. I will do a deeper dive into profiling both versions of the module in a post on my personal blog at some point soon.↩︎

  6. I want to extend a deep and heartfelt thanks to not only Roger and his email but especially dzaima, ngn, Marshall, and the whole APL Orchard for helping me to arrive at a much deeper understanding of the varieties of class and proportion that are countervailing forces for any hard-line stance relating to rational arithmetic. My positions were worded poorly and argued brusquely and not a match faithful match for your patience. I hope it is some consolation, at least, that you have changed my mind.↩︎

RFC 54, by Damian Conway: Operators: Polymorphic comparisons

This RFC was originally proposed on August 7th 2020 and frozen in six weeks.

It described a frustration with comparison operations in Perl. The expression:

"cat" == "dog"   #True

Perl (and now Raku) has excellent support for generic programming because of dynamic typing, generic data types and interface polymorphism. Just about the only place where that DWIM genericity breaks down is in comparisons. There is no generic way to specify an ordering on dynamically typed data. For example, one cannot simply code a generic BST insertion method. Using <=> for the comparison will work well for numeric keys, but fails miserably on most string-based keys because <=> will generally return 0 for most pairs of strings. The above code would work correctly however if <=> detected the string/string comparison and automagically used cmp instead.

The Raku implementation is as follows:

There are three built-in comparison operators that can be used for sorting. They are sometimes called three-way comparators because they compare their operands and return a value meaning that the first operand should be considered less than, equal to or more than the second operand for the purpose of determining in which order these operands should be sorted. The leg operator coerces its arguments to strings and performs a lexicographic comparison. The <=> operator coerces its arguments to numbers (Real) and does a numeric comparison. The aforementioned cmp operator is the “smart” three-way comparator, which compares strings with string semantics and numbers with number semantics.[1]

The allomorph types IntStr, NumStr, RatStr and ComplexStr may be created as a result of parsing a string quoted with angle brackets…

my $f = <42.1>; say $f.^name; # OUTPUT: «RatStr␤»

Returns either Order::Less, Order::Same or Order::More object. Compares Pair objects first by key and then by value etc.

Evaluates Lists by comparing element @a[$i] with @b[$i] (for some Int $i, beginning at 0) and returning Order::Less, Order::Same, or Order::More depending on if and how the values differ. If the operation evaluates to Order::Same, @a[$i + 1] is compared with @b[$i + 1]. This is repeated until one is greater than the other or all elements are exhausted. If the Lists are of different lengths, at most only $n comparisons will be made (where `$n = @a.elems min @b.elems`). If all of those comparisons evaluate to Order::Same, the final value is selected based upon which List is longer.

If $a eqv $b, then $a cmp $b always returns Order::Same. Keep in mind that certain constructs, such as Sets, Bags, and Mixes care about object identity, and so will not accept an allomorph as equivalent of its components alone.

Now, we can leverage the Raku sort syntax to suit our needs:

my @sorted = sort { $^a cmp $^b }, @values;
my @sorted = sort -> $a, $b { $a cmp $b }, @values;
my @sorted = sort * cmp *, @values;
my @sorted = sort &infix:cmp», @values;

And neatly avoid duplication in a functional style:

# sort case-insensitively
say sort { $^a.lc cmp $^b.lc }, @words;
#          ^^^^^^     ^^^^^^  code duplication
# sort case-insensitively
say sort { .lc }, @words;

So this solution to RFC54 smoothly combines many of the individual capabilities of Raku – classes, allomorphs, dynamic typing, interface polymorphism, and functional programming to produce a set of practical solutions to suit your coding style.

[1] This is ass-backwards from the RFC54 request with ‘cmp’ as the dynamic “apex”, degenerating to ‘<=>’ for Numeric or ‘leg’ for Lexicographic variants.

RFC 64: New pragma ‘scope’ to change Perl’s default scoping

Let’s talk about a fun RFC that mostly did not make its way into current day Raku, nor is it planned for later implementation.

This is about RFC 64 by Nathan Wiger. Let me quote the abstract:

Historically, Perl has had the default “everything’s global” scope. This means that you must explicitly define every variable with my, our, or its absolute package to get better scoping. You can ‘use strict’ to force this behavior, but this makes easy things harder, and doesn’t fix the actual problem.

Those who don’t learn from history are doomed to repeat it. Let’s fix this.

It seems use strict; simply has won, despite Nathan Wiger’s dislike for it.

Raku enables it by default, even for one-liners with the -e option, Perl 5 enables it with use 5.012 and later versions, and these days there’s even talk to enable it by default in version 7 or 8.

I’d say the industry as a whole has moved in the direction of accepting the tiny inconvenience of having to declare variables over the massive benefit in safety and protection against typos and other errors. Hey, even javascript got a "use strict" and TypeScript enables it by default in modules. PHP 7 also got something comparable. The only holdout in the “no strict” realm seems to be python.

With pretty much universal acceptance of required variable definitions, there was no compelling reason to improve implicit declaration, so no scope pragma was needed.

But… there’s always a “but”, isn’t there?

One of the primary motivations for not wanting to declare variables was laziness, and Raku did introduce several features that allow to you avoid some declarations:

  • Parameters in signatures are an implicit declaration, The RFC’s example sub squre could be written in Raku simply as sub square($x) { $x * $x }. No explicit declaration necessary.
  • Self-declaring formal parameters with the ^ twigil also imply a declaration, for example sub square { $^x * $^x }.
  • There are many functional features in Raku that you can use to avoid explicit variables altogether, like meta operators, Whatever star currying etc.

I am glad Raku requires variable declarations by default, and haven’t seen any code in the wild that explicitly states no strict;. And without declarations, where would you even put the type constraint?

RFC 5, by Michael J. Mathews: Multiline comments

This is the first RFC proposed related to documentation. It asks for a common feature in most of the modern programming languages: multiline comments.

The problem of not having multi-line comments is quite obvious: if you need to comment a large chunk of code, you need to manually insert a # symbol at the beginning of every line (in Raku). This can be incredibly tedious if you do not have, for instance, a text editor to do this with a shortcut or similar. This practice is very common in large code bases. For that reason, Michael refers to C++ and Java as

popular languages that were designed from the outset as being useful for large projects, implementing both single line and multiline comments

In those languages you can type comments as follows:

// single line of code

/*
 Several lines of code
*/

But, in addition, in Java you have a special multiline comment syntax 1 for writing documentation:

/**
* Here you can write the doc!
*
*/

A lot of people proposed POD as a solution to this problem, but Michael lists some inconvenients:

  • “it’s not intuitive”: given that POD is only used by Perl, people coming from different languages will face some struggles learning an entire new syntax.

From my point of view, this not as big a problem since POD6 syntax is quite simple and it’s well documented. In addition, it is quite intuitive for newcomers: if you want a header, you use =head1, if you want italics, you use I<> and so on.

  • “it’s not documentation”: this one is still true. The main problem is that when you want to comment a big chunk of code, that’s probably not documentation, so using =begin pod ... =end pod it’s a little weird.
  • “it doesn’t encourage consistency”: another problem of POD is that you can use arbitrary terms in its syntax:
    =begin ARBITRARYTEXT 
    ... 
    =end ARBITRARYTEXT

    While this behavior gives us a lot freedom, it also complicates consistency across different projects and users.

After some discussion, Perl chose POD for implementing multiline comments. Nonetheless, Michael proposal was taken into account and Raku supports multiline comments similar to those of C++ and Java, but with a slightly different syntax:

#`[
Raku is a large-project-friendly
language too!
]
say ":D";

And as a curiosity, Raku has embedded comments, that is:

if #`( embedded comment ) True {
    say "Raku is awesome";
}

In the end, as a modern, 100-year language, Raku gives you more than one way to do it, so choose whatever fits you best!


  1. It’s not really a multiline comment because you also need to type the * symbol at the beginning of every line.

RFC 225: Superpositions (aka Junctions)

Damian Conway is one of those names in the Perl and Raku world that almost doesn’t need explaining. He is one of the most prolific contributors to CPAN and was foundational in the design of Raku (then Perl 6). One of his more interesting proposals came in RFC225 on Superpositions, which suggested making his Perl Quantum::Superposition‘s features available in the core of the language.

What is a Superposition?¹

In the quantum world, there are measurable things that can exist in multiple states — simultaneously — until the point in time in which we measure them. For computer scientists, perhaps the most salient application of this is in qubits which, as a core element of quantum computing, threaten to destroy encryption as we know it, if quantum supremacy is borne out.

At the end of the day, though, for us it means being able to treat multiple values as if they were a single value so long as never actually need there to only be one, at which point we get a single state from them.

The Perl Implementation

In the original implementation, Dr. Conway adds two new operators, all and any. These converted a list of values into a single scalar value. How was this different from using a list or array? Consider the following Perl/Raku code:

my @foo = (0, 1, 2, 3, 4, 5);

We can easily access each of the values by using array notation:

print @names[0]; # 0
print @names[1]; # 1
print @names[2]; # 2

But what if we wanted to do stuff to this list of numbers? That’s a bit trickier. Functional programmers would probably say “But you have map!”. That’s true, of course. If I wanted to double everything, I could say

@foo = map { $_ * 2}  @foo; # Perl
@foo = map { $_ * 2}, @foo; # Raku

But it could also be nice if I could just say

@foo *= 2;

This is where the superposition can be helpful. Now imagine we have another array and wanted to add it to our now doubled set of values in @foo

my @bar = (0,20,40,60,80,100);
@foobar = @foo + @bar;          # (12); wait what?  Recall that arrays in numeric context are the number of elements, or 6 here.

Your instinctive reaction might be to say that we’d want to end up with (0,22,44,66,88,110) which is simple enough to handle in a basic map or for loop (using the zip operator, Raku can do this simply as @foo Z+ @bar). But remember what a superposition means: anything done happens to all the values, so each value in @foo needs to be handled with each value in @bar, which requires at least two loops if done via map or for (the cross operator in Raku can do this simply as @foo X+ @bar). We actually want (0, 2, 4, 6, 8, 10, 20, 22, 24, 26, 28, 30, 40, 42, 44, 46, 48, 50 … ). More difficult, then, would be to somehow compare this value:

@foobar > 10;

There is no map method we can attach to @foobar to check its values against 10, we’d need to instead map the > 10 into @foobar. But by using superpositioning, we can painless do all of the above with a single use of map, for, or anything else that generates line noise:

use Quantum::Superposition;
my $foo = any (0, 1, 2, 3, 4, 5);    # superposition of 0..5
$foo *= 2;                           # superposition of 0,2,4,6,8,10
my $bar = any (0,20,40,60,80,100);
my $foobar = $foo + $bar;            # superposition of 0,2,4,6,8,20,22,24,26…
$foobar > 10;                        # True
$foobar > 200;                       # False
$foobar < 50;                        # True
$foobar < 0;                         # False

In fact, comparison operators are where the power of superpositions really shine. Instead of checking if a string response is an an array of acceptable responses, or using a hash

The Raku proposal

In the original proposal, there were two types of superpositions possible: all and any. These were proposed to work exactly as described above (creating a single scalar value out of a list of values), with their most useful feature being evident when interpreted in a boolean context. For example, in the code

my $numbers = all 1,3,5,7,9;
say "Tiny"  if $numbers < 10;     # Tiny
say "Prime" if $numbers.is-prime; # (no output)

For those wishing to obtain the values, he proposed the using the sub eigenstates, which would retrieve the states without forcing it to collapse to a single one. The rest of the RFC argues why superpositions should not be left in module space, as even the Dr. Conway’s work had limitations that he himself readily admitted — namely, interacting with everything that assumes a single value for a scalar and (auto)threading. The former should be fairly obvious why it would be difficult for the Quantum::Superposition module to work perfectly outside of core, because “the use of superpositions changes the nature of subroutine and operator invocations that have superpositions as arguments”.² As well, if we had a superposition of a million values, doing each operation one by one on computers with multiple processors seems silly: it should be possible to take advantage of the multiple processors. While this seems like an obvious proposition today, we must recall the multicore processors were simply not common in the consumer market when the proposal was made. (Intel’s Pentium D chips didn’t arrive until 2005, IBM’s PowerPC970 MP in 2002.) By placing it in core, things can just work as intended and, in the rare event that a module author cares about receiving superimposed values, they could provide special support.

The Raku implementation

For the most part, RFC 225 was well received and expanded in scope. The most obvious change is the name. In the final implementation, Raku calls these superimposed values junctions. But on a practical level, two additional keywords were added, none and one which provide more options to those using the junctions.³A wildly different — and useful — option was added to provide syntax to create the junctions. Instead of using any 1,2,3, one can also write 1 | 2 | 3, and in lieu of all 1,2,3 it’s possible to write 1 & 2 & 3. Different situations might give rise to using one or the other form, which aids the Perl & Raku philosophy of TIMTOWTDI.

One feature that did not make the cut was the ability to introspect the current states. As late as 2009, it seems it was still planned (based on this issue), but at some point, it was taken out, probably because the way that junctions work means that any methods called on them ought to be fully passed through to their superimposed values, so it would be weird to have a single method that didn’t. Nonetheless, by abusing some of the multithreading that Raku does with junctions, it’s still possible if one really wants to do it:

sub eigenstates(Mu $j) {
    my @states;
    -> Any $s { @states.push: $s }.($j);
    @states;
}

Conclusion

Junctions are, despite their internal complexity and rarity in programming languages are something that are so well thought out and integrated into common Raku coding styles that most use them without any thought. Who hasn’t written a signature with a parameter like $foo where Int|Rat or @bar where .all < 256? Who prefers

if $command eq 'quit' || $command eq 'exit'

to these versions? (because TIMTOWTDI)

if $command eq 'quit'|'exit'
if $command eq any <quit exit>
if $command eq @bye.any

None of these are implemented with syntactical sugar for conditionals, though it may seem otherwise. Instead, at their core, is a junction. Dr. Conway’s RFC 225 is a prime example of a modest proposal that is so simultaneously both crazy and natural that, while it fundamentally changed how we wrote code, we haven’t even realized it.


  1. I am not a physicist, much less a quantum one. I probably made mistakes here. /me is not sorry.
  2. Maybe there’s a super convoluted way to still pull it off, but to my knowledge, he’s the only person who wrote an entire regex to parse Perl itself in order to add a few new keywords, so if he deems it not possible… I’m gonna go with it’s not possible.
  3. Perhaps in the future others could be designed, such as at-least-half. The sky’s the limit after all in Raku.

Cover image by Sharon Hahn Darlin, licensed under CC-BY 2.0

RFC 168, by Johan Vromans: Built-in functions should be functions

Proposed on 27 August 2000, frozen on 20 September 2000, which was a generalization of RFC 26: Named operators versus functions proposed on 4 August 2000, frozen on 28 August 2000, also by Johan Vromans.

Johan’s proposal was to completely obliterate the difference between built-in functions, such as abs, and functions defined by the user. In Perl, abs can be called both as a prefix operator (without parentheses), as well as a function taking a single argument.

You see, Perl has this concept of built-in functions that are slightly different from “normal” subroutines for performance reasons. In Perl, as in Raku, the actual name of a subroutine, is prefixed with an ‘&‘. In Perl, you can take a reference to a subroutine with ‘\‘, but that doesn’t work for built-in functions.

Nowadays, in Raku, the difference between a subroutine taking a single positional argument, and a built-in prefix operator whose name is acceptable as an identifier, is already minimal. Well, actually absent. Suppose we want to define a prefix operator foo that has the same semantics as abs:

sub foo(Numeric:D $value) {
    $value < 0 ?? -$value !! $value
}

say abs -42;  # 42
say foo -42;  # 42

say abs(-42); # 42
say foo(-42); # 42

You can’t really see a difference, now can you? Well, the reason is simple: in Raku, there is no real difference between the foo subroutine, and the abs prefix operator. They’re both just subroutines: just look at the definition of the abs function for Real numbers.

But how does that function for infix operators? Those aren’t surely subroutines as well in Raku? How can they be? Something like “+” is not a valid identifier, so you cannot define a subroutine with it?

The genius in the process from the RFC to the implementation in Raku, has really been the idea to give a subroutine that represents an infix operator, a specially formatted name. In the case of infix + operator, the subroutine is known by the name infix:<+>. And if you look at its definition, you’ll see that it is actually quite simple: the left hand side of the infix operator becomes the first positional argument, and the right hand side the second positional argument. So something like:

say 42 + 666;

is really just syntactic sugar for:

say infix:<+>(42, 666);

Does this apply to all built-in operators in Raku? Well, almost. Some operators, such as ||, or, && and and are short-circuiting. This means that the value on the right hand side, might not be evaluated if the left hand side has a certain value.

A simple example using the say function (which always returns True):

say "foo" or say "bar"; # foo

Because the infix or operator sees that its left hand side is already True, it will not bother to evaluate the right hand side, and thus will not print “bar”. There is currently no way in Raku to mimic this short-circuiting behaviour in “ordinary” subroutines. But this will change when macro’s will finally also become first-class citizens in Raku land. Which is expected to be happening in the coming year as part of Jonathan Worthington‘s work on the RakuAST grant.

Going back to the original RFC, it also mentions:

In particular, it is desired that every built-in
- can be overridden by a user defined subroutine;
- can have a reference taken;
- has a useful prototype.

So, let’s check that those points:

can be overridden by a used defined subroutine

OK, so infix operators have a special name. So what happens if I declare a subroutine with that name? Well, let’s try:

sub infix:<+>(\a, \b) { a + b }
say 42 + 666;

Hmmm… that doesn’t show anything, that just hangs! Well, yeah, because we basically have a case of a subroutine here calling itself without ever returning!

This code example eats about 1GB of memory per second, so don’t do that too long unless you have a lot of memory available!

The easiest fix would be to not use the infix ‘+‘ operator in our version:

sub infix:<+>(\a, \b) { sum a, b }
say 42 + 666;  # 708

But what if we want to refer to original infix:<+> logic? It’s just a subroutine after all! But where does that subroutine live? Well, in the core of course! And for looking up things in the core, you use the CORE:: PseudoStash:

sub infix:<+>(\a, \b) {
    say "plussing";
    CORE::<&infix:<+>>(a, b)
}
say 42 + 666; # plussing\n708

You look in the CORE:: pseudostash for the full name of the infix operator: CORE::<&infix:<+>> will then give you the subroutine object of the core’s infix + operator, and you can call that as a subroutine with two parameters.

So that part of the RFC has been implemented!

can have a reference taken

For the infix + operator, that would be &infix:<+>, as basically is shown in the example above. You could actually store that in a variable, and use that later in an expression:

my $foo = &infix:<+>;
say $foo(42,666);  # 708

Note that contrary to Perl, you do not need to take a reference in Raku. Since everything in Raku is an object, &foo and &infix:<+> are just objects as well. You can just use them as they are. So literally this part of the RFC could never be implemented because Raku does not have reference. But for the use case, which is obtaining something that can be called, the RFC has also been implemented.

has a useful prototype

Perl’s prototypes basically morphed into Raku’s Signatures. But that’s at least one blog post all by itself. So for now, we just say that the “prototypes” of Perl in 2000 turned into signatures in Raku. And since you can ask for a subroutine’s signature:

sub foo(\a, \b) { }
say &foo.signature;  # (\a, \b)

You can also do that for the infix + operator:

say &infix:<+>.signature;  # ($?, $?, *%)

Hmmm… that looks different? Well, yes, it does a bit, but what is mainly different is that both positional parameters are optional. And that any named parameters will also be accepted. As to why that is, that’s really the topic of a yet another blog post about meta-operators. Which we’ll also leave for another time.

Conclusion

RFC’s 168 and 26 have been implemented completely, although maybe not in the way the original RFC’s envisioned. In a way that nowadays just feels very natural. Which allows us to build further, on the shoulders of giants!

RFC 145, by Eric J. Roode: Brace-matching for Perl Regular Expressions

Problem and proposal

The RFC 145 calls for a new regex mechanism to assist in matching paired characters like parentheses, ensuring that they are balanced. There are many “paired characters” in more or less daily use: (), [], {}, <>, «», "", '', depending on your local even »«, or in the fancy world of Unicode additionally ⟦⟧ and many, many more. In this article I will take up the RFC’s title and call all of them “braces”.

For example, consider the string ([b - (a + 1)] * 7). We might wish to extract all the subformulas

  • [b - (a + 1)] * 7,
  • b - (a + 1),
  • a + 1

from it, all of which are surrounded by a matching pair of braces using a global match. The reader is invited to try to write such a regex now.

The RFC author Eric Roode notes that this was still quite difficult in Perl in the year 2000. The task splits into two parts:

  1. Determining for an opening bracket what is its closing counterpart.
  2. Keeping track of the nesting levels and matching braces at each level.

The first subtask becomes hairy in a regex when there are multiple options for the opening bracket. The second subtask is hard for a more profound reason which goes by the name of “Dyck language“. The Dyck language is the set of all strings of properly paired parentheses (with hypothetical contents between them erased). It is the prototypical example of a language in the computer-science sense which is not regular but still context-free, meaning that it somehow needs a stack to keep track of nesting levels. Of course, regexes are more powerful than computer-scientific regular expressions but this fact may still justify why this is a difficult thing to do. Eric Roode recognized the gap between how easy this very common task in parsing structured data should be and how easy it is and wrote an RFC.

He proposed a pragma use matchpairs to solve subtask № 1 by providing a map from opening to closing braces. Pragmas are activated in a lexical scope and influence all regex matches in it. For subtask № 2 two new regex metacharacters were proposed, \m and \M for matching and remembering corresponding braces. Using these hooks, the nesting level business is offloaded onto the regex engine.

Spec and solution

RFC 145 is marked “developing”, meaning that it was not fully addressed in the Perl 6, and now Raku, specification. (Apocalypse 5 on pattern matching includes a response to RFC 145.) But there have been related improvements which I am going to use in this section to show how the problem posed in the beginning might be handled in Raku today.

The idea of using a pragma to set up a table of valid braces and then using “brace” regex metacharacters was not implemented, but the regex language was to be redesigned anyway and the designers extrapolated from brace matching and created a new regex operator for nesting structures, the tilde. This operator is used like this:

anon regex { '(' ~ ')' <body> }

and it achieves two things: it transposes body and closing brace so that the two delimiters are close to each other, even when <body> is long, and it sets up error reporting for when the closing brace was not found.

We can use this new feature to slightly improve the regex structure and get error reporting for free, but it does not keep track of nesting levels of the parentheses and it does not compute the closing brace for us if there had been multiple options for the opening one.

To compute the closing brace, it would suffice to have a way to capture the opening brace and pass it to a function whose return value is dynamically interpolated into the regex. This is now easy in Raku regexes and grammars:

grammar Formula {
    # Registry of understood braces.
    constant %braces =
        '(' => ')',
        '[' => ']',
        '{' => '}',
    ;

    # A parametric token which matches the closing brace
    # corresponding to its argument.
    token closing ($opening) {
        "%braces{$opening}"
    }

    rule braced {
        $<opening>=@(%braces.keys) ~ <closing($<opening>)>
          [ <expr> {} ]
    }

    rule expr {
        [ <:Letter>+ || <:Number>+ || <braced> ]+ % <[+*/-]>
    }
}

The crucial part is rule braced.¹ We capture the opening brace and then later ask for its corresponding closing brace from a lookup in the %braces map.² The @(%braces.keys) interpolation of a list invokes longest-token matching, so it will DWIM when multiple braces with overlapping prefixes are present.

Notice that the mutually recursive use of the <expr> and <braced> rules ensures correct nesting of braces without needing a dedicated gear for this in the regex engine. It falls out of Raku’s improved regex structuring and reusing facilities. It is time for a test:

grammar Formula { … }
sub braced-subexprs ($expr) { … }

braced-subexprs Q|([b - (a + 1)] * 7)|;
-- ([b - (a + 1)] * 7) ---------------------------------------------------------
Braces: ( * ) ||| Subexpr: a + 1
Braces: [ * ] ||| Subexpr: b - (a + 1)
Braces: ( * ) ||| Subexpr: [b - (a + 1)] * 7

Summary

In summary, brace matching is obviously useful in parsing structured data. It was proposed by Eric Roode to make this simple in Perl 6 / Raku. Although the feature was not implemented in the proposed form, the task has indeed become easier to accomplish and the code much easier to read, notably due to the new regex syntax and grammar support.

Encore!

If, like me, you are slightly bothered by the static brace table but are fine with heuristics, then the Unicode Consortium may be an unexpected ally. The Unicode Bidi_Mirroring_Glyph property gives hints about bidirectional writing, that is putting text on the screen when multiple scripts are involved, some of which write left-to-right and others right-to-left. Raku has built-in support for Unicode properties and we can use this one to let the Unicode Consortium pick closing braces for us:

    sub unicode-mirror ($_) {
        join '', .comb.reverse.map: {
            .uniprop('Bidi_Mirroring_Glyph')
                or .self
        }
    }

    token closing ($opening) {
        "{ unicode-mirror($opening) }"
    }

    regex braced {
        :sigspace
        $<opening>=<:Symbol + :Punctuation>+ ~ <closing($<opening>)>
        [ <expr> {} ]
    }

The &unicode-mirror heuristic splits the argument into characters, reverses their order and then either picks its mirroring glyph, if one is defined, or leaves the character as-is, then reassembles them into a string. This function successfully turns <{ into }>, for example.

braced was tweaked in two regards: it accepts any sequence of symbols and punctuation as opening braces now and it has been turned into a regex for full backtracking power when it is too greedy in consuming opening braces.

With these tweaks, we can go nuts and have the grammar do free association and match everything that “looks like a brace pair”:

-- ([b - (a + 1)] * 7) ---------------------------------------------------------
Braces: ( * ) ||| Subexpr: a + 1
Braces: [ * ] ||| Subexpr: b - (a + 1)
Braces: ( * ) ||| Subexpr: [b - (a + 1)] * 7

-- (=^123^=) -------------------------------------------------------------------
Braces: (=^ * ^=) ||| Subexpr: 123

-- <<<123>> --------------------------------------------------------------------
FAILED

-- >123< -----------------------------------------------------------------------
Braces: > * < ||| Subexpr: 123

-- >123> -----------------------------------------------------------------------
FAILED

-- <{ (a + <b>) / !c! / e * »~d~« }> -------------------------------------------
Braces: < * > ||| Subexpr: b
Braces: ( * ) ||| Subexpr: a + <b>
Braces: ! * ! ||| Subexpr: c
Braces: »~ * ~« ||| Subexpr: d
Braces: <{ * }> ||| Subexpr: (a + <b>) / !c! / e * »~d~«

Footnotes

The function used to report braced subexpressions is this:

sub braced-subexprs ($expr) {
    # Get all submatches of the C<braced> subrule.
    class BracedCollector {
        has @.braced-subexprs;

        method braced ($/) {
            push @!braced-subexprs, $/
        }

        method braced-subexprs {
            @!braced-subexprs.unique(as => *.pos)
        }
    }

    say "-- $expr ", '-' x (76 - $expr.chars);

    my BracedCollector $collect .= new;
    say "FAILED" and return
        unless Formula.parse($expr, :rule<expr>, :actions($collect));

    for $collect.braced-subexprs -> $/ {
        say "Braces: $<opening> * $<closing> ||| Subexpr: $<expr>";
    }
}

¹ In case you are wondering about the use of an empty block in [ <expr> {} ], this is due to an implementation detail in Rakudo’s regex engine which does not make the capture $<opening> available to a later subrule closing unless it is forced to. The empty block is one way to force it; cf. RT#111518 and DOC#3478.

² The essential feature of interpolating back the return value of a function call closing $<op> which may depend on previous captures was added, to the best of my knowledge, also around the year 2000 (so about the time this RFC was posted), to Perl 5.6, in this case with the spelling (??{closing $+{op}}).

RFC 137: Perl OO should not be fundamentally changed.

Now, as you have read the title and already stopped laughing… Er, not all of you yet? Ok, I’ll give you another minute…

Good, let’s be serious now. RFC 137 was written by Damian Conway. Yes, the title too. No, I’m serious! Check it yourself! And then also read other RFCs from language-objects category. Turns out, it was the common intention back then: don’t break things beyond necessary, keep everything as backward compatible as possible.

A familiar stance, isn’t it?

I chose this RFC not over its title but because it might be considered as the source of the river we now call the Raku OO model.

Let’s look closer into the RFC text. To be frank, this gonna be second time only as I read it. Gosh, three weeks ago I even had no idea Perl6 started with RFCs! My long-time belief was the synopses were the first to come. And since even they are now pretty much outdated in many places, studying RFCs is like studying the first stone tools of Homo Habilis: there is not much in common with what we use nowadays, but how great is it to see ways human mind adapt and improve its own ideas! So, let’s get back into our archeological excavation and start examining our sample.

The first prominent feature we find states that:

It ain’t broken. Don’t fix it.

Here and below all quotes are from the RFC body.

And you know what? Back then it was my consideration too! All I really wanted was class and method keywords, private declarations… And that’s basically all. Perl with classes, what else?

Heh, young and stupid, aha…

The way I see it now? It’s still not broken. But neither it is good. And once you try it in Raku – there is no way back.

Perl’s current OO model has a number of well-known deficiencies: lack of (easy) encapsulation, poor support for hierarchical method calls (especially constructors and destructors), limited (single) dispatch mechanism, poor compile-time checking. More fundamentally, many people find that setting up reliable OO class hierarchies requires too much low-level coding.

Fairly long list, isn’t it? The good thing: these days there is Moose family of toolkits to get solutions for many of the listed problems. Not all of them though. But the fact of existence of these toolkits tells a lot about Perl’s flexibility which was really something outstanding back then, in the late 90s. Even though Moose didn’t exists yet in 2000, there were already a few OO toolkits implementing different approaches.

The bad thing: all of them are external solutions. Yes, CPAN. Yes, easy to install. Still…

This is one of the aspects where Raku shines absolutely: every single issue from the list above has been taken care of. And event some problems beyond the listed ones. Eventually, this is what made Raku almost totally backward incompatible to Perl. But do I feel pity about it? I’m certainly do not!

Later in RFC text Damian writes:

The non-prescriptive, non-proscriptive nature of Perl’s OO model makes it possible to construct am enormous range of OO systems within the one language

And you know what? You can do it in Raku too! Yes, you can create your own OO model from the scratch by utilizing Raku’s powerful meta-object protocol (MOP) capabilities. Apparently I won’t be discussing this matter in this article. But I can give you a hint:

say EXPORTHOW::<class>.^name; # Perl6::Metamodel::ClassHOW

Congratulations! We just found the class which is responsible for Raku class keyword. It is possible to implement your own keyword like, say, myclass (sorry for a commonplace) and make it possible to have declarations like:

myclass Foo { }

And make sure that the behavior of myclass kind of type objects is different from class. How much different is totally up to one’s imagination and demands.

Don’t like it? Create your own slang and extend Raku grammar with it! Why not to have something like:

myclass Foo;
   private Int attr1;
   public Str attr2;
end

It’s possible. The only remark to make about it: the power of Raku makes this kind of tricks unnecessary most of the time.

Yet, the ability to declare own class-like keyword which provides some kind of specialized functionality proves to be extremely useful for creating an ORM with an exceptional level of flexibility and ease of use!

A polite cough from the audience reminds me that it’s time to change subject. My apologies, the MOP is my all-time fad!

Of course, the RFC is not about critics but proposals. Let’s see why I think that this one is the source of many things we have in the Raku language today.

A private keyword that lexically scopes hash keys to the current package

As Perl’s OO wasn’t planned to for a drastic change in v5 to v6 transition, it was supposed to still remain based upon blessed hashes. This is not true for Raku. Why and how are questions not for an article, but rather for a small book. Anyway, the idea of private class members is here, even though not the keyword private itself:

class Foo {
    has $!private;
    has $.public;
    method !only-mine { }
    method for-anyone { self!only-mine }
}
my $foo = Foo.new;
$foo.for-anyone;
$foo.only-mine; # Error
say $foo.public;
say $foo.private; # Error

Instead of over-verbose public/private declarations Raku uses concise twigil notation using . for publics and ! for privates. But what’s even more interesting is that the only difference between the two declared attributes is that $.public automatically receives an accompanying method public which is the accessor for the attribute. Because, as a matter of fact, in Raku all attributes are actually private! By simplifying a bit, it can be said that the only thing which a class exposes into the world is its public methods.

A new special subroutine name — SETUP — to separate construction from initialization.

In fact, Raku’s object construction model is based upon three special methods: BUILDALL, BUILD, and TWEAK. The latter is the evolution of SETUP method idea.

But! (There is often a but in Raku) The three are not the kind of methods we used to think about them. They are submethods which are the sole property of a class or a role which declares them. What it means in practice is:

class Foo {
    submethod foo { say "foo!" } 
    method bar { say "bar!" }
}
class Bar is Foo { }
Foo.foo; # foo!
Bar.bar; # bar!
Bar.foo; # Error: no such method

Submethods are a kind of tool ideally suited for performing tasks totally specific to the class/role. Precisely like the construction/destruction tasks.

  • Changes to the semantics of bless so that, after associating an object with a class, the class’s SETUP methods are automatically called on the object. An additional trailing @ parameter for bless, to allow arguments to be passed to SETUP methods.
  • Changes to the semantics of DESTROY, so that all inherited destructors are, by default, automatically called when an object is destroyed.

Yes and no. Back then, when the RFC was written, the low-level architecture of Perl6 not even started to be discussed. Thus many ideas was based on Perl5 design in which the core is written in C and bundled with a bunch of modules. Correspondingly, bless is the core thing doing some kind of magic to the data provided as an argument. It seemed natural to teach bless a few additional tricks to get the desired outcome.

In Raku everything is different in this area. First of all, the language specification doesn’t imply the exact way of how construction/destruction stages are to be implemented, it only demands the constructor/destructor methods to be supported and invoked in specific order with specific parameters. So, the “Yes” in the beginning of the previous paragraph is related to the fact that the automatic invocation does take place and the arguments are passed from a call to the method new to the construction submethods:

class Foo {
    has $.foo;
    method TWEAK(:$foo) {
        $!foo = $foo * 10 if $foo < 10;
    }
}
say Foo.new(foo => 5).foo; # 50

But the “No” above is related to the fact that there is no low-level bless subroutine responsible for how the things are done. For example, the way Rakudo compiler implements the specification of object construction is basically winds down to something like this incomplete pseudo-code:

method new(*%attrinit) {
    self.bless(|%attrinit)
}
method bless(*%attrinit) {
    nqp::create(self).BUILDALL(Empty, %attrinit)
}

So bless is no more than just an ordinary pre-defined method. If necessary, you can override it in your class:

class Foo {
    has $.foo;
    method bless(:$foo, *%c) {
        nextwith(|%c, foo => $foo * 2) 
    } 
}
say Foo.new(foo => 4).foo; # 8

The code will work because this part of the construction logic is partially implemented by class Mu from which all Raku classes indirectly inherit by default. So, if no special care is taken, when you do Foo.new it means the method new from Mu is invoked.

Besides, in Rakudo’s scenario all the “magic” of object initialization eventually happens within the Mu::BUILDALL method which is using the MOP to determine all the steps to be done for you to get an instance of Foo properly setup and ready for use.

Off all the above steps it’s only the nqp::create call which is served by the low-level core executable (virtual or bytecode machine in Rakudo terms) and which purpose is to allocate and initialize memory for an object representation.

Pre- and post-condition specifiers, which associate code blocks with particular subroutine/method names.

This idea has never got developed. Instead Raku has gotten something way more powerful! A concept which applies not only to routines but to any object. And don’t forget: in Raku everything is an object! The concept I’m talking about is trait.

A trait is a routine which gets invoked at code compilation time and gets the object it is applied to as its argument. It’s also possible to pass your own arguments to the trait, which is able to use the full power of MOP to setup or even alter the object the way the user needs it to. For example:

class Foo {
    has $.foo is rw is default(42);
}

In this snippet is default(42) is a simple example of passing an argument to a trait. The meaning is to specify the value the attribute will get initially and every time it gets assigned with Nil.

is rw makes the attribute writable because, by default, all attributes in Raku are read-only. Remember I wrote earlier that everything is an object? Attributes are no exception! RO or RW status of an attribute is determined by… well, by an attribute value on an Attribute object! Thus, our is rw trait is actually as simple as:

multi sub trait_mod:<is>(Attribute:D $attr, :rw($)!) {
    $attr.set_rw();
    warn "useless use of 'is rw' on $attr.name()" unless $attr.has_accessor;
}

Just ignore all the syntax used here and consider the use of set_rw() method. That’s basically all is needed to make an attribute writable.

If we now get back to the pre- and post-condition specifiers mentioned in the RFC, they also can be implemented with traits using method wrap of Method objects.

I would now skip a few following items in the RFC list we’re walking over now. Some were not implemented, some are just self-evident. Multi-dispatch itself worth an article and I hope somebody would pick it up for this advent calendar.

Let’s just fast-forward directly to the last one:

A new pragma — delegation — that would modify the dispatch mechanism to automatically delegate specific method calls to specified attributes of an object.

This just is another example where traits came to the rescue. There is no delegation in the Raku, but there is a trait named handles (already mentioned in a previous article):

class Book {
    has Str  $.title;
    has Str  $.author;
    has Str  $.language;
    has Cool $.publication;
}
 
class Product {
    has Book $.book handles('title', 'author', 'language', year => 'publication');
}

I chose it since it’s another example of a very elegant solution where no core intervention is needed to implemented some advanced functionality. In two words, the only thing handles does – it installs new methods on the class to which its attribute belongs. Of course, it takes into account some edge cases, tries to optimize things where possible. But, otherwise, there is no magic in it. There is no thing which you wouldn’t be able to do yourself!

This is what I’d like to conclude this article with. Years ago the word magic was kind of trendy among Perl developers. “Here we do some magic” – and then something really unexpected was happening. It was a lot of fun!

However, recently I found an interesting definition of what magic is: a kind of action a person performs to achieve a result which doesn’t logically follow from the action itself.

Sorry for perhaps clumsy translation, but I hope it reflects the idea behind the definition. And makes the good point of the magic while being fun not being good for the production.

In Raku the magic is eliminated. Instead, Raku brought in such a level of uniformity among different levels of code that often the first impression of: wow, this is magical! – is soon replaced with: wow, it’s so logical!

But you know what? When you put everything together and look at the language as a whole, it creates even bigger magic which can easily enchant you once and forever.