Featured

20 years ago tomorrow

20 years ago, on the first of August, the inception of a language started to, well, incept.

Actually, it started a bit earlier than that. Perl was in need of change, so it was decided that the community itself should propose what the language needed to do to go forward one step, from Perl 5 to Perl 6. A call for requests for change was made; every request for change should include possible changes to Perl, as well as, if possible, an implementation proposal, laying out how to proceed. The procedure didn’t lack criticism, but it can’t be said that it was not received, in general, with such an enthusiasm that August 1st already saw the first RFC, pretty much at the same time as some instructions from Larry Wall on how to actually proceed.

The rest is history. It looks a bit like sacred history, since those RFCs were picked up (and apart) by Larry Wall’s apocalypses, explained later by Damian Conway’s exegeses, and roasted in the synopsis, which eventually became the roast repository, the actual specification of the language.

Which is now called Raku. But that’s another story.

To celebrate this part of the history and the people that brought us where we are now, starting tomorrow, we’ll publish 20 articles, one a day, that will focus on one or a few RFCs and show what they eventually became in today’s Raku. So come back every day for a piece of Raku, of history, and of Raku history!

RFC 265: Interface polymorphism considered lovely

A little preface with an off-topic first. In the process of writing this post I was struck by the worst sysadmin’s nightmare: loss of servers followed by a bad backup. Until the very last moment I have had well-grounded fears of not finishing the post whatsoever. Luckily, I made a truce with life to get temporary respite. A conclusion? Don’t use bareos with ESXi. Or, probably, just don’t use bareos…

While picking up a RFC for my previous advent post I was totally focused on language-objects section. It took me a few passes to find the right one to cover. But in the meantime I realized that a very important topic is actually missing from the list. “Impossible!” – I said to myself and went onto another hunt later. Yet, neither search for “abstract class”, nor for “role” didn’t come up with any result. I was about to give up and make the conclusion that the idea came to life later, when the synopses were written or around so.

But, wait, what interface is mentioned as a topic of a OO-related RFC? Oh, that interface! As the request body states it:

Add a mechanism for declaring class interfaces with a further method for declaring that a class implements said interface.

At this point I realized once again that it is now a full 20 years behind us. That the text is from the times when many considered Java as the only right OO implementation! And indeed, by reading further we find the following statement, likely to be affected by some popular views of the time:

It’s now a compile time error if an interface file tries to do anything other than pre declare methods.

Reminds of something, isn’t it? And then, at the end of the RFC, we find another one:

Java is one language that springs to mind that uses interface polymorphism. Don’t let this put you off – if we must steal something from Java let’s steal something good.

Good? Good?!! Oh, my… Java’s attempt to solve problems of C++ multiple inheritance approach by simply denying it altogether is what drove me away from the language from the very beginning. I was fed up with Pascal controlling my writing style as far back as in early 90s!

Luckily, those involved in early Perl6 design must have shared my view to the problem (besides, Java itself has changed a lot since). So, we have roles now. What they have in common with abstract classes and the modern interfaces is that a role can define an interface to communicate with a class, and provide implementation of some role-specific behavior too. It can also do a little more than only that!

What makes roles different is the way a role is used in Raku OO model. A class doesn’t implement a role; nor it inherits from it as it would with abstract classes. Instead it does the role; or the other word I love to use for this: it consumes a role. Technically it means that roles are mixed into classes. The process can be figuratively described as if the compiler takes all methods and attributes contained by role’s type object and re-plants then onto the class. Something like:

role Foo {
    has $.foo = 42;
    method bar {
        say "hello!"
    }
}
class Bar does Foo { }
my $obj = Bar.new;
say $obj.foo; # 42
$obj.bar;     # hello!

How is it different from inheritance? Let’s change the class Bar a little:

class Baz {
    method bar {
        say "hello from Baz!"
    }
}
class Bar does Foo is Baz {
    method bar {
        say "hello from Bar!";
        nextsame
    }
}
Bar.new.bar; # hello from Bar!
             # hello from Baz!

nextsame in this case re-dispatches a method call to the next method of the same name in the inheritance hierarchy. Simply put, it passes control over to the method Baz::bar, as one can see from the output we’ve received. And Foo::bar? It’s not there. When the compiler mixes the role into Bar it finds that the class does have a method named bar already. Thus the one from Foo is ignored. Since nextsame only considers classes in the inheritance hierarchy, Foo::bar is not invoked.

With another trick the difference from interface consumption can also be made clear:

class Bar {
    method bar {
        say "hello from Bar!"
    }
}
my $obj = Bar.new;
$obj.bar; # hello from Bar!
$obj does Foo;
$obj.bar; # hello!

In this example the role is mixed into an existing object, thanks to the dynamic nature of Raku which makes this possible. When a role is applied this way its content is enforced over the class content, similarly to a virus injecting its genetic material into a cell effectively overriding internal processes. This is why the second call to bar is dispatched to the Foo::bar method and Bar::bar is nowhere to be found on $obj this time.

To have this subject fully covered, let me show you some funny code example. The operator but used in it behaves like does except it doesn’t modify its LHS object; instead but creates and returns a new one:

‌‌my $s1 = "not empty means true";
my $s2 = $s1 but role { method Bool { False } };
say $s1 ?? "true" !! "false";
say $s2 ?? "true" !! "false";

This snippet I’m leaving for you to try on your own because it’s time for my post to move onto another topic: role parameterization.

Consider the example:

role R[Str:D $desc] {
    has Str:D $.description = $desc;
}
class Foo does R["some info"] { }
say Foo.new.description; # some info

Or more practical one:

role R[::T] {
    has T $.val is rw;
}
class ContInt does R[Int] { }
ContInt.new.val = "oops!"; # "Type check failed..." exception is thrown

The latter example utilizes so called type capture where T is a generic type, the concept many of you are likely to know from other languages, which turns into a concrete type only when the role gets consumed and supplied with a parameter, as in class ContInt declaration.

The final iteration for parametrics I’m going to present today would be this more extensive example:

role Vect[::TX] {
    has TX $.x;
    method distance(Vect $v) { ($v.x - $.x).abs }
}
role Vect[::TX, ::TY] {
    has TX $.x;
    has TY $.y;
    method distance(Vect $v) { 
        (($v.x - $.x)² + ($v.y - $.y)²).sqrt 
    }
}

class Foo1  does Vect[Rat]      { }
class Foo2 does Vect[Int, Int] { }

my $foo1 = Foo1.new(:x(10.0));
my $foo2 = Foo2.new(:x(10), :y(5));
say $foo1;                                   # Foo1.new(x => 10.0)
say $foo2;                                   # Foo2.new(x => 10, y => 5)
say $foo2.distance(Foo2.new(:x(11), :y(4))); # 1.4142135623730951

Hopefully, the code explains itself. Most certainly it nicely visualizes the long way made by the language designers since the initial RFC was made.

At the end I’d like to share a few interesting facts about Raku roles and their implementation by Rakudo.

  1. As of Raku v6.e, a role can define own constructor/destructor submethods. They’re not mixed into a class as methods are. Instead, they’re used to build/destroy an object same way, as constructors/destructors of classes do:
use v6.e.PREVIEW; # 6.e is not released yet
role R { submethod TWEAK { say "R" } }
class Foo { submethod TWEAK { say "Foo" } }
class Bar is Foo does R { submethod TWEAK { say "Bar" } }
Bar.new; # Foo
         # R
         # Bar
  1. Role body is a subroutine. Try this example:
role R { say "Role" }
class Foo { say "Foo" }
# Foo

Then modify class Foo so that it consumes R:

class Foo does R { say "Foo" }
# Role
# Foo

The difference in the output is explained by the fact that role body gets invoked when the role itself is mixed into a class. Try adding one more class consuming R alongside with Foo and see how the output changes. To make the distinction between class and role bodies even more clear, make your new class inherit from Foo. Even though is and does look alike they act very much different. 3. Square brackets in role declaration enclose a signature. As a matter of fact, it is the signature of role body subroutine! This makes a few very useful tricks possible:

# Limit role parameters to concrete numeric objects.
role R[Numeric:D ::T $default] {
    has T $.value = $default;
}
class Foo[42.13] { };
say Foo.new.x; # 42.13

Or even:

# Same as above but only allow specific values.
role R[Numeric:D ::T $default where * > 10] {
    has T $.value = $default;
} 

Moreover, in case when few different parametric candidates are declared for a role, choosing the right one is a task of the same kind as choosing the right routine of a few multi candidates and based on matching signatures to the parameters passed. 4. Rakudo implements a role using four different role types! Let me demonstrate one aspect of this with the following snippet based on the example for the previous fact:

for Foo.^roles -> \consumed {
    say R === consumed
}

=== is a strict object identity operator. In our case we can consider it as a strict type equivalence operator which tells us if two types are actually exactly the same one.

And as I hope to have this subject covered later in a more extensive article, at this point I would make it a classical abrupt open ending by providing just the output of the above snippet as a hint:

False

RFC 28, by Simon Cozens

 

RFC 28 – Perl Should Stay Perl

Originally Submitted by Simon Cozens, RFC 28 on August 4, 2020, this RFC asked the community to make sure that whatever updates were made, that Perl 6 was still definitely recognizable as Perl. After 20 years of design, proofs-of-concept, implementations, two released language versions, we’ve ended up with something that is definitely Perlish, even if we’re no longer a Perl.

At the time the RFCs were submitted, the thought was that this language would be the next Perl in line, Perl 6. As time went on before an official language release, Perl 5 development picked up again, and that team & community wanted to continue on its own path. A few months ago, Perl 6 officially changed its name to Raku – not to get away from our Perl legacy, but to free the Perl 5 community to continue on their path as well. It was a difficult path to get to Raku, but we are happy with the language we’re shipping, even if we do miss having the Perl name on the tin.

“Attractive Nuisances”

Let’s dig into some of the specifics Simon mentions in his RFC.

We’ve got a golden opportunity here to turn Perl into whatever on earth we like. Let’s not take it.

This was a fine line that we ended up crossing, even before the rename. Specific design decisions were changed, we started with a fresh implementation (more than once if you count Pugs & Parrot & Niecza …). We are Perlish, inspired by Perl, but Raku is definitely different.

Nobody wins if we bend the Perl language out of all recognition, because it won’t be Perl any more.

I argue that eventually, everyone won – we got a new and improved Perl 5 (and soon, a 7), and we got a brand new language in Raku. The path wasn’t clear 20 years ago, but we ended up in a good place.

Some things just don’t need heavy object orientation.

Raku’s OO is everywhere: but it isn’t required. While you can treat everything as an object:

  3.sqrt.say;

You can still use the familiar Perlish forms for most features. say sqrt 3;

Even native scalars (which don’t have the overhead of objects) let you treat them as OO if you want.

  my uint32 $x = 32;
  say $x;
  $x.^name.say;

Even though $x here doesn’t start out as an object, by calling a meta-method on it, the compiler cheats on our behalf and outputs Int here, the closest class to our native int.

But we avoid going the extent of Java; for example, we don’t have to define a class with a main method in order to execute a program.

Strong typing does not equal legitimacy.

Similar to the OO approach, we don’t require typing, but allow you to gradually add it. You can start with an untyped scalar variable, but as you further develop your code, you can add a type to that declared variable, and to parameters to subs & methods. The types can be single classes, subsets, Junctions, where clauses with complicated logic: you can use as much or as little typing as you want. Raku’s multi routines (subs or methods with the same name but different arguments) give you a way to split up your code based on types that is then optimized by the compiler. But you can use as little or as much of it as you want.

Just because Perl has a map operator, this doesn’t make it a functional programming language.

I think Raku stayed true to this point – while there are functional elements, the polyglot approach (supporting multiple different paradigms) means that any one of them, including functional, doesn’t take over the language. But you can declare routines pure, allowing the compiler to constant fold calls to that routine when the args are known at compile time.

Perl is really hard for a machine to parse. … It’s meant to be easy for humans to understand.

Development of Raku definitely embraced this thought – “torture the implementators on behalf of the users”. This is one of the reasons it took us a while to get to here. But on that journey, we designed and developed new language parsing tools that we not only use to build and run Raku, but we expose to our users as well, allowing them to implement their own languages and “Slangs” on top of our compiler.

fin

Finally, now that the Perl team is proposing a version jump to 7, I suspect the Perl community will raise similar concerns to those raised by Simon. Raku and Perl 7 have taken two different paths, but both will be recognizable to the Perl 5 RFC contributors from 20 years ago.

RFC 84 by Damian Conway: => => =>

RFC 84 by Damian Conway: Replace => (stringifying comma) with => (pair constructor)

Yet another nice goodie from Damian, truly what you might expect from the interlocutor and explicator!

The fat comma operator, =>, was originally used to separate values – with a twist. It behave just like , operator did, but modified parsing to stringify left operand.

It saved you some quoting for strings and so this code for hash initialization:

my %h = (
'a', 1,
'b', 2,
);

could be written as:

my %h = (
a => 1,
b => 2,
);

Here, bare a and b are parsed correctly, without a need to quote them into strings. However, the usual hash assignment semantics is still the same: pairs of values are processed one by one, and given that => is just a “left-side stringifying” comma operator, interestingly enough the code above is equivalent to this piece:

my %h = ( a => 1 => b => 2 => );

The proposal suggested changing the meaning of this “special” operator to become a constructor of a new data type, Pair.

A Pair is constructed from a key and a value:

my @pairs = a => 42, 1 => 2;
say @pairs[0]; # a => 42
say @pairs[1]; # 1 => 2;
say @pairs[1].key.^name; # Int, not a Str

The @pairs list contains just 2 values here, not 4, one is conveniently stringified for us and the second just uses bare Int literal as a key.

It turns out, introducing Pair is not only a convenient data type to operate on, but this change offers new opportunities for… subroutines.

Raku has first class support of signatures, both for the sake of the “first travel class” pun here and for the matter of it, yes, actually having Signature, Parameter and Capture as first-class objects, which allows for surprising solutions. It is not a surprise it supports named parameters with plenty of syntax for it. And Pair class has blended in quite naturally.

If a Pair is passed to a subroutine with a named parameter where keys match, it works just so, otherwise you have a “full” Pair, and if you want to insist, a bit of syntax can help you here:

sub foo($pos, :$named) {
say "$pos.gist(), $named.gist()";
}
foo(42); # 42, (Any)
try foo(named => 42); # Oops, no positionals were passed!
foo((named => 42)); # named => 42, (Any)
foo((named => 42), named => 42); # named => 42, 42

As we can see, designing a language is interesting: a change made in one part can have consequences in some other part, which might seem quite unrelated, and you better hope your choices will work out well when connected together. Thanks to Damian and all the people who worked on Raku design, for putting in an amazing amount of efforts into it!

And last, but not the least: what happened with the => train we saw? Well, now it does what you mean if you mean what it does:

my %a = a => 1 => b => 2;
say %a.raku; # {:a(1 => :b(2))}

And yes, this is a key a pointing to a value of Pair of 1 pointing to a value of Pair of b pointing to value of 2, so at least the direction is nice this time. Good luck and keep your directions!

RFC 200, by Nathan Wiger: Revamp tie to support extensibility

Proposed on 7 September 2000, frozen on 20 September 2000, depends on RFC 159: True Polymorphic Objects proposed on 25 August 2000, frozen on 16 September 2000, also by Nathan Wiger and already blogged about earlier.

What is tie anyway?

RFC 200 was about extending the tie functionality as offered by Perl.

This functionality in Perl allows one to inject program logic into the system’s handling of scalars, arrays and hashes, among other things. This is done by assigning the name of a package to a data-structure such as an array (aka tying). That package is then expected to provide a number of subroutines (e.g. FETCH and STORE) that will be called by the system to achieve certain effects on the given data-structure.

As such, it is used by some of Perl’s core modules, such as threads, and many modules on CPAN, such as Tie::File. The tie functionality of Perl still suffers from the problems mentioned in the RFC.

It’s all tied

In Raku, everything is an object, or can be considered to be an object. Everything the system needs to do with an object, is done through its methods. In that sense, you could say that everything in Raku is a tied object. Fortunately, Rakudo (the most advanced implementation of the Raku Programming Language) can recognize when certain methods on an object are in fact the ones supplied by the system, and actually create short-cuts at compile time (e.g. when assigning to a variable that has a standard container: it won’t actually call a STORE method, but uses an internal subroutine to achieve the desired effect).

But apart from that, Rakudo has the capability of identifying hot code paths during execution of a program, and optimize these in real time.

Jonathan Worthington gave two very nice presentations about this process: How does deoptimization help us go faster from 2017, and a Performance Update from 2019.

Because everything in Raku is an object and access occurs through the methods of the classes of these objects, this allows the compiler and the runtime to have a much better grasp of what is actually going on in a program. Which in turn gives better optimization capabilities, even optimizing down to machine language level at some point.

And because everything is “tied” in Raku (looking at it using Perl-filtered glasses), injecting program logic into the system’s handling of arrays and hashes can be as simple as subclassing the system’s class and providing a special version of one of the standard methods as used by the system. Suppose you want to see in your program when an element is fetched from an array, one need only add a custom AT-POS method:

class VerboseFetcher is Array {    # subclass core's Array class
    method AT-POS($pos) {           # method for fetching an element
        say "fetching #$pos";        # tell the world
        nextsame                     # provide standard functionality
    }
}

my @a is VerboseFetcher = 1,2,3;   # mark as special and initialize
say @a[1];  # fetching #1␤2

The Raku documentation contains an overview of which methods need to be supplied to emulate an Array and to emulate a Hash. By the way, the whole lemma about accessing data structure elements by index or key is recommended reading for someone wanting to grok those aspects of the internals of Raku.

Nothing is special

In a blog post about RFC 168 about making things less special, it was already mentioned that really nothing is special in Raku. And that (almost) all aspects of the language can by altered inside a lexical scope. So what the above example did to the Array class, can be done to any of Raku’s core classes, or any other classes that have been installed from the ecosystem, or that you have written yourself.

But it can be overwhelming to have to supply all of the logic needed to fully emulate an array or a hash. Especially when you first try to do this. Therefore the ecosystem actually has two modules with roles that help you with that:

Both modules only require you to implement 5 methods in a class that does these roles to get the full functionality of an array or a hash, completely customized to your liking.

In fact, the flexibility of the approach of Raku towards customizability of the language, actually allowed the implementation of Perl’s tie built-in function in Raku. So if you’re porting code from Perl to Raku, and the code in question uses tie, you can use this module as a quick intermediate solution.

Has the problem been fixed?

Let’s look at the problems that were mentioned with tie in RFC 200:

  1. It is non-extensible; you are limited to using functions that have been implemented with tie hooks in them already.

Raku is completely extensible and pluggable in (almost) all aspects of its implementation. There is no limitation to which classes one can and one cannot extend.

  1. Any additional functions require mixed calls to tied and OO interfaces, defeating a chief goal: transparency.

All interfaces use methods in Raku, since everything is an object or can be considered as one. Use of classes and methods should be clear to any programmer using Raku.

  1. It is slow. Very slow, in fact.

In Raku, it is all the same speed during execution. And every customization profits from the same optimization features like every other piece of code in Raku. And will be, in the end, optimized down to machine code when possible.

  1. You can’t easily integrate tie and operator overloading.

In Raku, operators are multi-dispatch subroutines that allow additional candidates for custom classes to be added.

  1. If defining tied and OO interfaces, you must define duplicate functions or use typeglobs.

Typeglobs don’t exist in Raku. All interfacing in Raku is done by supplying additional methods (or subroutines in case of operators). No duplication of effort is needed, so no such problem.

  1. Some parts of the syntax are, well, kludgey

One may argue that the kludgey syntax of Perl has been replaced by another kludgey syntax in Raku. That is probably in the eye of the beholder. Fact is that the syntax in Raku for injecting program logic, is not different from any other subclassing or role mixins one would otherwise do in Raku.

Conclusion

Nothing from RFC 159 actually was implemented in the way it was originally suggested. However, solutions to the problems mentioned have all been implemented in Raku.

RFC 159, by Nathan Wiger: True Polymorphic Objects

Proposed on 25 August 2000, frozen on 16 September 2000

On polymorphism

RFC159 introduces the concept of true polymorphic object.

Objects that can morph into numbers, strings, booleans and much more on-demand. As such, objects can be freely passed around and manipulated without having to care what they contain (or even that they’re objects).

When one looks at how 42, "foo", now work in Raku nowadays, one can only see that that vision has pretty much been implemented. Because most of the time, one doesn’t really care about the fact that 42 is really an Int object, "foo" is really a Str object and that now represents a new Instant object every time it is called. The only thing one cares about, is that they can be used in expressions:

say "foo" ~ "bar";  # foobar
say 42 + 666;       # 708
say now - INIT now; # 0.0005243

RFC159 lists a number of method names to be used to indicate how an object should behave under certain circumstances, with a fallback provided by the system if the class of the object does not provide that method. In most cases these methods did not make it into Raku, but some of them did with a different name:

Name in RFC Name in Raku When
STRING Str Called in a string context
NUMBER Numeric Called in a numeric context
BOOLEAN Bool Called in a boolean context

And some of them even retained their name:

Name in RFC When
BUILD Called in object blessing
STORE Called in an lvalue = context
FETCH Called in an rvalue = context
DESTROY Called in object destruction

but with sometimes subtly different semantics from the RFC.

Only a few made it

In the end, only a limited set of special methods was decided on for Raku. All of the other methods in RFC159 have been implemented by polymorphic operators that coerce when needed. For instance the proposed PLUS method has been implemented as an infix + operator that has a “default” candidate that coerces its operands to a number.

So, effectively, if you have an object of class Foo and you want that to act as a number, one only needs to add a Numeric method to that class. An expression such as:

my $foo = Foo.new;
say $foo + 42;

is effectively executing:

say infix:<+>( $foo, 42 );

and the infix:<+> candidate that takes Any objects, does:

return infix:<+>( $foo.Numeric, 42.Numeric );

And if such a class Foo does not provide a Numeric method, then it will throw an exception.

The DESTROY method

In Raku, object destruction is non-deterministic. If an object is no longer in use, it will probably get garbage collected. The probable part is because Raku does not know a global destruction phase, unlike Perl. So when a program is done, it just does an exit (although that logic does honour any END blocks).

An object is marked “ready for removal” when it can no longer be “reached”. It then has its DESTROY method called when the garbage collection logic kicks in. Which can be any amount of time after it became unreachable.

If you need deterministic calling of the DESTROY method, you can use a LEAVE phaser. Or if that doesn’t allow you to scratch your itch, you can possibly use the FINALIZER module.

STORE / FETCH on scalar values

Conceptually, you can think of a container in Raku as an object with STORE and FETCH methods. Whenever you set a value in a container, it conceptually calls the STORE method. And whenever the value inside the container is needed, it conceptually calls the FETCH method. In pseudo-code:

my $foo = 42;  # Scalar.new(:name<$foo>).STORE(42)

But what if you want to control access to a scalar value, similar to Perl’s tie? Well, in Raku you can, with a special type of container class called Proxy. An example of its usage:

sub proxier($value? is copy) {
    return-rw Proxy.new(
        FETCH => method { $value },
        STORE => method ($new) {
            say "storing";
            $value = $new
        }
    )
}

my $a := proxier(42);
say $a;    # 42
$a = 666;  # storing
say $a;    # 666

Subroutines return their result values de-containerized by default. There are basically two ways of making sure the actual container is returned: using return-rw (like in this example), or by marking the subroutine with the is rw trait.

STORE on compound values

Since FETCH only makes sense on scalar values, there is no support for FETCH on compound values, such as hashes and arrays, in Raku. I guess one could consider calling FETCH in such a case to be the Zen slice, but it was decided that that would just return the compound value itself.

The STORE method on compound values however, allows for some interesting functionality. The STORE method is called whenever there is an initialization of the entire compound value. For instance:

@a = 1,2,3;

basically executes:

@a := @a.STORE( (1,2,3) );

But what if you don’t have an initialized @a yet? Then the STORE method is supposed to actually create a new object and initialize this with the given values. And the STORE method can tell, because then it also receives a INITIALIZE named argument with a True value. So when you write this:

my @b = 1,2,3;

what basically gets executed is:

@b := Array.new.STORE( (1,2,3), :INITIALIZE );

Now, if you realize that:

my @b;

is actually short for:

my @b is Array;

it’s only a small step to realize that you can create your own class with customized array logic, that can replace the standard Array logic with your own. Observe:

class Foo {
    has @!array;
    method STORE(@!array) {
        say "STORED @!array[]";
        self
    }
}

my @b is Foo = 1,2,3;  # STORED 1 2 3

However, when you actually start using such an array, you are confronted with some weird results:

say @b[0]; # Foo.new
say @b[1]; # Index out of range. Is: 1, should be in 0..0

Without getting into the reasons for these results, it should be clear that to completely mimic an Array, a lot more is needed. Fortunately, there are ecosystem modules available to help you with that: Array::Agnostic for arrays, and Hash::Agnostic for hashes.

BUILD

The BUILD method also subtly changed its semantics. In Raku, method BUILD will be called as an object method and receive all of the parameters given to .new, after which it is fully responsible for initializing object attributes. This becomes more visible when you use the internal helper module BUILDPLAN. This module shows the actions that will be performed on an object of a class when built with the default .new method:

class Bar {
    has $.score = 42;
}
use BUILDPLAN Bar;
# class Bar BUILDPLAN:
#  0: nqp::getattr(obj,Foo,'$!score') = :$score if possible
#  1: nqp::getattr(obj,Foo,'$!score') = 42 if not set

This is internals speak for: – assign the value of the optional named argument score to the $!score attribute – assign the value 42 to the $!score attribute if it was not set already

Now, if we add a BUILD method to the class, the buildplan changes:

class Bar {
    has $.score = 42;
    method BUILD() { }
}
use BUILDPLAN Bar;
# class Bar BUILDPLAN:
#  0: call obj.BUILD
#  1: nqp::getattr(obj,Foo,'$!score') = 42 if not set

Note that there is no automatic attempt to take the value of the named argument score anymore. Which means that you need to do a lot of work in your custom BUILD method if you have many named arguments, and only one of them needs special handling. That’s why the TWEAK method was added:

class Bar {
    has $.score = 42;
    method TWEAK() { }
}
use BUILDPLAN Bar;
# class Bar BUILDPLAN:
#  0: nqp::getattr(obj,Foo,'$!score') = :$score if possible
#  1: nqp::getattr(obj,Foo,'$!score') = 42 if not set
#  2: call obj.TWEAK

Note that the TWEAK method is called after all of the normal checks and initializations. This is in most cases much more useful.

Conclusion

Although the idea of true polymorphic objects has been implemented in Raku, it turned out quite different from originally envisioned. In hindsight, one can see why it was decided to be unpractical to try to support an ever increasing list of special methods for all objects. Instead, a choice was made to only implement a few key methods from the proposal, and for the others the approach of automatic coercions was taken.

RFC 188, by Damian Conway: Objects: Private keys and methods

Break someone’s code today!

On September 1st of 2000 Damian Conway sent a proposal №188, promoting the idea of Private Keys and Methods.

In those days, Perl’s object-oriented programming relied heavily on hashes. Indeed, a hash can store data values by keys, as well as references to routines, which is how you can describe an object with data attributes and methods. With bits of syntax sugar here and there one can make shortcuts to work with such an object’s state and behavior.

So can be done in Raku nowadays:

my %tipsy-object;
# We have some data
%tipsy-object<created> = Date.new(now);
# And a "method"...
%tipsy-object<tell-age> = { say %tipsy-object<created> }
# Call it!
%tipsy-object<tell-age>(); # 2020-08-14
# Not method-y enough? Mkay, maybe this way...
%tipsy-object.<tell-age>(); # 2020-08-14
# Wow, what a dot it was... Now edit some data:
%tipsy-object<created> = Date.new(now).later(:1day);
%tipsy-object.<tell-age>(); # 2020-08-15, wow!

This sounds awesome in its simplicity (if we pretend Hash itself is not an object!) until you try to share your code. The moment you post this %tipsy-object people all around the World start to play with it even when you sleep and you have no idea what they might do. With best intentions, someone might write this:

my %tipsier-object;
die 'Woah this was not expected, calm down!!' unless %tipsy-object<created> ~~ Date;
%tipsier-object<tipsy-origins> = %tipsy-object<created>;

And now lying on a comfy sofa, Thinking about how someone might stare at your code even when you sleep, you notice a scary thing: you wanted to know when a %tipsy-object is created exactly, and so Date is just not nearly enough! Was it a morning, a bright day or a night? Who would tell the answer without mighty DateTime?

So you hurry up and do a little patch:

my %tipsy-object = created => DateTime.new(now), tell-age => { say %tipsy-object<created> }
%tipsy-object.<tell-age>();

What we just saw is a small, not so uncommon step for every programmer and a big leap for computer science: we broke someone’s code!

After all this rant from %tipsy-object users the next day, a lesson lived is a lesson learned: it is hard to ever underestimate the amount of assumptions people can do with your code, so spare them guessing, just tell the rules and hide everything else.

In object-oriented programming this very, very generic concept is spread across a couple of different ideas, one of which is named “encapsulation”.

In short, encapsulation allows us to say “Nobody should touch this… (unless someone really, really wants to make it happen)”. It gives you means to divide your coding efforts into a couple of different realms, one of which has “Nobody but I can use this” policy and its opposite takes the famous “Use it as you please and we will try really, really hard not to break anything knowingly” stance.

While encapsulation by itself is not a Great Code Problems Solver (you can write scary code anyways!) nor it reduces the amount of bugs (as if it happens!), but it certainly can save you some nerves during rewriting code (which sometimes is a Great Way To Solve Problems) and reduces coupling between components (which does reduce the amount of bugs, right? Hope never dies).

Here, inspired by the Tie::SecureHash module approach, Damian Conway has proposed a way to support encapsulation in a hash-based object-y system.

Sugar – the keyword way

The proposal describes a unary function named private, which can be applied to a single hash entry, a slice of entries or a whole hash:

private $hash{key};
# ^ can note the $ sigil to refer to a value
private @hash{qw(_name _rank _snum)};
# ^ can note the @ sigil to refer to a number of values
private %hash;
# ^ or, effectively, applied to any hash directly or via a reference

This function application restricts hash entries to the current package and makes all the spared ones “public”:

package MyClass;
sub new { bless private { secret => 'data' }, $_[0] }
package main;
my $obj = MyClass->new();
print $obj->{secret}; # dies, inaccessible entry

Such entries can be inherited:

package Base;
sub new {
my ($class, @data) = @_;
bless private { data => [@data] }, $class;
}
package SortableBase;
use base 'Base';
sub sorted {
my ($self) = @_;
print sort @{ $self->{Base::data} };
}

And, of course, for the aforementioned “someone really, really wants it to happen”, there was a Plan B prepared:

Although it is almost inevitably a Very Bad Idea and we shall probably All Come Regret To It Later, it ought to be possible to specify that certain entries of a private-ized hash are nevertheless public. This might, for example, be necessary when inheriting from legacy code.

The smartest hacks require the spookiest code anyway.

When it comes to methods, which are incidentally just subroutines with a twist or two, private becomes a prefix keyword:

package Base;
sub new { ... }
private sub check { ... }
sub do_check {
my ($self) = @_;
$self->check(); # okay
$self->Base::check(); # okay
check(); # okay
Base::check(); # okay
package Derived;
use base 'Base';
sub do_check {
my ($self) = @_;
$self->check(); # dies, no suitable accessible method
$self->Base::check(); # okay
Base::check($self); # okay
}
package main;
my $obj = Base->new();
$obj->check(); # dies, no suitable accessible method
$obj->Base::check(); # dies, no suitable accessible method
Base::check($obj); # dies, inaccessible subroutine

What about Raku?

Conclusion: Huffmanize it!

Raku wisdom says “Huffmanize your syntax”, as mighty Huffman coding it tells you to make things which are common to type easier to type.

With the Raku object system being ground up overhauled, the question of encapsulation was not forgotten! Not! In! A! Bit! (sorry for all the line noise you just heard).

Private things in Raku are marked with an exclamation mark. Firstly, it indicates to the reader those code bits are important. Secondly, for attributes it serves as a secondary “sigil”, namely “twigil”, which tells the reader that some interesting scoping is involved here. Thirdly, it looks like it has some class. Fourthly… Do you really expect more reasons here? Please, come up with your ideas in the comments!

Using Raku for some years, it suddenly strikes with a simple consistency here:

class Class {
# A public attribute
has $.unchecked = True;
# A private attribute
has $!checked = False;
# Just a Class package scoped sub
sub i'm-just-a-sub {}
# A public method
method check { True }
# A private method
method !check { False }
}

It is kind of amusing to see how exclamation marks mean the thing they mean in both cases, so if you once just remembered the syntax as given, now you can hopefully chuckle a bit.

Another common wisdom is “Similar things should look similar and different things should look different” and this approach, evolved from the proposal done, reflects this very nicely (just don’t tell anyone similar things also should look different, as that would be the next level of wisdom).

RFC 307, by Simon Cozens

In a context with apocalypses and exegeses, no wonder there’s also a PRAYER somewhere: precisely in this RFC 307, by Simon Cozens, which was actually rejected, but somehow ended up anyway in Apocalypse 12 together with the one it that superseded it, RFC 189. Both proposals talk about what is going ot happen when an object is created, an action that has traditionally been called blessing in the Perl world.

So let’s get first to the rejected one, which is simpler. It says

This RFC proposes a special sub, PRAYER, which is automatically called on blessing.

The term blessing is Perlese for welcoming a data structure into the object world. In Perl objects are little more than hashes with a tag tacked on them, so you bless a data structure with the class it’s going to belong to. However, that’s pretty static, and it’s only sticking together two pieces of data:

my %foo = bless { bar => "baz" }, "Foo";
say %foo; # Foo=HASH(0x1b05558)

Not even a formal definition of a package (again, Perlese for classes) or attributes or whatever are needed. But imagine we need to ensure the data structure includes a certain attribute, or need to create an additional one from existing attribute values. Well, use one of the existing object orientation packages such as Moo. Moose even has per-attribute triggers, that can be used to check them once they have been set. But really not an object-wide C.

So we really need one. And we need one WITH CAPITAL LETTERS because that’s how we call the IMPORTANT things, or actually the things that happen under the hood when we do something. Perl already does have a few of those. UNIVERSAL is the ur-class (we call it Mu in Raku). That class includes DOES and VERSION; there’s also AUTOLOAD which is called when a method does not exist in a class, as well as DESTROY.

Other RFCs, recently commented, proposed other seudo-classes that used capital leters: NEXT, for instance.

This last method is interesting: it’s called at a certain phase in the object lifecycle; at the very end of it, namely.

Ob-pun about the mixed metaphor of objects starting their life with a blessing and ending with simple destruction.

So Perl had a single phaser. This RFC advocated for getting, at least, a couple of symmetric ones, at both ends of the object lifecycle. Which was precisely in the turf of RFC 189, which proposed a set of hierarchical BUILD (lost chance to call them INCEPT) and DESTROY calls, invoked up and down the object inheritance chain.

Which is why this RFC was rejected, I guess. However, this RFC goes a little beyond that, or sideways from that, so it made sense to pick it up for the apocalypsis 12, instead of subsuming it into 189. In a way, BUILD is what it asks for off the bat: it’s the “prayer” that gets called from bless. But rephrase it slightly, “what’s called when something has been blessed” and it becomes something totally different: not a building routine, but a, wait for it, tweaking routine (or method, or submethod, which is what it eventually became).

In this way, Raku got TWEAK. RFC 189 begat BUILD, which creates a nice and bundled object. But after this object has been created, and it’s solidly wrapped into a self data structure, you might still need to TWEAK it, and do stuff that can only be done when and if the object is already BUILT. This is what TWEAK is for. And it can be put to good use in cases like checking if the constructor has been called with an unexisting attribute (which uses the self variable to grab all existing attributes), or to register self with some external object, or, actually, anything you might want to do with an object that has been aready blessed.

This also says something about the process that created Raku; an RFC that was rejected eventually ended up as an useful, and interesting, feature of the final language. Because, in the true open source spirit, with many eyes all bugs are shallow, and also many eyes look stuff all over again and find its real worth and meaning. Which is what you need when designing a 100 year language, I guess.

RFC 190, by Damian Conway: NEXT pseudoclass for method redispatch

In his series of object orientation RFC’s Perl/Raku luminary Damian Conway includes a proposal for method redispatch, RFC 190, which is the subject of today’s article.

On method dispatch

Perl has a pseudoclass named SUPER which an object can use to invoke a method of its parent class that it has overridden. It looks approximately like this:

sub dump_info {
    my $self = shift;           # obtain invocant
    $self->SUPER::dump_info;    # first dump parent
    say $self->{derived_info};  # then ourselves
}

In this example, taken loosely from the RFC, we define a method dump_info in a derived class which dumps the info of its parent class and then anything that itself added to the object, exemplified by the derived_info attribute. Conway notes that this breaks down under multiple inheritance because SUPER will only dispatch to the first parent class of $self and once you go SUPER you can’t go back. Supposing that all dump_info methods in parent classes are similarly implemented, only the family bonds indicated in the below diagram with double lines would be traversed, resulting in lots of info potentially undumped:

Grand11  Grand12      Grand21  Grand22
   ║        │            │        │
   ╟────────┘            ├────────┘
   ║                     │
Parent1               Parent2
   ║                     │
   ╟─────────────────────┘
   ║
Derived

One might think that to get this right, each class needs to dispatch to all of its parent classes somehow, which would be akin to a post-order traversal of the inheritance tree. This is correct insofar as it models the relevance of methods in the inheritance tree, supposing that left parents are more important than right ones.

The NEXT pseudoclass

Conway’s proposal is subtly different. Namely, he proposes to add a new pseudoclass named NEXT which is to be used just like SUPER above and which should, instead of continuing in the parent of the current package, resume the original method dispatch process with the next appropriate candidate as if the current one had not existed. Then, method redispatch is performed with respect to the original object’s class and sees the entire inheritance tree instead of cutting off all other branches below SUPER. Effectively, this offloads the responsibility of redispatching from each specific class onto the runtime method dispatch mechanism.

Grand11══Grand12══╗   Grand21══Grand22
   ║        │     ║      ║        │
   ╟────────┘     ║      ╟────────┘
   ║              ║      ║
Parent1           ╚═══Parent2
   ║                     │
   ╟─────────────────────┘
   ║
Derived

Concretely, when calling dump_info on an object blessed into the Derived class, there is an array of possible candidates. They are the methods of the same name of Derived, Parent1, Grand11, Grand12, Parent2, Grand21 and Grand22, in order of relevance. Given this array, each dump_info implementation just has to redispatch to the single next method in line. It is on the runtime to keep enough data around to continue the dispatch chain.

Notably, this mechanism can also provide an implementation of RFC 8, which is about the special method AUTOLOAD. AUTOLOAD is called as a fallback when some method name could not be resolved. Redispatching via NEXT can be used to decline to autoload a method in the current class and leave that task to another AUTOLOAD in the inheritance tree.

The Raku implementation

While the status of RFC 190 is “frozen”, hence accepted, this feature looks
different in Raku today. There is no NEXT and even SUPER is gone.
Instead we have three types of redispatch keywords:

  • callsame, callwith: calls the next candidate for the method, either
    using the same arguments or with the other, given ones.
  • nextsame, nextwith: the same as as the call* keywords, except
    they do not return control to the current method.
  • samewith: calls the same candidate again with different arguments.

callsame and callwith implement the process that NEXT would have, but in the wider context of all dispatch-related goodies that Raku got. They work in all places that have a linearized hierarchy of “callable candidates”. This includes redispatch of methods along the inheritance tree, it naturally includes candidates for multi methods and subs, it includes FALLBACK (erstwhile AUTOLOAD) and wrapped routines. Another speciality is in case you ever find you have to redispatch from the current method but do so in another context, for example inside a Promise, then the nextcallee keyword can be used to obtain a Callable which can continue the dispatch process from anywhere.

We change NEXT::callsame and after some localizations, our dump_info method looks like this and does not omit any parent class’s methods anymore:

method dump-info {
    callsame;            # first dump next
    put $!derived-info;  # then ourselves
}

In summary, being able to redispatch to a parent class’s method is useful. In a situation with multiple inheritance and multi subs, the fixation on the “parent class” is less helpful and is replaced by “less specific method”. Conway’s proposal to redispatch to the next less specific method made it into Raku and its usefulness is amplified way beyond the RFC by other Raku features and the careful design connecting them.

Curiously, a NEXT module was first shipped as a core module with Perl v5.7.3, released in 2002, written by… Damian Conway.


There is more than one way to dump info

For the particular pattern used as an example in the RFC, where each class in the inheritance tree independently throws in its own bit, there is another way in Raku to accomplish the same as redispatch. This is the method call operator .* (or its greedy variant .+). Consider

# Parent and grandparent classes look the same...

class Derived is Parent1 is Parent2 {
    sub dump-info { put "Derived" }
}

Each dump-info method is only concerned with its own info and does not redispatch. The .* methodop walks the candidate chain and calls all of them, returning a list of return values (although in this case we care only about the side effect of printing to screen):

Derived.new.*dump-info
# Derived
# Parent1
# Grand11
# Grand12
# Parent2
# Grand21
# Grand22

The methods are called from most to least relevant, in pre-order of the inheritance tree. This way we can keep our dump-info methods oblivious to redispatch, whereas explicitly redispatching using callsame and co. would allow us to choose between pre- and post-order.

RFC22: Control flow: Builtin switch statement, by Damian Conway

The problem

C has switch/case, and many other languages either copied it, or created a similar construct. Perl in 2000 didn’t have any such thing, and this was seen as a lack.

A Tale of Two Languages

This RFC not only became two (related) features, it did so in both Perl and Raku with dramatically different results: in Perl it’s widely considered the biggest design failure of the past two decades, whereas in Raku it’s an entirely non-controversial. The switch that Perl ended up with is working very similar to the original proposal. This is actually helpful in analysing what changes were necessary to make it a successful feature. It looks something like this (in both languages):

given $foo {
    when /foo/ {
        say "No dirty words here";
    }
    when "None" {

    }
    when 1 {
        say "One!";
    }
    when $_ > 42 {
        say "It's more than life, the universe and everything";
    }
}

The switch is actually two features (for the price of only one RFC): smartmatch and given/when on top of it. Smartmatch is an operator ~~ that checks if the left hand side fits the constraints of the right hand side. Given/when is a construct that smartmatches the given argument to a series of when arguments (e.g. $given ~~ $when) until one succeeds.

However, one of the distinguishing features of Perl is that it doesn’t generally overload operators, instead it has different operators for different purposes (e.g. == for comparing numbers and eq for comparing strings). Smartmatch however is inherently all about overloading. This mismatch is essentially the source of all the trouble of smartmatch in Perl. Raku on the other hand has an extensive type-system, and is not so dependent on type specific operators (though it still has some for convenience), and hence is much more predictable.

Most obviously, the type system of Raku means that it doesn’t use a table of possibilities, but instead $left ~~ $right just translates to $right.ACCEPTS($left). The right-first semantics makes it a lot easier to reason about (e.g. matching against a number will always do numeric equality).

It means it can easily distinguish between a string and an integer, unlike Perl which has to guess what you meant: $foo ~~ 1 always means $foo == 1, and $foo ~~ "foo" always means $foo eq "foo". In Perl, $foo ~~ "1" would do a numeric match.

But perhaps the most important rule is smartmatching booleans. Perl doesn’t have them, and this makes when so much more complicated than most people realize. The problem is with statements like $_ > 42, which need boolean logic. Perl solves this using a complex heuristic that no one really can remember (no really, no one). Most surprisingly, that means that when /foo/ does not use smartmatching (this becomes obvious when the left hand side is an arrayref).

Raku uses a very different method to solve this problem. In Raku, when always smartmatches. Smartmatching against a Bool (like in $_ > 42), will always return that bool, so $foo ~~ True always equals True. This enables a wide series of boolean expressions to be used as when condition without problems. It’s a much simpler, and surprisingly effective method of dealing with this challenge.

Other uses

The other difference between smartmatching in Perl versus Raku is that it is actually used outside of given/when. In particular, selecting methods such as grep and first use it to great effect: @array.grep(1), @array.grep(Str), @array.grep(/foo/), @array.grep(&function), @array.grep(1..3), and @array.grep((1, *, 3))* all do what you probably expect them to do. Likewise it’s used in a number of other places where one checks if a value is part of a certain group or not, like the ending argument of a sequence (e.g. 1000, 1001 ... *.is-prime) and the flip-flop operators.

Smartmatch is all about making code do what you mean, and it’s pretty useful and reusable for that.

RFC 43: Integrate BigInts (and BigRats) Support Tightly With The Basic Scalars

Intro

RFC 43, titled ‘Integrate BigInts (and BigRats) Support Tightly With The Basic Scalars’ was submitted by Jarkko Hietaniemi on 5 August 2000. It remains at version 1 and was never frozen during the official RFC review process.

Despite this somewhat “unoffical” seeming status, the rational (or Rat) numeric type, by default, powers all of the fractional mathematics in the Raku programming language today.

You might say that this RFC was “adopted, and then some.”

A legacy of imprecision

There is a dirty secret at the core of computing: most computations of reasonable complexity are imprecise. Specifically, computations involving floating point arithmetic have known imprecision dynamics even while the imprecision effects are largely un-mapped and under-discussed.

The standard logic is that this imprecision is so small in its individual expression – in other words, because the imprecision is so negligible when considered in the context of an individual equation, it is taken for granted that the overall imprecision of the system is “fine.”

Testing the accuracy of this gut feeling in most , however, would involve re-creating the business logic of that given system to use a different, more accurate representation for fractional values. This is a luxury that most projects do not get.

Trust the data, question the handwaves

What could be much more worrisome, however, is the failure rate of systems that do attempt the conversion. In researching this topic I almost immediately encountered third party anecdotes. Both come from a single individual and arrived within minutes of broaching the floating point imprecision question.1

One related the unresolved issue of a company that cannot switch their accounting from floating point to integer math – a rewrite that must necessarily touch every equation in the system – without “losing” money. Since it would “cost” the company too much to compute their own accounting accurately, they simply don’t.

Another anecdote related the story of a health equipment vendor whose equipment became less effective at curing cancer when they attempted to migrate from floating point to arbitrary precision.

In a stunning example of an exception that just might prove the rule of “it’s a feature, not a bug”, it turned out that the tightening up of the equation ingredients resulted in a less effective dose of radiation because the floating point was producing systemic imprecisions that made a device output radiation beyond its designed specification.

In both cases it could be argued that the better way to do things would be to re-work the systems so that expectations matched reality. I do have much more sympathy for continued use of the life-saving imprecision than I do for a company’s management to prefer living in a dream world of made up money than actually accounting for themselves in reality, though.

In fact, many financing and accounting departments have strict policies banning floating point arithmetic. Should we really only be so careful when money is on the line?

Precision-first programming

It is probably safe to say that when Jarkko submitted his RFC for native support high-precision bigint/bigrat types that he didn’t necessarily imagine that his proposal might result in the adoption of “non-big” rats (that it is, arbitrary precision rationals shortened into typically playful Perl-ese) as the default fractional representation in Perl 6.2

Quoting the thrust of Jarkko’s proposal, with emphasis added:

Currently Perl ‘transparently’ starts using double floating point numbers when the numeric values grow too large for the native integer types (int, long, quad) can no more hold quantities that large. Because double floats are at their heart a lie, they cannot truly represent large numbers accurately. Therefore sometimes when the application would prefer to stay accurate, the use of ‘bigints’ (and for division, ‘bigrats’) would be preferable.

Larry, it seems, decided to focus on a different phrase in the stated problem: “because double floats are at their heart a lie”. In a radical break from the dominant performance-first paradigm, it was decided that the core representation of fractional values would default to the most precise available – regardless of the performance implications.

Perl has always had a focus on “losing as little information as possible” when it comes to your data. Scalars dynamically change shape and type based on what you put into them at any given time in order to ensure that they can hold that new value.

Perl has also always had a focus on DWIM – Do What I Mean. Combining DWIM with the “lose approximately nothing” principle in the case of division of numbers, Perl 6 would thus default to understanding your meaning to be a desire to have a precise fractional representation of this math that you just asked the computer to compute.

Likewise, Perl has also always had a focus on putting in curbs where other languages build walls. Nothing would force the user to perform their math at the “speed of rational” (to turn a new phrase) as they would have the still-essential Num type available.

In this sense, nothing was removed – rather, Rat was added and the default behavior of representing parsed decimal values was to create a Rat instead of a Num. Many languages introduce rationals as a separate syntax, thus making precision opt-in (a la 1r5 in J).

In Perl 6 (and thus eventually Raku), the opposite is true – those who want imprecision must opt-in to floating point arithmetic instead via an explicit construction like Num.new(0.2), type coercion of an existing object a la $fraction.Num, or via the short-hand value notation of 0.2e0.

A visual example thanks to a matrix named Hilbert

In pursuit of a nice example of programming concerns solved by arbitrary precision rationals that are a bit less abstract than the naturally unfathomable “we have no actual idea how large the problem of imprecision is in effect on our individual systems let alone society as a whole”, I came across the excellent presentation from 2011 by Roger Hui of Dyalog where he demonstrates a development version of Dyalog APL which included rational numbers.

In his presentation he uses the example of Hilbert matrices, a very simple algorithm for generating a matrix of any size that will be of notorious difficulty for getting a “clean” identity matrix (if that’s not clear at the moment, don’t worry, the upcoming visual examples should make this clear enough to us non-experts in matrix algebra).

Here is a (very) procedural implementation for generating our Hilberts for comparison (full script in this gist):3

my %TYPES = :Num(1e0), :Rat(1);
subset FractionalRepresentation of Str where {%TYPES{$^t}:exists};
sub generate-hilbert($n, $type) {
    my @hilbert = [ [] xx $n ];
    for 1..$n -> $i {
        for 1..$n -> $j {
            @hilbert[$i-1;$j-1] = %TYPES{$type} / ($i + $j - 1);
        }
    }
    @hilbert
}

One of the most important aspects of having rationals as a first-class member of your numeric type hierarchy is that only extremely minimal changes are required of the math to switch between rational and floating point.

There is a danger, as Roger Hui notes in both his video and in a follow-up email to a query I sent about “where the rationals” went, that rational math will seep out into your application and unintentionally slow everything down. This is a valid concern that I will return to in just a bit.

Floating Hilbert

Here are the results of the floating point dot product between a Hilbert and its inverse – an operation that generally results in an identity matrix (all 0’s except for a straight diagonal of 1’s from the top left corner down to the bottom right).

Floating Hilbert
         1       0.5  0.333333      0.25       0.2
       0.5  0.333333      0.25       0.2  0.166667
  0.333333      0.25       0.2  0.166667  0.142857
      0.25       0.2  0.166667  0.142857     0.125
       0.2  0.166667  0.142857     0.125  0.111111
 
Inverse of Floating Hilbert
     25    -300     1050    -1400     630
   -300    4800   -18900    26880  -12600
   1050  -18900    79380  -117600   56700
  -1400   26880  -117600   179200  -88200
    630  -12600    56700   -88200   44100

Floating Hilbert ⋅ Inverse of Floating Hilbert
            1             0            0             0            0
            0             1            0  -7.27596e-12  1.81899e-12
  2.84217e-14  -6.82121e-13            1  -3.63798e-12            0
  1.42109e-14  -2.27374e-13  2.72848e-12             1            0
            0  -2.27374e-13  9.09495e-13  -1.81899e-12            1

All those tiny floating point values are “infinitesimal” – yet some human needs to choose at what point of precision we determine the cutoff. Since we “know” that the inverse dot product is supposed to yield an identity matrix for Hilberts, we can code our algorithm to translate everything below e-11 into zeroes.

But what about situations that aren’t so certain? Some programmers undoubtedly write formal proofs of the safety of using a given cutoff – however I doubt any claims that this population represents a significant subset of programmers based on lived experience.

Rational Hilbert

With the rational representation of Hilbert, it’s a lot easier to see what is going on with the Hilbert algorithm as the pattern in the rationals clearly evokes the progression. This delivers an ability to “reason backwards” about the numeric data in a way that is not quite as seamless with decimal notation of fractions.4

Rational Hilbert
    1  1/2  1/3  1/4  1/5
  1/2  1/3  1/4  1/5  1/6
  1/3  1/4  1/5  1/6  1/7
  1/4  1/5  1/6  1/7  1/8
  1/5  1/6  1/7  1/8  1/9

Inverse (same output, trimmed here for space)

Rational Hilbert ⋅ Inverse of Rational Hilbert
  1  0  0  0  0
  0  1  0  0  0
  0  0  1  0  0
  0  0  0  1  0
  0  0  0  0  1

Notice the lack of ambiguity in both the initial Hilbert data set and the final output of the inverse dot product. There is no need to choose any threshold values, thus no programmer decisions come in between the result of the equation provided by the computer and the result of the equation as stored by the computer or presented to the user.

Consider again the fact that it is useless to do an equality comparison in floating point math – the imprecision of the underlying representation forces users to choose some decimal place to end the number they check against. There is no possible model that can take into account the total degree of impact this human interaction (in aggregate) creates on top of the known and quantifiable imprecisions that are usually the basis of the “it’s good enough” argument.

The cost of precision

In summary, the nice equations about maximum loss per equation are far from the entire story and it is in recognition of that fact that Raku does its best to ensure that the eventual user of whatever Raku-powered system is not adversely impacted by unintentional or negligent errors in the underlying math used by the programmer.

It nevertheless remains a controversial decision that has a significant impact on baseline Raku performance. Roger Hui’s warning about rational math “leaking out” into the business application is well-served by the timing results I get for the current implementations in Matrix::Math:

raku rationale-matrix.p6 --timings-only 8
Created 'Num' Hilbert of 8x8: 0.03109376s
Created 'Rat' Hilbert of 8x8: 0.0045802s
Starting 'Num' Hilbert test at 2020-08-10T08:38:20.147506+02:00
'Num' Hilbert inverse calculation: 4.6506456s
'Num' dot product: 4.6612038s
Starting 'Rat' Hilbert test at 2020-08-10T08:38:24.808709+02:00
'Rat' Hilbert inverse calculation: 5.0791889s
'Rat' dot product: 5.0791889s

Considering it is quite unlikely for the floating point version to be so close to the speed of the rational one, this simple benchmark appears to prove the case for Roger’s warning about rational math “leaking out” into the larger system and causing significant performance degradation.5

So even though we ourselves took the extra step to try and get floating point (im-)precision and thus speeds, we were thwarted in our attempts. I believe this is a valid concern for the plenty of use cases that benefit from floating point math without causing any harm.

Potential futures of even more rational and precise rationality

The new capabilities provided by the upcoming introduction of Raku AST include the ability to modify/extend the behavior of dispatch. It then becomes quite feasible for a user-space module to be able to tweak the behavior of core to the extent that a more forceful approach to using floating point representation could be applied.

In the current preliminary conceptual phase, the idea would be to provide means to test the necessity of rational representation to the outcome of a system. This could be achieved by making the underlying numeric type “promotion” logic configurable. It then becomes possible to imagine replacing the Rat type with a Bell representation that can track the overall loss of precision throughout the system.

If that is below a given threshold, the same configurability could be used to remove Rat from the hierarchy entirely, thus ensuring floating point performance.

Balancing concerns

The concerns of unintentional and/or unrecoverable performance degradation remain extremely valid and up until beginning the research on this I was un-concerned about rational performance.

This lack of concern relied on a huge caveat – that floating point performance is available and guarantee-able to anyone who wants to ensure its use throughout their system.

Unfortunately this is not the case in current Rakudo (ie, the reference implementation of Raku).

I am still likely to move forward with a requirement of provable precision as a personal requirement when judging a system for whether it meets my expectations for “21st century.”

However, I now disavow my previous feelings that rational math as a default fractional representation is the best – or, if you’ll allow me to put it more playfully, most rational – route for a given language implementation to take.6

Conclusion

The intention of defaulting to rational representation was not intended to force users to the performance floor of rationals and it is a natural extension of Perlish philosophy to strive towards being even better at giving the user exactly – without surprises – what they want.

Rather, Raku chooses a precision-first philosophy because it fits into the overall philosophy of keeping the truest representation of a user’s demands as possible regardless of current computation cost – without locking them into the expectations of the language implementor either.

To the extent that Raku does not yet get this quite perfect, there remains a certain quality of rebellion and original thinking on this topic specifically and also throughout the language design that puts Raku quite close to providing powerful yet seamless mechanisms for choosing the best fractional representation for any given system.


  1. I say this because it implies that these are hardly the only two occasions where people have chosen an imprecise representation, have seen that imprecision manifest at significant scale, and have chosen to continue with the imprecise representation for mostly dubious reasons.↩︎
  2. To date I’m not aware of any other programming languages designed and implemented in the 21st century that take this same “precision-first” approach. It apparently remains a controversial/unpopular choice in language design.↩︎
  3. This is called from inside some glue that sepearates the timings and type cheacking from the underlying details.↩︎
  4. I patched a local version of Math::Matrix to print its rationals always as fractions, so your output will look different for the rational Hilbert (ie the same as fractional Hilbert) if you end up running the code in this gist.↩︎
  5. If the timings of those calculations depress you, it is worth pointing out that Raku developer lichtkind has developed a version of the library that produces timings like these instead.
    raku rationale-matrix.p6 --timings-only 8
    Created 'Num' Hilbert of 8x8: 0s
    Created 'Rat' Hilbert of 8x8: 0.006514s
    Starting 'Num' Hilbert test at 2020-08-10T11:06:34.567898+02:00
    'Num' Hilbert inverse calculation: 0.2784531s
    'Num' dot product: 0.2784531s
    Starting 'Rat' Hilbert test at 2020-08-10T11:06:34.846350+02:00
    'Rat' Hilbert inverse calculation: 0.06900818s
    'Rat' dot product: 0.0846446s

    It is likely that this work will be incorporated in the default release at some point. Both systems work much better but here the asymmetry of Num to Rat performance is looking even more suspicious than the near-parity of the slower, released version. I will do a deeper dive into profiling both versions of the module in a post on my personal blog at some point soon.↩︎

  6. I want to extend a deep and heartfelt thanks to not only Roger and his email but especially dzaima, ngn, Marshall, and the whole APL Orchard for helping me to arrive at a much deeper understanding of the varieties of class and proportion that are countervailing forces for any hard-line stance relating to rational arithmetic. My positions were worded poorly and argued brusquely and not a match faithful match for your patience. I hope it is some consolation, at least, that you have changed my mind.↩︎