Featured

It’s that time of the year

When we start all over again with advent calendars, publishing one article a day until Christmas. This is going to be the first full year with Raku being called Raku, and the second year we have moved to this new site. However, it’s going to be the 12th year (after this first article) in a row with a Perl 6 or Raku calendar, previously published in the Perl 6 Advent Calendar blog. And also the 5th year since the Christmas release, which was announced in the advent calendar of that year.

Anyway. Here we go again! We have lined a up a full (or eventually full by the time the Advent Calendar is finished) set of articles on many different topics, but all of them about our beloved Raku.

So, enjoy, stay healthy, and have -Ofun reading this nice list of articles this year will be bringing.

Day 4: Parsing Clojure namespace forms using Raku grammars

One day, I started wondering if it would be possible to parse Clojure namespace forms and generate a dependency graph of the various namespaces used in a real-world Clojure project. While that was the original motivation, I ended up down the Raku grammar rabbit hole, and had an enjoyable time learning how to use them. I’m glad you’re joining me in reliving that journey.

Background

Grammars

Informally speaking, grammars can be thought of as a set of rules that describe a language. With these rules, one can meaningfully parse (to make sense of, or deconstruct into its grammatical components) a piece of text. It turns out that this is a common task in computing. We need to frequently translate programs from one language to another. This is the job of a compiler. Before being able to translate it, the compiler needs to know whether the original program is even valid, according to the language’s grammar.

While we have explained in theory what grammars are, Raku grammars help us model abstract grammars as a programming construct (the grammar keyword and its adjacent helpers) using which we can perform parsing tasks. It is important to understand this distinction.

First class grammars are considered one of the revolutionary features of Raku. Normally, you’d find grammars as a library or a standalone tool, but Raku has embraced it wholesale, and has a powerful implementation of grammars which makes light work of most parsing tasks.

Clojure

Clojure is a modern lisp, and happens to be the language I use in $DAYJOB. At the top of most Clojure files, there is a namespace form importing various internal and external namespaces. This way we can neatly organize our project into many separate files, rather than having to put them all into one big file. We are shortly going to design a grammar that can parse these namespace forms.

Grammar::Tracer

Grammar::Tracer is helpful in figuring out where your grammar fails to match, and is invaluable in debugging. Make sure you do zef install Grammar::Tracer before running the code examples.

Let’s start cookin’

A trivial example

A Clojure namespace is a Lisp form, which as expected starts with an open paren and ends with a close paren. Let’s write a grammar for one of the simplest Clojure forms, the empty list.

()
grammar EmptyLispForm {
    token TOP { <lparen> <rparen> }

    token lparen { '(' }
    token rparen { ')' }
}

This is one of the simplest possible grammars we can write. Parsing always starts with the TOP token, and recurses into the various component tokens from there. We just have two such tokens lparen and rparen which are used to denote the left and right parens. Play around with trivial.raku to see how we can parse this.

Warmup: Slaying a cliched interview question using Raku grammars

Q: Write a program that checks that parentheses are balanced.

For example:

() ;; balanced
(()) ;; balanced
(()(())) ;; balanced
(()(((()))) ;; unbalanced
grammar BalancedParens {
    token TOP { <balanced-paren> }

    token balanced-paren { <lparen> <balanced-paren>* <rparen> }

    token lparen { '(' }
    token rparen { ')' }
}

It’s likely I’ve made this more wordy than necessary, but that’s still only six rather-readable lines.

Now, under time pressure which seems more likely? Coding up a stack and fiddling with corner cases or using grammars with the awesomeness of Grammar::Tracer to quickly hack out a declarative solution.

It turns out, that we have just tackled one of the trickiest aspects of writing a real-world grammar, and that is dealing with nested structures. As programmers we know that when we see nested structures, we know we will have to deal with recursion in some form.

You can play around with balanced-parens.raku program on the console, and observe how the grammar is parsed.

Note: It turns out there is a better way to parse nested structures, but this is fine for now.

Parsing our first namespace form

Let’s try to parse this simple namespace declaration:

;; 1   3
   |   |
;; (ns my-own.fancy.namespace)
    |                        |
;;  2                        4

While this is easy, it is going to be an important building block in tackling more complicated namespace forms. Let’s break this down into its four component lexemes. We can see the open and close parens, we see ns, the namespace my-own.fancy.namespace and finally the close paren. That’s it! Let’s tackle these individual pieces using a grammar.

grammar SimpleNs {
    token TOP { <simple-ns> }

    #                  1        2            3         4
    #                  |        |            |         |
    token simple-ns { <lparen> <ns-keyword> <ns-name> <rparen> }

    token ns-keyword { 'ns' }
    token ns-name { <.ns-name-component>+ % '.' }
    token ns-name-component { ( <.alnum> | '-' )+ }

    token lparen { '(' }
    token rparen { ')' }
}

Over here we can see that we have translated this into a simple Raku grammar. You could argue that defining simple-ns is not even required, we could have just put it in TOP directly, but anyway.

We will need to deal a little with regexes here. In the various flavours of regexes, + usually means one or more. | has a slightly different meaning from what you’re expecting, but you can check the regex documentation for all the details on the difference between | and ||. Loosely speaking, we are saying that a namespace component, i.e. the thing between two dots, is made up of one or more alphanumneric characters or hyphens. Now, if there is a rule saying a namespace has to start with an alphabetic character, and not a number, the grammar will become a little more complex, but this is a pedagogic example, so we’ll not be too pedantic.

I expect eagle-eyed readers to point out a few things:

  1. Where is <.alnum> being defined, and why does it have a dot before it?

alnum is predefined. The reason it has a . before it is so that we are not interested in capturing each letter; it’s too low level. We are interested in capturing a top-level token like ns-name instead, and not each individual character. Play around with the code examples by adding and removing a dot from various tokens and see the difference in Grammar::Tracer‘s output.

  1. What does % mean?

This is a very useful convenience to describe patterns where something is interspersed between a bunch of other things. For example, we can have a namespace like foo.bar.baz or we could have an IPv4 address 192.168.0.1 where integers are separated by dots.

token ns-name { <.ns-name-component>+ % '.' }

This means that ns-name is made up of at least one ns-name-component‘s (denoted by the +) and separated by a . (denoted by %).

Okay, so this should work I guess? Let’s see what happens when we run the code!

No, that didn’t work. As Grammar::Tracer helpfully tells us, we did not account for the space after ns. In traditional compiler theory, there is a process of tokenisation, where some lightweight regexes are used to separate the program into its component lexemes and discard all the extraneous spaces before the parser takes over. However, here we will not do that, and we’ll deal with it in the grammar itself. Now, it could be argued whether that is a good decision, but that’s another discussion. Let’s add some allowance for whitespace and see what happens. While building the full Clojure NS grammar, I got myself into a position where I liberally used to sprinkle <.ws>* indicating zero or more whitespace characters in places where I felt that we should allow optional whitespace, as you would expect in a real-world program.

token simple-ns { <lparen> <ns-keyword> <.ws> <ns-name> <rparen> }

With that tiny addition, we are now able to parse the simple namespace form. You can play around with simple-ns.raku.

Let’s make our lives a little more difficult

Okay, now we are getting the hang of this, so let’s see a namespace form which is a little more realistic.

(ns my-amazing.module.core
  (:require [another-library.json.module :as json]
            [yet-another.http.library :as http]))

This is a realistic namespace form which we shall parse by adding support for the :require form where other libraries are imported and given a short nickname.

Can this be done? You bet!

grammar RealisticNs {
    token TOP { <realistic-ns> }

    token realistic-ns { <lparen>
                           <ns-keyword> <.ws> <ns-name> <.ws>
                           <require-form>
                         <rparen> }

    token ns-keyword { 'ns' }

    token ns-name { <.ns-name-component>+ % '.' }
    token ns-name-component { ( <.alnum> | '-' )+ }

    token require-form { <lparen>
                           <require-keyword> <ws>? <ns-imports>
                         <rparen> }

    token require-keyword { ':require' }

    token ns-imports { <ns-import>+ % <.ws> }

    token ns-import { <lsquare>
                        <ns-name> <.ws> ':as' <.ws> <ns-nickname>
                      <rsquare> }

    token ns-nickname { <.alnum>+ }

    token lsquare { '[' }
    token rsquare { ']' }
    
    token lparen { '(' }
    token rparen { ')' }
}

Nothing too scary as yet. We can see how the grammar evolves. At the top level, in realistic-ns we have added an extra token called <require-form> and we flesh out the details later. We can manage complexity in this manner, so that we have the ability to zoom in and out of the details as necessary.

Using the parsed data

Now that we have been able to parse the data, we need to make use of what we’ve parsed. This is where Actions come into the picture.

When we do RealisticNs.parse(...), that returns a Match object corresponding to the RealisticNs grammar. While we can query that object to get the pieces of data that we require, it is less cumbersome to use Actions to build up the data we are actually interested in.

Given, a namespace, we want to extract out:

  1. Namespace name
  2. Imported namespaces
  3. Imported namespace nicknames

The simple principle to follow is that for the token we are interested in, we create an Action method with the same name. When the Match object is being built up, the token Action methods run when the token is matched. Grammars are parsed top-down, but data is built up by the Action methods in a bottom-up fashion.

class RealisticNsActions {
    has $!ns-name;
    has $!imported-namespaces = SetHash.new;
    has $!ns-nicknames = SetHash.new;

    method TOP($/) {
        make {
            ns-name => $!ns-name,
            ns-imports => $!imported-namespaces,
            ns-nicknames => $!ns-nicknames
        }
    }
    
    method ns-name($/) {
        $!ns-name = $/.Str;
    }

    method imported-ns-name($/) {
        $!imported-namespaces{$/.Str}++;
    }

    method ns-nickname($/) {
        $!ns-nicknames{$/.Str}++;
    }
}

Here, we have created a RealisticNsActions class and created methods where we want to do something with the data associated with it. We don’t need to touch the grammar definition at all (which keeps it clean). The only extra thing we need to do, is while parsing, we need to pass the Actions object like this, which instructs Raku to run the token Action methods when it sees those tokens.

sub MAIN() {
    my $s = RealisticNs.parse(slurp("realistic.clj"), actions => RealisticNsActions.new);
    say $s.made;
}

In an Actions class, the TOP method can be used to generate the final payload that we can access by calling the made method. For more information on make and made the official Raku Grammar tutorial makes it clear. In short, arbitrary payloads created using make can be accessed by using made.

When we run this program, this is what we see:

{ns-imports => SetHash(another-library.json.module
                       yet-another.http.library),
 ns-name => my-amazing.module.core,
 ns-nicknames => SetHash(http json)}

As expected, we can see the parsed data is created in a nice HashMap, but wouldn’t it be nice to know that yet-another.http.library‘s nickname in the namespace is http? This is where we run into a design issue in the Actions class we have just written. We are building payloads at a lower level than what we should.

We need more structure to be able to get the namespace -> namespace-nickname mapping that we want. A quick glance at the grammar tells us that we can find it at the ns-import level, because its subtokens are imported-ns-name and ns-nickname, and those two are the pieces of data that we want.

Let’s write an Action method for ns-import!

class RealisticNsActions {
    has $!ns-name;
    has $!imported-namespaces = SetHash.new;
    has %!ns-nicknames;

    method TOP($/) {
        make {
            ns-name => $!ns-name,
            ns-imports => $!imported-namespaces,
            ns-nicknames => %!ns-nicknames
        }
    }

    method ns-name($/) {
        $!ns-name = $/.Str;
    }

    method ns-import($match) {
        #say $match;
        
        my $imported-ns-name = $match<imported-ns-name>.Str;
        my $ns-nickname = $match<ns-nickname>.Str;

        $!imported-namespaces{$imported-ns-name}++;
        %!ns-nicknames{$imported-ns-name} = $ns-nickname;
    }
}

which results in the output:

{ns-imports => SetHash(another-library.json.module 
                       yet-another.http.library),
 ns-name => my-amazing.module.core,
 ns-nicknames => {another-library.json.module => json,
                  yet-another.http.library => http}}

Now we have all the information we require. You can play around with realistic-ns-with-actions.raku.

Match objects in Action classes

The result of a successful parse is a Match object. This contains the entire heirarchical parsed structure.

Whatever we did in the ns-import method, we could have done at a higher level, but the Match object at that level would need more querying. This is because the method will receive its “view” into the full Match object, i.e. TOP method will have the entire Match object, and the ns-import method will have a more restricted view using which we could easily extract out imported-ns-name and ns-nickname. This might not make sense immediately, but after dealing with Match for some time, you’ll see how it makes sense to extract out useful information at the lowest possible level which allows for easier querying. At the TOP level, to extract out ns-nickname we would have had to query realistic-ns -> require-form -> ns-imports -> ns-import -> ns-nickname which is cumbersome to say the least, and because there are multiple of these, there will be an array of ns-import.

To visualise what is happening in each Action method, add a say $match or say $/ as appropriate to see what the structure at that level.

Development Style

Raku does not as yet have a full-featured REPL environment that a Lisp programmer may be used to. That is just something that needs to be worked around.

The Raku REPL could be used to quickly test short one-line snippets of code, mostly as the simplest of sanity checks, but it gets unwieldy with anything more complex than that.

To get around this, I used a TDD approach of sorts, where I would take a real-world Clojure project, and run the (rapidly evolving) grammar on all the Clojure files. With every “correct” change the number of parsed files would increase, and with every “wrong” change, the number of parsed files would decrease.

Next steps

With what we’ve tackled so far, it is not much of a stretch to parse real world Clojure namespace declarations as well. For example, we have added support for importing Clojure namespaces using the :require form. Similarly, we could add support for the :import form using which we import Java libraries. The same iterative approach can be used to parse increasingly complicated code.

The final Clojure NS Grammar with which I have been able to parse hundreds of Clojure files. Using this grammar to generate the dependency graphs is a story left for another day. You may notice that I have to handle lots of syntactic variation, and optional whitespace. I believe we have extracted the crux of the implementation into the easily understood grammar that we have discussed in detail.

Caveat emptor (and a few big ones):

There is a chance that when a Grammar is improperly specified, the program just hangs. What is happening is similar to pathological backtracking. This is not something only Raku grammars, or grammars in general, can suffer from. A badly written regex can have the same effect too, so one must be aware of this contingency before putting a regex/grammar in the hot path of a highly-critical application. There are high-profile post mortems discussing how poorly-written regexes brought down web applications to their knees.

While dealing with regexes the common advice is to avoid using .* (any number of any character), and instead be more restrictive in what that matches with, and that advice holds true for grammars too. Be as restrictive as possible, and relax certain things when it is unavoidable. In the final code, I was able to parse hundreds of real Clojure namespace declarations, but during the process of evolving the grammar every once in a while, I did run into this behaviour a few times, and that was remedied by tweaking the grammar as described. Being able to reliably fix it, is difficult, but in time you’ll be able to do it intuitively.

The other thing to be wary of in a production system is, that even if a grammar is well-specified, could a malicious user craft an input that triggers this pathological behaviour? Given enough motivation, anything is possible, so all bets are off. crossed_fingers

Grammars are more powerful than regular expressions (https://en.wikipedia.org/wiki/Chomsky_hierarchy). The traditional warning of not using regexes to parse things it isn’t powerful enough to parse does not apply here, assuming the grammar is well-written, but a well-written grammar is as elusive as a bug-free program. I don’t mean the grammar in the original design of the language, but the grammar you are writing to parse something. Without properly dealing with corner-cases, and adequately handling errors, you are going to end up with a brittle and frustratingly difficult to debug program.

Conclusion

Now that we have reached the end, I hope this is another article floating out on the internet that will help fledgling Rakoons (can Rakoons fly? thinking) understand how Raku grammars can be a powerful tool to tackle any parsing tasks that might come their way.

References

End Credits

  • All the Raku designers and contributors over the years.
  • Jonathan Worthington for creating Grammar::Tracer.
  • All the friendly Rakoons (@JJ) who helped with reviewing this article.

Day 3: Literate Programming with Raku

Literate Programming with Raku

Different programming language communities have differing cultures. Some are more pragmatic, others more idealistic. Some place great emphasis on having code be thoroughly readable and understandable for anyone who joins an existing project, and some prefer writing out clear and in-depth documentation.

Raku, inheriting one of the best parts of Perl, has a community that writes great documentation.

What is Literate Programming?

Literate Programming is an alternate take on documentation. Instead of having the code be the central element and writing documentation around it, in Literate Programming we write a document that contains the essential parts of our program. In this way we integrate the code into our natural language in such a manner that the idea underlying the design is clear. In this way we also naturally start thinking explicitly about the operations our programs need to perform to fulfill the task we are setting out to undertake.

Literate programming is not to be confused with documentation generation; it’s not merely having a program that is well-documented with documentation lavishly surrounding the code and embedded into it, rather it’s having a document about the program in which the program itself is embedded. As Lewis Carroll had the Mad Hatter say in Alice in Wonderland:

You might just as well say that “I see what I eat” is the same thing as “I eat what I see”!

What is Org-mode?

Org-mode is a text-editing mode in the Emacs Lisp interpreter (which could arguably be called a text editor). When Org-mode is enabled one can edit text using a specific type of markup, called Org. Similar to Markdown, it supports basic text features like headers, basic text properties like bold or italic text, embedded hyperlinks, embedded code blocks, and so on.

However, due to the versatility of the Emacs platform Org-mode has been extended with a facility termed “Babel”, which allows authors to execute blocks of source code. Various languages are supported out of the box; unfortunately Raku, still being a rather young language despite being quite old, is not among those. To remedy that, I wrote a package called ob-raku that extends the Babel facilities to add Raku to the supported languages.

Using Ob-Raku

To use ob-raku in Emacs one will need to download the package (simple clone it from GitHub or download one of the release tarballs) and add it to their Emacs load path. This can either be done using the environment variable:

export EMACSLOADPATH="/path/to/ob-raku:$EMACSLOADPATH"

Or you can change the path in your configuration file (probably ~/.emacs or ~/.emacs.d/init.el):

(add-to-list 'load-path "/path/to/ob-raku")

After adding the path you can add Raku to the list of languages that Babel can use like this:

(org-babel-do-load-languages
 'org-babel-load-languages
 '((c . t)
   (emacs-lisp . t)
   ; ...
   (raku . t)))

With Raku added to the list, you can create a Raku code block like this:

#+BEGIN_SRC raku
"!dlroW ,olleH".flip
#+END_SRC

You can evaluate the block by putting the cursor in or at the end of it and either using the menu bar, or typing C-c C-c (which is the Emacs notation for Ctrl+c Ctrl+c). The result of the evaluation will be added after the code block.

Linking blocks

Unfortunately, the lack of session support for ob-raku means that functions declared in one block aren’t usable in other blocks. That said, the results of evaluating a block can be used as arguments to another block, to chain them together. So when editing a .org file we can write the following:

Let's make a list in Raku:

#+NAME: nested-list
#+BEGIN_SRC raku
my @a = (("A", "B"), ("C", "D"))
#+END_SRC

We can also include results from other languages:

#+NAME: elisp-list
#+BEGIN_SRC emacs-lisp :results vector
'(1 2)
#+END_SRC

And now we'll use the lists we just defined:

#+NAME: crosser
#+HEADER: :var a=nested-list() b=elisp-list()
#+BEGIN_SRC raku
my @crossed = @a X @b
#+END_SRC

When you evaluate the crosser block Babel will evaluate the nested-list and elisp-list block, which both return lists, and assign them to the @a and @b variables. The resulting crossed list will be returned underneath the crosser block.

Editor’s notes

This post was drafted at the end of February 2020, before Daniel Sockwell wrote his excellent article on Literate Programming with Pod6. This post won’t be mentioning using Pod6 to do Literate Programming, but talk about an Emacs package I wrote to use Raku in Emacs Org-mode.

Raku REPL interaction in Emacs was still underway when this article was written, thanks to Matías Linares it has since solidified.

Conclusion and thanks

While the functionality of ob-raku is still limited, anyone who already uses it to write documents can now implement Raku code and thereby show reproducible data expressed in our lovely versatile language, and once session support lands Org-mode’s built in tangling facilities should bring true Literate Programming to the community.

Creating ob-raku would not have been possible without the work put in to raku-mode, so I would like to thank everyone involved in the project for their hard work. While our community is small, having people with passion work on improving the tools we use on a daily base makes our work much more enjoyable and thereby helps us keep going to make our little slice of heaven the best it can be.

Day 2: Perl is dead. Long live Perl and Raku.

Perl is dead’, is a meme that’s just plain wrong. Perl isn’t dead. It’s just dead to some programmers. Complicated regexes? Sigils? There’s more than one way to do it (TMTOWTDI)? Sometimes when programmers encounter Perl in the wild they react with fear. “WTF!?”, they cry! But fear needn’t be a Perl killer. If you take the time to see past Perl’s imperfections and walk the learning curve, there are rich rewards: Perl is an imperfect but pragmatic and expressive language that for 30+ years has helped programmers get the job done.

When Larry Wall designed Raku, with the help of the Perl community, he fixed most of Perl’s imperfections and doubled down on Perl’s DNA. Perl values pragmatism, expressivity, and whipupitude and Raku does too! Why stop at sigils ($@%) when you can have twice the fun with twigils ($!, %!, @! etc)?

For some programmers, however, the mere sight of a twigil can induce fear. Like Perl, Raku’s expressive power is a double-edged sword – potentially stopping other programmers in their tracks. A Raku programmer’s, “DWIM” (do what I mean) can be another programmer’s, “WAT!?”

Fear-free code that flows

We write programs for two audiences: humans and the computer. It’s the humans that should come first. If I can’t understand my own code in a week’s time, what hope do my colleagues have? Fortunately we can help ourselves, and each other, to have a smooth ride up the Raku learning curve.

Learning Raku is never boring but when did you last encounter a bump while learning Raku? That’s a chance for you to help yourself and others. You could leave an empathic comment in your code, contribute some documentation, write a blog post, give a talk, ask and answer a question on StackOverflow etc.

The joy of programming is finding flow for ourselves and each other while getting things done. No Raku-riffing-rockstars required.

Surfing the learning curve

Maybe you haven’t started learning Raku yet? Now is the perfect time to add Raku to your toolbox. Here are some of the learning resources I’ve found useful and I hope you do too.

Firstly, there is the pithy and concise introduction to Raku that includes instructions on how to install Raku. The Raku interpreter itself is very helpful. If your Raku program contains errors Raku often suggests ways to fix them.

For programmers coming from other language(s), RosettaCode showcases coding solutions in different languages side-by-side. Prepare to be pleasantly surprised by the expressive power of Raku’s operators. Raku’s expressivity typically results in less lines of code (LLOC).

An idea for your first Raku program is to translate a progam you know well from a different language. Here are some helpful guides for translating from other languages to Raku: Perl, Python, Ruby, Haskell and Javascript.

There’s a growing list of books on Raku and a flowchart for choosing one. Here’s a selection:

Searching for Raku-related problems will often point to the official documentation at docs.raku.org or Raku answers on StackOverflow.

When you want to learn more about a specific sub-topic or dive into the deeper design philosophy of Raku check out Jonathan Worthington’s clear presentations and explanations.

Finally if you’re stuck on something, or just want to share in the –Ofun of learning Raku the #raku IRC channel on freenode is friendly and welcoming.

Both Perl and Raku are useful tools in any programmer’s toolbox: no fear needed, just remember to help the code flow!

Long live Perl and Raku.

Happy Christmas!

Day 1: Why Raku is the ideal language for Advent of Code

Now that it’s December, it’s time for two of my favorite traditions from the tech world: the Raku Advent Calendar and Advent of Code. These two holiday traditions have a fair amount in common – they both run from December 1 through Christmas, and both involve releasing something new every day during the event. Specifically, the Raku Advent Calendar releases a new blog post about the Raku programming language, while Advent of Code releases a new programming challenge – which can be solved in any language.

(In this post, I’ll be referring to Advent of Code as “AoC” – not to be confused with the American politician AOC who, to the best of my knowledge, does not program in Raku.)

For me, Raku and AoC are the chocolate and peanut butter of tech Advent season: each is great on its own, but they’re even better in combination. If your only goal is to solve AoC challenges, Raku is a great language to use; on the other hand, if your only goal is to learn Raku, then solving AoC challenges is a great way to do so. This post will explain how Raku and AoC are such a good fit and then provide some resources to help us all get started solving AoC challenges.

What is Raku? (And why should you care?)

Since Raku is a relatively new programming language, at least some of you may not be familiar with it or why it’s worth learning. Raku is notoriously hard to pin down in a single sentence, but here’s my attempt:

Raku A concise, expressive, aggressively multiparadigm language with strongly inferred types, built-in concurrency, rich metaprogramming, and best-in-class string processing and pattern matching.

What this means in practice is that I find myself reaching for Raku increasingly often. When faced with pretty much any problem, I keep concluding that Raku is the language that will let me solve it in the clearest, fastest, and most elegant way possible. (The one exception is if solving my problem demands the raw speed or low resource use of a compiled language. But even in that relatively rare case, I’d probably write the performance-critical sections of my code in Rust and the rest in Raku, taking advantage of how well the two languages play together.)

That’s not to say that Raku is a language that tries to be all things to all people. In fact, Raku has a rare, laser-like focus on individual productivity and is willing to trade off some standard enterprise/large group features to achieve that goal, as I’ve previously discussed at length (part 1, part 2, part 3). But, even setting those big-picture ideas aside, Raku is a language that’s full of interesting ideas. If those ideas are even half as good as I believe them to be, then Raku is a language you’ll want in your toolbelt.

Why is Raku a great fit for solving AoC challenges?

I believe that Raku is a good fit for solving many different problems, but it’s definitely an excellent fit for Advent of Code challenges. To explain why, I’ll walk through what we’d want out of an ideal AoC language. Then I’ll present a Raku solution to last year’s first AoC challenge and compare it to our ideal. (Spoiler: they’re really similar!)

When thinking about the ideal AoC language, the first feature of AoC that comes to mind is that it’s a series of small, largely self-contained puzzles rather than one large project. This suggests that the ideal language would be concise and low-boilerplate. It’s annoying but bearable to deal with many lines of boilerplate when setting up a large project, but it’d be far worse to do so repeatedly on each day of AoC. And, since we’ll be sharing our code, keeping it concise will help others see our logic without being distracted by housekeeping details.

We can be a bit more specific: the AoC challenges typically provide textual input and look for textual output; they also usually provide several test cases that can help to craft a working solution. Thus, in addition to being concise in general, our ideal language should offer low-boilerplate solutions for scripting and testing.

Perhaps the most notable (and certainly most fun!) feature of AoC is that it’s community driven and educational: thousands of programmers are all solving the same puzzles on more or less the same schedule, and are then posting their solutions to the Advent of Code subreddit. It’s very common to learn as much or more from reading other solutions – including ones in different languages – as you do from solving the challenge yourself. This means that our ideal language should be readable and elegant, even for people without much experience in the language.

Of course, while it’s great to be able to write code that people unfamiliar with the language can appreciate, much of our teaching and learning will come from comparing our solutions to other solutions in the same language. After all, different solutions in the same language will often show approaches that make different tradeoffs and can help expand our programming toolset. Or at least that happens if different solutions in our language are, well, different: if our language pushes everyone towards a single, obvious solution, then there will be much less room for that sort of learning. So our ideal language should offer more than one way to solve any particular challenge.

I’m sure I could go on, but it seems like we have a pretty good list. To sum up, we’re looking for a language that’s concise and low-boilerplate, especially for scripting and testing, and that allows for multiple different readable and elegant solutions to each challenge. (Note that this list is about finding the best language for learning from and enjoying AoC. If your goal is to place highly on the AoC leaderboard, then Raku’s excellent string processing features would still make it a good fit. But, realistically, if that’s your goal, you should pick whatever language you know best.)

Now that we know what we’d want in an ideal language, let’s take a look at a Raku solution to last year’s first challenge.

AoC 2019 day 1 in Raku

You can read the full problem description for all the details, but the short version is that this challenge asks us to make a few different calculations about the fuel required to launch a spaceship based on its mass. Specifically, in Part 1 we are told:

to find the fuel required for a module, take its mass, divide by three, round down, and subtract 2.

Because Raku has an integer division operator, this is almost trivially easy:

sub fuel($mass) { +$mass div 3 - 2 }

Part 2 asks us to perform a similar calculation, but this time to take into account the extra mass added by the fuel we’re adding:

Fuel itself requires fuel just like a module – take its mass, divide by three, round down, and subtract 2. However, that fuel also requires fuel, and that fuel requires fuel, and so on.

To solve this part, we can use our fuel function from Part 1 to calculate how much fuel we need, and then add our initial result to the amount of fuel we need for the new mass.

multi total-fuel($mass) { fuel($mass).&{$_ + .&total-fuel} }

Part 2 also tells us:

Any mass that would require negative fuel should instead be treated as if it requires zero fuel.

Again, Raku offers a powerful feature that makes this simple: in this case, the powerful feature is Raku’s ability to pattern match against run-time values in function signatures.

multi total-fuel($mass where fuel($mass) ≤ 0) { 0 }

With those three lines, we’ve essentially solved the challenge. Of course, we want to be able to perform this calculation not just on single numbers but on our entire input (which, for this challenge, takes the form of a text file with a different number on each line). We also want our script to be executable and to expose a CLI with a user-friendly description and --help text. This CLI should allow the user to select whether our script solves Part 1 or Part 2 of the challenge.

Fortunately, Raku lets us add all of these niceties with a grand total of three additional lines of code and a shebang comment. This gets our full solution so far to the following:

#!/usr/bin/env raku
unit sub MAIN( #= Solve the 2019 AoC day 01 puzzle
Bool :$p2 #={ Solve p2 instead of p1 (the default)} );
sub fuel($mass) { +$mass div 3 - 2 }
multi total-fuel($mass) { fuel($mass).&{$_ + .&total-fuel} }
multi total-fuel($mass where fuel($mass) ≤ 0) { 0 }
say lines.map($p2 ?? &total-fuel !! &fuel).sum;

(The comments embedded in lines 2 and 3 produce the --help documentation.)

The challenge also provides 7 test cases that we should probably include. When working on a larger Raku project, the standard approach would be to split our tests out into a separate file. But we’re going scripting and it would be a shame to give up our single-file simplicity. So, instead of using a separate file, we’ll use a technique that I previously blogged about to use Raku’s conditional compilation to include our tests in a single file without executing them every time we run our script.

Using that technique, here’s what we get for our final code, including tests:

#!/usr/bin/env raku
unit sub MAIN( #= Solve the 2019 AoC day 01 puzzle
Bool :$p2 #={ Solve p2 instead of p1 (the default)} );
sub fuel($mass) { +$mass div 3 - 2 }
multi total-fuel($mass) { fuel($mass).&{$_ + .&total-fuel} }
multi total-fuel($mass where fuel($mass) ≤ 0) { 0 }
say lines.map($p2 ?? &total-fuel !! &fuel).sum;
# Tests (run with `raku --doc -c $FILE`)
DOC CHECK { use Test;
subtest 'Part 1', { fuel(12).&is: 2;
fuel(14).&is: 2;
fuel(1_969).&is: 654;
fuel(100_756).&is: 33_583; }
subtest 'Part 2', { total-fuel(14).&is: 2;
total-fuel(1_969).&is: 966;
total-fuel(100_756).&is: 50_346; }
}

Comparing Raku to the ideal AoC language

So, how does Raku do on the metrics we came up with earlier? Well, in terms of being concise and low-boilerplate, this code seems to do reasonably well. True, it’s far from maximally concise; it’s about 7 times longer than the solution I came up with when I worked through this challenge last year in Dyalog APL. However, at 6 lines of code for the program and 9 for the 7 test cases, I’d still score it as highly concise. As a point of comparison, a strong Rust solution used 27 lines of code on the solution and 17 on the tests.

And when it comes to supporting scripting and testing while eliminating boilerplate, the Raku code is just about ideal. The shebang line at the beginning is boilerplate, but is essential to creating a standalone script in any language. Other than that line, the only bits that are even arguably boilerplate are the use Test line and the unit sub MAIN line. These lines give us a tested script with a fully documented CLI – something that neither the Rust nor the APL examples linked above provide. Considering what we get in return, I’m prepared to give this solution full marks for supporting scripting and testing without relying on boilerplate.

Judging how readable and elegant the Raku solution is certainly involves more subjectivity. And, arguably, someone who knows Raku is least qualified to judge how readable the code would be to a programmer who doesn’t know the language. That said, in my view this solution is readable enough that a non-Raku programmer could follow it – while still making use of enough of Raku’s clever features to be elegant. A non-Raku programmer probably wouldn’t know the div operator, but could likely figure it out from the context and the reasonable assumption that it must do something different from the / operator. The fuel($mass).&{ $_ + total-fuel($_) } line might also slow a non-Raku programmer down – especially if they hadn’t encountered the $_ topic variable in Perl. But, again, I believe they could work it out quickly enough from the context. And, once they do, I bet they’d appreciate the elegance of reusing (rather than recalculating) the fuel($mass) value without needing to create and name a temporary value.

The last snippet that would likely be unfamiliar to non-Raku programmers is the $mass where fuel($mass) ≤ 0 bit. But, in my (admittedly biased!) opinion, this snippet is both immediately clear and immediately elegant – it so perfectly captures the intent of “call this function only when $mass is less than or equal to 0” in a way that most other programming languages cannot express in a function signature. I know that elegance is in the eye of the beholder, but this snippet and this script as a whole fit my definition pretty much perfectly.

Our final criterion – whether Raku offers more than one way to solve this challenge – is inherently impossible to judge from a single solution. But there’s always more than one way to do it in Raku, regardless of what it refers to. One relatively minor change would be to lean into Raku’s type system. Because Raku has such capable type inference, it is possible to write it just as you’d write a dynamically typed language. But Raku has a powerful type system and allows you to constrain the types your functions accept.

For example, in the current code, we have the function sub fuel($mass) { +$mass div 3 - 2 }. This accepts an arugment of any type (which it later casts to a numeric type with the + operator) and makes no guarantee of its return type. This flexibility is nice – it lets us call fuel with a Str when we pass it input from stdin but with an Int when we call it recursivly. If we wanted more type safety, though (and we very well might in a longer program), we could pin down both the parameter and return types like this:

sub fuel(Int $mass --> Int) { $mass div 3 - 2 }

Since this only accepts Ints, we wouldn’t need the + inside the function body, but we would need to parse the input before calling fuel. We could do so in many ways; I’d probably add a .map(+*) method call to our lines pipeline.

Another option would strike something of a middle ground between safety and flexibility using coercion type constraints. This would result in a signature of sub fuel(Int() $mass --> Int). The Int() bit constrains the function to accept a type that can be cast to an Int and automatically performs the cast. Using this signature, we’d avoid the need for either +$mass or .map(+*).

Adding type safety is just one way we could approach this problem differently. We could also use a sequence with a computed endpoint (e.g., $_, total-fuel($_) …^ * ≤ 0) as an alternative to the where block we currently use. Or we could use reduce (either as a method call or with the ]) instead of separate map and sum steps. Or we could step even further away from map by using gather and take. In short, even with a problem as simple as this, there are a vast number of ways we could use Raku – and that’s without mentioning any of the ways we could use Raku’s strong, built-in support for concurrency to parallelize our solution if, for some reason, we wanted to push performance. Regardless of how you personally solve an AoC challenge, I’m willing to bet that you’ll be able to learn something by taking a look at the variety of other solutions people come up with in Raku.

I know that there’s a limit to how many conclusions we can draw from looking at a single AoC challenge – especially since the challenges generally increase in difficulty through December, which means Day 1 is not all that representative. And I know that not everyone will share my exact definition of what makes a language a good fit for AoC. Nevertheless, I hope that our fairly detailed examination of an AoC challenge has been enough to persuade you that Raku is a promising language to use for AoC. If you already know at least some Raku, I hope you’ll agree that working through the AoC challenges could be a good use of your time (both to get better at Raku and to share your knowledge). And if you don’t yet know Raku, I hope you’re at least tempted to use AoC as an opportunity to learn.

If so, I have some good news: AoC is an excellent way to learn Raku.

Why is AoC a great way to learn Raku?

To see why AoC is such a good way to learn Raku, I’d like to start with a different question: how hard is it to learn Raku?

There are two different perspectives on this question. From one perspective, Raku is extremely hard to learn – in pretty much exactly the way that Scheme is easy to learn. Scheme famously has almost no syntax; you can easily sit down and learn the entirety of Scheme’s syntax in a single sitting. (Of course mastering the language would take far longer.)

Raku occupies nearly the opposite extreme: it makes extensive use of syntax. I don’t think there’s any particularly principled way to measure the “size” of different programming languages, but the Learn X in Y minutes series of guides might provide a rough approximation. Comparing the Learn X in Y minutes, where X=CHICKEN Scheme guide to the Learn X in Y minutes, where X=Raku, I see that the Raku one has about 7× as many lines; I can understand why someone might conclude that the value of Y is very different in those two equations.

And syntax isn’t the only way that Raku is a large language. I previously described Raku as “aggressively multiparadigm”; an influential review called Raku “multi-paradigm, maybe omni-paradigm”. However you put it, it’s clear that you can write Raku in many different ways. You can write it procedurally, complete with explicit for loops if you’d like. It also has first-class support for functional programming (including support for function types, abstract data types, and other advanced functional features that multiparadigm languages often omit from their functional toolbelt). You can also write purely object-oriented Raku and indeed the language itself is built from an OO perspective. Raku also borrows many ideas from array programming, from constraint programming, and from dataflow programming; it supports concurrent programming, generic programming, and extremely strong introspection and metaprogramming.

Truly mastering Raku doesn’t just mean knowing all of its extensive syntax; it doesn’t even mean becoming familiar with the many different paradigms that Raku supports. To be fully proficient with Raku requires the judgment to decide which paradigm best fits today’s problem and the comfort to correctly apply that paradigm. From this perspective, Raku is an extremely hard language to learn – though the rewards of doing so make it worthwhile.

But there’s another perspective that says that Raku is actually easy to learn.

True, it’s hard to learn all of Raku’s syntax if you try to learn it all at once, without context. But you shouldn’t do that, any more than an English speaker would try to learn German by sitting down with a German dictionary. The only reason that learning all of a language’s syntax seems like a reasonable thing to do is because some programming languages are so minimal that an all-at-once approach is not immediately disastrous. But that doesn’t make that approach a good idea.

Instead, the way to learn Raku it to learn just enough to get by, and then to start using it and picking up more as you go. Viewed from this perspective, Raku’s large amount of syntax is irrelevant to how hard the language is to learn – you’re not trying to learn it all at once, so it doesn’t matter if the bit you start with is half the syntax or just 20%. And Raku’s multiparadigm nature actually makes Raku easier to lean rather than harder: because Raku supports so many different paradigms, you can start with whatever subset of Raku you’re initially most comfortable with and expand out from there.

By now, you’ve probably figured out why we’ve spent so long discussing whether learning Raku is easy or hard: If you try to learn Raku the way you might learn Scheme, you’ll be in for a hard slog. But if you take a more piecemeal approach, learning Raku is much easier. And that piecemeal approach maps perfectly onto learning Raku through Advent of Code.

Specifically, to succeed with the piecemeal approach, you need to work on a series of small projects, each of which is manageable even if you know only a corner of the language. They need to be projects with relatively low stakes – since you won’t yet know all the different ways to solve a problem, odds are good that you’ll sometimes implement a suboptimal solution. And, most importantly of all, you need to work through them in a context that lets you see alternate approaches. Starting with the object-oriented subset of Raku and gradually adding techniques from outside your comfort zone is a great way to learn Raku, but it only works if you actually do the “gradually add techniques” part. If you get stuck in a rut, you won’t learn nearly as much, and seeing other people use different techniques to solve the same problem more elegantly is the best way to avoid that sort of rut.

In short, if you’d like to get better at Raku, there’s really no better way than by working through as many of the Advent of Code challenges as you can, comparing and contrasting your solutions with other AoC solutions (in Raku and in other languages), and thinking critically about the advantages and disadvantages of various approaches.

Let’s do this together

To make it easier for us to all find each other’s Raku Advent of Code solutions, I’ve created a GitHub repo that can hold them all: codesections/advent-of-raku-2020. I’ll be keeping the repo intentionally minimalist – if Raku weren’t so good at avoiding boilerplate, I’d add a template for setting up an AoC solution. But, given how little boilerplate Raku requires, I plan to keep the repo as simply a host for our solutions and links to related resources.

If you would like your solutions to be included, please submit a PR that adds a $your-name folder that includes your solutions (details are in the README). Of course, posting your solutions to the Advent of Raku repo shouldn’t stop you from posting them anywhere else you might want to, whether that’s your own site, the Raku subreddit or the main AoC subreddit.

And please feel free to add your solutions to the repo even – especially! – if you aren’t sure that you’ll have time to complete all the AoC challenges or if you’re brand new to Raku and aren’t positive that you’ll stick with the language past day 1. The point of all of this is to learn together, after all, and we can do that best by bringing together people with as wide a set of perspectives and backgrounds as possible.

I’ve also registered a private leaderboard for the Raku community, which you can access by logging in, following that link and then entering the code 407451-52d64a27 (if necessary, I can change that code and distribute it more securely, but I don’t anticipate any issues). Despite the “leaderboard” name, I view this as less a competitive ranking and more a way to keep track of who is participating – speaking personally, I have no intention of rushing to finish the challenges or trying to increase my score on any of the time-based metrics (indeed, given my timezone and the schedule I typically keep, I doubt I’ll start most puzzles until hours after they’re released).

I look forward to seeing your solutions; I’m sure we can learn a lot from one another. I also look forward to discussing the different approaches we take, whether that conversation takes place in the GitHub issues, on the Raku or AoC subreddits, or on the #raku IRC channel (I’ll try to keep an eye on all of those). Good luck to everyone, and may we all have an -Ofun Advent.

RFC 265: Interface polymorphism considered lovely

A little preface with an off-topic first. In the process of writing this post I was struck by the worst sysadmin’s nightmare: loss of servers followed by a bad backup. Until the very last moment I have had well-grounded fears of not finishing the post whatsoever. Luckily, I made a truce with life to get temporary respite. A conclusion? Don’t use bareos with ESXi. Or, probably, just don’t use bareos…

While picking up a RFC for my previous advent post I was totally focused on language-objects section. It took me a few passes to find the right one to cover. But in the meantime I realized that a very important topic is actually missing from the list. “Impossible!” – I said to myself and went onto another hunt later. Yet, neither search for “abstract class”, nor for “role” didn’t come up with any result. I was about to give up and make the conclusion that the idea came to life later, when the synopses were written or around so.

But, wait, what interface is mentioned as a topic of a OO-related RFC? Oh, that interface! As the request body states it:

Add a mechanism for declaring class interfaces with a further method for declaring that a class implements said interface.

At this point I realized once again that it is now a full 20 years behind us. That the text is from the times when many considered Java as the only right OO implementation! And indeed, by reading further we find the following statement, likely to be affected by some popular views of the time:

It’s now a compile time error if an interface file tries to do anything other than pre declare methods.

Reminds of something, isn’t it? And then, at the end of the RFC, we find another one:

Java is one language that springs to mind that uses interface polymorphism. Don’t let this put you off – if we must steal something from Java let’s steal something good.

Good? Good?!! Oh, my… Java’s attempt to solve problems of C++ multiple inheritance approach by simply denying it altogether is what drove me away from the language from the very beginning. I was fed up with Pascal controlling my writing style as far back as in early 90s!

Luckily, those involved in early Perl6 design must have shared my view to the problem (besides, Java itself has changed a lot since). So, we have roles now. What they have in common with abstract classes and the modern interfaces is that a role can define an interface to communicate with a class, and provide implementation of some role-specific behavior too. It can also do a little more than only that!

What makes roles different is the way a role is used in Raku OO model. A class doesn’t implement a role; nor it inherits from it as it would with abstract classes. Instead it does the role; or the other word I love to use for this: it consumes a role. Technically it means that roles are mixed into classes. The process can be figuratively described as if the compiler takes all methods and attributes contained by role’s type object and re-plants then onto the class. Something like:

role Foo {
    has $.foo = 42;
    method bar {
        say "hello!"
    }
}
class Bar does Foo { }
my $obj = Bar.new;
say $obj.foo; # 42
$obj.bar;     # hello!

How is it different from inheritance? Let’s change the class Bar a little:

class Baz {
    method bar {
        say "hello from Baz!"
    }
}
class Bar does Foo is Baz {
    method bar {
        say "hello from Bar!";
        nextsame
    }
}
Bar.new.bar; # hello from Bar!
             # hello from Baz!

nextsame in this case re-dispatches a method call to the next method of the same name in the inheritance hierarchy. Simply put, it passes control over to the method Baz::bar, as one can see from the output we’ve received. And Foo::bar? It’s not there. When the compiler mixes the role into Bar it finds that the class does have a method named bar already. Thus the one from Foo is ignored. Since nextsame only considers classes in the inheritance hierarchy, Foo::bar is not invoked.

With another trick the difference from interface consumption can also be made clear:

class Bar {
    method bar {
        say "hello from Bar!"
    }
}
my $obj = Bar.new;
$obj.bar; # hello from Bar!
$obj does Foo;
$obj.bar; # hello!

In this example the role is mixed into an existing object, thanks to the dynamic nature of Raku which makes this possible. When a role is applied this way its content is enforced over the class content, similarly to a virus injecting its genetic material into a cell effectively overriding internal processes. This is why the second call to bar is dispatched to the Foo::bar method and Bar::bar is nowhere to be found on $obj this time.

To have this subject fully covered, let me show you some funny code example. The operator but used in it behaves like does except it doesn’t modify its LHS object; instead but creates and returns a new one:

‌‌my $s1 = "not empty means true";
my $s2 = $s1 but role { method Bool { False } };
say $s1 ?? "true" !! "false";
say $s2 ?? "true" !! "false";

This snippet I’m leaving for you to try on your own because it’s time for my post to move onto another topic: role parameterization.

Consider the example:

role R[Str:D $desc] {
    has Str:D $.description = $desc;
}
class Foo does R["some info"] { }
say Foo.new.description; # some info

Or more practical one:

role R[::T] {
    has T $.val is rw;
}
class ContInt does R[Int] { }
ContInt.new.val = "oops!"; # "Type check failed..." exception is thrown

The latter example utilizes so called type capture where T is a generic type, the concept many of you are likely to know from other languages, which turns into a concrete type only when the role gets consumed and supplied with a parameter, as in class ContInt declaration.

The final iteration for parametrics I’m going to present today would be this more extensive example:

role Vect[::TX] {
    has TX $.x;
    method distance(Vect $v) { ($v.x - $.x).abs }
}
role Vect[::TX, ::TY] {
    has TX $.x;
    has TY $.y;
    method distance(Vect $v) { 
        (($v.x - $.x)² + ($v.y - $.y)²).sqrt 
    }
}

class Foo1  does Vect[Rat]      { }
class Foo2 does Vect[Int, Int] { }

my $foo1 = Foo1.new(:x(10.0));
my $foo2 = Foo2.new(:x(10), :y(5));
say $foo1;                                   # Foo1.new(x => 10.0)
say $foo2;                                   # Foo2.new(x => 10, y => 5)
say $foo2.distance(Foo2.new(:x(11), :y(4))); # 1.4142135623730951

Hopefully, the code explains itself. Most certainly it nicely visualizes the long way made by the language designers since the initial RFC was made.

At the end I’d like to share a few interesting facts about Raku roles and their implementation by Rakudo.

  1. As of Raku v6.e, a role can define own constructor/destructor submethods. They’re not mixed into a class as methods are. Instead, they’re used to build/destroy an object same way, as constructors/destructors of classes do:
use v6.e.PREVIEW; # 6.e is not released yet
role R { submethod TWEAK { say "R" } }
class Foo { submethod TWEAK { say "Foo" } }
class Bar is Foo does R { submethod TWEAK { say "Bar" } }
Bar.new; # Foo
         # R
         # Bar
  1. Role body is a subroutine. Try this example:
role R { say "Role" }
class Foo { say "Foo" }
# Foo

Then modify class Foo so that it consumes R:

class Foo does R { say "Foo" }
# Role
# Foo

The difference in the output is explained by the fact that role body gets invoked when the role itself is mixed into a class. Try adding one more class consuming R alongside with Foo and see how the output changes. To make the distinction between class and role bodies even more clear, make your new class inherit from Foo. Even though is and does look alike they act very much different. 3. Square brackets in role declaration enclose a signature. As a matter of fact, it is the signature of role body subroutine! This makes a few very useful tricks possible:

# Limit role parameters to concrete numeric objects.
role R[Numeric:D ::T $default] {
    has T $.value = $default;
}
class Foo[42.13] { };
say Foo.new.x; # 42.13

Or even:

# Same as above but only allow specific values.
role R[Numeric:D ::T $default where * > 10] {
    has T $.value = $default;
} 

Moreover, in case when few different parametric candidates are declared for a role, choosing the right one is a task of the same kind as choosing the right routine of a few multi candidates and based on matching signatures to the parameters passed. 4. Rakudo implements a role using four different role types! Let me demonstrate one aspect of this with the following snippet based on the example for the previous fact:

for Foo.^roles -> \consumed {
    say R === consumed
}

=== is a strict object identity operator. In our case we can consider it as a strict type equivalence operator which tells us if two types are actually exactly the same one.

And as I hope to have this subject covered later in a more extensive article, at this point I would make it a classical abrupt open ending by providing just the output of the above snippet as a hint:

False

RFC 28, by Simon Cozens

 

RFC 28 – Perl Should Stay Perl

Originally Submitted by Simon Cozens, RFC 28 on August 4, 2020, this RFC asked the community to make sure that whatever updates were made, that Perl 6 was still definitely recognizable as Perl. After 20 years of design, proofs-of-concept, implementations, two released language versions, we’ve ended up with something that is definitely Perlish, even if we’re no longer a Perl.

At the time the RFCs were submitted, the thought was that this language would be the next Perl in line, Perl 6. As time went on before an official language release, Perl 5 development picked up again, and that team & community wanted to continue on its own path. A few months ago, Perl 6 officially changed its name to Raku – not to get away from our Perl legacy, but to free the Perl 5 community to continue on their path as well. It was a difficult path to get to Raku, but we are happy with the language we’re shipping, even if we do miss having the Perl name on the tin.

“Attractive Nuisances”

Let’s dig into some of the specifics Simon mentions in his RFC.

We’ve got a golden opportunity here to turn Perl into whatever on earth we like. Let’s not take it.

This was a fine line that we ended up crossing, even before the rename. Specific design decisions were changed, we started with a fresh implementation (more than once if you count Pugs & Parrot & Niecza …). We are Perlish, inspired by Perl, but Raku is definitely different.

Nobody wins if we bend the Perl language out of all recognition, because it won’t be Perl any more.

I argue that eventually, everyone won – we got a new and improved Perl 5 (and soon, a 7), and we got a brand new language in Raku. The path wasn’t clear 20 years ago, but we ended up in a good place.

Some things just don’t need heavy object orientation.

Raku’s OO is everywhere: but it isn’t required. While you can treat everything as an object:

  3.sqrt.say;

You can still use the familiar Perlish forms for most features. say sqrt 3;

Even native scalars (which don’t have the overhead of objects) let you treat them as OO if you want.

  my uint32 $x = 32;
  say $x;
  $x.^name.say;

Even though $x here doesn’t start out as an object, by calling a meta-method on it, the compiler cheats on our behalf and outputs Int here, the closest class to our native int.

But we avoid going the extent of Java; for example, we don’t have to define a class with a main method in order to execute a program.

Strong typing does not equal legitimacy.

Similar to the OO approach, we don’t require typing, but allow you to gradually add it. You can start with an untyped scalar variable, but as you further develop your code, you can add a type to that declared variable, and to parameters to subs & methods. The types can be single classes, subsets, Junctions, where clauses with complicated logic: you can use as much or as little typing as you want. Raku’s multi routines (subs or methods with the same name but different arguments) give you a way to split up your code based on types that is then optimized by the compiler. But you can use as little or as much of it as you want.

Just because Perl has a map operator, this doesn’t make it a functional programming language.

I think Raku stayed true to this point – while there are functional elements, the polyglot approach (supporting multiple different paradigms) means that any one of them, including functional, doesn’t take over the language. But you can declare routines pure, allowing the compiler to constant fold calls to that routine when the args are known at compile time.

Perl is really hard for a machine to parse. … It’s meant to be easy for humans to understand.

Development of Raku definitely embraced this thought – “torture the implementators on behalf of the users”. This is one of the reasons it took us a while to get to here. But on that journey, we designed and developed new language parsing tools that we not only use to build and run Raku, but we expose to our users as well, allowing them to implement their own languages and “Slangs” on top of our compiler.

fin

Finally, now that the Perl team is proposing a version jump to 7, I suspect the Perl community will raise similar concerns to those raised by Simon. Raku and Perl 7 have taken two different paths, but both will be recognizable to the Perl 5 RFC contributors from 20 years ago.

RFC 84 by Damian Conway: => => =>

RFC 84 by Damian Conway: Replace => (stringifying comma) with => (pair constructor)

Yet another nice goodie from Damian, truly what you might expect from the interlocutor and explicator!

The fat comma operator, =>, was originally used to separate values – with a twist. It behave just like , operator did, but modified parsing to stringify left operand.

It saved you some quoting for strings and so this code for hash initialization:

my %h = (
'a', 1,
'b', 2,
);

could be written as:

my %h = (
a => 1,
b => 2,
);

Here, bare a and b are parsed correctly, without a need to quote them into strings. However, the usual hash assignment semantics is still the same: pairs of values are processed one by one, and given that => is just a “left-side stringifying” comma operator, interestingly enough the code above is equivalent to this piece:

my %h = ( a => 1 => b => 2 => );

The proposal suggested changing the meaning of this “special” operator to become a constructor of a new data type, Pair.

A Pair is constructed from a key and a value:

my @pairs = a => 42, 1 => 2;
say @pairs[0]; # a => 42
say @pairs[1]; # 1 => 2;
say @pairs[1].key.^name; # Int, not a Str

The @pairs list contains just 2 values here, not 4, one is conveniently stringified for us and the second just uses bare Int literal as a key.

It turns out, introducing Pair is not only a convenient data type to operate on, but this change offers new opportunities for… subroutines.

Raku has first class support of signatures, both for the sake of the “first travel class” pun here and for the matter of it, yes, actually having Signature, Parameter and Capture as first-class objects, which allows for surprising solutions. It is not a surprise it supports named parameters with plenty of syntax for it. And Pair class has blended in quite naturally.

If a Pair is passed to a subroutine with a named parameter where keys match, it works just so, otherwise you have a “full” Pair, and if you want to insist, a bit of syntax can help you here:

sub foo($pos, :$named) {
say "$pos.gist(), $named.gist()";
}
foo(42); # 42, (Any)
try foo(named => 42); # Oops, no positionals were passed!
foo((named => 42)); # named => 42, (Any)
foo((named => 42), named => 42); # named => 42, 42

As we can see, designing a language is interesting: a change made in one part can have consequences in some other part, which might seem quite unrelated, and you better hope your choices will work out well when connected together. Thanks to Damian and all the people who worked on Raku design, for putting in an amazing amount of efforts into it!

And last, but not the least: what happened with the => train we saw? Well, now it does what you mean if you mean what it does:

my %a = a => 1 => b => 2;
say %a.raku; # {:a(1 => :b(2))}

And yes, this is a key a pointing to a value of Pair of 1 pointing to a value of Pair of b pointing to value of 2, so at least the direction is nice this time. Good luck and keep your directions!

RFC 200, by Nathan Wiger: Revamp tie to support extensibility

Proposed on 7 September 2000, frozen on 20 September 2000, depends on RFC 159: True Polymorphic Objects proposed on 25 August 2000, frozen on 16 September 2000, also by Nathan Wiger and already blogged about earlier.

What is tie anyway?

RFC 200 was about extending the tie functionality as offered by Perl.

This functionality in Perl allows one to inject program logic into the system’s handling of scalars, arrays and hashes, among other things. This is done by assigning the name of a package to a data-structure such as an array (aka tying). That package is then expected to provide a number of subroutines (e.g. FETCH and STORE) that will be called by the system to achieve certain effects on the given data-structure.

As such, it is used by some of Perl’s core modules, such as threads, and many modules on CPAN, such as Tie::File. The tie functionality of Perl still suffers from the problems mentioned in the RFC.

It’s all tied

In Raku, everything is an object, or can be considered to be an object. Everything the system needs to do with an object, is done through its methods. In that sense, you could say that everything in Raku is a tied object. Fortunately, Rakudo (the most advanced implementation of the Raku Programming Language) can recognize when certain methods on an object are in fact the ones supplied by the system, and actually create short-cuts at compile time (e.g. when assigning to a variable that has a standard container: it won’t actually call a STORE method, but uses an internal subroutine to achieve the desired effect).

But apart from that, Rakudo has the capability of identifying hot code paths during execution of a program, and optimize these in real time.

Jonathan Worthington gave two very nice presentations about this process: How does deoptimization help us go faster from 2017, and a Performance Update from 2019.

Because everything in Raku is an object and access occurs through the methods of the classes of these objects, this allows the compiler and the runtime to have a much better grasp of what is actually going on in a program. Which in turn gives better optimization capabilities, even optimizing down to machine language level at some point.

And because everything is “tied” in Raku (looking at it using Perl-filtered glasses), injecting program logic into the system’s handling of arrays and hashes can be as simple as subclassing the system’s class and providing a special version of one of the standard methods as used by the system. Suppose you want to see in your program when an element is fetched from an array, one need only add a custom AT-POS method:

class VerboseFetcher is Array {    # subclass core's Array class
    method AT-POS($pos) {           # method for fetching an element
        say "fetching #$pos";        # tell the world
        nextsame                     # provide standard functionality
    }
}

my @a is VerboseFetcher = 1,2,3;   # mark as special and initialize
say @a[1];  # fetching #1␤2

The Raku documentation contains an overview of which methods need to be supplied to emulate an Array and to emulate a Hash. By the way, the whole lemma about accessing data structure elements by index or key is recommended reading for someone wanting to grok those aspects of the internals of Raku.

Nothing is special

In a blog post about RFC 168 about making things less special, it was already mentioned that really nothing is special in Raku. And that (almost) all aspects of the language can by altered inside a lexical scope. So what the above example did to the Array class, can be done to any of Raku’s core classes, or any other classes that have been installed from the ecosystem, or that you have written yourself.

But it can be overwhelming to have to supply all of the logic needed to fully emulate an array or a hash. Especially when you first try to do this. Therefore the ecosystem actually has two modules with roles that help you with that:

Both modules only require you to implement 5 methods in a class that does these roles to get the full functionality of an array or a hash, completely customized to your liking.

In fact, the flexibility of the approach of Raku towards customizability of the language, actually allowed the implementation of Perl’s tie built-in function in Raku. So if you’re porting code from Perl to Raku, and the code in question uses tie, you can use this module as a quick intermediate solution.

Has the problem been fixed?

Let’s look at the problems that were mentioned with tie in RFC 200:

  1. It is non-extensible; you are limited to using functions that have been implemented with tie hooks in them already.

Raku is completely extensible and pluggable in (almost) all aspects of its implementation. There is no limitation to which classes one can and one cannot extend.

  1. Any additional functions require mixed calls to tied and OO interfaces, defeating a chief goal: transparency.

All interfaces use methods in Raku, since everything is an object or can be considered as one. Use of classes and methods should be clear to any programmer using Raku.

  1. It is slow. Very slow, in fact.

In Raku, it is all the same speed during execution. And every customization profits from the same optimization features like every other piece of code in Raku. And will be, in the end, optimized down to machine code when possible.

  1. You can’t easily integrate tie and operator overloading.

In Raku, operators are multi-dispatch subroutines that allow additional candidates for custom classes to be added.

  1. If defining tied and OO interfaces, you must define duplicate functions or use typeglobs.

Typeglobs don’t exist in Raku. All interfacing in Raku is done by supplying additional methods (or subroutines in case of operators). No duplication of effort is needed, so no such problem.

  1. Some parts of the syntax are, well, kludgey

One may argue that the kludgey syntax of Perl has been replaced by another kludgey syntax in Raku. That is probably in the eye of the beholder. Fact is that the syntax in Raku for injecting program logic, is not different from any other subclassing or role mixins one would otherwise do in Raku.

Conclusion

Nothing from RFC 159 actually was implemented in the way it was originally suggested. However, solutions to the problems mentioned have all been implemented in Raku.

RFC 159, by Nathan Wiger: True Polymorphic Objects

Proposed on 25 August 2000, frozen on 16 September 2000

On polymorphism

RFC159 introduces the concept of true polymorphic object.

Objects that can morph into numbers, strings, booleans and much more on-demand. As such, objects can be freely passed around and manipulated without having to care what they contain (or even that they’re objects).

When one looks at how 42, "foo", now work in Raku nowadays, one can only see that that vision has pretty much been implemented. Because most of the time, one doesn’t really care about the fact that 42 is really an Int object, "foo" is really a Str object and that now represents a new Instant object every time it is called. The only thing one cares about, is that they can be used in expressions:

say "foo" ~ "bar";  # foobar
say 42 + 666;       # 708
say now - INIT now; # 0.0005243

RFC159 lists a number of method names to be used to indicate how an object should behave under certain circumstances, with a fallback provided by the system if the class of the object does not provide that method. In most cases these methods did not make it into Raku, but some of them did with a different name:

Name in RFC Name in Raku When
STRING Str Called in a string context
NUMBER Numeric Called in a numeric context
BOOLEAN Bool Called in a boolean context

And some of them even retained their name:

Name in RFC When
BUILD Called in object blessing
STORE Called in an lvalue = context
FETCH Called in an rvalue = context
DESTROY Called in object destruction

but with sometimes subtly different semantics from the RFC.

Only a few made it

In the end, only a limited set of special methods was decided on for Raku. All of the other methods in RFC159 have been implemented by polymorphic operators that coerce when needed. For instance the proposed PLUS method has been implemented as an infix + operator that has a “default” candidate that coerces its operands to a number.

So, effectively, if you have an object of class Foo and you want that to act as a number, one only needs to add a Numeric method to that class. An expression such as:

my $foo = Foo.new;
say $foo + 42;

is effectively executing:

say infix:<+>( $foo, 42 );

and the infix:<+> candidate that takes Any objects, does:

return infix:<+>( $foo.Numeric, 42.Numeric );

And if such a class Foo does not provide a Numeric method, then it will throw an exception.

The DESTROY method

In Raku, object destruction is non-deterministic. If an object is no longer in use, it will probably get garbage collected. The probable part is because Raku does not know a global destruction phase, unlike Perl. So when a program is done, it just does an exit (although that logic does honour any END blocks).

An object is marked “ready for removal” when it can no longer be “reached”. It then has its DESTROY method called when the garbage collection logic kicks in. Which can be any amount of time after it became unreachable.

If you need deterministic calling of the DESTROY method, you can use a LEAVE phaser. Or if that doesn’t allow you to scratch your itch, you can possibly use the FINALIZER module.

STORE / FETCH on scalar values

Conceptually, you can think of a container in Raku as an object with STORE and FETCH methods. Whenever you set a value in a container, it conceptually calls the STORE method. And whenever the value inside the container is needed, it conceptually calls the FETCH method. In pseudo-code:

my $foo = 42;  # Scalar.new(:name<$foo>).STORE(42)

But what if you want to control access to a scalar value, similar to Perl’s tie? Well, in Raku you can, with a special type of container class called Proxy. An example of its usage:

sub proxier($value? is copy) {
    return-rw Proxy.new(
        FETCH => method { $value },
        STORE => method ($new) {
            say "storing";
            $value = $new
        }
    )
}

my $a := proxier(42);
say $a;    # 42
$a = 666;  # storing
say $a;    # 666

Subroutines return their result values de-containerized by default. There are basically two ways of making sure the actual container is returned: using return-rw (like in this example), or by marking the subroutine with the is rw trait.

STORE on compound values

Since FETCH only makes sense on scalar values, there is no support for FETCH on compound values, such as hashes and arrays, in Raku. I guess one could consider calling FETCH in such a case to be the Zen slice, but it was decided that that would just return the compound value itself.

The STORE method on compound values however, allows for some interesting functionality. The STORE method is called whenever there is an initialization of the entire compound value. For instance:

@a = 1,2,3;

basically executes:

@a := @a.STORE( (1,2,3) );

But what if you don’t have an initialized @a yet? Then the STORE method is supposed to actually create a new object and initialize this with the given values. And the STORE method can tell, because then it also receives a INITIALIZE named argument with a True value. So when you write this:

my @b = 1,2,3;

what basically gets executed is:

@b := Array.new.STORE( (1,2,3), :INITIALIZE );

Now, if you realize that:

my @b;

is actually short for:

my @b is Array;

it’s only a small step to realize that you can create your own class with customized array logic, that can replace the standard Array logic with your own. Observe:

class Foo {
    has @!array;
    method STORE(@!array) {
        say "STORED @!array[]";
        self
    }
}

my @b is Foo = 1,2,3;  # STORED 1 2 3

However, when you actually start using such an array, you are confronted with some weird results:

say @b[0]; # Foo.new
say @b[1]; # Index out of range. Is: 1, should be in 0..0

Without getting into the reasons for these results, it should be clear that to completely mimic an Array, a lot more is needed. Fortunately, there are ecosystem modules available to help you with that: Array::Agnostic for arrays, and Hash::Agnostic for hashes.

BUILD

The BUILD method also subtly changed its semantics. In Raku, method BUILD will be called as an object method and receive all of the parameters given to .new, after which it is fully responsible for initializing object attributes. This becomes more visible when you use the internal helper module BUILDPLAN. This module shows the actions that will be performed on an object of a class when built with the default .new method:

class Bar {
    has $.score = 42;
}
use BUILDPLAN Bar;
# class Bar BUILDPLAN:
#  0: nqp::getattr(obj,Foo,'$!score') = :$score if possible
#  1: nqp::getattr(obj,Foo,'$!score') = 42 if not set

This is internals speak for: – assign the value of the optional named argument score to the $!score attribute – assign the value 42 to the $!score attribute if it was not set already

Now, if we add a BUILD method to the class, the buildplan changes:

class Bar {
    has $.score = 42;
    method BUILD() { }
}
use BUILDPLAN Bar;
# class Bar BUILDPLAN:
#  0: call obj.BUILD
#  1: nqp::getattr(obj,Foo,'$!score') = 42 if not set

Note that there is no automatic attempt to take the value of the named argument score anymore. Which means that you need to do a lot of work in your custom BUILD method if you have many named arguments, and only one of them needs special handling. That’s why the TWEAK method was added:

class Bar {
    has $.score = 42;
    method TWEAK() { }
}
use BUILDPLAN Bar;
# class Bar BUILDPLAN:
#  0: nqp::getattr(obj,Foo,'$!score') = :$score if possible
#  1: nqp::getattr(obj,Foo,'$!score') = 42 if not set
#  2: call obj.TWEAK

Note that the TWEAK method is called after all of the normal checks and initializations. This is in most cases much more useful.

Conclusion

Although the idea of true polymorphic objects has been implemented in Raku, it turned out quite different from originally envisioned. In hindsight, one can see why it was decided to be unpractical to try to support an ever increasing list of special methods for all objects. Instead, a choice was made to only implement a few key methods from the proposal, and for the others the approach of automatic coercions was taken.