Featured

It’s that time of the year

When we start all over again with advent calendars, publishing one article a day until Christmas. This is going to be the first full year with Raku being called Raku, and the second year we have moved to this new site. However, it’s going to be the 12th year (after this first article) in a row with a Perl 6 or Raku calendar, previously published in the Perl 6 Advent Calendar blog. And also the 5th year since the Christmas release, which was announced in the advent calendar of that year.

Anyway. Here we go again! We have lined a up a full (or eventually full by the time the Advent Calendar is finished) set of articles on many different topics, but all of them about our beloved Raku.

So, enjoy, stay healthy, and have -Ofun reading this nice list of articles this year will be bringing.

Day 25: Reminiscence, refinement, revolution

By Jonathan Worthington

Raku release reminiscence

Christmas day, 2015. I woke up in the south of Ukraine – in the very same apartment where I’d lived for a month back in the spring, hacking on the NFG representation of Unicode. NFG was just one of the missing pieces that had to fall into place during 2015 in order for that Christmas – finally – to bring the first official release of the language we now know as Raku.

I sipped a coffee and looked out onto a snowy courtyard. That, at least, was reassuring. Snow around Christmas was relatively common in my childhood. It ceased to be the year after I bought a sledge. I opened my laptop, and took a look at the #perl6-dev IRC channel. Release time would be soon – and I would largely be a spectator.

My contributions to the Rakudo compiler had started eight years prior. I had no idea what I was getting myself into, although if I had known, I’m pretty sure I’d still have done it. The technical challenges were, of course, fascinating for somebody who had developed a keen interest in languages, compilers, and runtimes while at university. Larry designs languages with an eye on what’s possible, not on what’s easy, for the implementer. I learned, and continue to learn, a tremendous amount by virtue of working on Raku implementation. Aside from that, the regular speaking at workshops and conferences opened the door to spending some years as a teacher of software development and architecture, and after that left me with a wealth of knowledge to draw on as I moved on to focus on consultancy on developer tooling. Most precious of all, however, are the friendships forged with some of those sharing the Raku journey – which I can only imagine lasting a lifetime.

Eight years had gone by surprisingly quickly. When one is involved with the day-to-day development, the progress is palpable. This feature got done, this bug got fixed, this design decision got made, this library got written, this design decision got re-made for the third time based on seeing ways early adopters stubbed their toe, this feature got re-implemented for the fourth time because of the design change… From the outside, it all looks rather different; it’s simply taking forever, things keeping getting re-done, and the developers of it “have failed us all” (all seasons have their grinches).

Similarly, while from the outside the Christmas release was “the big moment”, from the inside, it was almost routine. We shipped a Rakudo compiler release, just as we’d been doing every month for years on end. Only this time, we also declared the specification test suite – which constitutes the official specification of the language – as being an official language release. The next month would look much like the previous ones: more bug reports arrive, more things get fixed and improved, more new users show up asking for guidance.

The Christmas release was the end of the beginning. But the beginning is only the start of a story.

Regular Raku refinement

It’s been five years since That Christmas. Time continues to fly by. Each week brings its advances – almost always documented in what’s currently known as the Rakudo Weekly, the successor of the Perl 6 Weekly. Some things seem constant: there’s always some new bug reports, there will always be something that’s under-documented, no matter what you make fast there will always be something else that is still slow, no matter how unlikely it seemed someone would depend on an implementation detail they will have anyway, new Unicode versions require at least some effort, and the latest release of MacOS seemingly always requires some kind of tweak. Welcome to the life of those working on a language implementation that’s getting some amount of real-world use.

Among the seemingly unending set of things to improve, it’s easy to lose sight of just how far the Raku Programming Language, its implementation in Rakudo, and the surrounding ecosystem has come over the last five years. Here I’ll touch on just a few areas worthy of mention.

Maturity

Maturity of a language implementation, its tools, and its libraries, is really, really, hard won. There’s not all that many shortcuts. Experience helps, and whenever a problem can be avoided in the first place, by having somebody about with the background to know to do so, that’s great. Otherwise, it’s largely a case of making sure that when there are problems, they get fixed and something is done to try and avoid a recurrence. It’s OK to make mistakes, but making the same mistake twice is a bit careless.

The most obvious example of this is making sure that all implementation bugs have test coverage before the issue is considered resolved. However, being proactive matters too. Prior to every Rakudo compiler release, the tests of all modules in the ecosystem are run against it. Regressions are noted, and by now can even be automatically bisected to the very commit that caused the regression. Given releases take place every month, and this process can be repeated multiple times a month, there’s a good chance the developer whose work caused the regression will have it relatively fresh in their mind.

A huge number of things have been fixed and made more robust over the last five years, and there are tasks I can comfortably reach to Raku for today that I wouldn’t have five years back. Just as important, the tooling supporting the development and release process has improved too, and continues to do so.

Modules

There are around 3.5x as many modules available as there were five years ago, which means there’s a much greater chance of finding a module to do what you need. The improvement in quantity is easy enough to quantify (duh!), but the increase in quality has also had a significant impact, given that many problems we need to solve draw on a relatively small set of data formats, protocols, and so forth.

Just to pick a few examples, 5 years ago we didn’t have:

  • Cro, which is now the most popular choice for building web applications and services in Raku. Cro wasn’t a port of any existing library, but rather designed from scratch to make the most of the Raku language.
  • DB::Pg and friends: while DBIish, which existed 5 years ago, has become far more mature, DB::Pg provides a well-engineered alternative that – at least to me – has an API that feels more natural in Raku.
  • Red – an ORM for Raku that puts the meta-programming powers of the language to good use. It’s marked up as a work in progress, but looks promising.
  • LibXML – an extensive Raku binding to libxml native library
  • IO::Socket::Async::SSL – yes, at the time of the Christmas release, for all of the nice async things we had, there wasn’t yet an asynchronous building of the OpenSSL library.

Given how regularly I’ve used many of these in my recent work using Raku, it’s almost hard to imagine that five years ago, none of them existed!

An IDE

I was somewhat reluctant to put this in as a headline item, given that I’m heavily involved with it, but the Comma IDE for Raku has been a game changer for my own development work using the language. Granted IDEs aren’t for everyone or everything; if I’m at the command line and want to write a short script, I’ll open Vim, not fire up Comma. But for much beyond that, I’m glad of the live analysis of my code to catch errors, quick navigation, auto-complete, effortless renaming of many program elements, and integrated test runner. The timeline view can offer insight into asynchronous programs, especially Cro services, while the grammar live view comes in handy when writing parsers. While installing an IDE just for a REPL is a bit over the top, Comma also provides a REPL with syntax highlighting and auto-completion too.

Five years ago, the answer to “is there an IDE for Raku” was “uhm, well…” Now, it’s an emphatic yes, and for some folks to consider using the language, that’s important.

Performance

The Christmas release of Raku was not speedy. While just about everyone would agree there’s more work needed in this area, the situation today is a vast improvement on where we were five years ago. This applies at all levels of the stack: MoarVM’s runtime optimizer and JIT have learned a whole bunch of new tricks, calls into native libraries have become far more efficient (to the benefit of all bindings using this), the CORE setting (Raku’s standard library) has seen an incredible number of optimizations, and some key modules outside of the core have received performance analysis and optimization too. Thanks to all of that, Raku can be considered “fast enough” for a much wider range of tasks than it could five years ago.

And yes, the name

Five years ago, Raku wasn’t called Raku. It was called Perl 6. Name change came up now and then, but was a relatively fringe position. By 2019, it had won sufficient support to take place. A year and a bit down the line, I think we can say the rename wasn’t a panacea, but nor could it be considered a bad thing for Raku. While I’ve personally got a lot of positive associations with the name “Perl”, it does carry an amount of historical baggage. One of the more depressing moments for me was when we announced Comma, and then saw many of the comments from outside of the community consist of the same tired old Perl “jokes”. At least in that sense, a fresh brand is a relief. Time will tell what values people will come to attach to it.

Rational Raku revolution

With software, it feels like the ideal time to start working on something is after having already done most of the work on it. At that point, the required knowledge and insight is conveniently to hand, and at least a good number of lessons wouldn’t need to be learned the hard way again.

Alas, we don’t have a time machine. But we do have an architecture that gives us a chance of being able to significantly overhaul one part of the stack at a time, so we can use what we’ve learned. A number of such efforts are now underway, and stand to have a significant impact on Raku’s next five years.

The most user-facing one is known as “RakuAST”. It involves a rewrite of the Rakudo compiler frontend that centers around a user-facing AST – that is, a document object model of the program. This opens the door to features like macros and custom compiler passes. These may not sound immediately useful to the everyday Raku user, but will enable quite a few modules to do what they’re already doing in a better way, as well as opening up some new API design possibilities.

Aside from providing a foundation for new features, the compiler frontend rewrite that RakuAST entails is an opportunity to eliminate a number of long-standing fragilities in the current compiler implementation. This quite significant overhaul is made achievable by the things that need not change: the standard library, the object system implementation, the compiler backend, and the virtual machine.

A second ongoing effort is to improve the handling of dispatch in the virtual machine, by introducing a single more general mechanism that will replace a range of feature-specific optimizations today. It should also allow better optimization of some things we presently struggle with. For example, using deferral via callsame, or where clauses in multiple dispatch, comes with a high price today (made to stand out because many other constructs in that space have become much faster in recent years). The goal is to do more with less – or at least, with less low-level machinery in the VM.

It’s not just the compiler and runtime that matter, though. The recently elected Raku Steering Council stands to provide better governance and leadership than has been achieved in the last few years. Meanwhile, efforts are underway to improve the module ecosystem and documentation.

Today, we almost take for granted much of the progress of the last five years. It’s exciting to think what will become the Raku norm in the next five. I look forward to creating some small part of that future, and especially to seeing what others – perhaps you, dear reader – will create too.

Day 24: Christmas-oriented programming, part deux

In the previous installment of this series of articles, we started with a straightforward script, and we wanted to arrive to a sound object-oriented design using Raku.

Our (re)starting point was this user story:

Continue reading “Day 24: Christmas-oriented programming, part deux”

Day 23: Christmas-oriented design and implementation

Elves graduating from the community college, in front of a Santa Statue

Every year by the beginning of the school year, which starts by January 8th in the North Pole, after every version of the Christmas gift-giving spirit has made their rounds, Santa needs to sit down to schedule the classes of the North Pole Community College. These elves need continuous education, and they need to really learn about those newfangled toys, apart from the tools and skills of the trade.

Continue reading “Day 23: Christmas-oriented design and implementation”

Day 22: What’s the point of pointfree programming?

He had taken a new name for most of the usual reasons, and for a few unusual ones as well, not the least of which was the fact that names were important to him.

— Patrick Rothfuss, The Name of the Wind

If you’re a programmer, there’s a good chance that names are important to you, too. Giving variables and functions well chosen names is one of the basic tenets of writing good code, and improving the quality of names is one of the first steps in refactoring low-quality code. And if you both are a programmer and are at all familiar with Raku (renamed from “Perl 6” in 2019), then you are even more likely to appreciate the power and importance of names.

This makes the appeal of pointfree programming – which advocates for removing many of the names in your code – a bit mysterious. Given how helpful good names are, it can be hard to understand why you’d want to eliminate them.

This isn’t necessarily helped by some of the arguments put forward by advocates of pointfree programming (which is also sometimes called tacit programming). For example, one proponent of pointfree programming said:

Sometimes, especially in abstract situations involving higher-order functions, providing names for tangential arguments can cloud the mathematical concepts underlying what you’re doing. In these cases, point-free notation can help remove those distractions from your code.

That’s not wrong, but it’s also not exactly helpful; when reading that, I find myself thinking “sometimes; OK, when? in abstract situations; OK, what sort of situations?” And it seems like I’m not the only one with a similar set of questions, as the top Hacker News comment shows. Given arguments like these, I’m not at all surprised that many programmers dismiss pointfree programming in essentially the same way Wikipedia does: according to Wikipedia, pointfree programming is “of theoretical interest” but can make code “unnecessarily obscure”.

This view – though understandable – is both mistaken and, I believe, deeply unfortunate. Programming in a pointfree style can make code far more readable; done correctly, it makes code less obscure rather than more. In the remainder of this post I’ll explain, as concretely as possible, the advantage of coding with fewer names. To keep myself honest, I’ll also refactor a short program into pointfree style (the code will be in Raku, but both the before and after versions should be approachable to non-Raku programmers). Finally, I’ll close by noting a handful of the ways that Raku’s “there’s more than one way to do it” philosophy makes it easier to write clear, concise, pointfree code (if you want to).

The fundamental point of pointfree

I said before that names are important, and I meant it. My claim is the one that G.K. Chesterton (or his dog) might have made if only he’d cared about writing good code: we should use fewer names not because names are unimportant but precisely because of how important names are.

Let’s back up for just a minute. Why do names help with writing clear code in the first place? Well, most basically, because good names convey information. sub f($a, $b) may show you that you’ve got a function that takes two arguments – but it leaves you totally in the dark about what the function does or what role the arguments play. But everything is much clearer as soon as we add names: sub days-to-birthday($person, $starting-date). Suddenly, we have a much better idea what the function is doing. Not a perfect idea, of course; in particular, we likely have a number of questions of the sort that would be answered by adding types to the code (something Raku supports). But it’s undeniable that the names added information to our code.

So if adding names adds info, it’ll make your code clearer and easier to understand, right? Well, sure … up to a point. But this is the same line of thinking that leads to pages and pages of loan “disclosures”, each of which is designed to give you more information about the loan. Despite these intentions, anyone who has confronted a stack of paperwork the approximate size of the Eiffel Tower can attest that the cumulative effect of this extra info is to confuse readers and obscure the important details. Excessive names in code can fall into the same trap: even if each name technically adds info, the cumulative effect of too many names is confusion rather than clarity.

Here’s the same idea in different words: what names add to your code is not just extra info but also extra emphasis. And the thing about emphasis – whether it comes from bold, all-caps, or naming – is that it loses its power when overused. Giving everything a name is the same sort of error as writing in ALL-CAPS. Basically, don’t be this guy:

<Khassaki>:      HI EVERYBODY!!!!!!!!!!  
<Judge-Mental>:  try pressing the Caps Lock key  
<Khassaki>:      O THANKS!!! ITS SO MUCH EASIER TO WRITE NOW!!!!!!!  
<Judge-Mental>:  f**k me  

source (expurgation added, mostly to have an excuse to use the word expurgation).

I believe that the fundamental benefit of using pointfree programming techniques to write code with fewer names is that it allows the remaining names to stand out more – which lets them convey more information than a sea of names would do.

What does it mean to “understand” a line of code?

Do you understand this line of Raku code?

$fuel += $mass

Let’s imagine how a very literal programmer – we’ll call them Literal Larry – might respond. (Literal Larry is, of course, not intended to refer to Raku founder Larry Wall. That Larry may have been accused of various stylistic flaws over the years, but never of excessive literalness.)

Literal Larry might say, “Of course I understand what that line does! There’s a $fuel variable, and it’s incremented by the value of the $mass variable. Could it be any more obvious?”. But my response to Larry (convenient strawman that he is) would be, “You just told me what that line says, but not what it does. Without knowing more of the context around that line, in fact, we can’t know what that line does. Understanding that single – and admittedly simple! – line requires that we hold the context of other lines in our head. Worse, because it’s changing the value of one variable based on the value of another, understanding it requires us to track mutable state – one of the fastest ways to add complexity to a piece of code.”

And that sets up my second claim about coding in a pointfree style: It often reduces the amount of context/state that you need in your head to understand any given line of code. Pointfree code reduces the reliance on context/state in two ways: first, to the extent that we totally eliminate some named variables, then we obviously no longer need to mentally track the state of those variables. Less obviously (but arguably more importantly), a pointfree style naturally pushes you towards limiting the scope of your variables and reduces the number to keep track of at any one time. (You’ll see this in action as we work through the example below.)

A pointed example

Despite keeping our discussion as practical as possible, I worry that it has drifted a bit away from the realm of the concrete. Let’s remedy that by writing some actual code! I’ll present some code in a standard procedural style, refactor it into a more pointfree style, and discuss what we get out of the change.

But where should we get our before code? It needs to be decently written – my exchange with Literal Larry was probably enough strawmanning for one post, and I don’t want you to think that the refactored version is only an improvement because the original was awful. At the same time, it shouldn’t be great idiomatic Raku code, because that would mean using enough of Raku’s superpowers to reduce the code’s accessibility (I want to explain what’s going on in the after code, but don’t want to get bogged down teaching the before). It should also be just the right length – too short, and we won’t be able to see the advantages of reducing context; too long, and we won’t have space to walk through it in any detail.

Fortunately, the Raku docs provide the perfect before code: the Raku by example 101 code. This simple script is not idiomatic Raku; it’s a program that does real (though minimal) work while using only the very basics of Raku syntax. Here’s how that page describes the script’s task:

Suppose that you host a table tennis tournament. The referees tell you the results of each game in the format Player1 Player2 | 3:2, which means that Player1 won against Player2 by 3 to 2 sets. You need a script that sums up how many matches and sets each player has won to determine the overall winner.

The input data (stored in a file called scores.txt) looks like this:

Beth Ana Charlie Dave
Ana Dave | 3:0
Charlie Beth | 3:1
Ana Beth | 2:3
Dave Charlie | 3:0
Ana Charlie | 3:1
Beth Dave | 0:3

The first line is the list of players. Every subsequent line records a result of a match.

I believe that the code should be legible, even to programmers who have not seen any Raku. The one hint I’ll provide for those who truly haven’t looked at Raku (or Perl) is that @ indicates that a variable is array-like, % indicates that it’s hashmap-like, and $ is for all other variables. If any of the other syntax gives you trouble, check out the full walkthrough in the docs.

Here’s the 101 version:

use v6d;
# start by printing out the header.
say "Tournament Results:\n";

my $file  = open 'scores.txt'; # get filehandle and...
my @names = $file.get.words;   # ... get players.

my %matches;
my %sets;

for $file.lines -> $line {
    next unless $line; # ignore any empty lines

    my ($pairing, $result) = $line.split(' | ');
    my ($p1, $p2)          = $pairing.words;
    my ($r1, $r2)          = $result.split(':');

    %sets{$p1} += $r1;
    %sets{$p2} += $r2;

    if $r1 > $r2 {
        %matches{$p1}++;
    } else {
        %matches{$p2}++;
    }
}

my @sorted = @names.sort({ %sets{$_} }).sort({ %matches{$_} }).reverse;

for @sorted -> $n {
    my $match-noun = %matches{$n} == 1 ?? 'match' !! 'matches';
    my $set-noun   = %sets{$n} == 1 ?? 'set' !! 'sets';
    say "$n has won %matches{$n} $match-noun and %sets{$n} $set-noun";
}

OK, that was pretty quick. It uses my to declare 13 different variables; let’s see what it would look like if we declare 0. Before I start, though, one note: I said that the code above isn’t idiomatic Raku, and the code below won’t be either. I’ll introduce considerably more of Raku’s syntax where it makes the code more tacit, but I’ll still steer clear of some forms I’d normally use that aren’t related to refactoring the code in a more pointfree style. I also won’t make unrelated changes (e.g., removing mutable state) that I’d normally include. Finally, this code also differs from typical Raku (at least the way I write it) by being extremely narrow. I typically aim for a line length under 100 characters, but because I’d like this to be readable on pretty much any screen, these lines never go above 45.

With that throat-clearing out of the way, let’s get started. Our first step is pretty much the same as in the 101 code; we open our file and iterate through the lines.

open('scores.txt')
  ==> lines()

You can already see one of the key pieces of syntax we’ll be using to adopt a pointfree style: ==>, the feed operator. This operator takes the result from open('scores.txt') and passes it to lines() as its final argument. (This is similar to, but not exactly the same as, calling a .lines() method on open('scores'). Most significantly, ==> passes a value as the last parameter to the following function; calling a method is closer to passing a value as the first parameter.)

Now we’re dealing with a list of all the lines in our input file – but we don’t actually need all the lines, because some are useless (to us) header lines. We’ll solve this in basically the same way we would on the command line: by using grep to limit the lines to just those we care about. In this case, that means just those that have the ” | ” (space-pipe-space) delimiter that occurs in all valid input lines.

  ==> grep(/\s '|' \s/)

A few syntax notes in passing: first, Raku obviously has first-class support for regular expressions. Second, and perhaps more surprisingly, note that Raku regexes default to being _in_sensitive to whitespace; /'foo' 'bar'/ matches ‘foobar’, not ‘foo bar’. Finally, Raku regexes require non-alphabetic characters to be enclosed in 's before they match literally.

After using grep to limit ourselves to the lines we care about, we’re dealing with a sequence of lines something like Ana Dave | 3:0. Our next task is to convert these lines into something more machine readable. Since we just went over the regex syntax, let’s stick with that approach.

  ==> map({
      m/ $<players>=( (\w+)  \s (\w+) ) 
                       \s    '|'  \s
         $<sets-won>=((\d+) ':' (\d+) )/;
      [$<players>[0], $<sets-won>[0]],
      [$<players>[1], $<sets-won>[1]]
  })

This uses the Regex syntax we introduced above and adds a bit on top. Most importantly, we’re now naming our capture groups: we have one capture group named players that captures the two space-separated player names before the | character. (Apparently our tournament only identifies players with one-word names, a limitation that was present in the 101 code as well.) And the sets-won named capture group extracts out the :-delimited set results.

Once we’ve captured the names and scores for that match, we associate the correct scores with the correct names and create a 2×2 matrix/nested array with our results.

Actually, though, we’re not quite done with everything we want to do inside this map – we’ve given meaning to the order of the elements within each row, but the order of the rows themselves is currently meaningless. Let’s fix that by sorting our returned array so that the winner is always at the front:

      [$<players>[0], $<sets-won>[0]],
      [$<players>[1], $<sets-won>[1]]
      ==> sort({-.tail})

With this addition, our code so far is:

open('scores.txt')
  ==> lines()
  ==> grep(/\s '|' \s/)
  ==> map({
      m/ $<players>=( (\w+)  \s (\w+) ) 
                       \s    '|'  \s
         $<sets-won>=((\d+) ':' (\d+) )/;
      [$<players>[0], $<sets-won>[0]],
      [$<players>[1], $<sets-won>[1]]      
      ==> sort({-.tail})
  })

At this point, we’ve processed our input lines into arrays; we’ve gone from something like Ana Dave | 3:0 to something a bit like

[ [Ana,  3],
  [Dave, 0] ]

Now it’s time to start combining our separate arrays into a data structure that represents the results of the entire tournament. As in most languages these days, Raku does this with reduce (some languages call the same operation fold). We’re going to use reduce to build a single hashmap out of our list of nested arrays. However, before we can do so, we’re going to need to add an appropriate initial value to reduce onto (here, an empty Hash).

Raku gives us a solid half-dozen ways to do so – including specifying an initial value when you call reduce, much like you would in modern JavaScript. I’m going to accomplish the same thing differently, both because it’s more fun and because it lets me introduce you to 5 useful pieces of syntax in just 10 characters, which may be some sort of a record. Here’s the line:

  ==> {%, |$_}()

OK, there’s a lot packed in there! Let’s step through it. {...} is Raku’s anonymous block (i.e., lambda) syntax. So {...}() would normally create an anonymous block and then call it without any arguments. However, as we said above, ==> automatically passes the return value of its left-hand side as the final argument to its right-hand side. So ==> {...}() calls the block with the value that was fed into the ==>.

Since this block doesn’t specify a signature (more on that very shortly), it doesn’t have any named parameters at all; instead, any values the block is called with are placed in the topic variable – which is accessed with $_. Putting what we have so far together, we can show a complex (but succinct!) way to do nothing: ==> {$_}(). That expression feeds a value into a block, loads the value into the topic variable, and then returns it without doing anything at all.

Our line did something, however – after all, we have 4 more characters and 2 new concepts left in our line! Starting at the left, we have the % character, which you may recognize as the symbol that indicates that a variable is hash-like (Associative, if we’re being technical). On its own like this, it effectively creates an empty hash – which we could also have done with Hash.new, {}, or %(), but I like % best here. And the , operator, which we’ve already used without remarking on, combines its arguments into a list.

Here’s an example using the syntax we’ve covered so far:

[1, 2, 3] ==> {0, $_}()

That would build a list out of 0 and [1, 2, 3]. Specifically, it would build a two-element list; the second element would be the array [1, 2, 3]. That is not quite what we want, because we want to add % onto the front of our existing list instead of creating a new and more nested list.

As you may have guessed, the final character we have left – | – solves this problem for us. This Slip operator is one of my favorite bits of Raku cleverness. (Well, top 20 anyway – there are a lot of contenders!) The | operator transforms a list into a Slip, which is “a kind of List that automatically flattens”, as the docs put it. In practice, this means that Slips merge into lists instead of becoming single elements in them. To return to our earlier example,

[1, 2, 3] ==> {0, |$_}()

produces the four-element list (0, 1, 2, 3) instead of the two-element list (0, [1, 2, 3]) we got without the |.

Putting all this together, we are now in a position to easily understand the ~15 character (!) line of code we’ve been talking about. Recall that we’d just used map to transform our list of lines into a list of 2×2 matrices. If we’d printed them out, we would see something kind of like:

( [ [Ana,  3],
    [Dave, 0] ],
  ...
)

When we feed this array into the {%, |$_}() block, we slip it into a list with the empty hash, and end up with something like:

( {},
  [ [Ana,  3],
    [Dave, 0] ],
  ...
)

With that short-but-dense line out of the way, we can proceed on to calling reduce. As in many other languages, we’ll pass in a function for reduce to use to combine our values into a single result value. We’ll do this with the block syntax we just introduced (see, taking our time on that line is already starting to pay off!). So it will look something like this:

==> reduce( { 
    # Block body goes here
})

Before filling in that body, though, let’s say a word about signatures (I told you it’d come up soon). As we discussed, when you don’t specify a signature for a block, all the arguments passed to the block get loaded into the topic variable $_. We can do anything we need to by manipulating/indexing into the topic variable, but that could get pretty verbose. Fortunately, we can specify the signature for a block by placing parameter names between -> and the opening {. Thus, -> $a, $b { $a + $b } is a block that accepts exactly two arguments and returns the sum of its arguments.

In our case, we know that the first argument to reduce is going to be the hash we’re building up to track the total wins in the tournament and the second will be the 2×2 array that represents the results of the next match. That gives us a signature of

==> reduce(-> %wins, @match-results { 
    # Block body goes here
})

So, how do we fill in the body? Well, since we previously sorted the array we’re now calling @match-results, we know that the first row contains the person who won the most sets (and therefore the match). More specifically, the first element in the first row contains that person’s name. So we want the first element of the first row – that is, the element that would be at (0, 0) if our array were laid out in 2D. Fortunately, Raku supports directly indexing into multi-dimensional arrays, so accessing this name is as simple as @match-results[0;0]. This means we can update our hash to account for the match winner with

      %wins{@match-results[0;0]}<matches>++;

Handling the sets is very similar – the biggest difference is that we iterate through both rows of @match-results instead of indexing into the first row:

      for @match-results -> [$name, $sets] {
          %wins{$name}<sets> += $sets;
      }

Note the -> [$name, $sets] signature above. This shows Raku’s strong support for destructuring assignment, another key tool in avoiding explicit assignment statements. -> [$a, $b] tells Raku that the block accepts a single array with two elements in it and assigns names to each. It’s equivalent to writing -> @array { my $a = @array[0]; my $b = @array[1]; ... }.

(And if the idea of using destructuring assignment to avoid assignment feels like cheating in terms of pointfree style, then hold that thought because we’ll come back to it when we get to the end of this example.)

At the end of our reduce block, we need to return the %wins hash we’ve been building. Putting it all together gives us

  ==> reduce(-> %wins, @match-results {
      %wins{@match-results[0;0]}<matches>++;
      for @match-results -> [$name, $sets] {
          %wins{$name}<sets> += $sets;
      }
      %wins
  })

At this point, we’ve built a hash-of-hashes that contains all the info we need; we’re done processing our input. Specifically, our hash contains keys for each of the player names in the tournament; the value of each is a hash showing that player’s total match and set wins. It looks a bit like this:

{ Ana  => { matches => 2,
            sets    => 8 }
  Dave => ...,
  ...
}

This contains all the information we need but not necessarily in the easiest shape to work with for generating our output. Specifically, we would like to print results in a particular order (winners first) but we have our data in a hash, which is inherently unordered. Thus – as happens so often – we need to reshape our data from the shape that was the best fit for processing our input data into the shape that is the best fit for generating our output.

Here, that means going from a hash-of-hashes to a list of hashes. We do so by first transforming our hash into a list of key-value pairs and then mapping that list into a list of hashes. In that map, we need to add the player’s name (info that was previously stored in the key of the outer hash) into the inner hash – if we skipped that step, we wouldn’t know which scores went with which players.

Here’s how that looks:

  ==> kv()
  ==> map(-> $name, %_ { %{:$name, |%_} })

I’ll note, in passing, that our map uses both destructuring assignment and the | slip operator to build our new hash. After this step, our data looks something like

( { name    => "Ana",
    matches => 2,
    sets    => 8 }
  ...
)

This list isn’t inherently unordered the way a hash is, but we haven’t yet put it in any meaningful order. Let’s do so now.

  ==> sort({.<matches>, .<sets>, .<name>})
  ==> reverse()

Note that this preserves the somewhat wackadoodle sort order from the original code: sort by match wins, high to low; break ties in matches by set wins; break ties in set wins by reverse alphabetical order.

At this point, we have all our output data organized properly; all that is left is to format it for printing. When printing our output, we need to use the correct singular/plural affixes – that is, we don’t want to say someone won “1 sets” or “5 set”.

Let’s write a simple helper function to handle this for us. We could obviously write a function that tests whether we need a singular or plural affix, but instead let’s take this chance to look at one more Raku feature that makes it easier to write pointfree code: multi-dispatch functions that perform different actions based on how they’re called.

The function we want should accept a key-value pair and return the singular version of the key when the associated value is 1; otherwise, it should return the plural version of the key. Let’s start by stating what all versions of our function have in common using a proto statement:

proto kv-affix((Str, Int $v) --> Str) {{*}}

A few things to know about that proto statement: This is the first time we’ve added type constraints to our code, and they work just about as you’d expect. kv-affix can only be called with a string as its first argument and an integer as its second (this protects us from calling it with the key and value in the wrong order, for example). It’s also guaranteed to return a string. Additionally, note that we can destructure using a type (Str, here) without needing to declare a variable – handy for situations like this, where we want to match on a type without needing to use the value.

Finally, note that the proto is entirely optional; indeed, I don’t think that I’d necessarily use one here. But I would have felt remiss if we didn’t discuss Raku’s support for type constraints, which is generally quite helpful in writing pointfree code (even if we haven’t really needed it today).

Next, let’s handle the case where we need to return the singular version of the key:

multi kv-affix(($_, 1)) { S/e?s$// }

As you can see, Raku lets us destructure/pattern match with literals – this version of our multi will only be invoked when kv-affix is called with 1 as its second argument. Additionally, notice that we’re destructuring the first parameter into $_, the special topic variable. Setting the topic variable not only lets us use that variable without giving it a name, but it also enables all the tools Raku reserves for the current topic. (If we want these tools without destructuring into the topic variable, we can also set the topic with with or given.)

Setting the topic to the key we’re modifying is helpful here because it lets us use the S/// non-destructive substitution operator. This operator matches a regex against the topic and then returns the string that results from replacing the matched portion of the string. Here, we match 0 or 1 e’s (e?) followed by an ‘s’, followed by the end of the string ($). We then replace that ‘s’ or ‘es’ with nothing, effectively trimming the plural affix from the string.

The final multi candidate is trivial. It just says to return the unaltered plural key when the previous multi candidate didn’t match (that is, when the value isn’t 1).

multi kv-affix(($k, $)) { $k }

(We use $ as a placeholder for a parameter when we don’t need to care about its type or its value.)

With those three lines of code, we now have a little helper function that will give us the correct singular/plural version of our keys. In all honesty, I’m not sure it was actually worth using a multi here. This might be a situation where a simple ternary condition – something like sub kv-affix(($_, $v)) { $v ≠ 1 ?? $_ !! S/e?s$// } – might have done the trick more concisely and just as clearly. But that wouldn’t have given us a reason to talk about multis, and those are just plain fun.

In any event, now that we have our helper function, formatting each line of our output is fairly trivial. Below, I do so with the venerable C-style sprintf, but Raku offers many other options for formatting textual output if you’d prefer something else.

  ==> map({
      "%s has won %d %s and %d %s".sprintf(
          .<name>,
          .<matches>, kv-affix(.<matches>:kv),
          .<sets>,    kv-affix(.<sets>:kv   )
      )})

And once we’ve formatted each line of our output, the final step is to add the appropriate header, concatenate our output lines, and print the whole thing.

  ==> join("\n", "Tournament Results:\n")
  ==> say();

And we’re done.

Evaluating our pointfree refactor

Let’s take a look at the code as a whole and talk about how it went.

use v6d;
open('scores.txt')
  ==> lines()
  ==> grep(/\s '|' \s/)
  ==> map({
      m/ $<players>=( (\w+)  \s (\w+) ) 
                       \s    '|'  \s
         $<sets-won>=((\d+) ':' (\d+) )/;
      [$<players>[0], $<sets-won>[0]],
      [$<players>[1], $<sets-won>[1]]
      ==> sort({-.tail}) })
  ==> {%, |$_}()
  ==> reduce(-> %wins, @match-results {
      %wins{@match-results[0;0]}<matches>++;
      for @match-results -> [$name, $sets] {
          %wins{$name}<sets> += $sets;
      }
      %wins })
  ==> kv()
  ==> map(-> $name, %_ { %{:$name, |%_} }) 
  ==> sort({.<matches>, .<sets>, .<name>})
  ==> reverse()
  ==> map({
      "%s has won %d %s and %d %s".sprintf(
          .<name>,
          .<matches>, kv-affix(.<matches>:kv),
          .<sets>,    kv-affix(.<sets>:kv) )})
  ==> join("\n", "Tournament Results:\n")
  ==> say();

proto kv-affix((Str, Int) --> Str) {{*}}
multi kv-affix(($_, 1)) { S/e?s$// }
multi kv-affix(($k, $)) { $k }

So, what can we say about this code? Well, at 32 lines of code, it’s longer than the 101 version (and, even though these lines are pretty short, it’s longer by character count as well). So this version doesn’t win any prizes for concision. But that was never our goal.

So how does it do on the goal we started out with – reducing assignments? Well, if we channel Literal Larry, we can say that it has zero assignment statements; it never assigns a value to a variable with my $name = 'value' or similar syntax. In contrast, the 101 code used my to assign to a variable over a dozen times. So, from a literal perspective, we succeeded.

But, as we already noted, ignoring destructuring assignment feels very much like cheating. Similarly, using named captures in a regex is essentially a form of assignment/naming. So, if we adopt an inclusive view of assignment, the 101 code has 15 assignments and our refactored code has 6. So a significant drop, but nothing like an order of magnitude difference.

But trying to evaluate our refactor by counting assignment statements is probably a fool’s errand to begin with. What I really care about – and, I suspect, what you care about too – is the clarity of our code. To some degree, that’s inherently subjective and depends on your personal familiarity and preferences – by my lights, ==> {%, |$_}() is extremely clear. Maybe, after we spent 3 paragraphs on that line, you might agree; or you might not – and I doubt anything further I could say would change your mind. So, by my lights, the refactored code looks clearer.

But clarity is not entirely a subjective matter. I argue that the refactored code is objectively clearer – and in exactly the ways the pointfree style is supposed to promote. Back at the beginning of this post, I claimed that writing tacit code has two main benefits: it provides better emphasis in your code, and it reduces the amount of context you need to hold in your head to understand any particular part of the code. Let’s look at each of these in turn.

In terms of emphasis, there’s one question I like to ask: what identifiers are in scope at the global program (or module) scope? Those identifiers receive the most emphasis; in an ideal world, they would be the most important. In the refactored code, there are no variables at all in the global scope and only one item: the kv-affix function. This function is appropriately in the global scope since it is of global applicability (indeed, it could even be a candidate to be factored out into a separate module if this program grew).

Conversely, in the 101 code the global-scope variables are $file, @names, %matches, %sets, and @sorted. At least a majority of those are pure implementation details, undeserving of that level of emphasis. And some (though this bleeds into the “context” point, discussed below) are downright confusing in a global scope. What does @names refer to, globally? How about %matches? (does it change your answer if I tell you that Match is a Raku type?) What about %sets? (also a Raku type). Of course, you could argue that these names are just poorly chosen, and I wouldn’t necessarily disagree. But coming up with good variable names is famously hard, and figuring out names that are clear in a global scope is even harder – there are simply more opportunities for conceptual clash.

To really emphasize this last point, take a look at the final line of the refactored code:

multi kv-affix(($k, $)) { $k }

If the name $k occurred in a global context, it would be downright inscrutable. It could be an iteration variable (old-school programmers tend to start with i, and then move on to j and k). It could stand for degrees Kelvin or, oddly enough, the Coulomb constant. Or it could be anything, really.

But because its scope is more limited, the meaning is clear. The function takes a key-value pair (typically generated in Raku with the .kv method or the :kv adverb) and is named kv-affix. Given those surroundings, it’s no mystery at all that $k stands for “key”. Keeping items out of the global scope both provides better emphasis and provides a less confusing context to evaluate the meaning of different names.

The second large benefit I claimed for pointfree code is that it reduces the amount of context/state you need to hold in your head to understand any given bit of code. Comparing these two scripts also supports this point. Take a look at the last line of the 101 code:

say "$n has won %matches{$n} $match-noun and %sets{$n} $set-noun";

Mentally evaluating this line requires you to know the value of $n (defined 3 lines above), $match-noun (2 lines above), $set-noun (1 line), %sets (24 lines), and %matches (25 lines). Considering how simple this script is, that is a lot of state to track!

In contrast, the equivalent portion of the refactored code is

"%s has won %d %s and %d %s".sprintf(
    .<name>,
    .<matches>, kv-affix(.<matches>:kv),
    .<sets>,    kv-affix(.<sets>:kv) )

Evaluating the value of this expression only requires you to know the value of the topic variable (defined one line up) and the pure function kv-affix (defined 3–5 lines below). This is not an anomaly: every variable in the refactored code is defined no more than 5 lines away from where it is last used.

(Of course, writing code in a pointfree style is neither sufficient nor necessary to limit the scope of variables. But as this example illustrates – and my other experience backs up – it certainly helps.)

Raku supports pragmatic (not pure) pointfree programming

A true devotee of pointfree programming would likely object to the refactored code on the grounds that its not nearly tacit enough. Despite avoiding explicit assignment statements, it makes fairly extensive use of named function parameters and destructuring assignment; it just isn’t pure.

Nevertheless, the refactored code sits in a pragmatic middle ground that I find highly productive: it’s pointfree enough to gain many of the clarity, context, and emphasis benefits of that style without being afraid to use a name or two when that adds clarity.

And this middle ground is exactly where Raku shines (at least in my opinion! It’s entirely possible to write Raku in a variety of different styles and many of them are not in the least bit pointfree).

Here are some of the Raku features that support pragmatic pointfree programming (most, but not all, of which we saw above):

If you’re already a Raku pro, I hope this list and this post have given you some ideas for some other ways to do it. If you’re new to Raku, I hope this post has gotten you excited to explore some of the ways Raku could expand the way you program. And if you’re totally uninterested in writing Raku code – well, I hope you’ll reconsider, but even if you don’t, I hope that this post gave you something to think about and left you with some ideas to try out in your language of choice.

Day 21: The Story Of Elfs, and Roles, And Santas’ Enterprise

Let’s be serious. After all, we’re grown up people and know the full truth about Santa: he is a showman, and he is a top manager of Santa’s family business. No one knows his exact position, because we must not forget about Mrs.Santa whose share in running the company is at least equal. The position is not relevant to our story anyway. What is important though is that running such a huge venture requires a lot of skills. Not to mention that the venture itself is also a tremendous show on its own, as one can find out from documentaries like The Santa Clause and many other filmed over the last several decades of human history.

What would be the hardest part of running The North Pole Inc.? Logistics? Yeah, but with all the magic of the sleds, and the reindeers, and the Christmas night this task is not that hard to be done. Manufacturing? This task has been delegated to small outsourcing companies like Lego, Nintendo, and dozens others across the globe.

What else remains? The employees. Elves. And, gosh!, have you ever tried to organize them? Don’t even think of trying unless you have a backup in form of a padded room served by polite personnel with reliable supply of pills where you’d be spending your scarce vacation days. It’s an inhumane task because when one puts together thousands, if not millions, as some estimations tell, of ambitious stage stars (no elf would ever consider himself as a second-plane actor!), each charged with energy amount more appropriate to a small nuclear reactor… You know…

How do Santas manage? Sure, they’re open-hearted, all-forgiving beyond an average human understanding. But that’s certainly not enough to build a successful business! So, there must be a secret ingredient, something common to both commercial structures and shows. And I think it’s well done role assignment which turns the brownian motion of elf personalities into a self-organized structure.

In this article I won’t be telling how Raku helps Santas sort things out or ease certain tasks. Instead I’ll try to describe some events happening within The North Pole company with help of the Raku language and specifically its OO capabilities.

Elves

Basically, an elf is:

class Elf is FairyCreature {...}

For some reason, many of them don’t like this definition; but who are we to judge them as long, as many humans still don’t consider themselves as a kind of apes? Similarly, some elves consider fairy to be archaic and outdated and not related to them, modern beings.

But I digress…

The above definition is highly oversimplified. Because if we start traversing the subtree of the FairyCreature class we gonna find such diverse species in it like unicorns, goblins, gremlins, etc., etc., etc. Apparently, there must be something else, defining the difference. Something what would provide properties sufficiently specific to each particular kind of creature. If we expand the definition of the Elf class we gonna see lines like these:

class Elf is FairyCreature {
    also does OneHead;
	  also does UpperLimbs[2];
    also does LowerLimbs[2];
    also does Magic;
    ...
}

I must make a confession here: I wasn’t allowed to see the full sources. When requested for access to the fairy repository the answer was: “Hey, if we reveal everything it’s not gonna look like magic anymore!” So, some code here is real, and some has been guessed out. I won’t tell you which is which; after all, let’s keep it magic!

Each line is a role defining a property, or a feature, or a behavior intrinsic to a generic elf (don’t mess it up with the spherical cow). So, when we see a line like that we say that: Elf does Magic ; or, in other words: class Elf consumes role Magic.

I apologize for not explaining in details to Raku newcomers what a role is; hopefully the link will be helpful here. For those who knows Java (I’m a dinosaur, I don’t), a role is somewhat similar to interfaces but better. It can define attributes and methods to be injected into a consuming class; it can require certain methods to be defined; and it can specify what other roles the class will consume, and other classes it will inherit from.

As a matter of fact, because of complexity of elf species, the number of roles they do is too high to mention all of them here. Normally, when a class consumes only few of them, it’s OK to write the code in another way:

class DisgustingCreature is FairyCreature does Slimy does Tentacles[13] { ... }

But numerous roles are better be put into class body with the prefix also.

When Santas Hire An Elf

This would probably be incorrect to say that elves are hired by Santas. In fact, as we know it, all of them do work for The North Pole Inc. exclusively. Yet, my thinking is such that at some point in the history the hiring did take place. Let’s try to imagine how it could have happen.

One way or another, there is a special role:

role Employee {...}

And there is a problem too: our class Elf is already composed and immutable. Moreover, each elf is an object of that class! Or, saying the same in Raku language: $pepper-minstix.defined && $pepper-minstix ~~ Elf. Ok, what’s the problem? If one tries to $pepper-minstix.hire(:position("security officer"), :company("The North Pole Inc."), ...) a big boom! will happen because of no such method ‘hire’ for invocant of type ‘Elf’. Surely, the boom! is expected because long, long time ago elves and work were as compatible as Christmas and summer! But then there was Santas. And what they did is called mixing a role into an object:

$pepper-minstix does Employee;

From the outside the operation does adds content of Employee role to the object on its left hand side, making all role attributes and methods available on the objet. Internally it creates a new class for which the original $pepper-minstix.WHAT is the only parent class; and which consumes the role Employee. Eventually, after the does operator, say $pepper-minstix.WHAT will output something like (Elf+{Employee}). This is now the new class of the object held by $pepper-minstix variable.

Such a cardinal change in life made elves much happier! Being joyful creatures anyway, they now got a great chance to be also useful by sharing their joy with all the children, and sometimes not only children. The only thing worried them though. You know, it’s really impossible to find two identical people; the more so there’re no two identical elves. But work? Wouldn’t it level them all down in a way? Santas wouldn’t be the Santas if they didn’t share these worries with their new friends. To understand their solution let’s see what Employee role does for us. Of the most interest to us are the following lines:

has $.productivity;
has $.creativity;
has $.laziness;
has $.position;
has $.speciality;
has $.department;
has $.company;

For simplicity, I don’t use typed attributes in the snippet, though they’re there in the real original code. For example, $.lazyness attribute is a coefficient among other things used in a formula calculating how much time is spent for coffee or eggnog breaks. The core of the formula is something like:

method todays-cofee-breaks-length {
    $.company.work-hours * $.laziness * (1 + 0.2.rand)
}

Because they felt their responsibility for the children, elves agreed to limit their maximum laziness level. Therefore the full definition of the attribute is something like:

has Num:D $.laziness where * < 0.3;

If anybody thinks that the maximum is too high then they don’t have the Christmas spirit in their hearts! Santa Claus was happy about it, why wouldn’t we? I personally sure his satisfaction is well understood because his own maximum is somewhere closer to 0.5, but – shh! – let’s keep it a secret!

Having all these characteristics in place, Santas wanted to find a way to set them to as diverse combinations, as possible. And here is what they came up with something similar to this:

role Employee {
    ...
    method my-productivity {...}
    method my-creativity {...}
    method my-laziness {...}
    submethod TWEAK {
        $!productivity //= self.my-productivity;
        $!creativity //= self.my-creativity;
        $!laziness //= self.my-laziness;
    }
}

Now it was up to an elf to define their own methods to set the corresponding characteristics. But most of them were OK with a proposed special role for this purpose:

role RandomizedEmployee {
    method my-productivity { 1 - 0.3.rand }
    method my-creativity { 1 - 0.5.rand }
    method my-laziness { 1 - 0.3.rand }
}

The hiring process took the following form now:

$pepper-minstix does Employee, RandomizedEmployee;

But, wait! We have three more attributes left behind! Yes, because these were left up to Santas to fill in. They knew what kind of workers and where they needed most. Therefore the final version of the hiring code was more like:

$pepper-minstix does Employee(
        :company($santas-company),
        :department($santas-company.department("Security")),
        :position(HeadOfDepartment),
        :speciality(GuardianOfTheSecrets),
        ...
    ), 
    RandomizedEmployee;

With this line the Raku’s mixin protocol does the following:

  1. creates a new mixin
  2. sets attributes defined with named parameters
  3. invoke role’s constructor TWEAK
  4. returns a new employee object

Because everybody knew that the whole thing is going to be a one-time venture as elves would never leave their new boss alone, the code was a kind of trade-off between efficiency and speed of coding. Still, there were some interesting tricks used, but discussing them is beyond this story mainline. I think many readers can find their own solutions to the problems mentioned here.

Me, in turn, moves on to a story which took place not long ago…

When Time Is Too Scarce

It was one of those craziest December days, when Mrs.Santa left for an urgent business trip. Already busy with mails and phone calls Mr.Santa got additional duties in the logistics and the packing departments which are usually handled by his wife. There were no way he could skip those or otherwise risks of something going wrong on the Christmas night would be too high. The only way to get everywhere on time was to cut on the phone calls. It meant to tell the elf-receptionist to answer with the “Santa is not available” message.

Santa sighed. He could almost see and hear the elf staring at him with deep regret and asking: “Nicholas, are you asking me to lie?” Oh, no! Of course he wouldn’t ask, but…

But? but! After all, even if the time/space magic of the Christmas is not available on other days of year, Santa can still do other kind of tricks! So, here is what he did:

role FirewallishReceptionist {
    has Bool $.santa-is-in-the-office;
    has Str $.not-available-message;
    method answer-a-call {
        if $.santa-is-in-the-office {
            self.transfer-call: $.santas-number;
        }
        else {
            self.reply-call: $.not-available-message, 
                                :record-reply,
                                :with-marry-christmas;
        }
    }
}

my $strict-receptionist = 
    $receptionist but FirewallishReceptionist(
        :!santa-is-in-the-office, 
        :not-available-message(
            "Unfortunately, Santa is not available at the moment."
            ~ ... #`{ the actual message is longer than this }
        )
    );

$company.give-a-day-off: $receptionist;
$company.santa-office-frontdesk.assign: :receptionist($strict-receptionist);

The operator but is similar to does, but instead of altering its left hand side operand its clone is created and then mixes in the right hand side role into the clone.

Just imagine the amazement of the receptionist when he saw his own copy taking his place at his desk! But a day off is a day off, he wasn’t really much against applying his laziness coefficient to the rest of that day…

As to Santa himself… He has never been really proud of what he done that day. Even though it was needed in the name of saving the Christmas. Besides, the existence of a clone created a few awkward situations later, especially when both elves were trying to do the same work while still sharing some data structures. But that’s a separate story on its own…

When New Magic Helps

Have you seen the elves this season? They’re always very strict in sticking to the latest tendencies of Christmas fashion: rich colors, spangles, all fun and joy! Yet this year is really something special!

It all started at the end of the spring. Santa was sitting in his chair, having well-deserved rest of the last Christmas he served. The business did not demand as much attention as it usually does at the end of the autumn. So, he was sitting by the fireplace, drinking chocolate, and reading the news. Though the news was far from being the best part of Santa’s respite (no word about 2020!). Eventually, Santa put away his tablet, made a deep sip from his giant mug, and said aloud: “Time to change their caps!” No idea what pushed him to this conclusion, but from this moment on elves knew that a new fashion is coming!

The idea Santa wanted to implement was to add WiFi connection and LEDs to elvish caps, and to make the LEDs twinkle with patterns available from a local server of The North Pole Inc. Here is what he started with:

role WiFiConnect {
    has $.wifi-name is required;
    has $.wifi-user is required;
    has $.wifi-password is required;
    submethod TWEAK { 
        self.connect-wifi( $!wifi-name, $!wifi-user, $!wifi-password );
    }
}

role ShinyLEDs {
    submethod TWEAK { 
        if self.test-cirquits {
            self.LED( :on );
        }
        if self ~~ WiFiConnect {
            self.set-LED-pattern: self.fetch( :config-key<LED-pattern> );
        }
    }
}

class ElfCap2020 is ElfCap does WiFiConnect does ShinyLEDs {...}

Note, please, that I don’t include the body of the class here for it’s too big for this article.

But the attempt to compile the code resulted in:

Method 'TWEAK' must be resolved by class ElfCap2020 because it exists in multiple roles (ShinyLEDs, WiFiConnect)

“Oh, sure thing!” – Santa grumbled to himself. And added a TWEAK submethod to the class:

    submethod TWEAK {
        self.R1::TWEAK;
        self.R2::TWEAK;
    }

This made the compiler happy. and ElfCap2020.new came up with a new and astonishingly fun cap instance! “Ho-ho-ho!” – Santa couldn’t help laughing of joy. It was the time to start producing the new caps for all company employees; and this was the moment when it became clear that mass production of the new cap will require coordinated efforts of so many third-party vendors and manufacturers that there were no way to equip everybody with the new toy by the time the Christmas comes.

Does Santa give up? No, he never does! What if we try to modernize the old caps? It would only require so many LEDs and controllers and should be feasible to handle on time!

Suit the action to the world! With a good design it should be no harder than to:

$old-cap does (WiFiConnect(:$wifi-name, :$wifi-user, :$wifi-password), ShinyLEDs);

And… Boom! Method 'TWEAK' must be resolved by class ElfCap+{WiFiConnect,ShinyLEDs} because it exists in multiple roles (ShinyLEDs, WiFiConnect)

Santa sighed. No doubt, this was expected. Because does creates an implicit empty class the two submethods from both roles clash when compiler tries to install them into the class. A deadend? No way! Happy endings is what Santa loves! And he knows what to do. He knows that there is a new version of the Raku language is in development. It is not released yet, but is available for testing with Rakudo compiler if requested with use v6.e.PREVIEW at the very start of a compilation unit which is normally is a file.

Also, Santa knows that one of the changes the new language version brings in is that it keeps submethods where they were declared, no matter what. It means that where previously a submethod was copied over from a role into the class consuming it, it will now remain be the sole property of the role. And the language itself now takes care of walking over all elements of a class inheritance hierarchy, including roles, and invoking their constructor and/or destructor submethods if there’re any.

Not sure what it means? Check out the following example:

use v6.e.PREVIEW;
role R1 {
    submethod TWEAK { say ::?ROLE.^name, "::TWEAK" }
}
role R2 {
    submethod TWEAK { say ::?ROLE.^name, "::TWEAK" }
}
class C { };
my $obj = C.new;
$obj does (R1, R2);
# R1::TWEAK
# R2::TWEAK

Apparently, adding use v6.e.PREVIEW at the beginning of the modernization script makes the $old-cap does (WiFiConnection, ShinyLEDs); line work as expected!

Moreover, switching to Raku 6.e also makes submethod TWEAK unnecessary for ElfCap2020 class too if its only function is to dispatch to role TWEAKs. Though, to be frank, Santa kept it anyway as he needed a few adjustments to be done at the construction time. But the good thing is that it wasn’t necessary for him to worry that much about minor details of combining all the class components together.

And so the task was solved. At the first stage all the old caps were modernized and made ready before the season started and Christmas preparations took all remaining spare time of The North Pole Inc. company. The new caps will now be produced without extra fuss and be ready by the 2021 season. The time spared Santa used to adapt WiFiConnection and ShinyLEDs to use them with his sleds too. When told by The Security Department that additional illumination makes sled’s camouflaging much harder if ever possible Santa only shrugged and replied: “You’ll manage, I have my trust in you!” And they did, but that’d be one more story…

Happy End

When it comes to The North Pole it’s always hard to tell the truth from fairy tales, and to separate magic from the science. But, after all, a law of nature tells it that any sufficiently advanced technology is indistinguishable from magic. With Raku we try to bring a little bit of good magic into this life. It is so astonishing to know that Raku is supported by nobody else but the Santa family themselves!

Merry Christmas and Happy New Year!

Day 20: A Raku in the Wild

Quite a while ago, Santa got a feature request for a web application called AGRAMMON, developed by the elves of one of his sub-contractors Oetiker+Partner AG in what then was called Perl 5. When Santa asked the elf responsible for this application to get to work, the elf suggested that some refactoring was in order, as the application dated back almost 10 years and had been extended regularly.

As the previous year had seen a real Christmas wonder, namely the release of Perl 6c, the elf suggested, that instead of bolting yet another feature onto the web application’s Perl backend, a rewrite in Perl 6 would be a bold but also appropriate move. The reason being that the application used a specially developed format for describing it’s functionality by none-programmers. What better choice for rewriting the parser than Perl 6’s grammars, the elf reasoned. Fittingly, the new AGRAMMON was going to be version 6.

When Santa asked when the rewrite would be finished, the elf’s obivous answer was “by Christmas”. And as things went in Perl 6 land, by the time the rewrite is finally going into production, the backend is now implemented in Raku.

AGRAMMON

Most people nowadays know about the negative side-effects of agriculture on climate, namely emissions of methane and nitrous oxide (strong greenhouse gases) and deforestation. A lesser known, but also significant environmental problem, are ammonia (NH3) and nitrous oxide (NOx) emissions. For the agricultural production, NH3 is a major gaseous pollutant while NOx is of minor importance.

The main source of these emissions are excretions of farm animals, mainly from cattle, pigs, and poultry. Both liquid and solid manure contain nitrogen compounds such as urea. These compounds are decomposing when the excretions are deposited onto the farm surfaces, subsequently stored in manure stores, and when applied to the field as fertilizer.

In addition to induce negative environmental impacts, these emissions entail a substantial loss of nitrogen (N) from manure and either result in diminished productivity of farms or the loss must be compensated by mineral fertilizers at additional costs to the farmers. In Switzerland alone, about 40,000 tonnes of nitrogen are lost every year, amounting to about 30% of the N load in manure.

In order to address these problems, the processes of ammonia volatilisation are studied, emission mitigation options developed, and the effects measured where possible under controlled conditions. However, as controlled conditions are difficult to implement at farm-scale, the effects of the reduction measures as well as the total amount of emissions can be simulated by model calculations. AGRAMMON is a tool that facilitates such simulations at the scale of a single farm. Such calculations can also be done at regional scale by means of simulating “typical farm types” using average process types and cumulated numbers of animals, storage facilities, and fertilizer application. The following picture shows the processes simulated by the model:

The model Agrammon calculates ammonia loss on the basis of the N-flux (mass-flow model). For the stages housing/yard, manure store, manure application, and other sources the loss is calculated by using emission rates as a proportion of the total ammoniacal nitrogen (TAN) in the system. For slurry stores, an emission rate per m2 of surface area of the slurry store is used. Important production variables such as feed, housing system, covering of slurry store or emission-mitigating application systems are taken into account in emission rates as correction factors.

The Application

AGRAMMON is a typical web application, with data stored in a PostgreSQL database, a web frontend implemented in JavaScript using the Qooxdoo framework , and a Raku backend. The physical and chemical processes are not directly implemented in the backend, but as already mentioned in a none-programmer-friendly custom “language”, describing (user) inputs, model parameters, calculations, and outputs (results).

Each process is broken down into smaller sub-processes and each is described in its own file, including documentation and references to appropriate scientific sources. Here is a small example for such a file:

*** general ***

author   = Agrammon Group
date     = 2008-03-30
taxonomy = Livestock::DairyCow::Excretion

+short

Computes the annual N excretion of a number of dairy cows as a function of the
milk yield and the feed ration.

+description

This process calculates the annual N excretion (total N and Nsol (urea plus
measured total ammoniacal nitrogen)) of a number of dairy cows as a
function of the milk yield and the supplied feed ration. Nitrogen
surpluses from increased nitrogen uptake are primarily excreted as
Nsol in the urine. Eighty percent of the increased N excretion is
therefore added to the Nsol fraction.

*** input parameters ***

+dairy_cows
  type        = integer
  validator = ge(0)
  ++labels
       en = Number of animals
       de = Anzahl Tiere 
       fr   = Nombre d'animaux
  ++units
      en = -
  ++description
       Number of dairy cows in barn.
  ++help
      +++en 
          <p>Actual number of animals
                in the barn.</p>
        +++de  ...
        +++fr    ...

*** technical parameters ***

+standard_N_excretion
   value = 115
  ++units 
      en = kg N/year
      de = kg N/Jahr
      fr   = kg N/an
  ++description
    Annual standard N excretion for a
    dairy cow according to
    Flisch et al. (2009).

*** external ***

+Excretion::CMilk
+Excretion::CFeed

*** output ***
+n_excretion
  print = 7

  ++units
      en = kg N/year
      de = kg N/Jahr
      fr   = kg N/an

  ++formula
        Tech(standard_N_excretion)
      * Val(cmilk_yield,    Excretion::CMilk)
      * Val(c_feed_ration,Excretion::CFeed)
      * In(dairy_cows);

  ++description
      Annual total N excreted by a specified
      number of animals.

In the current version of the AGRAMMON model there are 133 such model files with 31,014 lines. From those, the backend can generate

  • the PDF documentation of the model (allowing LaTeX formatting in the files)
  • the actual model simulation using the user’s input data
  • a description of the web GUI which can be rendered by the frontend

The results are presented in the web GUI in tabular form (showing various subsets of the data that can also be defined in the model files)

and can be exported as PDF report or Excel file, together with the actual inputs provided by the user.

A special instance of AGRAMMON is used by a regional government agency in the evaluation process of the environmental impact of modifications to local farms and the approval of the respective building applications. For this, the ammonia emissions before and after the planned modifications must be simulated by the applicant and can be directly submitted to the agency’s AGRAMMON account, including a notification of the agency by eMail with the PDF report attached.

The Raku backend

The refactored backend as of today consists of 59 .pm6 modules/packages with 6,942 lines and is covered by tests in 38 .t files with 5,854 lines. It uses the 13 Raku modules shown in the following excerpt of the META6.json file:

  "depends": [
    "Cro::HTTP",
    "Cro::HTTP::Session::Pg",
    "Cro::OpenAPI::RoutesFromDefinition",
    "Cro::WebApp::Template",
    "DB::Pg",
    "Digest::SHA1::Native",
    "Email::MIME",
    "LibXML:ver<0.5.10>",
    "Net::SMTP::Client::Async",
    "OO::Monitors",
    "Spreadsheet::XLSX:ver<0.2.1+>",
    "Text::CSV",
    "YAMLish"
  ],
  "build-depends": [],
  "test-depends": [
    "App::Prove6",
    "Cro::HTTP::Test",
    "Test::Mock",
    "Test::NoTabs"
  ],

Those modules can be found on the Raku Modules Directory. Note that Spreadsheet::XLSX was specifically implemented for this project. As a side-effect, just yesterday our expert elf (see below) submitted a pull request for LibXML used in Spreadsheet::XLSX leading to a factor of 2 performance improvement.

Speaking of the actual implementation, although our brave elf didn’t have much experience with either grammars, parsers, or even Perl 6 / Raku, he was smart enough to engage a real expert elf for that. This elf did most of the heavy lifting of the backend implementation and helped our elf with advice and code review for the parts he implemented himself.

Please note that the goal of this rewrite was to leave most of the syntax of the model implementation and also the frontend as is, so the blame for all the sub-optimal design decisions are solely on our primary elf as well as the responsibility for imperfect implementation details passing under the review radar.

Some Raku features used in AGRAMMON

In this section we’ll present a few Raku features used in AGRAMMON. This is not meant as a hardcore technical explanation for experts, but rather as a means to give a taste to people interested in Raku.

Most code examples are taken straight from the current implementation, sometimes the examples are slightly shortened by leaving of code that is irrelevant to the concept presented. There are many links to the original modules on GitHub. As AGRAMMON is still being worked on, those links might point at a more recent version of the module.

bin/agrammon.pl6

The actual AGRAMMON “executable” is just a three-liner (of which only two are Raku):

#!/usr/bin/env raku
use lib "lib"
use Agrammon::UI::CommandLine;

This exploits the fact that Rakudo (the Raku implementation used here) has a pretty nice pre-compilation feature which is useful for minimizing (the still not neglegible) startup time after the first run of the program.

Agrammon::UI::CommandLine

This module contains the main functions of the AGRAMMON application available from the command line.

Usage

Running ./bin/agrammon.pl6 gives the following output:

Usage:
  ./bin/agrammon.pl6 web <cfg-filename> <model-filename> [<technical-file>] -- Start the web interface
  ./bin/agrammon.pl6 [--language=<SupportedLanguage>] [--prints=<Str>] [--variants=<Str>] [--include-filters] [--include-all-filters] [--batch=<Int>] [--degree=<Int>] [--max-runs=<Int>] [--format=<OutputFormat>] run <filename> <input> [<technical-file>] -- Run the model
  ./bin/agrammon.pl6 [--variants=<Str>] [--sort=<SortOrder>] dump <filename> -- Dump model
  ./bin/agrammon.pl6 [--variants=<Str>] [--sort=<SortOrder>] latex <filename> [<technical-file>]
  ./bin/agrammon.pl6 create-user <username> <firstname> <lastname> -- Create Agrammon user
  
    <cfg-filename>        configuration file
    <model-filename>      top-level model file
    [<technical-file>]    optionally override model parameters from this file

    See https://www.agrammon.ch for more information about Agrammon.

This usage message is created automagically from the implementation of the multi subroutine MAIN instances as shown for the first line:

subset ExistingFile of Str where { .IO.e or note("No such file $_") && exit 1 }

#| Start the web interface
multi sub MAIN(
        'web',
        ExistingFile $cfg-filename,   #= configuration file
        ExistingFile $model-filename, #= top-level model file
        ExistingFile $technical-file? #= override model parameters from this file
    ) is export {
    my $http = web($cfg-filename, $model-filename, $technical-file);
    react {
        whenever signal(SIGINT) {
            say "Shutting down...";
            $http.stop;
            done;
        }
    }
}

Note that the parameter $technical-file is marked as optional by the trailing ? and that the usage message thus also marks this parameter as optional by enclosing it in [ ].

The first line in the above code example defines a subset ExistingFile of the data type Str, namely those strings that refer to a locally existing file. If called with a filename foo.cfg of a none-existing file, the program aborts with the message No such file foo.cfg.

The usage message also shows the command line calls for

  • running the model in batch mode from the command line (run),
  • showing the simulation flow by dumping the model structur,
  • generation of the model documentation (latex),
  • and for creation of user accounts for the web application (create-user),
  • and at the end lists those parameters that have comments “attached” in the source of the sub MAIN above shown above.

sub web()

This subroutine is called to start the web service as shown in the first line of the above usage message.

sub web(Str $cfg-filename, Str $model-filename, Str $technical-file?) is export {

    # initialization
    # ...
    
    my $model = timed "Load model from $module-path/$module.nhd", {
        load-model-using-cache($*HOME.add('.agrammon'), $module-path, $module, preprocessor-options($variants));
    }

    my $db = DB::Pg.new(conninfo => $cfg.db-conninfo);
    PROCESS::<$AGRAMMON-DB-CONNECTION> = $db;

    my $ws = Agrammon::Web::Service.new(:$cfg, :$model, :%technical-parameters);

    # setup and start web server
    my $host = %*ENV<AGRAMMON_HOST> || '0.0.0.0';
    my $port = %*ENV<AGRAMMON_PORT> || 20000;
    my Cro::Service $http = Cro::HTTP::Server.new(
        :$host, :$port,
        application => routes($ws),
        after => [
            Cro::HTTP::Log::File.new(logs => $*OUT, errors => $*ERR)
        ],
        before => [
            Agrammon::Web::SessionStore.new(:$db)
        ]
    );
    $http.start;
    say "Listening at http://$host:$port";
    return $http;
}

The subroutine uses a signature to describe it’s arguments (all of them are of type Str and the third argument is again marked as optional by the trailing ?.

sub run()

The AGRAMMON application can also be used directly from the command line by providing input data from a CSV file. This mode is used from scientists to automate the running large amounts of simulations for regional and national projections. It is planned to make this mode available via a REST API call in the future.

sub run (IO::Path $path, IO::Path $input-path, $technical-file, $variants, $format, $language, $prints,
         Bool $include-filters, $batch, $degree, $max-runs, :$all-filters) is export {
         
    # initialization
    # ...
    
    my $rc = Agrammon::ResultCollector.new;
    my atomicint $n = 0;
    my class X::EarlyFinish is Exception {}
    race for $ds.read($fh).race(:$batch, :$degree) -> $dataset {
        my $my-n = ++⚛$n;

        my $outputs = timed "$my-n: Run $filename", {
            $model.run(
                input     => $dataset,
                technical => %technical-parameters,
            );
        }
        # create output
        # ...
    }

Here we use race, one of the various concurrency features of Raku, to run the actual model simulation using multiple threads in parallel to speed-up execution.

The function’s signature again specifies the types of some parameters. In addition to (too many) positional arguments, :$all-filtersis a by default optional named argument.

Agrammon::Web::Routes

While already having shown the start-up of the web service above, here we see an example of setting up the routes of AGRAMMON’s REST interface using Cro::HTTP::Router from Edument’s Cro Services:

use Cro::HTTP::Router;
use Cro::OpenAPI::RoutesFromDefinition;
use Agrammon::Web::Service;
use Agrammon::Web::SessionUser;

subset LoggedIn of Agrammon::Web::SessionUser where .logged-in;

sub routes(Agrammon::Web::Service $ws) is export {
    my $schema = 'share/agrammon.openapi';
    my $root = '';
    route {
        include static-content($root);
        include api-routes($schema, $ws);

        ...

        after {
            forbidden if .status == 401 && request.auth.logged-in;
            .status = 401 if .status == 418;
        }
    }
}

sub static-content($root) {
    route {
        get -> {
            static $root ~ 'public/index.html'
        }
        
        ...
    }
}

sub api-routes (Str $schema, $ws) {
    openapi $schema.IO, {
        # working
        operation 'createAccount', -> LoggedIn $user {
            request-body -> (:$email!, :$password!, :$key, :$firstname, :$lastname, :$org, :$role) {
                my $username = $ws.create-account($user, $email, $password, $key, $firstname, $lastname, $org, $role);
                content 'application/json', { :$username };
                CATCH {
                    note "$_";
                    when X::Agrammon::DB::User::CreateFailed  {
                        not-found 'application/json', %( error => .message );
                    }
                    when X::Agrammon::DB::User::AlreadyExists
                       | X::Agrammon::DB::User::CreateFailed  {
                        conflict 'application/json', %( error => .message );
                    }
                }
            }
        }
        
    ...
}

the latter using the (abbreviated) OpenAPI definition

openapi: 3.0.0
info:
    version: 1.0.0,
    title: OpenApi Agrammon,
paths:
    /create_account:
        post:
            summary: Create new user account
            operationId: createAccount
            requestBody:
                required: true
                content:
                    application/json:
                        schema:
                            type: object
                            required:
                                - email
                                - password
                            properties:
                                email:
                                    description: User's email used as username
                                    type: string
                                firstname:
                                    description: Firstname
                                    type: string
                                lastname:
                                    description: Lastname
                                    type: string
              responses:
                '200':
                    description: Account created.
                    content:
                        application/json:
                            schema:
                                type: object
                                required:
                                    - username
                                properties:
                                    username:
                                        type: string
                '404':
                    description: Couldn't create account
                    content:
                        application/json:
                            schema:
                                $ref: "#/components/schemas/CreationFailed"
                '409':
                    description: User already exists
                    content:
                        application/json:
                            schema:
                                $ref: "#/components/schemas/Error"

handled by Cro::OpenAPI::RoutesFromDefinition.

Agrammon::OutputFormatter::PDF

For documenting AGRAMMON calculations, PDF reports of the inputs to the model and the simulation results can be created by first creating a LaTeX file using the Cro::WebApp::Template module. While tailored towards generation of HTML pages, it worked quite well for our purpose. The module does escape its input data appropriate for HTML, however, a simple-minded escaping of characters with special meaning in LaTeX was implemented outside the module:

sub latex-escape(Str $in) is export {
    my $out = $in // '';
    $out ~~ s:g/<[\\]>/\\backslash/;
    $out ~~ s:g/(<[%#{}&$|]>)/\\$0/;
    $out ~~ s:g/(<[~^]>)/\\$0\{\}/;
    # this is a special case for Agrammon as we use __ in
    # the frontend at the moment for indentation in the table
    $out ~~ s:g/__/\\hspace\{2em\}/;
    $out ~~ s:g/_/\\_/;
    return $out;
}

An addition function is used for beautification of chemical molecules:

sub latex-chemify(Str $in) is export {
    my $out = $in // '';
    $out ~~ s:g/NOx/\\ce\{NO_\{\(x\)\}\}/;
    $out ~~ s:g/(N2O|NH3|N2|NO2)/\\ce\{$0\}/;
    return $out;
}

These functions use simple regular expression substitutions. A more generic handling of LaTeX special characters would need porting something like the LaTeX::Encode Perl module to Raku. Alternatively, Inline::Perl5 could be employed to utilize the Perl module.

This code fragment shows how a LaTeX file is created

    %data<titles>    = %titles;
    %data<dataset>   = $dataset-name // 'NO DATASET';
    %data<username>  = $user.username // 'NO USER';
    %data<model>     = $cfg.gui-variant // 'NO MODEL';
    %data<timestamp> = ~DateTime.now( formatter => sub ($_) {
        sprintf '%02d.%02d.%04d %02d:%02d:%02d',
            .day, .month, .year,.hour, .minute, .second,
    });
    %data<version>    = latex-escape($cfg.gui-title{$language} // 'NO  VERSION');
    %data<outputs>    = @output-formatted;
    %data<inputs>     = @input-formatted;
    %data<submission> = %submission;
    
    template-location $*PROGRAM.parent.add('../share/templates');
    my $temp-dir    = $*TMPDIR.add($temp-dir-name);
    my $source-file = "$temp-dir/$filename.tex".IO;

    my $latex-source = render-template('pdfexport.crotmp', %data);
    
    $source-file.spurt($latex-source, %data);   

by calling render-template with a %data hash and a template file pdfexport.crotmp like

\nonstopmode
\documentclass[10pt,a4paper]{article}

\begin{document}

\section*{<.titles.report>}
\section{<.titles.data.section>}
\begin{tabular}[t]{@{}l@{\hspace{2em}}p{7cm}}
    \textbf{<.titles.data.dataset>:} & <.dataset>\\
    \textbf{<.titles.data.user>:} & <.username>\\
    \textbf{Version:} & <.model>\\
\end{tabular}

\section{<.titles.outputs>}
<@outputs>
<?.section>
<!.first>
\bottomrule
\end{tabular}
</!>
\subsection{<.section>}
\noindent
\rowcolors{1}{LightGrey}{White}
\begin{tabular}[t]{lllrl}
\toprule
</?>
<!.section>
&  & <.label> & <.value> & <.unit>\\
</!>
</@>
\bottomrule
\end{tabular}
\end{document}

as arguments. The generated LaTeX source is then written to a file using the spurt function.

While the above template might seem a bit cryptic if you are not familiar with LaTeX, the relevant parts are the HTML-like tags like <.titles.report> accessing a value of the hash data structured passed to render-template, <@output> ... </@> being an array in this data structure being iterated over, or the conditionals <?.section> ... </?> or <!.section> ... </!>. For details please consult the documentation of the Cro::WebApp::Template module.

The LaTeX file is then rendered into a PDF file with the external program lualatex and the built-in Proc::Async class:

# setup temp dir and files
my $temp-dir = $*TMPDIR.add($temp-dir-name);
my $source-file = "$temp-dir/$filename.tex".IO;
my $pdf-file    = "$temp-dir/$filename.pdf".IO;
my $log-file    = "$temp-dir/$filename.log".IO;

# create PDF, discard STDOUT and STDERR (see .log file if necessary)
my $exit-code;
my $signal;
my $reason = 'Unknown';

my $proc = Proc::Async.new: :w, '/usr/bin/lualatex',
        "--output-directory=$temp-dir",  '--no-shell-escape', '--', $source-file, ‘-’;

react {
    # discard any output of the external program
    whenever $proc.stdout.lines {
    }
    whenever $proc.stderr {
    }
    # save exit code and signal if program was terminated
    whenever $proc.start {
        $exit-code = .exitcode;
        $signal    = .signal;
        done; # gracefully jump from the react block
    }
    # make sure we don't end up with a hung-up lualatex process
    whenever Promise.in(5) {
        $reason = 'Timeout';
        note ‘Timeout. Asking the process to stop’;
        $proc.kill; # sends SIGHUP, change appropriately
        whenever Promise.in(2) {
            note ‘Timeout. Forcing the process to stop’;
            $proc.kill: SIGKILL
        }
    }
}

# write appropriate error messages if program didn't terminate sucessfully
if $exit-code {
    note "$pdf-prog failed for $source-file, exit-code=$exit-code";
    die X::Agrammon::OutputFormatter::PDF::Failed.new: :$exit-code;
}
if $signal {
    note "$pdf-prog killed for $source-file, signal=$signal, reason=$reason";
    die X::Agrammon::OutputFormatter::PDF::Killed.new: :$reason;
}

# read content of PDF file created in binary format for further use
my $pdf = $pdf-file.slurp(:bin);
# remove created files if successful, otherwise keep for debugging
unlink $source-file, $pdf-file, $aux-file, $log-file unless %*ENV<AGRAMMON_KEEP_FILES>;

A react block with several whenever blocks is used to handle the events from the asynchronously running external program to avoid blocking of the otherwise already asynchronous backend.

Typed exceptions are used to handle errors occuring in the external process.

Agrammon::OutputFormatter::Excel

Here we create Excel exports of the simulation results and the user inputs, using Spreadsheet::XLSX. This module allows to read and write XLSX files from Raku. The current functionality is by no means complete, but implements what was needed for AGRAMMON. Please feel free to provide pull requests or funds for the implementation of additional features.

# get data to be shown
my %data = collect-data();
# ...

my $workbook = Spreadsheet::XLSX.new;

# prepare sheets
my $output-sheet = $workbook.create-worksheet('Results');
my $input-sheet = $workbook.create-worksheet('Inputs');
my $timestamp = ~DateTime.now( formatter => sub ($_) {
    sprintf '%02d.%02d.%04d %02d:%02d:%02d',
            .day, .month, .year, .hour, .minute, .second,
});
# add some meta data to the sheets
for ($output-sheet, $input-sheet) -> $sheet {
    $sheet.set(0, 0, $dataset-name, :bold);
    $sheet.set(1, 0, $user.username);
    $sheet.set(2, 0, $model-version);
    $sheet.set(3, 0, $timestamp);
}

# set column width
for ($output-sheet, $input-sheet) -> $sheet {
    $sheet.columns[0] = Spreadsheet::XLSX::Worksheet::Column.new:
            :custom-width, :width(20);
    $sheet.columns[1] = Spreadsheet::XLSX::Worksheet::Column.new:
            :custom-width, :width(32);
    $sheet.columns[2] = Spreadsheet::XLSX::Worksheet::Column.new:
            :custom-width, :width(20);
    $sheet.columns[3] = Spreadsheet::XLSX::Worksheet::Column.new:
            :custom-width, :width(10);
}

# add input data to sheets
my $row = 0;
my $col = 0;
my @records := %data<inputs>;
for @records -> %rec {
    $input-sheet.set($row, $col+2, %rec<input>);
    $input-sheet.set($row, $col+3, %rec<value>, :number-format('#,#'), :horizontal-align(RightAlign));
    $input-sheet.set($row, $col+4, %rec<unit>);
    $row++;
}

# add output data to sheets
# ...

This example shows a variety of Raku basics:

  • %data, %rec are hash variables. Contrary to Perl, in Raku the sigils don’t change when accessing elements of variables.
  • for ($output-sheet, $input-sheet) -> $sheet { ... } and for @records -> %rec { ... } are loops over a lists, each assigning the current element to a variable in the loop’s scope using the pointy block syntax.
  • my $timestamp = ~DateTime.now( formatter => sub ($_) { ... } ...) uses the builtin DateTime method to create a timestamp, using the ~operator to coerce it into a string. The string is being formatted by the unamed anonymous subroutine sub ($_) { ... } which uses the topic variable $_ as argument on which the various methods of the the DateTime class are being called by just prepending a . For example, .year is just a short-cut for $_.year.

Agrammon::Email

As mentioned above PDF reports of simulations can be mailed to certain AGRAMMON users directly from the web application. First, a multi-part MIME message is created using the Email::MIME module.

# create PDF attachment
my $attachment = Email::MIME.create(
    attributes => {
        'content-type' => "application/pdf; name=$filename",
        'charset'      => 'utf-8',
        'encoding'     => 'base64',
    },
    body => $pdf,
);
# create main body part            
my $msg = Email::MIME.create(
    attributes => {
        'content-type' => 'text/plain',
        'charset'      => 'utf-8',
        'encoding'     => 'quoted-printable'
    },
    body-str => 'Attached please find a PDF report from a AGRAMMON simulation,
);
# build multi-part Email
my $from = 'support@agrammon.ch';
my $to   = 'foo@bar.com';
my $mail = Email::MIME.create(
    header-str => [
        'to'      => $to,
        'from'    => $from,
        'subject' => 'Mail from AGRAMMON'
    ],
    parts => [
        $msg,
        $attachment,
    ]
);

This message is then sent to the mail’s recipient using the promise based Net::SMTP::Client::Async module:

# asynchronously send Email via AGRAMMON's SMTP server
with await Net::SMTP::Client::Async.connect(:host<mail.agrammon.ch>, :port(25), :!secure) {
    # wait for SMTP server's welcome response
    await .hello;
    # send message
    await .send-message(
        :$from,
        :to([ $to ]),
        :message(~$mail),
    );
    # terminate connection on exit
    LEAVE .quit;
    # catch exceptions and emit user friendly error message
    CATCH {
        when X::Net::SMTP::Client::Async {
            note "Unable to send email message: $_";
        }
    }
}

The await function is used to handle the asynchronous communication with the SMPT server. The LEAVE is called upon exit from the with await { ... } block to close the connection to the server.

Unicode operators

My favorite code fragments used in AGRAMMON demonstrate the use auf Unicode codepoints in Raku source code:

  • my $my-n = ++⚛$n; is incrementing a variable of type atomicint,
  • and $var-print.split(',') ∩ @print-set gives the intersection of two sets.

While Unicode can also be used for other purposes, e.g. for numerical values like ⅓, 𝑒, π, or τ or in variable names my $Δ = 1;, their use as operators definitely makes for better readable code (compare to (&)).

Apart from using the appropriate mathematical symbol, the existence of such powerful operators is by itself a great feature of Raku and makes for much shorter code than implementing such operations in other programming languages.

Parser and Compiler

Finally, a few words about the parser and compiler used to process the AGRAMMON model files shown above. Agrammon::ModuleParser is the top-level element for parsing the model files:

use v6;
use Agrammon::CommonParser;

grammar Agrammon::ModuleParser does Agrammon::CommonParser {
    token TOP {
        :my $*TAXONOMY = '';
        :my $*CUR-SECTION = '';
        <.blank-line>*
        <section>+
        [
        || $
        || <.panic('Confused')>
        ]
    }

    proto token section { * }

    token section:sym<general> {
        <.section-heading('general')>
        [
        | <option=.single-line-option>
        | <option=.multi-line-str-option('+')>
        | <.blank-line>
        ]*
    }

    token section:sym<external> {
        <.section-heading('external')>
        [
        | <.blank-line>
        | <external=.option-section>
        ]*
    }

    token section:sym<input> {
        <.section-heading('input')>
        [
        | <.blank-line>
        | <input=.option-section>
        ]*
    }

    token section:sym<technical> {
        <.section-heading('technical')>
        [
        | <.blank-line>
        | <technical=.option-section>
        ]*
    }

    token section:sym<output> {
        <.section-heading('output')>
        [
        | <.blank-line>
        | <output=.option-section>
        ]*
    }

    token section:sym<results> {
        <.section-heading('results')>
        [
        | <.blank-line>
        | <results=.option-section>
        ]*
    }

    token section:sym<tests> {
        <.section-heading('tests')>
        [
        | <.blank-line>
        | <tests=.option-section>
        ]*
    }
}

It handles parsing of the various sections of the model file sections, using various elements from module Agrammon::CommonParser such as

   token section-heading($title) {
        \h* '***' \h* $title \h* '***' \h* \n
        { $*CUR-SECTION = $title }
    }

    token option-section {
        \h* '+' \h* <name> \h* \n
        [
        | <.blank-line>
        | <option=.single-line-option>
        | <option=.subsection-map>
        | <option=.multi-line-str-option('++')>
        ]*
    }

    token single-line-option {
        \h* <key> \h* '=' \h*
        $<value>=[[<!before \h*'#'>\N]*]
        \h* ['#'\N*]?
        [\n || $]
    }
    
   token blank-line {
        | \h* \n
        | \h* '#' \N* \n
        | \h+ $
    }

Raku grammars are basically build top-down from regular expressions. Such grammars can be extended by means of action classes that further process the match objects generated while parsing the data fed to the grammar.

Please consult this tutorial or other resources to learn more about those concepts.

If you want to know more about the (real-world) AGRAMMON parser/compiler you can have a look at the other parser elements in the Agrammon::Formula::Parser, Agrammon::Formula::Builder, Agrammon::ModuleBuilder, Agrammon::TechnicalParser, Agrammon::TechnicalBuilder, and Agrammon::LanguageParser modules, the latter being a simple none-grammmar based function.

The compiler consists of the modules Agrammon::Formula::Compiler and Agrammon::Formula::Builtins.

Finally, as a recent addition, AGRAMMON also got a C-style preprocessor in Agrammon::Preprocessor for conditionally including or excluding parts of the model using the following syntax:

?if FOO
...
?elsif BAR
...
?else
...
?endif

with optional ?elsif and ?else parts. The keywords can also be negated, such as ?if !FOO.

So, which Christmas?

Well, as you can see from this presentation at the Swiss Perl Workshop 2018, the original plan was not quite met, mostly due to another project being given higher priority (which was a very poor decision, but this is another long story).

We had hoped to have AGRAMMON 6 deployed and in production before the appearance of this article and almost suceeded. All the critical features are in place, a bit of polishing is still to be done. One of the biggest relieves for our brave elf was that test calculations done with the Perl and Raku backends using the same model and input values gave identical results. As those implementations were done not only in two different languages, but also by different programmers with a completely different architecture, this gives a lot of trust in their correctness.

In addition, the customer has done a pretty extensive refactoring of the model files and is currently in the process of verifying both the model calculations and the functionality of the Raku based web application.

The current setup is already online as demo/test version and you are welcome to give it a try. We expect the Raku implementation to finally go into production in early 2021 and to replace the current Perl implementation.

Conclusion

Is Raku ready for use in production? Definitely yes!

While having already delivered a few smaller customer projects implemented in Raku, AGRAMMON 6 will be Oetiker+Partner AG’s first publically accessible web application with a Raku backend and we hope for many more to come. It was a great pleasure to work with our colleague on this project and we also want to thank our customer and partners for this opportunity.

And the most important outcome: Santa now has another elf able to work on future Raku projects. Raku is a very rich language and have no doubt, “There’s more than one way to do it”, for any defintion of “it”. While it is not necessary to learn everything at once, it is certainly helpful to have some expert knowledge near by to ask questions and to learn about the more elegant and often very concise options available. The Raku community is very friendly and welcoming to newcomers (and even to the occasional trolls hanging around).

Raku itself tries very hard to help the programmer to not shoot himself into the foot. Error messages are often very helpful and so are the problem reports or suggestions the Comma IDE has to offer (and they become more and more with every release). So, go ahead and take a dive!

Day 19: Typed Raku, Part 2: Taming Behaviour

In the previous part, I claimed that types can allow for more fluid, robust code, then wrote a bunch of restrictive types for chess that won’t allow for this to occur:

subset Chess::Index of Int:D where ^8;

class Chess::Position {
    has Chess::Index $.file is required;
    has Chess::Index $.rank is required;
}

enum Chess::Colour <White Black>;
enum Chess::Type   <Pawn Bishop Rook Knight Queen King>;

class Chess::Piece {
    has Chess::Colour:D $.colour is required;
    has Chess::Type:D   $.type   is required;
}

class Chess::Square {
    has Chess::Colour:D $.colour is required;
    has Chess::Piece:_  $.piece  is rw;
}

class Chess::Board {
    has Chess::Square:D @.squares[8;8];

    submethod BUILD(::?CLASS:D: --> Nil) { ... }
}

Branching with Multiple Dispatch

There’s a key concept we need to understand before we can fix the types we wrote, but we can’t use chess as an example without fixing the types first. Instead, we’ll use a small helper routine from my Trait::Traced module as an example.

When rendering prettified trace output, objects that can appear in a trace can be rendered in a few different ways: an exception will be rendered as a red exception name, a failure will be rendered as a yellow exception name, and anything else will be rendered as its gist. We can write a &prettify routine with conditions based around the type of a value parameter, which we’ll assume is Any:_:

sub prettify(Any:_ $value --> Str:D) {
    if $value ~~ Exception:D {
        "\e[31m$value.^name()\e[m"
    } elsif $value ~~ Failure:D {
        "\e[33m$value.exception.^name()\e[m"
    } else {
        $value.gist
    }
}

We have conditions based on smartmatching semantics that affect the value we wind up with, so maybe we have whens, not ifs. We’ll rewrite this with the given/when pattern:

sub prettify(Any:_ $value --> Str:D) {
    given $value {
        when Exception:D { "\e[31m$value.^name()\e[m" }
        when Failure:D   { "\e[33m$value.exception.^name()\e[m" }
        default          { $value.gist }
    }
}

We’re writing a routine that can potentially get called for every single traced event that occurs in the duration of a program; that given block introduces an extra scope that we don’t need, which comes with overhead that is unacceptable in this case. Because we already have a block to work with (&prettify itself), if we rename $value to $_, we can get rid of that:

sub prettify(Any:_ $_ --> Str:D) {
    when Exception:D { "\e[31m$_.^name()\e[m" }
    when Failure:D   { "\e[33m$_.exception.^name()\e[m" }
    default          { $_.gist }
}

But this doesn’t read very well. When we have conditions based around the type of a routine’s parameters, representing the routine as not one routine, but multiple with differing signatures becomes a possibility:

multi sub prettify(Any:_ $value --> Str:D)           { $value.gist }
multi sub prettify(Exception:D $exception --> Str:D) { "\e[31m$exception.^name()\e[m" }
multi sub prettify(Failure:D $failure --> Str:D)     { "\e[33m$failure.exception.^name()\e[m" }

Each multi here is a dispatchee of the &prettify routine. When this is called, the dispatchee with the most specific signature that typechecks given its argument will be selected and invoked, or if none match, a typechecking exception will be thrown. Because we didn’t define one ourselves, the proto routine that handles this will be generated for us. This comes with :(|) as a signature when we want something a little more specific here:

proto sub prettify(Any:_ --> Str:D) {*}

This constrains the type of &prettify‘s first parameter to Any:_ for all of its dispatchees. Where the {*} is written lies the multi invocation that will get made when this is called; because this is all we want to do in this case, we can write this in place of the routine body.

With the help of subsets, multiple dispatch can represent simpler if, when, or with branches in a program in a more extendable and testable way. There are ways we can avoid having these branches occur during runtime in certain cases, however.

Sharing with Roles

Getting back to chess, our Chess::Piece and Chess::Square types aren’t as optimal as they could be. Because we define their $.colour and $.type as attributes, we would wind up evaluating conditions related to these with each move made. This is unnecessary when we know exactly how a specific colour and type of piece will ever be able to move, and what colour each square of a chess board will ever be ahead of time.

Starting with Chess::Piece, we could maybe eliminate the $.colour and $.type attributes that define behaviour related to these, but in doing so, we lose direct access to Chess::Piece‘s attributes. Creating a hierarchical type system with classes here is overkill. When we want to share rather than extend behaviour, a role would be a more apt kind to give a type, or in this case, several:

role Chess::Piece { }
role Chess::Piece[White, Pawn] {
    method colour(::?CLASS:_: --> White) { }

    method type(::?CLASS:_: --> Pawn) { }
}
role Chess::Piece[Black, Pawn] {
    method colour(::?CLASS:_: --> Black) { }

    method type(::?CLASS:_: --> Pawn) { }
}
role Chess::Piece[Chess::Colour:D $colour, Bishop] {
    method colour(::?CLASS:_: --> Chess::Colour:D) { $colour }

    method type(::?CLASS:_: --> Bishop) { }
}
role Chess::Piece[Chess::Colour:D $colour, Rook] {
    method colour(::?CLASS:_: --> Chess::Colour:D) { $colour }

    method type(::?CLASS:_: --> Rook) { }
}
role Chess::Piece[Chess::Colour:D $colour, Knight] {
    method colour(::?CLASS:_: --> Chess::Colour:D) { $colour }

    method type(::?CLASS:_: --> Knight) { }
}
role Chess::Piece[Chess::Colour:D $colour, Queen] {
    method colour(::?CLASS:_: --> Chess::Colour:D) { $colour }

    method type(::?CLASS:_: --> Queen) { }
}
role Chess::Piece[Chess::Colour:D $colour, King] {
    method colour(::?CLASS:_: --> Chess::Colour:D) { $colour }

    method type(::?CLASS:_: --> King) { }
}

Roles are mixin types. We can declare an arbitrary number of these with the same name to form a role group, so long as they have differing and non-overlapping type parameters. With two type parameters, we can establish a relationship between colours and types of chess pieces when it comes to behaviour. That first role is an exception, which will contain behaviour pertaining to the absence of a chess piece in Chess::Square. The types we wind up with here resemble a multiple dispatch routine. In fact, we do have multiple dispatch, just it’s happening at the type level.

Similar to class’ $?CLASS and ::?CLASS symbols, we have $?ROLE and ::?ROLE symbols we can use to reference an outer role type. Roles can’t do much besides parameterize without the help of a class somewhere along the way, so we still get the $?CLASS and ::?CLASS symbols in the end. In the context of a role, these are generic types to be filled by any class it ever winds up getting mixed into.

As mixin types, methods and attributes of roles don’t really belong to the role itself, but the class it eventually gets mixed into. However, Chess::Piece is one of the rarer cases where we don’t actually have any class to mix the type into. That’s OK because roles are punnable. By default, when we attempt to call a method like new on a role, this is not called on the role itself, but a pun produced by mixing in the role to an empty class. This behaviour allows roles to be usable like classes for the most part.

As with Chess::Piece, the behaviour of Chess::Square will depend on its colour in a predictable way, so we have a type parameter there too:

role Chess::Square[White] {
    has Chess::Piece:_ $.piece is rw = Chess::Piece.^pun;

    method colour(::?CLASS:_: --> White) { }
}
role Chess::Square[Black] {
    has Chess::Piece:_ $.piece is rw = Chess::Piece.^pun;

    method colour(::?CLASS:_: --> Black) { }
}

We change the default chess piece for a square to a pun of Chess::Piece because Chess::Piece refers to the entire role group, not the individual pun we mean here.

Wrapping Up Once More

With our types sorted out, we can now set up a chess board for Chess::Board.BUILD now:

class Chess::Board {
    # ...

    submethod BUILD(::?CLASS:D: --> Nil) {
        @!squares = |(
            (flat (Chess::Square[Black].new, Chess::Square[White].new) xx 4),
            (flat (Chess::Square[White].new, Chess::Square[Black].new) xx 4),
        ) xx 4;

        @!squares[0;0].piece  = Chess::Piece[White, Rook].new;
        @!squares[0;1].piece  = Chess::Piece[White, Knight].new;
        @!squares[0;2].piece  = Chess::Piece[White, Bishop].new;
        @!squares[0;3].piece  = Chess::Piece[White, Queen].new;
        @!squares[0;4].piece  = Chess::Piece[White, King].new;
        @!squares[0;5].piece  = Chess::Piece[White, Bishop].new;
        @!squares[0;6].piece  = Chess::Piece[White, Knight].new;
        @!squares[0;7].piece  = Chess::Piece[White, Rook].new;
        @!squares[1;$_].piece = Chess::Piece[White, Pawn].new for ^8;

        @!squares[6;$_].piece = Chess::Piece[Black, Pawn].new for ^8;
        @!squares[7;0].piece  = Chess::Piece[Black, Rook].new;
        @!squares[7;1].piece  = Chess::Piece[Black, Knight].new;
        @!squares[7;2].piece  = Chess::Piece[Black, Bishop].new;
        @!squares[7;3].piece  = Chess::Piece[Black, King].new;
        @!squares[7;4].piece  = Chess::Piece[Black, Queen].new;
        @!squares[7;5].piece  = Chess::Piece[Black, Bishop].new;
        @!squares[7;6].piece  = Chess::Piece[Black, Knight].new;
        @!squares[7;7].piece  = Chess::Piece[Black, Rook].new;
    }
}

At this point, it’d be nice if we could see what we’re doing. We’ll define dispatchees for the gist method in relevant types, starting with Chess::Piece:

role Chess::Piece {
    multi method gist(::?CLASS:U: --> ' ') { }
}
role Chess::Piece[White, Pawn] {
    # ...

    multi method gist(::?CLASS:D: --> '♙') { }
}
role Chess::Piece[Black, Pawn] {
    # ...

    multi method gist(::?CLASS:D: --> '♟︎') { }
}
role Chess::Piece[Chess::Colour:D $colour, Bishop] {
    # ...

    multi method gist(::?CLASS:D: --> Str:D) {
        my constant %BISHOPS = :{ (White) => '♗', (Black) => '♝' };
        %BISHOPS{$colour}
    }
}
role Chess::Piece[Chess::Colour:D $colour, Rook] {
    # ...

    multi method gist(::?CLASS:D: --> Str:D) {
        my constant %ROOKS = :{ (White) => '♖', (Black) => '♜' };
        %ROOKS{$colour}
    }
}
role Chess::Piece[Chess::Colour:D $colour, Knight] {
    # ...

    multi method gist(::?CLASS:D: --> Str:D) {
        my constant %KNIGHTS = :{ (White) => '♘', (Black) => '♞' };
        %KNIGHTS{$colour}
    }
}
role Chess::Piece[Chess::Colour:D $colour, Queen] {
    # ...

    multi method gist(::?CLASS:D: --> Str:D) {
        my constant %QUEENS = :{ (White) => '♕', (Black) => '♛' };
        %QUEENS{$colour}
    }
}
role Chess::Piece[Chess::Colour:D $colour, King] {
    # ...

    multi method gist(::?CLASS:D: --> Str:D) {
        my constant %KINGS = :{ (White) => '♔', (Black) => '♚' };
        %KINGS{$colour}
    }
}

Chess::Square can wrap its piece’s gist with coloured brackets:

role Chess::Square[White] {
    # ...

    multi method gist(::?CLASS:D: --> Str:D) {
        "\e[37m[\e[m$!piece.gist()\e[37m]\e[m"
    }
}
role Chess::Square[Black] {
    # ...

    multi method gist(::?CLASS:D: --> Str:D) {
        "\e[30m[\e[m$!piece.gist()\e[30m]\e[m"
    }
}

And Chess::Board can glue these together:

class Chess::Board {
    # ...

    multi method gist(::?CLASS:D: --> Str:D) {
        @!squares.rotor(8).reverse.map(*».gist.join).join($?NL)
    }
}

my Chess::Board:D $board .= new;
say $board;

Now we can see some results:

bastille% raku chess.raku
[♜][][♝][][♛][][♞][]
[♟︎][♟︎][♟︎][♟︎][♟︎][♟︎][♟︎][♟︎]
[ ][ ][ ][ ][ ][ ][ ][ ]
[ ][ ][ ][ ][ ][ ][ ][ ]
[ ][ ][ ][ ][ ][ ][ ][ ]
[ ][ ][ ][ ][ ][ ][ ][ ]
[♙][][♙][][♙][][♙][]
[][♘][][♕][][♗][][♖]

Now let’s put Chess::Position to use. We’ll give it an explicit new method to make it a little easier to work with and a gist candidate to express its output in a way a chess player can read:

class Chess::Position {
    # ...

    method new(::?CLASS:_: Int:D $rank, Int:D $file --> ::?CLASS:D) {
        self.bless: :$rank, :$file
    }

    multi method gist(::?CLASS:D: --> Str:D) {
        my constant @RANKS = <a b c d e f g h>;
        @RANKS[$!rank] ~ $!file + 1
    }
}

If we return a list of offsets for all possible moves a knight can make:

role Chess::Piece[Chess::Colour:D $colour, Knight] {
    # ...

    method moves(::?CLASS:D: --> Seq:D) {
        gather {
            take slip (-2, 2) X (-1, 1);
            take slip (-1, 1) X (-2, 2);
        }
    }
}

Then we can grep for the valid moves to make from these in Chess::Board with the help of feed operators and v6.e’s || prefix operator:

use v6.e.PREVIEW;

class Chess::Board {
    # ...

    method moves(::?CLASS:D: Chess::Index $rank, Chess::Index $file --> Seq:D) {
        gather with @!squares[$rank;$file].piece -> Chess::Piece:D $piece {
            $piece.moves
        ==> map({ $rank + .[0], $file + .[1] })
        ==> grep((Chess::Index, Chess::Index))
        ==> grep({ not @!squares[||$_].piece.?colour ~~ $piece.colour })
        ==> map({ Chess::Position.new: |$_ })
        ==> slip()
        ==> take()
        }
    }
}

my Chess::Board:D $board .= new;
say $board.moves: 0, 1;

Now we can see what moves the white knight at a2 can make on the first turn:

bastille% raku chess.raku
(c1 c3)

Combined with multiple dispatch, Raku’s type system allows us take a problem like a game of chess and split it up into smaller, more manageable components that can be tested more easily. Runtime exceptions can often be expressed as typechecking exceptions and if denoted explicitly, types can make it easier to catch bugs ahead of time. Though they don’t come into play with chess, at this point you should know enough about how types in Raku work to be able to take advantage of generics and coercions.

Day 18: Typed Raku, Part 1: Taming State

When I started learning Raku a couple years back, one of the first features that stuck out to me was its type system. This is one I feel gets overlooked at times. I found this to be rather difficult to wrap my head around at first, but I found that relying on strict typing can lead to simpler, more robust code that can better cope with changes as time goes on. I’ll be using chess to demonstrate this, but there are some fundamentals to cover first.

Introspecting Types and Kinds

The type of any object in Raku can be introspected with WHAT:

say 42.WHAT;      # OUTPUT: (Int)
say WHAT 42 | 24; # OUTPUT: (Junction)

Though we’re often more interested in its type’s name than the type object itself when we want to do this:

say 42.^name; # OUTPUT: Int

Typechecking is among the behaviours defined by an object’s HOW:

say 42.HOW.^name; # OUTPUT: Perl6::Metamodel::ClassHOW

In type theory jargon, this would be a kind, or a type of type.

Runtime Typechecking

Occasionally, there comes a time when you need to typecheck objects manually, such as when debugging. Smartmatching against a type object will perform a typecheck by default:

say 42 | 24 ~~ Junction; # OUTPUT: True
say 42 | 24 ~~ Int;      # OUTPUT: True

However, we’re smartmatching; ~~ can have any behaviour depending on what how the RHS’ ACCEPTS method behaves. While this can allow for smartmatching of junctions of objects against type objects for instance, sometimes a more literal typecheck is needed. Metamodel::Primitives.is_type will do in those cases:

say Metamodel::Primitives.is_type: 42 | 24, Junction; # OUTPUT: True
say Metamodel::Primitives.is_type: 42 | 24, Int;      # OUTPUT: False

Typing Variables

Types may optionally be provided for any variable, parameter, or attribute, generally as a prefix to the variable name. For $-sigilled variables, this denotes its value’s type; for @-sigilled variables, this denotes the type of the list’s values; for %-sigilled variables, this denotes the type of the hash’s values; for &-sigilled variables, this denotes the type of the routine’s return value. With %-sigilled variables in particular, an additional key type may be given in curly braces after the variable name:

my Int  $x       = 0;
my Str  @ss      = <sup lmao>;
my Num  %ns{Str} = :pi(π);
my True &is-cool = sub is-cool(Cool $x --> True) { };

Alternatively, value types may be specified using the of trait:

my $x       of Int  = 0;
my @ss      of Str  = <sup lmao>;
my %ns{Str} of Num  = :pi(π);
my &is-cool of True = sub is-cool(Cool $x --> True) { };

Wait a minute, how is True a valid type? True being a Bool enum value, it’s a Cool constant that can be used like a type with ACCEPTS semantics. String and numeric literals also fall under this category. We give one as a return value here, so we don’t need to return anything from the routine explicitly, as it will always return True.

Definitely Typing Variables

Raku offers a way to restrict values of a type based on their definiteness using type smileys, which are placed after the type of a variable:

my Int:U $type;    # Contains an Int type object by default.
my Int:D $value    = 42;
my Int:_ $nullish;
$nullish = $value;

The :U smiley denotes an undefined type object; the :D smiley denotes defined values (or instances); the :_ smiley denotes either/or.

When no type smiley is given for a type, they will use :_ semantics by default. This can be customized using the variables and attributes pragmas (parameters is NYI). For instance, types without smileys can be made definite to give variable typings behaviour more akin to Haskell’s like so:

use variables  :D;
use attributes :D;

Nil is an exception when it comes to typechecking of :U and :_ variables. This cannot be bound to a variable typed with these smileys, but when assigned or returned from a routine with them, you’ll wind up with its type object instead of Nil itself. Failure also being a nullish type, failures can be returned from routines of other :U/:_ typings.

Definite types can help prevent common, annoying runtime errors related to treating type objects like instances or vice versa. For example, you’ve probably seen this sort of warning before:

put my $warning; # OUTPUT:
# Use of uninitialized value $message of type Any in string context.
# Methods .^name, .raku, .gist, or .say can be used to stringify it to something meaningful.
#   in block <unit> at -e line 1

Emitting this warning is Mu‘s default behaviour for Str coercions of type objects. Giving variables a :D typing can help prevent it from appearing.

Grouping Data with Classes

If we were to represent a chess piece as data, we might have a colour and a type of piece. The question is how do we type this? We might have an Array:D of Str:D to start:

my Str:D @piece  = 'white', 'pawn';
my Str:D $colour = @piece[0];
my Str:D $type   = @piece[1];

0 and 1 don’t make good names for colours and types though. If we were to group data of mixed types, we would be stuck typing the entire thing with a less specific typing than we intend for them to have. We want types! We can bundle our data together in a class to type all of this:

class Chess::Piece {
    has Str:D $.colour is required;
    has Str:D $.type   is required;
}

$.colour and $.type are public attributes of the Chess::Piece class. Attributes are declared in the has scope, with public attributes having the . twigil and private attributes having the ! twigil. Because we have :D typings for these, they either must be required or have a default value. We can’t assume anything about their values at this point, so we mark these as required.

As in a traditional object-oriented language, a class can be constructed with data to produce a value. The new method is the default method we can use to do this, which accepts named parameters corresponding to the class’ public attributes:

my Chess::Piece:D $pawn = Chess::Piece.new: :colour<white>, :type<pawn>;

My pawn chess piece is a chess piece… because we have Chess::Piece as a type for $pawn, there is sugar for assignments based on method calls we can use to make this less redundant:

my Chess::Piece:D $pawn   .= new: :colour<white>, :type<pawn>;
my Str:D          $colour  = $pawn.colour;
my Str:D          $type    = $pawn.type;

We can access the public attributes of $pawn with syntax similar to method calls. In fact, these are method calls. Raku doesn’t differentiate between public, private, or protected data when it comes to classes; it’s always private. What we call public attributes are really private attributes with an automatically generated getter method.

Chess pieces can exist within a square on a chess board, which comes with a colour. That might be a class too:

class Chess::Square {
    has Str:D          $.colour is required;
    has Chess::Piece:_ $.piece  is rw;
}

my Chess::Square:D $a1 .= new: :colour(Black);
$a1.piece .= new: :colour(White), :type(Rook);

We give Chess::Square an rw $.piece attribute. is rw is a trait that makes the getter for a public attribute more of a combination of a getter and a setter, allowing for assignments to the attribute to be made from outside of Chess::Square.

We will need another class for the chess board itself. This will keep track of a grid of squares:

class Chess::Board {
    has Chess::Square:D @.squares[8;8];

    submethod BUILD(::?CLASS:D: --> Nil) { ... }
}

This wraps a @.squares multidimensional array with 8 ranks (rows) of 8 files (columns). We stub a BUILD submethod that will initialize @!squares once the board has been constructed. As a submethod, this will not get inherited by any potential subclasses. As a ... stub, this will fail when called (we’re not quite ready to implement this yet).

Typically we want methods, not submethods. Public methods can be declared by using the method routine declarator instead of submethod; private methods are declared the same way as public methods, but have a name prefixed with !; there are no protected methods in Raku.

Within the scope of a class, we will always have $?CLASS and ::?CLASS symbols that act as aliases for it. ::?CLASS can be used as a typing, while $?CLASS would be preferable in other contexts. The type that comes before the : in BUILD‘s signature is an invocant typing. This is a parameter like any other, but we don’t give it a name because there already is a default symbol for it that’s usually good enough: self.

Closing Sets with Enums

We represent chess piece colours and types as strings at the moment. Strings are for text. In representing them this way, any text can be given as a colour or type, but we can can only have black or white as a valid colour for a chess piece, and only pawns, rooks, bishops, knights, queens, and kings as types. In other words, we get values a human can understand, but a computer doesn’t interpret them how we’d like it to here. We can better represent these with enums:

enum Chess::Colour <White Black>;
enum Chess::Type   <Pawn Rook Bishop Knight Queen King>;

class Chess::Piece {
    has Chess::Colour:D $.colour is required;
    has Chess::Type:D   $.type   is required;
}

class Chess::Square {
    has Chess::Colour:D $.colour is required;
    has Chess::Piece:_  $.piece  is rw;
}

my Chess::Square:D $a1 .= new: :colour(Black);
$ai.piece .= new: :colour(White), :type(Bishop);

Now only the colours and types of chess pieces we intend to have can be used.

By default, an enum will have its index in the enum’s list of values as its value, alongside a key with the enum value’s name:

say White;       # OUTPUT: White
say White.key;   # OUTPUT: White
say White.value; # OUTPUT: 0

As for enum values themselves, they will be instances of their enum type, and will always be equivalent to their value, yet they will come with an Enumeration typing as well:

say White == 0;             # OUTPUT: True
say White ~~ Enumeration:D; # OUTPUT: True

The Enumeration type is what causes White to get output as White by &say instead of 0, for instance.

The default index values works fine in the case of chess piece colours and types, but not all cases. Enums can have any type of value, though its values can only be instances as of writing. This is inferred from the enum’s values by default, but may be denoted explicitly when a scope declarator (my, our, unit, etc.) is given:

my Int enum Chess::Colour <White Black>;

Constraining Types and Values with Subsets

When moving a chess piece, we will need to take tuples of its file and rank (indices in Chess::Board‘s @.squares) as parameters for a routine somewhere along the way. We could represent these with Int:D in a Chess::Position class:

class Chess::Position {
    has Int:D $.file;
    has Int:D $.rank;
}

But as with colours and types, this is overly broad; we only want to allow integers from 0-7. We could maybe represent files with an Int enum of a-h, but we don’t have any symbols to give ranks. We can use a subset to constrain valid values for either to the range we want instead:

subset Chess::Index of Int:D where ^8;

class Chess::Position {
    has Chess::Index $.file is required;
    has Chess::Index $.rank is required;
}

This constrains Int to a range of 0-7 via a runtime typecheck. Here, Int:D is our subset’s refinee, and ^8 is its refinement. Smartmatching against Chess::Index will first typecheck the LHS against its refinee, then smartmatch the LHS against the refinement should that succeed. While I’d normally include a type smiley with a type, writing Chess::Index:D is redundant when the refinee already includes :D.

We can write this subset in a more ad-hoc way using the where clause. This variable declaration is roughly equivalent to a Chess::Index-typed variable:

my Int:D $file where ^8 = 0;

However, when typed with a bare where clause in lieu of an explicit subset like this, we will wind up with a less readable exception should a typecheck fail.

Wrapping Up

So far, we have a handful of types for chess. While we have type-safe representations of state associated with them, we have little in the ways of behaviour implemented. We don’t implement any at this point because our types are flawed; our dependence on classes will complicate the code we wind up with in the end. We focus on the more restrictive aspects of Raku’s type system first so we can fully take advantage of the more liberating aspects, which will be the focus of the next part.

Day 17: Becoming a Time Lord in Raku

I’ve lived within a few minutes of a time zone border for most of my life. The way we distinguished time wasn’t with the official monickers of “Eastern” and “Central” time. No, we used the much more folksy (and yet, also much cooler) terms “fast time” and “slow time”. Knowing which zone you were talking about was extremely important as many people like my mother lived in one zone and worked in the other.

When I started looking at implementing internationalized DateTime formatters in Raku using data from the Common Linguistic Data Repository (or CLDR), I came to a fairly surprisingly realization: Raku doesn’t understand timezones! Sure, DateTime objects have the .timezone method, but it’s just an alias for .offset to figure out the offset from GMT.

Having lived in countries that did daylight savings time at different times of the year, having family in places in my own zone that don’t observe daylight savings time, and knowing that there are weird places with thirty- and even forty-five-minute offsets from GMT, I knew time zones could be complicated.

The universe is big, it’s vast and complicated, and ridiculous

There is a huge database simply called tz that is a huge repository of timezone data, from when transitions occurred, when daylight savings time when in and out off commission, offsets, everything. Unlike the Unicode code charts, Raku doesn’t include this as a part of its core because of its frequent updates and inherent instability (yay politicians). OTOH, probably in part to its origins as a real-life XKCD comic, it does include some very cool old-fashioned programmer musings which I fully advocate for us bringing back (when was the last time you saw a code base quoting literature in their header? Knuth?)

Alongside the database is a standard code library — one that’s likely on your computer if you’re using a *nix machine — to convert times from a variety of different representations while taking into consideration timezones. It’s written in C, so it’s highly portable.

We could have taken the easy way out and use NativeCall (a way to directly call compiled C code from within Raku) to pass in the data. But what’s the fun in that? Instead, I ported the code. After all, the algorithm is fairly simple consisting of tons of constants and some basic math, a few binary searches and a pair of conditional, but nothing that can’t be done in any language. Easy.

But once that’s done, there’s still a problem. How do we get DateTime to understand time zones?

Mastering time

Raku’s DateTime, as mentioned, doesn’t really understand time zones outside of knowing what a GMT offset is. I probably could have just made a new DateTimeTZ class for people to use in modules such as a date/time formatter that need to understand time zones but then I’d need to spend a lot of time ensuring that my code coerced between the two and didn’t accept/return the wrong ones and… yeah, that would be annoying. Plus, even if I made it a subclass of DateTime, because most DateTime methods return new DateTime objects, I’d need to override virtually every method, and even then, if someone other module created a DateTime manually from it, time zone information would be lost.

Another option could be to augment DateTime to give it a new .timezone-id and .is-dst method. Augmenting is the process of adding methods or attributes to a class outside of its original declaration. But it’s impossible to know looking at a time what it’s timezone ID is. While North America and South America share time zones by offset, they have different names (and adjust daylight savings time differently). I could try to infer, like with what Intl::UserTimezone does, but that would only work for timezones in the user’s current region at best, and still ultimately end up requiring the user to specify it in some way. Plus, when you augment something you break precompilation. Raku precompiles modules to reduce startup time, meaning that using time zones with any large module would wreck your start up time, especially if you use a few very large modules.

There had to be a better solution. That solution involved two solutions: one very common, and one rarer.

Adding a dash of wibbly wobbly timey wiminess.

The first thing that needed to be done was create a role that could be mixed in, that is, applied to a class. Roles are traditionally used to describe or modify behaviors (they are similar to Java’s interfaces), but they can also add extra information to existing classes. Roles also nicely allow typechecking to happen exactly as it would have for the base class, so by mixing one in with every DateTime there shouldn’t be any compatibility problems. A simple Timezone role might look like

role TimezoneAware {
    has $.olson-id;
    has $.is-dst;
}

I mean, this works. I need to be able to set those, and I’d rather not pollute things with a public instantiation method since roles can’t be passed attributes like classes can.But they can be parameterized. This might be an abuse but we can end up with …

role TimezoneAware[$tz-id,$dst] {
    method olson-id { $tz-id }
    method is-dst   { $dst   }
}

Now we have a way to make a DateTime know about its time zone but… how do we apply it? Asking users to manually state DateTime.new(…) does TimezoneAware[…] would get very tedious, especially since they can’t control DateTime objects that might be created from those (since DateTime is immutable, any adjustments like .later create a new DateTime object, which wouldn’t have the mixin).

Never throw anything away, Harry

The way we can get this to work (and without throwing out precompilation!) is by using the wrap routine. Wrapping allows us to capture a call as it’s being made, and intervene as necessary.

A simple wrapper that just lets us know something was called would be:

Foo.^find_method('bar').wrap(
    method (|c) {
        say "Called 'bar' with arguments ", c;
        my $result = callsame;
        say "  --> ", $result;
        $result;
    }
);

Anytime someone calls .bar on a Foo object, Raku will output what the arguments are as well as the newly made object, and still return it so it doesn’t interference with program flow. Because we can obtain the result of the original and then do something with it, we have the opportunity to mix in our role and have it affect every single DateTime that’s created by just saying $result does TimezoneAware[…].

There was one small issue I found with using this technique and it’s due to DateTime’s .new being a multimethod. Using callsame (which would pass us to the original DateTime) uses all the original arguments, which makes it impossible for us to add new arguments like :daylight or :dst or whatever we want to call it because the original method will reject them.

If we use callwith, though, we can remove those extra arguments and even make modifications if our timezone processing called for it (and it ultimately did). But because of the way wrap interacts with multi methods, we end up calling the wrapped method again! When I was testing, I would occasionally get a DateTime with the role applied two or three times. Not good.

The solution was surprisingly simple. When the wrapped method was called again, I just needed to use callsame to get the original version. But how could I know whether I was calling it the first time or not? (Recall that we can’t add parameters and still use callsame!) Raku’s dynamic variables came to the rescue. At the beginning of the wrapped method, we do a quick check to see if we want the original or wrapped:

DateTime.^find_method('new').wrap(
    method (|c) {
        return (callsame) if $*USE-ORIGINAL;
        ...
    }
);

Unfortunately, the way that Rakudo compiles this means that we can’t actually set this variable, because the my $*USE-ORIGINAL would necessarily come after. But, if you haven’t guessed, Raku has a solution for that 🙂 We know that the variable will be somewhere in the caller chain. By using the psuedo-package CALLERS, it’s possible to locate the variable up the call chain, without causing the compiler to install its symbol in our scope.

DateTime.^find_method('new').wrap(
    method (|c) {
        return (callsame) if CALLERS::<$*USE-ORIGINAL>;
        ...
    }
);

It’s true that if someone uses this same name there could be a problem because CALLERS goes all the way up the calling chain. It might be possible to use just CALLER::CALLER::<$*USE-ORIGINAL> but the number of times to use CALLER:: might not be terrible consistent. For the actual module, I’ve chosen an even more unlikely name of $*USE-ORIGINAL-DATETIME-NEW. Magic variables are bad, I know, but the obscurity should be more than sufficient.

Dimensional transcendentalism is preposterous (but it works)

One issue of callsame, callwith and the like is that they work on the current method, which makes it harder to farm things out. There are some ways around it, but I ultimately found it easiest to include all logic in a single method.

To mimic the multi methods exactly, without calling my own subs, I used captures and signature literals. Note the wrapped method’s signature of |c, which collects all the arguments into c and allows for inspection thereof. As there are, effectively, two ways to create a DateTime, let’s tackle the easiest one first: from a single number.

        ...
        if c ~~ :(Instant:D $, *%)
        || c ~~ :(Int:D     $, *%) {
            my $posix = c.list.head;
            $posix = $posix.to-posix if $posix ~~ Instant;
            
            my $tz-id = c.hash<tz-id> // 'Etc/GMT';
            my $time = localtime get-timezone-data($tz-id), $posix;
            
            my $*CALL-ORIGINAL = True;
            return callwith(self, $posix, :timezone($time.gmt-offset))
                but TimezoneAware[$tz-id, $time.is-dst];
        }

The result of localtime is (presently) a Raku equivalent of the old and ubiquitous tm struct used in virtually all *nix systems and time libraries. Since we already have the POSIX time, we just pass in the new “timezone” and mix in the role and done.

There’s one small annoyance though. Consider the following now:

say DateTime.new(now).WHAT; # DateTime+{TimeZoneAware[…]}

Ugh. That’s a veritable mouthful. Is there any way we can change that? As it turns out, there is. I’m not going to say that you necessary should do this, but we want to be as in the background as possible. Before returning, we store the variable like so:

            my $result = callwith( … ) but TimezoneAware[…];
            $result.^name = 'DateTime';
            return $result

Et violà, it looks totally normally, except it has those extra methods. Our new DateTime will pass for an old one, even if someone does a name-based comparison (of course, they should use probably use .isa or smartmatching, which would work without the name change).

Now we tackle the second method, which is from being given discrete time units. There is a gmt-from-local routine that takes the aforementioned tm struct along with a timezone and tries to reconcile the two to get a POSIX time (if you ask me for 2:30 on a day we spring forward… we’ll have problems). Once we have the POSIX time, then we can create things like before. I’ll spare you all the different ways that this type of creation can happen, but it’s easy enough to imagine (or you can look in the code itself).

For methods outside of new there isn’t a lot of work that needs to be done. Things like .day, .month, etc, should all work the same, since the original DateTime understands GMT offset.

The important one is to-timezone where all we need to do, really, is wrap it and call .new(self, :$timezone).

Life depends on change and renewal

Wrapping is actually a very pervasive thing: you cannot lexically scope it, and so if we wrap at INIT (the phaser fired when a script is launched), its effects are seen globally from the get-go. Except… there are two phasers that fire before INIT. They are BEGIN and CHECK, and they fire in the compilation phase where we can’t touch them. If someone were to create a DateTime in one of these blocks, it will still be a regular DateTime without our mixin. Consider the following:

my $compile-time = BEGIN DateTime.new: now;

What should we do about this? If it gets used later, it won’t have the attributes that users might depend on. How can we help this out?

Firstly, if the user calls .day, there won’t be a difference, so no problems there. But if the user calls, say, olson-id, we’re in trouble. No such method. Or is there?

Raku objects have a special (psuedo) method called FALLBACK that is called when an unknown method is called. If there isn’t already a fallback added, we can’t wrap it (e.g. .^find_method('FALLBACK').wrap(…)). Nonetheless, the same HOW that gives us ^find_method also gives us ^add_fallback, although its syntax is a bit trickier.

For example, for the Olson ID method, we can do the following:

INIT DateTime.^add_fallback:
    anon sub condition  ($invocant, $method-name --> Bool    ) { $method-name eq 'olson-id' }
    anon sub calculator ($invocant, $method-name --> Callable) { method { … }               };

If the condition sub returns True, then the method returned in calculator is run. Now, even if one of these old school DateTime objects manages to stick around, we can do something. But what can we do? As it turns out, a lot… depending.

We could just try to run a fresh set of calculations. If the same old-fashioned DateTime has us call the method on it regularly, though, then we’re wasting a lot of CPU cycles. Instead, we can actually replace (or… regenerate) the object! While the trait is rw is fairly well-known, much less well-known is that it can exist on the invocant! The only catch is we need to have a scalar container for the invocant, which is done by giving it a sigil:

method ($self is rw: |c) {
    $self = …
}

There is one small catch, though. If the DateTime is not in a container (for example, it’s a constant), we’re not only stuck, but the above method will error because is rw requires a writable container. In that case, we’ll need to fallback to recalculating each time. Small price to pay. But how can we even know? Or make it work since the above errors with unwritable containers? Simple answer: multi methods. Miraculously, if you have two identical methods, but for the trait is rw, then dispatch will prefer the is rw for writable containers, and the other for unwritables.

multi method foo ($self is rw: |c) { 
    self = …  # upgrade for faster calls later
}
multi method foo ($self: |c) {
    calculate-with($self)  # slower style here
}

The catch is you can’t pass a multi method. In fact, multi methods can only be properly declared and referenced inside of a class declaration. The solution is to instead make a multi sub outside of wrap’s parentheses, and then refer to it with its sigiled self when wrapping:

proto sub foo (|) { * }
multi sub foo ($self is rw, |c) { 
   self = …             # ^ notice the comma, subs don't have invocants, 
}                       #   but they're passed as the first argument
multi sub foo ($self, |c) { 
   calculate-with($self)
}                       
….wrap(&foo);

Wrapping, multiple dispatch, first-class functions, so much stuff going on but we avoid breaking precompilation and manage to not make a single use of MONKEY-TYPING 🙂

Bowties are cool

There are a lot of other little niceties that can be given for users. One of the primary issues is the name of methods and parameters. In the above write up, I’ve used some names, but could have used others. For instance, is there anything inherently better in using .is-dst versus .dst to determine if the given time is in daylight saving time? And outside of the is- question, should we use dst, daylight, or, like much of the world, summer-time?

Grabbing the timezone name presents similar issues. While stock DateTime has an .offset method to get the GMT offset, it also provides the exact same information from .timezone. Alternatives could be, as I used above, tz-id, timezone-id, or olson-id (Olson invented the IDs used when he made the database). Actually, on this one, I’ve cheated, slightly. By using the allomorphic IntStr, it’s possible for us to make something function differently in numeric and stringy contexts. So we can override timezone to return self.offset but self.timezone-id and it will probably give good DWIM functionality for everyone.

One that seemed fairly obvious was .tz-abbr, which gives information like EST or PDT for use in formatting. Nothing like having the method name exemplify what it gives 🙂

When creating a DateTime, it’s possible to also specify a formatter. The default follows an ISO standard, but a lot of people find, e.g. “CEST” much easier to recognize as being European than “+02:00”. Should the default formatter be changed? This is one I’ve not come to a conclusion on. The default provides a standard format, but since it can be changed, there’s no reason anyone should expect (or more importantly, depend upon) it to always produce the same string.

These may seem like trivial questions, but Raku prides itself on a culture of core and module developers really polishing things up to make them easy. Everything from code readability, integration with other modules and the core language, and functionality all get considered considered, and Raku in particular lends itself to giving developers the ability to do what’s best for both them and the users.

Don’t blink. Don’t even blink

Although I mentioned it briefly before, it bears repeating why Raku itself doesn’t contain support time zones out of the box. Time zones aren’t fixed. Government and politics are the wibbly wobbly to time zones’ timey wimey. I’m in the United States and Just because today I expect daylight savings time to start on March 14th, doesn’t mean that Congress or the US Secretary of Transportation (!) can’t change things tomorrow. Or my state, independently of the ones around it, may opt out of daylight savings entirely before then. Rinse and repeat for all the rest of the countries in the world.

Anything with baked in support needs to be incredibly stable (Unicode Character Database), or provide heads up for changes well in advance (leapseconds). This is because most people don’t use bleeding-edge distributions. Heck, Apple still distributes Perl 5.18.4 from 2013! In 2013 alone there were eight updates to the database, and since then there have been forty-five more updates up to today. Even for Python, which gets more update love from Apple, there have been twelve updates since the most recent version Apple distributes (2.7.16).

This is where modules can shine: by using zef or another module manager to upgrade DateTime::Timezones whenever there’s an update, users can always stay up to date. On the maintenance side, I’ve created a script that automates the entire update process, with me only needing to change the module’s version number and update documentation manually. This also means that if I don’t update the module for some reason, a local user can easily update the database locally on their own with zero knowledge of how the vagaries of the database work.

Event Two

As mentioned, I came to this project because my work on bringing the CLDR data to Raku, and specifically with formatting dates/times. One thing that it contemplates is support for non-Gregorian calendars, some of which differ quite substantially from the Gregorian in their manner of calculations. There are some like Jean Forget that are working on these for Raku, but they currently exist as their own separate classes that are not interchangeable with the built in Date and DateTime classes. There is nothing stopping anyone from further extending DateTime with the above methods to add in a new attribute calendar that can be set to gregorian or hebrew or persian. It would be a bit more involved than our work here, as some time calculations are hard coded into DateTime, it will require a fair bit of extra work, but is well within the realm of possibilities.

Our Perl brethren imagined different modules that shared common attributes, but with work, it ought to be possible for one DateTime to rule them all in Raku (wait, I’m changing cultural reference points, oops). Only … um, time, uh, will tell.

Day 16: Writing faster Raku code, Part II

By Wim Vanderbauwhede

This is the follow-on article about writing an expression parser in Raku. In the previous article, I explained the background looked at some basic performance comparisons relating to data structures for parsing and ways to process them: lists, parse trees, recursive descent and iteration.

In this article, we’ll have a look at the performance of various ways of processing strings, and then see how it all fits together in the expression parser.

String processing: regular expressions, string comparisons or list operations?

How should we parse the expression string in Raku? The traditional way to build an expression parser is using a Finite State Machine, consuming one character at a time (if needed with one or more characters look-ahead) and keeping track of the identified portion of the string. This is very fast in a language such as C but in Raku I was not too sure, because in Raku a character is actually a string of length one, so every test against a character is a string comparison. On the other hand, Raku has a sophisticated regular expression engine. Yet another way is to turn the string into an array, and parse using list operations. Many possibilities to be tested:

constant NITERS = 100_000;
my $str='This means we need a stack per type of operation and run until the end of the expression';
my @chrs =  $str.comb;
if (CASE==0) { # 5.8 s
    for 1 .. NITERS -> $ct {
# map on an array of characters        
        my @words=();
        my $word='';
        map(-> \c { 
            if (c ne ' ') {
                $word ~= c;
            } else {
                push @words, $word;
                $word='';
            }
        }, @chrs);
        push @words, $word;
    }
} elsif CASE==1 { # 2.7 s    
     for 1 .. NITERS -> $ct {
# while with index through a string        
        my @words=();
        my $str='This means we need a stack per type of operation and run until the end of the expression';
        while my $idx=$str.index( ' ' ) {
            push @words, $str.substr(0,$idx);
            $str .= substr($idx+1);
        }
        push @words, $str;
    }         
} elsif CASE==2 {  # 11.7 s
    for 1 .. NITERS -> $ct {
# while on an array of characters        
        my @words=();
        my @chrs_ = @chrs; 
        my $word='';      
        while @chrs_ {
            my $chr = shift @chrs_;
            if ($chr ne ' ') {
                $word~=$chr;
            } else {
                push @words, $word;
                $word='';
            }
        }
        push @words, $word;
    }
} elsif CASE==3 { # 101 s
    for 1 .. NITERS -> $ct {
# while on a string using a regexp        
        my @words=();
        my $str='This means we need a stack per type of operation and run until the end of the expression';
        while $str.Bool {
            $str ~~ s/^$<w> = [ \w+ ]//;
            if ($<w>.Bool) {
                push @words, $<w>.Str;
            }
            else {
                $str ~~ s/^\s+//;
            } 
        }
    }   
} elsif CASE==4 { # 64 s
    for 1 .. NITERS -> $ct {
# reduce on an array of characters        
        my \res = reduce(
        -> \acc, \c { 
            if (c ne ' ') {
                acc[0],acc[1] ~ c;
            } else {
                ( |acc[0], acc[1] ),'';
            }
        }, ((),''), |@chrs);
        my @words = |res[0],res[1];
}

For the list-based version, the overhead is 1.6 s; for the string-based versions, 0.8s.

The results are rather striking. Clearly the regexp version is by far the slowest. This was a surprise because in my Perl implementation, the regexp version was twice as fast as next best choice. From the other implementations, the string-based FSM which uses the index and substr methods is by far the fastest, without the overhead it takes 1.9s s, which is more that 50 times faster than the regexp version. The map based version comes second but is nearly twice as slow. What is surprising, and actually a bit disappointing, is that the reduce based version, which works the same as the map based one but works on immutable data, is also very slow, 64 s.

In any case, the choice is clear. It is possible to make the fastest version marginally faster (1.6 s instead of 1.9 s) by not reducing the string but instead moving the index through the string. However, for the full parser I want to have the convenience of the trim-leading and starts-with methods, so I choose to consume the string.

A faster expression parser

With the choices of string parsing and data structure (nested arrays with integer identifiers, see the first article) made, let’s focus on the structure of the overall algorithm. Without going into details on the theory, we use a Finite State Machine, so the basic approach is to loop through a number of states and in every state perform a specific action. In the Perl version this was simple because we use regular expressions to identify tokens, so most of the state transitions are implicit. I wanted to keep this structure so I emulate the regexp s/// operation with comparisons, indexing and substring operations.

my $prev_lev=0;
my $lev=0;
my @ast=();
my $op;
my $state=0;
while (length($str)>0) {
     # Match unary prefix operations
     # Match terms
     # Add prefix operations if matched
     # Match binary operators
     # Append to the AST
}

The matching rules and operations are very simple (I use <pattern> and <integer> as placeholders for the actual values). Here is the Perl version for reference:

  • prefix operations:
if ( $str=~s/^<pattern>// ) { $state=<integer>; } 
  • terms:
if ( $str=~s/^(<pattern>)// ) { $expr_ast=[<integer>,$1]; }
  • operators:
$prev_lev=$lev;
if ( $str=~s/^<pattern>// ) { $lev=<integer>; $op=<integer>; }

In the Raku version I used the given/when construct, which is as fast as an if statement but a bit neater.

  • prefix operations:
given $str {
    when .starts-with(<token>) { 
        .=substr(<length of token>); 
        $state<integer>; }
  • terms:
given $str
    when .starts-with(<token start>) { 
        $expr_ast=[<integer>,$term]; }
  • operators:
given $str {
    when .starts-with(<token>) { 
        .=substr(<length of token>); 
        $lev=<integer>; 
        $op=<integer>; 
    }

One of the more complex patterns to match is the case of an identifier followed by an opening parenthesis with optional whitespace. Using regular expressions this pattern would be:

if  $str ~~ s:i/^ $<token> = [ [a .. z] \w*] \s* \( // { 
    my $var=$<token>.Str;
    ... 
}

Without regular expressions, we first check for a character between ‘a’ and ‘z’ using 'a' le .substr(0,1).lc le 'z'. If that matches, we remove it from $str and add it to $var. Then we go in a while loop for as long as there are characters that are alphanumeric or ‘_’. Then we strip any whitespace and test for ‘(‘.

when 'a' le (my $var = .substr(0,1)).lc le 'z' {
    my $idx=1;
    my $c = .substr($idx,1);
    while 'a' le $c.lc le 'z' or $c eq '_' 
        or '0' le $c le '9' {
        $var~=$c;
        $c = .substr(++$idx,1);
    }
    .=substr($idx);
    .=trim-leading;
    if .starts-with('(') {
        ...
    }
}

Another complex pattern is that for a floating point number. In Fortran, the pattern is more complicated because the sub-pattern .e can be part of a floating-point constant but could also be the part of the equality operator .eq.. Furthermore, the separator between the mantissa and the exponent can be not just e but also d or q. So the regular expression is rather involved:

if (                    	
    (
        !($str~~rx:i/^\d+\.eq/) and
        $str~~s:i/^([\d*\.\d*][[e|d|q][\-|\+]?\d+]?)//        
    )        	
    or 
    $str~~s:i/^(\d*[e|d|q][\-|\+]?\d+)//
) {
    $real_const_str=$/.Str;
} 

Without regular expression, the implementation is as follows. We first detect a character between 0 and 9 or a dot. Then we try to match the mantissa, separator, sign and exponent. The latter three are optional; if they are not present and the mantissa does not contain a dot, we have matched an integer.

when '0' le .substr(0,1) le '9' or .substr(0,1) eq '.' { 
    my $sep='';
    my $sgn='';
    my $exp='';
    my $real_const_str='';

    # first char of mantissa
    my $mant = .substr(0,1);
    # try and match more chars of mantissa
    my $idx=1;
    $h = .substr($idx,1);
    while '0' le $h le '9' or $h eq '.' {
        $mant ~=$h;
        $h = .substr(++$idx,1);
    }
    $str .= substr($idx);

    # reject .eq.
    if not ($mant.ends-with('.') and .starts-with('eq',:i)) { 
        if $h.lc eq 'e' | 'd' | 'q' {
            # we found a valid separator
            $sep = $h;            
            my $idx=1;
            $h =.substr(1,1);
            # now check if there is a sign
            if $h eq '-' or $h eq '+' {
                ++$idx;
                $sgn = $h;
                $h =.substr($idx,1);
            }
            # now check if there is an exponent            
            while '0' le $h le '9' {
                ++$idx;
                $exp~=$h;
                $h =.substr($idx,1);
            }
            $str .= substr($idx);
            if $exp ne '' {
            $real_const_str="$mant$sep$sgn$exp";
            $expr_ast=[30,$real_const_str];
            } else {
                # parse error
            }
        } elsif index($mant,'.').Bool {
            # a mantissa-only real number
            $real_const_str=$mant;
            $expr_ast=[30,$real_const_str];
        }
        else { # no dot and no sep, so an integer
            $expr_ast=[29,$mant];   
        }
    } else { # .eq., backtrack and carry on
        $str ="$mant$str";        
        proceed;
    }            
}

A final example of how to handle patterns is the case of whitespace in comparison and logical operators. Fortran has operators of the form <dot word dot>, for example .lt. and .xor.. But annoyingly, it allows whitespace between the dot and the word, e.g. . not .. Using regular expressions, this is of course easy to handle, for example:

if $str~~s/^\.\s*ge\s*\.//) {
    $lev=6;
    $op=20;
} 

I check for a pattern starting with a dot and which contains a space before the next dot. Then I remove all spaces from that substring using trans and replace this original string with this trimmed version.

when .starts-with('.') and  .index( ' ' ) 
    and (.index( ' ' ) < (my $eidx = .index('.',2 ))) {
    
    # Find the keyword with spaces
    my $match = .substr(0, $eidx+1);
    # remove the spaces
    $match .= trans( ' ' => '' );
    # update the string
    $str = $match ~ .substr( $eidx+1);
    proceed;
}

Conclusion

Overall the optimised expression parser in Raku is still very close to the Perl version. The key difference is that the Raku version does not use regular expressions. With the above examples I wanted to illustrate how it is possible to write code with the same functionality as a regular expression s/// operation, using some of Raku’s built-in string operations:

  • substr : substring
  • index : location a a substring in a string
  • trim-leading : strip leading whitespace
  • starts-with
  • ends-with
  • trans : used to remove whitespace using the ' ' => '' pattern
  • lc : used in range tests instead of testing against both upper and lower case
  • le, lt, ge, gt: for very handy range comparisons, e.g. 'a' le $str le 'z'

The resulting code is of course much longer but arguably more readable than regular expressions, and currently four times faster.

All code for the tests is available in my GitHub repo.