Day 8 – Yet More Abilities for Iterables

by Mustafa Aydın

Raku has a superb support for Iterables — for example, map is almost like a basis for whatever operation you’d like to do with your iterable or there is the Iterator protocol that is kind of a NAND gate that you can use to build any circuitry over your iteration logic.

But would it be even finer to have more abilities on Raku’s iterables (and also Strings!) that are packed in, or ab5tracted out, to high-level functions that hopefully speak for themselves in their names? I hope so!

The rest of this writing will be little “challenges” to do a task on a given Iterable/String using Raku’s batteries, and we’ll present an alternative black boxes along.

Task: Given a list of numbers, e.g., [2, 14, 111, 1787], double the primes in it and leave the non-primes as is.

Solutions: We can use a conditional in the map:

>>> [2, 14, 111, 1787].map({ .is-prime ?? $_ * 2 !! $_ })
(4, 14, 111, 3574).Seq

The proposition is a map-when function that takes a predicate and a mapper, and achives the same:

>>> [2, 14, 111, 1787].&map-when(&is-prime, * * 2)
(4, 14, 111, 3574).Seq

Supposed advantage is that it’s somewhat more concise and name might give it away what it’s doing quicker to the reader compared to the map call. Also, the speed of execution should be higher given that it’s designed for a specific task. These will be the theme all along.

Okay, this was not super exciting to wrap in a function. But we move on:

Task: Given an iterable, e.g., [1, 2, 4], infinitely yield its elements by wrapping to beginning, i.e., cycling.

Solutions: We can list-repeat the given iterable infinite amount of times with the xx operator. But it won’t flatten automatically, e.g., it’d yield [1, 2, 4], [1, 2, 4], ..., so we slap in a flat:

>>> flat [1, 2, 4] xx *
(...)
>>> _.head(5)
(1, 2, 4, 1, 2).Seq

The proposed alternative is cycle:

>>> (cycle [1, 2, 4]).head(5)
(1, 2, 4, 1, 2).Seq

A quick benchmark (100 runs) over 10,000-length array with 50,000 stream of elements requested, cycle turned out to be ~%30 faster! Well, this should be kind of expected as it’s specifically tailored for the exact task.

Task: Given an iterable, e.g., [7, 8, 9], yield index-value tuples, e.g., (0, 7), (1, 8), (2, 9). The starting index should be adjustable, e.g., if we want to enumerate from 1, then it should yield (1, 7), (2, 8), (3, 9).

Solutions: If not for the arbitrary starting index, this task is nothing but a .kv call. But it doesn’t recognize a starting index other than 0. map can do it though:

>>> my $start = 1
>>> [7, 8, 9].map({ $start + $++, $_ })
((1, 7), (2, 8), (3, 9)).Seq

It makes use of the anonymous state variable $ to keep track of the count 0, 1, 2… As usual, TIMTOWTDI:

>>> my $start = 1;
>>> $start .. * Z [7, 8, 9]
((1, 7), (2, 8), (3, 9)).Seq

Our proposition is to wrap this into an enumerate call:

>>> my $start = 1
>>> enumerate([7, 8, 9], start => 1)
((1, 7), (2, 8), (3, 9)).Seq

The need for index to start from a nonzero value (most often 1) is useful because human indexing is 1-based, so, for example, in a loop you’d like to log "{$n}th iteration..." and neither it starting from “0th iteration…” nor making it {$n + 1} feels aesthetically pleasing (to me at least), hence a thin wrapper.

Task: Given an iterable, detect if all of its elements are the same or not.

For example, [] and [2, 2, 2] should give True (former is vacuously true) and [4, 8, 8, 8 ... (ad infinitum)] should yield False. Note that the algorithm should short-circuit, i.e., once it finds 2 different elements, it should stop to report False.

Solutions: Since this is an aggregation, i.e., we want a Single value out of Multiple values (many to one if you will), map is not the goto tool.

A naive attempt would be get .unique values of the Iterable: if it has cardinality 1, return True; otherwise False. But it would fail on “obviously” False infinitely-long samples because it would need to exhaust the Iterable to construct the unique values only to measure its length and throw away the values themselves (that we didn’t really care about to begin with).

So, alternatively, we can check if all the elements of the Iterable are the same as the first one. This does seem doable in a lazy manner (unfortunately Junctions don’t short-circuit yet, so they are not to be used here):

>>> my \it = [4, |(8 xx *)]
>>> it.grep(* != it[0]).not
False

We are using double negation: one in the predicate to grep, one afterwards to query its emptiness. Thankfully, grep is clever that when it sees a chained .not call, it understands that we only care about whether there were no matches, and as soon as it finds a match, it returns False. In essence, we are applying De Morgan’s rule to mimic what the all-junction would do if it short-circuited. (Conveniently, it also works for an empty iterable, since there is nothing to grep about and it boils down to ().Seq.not, which is True.)

Our wrapper is is-all-same for this:

>>> my \it = [4, |(8 xx *)]
>>> it.&is-all-same
False

Now we don’t have to double-negatively think anymore 🙂

Task: Given an iterable and an index-value pair(s), “insert” the new value(s) into the desired index(es). For example, ["e", "f"] and 0 => "d" should yield ["d", "e", "f"].Seq.

Should also work for strings, e.g., "sing" with 1 => "tr" should yield "string".

Solutions: We can first Hashify the given pairs, then map the kv-ed iterable and slip the new value together with the old one:

>>> my \it = ["e", "f"]
>>> my @pairs = 0 => "d"

>>> my %ph = @pairs;
>>> it.kv.map({ %ph{$^idx}:exists ?? |(%ph{$idx}, $^val) !! $^val })
("d", "e", "f").Seq

# For strings, we comb -> operation -> join
>>> %ph = @pairs = 1 => "tr"
>>> "sing".comb.kv.map({ #`[same as above] }).join
"string"

We wrap this into insert-at:

>>> my \it = ["e", "f"]
>>> it.&insert-at(0 => "d")
("d", "e", "f").Seq

>>> "sing".&insert-at(1 => "tr")
"string"

The special case of 0 => $new-value is somewhat common, i.e., pre-pending a value to a given iterable. The direct way to go could be ($new-value, |@iterable) but this has the (undesired) effect of copying the entire iterable, whereas sometimes (oftentimes) all we want is first the new-value is yielded, then whatever the iterable has; we don’t need to face an O(N) operation here.

Task: Given an iterable, group the consecutive values into buckets using the given criterion (e.g., sameness) and under a possible transformation (values themselves by default, or, e.g., .keys of the items).

For example, [1, 2, 2, 6, 6, 6, 2]should yield (1 => (1,), 2 => (2, 2), 6 => (6, 6, 6), 2 => (2,)). Note that last 2 is apart from the first group, i.e., consecutiveness is broken, so it’s grouped differently.

This is actually the Unix’s uniq‘s way of working (on stream of data). An example with a transformation could be [a => 2, a => 3, b => 3] with *.key to get (a => (a => 2, a => 3), b => 3). This parameter is called “as” in many routines in Raku, e.g., classify.

An example with a criterion for comparison could be to use =:= to ensure container-level equality. This parameter is called “with” in many routines in Raku, e.g., unique.

Solutions: Left as an exercise for the reader 🙂 We propose group-conseq to achieve this:

# Elements themselves are the groupers by default
>>> [3, 4, 4, 5, 4].&group-conseq
(3 => (3,), 4 => (4, 4), 5 => (5,), 4 => (4,)).Seq

# They are all the same, really
>>> [1, -1, 1, -1, 1, -1].&group-conseq(as => &abs)
(1 => (1, -1, 1, -1, 1, -1)).Seq

# Respect the container for sameness
>>> my $a = 7
>>> ($a, $a, 7).&group-conseq(with => &[=:=])
(7 => (7, 7), 7 => (7,)).Seq

# Case insensitive detection of consecutive duplicates in a string; typos?
>>> my $s = "how aree youU?"
>>> $s.&group-conseq(as => &lc).grep(*.value > 1)
(e => (e, e), u => (u, U)).Seq

Task: Given an iterable, take values from until the given predicate fails. For example, [2, 5, 17, 8, 3, 13] with &is-prime should give (2, 5, 17).Seq.

Solutions: We can last the map if we see the predicate fail:

>>> [2, 5, 17, 8, 3, 13].map({ last if !.&is-prime; $_ })
(2, 5, 17).Seq

We provide take-while for this:

>>> [2, 5, 17, 8, 3, 13].&take-while(&is-prime)
(2, 5, 17).Seq

Conclusion

So we have seen several mundane-looking tasks (that are actually not that rare to come up in real scenarios) and some solutions from the batteries of Raku as well as ab5tracted out counterparts of them.

All the functions presented here are actually from the third party module Iter::Able (and they also support String and Iterator inputs whenever applicable) — there are currently ~30 functions implemented and aim is to implement many more, drawing inspirations from other languages and libraries therein, as well as our imaginations. I thank you for your presence.

2 thoughts on “Day 8 – Yet More Abilities for Iterables

Leave a comment

This site uses Akismet to reduce spam. Learn how your comment data is processed.