Iterator syntax¶
A common pattern in functional programming is the traversal of data structures,
particularly lists, in a specified order accumulating some values. If you've
used languages like Haskell or OCaml, you must have come across the "fold left"
(foldl) and "fold right" (foldr) higher-order functions which implement this
pattern. These functions are also available in Juvix. In this
blog post, I describe an iterator syntax I designed for Juvix which allows
expressing folds (and maps, filters and more) in a readable manner.
The next paragraph discusses some issues with using fold functions directly. Don't worry if you've never heard of folds - just skip this paragraph and the rest of the blog post will teach you how to use them in a nice iterator syntax.
The problem with folds is that they are hard to read and understand, which
results in code that is difficult to maintain. From a fold application, e.g.,
foldr \{acc x := body} a xs, it is not always immediately apparent how the
list traversal proceeds. This is especially the case when the function argument
is big and spans several lines - then the initial value a of the accumulator
and the list xs are syntactically "disconnected" from the accumulator variable
acc and the current list element x. Personally, I also find it hard to
remember which argument is which - this differs between different functional
languages. I'm not the first person who noticed this problem. For example, the
unreadability of folds was one of the motivations behind introducing a monadic
for .. in notation in Lean 4.
The essence of folds¶
The anatomy of a fold (left or right) is simple.
-
We have an accumulator variable
accwhich we initialise to some valuea. -
We go through a data structure (list) in some specified order (left-to-right or right-to-left).
-
At each step, we receive the current value of the accumulator
accand the current elementx. From those we need to compute the new value ofacc. -
After going through all elements, the final value of
accis the result of the fold expression.
The for-notation¶
The Juvix standard library defines two iterators on lists which correspond to list folds:
foras a syntactic sugar for fold left (foldl),rforas a syntactic sugar for fold right (foldr).
Iterator application has the syntax:
for (acc := a) (x in xs) {body}
If body is an atom, the braces can be omitted.
The above for iteration starts with the accumulator acc equal to a and
goes through the list xs from left to right (from beginning to end), at each
step updating the accumulator to the result of evaluating body. The variables
acc, x are locally bound in body where they denote the previous
accumulator value (acc) and the current element (x). The final value of the
accumulator becomes the value of the entire for expression.
For example, the following code computes the sum of all numbers in the list
xs:
for (acc := 0) (x in xs) {x + acc}
Product of all numbers in a list:
for (acc := 1) (x in xs) {x * acc}
Reversing a list:
for (acc := nil) (x in xs) {x :: acc}
Counting odd numbers in a list:
for (acc := 0) (x in xs) {
if
| mod x 2 == 0 := acc
| else := acc + 1
}
Sum of squares of positive numbers in a list:
for (acc := 0) (x in xs) {
if
| x > 0 := acc + x * x
| else := acc
}
The for iterator is complemented by the rfor iterator which goes through the
list from right to left (from end to beginning).
For example, the following code concatenates all lists from a list of lists:
rfor (acc := nil) (x in xs) {x ++ acc}
If we used the for iterator above, the order of concatenations would be
reversed.
Applying a function f to each element in a list may be implemented with:
rfor (acc := nil) (x in xs) {f x :: acc}
Filtering a list with a predicate p:
rfor (acc := nil) (x in xs) {
if
| p x := x :: acc
| else := acc
}
The above keeps only the elements that satisfy p. The order of the elements
would be reversed if we used for instead of rfor.
Maps, filters and more¶
If you're familiar with the map and filter higher-order functions, you
probably noticed that the last two examples above provide their implementations
using rfor. In fact, one can use the iterator notation directly with map and
filter, and several other list functions from the standard library. In this
case, there are no explicit accumulators in the notation.
The expression
map (x in xs) {body}
is equivalent to (assuming acc doesn't occur in body)
rfor (acc := nil) (x in xs) {(body) :: acc}
or if you're familiar with the standard map function:
map \{x := body} xs
Similarly, one can use the notation
filter (x in xs) {p x}
to filter xs with the predicate p.
Other functions that can be used with the iterator syntax are all and any
which check whether all, resp. any, elements x in a list satisfy body
(which would of course refer to x):
all (x in xs) {body}
any (x in xs) {body}
Multiple accumulators¶
In fact, the acc and x in the iterator syntax don't need to be variables -
they can be arbitrary patterns. This is especially useful in conjunction with
pairs, allowing to effectively operate on multiple accumulators.
For example, to compute the largest and the second-largest element of a list of non-negative numbers one can use:
for (n, n' := 0, 0) (x in lst) {
if
| x >= n := (x, n)
| x > n' := (n, x)
| else := (n, n')
}
where n is the largest and n' the second-largest element found so far.
One can also operate on multiple lists simultaneously. For example, the
following computes the dot product of the lists xs, ys (assuming they have
equal lengths):
for (acc := 0) (x, y in zip xs ys) {x * y + acc}
The zip function creates a list of pairs of elements in the two lists, e.g.,
zip (1 :: 2 :: nil) (3 :: 4 :: nil) = (1, 3) :: (2, 4) :: nil
Declaring iterators¶
Iterator syntax can be enabled for any identifier func with the declaration:
syntax iterator func;
Then any iterator application of the form
func (acc1 := a1; ..; accn := an) (x1 in xs1; ..; xk in xsk) {body}
is automatically replaced by
func \{acc1 .. accn x1 .. xk := body} acc1 .. accn xs1 .. xsk
The replacement is entirely syntactic and happens before type-checking.
It is possible to restrict the number of initialisers (acci := ai) and
ranges (xi in xsi) accepted:
syntax iterator func {init: n, range: k};
Further reading¶
More information on iterators can be found in the Juvix language reference and the Juvix tutorial.