Iterator syntax¶
A common pattern in functional programming is the traversal of data structures,
particularly lists, in a specified order accumulating some values. If you've
used languages like Haskell or OCaml, you must have come across the "fold left"
(foldl
) and "fold right" (foldr
) higher-order functions which implement this
pattern. These functions are also available in Juvix. In this
blog post, I describe an iterator syntax I designed for Juvix which allows
expressing folds (and maps, filters and more) in a readable manner.
The next paragraph discusses some issues with using fold functions directly. Don't worry if you've never heard of folds - just skip this paragraph and the rest of the blog post will teach you how to use them in a nice iterator syntax.
The problem with folds is that they are hard to read and understand, which
results in code that is difficult to maintain. From a fold application, e.g.,
foldr \{acc x := body} a xs
, it is not always immediately apparent how the
list traversal proceeds. This is especially the case when the function argument
is big and spans several lines - then the initial value a
of the accumulator
and the list xs
are syntactically "disconnected" from the accumulator variable
acc
and the current list element x
. Personally, I also find it hard to
remember which argument is which - this differs between different functional
languages. I'm not the first person who noticed this problem. For example, the
unreadability of folds was one of the motivations behind introducing a monadic
for .. in
notation in Lean 4.
The essence of folds¶
The anatomy of a fold (left or right) is simple.
-
We have an accumulator variable
acc
which we initialise to some valuea
. -
We go through a data structure (list) in some specified order (left-to-right or right-to-left).
-
At each step, we receive the current value of the accumulator
acc
and the current elementx
. From those we need to compute the new value ofacc
. -
After going through all elements, the final value of
acc
is the result of the fold expression.
The for-notation¶
The Juvix standard library defines two iterators on lists which correspond to list folds:
for
as a syntactic sugar for fold left (foldl
),rfor
as a syntactic sugar for fold right (foldr
).
Iterator application has the syntax:
for (acc := a) (x in xs) {body}
If body
is an atom, the braces can be omitted.
The above for
iteration starts with the accumulator acc
equal to a
and
goes through the list xs
from left to right (from beginning to end), at each
step updating the accumulator to the result of evaluating body
. The variables
acc
, x
are locally bound in body
where they denote the previous
accumulator value (acc
) and the current element (x
). The final value of the
accumulator becomes the value of the entire for
expression.
For example, the following code computes the sum of all numbers in the list
xs
:
for (acc := 0) (x in xs) {x + acc}
Product of all numbers in a list:
for (acc := 1) (x in xs) {x * acc}
Reversing a list:
for (acc := nil) (x in xs) {x :: acc}
Counting odd numbers in a list:
for (acc := 0) (x in xs) {
if
| mod x 2 == 0 := acc
| else := acc + 1
}
Sum of squares of positive numbers in a list:
for (acc := 0) (x in xs) {
if
| x > 0 := acc + x * x
| else := acc
}
The for
iterator is complemented by the rfor
iterator which goes through the
list from right to left (from end to beginning).
For example, the following code concatenates all lists from a list of lists:
rfor (acc := nil) (x in xs) {x ++ acc}
If we used the for
iterator above, the order of concatenations would be
reversed.
Applying a function f
to each element in a list may be implemented with:
rfor (acc := nil) (x in xs) {f x :: acc}
Filtering a list with a predicate p
:
rfor (acc := nil) (x in xs) {
if
| p x := x :: acc
| else := acc
}
The above keeps only the elements that satisfy p
. The order of the elements
would be reversed if we used for
instead of rfor
.
Maps, filters and more¶
If you're familiar with the map
and filter
higher-order functions, you
probably noticed that the last two examples above provide their implementations
using rfor
. In fact, one can use the iterator notation directly with map
and
filter
, and several other list functions from the standard library. In this
case, there are no explicit accumulators in the notation.
The expression
map (x in xs) {body}
is equivalent to (assuming acc
doesn't occur in body
)
rfor (acc := nil) (x in xs) {(body) :: acc}
or if you're familiar with the standard map
function:
map \{x := body} xs
Similarly, one can use the notation
filter (x in xs) {p x}
to filter xs
with the predicate p
.
Other functions that can be used with the iterator syntax are all
and any
which check whether all
, resp. any
, elements x
in a list satisfy body
(which would of course refer to x
):
all (x in xs) {body}
any (x in xs) {body}
Multiple accumulators¶
In fact, the acc
and x
in the iterator syntax don't need to be variables -
they can be arbitrary patterns. This is especially useful in conjunction with
pairs, allowing to effectively operate on multiple accumulators.
For example, to compute the largest and the second-largest element of a list of non-negative numbers one can use:
for (n, n' := 0, 0) (x in lst) {
if
| x >= n := (x, n)
| x > n' := (n, x)
| else := (n, n')
}
where n
is the largest and n'
the second-largest element found so far.
One can also operate on multiple lists simultaneously. For example, the
following computes the dot product of the lists xs
, ys
(assuming they have
equal lengths):
for (acc := 0) (x, y in zip xs ys) {x * y + acc}
The zip
function creates a list of pairs of elements in the two lists, e.g.,
zip (1 :: 2 :: nil) (3 :: 4 :: nil) = (1, 3) :: (2, 4) :: nil
Declaring iterators¶
Iterator syntax can be enabled for any identifier func
with the declaration:
syntax iterator func;
Then any iterator application of the form
func (acc1 := a1; ..; accn := an) (x1 in xs1; ..; xk in xsk) {body}
is automatically replaced by
func \{acc1 .. accn x1 .. xk := body} acc1 .. accn xs1 .. xsk
The replacement is entirely syntactic and happens before type-checking.
It is possible to restrict the number of initialisers (acci := ai
) and
ranges (xi in xsi
) accepted:
syntax iterator func {init: n, range: k};
Further reading¶
More information on iterators can be found in the Juvix language reference and the Juvix tutorial.