Compiling Juvix programs to arithmetic circuits via Vamp-IR¶

Since version 0.3.5, the Juvix compiler supports the vampir target which generates Vamp-IR input files that can be compiled to various proof systems based on arithmetic circuits, like Plonk or Halo 2. Vamp-IR is a proof-system-agnostic language for writing arithmetic circuits.

In this post, we will not be discussing the details of Vamp-IR or the circuit computation model. Instead, we will describe how high-level functional Juvix programs can be compiled to circuits, what the common pitfalls and current limitations are. The reader is assumed to have at least basic familiarity with Vamp-IR.

A simple circuit program¶

For a simple example of a Juvix program that can be compiled to an arithmetic circuit via Vamp-IR, we consider computing the 6-bit mid-square hash of a 16-bit number.

module MidSquareHash;

import Stdlib.Prelude open;

--- `pow N` is 2 ^ N
pow : Nat -> Nat
  | zero := 1
  | (suc n) := 2 * pow n;

--- `hash N` hashes a number with max N bits (i.e. smaller than 2^N) into 6 bits
--- (i.e. smaller than 64) using the mid-square algorithm.
hash : Nat -> Nat -> Nat
  | (suc n@(suc (suc m))) x :=
    if
      (x < pow n)
      (hash n x)
      (mod (div (x * x) (pow m)) (pow 6))
  | _ x := x * x;

main (x y : Nat) : Bool := hash 16 x == y;

To compile this file to Vamp-IR type

juvix compile -t vampir MidSquareHash.juvix

This should generate the MidSquareHash.pir file containing the Vamp-IR code.

The exact details of the hashing algorithm are not essential here. What matters is that Juvix can compile this ordinary high-level program, which uses recursion, pattern-matching, etc., into low-level Vamp-IR representation. The user does not need to understand arithmetic circuits or Vamp-IR beyond the basics.

The Juvix main function is compiled to a Vamp-IR main function which is then used in an equation which connects the inputs (arguments of main) to the output of main. For example, for the program above the generated equation is:

main x y = 1;

stating that main x y equals true. The variables x, y are Vamp-IR's private inputs.

Controlling generated equations¶

In principle, any Juvix program can be compiled to a circuit, subject to certain restrictions. When targeting Vamp-IR, the main function must have the type

main : ArgTy1 -> .. -> ArgTyN -> ResTy;

where ArgTyK and ResTy are Nat, Int or Bool. Since Vamp-IR natively supports only numbers (field elements), booleans are represented using 1 for true and 0 for false.

If the result type ResTy is a boolean (Bool), then the generated Vamp-IR file will contain the equation

main arg1 .. argN = 1;

where arg1, .., argN are the names of inputs to the main function. By default, these are inferred from the variable names in the first clause of main, e.g., compiling

main (x y : Nat) : Bool := x == y;

will generate Vamp-IR code similar to

def main x y = equal x y;

main x y = 1;

The Vamp-IR input variable names can also be explicitly specified with the argnames pragma, e.g., compiling

{-# argnames: [a, b] #-}
main (x y : Nat) : Bool := x == y;

will generate Vamp-IR code similar to

def main x y = equal x y;

main a b = 1;

If the result type ResTy is Nat or Int, then the generated equation is

main arg1 .. argN = out;

Currently, all Vamp-IR inputs (argK, out) are private, and it is not possible to change the name of out. These technical limitations will be lifted in future Juvix versions.

Recursion unrolling¶

Neither arithmetic circuits nor the Vamp-IR intermediate representation support recursion. This means that all Juvix recursive functions need to be unrolled up to a specified depth. Currently, the default unrolling depth is 140, which may be too much or too little depending on your particular program. The unrolling depth can be specified globally on the command line with the --unroll option, or on a per-function basis with the unroll pragma. For example, using

{-# unroll: 16 #-}
hash : Nat -> Nat -> Nat;

would limit the recursion depth (i.e. the number of possible nested recursive calls) for hash to 16. It is the responsibility of the user to ensure that the recursion unrolling depth is sufficient for all arguments that the function might be applied to in the program. In the above example, hash recurses on its first argument and the call to hash in main provides 16 as the first argument. Hence, no more than 16 nested recursive calls to hash are possible.

If the recursion unrolling depth is too small, i.e. smaller than the actual number of nested recursive calls, then the computation result may be incorrect. On the other hand, the circuit size grows with the unrolling depth, so it's advised to keep it as small as possible.

Compilation by normalization¶

Currently, the Juvix compiler uses a straightforward method to translate Juvix programs to Vamp-IR code: it simply computes the full normal form of the main function. Because of the restrictions we imposed on its type, the normal form of the main function must be an applicative expression built up from variables, constants and arithmetic and boolean operations. Such an expression can be directly translated to the Vamp-IR input format.

The disadvantage of performing full normalization is that it may super-exponentially blow up the size of the program. As explained below, this applies in particular to branching recursive functions with at least two recursive calls.

The branching problem¶

With the current compilation method, any recursive function which contains two or more recursive calls to itself in its body will cause an exponential blow-up in the generated code size, and thus will most likely fail to compile. As a rule of thumb, the size of the VampIR code generated for a Juvix function is proportional to k^n where k > 1 is the number of recursive calls in the function body and n is the unrolling depth; or proportional to n when k = 1.

For example, trying to compile the fast power function

{-# unroll: 30 #-}
terminating
power' (acc a b : Nat) : Nat :=
  if
    (b == 0)
    acc
    (if
      (mod b 2 == 0)
      (power' acc (a * a) (div b 2))
      (power' (acc * a) (a * a) (div b 2)));

power : Nat → Nat → Nat := power' 1;

makes the Juvix compiler hang. The pragma unroll: 30 doesn't help, because 2^30 = 1073741824 is still a large number - this is the factor by which the program size increases during compilation.

However, the fast power function may be reformulated to use only one recursive call:

{-# unroll: 30 #-}
terminating
power' (acc a b : Nat) : Nat :=
  let
    acc' : Nat := if (mod b 2 == 0) acc (acc * a);
  in if (b == 0) acc (power' acc' (a * a) (div b 2));

power : Nat → Nat → Nat := power' 1;

With the reformulated definition, the program size increases only by a factor of 30. Now compiling to Vamp-IR succeeds and it is possible to generate a ZK proof that e.g. 2^30 is indeed equal to 1073741824.