Monomorphization
Monomorphization refers to the process of converting polymorphic code to monomorphic code (no type variables) through static analysis.
Example:
id : (A : Type) → A → A;
id _ a ≔ a;
b : Bool;
b ≔ id Bool true;
n : Nat;
n ≔ id Nat zero;
Is translated into:
id_Bool : Bool → Bool;
id_Bool a ≔ a;
id_Nat : Nat → Nat;
id_Nat a ≔ a;
More examples
Mutual recursion
inductive List (A : Type) {
nil : List A;
cons : A → List A → List A;
};
even : (A : Type) → List A → Bool;
even A nil ≔ true ;
even A (cons _ xs) ≔ not (odd A xs) ;
odd : (A : Type) → List A → Bool;
odd A nil ≔ false ;
odd A (cons _ xs) ≔ not (even A xs) ;
-- main ≔ even Bool .. odd Nat;
Collection algorithm
This section describes the algorithm to collect the list of all concrete functions and inductive definitions that need to be generated.
Assumptions:
- Type abstractions only appear at the leftmost part of a type signature.
- All functions and constructors are fully type-applied: i.e. currying for types is not supported.
- There is at least one function with a concrete type signature.
- All axioms are monomorphic.
- No module parameters.
Definitions
-
Application. An application is an expression of the form
t₁ t₂ … tₙ
with n > 0. -
Sub application. If
t₁ t₂ … tₙ
is an application then for every0<i<n
t₁ t₂ … tᵢ
is a sub application.
Fix a juvix program P
. Let 𝒲
be the set of all applications that
appear in P
.
-
Maximal application. A maximal application is an application
A∈𝒲
such that for everyA'∈𝒲
we have thatA
is not a sub application ofA'
. -
Type application. If
t a₁ a₂ … aₙ
is a maximal application; andt
is either a function or an inductive type; anda₁, …, aₘ
are types; andaₘ₊₁
is not a type orm = n
.
Then
t a₁, …, aₘ
is a type application. -
Concrete type. A type is concrete if it involves no type variables.
-
Concrete type application. A type application
t a₁ a₂ … aₙ
ifa₁, a₂, …, aₙ
are concrete types.
Option 1
Let S₀
be the set of functions with a concrete type signature. Gather
all type applications (both in the type and in the body) in each of the
functions in S₀
. Clearly the collected type applications are all
concrete. We now have a stack c₁, c₂, …, cₙ
of concrete type
applications.
- If the stack is empty, we are done.
- Otherwise pop
c₁
from the stack. It will be of the formt a₁ … aₘ
, wheret
is either an inductive or a function anda₁, …, aₘ
are concrete types. - If the instantiation
t a₁ … aₘ
has already been registered go to step 1. Otherwise continue to the next step. - Register the instantiation
t a₁ … aₘ
. - If
t
is a function, then it has type variablesv₁, …, vₘ
. Consider the substitutionσ = {v₁ ↦ a₁, …, vₘ ↦ aₘ}
. Consider the list of type applications in the body oft
:d₁, …, dᵣ
. Addσ(d₁), …, σ(dᵣ)
to the stack and continue to step 1. It is easy to see that for anyi
,σ(dᵢ)
is a concrete type application. - If
t
is an inductive type, letd₁, …, dᵣ
be the type applications that appear in the type of its constructors, then proceed as before.
Correctness
It should be clear that the algorithm terminates and registers all instantiations of constructors and functions.
Generation algorithm
The input of this algorithm is the list of concrete type applications,
name it ℒ
, produced by the collection algorithm. Example:
List String
Pair Int Bool
Maybe String
Maybe Int
if (Maybe String)
if (Maybe Int)
if (Pair Int Bool)
Name generation
Let f â
be an arbitrary element of ℒ
, where â
is a list of
concrete types.
- If
f
is a function, assign a fresh name tof â
, call it⋆(f â)
. - If
f
is an inductive type, assign a fresh name tof â
, call it⋆(f â)
. Then, for each constructorcᵢ
off
, wherei
is the index of the constructor, assign a fresh name to it and call it⋆ᵢ(f â)
.
Function generation
Consider an arbitrary function f
in the original program. Then
consider the list of concrete type applications involving f
:
f â₁, …, f âₙ
.
- If
n = 0
, then either:f
has a concrete type signature, in that case we proceed as expected.f
is never called from the functions with a concrete type. In this case we can safely ignore it.
- If
n > 1
. For eachâᵢ
we proceed as follows in the next sections. Fixm
to be the lenght ofâᵢ
withm > 0
.
Function name
The name of the monomorphized function is ⋆(f âᵢ)
.
Type signature
Let 𝒮
be the type signature of f
. Then 𝒮
has to be of the form
(A₁ : Type) → … → (Aₘ : Type) → Π
, where Π
is a type with no type
abstractions. Now consider the substitution
σ = {A₁ ↦ âᵢ[1], …, Aₘ ↦ âᵢ[m]}
. Since âᵢ
is a list of concrete
types, it is clear that σ(Π)
is a concrete type. Then proceed as
described in Types.
Function clause
Let 𝒞
be a function clause of f
. Let p₁ … pₖ
with k ≥ m
be the
list of patterns in 𝒞
. Clearly the first m
patterns must be either
variables or wildcards. Wlog assume that the first m
patterns are all
variables, namely v₁, …, vₘ
. Let σ = {v₁ ↦ âᵢ[1], …, Aₘ ↦ âᵢ[m]}
be
a substitution. Let e
be the body of 𝒞
, then clearly σ(e)
has no
type variables in it. Now, since each name id must be bound at most
once, we need to generate new ones for the local variables bound in the
patterns pₘ₊₁, …, pₖ
. Let w₁, …, wₛ
be the variables bound in
pₘ₊₁, …, pₖ
. Let w'₁, …, w'ₛ
be fresh variables. Then let
δ = {w₁ ↦ w'₁, …, wₛ ↦ w'ₛ}
.
Now let 𝒞'
have patterns δ(pₘ₊₁), …, δ(pₖ)
and let
e' ≔ (σ ∪ δ)(e)
. It should be clear that e'
has no type variables in
it and that all local variable references in e'
are among
w'₁, …, w'ₛ
. Note that e'
is not yet monomorphized. Proceed to the
next step to achieve that.
Expressions
The input is an expression e
that has no type variables in it. The
goal is to replace the concrete type applications by the corresponding
monomorphized expression.
The only interesting case is when we find an application. Consider the
unfolded view of the application: f a₁ … aₘ
. Then, if f
is either a
constructor, or a function, let A₁, …, Aₖ
with k ≤ m
be the list of
type parameters of f
.
- If
f
is a function andf a₁ … aₖ ∉ ℒ
then recurse normally, otherwise, letâ ≔ a₁ … aₖ
and replace the original expressionf a₁ … aₘ
, by⋆(f â) aₖ₊₁' … aₘ'
whereaₖ₊₁' … aₘ'
are the monomorphization ofaₖ₊₁ … aₘ
respectively. - If
f
is a constructor, letd
be its inductive type. Then checkd a₁ … aₖ ∈ ℒ
. Proceed analogously as before.
Types
The input is a type t
that has no type variables in it. The goal is to
replace the concrete type applications by the corresponding
monomorphized type. Proceed analogously to the previous section.