Episode 5 - Recursive Data Types
We have seen many types but we still don’t know how to represent numbers, lists, trees, free monads, or any type with an infinite number of values. One again we will start by simple examples. Like always, do no skim through them but take the time to develop a deep understanding. If you feel uncomfortable with complex examples, it means you missed something important in the simple ones.
A simple example
We will take as example an encoding of non-negative integers, also called natural numbers, i.e. numbers 0, 1, 2, 3 and so on. There are actually many encodings possible, but we will take a simple encoding known as Peano numbers. Did you ever wonder how natural numbers are built? There can actually be built starting from 0
then adding 1
to 0
to make 1
, then adding 1
to 1
to make 2
, then adding 1
to 2
to make 3
, then adding 1
to 3
to make 4
, and so on. Our encoding will mimic this construction. We need two constructors: one of represent 0
and the other to represent the operation of adding 1
to the previous number to make a new one.
Let’s call Nat
the type of natural numbers. The first constructor, representing 0
should be a constant of type Nat
while the second one, representing the operation of adding 1
should be a function of type Nat => Nat
. Remember that constructors need to be injective, but we are lucky, this operation is actually injective. Let call the first constructor Zero:Nat
and the second one Succ:Nat => Nat
(for successor). This is easy to translate into Scala and Haskell:
sealed abstract class Nat {
def fold[A](zero: A, succ: A => A): A =
this match {
case Zero => zero
case Succ(p) =>
val a = p.fold(zero, succ)
succ(a)
}
}
final case object Zero extends Nat
final case class Succ(n: Nat) extends Nat
data Nat where
Zero :: Nat
Succ :: Nat -> Nat
Note that all the constructors we have seen in the previous episodes took as argument already defined types, but the first argument of Succ
is of type Nat
, the very same type we are defining. This is the reason why Nat
is called a recursive data type: some of its constructors take as arguments values of type Nat
itself. As usual:
- constructors are injective
- different constructors produces different value
- every value of type
Nat
is either aZero
or aSucc(n)
for somen:Nat
The problem is, unlike enumerations, products and coproducts where these properties are enough to define their respective type (up to equivalence) without ambiguity. With recursive data types there can be several non-equivalent types for which these three properties hold. For example, the type {0, 1, 2, ...}
of natural numbers and {O, 1, 2, ..., ∞}
in which we added one special number called infinity such that Succ(∞) = ∞
both have the three properties above. So wee need to add a new constraint, which is that: among all the types for which these properties hold, Nat
is taken as the smallest one.
Recursive types have amazing properties. For example the types Nat
and Option[Nat]
are equivalent! Indeed nat2opt
and opt2nat
are inverse bijections:
def opt2nat: Option[Nat] => Nat = {
case Some(n) => Succ(n)
case None => Zero
}
def nat2opt: Nat => Option[Nat] = {
case Succ(n) => Some(n)
case Zero => None
}
opt2nat :: Maybe Nat -> Nat
opt2nat (Just n) = Succ n
opt2nat Nothing = Zero
nat2opt :: Nat -> Maybe Nat
nat2opt (Succ n) = Just n
nat2opt Zero = Nothing
It means that Nat ≅ Option[Nat]
. Note that Option[T] ≅ Either[Unit, T] ≅ 1 + T
in Type Theory notation, so Nat
is actually one of the solution of the type equation T ≅ 1 + T
. Such an equation means we are looking for types T
such that T
and Option[T]
are equivalent. Regarding Option
as a function from types to type, where the type Option[T]
is the one obtained by applying the argument T
to the function Option
, Nat
is one of the fixed-point of this function. More precisely, Nat
is the least fixed-point of Option
.
Let
μ: (Type -> Type) -> Type
be the operator taking a type functionF
as argument (writtenF[_]
in Scala andf :: * -> *
in haskell) and returning the least fixed-point ofF
, which is defined as the smallest typeT
(up to equivalence) which is solution of the equationT ≅ F[T]
. To simplify the notations, we may also writeμT.F[T]
instead ofμ(F)
.
As an example Nat = μ(Option)
which we also write Nat = μT.Either[Unit, T]
and also Nat = μT.(1 + T)
.
Another solution to the equation, which is, this time the greatest fixed-point of Option
is the type NatInf
, representing {0, 1, 2, ..., ∞}
, defined as below. It is the biggest type which is solution (up to equivalence) of the equation T ≅ 1 + T
. The two inverse functions opt2natInf
and natInf2opt
proves the equivalence:
trait NatInf {
def unfold: Option[NatInf]
}
val zero: NatInf =
new NatInf {
def unfold: Option[NatInf] = None
}
def succ(n: NatInf): NatInf =
new NatInf {
def unfold: Option[NatInf] = Some(n)
}
val ∞ : NatInf =
new NatInf {
def unfold: Option[NatInf] = Some(∞)
}
def opt2natInf: Option[NatInf] => NatInf = {
case Some(n) => succ(n)
case None => zero
}
def natInf2opt: NatInf => Option[NatInf] =
(n: NatInf) => n.unfold
newtype NatInf = NatInf { unfold :: Maybe NatInf }
zero :: NatInf
zero = NatInf Nothing
succ :: NatInf -> NatInf
succ n = NatInf (Just n)
inf :: NatInf
inf = NatInf (Just inf)
opt2natInf :: Maybe NatInf -> NatInf
opt2natInf (Just n) = succ n
opt2natInf Nothing = zero
natInf2opt :: NatInf -> Maybe NatInf
natInf2opt = unfold
Equivalence of inductive and functional definitions
Nat
can be equivalently defined as a data type NatInd
as well as a function type NatFun
. Inverse bijections ind2fun
and fun2ind
prove NatInf
and NarInd
are equivalent:
sealed abstract class NatInd
final case object ZeroInd extends NatInd
final case class SuccInd(n: NatInd) extends NatInd
trait NatFun {
def fold[A](zero: A, succ: A => A): A
}
val zeroFun : NatFun =
new NatFun {
def fold[A](zero: A, succ: A => A): A = zero
}
def succFun(n: NatFun) : NatFun =
new NatFun {
def fold[A](zero: A, succ: A => A): A = {
val a = n.fold[A](zero, succ)
succ(a)
}
}
def ind2fun: NatInd => NatFun =
(i: NatInd) =>
i match {
case ZeroInd =>
zeroFun
case SuccInd(p) =>
val n = ind2fun(p)
succFun(n)
}
def fun2ind: NatFun => NatInd =
(n: NatFun) => n.fold[NatInd](ZeroInd, SuccInd(_))
data NatInd where
ZeroInd :: NatInd
SuccInd :: NatInd -> NatInd
type NatFun = forall a. a -> (a -> a) -> a
zeroFun :: NatFun
zeroFun z _ = z
succFun :: NatFun -> NatFun
succFun n z s = s (n z s)
ind2fun :: NatInd -> NatFun
ind2fun ZeroInd = zeroFun
ind2fun (SuccInd n) = succFun (ind2fun n)
fun2ind :: NatFun -> NatInd
fun2ind n = n ZeroInd SuccInd
Lists
Similarly, given a type A
, we want to define the the type of lists whose elements are of type A
, written List[A]
. Let l:List[A]
be a list whose elements are of type A
. There are two cases: either the list is empty or it is not. Let’s call the the empty list Nil
. If the list is not empty, let head
be its first element and tail
the rest of the list (i.e. the same list as l
but without the first element head
). Then tail
is also a list of type List[A]
and l
can be obtained by prepending head
to tail
. We will write this prepending operation Cons :: (A, List[A]) => List[A]
such that l = Cons(head, tail)
.
Once again we see we have two constructors: Nil
of type List[A]
and Cons
of type (A, List[A]) => List[A]
. Besides, these constructors satisfy the usual thee properties:
- constructors are injective
- different constructors produces different value
- every value of type
List[A]
is either aNil
or aCons(head, tail)
for somehead:A
and sometail:List[A]
Furthermore List[A]
is the smallest type satisfying these properties. It can easily be defined in Scala as
sealed abstract class List[+A] {
def fold[R](nil: R, cons: (A,R) => R): R =
this match {
case Nil => nil
case Cons(head,tail) =>
val r = tail.fold[R](nil, cons)
cons(head, r)
}
}
final case object Nil extends List[Nothing]
final case class Cons[+A](head: A, tail: List[A]) extends List[A]
Like any recursive data type, List[A]
is the smallest solution of a type equation. This time the equation is T ≅ 1 + (A, T)
which in a more Scalaish syntax is T ≅ Option[(A, T)]
. Equivalently, List[A]
is also the least fixed-point of the type-function:
type F[T] = Option[(A, T)]
Which means List[A] = μT.(1 + A × T)
. The biggest type which is solution (up to equivalence) of the equation, which is the greatest fixed-point of F
is the type of streams whose elements are of type A
, written Stream[A]
:
trait Steam[A] {
def unfold: Option[(A, Stream[A])]
}
newtype Stream a = Stream { runStream :: forall c. (Maybe (a, Stream a) -> c) -> c }
Exercise: write the bijections proving Stream[A] ≅ Option[(A, Stream[A])]
Equivalence of inductive and functional definitions
List[A]
can equivalently be defined as the data type ListInd[A]
as well as the type function ListFun[A]
. The two inverse functions ind2fun
and fun2ind
prove ListInd[A]
and ListFun[A]
are equivalent:
sealed abstract class ListInd[+A]
final case object NilInd extends ListInd[Nothing]
final case class ConsInd[+A](head: A, tail: ListInd[A]) extends ListInd[A]
trait ListFun[+A] {
def fold[R](nil: R, cons: (A,R) => R): R
}
def nilFun[A]: ListFun[A] =
new ListFun[A] {
def fold[R](nil: R, cons: (A,R) => R): R = nil
}
def consFun[A](head: A, tail: ListFun[A]): ListFun[A] =
new ListFun[A] {
def fold[R](nil: R, cons: (A,R) => R): R = {
val r = tail.fold[R](nil, cons)
cons(head, r)
}
}
def ind2fun[A]: ListInd[A] => ListFun[A] =
(i: ListInd[A]) =>
i match {
case NilInd =>
nilFun[A]
case ConsInd(head, tail) =>
val tailFun = ind2fun(tail)
consFun(head, tailFun)
}
def fun2ind[A]: ListFun[A] => ListInd[A] =
(f: ListFun[A]) => f.fold[ListInd[A]](NilInd, ConsInd(_,_))
data ListInd a where
NilInd :: ListInd a
ConsInd :: a -> ListInd a -> ListInd a
type ListFun a = forall r. r -> (a -> r -> r) -> r
nilFun :: ListFun a
nilFun nil _ = nil
consFun :: a -> ListFun a -> ListFun a
consFun head tail nil cons = cons head (tail nil cons)
ind2fun :: ListInd a -> ListFun a
ind2fun NilInd = nilFun
ind2fun (ConsInd head tail) = consFun head (ind2fun tail)
fun2ind :: ListFun a -> ListInd a
fun2ind f = f NilInd ConsInd
Algebraic Data Types
Algebraic Data Types are types that can be expressed using only False
, Unit
, products, coproducts and the least fixed-point operator μ
. For example, binary trees whose elements are of type A
, defined in Scala by
sealed abstract class Tree[+A] {
def fold[R](empty: R, leaf: A => R, node: (R, R) => R): R =
this match {
case Empty => empty
case Leaf(a) => leaf(a)
case Node(l, r) =>
val al = l.fold[R](empty, leaf, node)
val ar = r.fold[R](empty, leaf, node)
node(al, ar)
}
}
final case object Empty extends Tree[Nothing]
final case class Leaf[+A](value: A) extends Tree[A]
final case class Node[+A](left: Tree[A], right: Tree[A]) extends Tree[A]
can be expressed as the type Tree[A] = μT.(1 + A + (T × T))
, which is the smallest type (up to equivalence) solution of the equation T ≅ Either3[Unit, A, (T, T)]
.
Exercise: write the bijection proving the equivalence.