Proving Primality with GADTs

Today we will explore the Curry–Howard correspondence. Our mission is writing, in Scala’s type system, the property on natural number of being prime. Wikipedia defines it by:

A natural number (1, 2, 3, 4, 5, 6, etc.) is called a prime number (or a prime) if it is greater than 1 and cannot be written as a product of two natural numbers that are both smaller than it.

An equivalent way to put it is:

A natural number is prime if it is greater than 1 and cannot be divided by any natural number greater than 1 but smaller than it.

These definitions are equivalent as, by definition, any natural number n is divisible by k if and only if it can be written n = k × p for some natural number p.

Writing a program whose execution checks whether a number is prime is easy. But we are not interested in executing programs, only compiling them! We want the compiler to verify that a number is indeed prime. At that point, you may wonder how it is even possible to use the compiler to “prove” something about numbers. That’s exactly the point of the Curry–Howard correspondence

The Challenge

You can write any positive integer in the input bow below:

  • Please write 3 in the input box above. A button shall appear letting you download Prime3.scala. Run it via

    scala Prime3.scala

    The file should compile and run flawlessly outputting ForAllRec(NotDivRec(NotDivBase(SIsPositive(),LTBase()),AddPlus1(AddZero())),ForAllBase()). Look into Prime3.scala, you should see a value prime3: Prime[_3] defined. The main method simply outputs this value.

  • Now, write 4 in the input box. Download and run Prime4.scala via

    scala Prime4.scala

    The file should compile but execution should failed with the exception scala.NotImplementedError: an implementation is missing. Look into Prime4.scala, the value prime4: Prime[_4] is defined by ???.

  • Read Prime4.scala carefully, starting from the beginning, and try to write a valid definition for val prime4: Prime[_4]. Remember to follow very scrupulously the rules stated in the first comment of Prime4.scala.

    • DO NOT ALTER, IN ANY WAY, THE DEFINITION OF ANY TYPE IN THE FILE
    • DO NOT ADD SUB CLASSES/OBJECTS TO TYPES IN THE FILE
    • DO NOT USE NULL IN ANY WAY
    • ONLY USE THE GIVEN CASE OBJECTS AND CASE CLASSES IN THE FILE
    • THE GOAL IS TO PRODUCE A val prime4: Prime[_4], NOT A def prime4: Prime[_4], NOT A lazy val prime4: Prime[_4]!
    • YOUR CODE SHOULD TYPE-CHECK AND RUN PRINTING THE VALUE prime4

Try to find valid values of type Prime[_N] when is not a prime number.

What the hell is going on ???

To encode properties over natural number, we need to start by encoding natural numbers. To do so, we associate to any natural number a type. Natural numbers can be constructed by starting from the first one, 0, and creating new ones by adding 1 successively. For example, 1 = 0 + 1, 2 = 0 + 1 + 1, 3 = 0 + 1 + 1 + 1, etc. Our encoding will mimic this construction, starting from a base type encoding 0 and an operation "+ 1". The number 0 is encoded as the type _0 defined as the final abstract class _0 and the operation "+ 1" as the type function final abstract class S[N <: Nat] which for every type N encoding a natural number n gives the type S[N] encoding the natural number n + 1. The type Nat can simply be defined as the alias type Nat = _0 | S[?] because a natural number is either 0 or obtained by some + 1 operations.

type Nat = _0 | S[?]
final abstract class _0
final abstract class S[N <: Nat]

type _1 = S[_0]
type _2 = S[_1] // S[S[_0]]
type _3 = S[_2] // S[S[S[_0]]]
type _4 = S[_3] // S[S[S[S[_0]]]]
type _5 = S[_4] // S[S[S[S[S[_0]]]]]
...

The next step is to define the type Prime[N] such that:

There exists a valid value of type Prime[N] if and only if N is (the type associated to) a prime number.

Proving that a Natural Number is Prime

Let n be a natural number and N its associated type (for example n=3 and N = _3). Then:

n is prime if and only if for all natural number m such that 2 ≤ m < n, then m does not divide n.

The type ForAll[X, N] encodes this exact property. There exists a value of type ForAll[X,N] if and only if both:

  • X ≤ N
  • For all M such that X ≤ M < N, M do not divide N

Actually the type Prime[N] is an alias for ForAll[_2, N]. We need to encode two more properties:

  • For I and J two natural numbers, the property that I is less than or equal to J (I ≤ J). It is encoded as the type LessThan[I, J].
  • For I and J two natural numbers, the property that I does not divide J. It is encoded as the type NotDiv[I, J].

Read the file PrimeN.scala carefully, each step is described and explained in much details.

Conclusion

Why asking you to build a value that not exists? Because the main point is not knowing whether there exists a value of type Prime[_4] but understanding why such a value (following all the rules!) cannot exists!

It is widely known and accepted in programming culture that every type has values. After all, types exists only to qualify values, right? And instantiating a type T is as simple as calling new! There is one huge problem with this claim: it is completely wrong!. The idea that a types can have no value, often called empty type or uninhabited type, is the cornerstone of a lot of techniques including logic, maths, programming with rich types, formal systems like Coq, etc.

This example is indeed both complicated and complex. It is neither a regular usage of GADTs nor something meant for production! It’s perfectly ok being confused about it or not understanding what is going on. As I said, it is an complicated and complex example!! But when you manage to understand it, consider you master the subject.