Deriving Y combinator — Sorawee's Website

Y combinator seems to be something mysterious. Several people have been trying to understand the intuition why it works. I have seen "A Lecture on the Why of Y" by Matthias Felleisen before, and while it does give me some insight, I still feel confused in some degree. That doesn’t prevent me from recommending this lecture to other people I talked to because it’s the best one I have seen so far.

Recently I thought about the fixed point combinator again and finally gained some good insight. I think it’s worth sharing here.

In lambda calculus, an expression e is either:

x; identifier
(lambda (x) e); abstraction
(e e); application

Church and Turing proved that lambda calculus is Turing complete. That means we should at least be able to write a factorial function in this language. Consider the factorial function in Racket:

(define fact
  (lambda (n)
    (if (zero? n)
        1
        (* n (fact (sub1 n))))))
(fact 10)

We seem to need numbers, basic arithmetics, booleans, number comparison, conditionals, and a way to perform recursion⊕define does not simply define a function. It also makes recursion possible in order to write the factorial function.

Church encoding provides a way to write numbers, booleans, and all that, in the language, except a way to perform recursion. For the sake of readability, we will use primitive numbers, booleans, etc. as we know a way to convert them to lambda terms if we want to. Similarly, (let [[x v]] e) is simply a syntactic sugar for ((lambda (x) e) v) which is a lambda term, so if using let will make the code clearer, we will.⊕I use double bracket on let just to let us distinguish it with other parenthesis easier–actually the preferable style is to write it as (let ([x v]) e).

Now, the goal is to write the factorial function without using define which gives a power to define a recursive function. Here’s the first straightforward attempt:

(let [[fact
       (lambda (n) (if (zero? n)
                       1
                       (* n (fact (sub1 n)))))]]
  (fact 10))

But this doesn’t work because with static scope, fact is unbound inside the lambda function that is going to be fact itself.

The key idea is that, while fact is unbound inside the lambda function, it is bound in the body of let, so why don’t we give fact explicitly to the function?⊕A code from a student in a class that I am TA’ing gave me this idea. I’m kinda embarassed that I did not consider this before

(let [[fact
       (lambda (fact n) (if (zero? n)
                        1
                        (* n (fact (sub1 n)))))]]
  (fact fact 10))

⊕Lambda calculus actually doesn’t support multi arity lambda functions, but we can use currying to circumvent the problem, so again, we will continue using multi arity functions for now, for the sake of readability.

Oh, we forgot to pass fact in the recursive call to match the new formal parameters! Let’s fix that:

(let [[fact
       (lambda (fact n) (if (zero? n)
                        1
                        (* n (fact fact (sub1 n)))))]]
  (fact fact 10))

And, ta-dah! It works!

With this trick, we can write any recursive function in lambda calculus by the following method:

Initially write the recursive function with define.
Change it to the let form.
Prepend both formal and actual parameters of the function with the a new parameter which uses the name of the function.

This transformation is simply a syntax transformation which could be done programmatically,

Let’s digress a little bit here. What will happen if I do this to the stupid infinite loop function?

(define (diverge) (diverge))
(diverge)

is transformed to:

(let [[diverge (lambda (diverge) (diverge diverge))]] (diverge diverge))

Now we desugar let back to lambda and application to obtain:

((lambda (diverge) (diverge diverge)) (lambda (diverge) (diverge diverge)))

which is the other mysterious, well-known $\Omega$ combinator!

Now, back to the main topic, the question that we should ask next is, are we satisfied with this way to write an arbitrary recursive function in lambda calculus? As it turned out, some people are dissatisfied because the transformed function doesn’t look like the original function that we would write in the define version. In fact, their goal is to try to make the transformed function look close enough to the original function, and abstract the part that looks like the original factorial function out. The result is a function that enables other functions to be recursive while maintaining their original form.

As a first step, we can transform:

(let [[fact
       (lambda (fact n) (if (zero? n)
                        1
                        (* n (fact fact (sub1 n)))))]]
  (fact fact 10))

(let [[fact
       (lambda (fact n)
         (let [[fact (lambda (n) (fact fact n))]]
           (if (zero? n)
               1
               (* n (fact (sub1 n))))))]]
  (fact fact 10))

which shadows fact that consumes two arguments with fact that consumes one argument while maintaining the semantics.

The body of the function is in fact now very similar to the original function that uses define. What can we do next? We see that fact and n are in the formal parameters in the transformed code, while only n is in the original one. Currying which is mentioned in the margin note above could separate two of them. Of course, we need to change how we call the function too.

(let [[fact
       (lambda (fact)
         (lambda (n)
           (let [[fact (lambda (n) ((fact fact) n))]]
             (if (zero? n)
                 1
                 (* n (fact (sub1 n)))))))]]
  ((fact fact) 10))

The (let [[fact ...]] ...) inside the lambda does not depend on the parameter n of the outter lambda function, so we can lift it up without changing the semantics:

(let [[fact
       (lambda (fact)
         (let [[fact (lambda (n) ((fact fact) n))]]
           (lambda (n)
             (if (zero? n)
                 1
                 (* n (fact (sub1 n)))))))]]
  ((fact fact) 10))

And we see the original code for the fact function in the above code now!

Are we done? Well, not quite, because even though this resembles the original factorial function, we can’t abstract:

(lambda (n)
  (if (zero? n)
      1
      (* n (fact (sub1 n)))))

out as fact would then be unbound. But if we were to abstract the whole:

(lambda (fact)
  (let [[fact (lambda (n) ((fact fact) n))]]
    (lambda (n)
      (if (zero? n)
          1
          (* n (fact (sub1 n)))))))

it would be too ugly.

What can we do? We see a let, so let’s desugar it again:

(let [[fact
       (lambda (fact)
         ((lambda (fact)
           (lambda (n)
             (if (zero? n)
                 1
                 (* n (fact (sub1 n))))))
          (lambda (n) ((fact fact) n))))]]
  ((fact fact) 10))

!!!!! Now we obtain:

(lambda (fact)
  (lambda (n)
    (if (zero? n)
        1
        (* n (fact (sub1 n))))))

which is the smallest part that can be abstracted out! Let’s call it F. Next, we have:

(let [[F (lambda (fact)
           (lambda (n)
             (if (zero? n)
                 1
                 (* n (fact (sub1 n))))))]]
  (let [[fact
         (lambda (fact)
           (F (lambda (n) ((fact fact) n))))]]
    ((fact fact) 10)))

Now, substitute fact into the body of let:

(let [[F (lambda (fact)
           (lambda (n)
             (if (zero? n)
                 1
                 (* n (fact (sub1 n))))))]]
  (((lambda (fact) (F (lambda (n) ((fact fact) n))))
    (lambda (fact) (F (lambda (n) ((fact fact) n))))) 10))

If we want to truly abstract F out, we will not want it to be free in the body of let, so:

(let [[F (lambda (fact)
           (lambda (n)
             (if (zero? n)
                 1
                 (* n (fact (sub1 n))))))]]
  (((lambda (F)
      ((lambda (fact) (F (lambda (n) ((fact fact) n))))
       (lambda (fact) (F (lambda (n) ((fact fact) n)))))) F) 10))

The fragment:

(lambda (F)
  ((lambda (fact) (F (lambda (n) ((fact fact) n))))
    (lambda (fact) (F (lambda (n) ((fact fact) n))))))

or, after renaming while maintaining alpha equivalence:

(lambda (f)
  ((lambda (x) (f (lambda (v) ((x x) v))))
   (lambda (x) (f (lambda (v) ((x x) v))))))

is known as Y combinator. Let’s now use the name Y instead:

(let [[F (lambda (fact)
           (lambda (n)
             (if (zero? n)
                 1
                 (* n (fact (sub1 n))))))]
      [Y (lambda (f)
           ((lambda (x) (f (lambda (v) ((x x) v))))
            (lambda (x) (f (lambda (v) ((x x) v))))))]]
  ((Y F) 10))

And just to clean things up to match Matthias’s lecture:

(let [[fact-maker (lambda (fact)
                    (lambda (n)
                      (if (zero? n)
                          1
                          (* n (fact (sub1 n))))))]
      [Y (lambda (f)
           ((lambda (x) (f (lambda (v) ((x x) v))))
            (lambda (x) (f (lambda (v) ((x x) v))))))]]
  (let [[fact (Y fact-maker)]]
    (fact 10)))

The process, if you notice, does not really depend on the fact that we use the factorial function, so in fact, Y enables recursive functions on every maker.

My presentation is similar to Matthias’s lecture a lot, but Matthias’s seems to insist on using lambda everywhere which might be a little bit confusing. I find that using let make things clearer a lot. Matthias’s also seems to have weird tower of makers and hukairs which I don’t really understand.