Lazier Than Lazy

While reading Programming Clojure I read that  lazy sequence can break recursion and prevent stack overflow. This wasn’t completely intuitive to me. How could changing the type of a sequence (eager or lazy)  change the way the stack is used? Let’s dig into it.

Let’s start first with the example given in the book:

(declare replace-symbol replace-symbol-expression)

(defn replace-symbol [coll  oldsym newsym]
(if (empty? coll)
()
(cons (replace-symbol-expression
(first coll) oldsym newsym)
(replace-symbol
(rest coll) oldsym newsym))))

(defn replace-symbol-expression [symbol-expr oldsym newsym]
(if (symbol? symbol-expr)
(if (= symbol-expr oldsym)
newsym
symbol-expr)
(replace-symbol symbol-expr oldsym newsym)))

(defn deeply-nested [n]
(loop [n n
result '(bottom)]
(if (= n 0)
result
(recur (dec n)(list result)))))

The function deeply-nested produces a structure like this

((((((((((bottom))))))))))

And the function replace-symbol can be used to change bottom to something else

user=> (replace-symbol (deeply-nested 10) ‘bottom ‘other)
(((((((((other))))))))))

And we indeed get a stack overflow if the nesting is too big

user=> (replace-symbol (deeply-nested 1000) ‘bottom ‘other)
#<CompilerException java.lang.StackOverflowError (NO_SOURCE_FILE:0)>

And if I slightly change the replace-symbol function to use a lazy sequence:

(defn replace-symbol [coll  oldsym newsym]
(lazy-seq (if (empty? coll)
()
(cons (replace-symbol-expression
(first coll) oldsym newsym)
(replace-symbol
(rest coll) oldsym newsym)))))

I indeed prevent the stack overflow from happening.

What happens actually is that lazy sequences are just like futures. The sequence is realized only when the sequence is used. If the consumer of the loop consumes the value sequentially, this indeed breaks the recursion. Let’s consider a sequence which is recursively defined as ( cons 1 (cons  2 ( cons 3 (list 4) ) ) ), and a function count defined with

(defn count-1 [col]
(loop [n 0 c col]
(if (= (first c) nil)
n
(recur (inc n)(rest c)))))

An attempt to count the number of items by traversing the sequence will produce the following call stack:

count-1
count-1, cons
count-1, cons, cons
count-1, cons, cons, cons
count-1, cons, cons, cons, list

(Count-1 itself is not recursive because it uses the recur special form to perform tail call optimization, which is not possible at the JVM level)

If the sequence is lazily defined, then the call stack will be as follows:

count-1
count-1, cons
count-1, cons
count-1, cons
count-1, list

The lazy sequence are realized one after the other, but each realization returns in the count-1 function.  This sequence is actually much closer to a generator, producing one item at a time.

This is the heart of infinite sequences also, as for instance in:

user=> (take 15 (cycle [1 2 3 4]))
(1 2 3 4 1 2 3 4 1 2 3 4 1 2 3)?

The special form lazy-seq in Clojure is actually a macro.

user=> (macroexpand-1 '(lazy-seq (list 1 2)))
(new clojure.lang.LazySeq (fn* [] (list 1 2)))

We clearly see here the parallel between future and lazy sequence. Both simply postpone the execution of a function at the time it is really needed.

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s