Lazy Sequence in Clojure
Introduction
A lazy sequence is a sequence that is not realized (or computed) until it is needed. This can be useful for handling potentially infinite sequences or for avoiding unnecessary computation when dealing with large sequences. You benefit by avoiding computations until they are absolutely needed and on demand.
Lazy sequences are available directly in many functional programming languages
like Haskell (LazyEvaluation),Swift (LazySequence) and via libraries in other
languages like Rust (lazy_seq), Go (seq). We will look into lazy-seq
available
in clojure.core
.
To create a lazy sequence, in Clojure you can use the lazy-seq
function. Here's
an example of how to create a lazy sequence of integers:
|
|
In the above example, number-sequence
is defined as a function that returns a
lazy sequence of integers starting at n
. The lazy-seq
function creates the
lazy sequence, and the cons
function is used to add the first element (1) to
the sequence. The map function is used to generate the remaining elements of the
sequence by adding 1 to each element of the previous sequence.
When we call (number-sequence)
, we get back a lazy sequence object. When we
call take
to get the first 5 elements of the sequence, Clojure will realize
(or compute) only the first 5 elements of the sequence, rather than computing
the entire sequence up front. This can be useful for avoiding unnecessary
computation when dealing with large or potentially infinite sequences.
Before going to more detailed explanations, here are two more examples of how to
use lazy-seq
in Clojure:
Example 1: Generating an infinite sequence of Fibonacci numbers
|
|
In this example, fib-seq
is defined as a lazy sequence of Fibonacci numbers.
The cons
function is used to add the first two elements (1 and 1) to the
sequence. The map
function is used to generate the remaining elements of the
sequence by adding each element to its predecessor. Because fib-seq is defined
recursively in terms of itself, it creates an infinite sequence of Fibonacci
numbers.
When we call (take 10 fib-seq)
, Clojure will realize only the first 10
elements of the sequence. Because the sequence is defined lazily, we can
generate an infinite sequence of Fibonacci numbers without running into issues
with stack overflow or memory usage.
Here is another way to create infinite fibonacci sequence in Clojure using
lazy-seq
|
|
Example 2: Implementing the Sieve of Eratosthenes algorithm
|
|
In this example, sieve
is defined as a function that implements the Sieve of
Eratosthenes algorithm, which generates a sequence of prime numbers. The
lazy-seq
function is used to create the sequence of prime numbers. The let
expression is used to bind the first element of the input sequence s
to the
variable p
. The cons
function is used to add p
to the output sequence. The
filter
function is used to remove all elements from s
that are divisible by
p
. The resulting sequence is passed recursively to sieve.
When we call (take 20 (sieve (iterate inc 2)))
, we get back a lazy sequence of
the first 20 prime numbers. Because the Sieve of Eratosthenes algorithm can be
used to generate an infinite sequence of prime numbers, we can use lazy-seq
to
generate prime numbers on-the-fly as we need them, without having to compute the
entire sequence up front.
How does lazy-seq work internally?
Let us get deeper in it.
When you call (lazy-seq <body>)
, Clojure returns a lazy sequence object that
wraps the body of the function. The body of the function is wrapped in a
function object that can be called later to compute the next element of the
sequence.
When you later call a sequence function like first
, rest
, map
, or
filter
on the lazy sequence object, Clojure will call the function object
generated by lazy-seq
to compute the next element of the sequence. This will
cause the body of the lazy-seq
function to be executed, if it has not already
been executed, and one or more elements of the sequence to be realized.
Once an element of the sequence has been realized, it is cached, so that
subsequent calls to sequence functions like first
, rest
, map
, or filter
can return the same value without recomputing it. This caching feature is
important and can be useful for avoiding unnecessary computation when dealing
with large or potentially infinite sequences.
One important thing to note is that lazy sequences are realized incrementally,
one element at a time. This means that if you call a sequence function that
requires the entire sequence to be realized, like count or reduce, then Clojure
will need to realize the entire sequence before it can return a value. This can
cause performance issues or out-of-memory errors if you try to realize an
infinite or very large sequence in this way. Clojure also has some optimisations
that will compute 32
items of lazy-seq at once.
Alternatives to lazy-seq in Clojure
You don't need to use lazy-seq
directly to work with lazy sequences in
Clojure. There are alternatives to lazy-seq
in Clojure, each with its own
advantages and limitations. Here are some of the most commonly used
alternatives:
iterate
: iterate is a built-in function that returns a lazy sequence of values generated by applying a function to an initial value repeatedly. For example,(iterate inc 0)
generates an infinite sequence of natural numbers.repeatedly
: repeatedly is a built-in function that returns a lazy sequence of values generated by calling a function repeatedly. For example,(repeatedly 10 #(rand-int 10))
generates a sequence of 10 random integers between 0 and 9.range
: range is a built-in function that returns a lazy sequence of numbers in a specified range. For example,(range 1 10)
generates a sequence of integers from 1 to 9.mapcat
: mapcat is a higher-order function that applies a function to each element of a sequence and concatenates the results into a single sequence. For example,(mapcat #(range 1 %) [2 3 4])
generates a sequence of integers from 1 to 2, then from 1 to 3, and finally from 1 to 4.concat
: concat is a function that concatenates two or more sequences into a single sequence. For example,(concat [1 2 3] [4 5 6])
generates a sequence of integers from 1 to 6.
Each of these alternatives has its own use cases and trade-offs. For example,
iterate
and repeatedly
are good for generating sequences of values based on
a function, while range
is good for generating sequences of consecutive
numbers. mapcat
and concat
are good for combining multiple sequences into a
single sequence. However, none of these alternatives can match the flexibility
and power of lazy-seq
for generating complex and potentially infinite
sequences of values.
Also read about Stuart Sierra's advice Clojure Dont's: Concat
Summary
Lazy evaluation is a technique used in functional programming languages to delay the evaluation of expressions until their values are actually needed. In Clojure, lazy evaluation can be implemented using lazy sequences.
You can use the lazy-seq
macro to create a lazy sequence in Clojure. It takes
a body of code that generates the next element of the sequence and returns a new
lazy sequence that evaluates the body of code only when necessary.
Lazy evaluation is useful for
- Efficient processing of potentially infinite sequences of values
- Dealing with large data sets
- Performing complex computations on streams of data.
Two important features of lazy sequences in Clojure are
- Delayed computation until in is actually needed
- Caching of computed values
Important note about lazy-seq
, it does 32 at a time not just one as an
optimisation.