Thinking Functionally with Haskell (23 page)

concat . map (test p (one,none))

=
{definition of
filter
}

filter p

5. We have

map (fork (f,g)) = uncurry zip . fork (map f,map g)
6. We have

filter (p . foldl f e) . inits

=
{derived law of
filter
}

map fst . filter snd .

map (fork (id, p . foldl f e)) . inits

=
{law of
zip
}

map fst . filter snd . uncurry zip .

fork (id, map (p . foldl f e)) . inits

=
{law of
fork
}

map fst . filter snd . uncurry zip .

fork (inits, map (p . foldl f e) . inits)

=
{
scan lemma
}

map fst . filter snd . uncurry zip .

fork (inits, map p . scanl f e)

Hence

takePrefix (p.foldl f e)

= fst . last . filter snd . uncurry zip .

fork (inits,map p . scanl f e)

6.9 Chapter notes

Gofer, an earlier version of Haskell designed by Mark Jones, was so named because it was GOod For Equational Reasoning. HUGS (The Haskell Users Gofer System) was an earlier alternative to GHCi, and used in the second edition of the book on which the current one is based, but is no longer maintained.

Many people have contributed to the understanding of the laws of functional programming, too many to list. The Haskellwiki page
haskell.org/haskellwiki/Equational_reasoning_examples

contains examples of equational reasoning and links to various discussions about the subject.

The fascinating history of the maximum segment sum problem is discussed in Jon Bentley’s
Programming Pearls
(second edition) (Addison-Wesley, 2000).

Chapter 7
Efficiency

The question of efficiency has been an ever-present undercurrent in recent discussions, and the time has come to bring this important subject to the surface. The best way to achieve efficiency is, of course, to find a decent algorithm for the problem. That leads us into the larger topic of Algorithm Design, which is not the primary focus of this book. Nevertheless we will touch on some fundamental ideas later on. In the present chapter we concentrate on a more basic question: functional programming allows us to construct elegant expressions and definitions, but do we know what it costs to evaluate them? Alan Perlis, a US computer scientist, once inverted Oscar Wilde’s definition of a cynic to assert that a functional programmer was someone who knew the value of everything and the cost of nothing.

7.1 Lazy evaluation

We said in
Chapter 2
that, under lazy evaluation, an expression such as
sqr (sqr (3+4))

where
sqr x = x*x
, is reduced to its simplest possible form by applying reduction steps from the outside in. That means the definition of the function
sqr
is installed first, and its argument is evaluated only when needed. The following evaluation sequence follows this prescription, but is
not
lazy evaluation:

sqr (sqr (3+4))

= sqr (3+4) * sqr (3+4)

= ((3+4)*(3+4))
((3+4)
(3+4))

= ...

= 2401

The ellipsis in the penultimate line hides no fewer than four evaluations of
3+4
and two of
7*7
. Clearly the simple policy of substituting argument expressions into function expressions is a very inefficient way of carrying out reduction.

Instead, lazy evaluation guarantees that when the value of an argument is needed, it is evaluated
only once
. Under lazy evaluation, the reduction sequence would unfold basically as follows:

sqr (sqr (3+4))

= let x = sqr (3+4) in x*x

= let y = 3+4 in

let x = y*y in x*x

= let y = 7 in

let x = y*y in x*x

= let x = 49 in x*x

= 2401

The expression
3+4
is evaluated only once (and so is
7*7
). The names
x
and
y
have been
bound
to expressions using
let
, though in the implementation of Haskell these names are anonymous
pointers
to expressions. When an expression is reduced to a value, the pointer then points to the value and that value can then be
shared
.

Even then, the headline ‘Under lazy evaluation arguments are evaluated only when needed and then only once!’ doesn’t tell the full story. Consider evaluation of
sqr (head xs)
. In order to evaluate
sqr
we have to evaluate its argument, but in order to evaluate
head xs
we do not have to evaluate
xs
all the way, but only to the point where it becomes an expression of the form
y:ys
. Then
head xs
can return
y
and
sqr (head xs)
can return
y*y
. More generally, an expression is said to be in
head normal form
if it is a function (such as
sqr
) or if it takes the form of a data constructor (such as
(:)
) applied to its arguments. Every expression in normal form (i.e. in fully reduced form) is in head normal form but not vice versa. For example,
(e1,e2)
is in head normal form (because it is equivalent to
(,) e1 e2
, where
(,)
is the data constructor for pairs), but is in normal form only if both
e1
and
e2
are. Of course, for numbers or booleans there is no distinction between the two kinds of normal form.

‘Under lazy evaluation arguments are evaluated only when needed and then only once, and then maybe only to head normal form’ is not as catchy a headline as before, but it does tell a better story.

Next, consider the following two definitions of the inductive case of the function
subseqs
that returns all the subsequences of a list:

subseqs (x:xs) = subseqs xs ++ map (x:) (subseqs xs)
subseqs (x:xs) = xss ++ map (x:) xss

where xss = subseqs xs

In the first definition the expression
subseqs xs
appears twice on the right-hand side, so it is evaluated twice when the subsequences of a given list are required. In the second definition this duplication of effort has been recognised by the programmer and a
where
clause has been used to ensure that
subseqs xs
is evaluated only once (we could also have used a
let
expression).

The important point is that you, the programmer, are in control of which definition you want. It is quite possible for Haskell to recognise the double occurrence and to
abstract
it away using the equivalent of an internal
let
expression. This is a well-known technique called
common subexpression elimination
. But Haskell doesn’t do this, and for a very good reason: it can cause a
space leak
. The second definition of
subseqs (x:xs)
has the following problem: the list
subseqs xs
is constructed only once, but it is retained in its entirety in memory because its value is used again, namely in the second expression
map (x:) xss
.

Look at it this way: the first definition takes longer because computation is duplicated; the second definition is faster (though still exponential) but can rapidly run out of available space. After all, there are 2
n
subsequences of a list of length
n
. There is a fundamental dichotomy in programming we can never get away from: to avoid doing something twice you have to use up space to store the result of doing it once.

Here is a related example. Consider the following two definitions in a script:

foo1 n = sum (take n primes)

where

primes
= [x | x <- [2..], divisors x == [x]]

divisors x = [d | d <- [2..x], x `mod` d == 0]

 

foo2 n = sum (take n primes)

primes
= [x | x <- [2..], divisors x == [x]]

divisors x = [d | d <- [2..x], x `mod` d == 0]

The programmer who wrote
foo1
decided to structure their script by making the definitions of both
primes
and
divisors
local to the definition of
foo1
, presumably because neither definition was used elsewhere in the script. The programmer who wrote
foo2
decided to allow these two subsidiary definitions to float to the status of a global or
top-level
definition. You might think that doesn’t make any
difference to the efficiency, but consider the following interaction with GHCi. (The command
:set +s
turns on some statistics which are printed after an expression is evaluated.)
ghci> :set +s

ghci> foo1 1000

3682913

(4.52 secs, 648420808 bytes)

ghci> foo1 1000

3682913

(4.52 secs, 648412468 bytes)

ghci> foo2 1000

3682913

(4.51 secs, 647565772 bytes)

ghci> foo2 1000

3682913

(0.02 secs, 1616096 bytes)

Why was the second evaluation of
foo2 1000
so much faster than the first, while the two evaluations of
foo1 1000
took the same time?

The answer is that in the definition of
foo2
the first 1000 elements of the list
primes
is demanded, so after evaluation
primes
now points to a list in which the first 1000 primes appear explicitly. The second evaluation of
foo 1000
does not require these primes to be computed again. Internally, the script has grown in size because
primes
now occupies at least 1000 units of space.

Programmer Three chooses to write
foo
in the following way:

foo3 = \n -> sum (take n primes)

where

primes
= [x | x <- [2..], divisors x == [x]]

divisors x = [d | d <- [2..x], x `mod` d == 0]

This uses a lambda expression to express
foo3
at the function level, but otherwise the definition is exactly the same as that of
foo1
. The alternative
foo3 = sum . flip take primes

also works but seems a little obscure. Now we have

ghci> foo3 1000

3682913

(3.49 secs, 501381112 bytes)

ghci> foo3 1000

3682913

(0.02 secs, 1612136 bytes)

Again, the second evaluation is much faster than the first. Why is that?

To see what is going on, we can rewrite the two functions in the form

foo1 n = let primes = ... in

sum (take n primes)

foo3
= let primes = ... in

\n -> sum (take n primes)

Now you can appreciate that in the first definition
primes
is re-evaluated every time
foo1 1000
is called because it is bound to an application of
foo1
not to the function itself. It is theoretically possible that the local definitions in the first definition depend on
n
, so any such definitions have to be re-evaluated for each
n
. In the second definition the local definitions are bound to the function itself (and can’t possibly depend on any argument to the function); consequently, they are evaluated only once. Of course, after evaluating
foo3 1000
, the local definition of
primes
will be expanded to an explicit list of 1000 elements followed by a recipe for evaluating the rest.

7.2 Controlling space

Suppose we define
sum
by
sum = foldl (+) 0
. Under lazy evaluation the expression
sum [1..1000]
is reduced as follows

sum [1..1000]

= foldl (+) 0 [1..1000]

= foldl (+) (0+1) [2..1000]

= foldl (+) ((0+1)+2) [3..1000]

= ...

= foldl (+) (..((0+1)+2)+ ... +1000) []

= (..((0+1)+2)+ ... +1000)

= ...

= 500500

It requires 1000 units of space just to build up the arithmetic expression that sums the first 1000 numbers before it pops to the surface and is finally evaluated.

Much better is to use a mixture of lazy and eager evaluation:

sum [1..1000]

= foldl (+) 0 [1..1000]

= foldl (+) (0+1) [2..1000]

= foldl (+) 1 [2..1000]

= foldl (+) (1+2) [3..1000]

= foldl (+) 3 [3..1000]

= ...

= foldl (+) 500500 []

= 500500

While the list expression
[1..1000]
is evaluated lazily, the second argument of
foldl
, the accumulated sum, is evaluated eagerly. The result of interleaving lazy and eager evaluation steps is a sequence that uses a constant amount of space.

This suggests that it would be useful to have some way of controlling the reduction order. Such a method is provided by a primitive function
seq
with type
seq :: a -> b -> b

Other books

Gay Phoenix by Michael Innes
Prince of Spies by Bianca D'Arc
Binding Arbitration by Elizabeth Marx
Partners (Fire & Lies - One) by Lilliana Anderson