lec06/notes.md


1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58

# Lecture 6: Laziness

## Strict evaluation

Strict evaluation is the conceptual opposite of lazy evaluation. Under strict evaluation, function arguments are evaluated *before* evaluating the overlying function.

The difference between these two kinds of evaluation, in particular the reason for considering lazy over strict, is that it only calculates what's necessary in order to obtain the final result. As an example, if a function was defined as having two arguments but it only used the first to produce the result, and it was called with an overwhelmingly complex calculation in its second argument which is unused, then the strict evaluation of such function's result would waste time and computing resources solving a preliminary expression whose result is not actually useful at all.

On the other hand, strict evaluation has the advantage of being predictable: every operation is done in the same order as it's written. In languages which allow functions to produce side-effects, strict evaluation is needed, because the result may depend on the order of evaluation of operations and functions.

## Side effects and purity

The main reason for using strict evaluation is the presence of side effects, not just because a function uses an external ingredient for producing a result, but that external thing changes over time. The most common scenario is modifying a global variable, but that's not the only one: printing to the screen and reading from a file or a network socket are also valid side effects which may require a certain order.

Since lazy evaluation does not transparently state the order in which things happen, adding side effects on top of it would create plenty of issues! The reason why Haskell does not have side effects (i.e. it's *pure*) is that its designers wanted it to be functional and lazy, and it's not possible if side effects exist.

A language with no side effects at all would not be very useful. Despite that, the Haskell designers found a way to allow the language to "interact with the outside world" (read and write files, print to the screen, etc.) in a principled and restricted way that doesn't interfere with the purity of the language: the IO monad.

## Lazy evaluation

Essentially, lazy evaluation means delaying an evaluation as long as possible, until it's not possible anymore because the result of such evaluation is needed to advance the overall computation.

What happens when an expression is given as argument to a function is that it's packaged up as an *unevaluated expression*, which takes the name of a **thunk**, without doing any actual work. If the expression ends up unused (i.e. the function doesn't use that parameter), it gets cleaned up by the garbage collector.

## Pattern matching drives evaluation

So far we've talked about *whether* an expression is used, but not *when* it is necessary to evaluate it. Let's look at an example.

	f1 :: Maybe a -> [Maybe a]
	f1 m = [m,m]

	f2 :: Maybe a -> [a]
	f2 Nothing = []
	f2 (Just x) = [x]

Both `f1` and `f2` use their arguments, but one needs to evaluate it while the other doesn't. As we can see, `f1` uses its argument `m` but it doesn't need to know anything about it, so it can be put in the list as an unevaluated expression. On the other hand, `f2` needs to know whether its argument was constructed using `Nothing` or `Just`, so it needs to be evaluated because its result depends on it.

Another important aspect is that thunks are only evaluated enough to proceed with computation, and no further. If `f2` was given a very heavyweight computation as argument, wrapped around a `Just`, its evaluation is going to depend on where and how the result of `f2` is used. If it's not needed, it's going to remain uncalculated.

A handy slogan helps us remember the important points: *"pattern matching drives evaluation"*, which means that:

- Expressions are only evaluated when pattern-matched.
- Expressions are only evaluated as far as needed for the match to proceed.

Although lazy evaluation works like this in theory, in practice GHC uses a more sophisticated technique called *graph reduction*, which represents all expressions as a single graph. In this graph, different parts of the expression can share pointers to the same subexpression, which ensures that no unique expression is evaluated more than once.

## Consequences

(Summed up as a list below. The generic "it" pronoun, when used as a subject with otherwise ambiguous meaning, refers to "lazy evaluation".)

- It forces the language to be pure.
- Determining space usage can get a little tricky, mostly due to thunks and recursion.
- It behaves interestingly, depending on how functions are defined, mainly due to short-circuiting. For example, some equivalent functions may take longer to compute if they make heavy use of pattern matching, while some others might even crash.
- It allows to define custom control structures.
- It allows to work with infinite data structures, or those large enough to not practically fit into memory (whatever the available size is).
- Pipelined incremental transformations of large data structures are memory-efficient.
- It eases the management of results for dynamic programming problems, especially the evaulation of the order of entries in a dynamic programming table.