# Lecture 2: Algebraic data types ## Enumeration types data Thing = Shoe | Ship | SealingWax | Cabbage | King derivating Show The example above is the **type declaration** of an **enumeration type**. The new type is called `Thing` and it has 5 **data constructors** (`Shoe`, `Ship`, etc.). A type's constructors are the only values for that type. The line that says `deriving Show` is a special instruction that tells the GHC to automatically generate default code for converting a `Thing` to `String`. Shoe :: Thing Shoe = Shoe listO'Things :: [Thing] listO'Things = [Show, SealingWax, King, Cabbage, King] Functions on enumeration types can be written using pattern-matching. isSmall :: Thing -> Bool isSmall Show = True isSmall Ship = False isSmall SealingWax = True isSmall Cabbage = True isSmall King = False Since function clauses are checked from top to bottom, `isSmall` can be rewritten like this: isSmall2 :: Thing -> Bool isSmall2 Ship = False isSmall2 King = False isSmall2 _ = True ## Beyond enumerations Enumeration types are just a special case of Haskell's **algebraic data types**. An example of ADT that is not an enumeration type is the following `FailableDouble`: data FailableDouble = Failure | OK Double deriving Show `FailableDouble` has two constructors: `Failure` and `OK`. The former takes no arguments, so it's a value by itself (of type `FailableDouble`). The latter takes a value of type `Double` and `OK` needs it in order to be a value, otherwise it's not. ex01 = Failure ex02 = OK 3.4 What's the type of `OK`? (TODO: give an answer) safeDiv :: Double -> Double -> FailableDouble safeDiv _ 0 = Failure safeDiv x y = OK (x / y) It's possible to give names to variables in constructors when using pattern-matching. failureToZero :: FailableDouble -> Double failureToZero Failure = 0 failureToZero (OK d) = d Data constructors can have more than one argument. data Person = Person String Int Thing deriving show brent :: Person brent = Person "Brent" 31 SealingWax stan :: Person stan = Person "Stan" 94 Cabbage getAge :: Person -> Int getAge (Person _ a _) = a Type constructor and data constructors can have the same name, in this case `Person`. However, they live in different name spaces and are different. It's a common idiom to use the type's name for its data constructor when it only has one constructor. ## Algebraic data types in general Algebraic data types can have one or more constructors, and each of them can have arguments, from zero to as many as we want. The syntax is: data AlgDataType = Constr1 Type11 Type12 | Constr2 Type21 | Constr3 Type31 Type32 Type33 | Constr4 Each constructor can be used to create a value of the `AlgDataType` type and, depending on the one that's chosen, a value of type `AlgDataType` may have more separate values and different types for them. Type and data constructor names must always start with a capital letter, while variables and functions must always start with a lowercase letter. ## Pattern-matching Although we already went through pattern-matching a bunch of times, it's time to see how it works in general. The idea is that pattern-matching analyzes a value by the constructor it was made with. Different constructors, different things to do, and constructors are the only way to make a decision. foo (Constr1 a b) = ... foo (Constr2 a) = ... foo (Constr3 a b c) = ... foo Constr4 = ... Haskell requires parentheses around a constructor which takes a value or more. Constructor values can be given names, which is the first step in order to use such values for computation. In addition: 1. Underscores match everything, they are called **wildcard pattern**. 2. Patterns like `x@pat` both give a name (`x`) to the pattern (`pat`) and match a value against that pattern. 3. Nested patterns are possible. Literal values can be thought as themselves being a constructor. Things like `2` or `'c'` are constructors without arguments. This makes it possible to use pattern-matching against literal values. ## Case expressions **Case expressions** are fundamental for pattern-matching. case exp of pat1 -> exp1 pat2 -> exp2 The evaluation of the case construct is similar to that of functions: the given expression (`exp`) is matched against the given patterns in order (`pat1`, `pat2`, etc.), the first match is chosen, and the whole case expression evaluates to the expression associated with the matching pattern. The syntax for defining functions is actually just a convenient way to define case expressions. Theoretically, as with functions, multiple patterns can match the given expression, but only the first that matches (from the top) is chosen. ## Recursive data types Data types can be recursive: defined in terms of themselves. Lists are an example: they can be either empty, or a single element followed by the rest of the list. The list type could be defined like this: data IntList = Empty | Cons Int IntList Haskell's actual lists are similar, but they use a special syntax. Recursive functions are very useful to process recursive data types. intListProd :: IntList -> Int intListProd Empty = 1 intListProd (Cons x l) = x * intListProd l Binary trees can also be defined with recursive data types. The following type `Tree` is a binary tree with an `Int` at each node and a `Char` at each leaf. data Tree = Leaf Char | Node Tree Int Tree deriving Show It can be used as follows: tree :: Tree tree = Node (Leaf 'x') 1 (Node (Leaf 'y') 2 (Leaf 'z'))