Every (algebraic) data type is created for a very specific purpose. A data type isn’t made because you feel like making one (even if you can). Those data types which exist for no reason, in general, will have difficulty being generalized (into monads, as we will see later). However, let’s assume we’re all keen and astute programmers who won’t do that for the heck of it, and that we’re actually trying to accomplish something.

With that, let’s take a whirl around the idea of algebraic data types, and what they really mean. As mentioned, when a data type is created, it’s because it is trying to represent a particular concept or property.

Assumptions: You understand the Haskell type system (typeclasses, algebraic data types, type declarations, etc), and Haskell syntax in general.

For instance, suppose we want to represent failure. Haskell’s `Maybe`

type does that. The very reason we type something as a `Maybe`

, is because we want to represent failure on that something. As an example, if a function f decides whether or not an integer is a perfect square, the intuitive result that the function should return is a boolean – `true`

or `false`

. Either something is a perfect square, or is not. This should be simple.

Hence, a C function may look something like this:

```
bool isPerfectSquare (int i) {
if (code_to_check_for_perfect_square)
return `true`;
else
return `false`;
}
```

However, what if we pass that function a negative number? By definition, over integers, negative numbers cannot be perfect squares. Hence, we would want to say that the function failed.

In most imperative languages (C/C++, Java, etc), we would use a failure value to represent failure. Perhaps we would also like to abort the “execution” of that function. However, we cannot use a boolean here anymore, because we essentially need a third result, and there are only two possible values for a boolean. Hence, a common (and ugly) solution in C would be to “upgrade” (essentially a downgrade) the return type to an integer. Hence, a C function may look something like this:

```
int isPerfectSquare (int i) {
if (i < 0)
return -1;
if (code_to_check_for_perfect_square)
return 1;
else
return 0;
}
```

There are similar strategies, but basically they all involve using a “magic” value to represent failure. That may be -1, NULL, 0, or whatever other value pleases the programmer.

However, such strategies place the implicit meaning of the “magic” value on the programmer – the human. The programmer says that -1 means failure, and humanly enforces it. The compiler (or the code) doesn’t know that, and hence cannot enforce it. If another programmer comes along and takes `isPerfectSquare()`

for a ride, and decides to alter it and state that -1 means something else, and -999 now means failure, then any code that relies on `isPerfectSquare()`

is going to break. This is not a good approach.

Haskell solves this by allowing the programmer to define new custom types that gives a context to various things (values). For example, in the `isPerfectSquare`

function, Haskell lets a programmer state, through a type, that the function has failed, or succeeded, and if it succeeded, what the result is. The custom data type here is the `Maybe`

type, and is defined as follows:

```
data Maybe a = Just a | Nothing
```

Hence, suppose we have a result of `True`

(or `False`

), we can type the result (a boolean) with the `Maybe`

type, by saying `Just True`

(or `Just False`

). Note that here, it is assumed that you understand the Haskell type system, at least where type constructors, value constructors and polymorphic/parameterized types are concerned. And if the function failed, we can just return a Nothing.

### Giving a Context

The idea here is not Haskell’s type system, but the idea of typing a value. Alternatively, you can see it as encapsulating, or wrapping, a value. Regardless of how you see it, the key thing is that we are now placing a value (the boolean, in this example), into a context. **The context is failure**. Hence, in the example above, we just placed the resultant `True`

or `False`

into a context. There is always a very specific reason for creating a particular custom type (`Maybe`

, in this case), and for `Maybe`

, the reason is to represent failure.

If there is another data type, then it must have a reason to have been created.

For instance, if you created a data type called Person:

```
data Person = Person String String String deriving (Show)
```

Then the reason for the Person data type is that you, the programmer, has decided that a Person has three string fields (such as first name, last name, middle name).

If you create a data type called Tree, as follows:

```
data Tree = Empty | Tree a (Tree a) (Tree a) deriving (Show)
```

then a possible reason is that you are representing indecision, whereby the tree illustrates all possible decisions – all outcomes for a given input.

Similarly, for a list:

```
data List = Empty | List a (List a) deriving (Show)
```

For all these data types, whenever we type a value as that data type, we are giving that value a context. And that context has a meaning in which you can articulate. For instance, a Tree 4 has type `Tree Int`

, and what it means is that you have a decision with input `4`

. Since the tree has no branches, there are no decisions and outcomes yet. If you have a `Tree 4 (Tree 5) (Tree 6)`

, it represents a decision with input `4`

, that can yield a `5`

or a `6`

(if you “traverse” the tree). Similarly, if you have a list like `List 5 (List 6 (List 7))`

, you may be representing non-determinism, whereby a value can take either `5`

, `6`

or `7`

.

Note: it is important to note that data types have a reason for their existence. If not, it becomes difficult to see how they can be generalized into a monad.

### Abstracting Away (Each and Every Different) Context

Now, what happens with our `isPerfectSquare`

function? Let’s write the proper Haskell version of `isPerfectSquare`

, complete with the `Maybe`

type to represent failure. We also modify it a bit, because we don’t just want a boolean, we want the value of the function succeeds. In other words, if it’s not a perfect square, we return Nothing, if it is, we return the number itself. Let’s also add another similar function that decides if something is a big number a not. We decide that numbers larger than 1024 are “big”:

```
isPerfectSquare :: Integer -> `Maybe` Integer
isPerfectSquare n =
let sq = floor (sqrt (fromIntegral n :: Double)) in
case (sq * sq == n) of
`True` -> Just n
`False` -> Nothing
isBigNumber :: Integer -> `Maybe` Integer
isBigNumber n
| n < 1024 = Nothing
| otherwise = Just n
```

Great, nice and simple. Now what if we want to treat those two functions as primitives, and develop a `isBigAndPerfectSquare`

function, which tests if an integer is larger than 1024, and is a perfect square? One way to do that would be to check if an integer is a perfect square (`isPerfectSquare`

), and if it’s bigger than 1024 (`isBigNumber`

). Not the most efficient, but just for illustration purposes. Hence we can write this:

```
isBigAndPerfectSquare :: Integer -> `Maybe` Integer
isBigAndPerfectSquare n =
case (`isPerfectSquare` n) of
Nothing -> Nothing
Just x -> case (isBigNumber n) of
Nothing -> Nothing
Just y -> Just y
```

Again, there are a ton of better ways to write it. It’s for illustration. Let’s go further.

```
isMagicNumber :: Integer -> `Maybe` Integer
isMagicNumber n
| n == 262144 = Just n
| otherwise = Nothing
isBigAndPerfectSquareAndMagic :: Integer -> `Maybe` Integer
isBigAndPerfectSquareAndMagic n =
case (`isPerfectSquare` n) of
Nothing -> Nothing
Just x -> case (isBigNumber n) of
Nothing -> Nothing
Just y -> case (isMagicNumber n) of
Nothing -> Nothing
Just z -> Just z
```

By now something is becoming very apparent. The **reason** for the custom data type is worming its way into your code. Everytime you have a “primitive” function (such is `isPerfectSquare`

) which returns a result (or computes to a value) of a data type that has a context (i.e. `Maybe`

), you have to check that context (is it a Just x, or a Nothing?) should you wish to **use** the result.

To state that again, when a result has a context (i.e. `Maybe`

, to represent failure), **using** that result means unwrapping the context to (a) extract the value out so that you can actually act on the value, and (b) checking the entire context to see if you need to handle special cases. The special case is failure, the value is the x in Just x. We performed the value extraction by pattern matching against Just `False`

and Just `True`

, and we checked the special case by pattern matching against Nothing. Once again, in this case it’s simple, because `Maybe`

is quite simple. If you have a more complex data type, you have to check and extract more stuff. Very quickly, this becomes a pain. Also notice one more thing: The point at which things became a pain was when two “primitives” were sequenced together. If we have a function that just calls `isPerfectSquare`

**once**, it would be fine. The function could declare a return type of `Maybe`

Bool, and call `isPerfectSquare`

, and return what `isPerfectSquare`

returns, which is a `Maybe`

Bool. All’s fine and dandy. Things got annoying the moment we sequenced calls to `isPerfectSquare`

. In other words, when we want to act on the value of `isPerfectSquare`

, and `maybe`

call `isPerfectSquare`

again, or even another function that will take the value we extracted and act on it, that all that special case and extraction stuff occurred. Therein lies a key point: when we **sequence** these calls, things get messy.

** Key idea: **

Hence, whenever we have a context that we build up or assign, because we want to **represent** something, it means we have to deal with the context. Since every context is different (represents different things), the **manner** in which to:

- Handle special cases
- Extract the value

is going to be different.

There are two ways we can handle (a) and (b). We can make the programmer (who uses the type) responsible for value extraction and special case handling. Or, we can make the type creator responsible for it. Of course, they may be the same person, but that’s the fundamentals of abstraction/generalization anyway, right?

What we’ve seen above is making the programmer (who uses the type) responsible for it. He had to write code that checks for the special case (Nothing) and the value extraction (`True`

and `False`

). Not only is that a pain (lots of typing!), but it also distracts away from the real thing the code is doing (check for power of 4 and power of 8). It’s hard to read, annoying to debug, and so on. Basically, it violates the principle of abstraction.

So, how do we make the type creator responsible for all that? Guess what? We abstract the process of **sequencing**, and define how (a) values can be extracted, and (b) special cases can be handled, when such functions (Integer -> `Maybe`

Bool, in this example) are **sequenced**. It’s very important to note that the idea of abstraction here acts on **sequenced** functions.

Hence, we define a special operator, which we call `>>=`

, which represents sequencing. Hence, for the above example, when dealing with `Maybe`

, we define `>>=`

for `Maybe`

as follows:

```
(>>=) :: `Maybe` a -> (a -> `Maybe` b) -> `Maybe` b
(>>=) Nothing f = Nothing
(>>=) Just x f = f x
```

** Responsibilities of `>>=`

**

The bind operator `>>=`

has a few jobs to do. As alluded to above, it certainly has to define how to extract the value, and handle special cases. However, more generally, this is what is has to do or can do:

- Handle special cases
- Extract the value
- Modify the value (if it wants to)
- Find a way to meaningfully interact with function f

So, the operator takes a `Maybe`

type value, and a function, and somehow feeds the value to the function. The value is simple: It’s just that – Just 16, Just 4, Nothing. The function f is interesting. It represents the function(s) that we want to use in sequence. A possible function, in our example, is `isPerfectSquare`

. Note that this is **not** limited to sequencing the same function.

Hence, the idea is this:

```
isBigAndPerfectSquare :: Integer -> `Maybe` Integer
isBigAndPerfectSquare n =
Just n >>= `isPerfectSquare` >>= isBigNumber
isBigAndPerfectSquareAndMagic :: Integer -> `Maybe` Integer
isBigAndPerfectSquareAndMagic n =
Just n >>= `isPerfectSquare` >>= isBigNumber >>= isMagicNumber
```

What a **massive** reduction in code and improvement in readability. All that value extraction (Just x) and special cases (Nothing) were **all** taken care of by the beautiful `>>=`

operator. Sequencing is now easy, and clean.

Notice also that we “fed” Just n into the various functions, and not plain n. That’s because the `>>=`

operator abstracts away the unwrapping (value extraction) process, and it needs something that is wrapped to unwrap. The type signature of `>>=`

’s first parameter confirms this. It’s a `Maybe`

a. Hence, we have to wrap our first n. This is good, if you think about it. n should exist with it’s context, not floating around. The only time it “floats” as a plain value, is when it’s inside a function, such as `isPerfectSquare`

and isBigNumber. At **all** other times, it is safely wrapped in it’s data type – `Maybe`

. That keeps things clean.

However, having the programmer deal with wrapping things up is all fine and dandy, but would it be nice if there was a generic way to just wrap things up, no matter what data type the programmer is using? Would it be possible to make the type creator responsible for that aspect as well? Turns out there is. And since we’re making the type creator responsible for more and more of these abstraction stuff, `maybe`

all of this behaviour can be made part of a typeclass, and again, indeed it is. That’s your monad typeclass, which we are ready to introduce.

```
class Monad m where
return :: a -> m a
(>>=) :: m a -> (a -> m b) -> m b
(>>) :: m a -> m b -> m b
(>>) m m&#039; = m >>= _ -> m&#039;
fail :: String -> m a
```

First thing to note: The return types of all the functions defined in the typeclass are monads. We’ve already introduced `>>=`

, and the meaning here is exactly the same (intuitively at least, since we haven’t formally introduced it) as what we’ve been doing above. Basically, it takes a monadic value (our Just x, above), and feeds it into a function that expects the actual value (the inner value, or x, in our example), does something to the value, and spits out yet another monadic value.

Hence, `Just x >>= isPerfectSquare`

feeds `Just n`

into `isPerfectSquare`

, and the definition for `>>=`

results in the extraction of x and calling of `isPerfectSquare x`

(remember when we made the type creator do all that extraction work and special case handling work?). `isPerfectSquare x`

produced either a `Nothing`

or a `Just y`

, which is of our monadic type. The type creator **must** define `>>=`

. In other words, what it means to extract and handle the edge cases (yes, again! this is important!)

Note that `>>`

is a convenience function, which does exactly what >>= does, but it doesn’t care about the input. It just spits out the second parameter, a monadic value, `m'`

.

`fail`

is a function, which takes an error string and gives a monad, which represents failure, possibly including the given error string. We’ll ignore this because we hardly define it.

Hence, if we wanted to make `Maybe`

a monad (it already is), then we would do the following:

```
instance Monad `Maybe` where
return x = Just x
Nothing >>= f = Nothing
Just x >>= f = f x
```

We should, by now, understand how and why `>>=`

is defined. Return is where we said we wanted to make the type creator responsible for wrapping something into a type (a monad). In the case of `Maybe`

, the simplest way to wrap something (actually the only way, for `Maybe`

) is to put it into a `Just`

. `Nothing`

is not a good idea, because `Nothing`

means failure. If we want to wrap the value 4, we don’t want to generate failure, we want `Just 4`

.

Hence, writing monadically, we should do this:

```
isBigAndPerfectSquare :: Integer -> `Maybe` Integer
isBigAndPerfectSquare n =
return n >>= `isPerfectSquare` >>= isBigNumber
isBigAndPerfectSquareAndMagic :: Integer -> `Maybe` Integer
isBigAndPerfectSquareAndMagic n =
return n >>= `isPerfectSquare` >>= isBigNumber >>= isMagicNumber
```

### Creating Order Of Evaluation

Remember that in a lazily evaluated language like Haskell, there is no guarantee if and when expressions (or computations) are evaluated. For instance, given the following code:

```
myFunction :: Int -> Bool
myFunction x = if x > 100 then `True` else `False`
```

We have no guarantee if and when `True`

, or `False`

, will be evaluated. Everything is lazy. Similarly:

```
myFunction :: Integer -> [Integer]
myFunction x = take 10 [x..]
```

does not generate the entire list, only the first ten elements. In short, we cannot determine when evaluation occurs.

However, while this has a **lot** of benefits (performance, conciseness, infinitity, etc), when dealing with the real, external world, many times evaluation order matters, and matters a lot. For instance, it matters when writing to a file, and reading from one. Same with databases (they can be viewed as fancy files). And so on. Hence, we need a way to guarantee that the order of evaluation can be predictable when we need it to be predictable.

How can we do this in a purely functional, lazily evaluated language? By the observation that we can force evaluation order by ensuring that a computations X require a computation `Y`

to compute. Hence, `Y`

must evaluate before `X`

.

If we express `X`

and `Y`

as anonymous functions, we can do something like this:

```
((y -> y / ((x -> x + 1) 3)) 12)
```

This guarantees that the `(x+1)`

function must run first, because the `(y/x)`

function relies on the computed result (the value) of the `(x+1)`

function for it to be evaluated. Hence `(x+1)`

returns 4, given its argument, which is fed to the `(y/x)`

function to becomes `(y/4)`

. the `(y/4)`

function takes the parameter 12, and computes the result of 3.

Note how similar this is to assignment in imperative languages! In fact, this is the **exact** equivalent of the imperative code:

```
x = 3
x = x + 1
y = 12
y = y / x
```

Imperative languages determine order of execution simply by order of listing. The compiler **must** generate code that evaluates those 4 statements in that order (optimizing compilers can rearrange, but the meaning must not change). Since declarative languages cannot have such multi-statement “programs”, and can only consist of functions which consist of expressions which evaluate, we then observe that execution “flow” can be simply expressed by a bunch of interdependent nested functions, exactly as above. In essence, we are forcing the compiler (or the evaluator) to evaluate the `(x+1)`

computation before the `(y/x)`

function, because we created a scenario in which it simply had no choice but to do so, no matter how lazy it wanted to be.

How does this relate to monads? Or rather, how do monads help to create order of evaluation in a nice and simple way? Make the following observation:

This function (exactly the same as above):

```
isBigAndPerfectSquareAndMagic :: Integer -> `Maybe` Integer
isBigAndPerfectSquareAndMagic n =
return n >>= `isPerfectSquare` >>= isBigNumber >>= isMagicNumber
```

is computationally identical to this function:

```
isBigAndPerfectSquareAndMagic n =
return n >>= (x ->
`isPerfectSquare` x >>= (y ->
isBigNumber y >>= (z ->
isMagicNumber z)))
```

The **only** difference is that now the “fed” and “unpacked” values (from the contextualized monadic value) are given nice names (x, y and z), and the functions are called with those names, instead of being implicit. To be obvious:

```
return n >>= `isPerfectSquare`
==> Just n >>= `isPerfectSquare` -- by definition of return
==> `isPerfectSquare` n -- by defintion of >>=
```

which is exactly the same as:

```
return n >>= (x -> `isPerfectSquare` x)
==> Just n >>= (x -> `isPerfectSquare` x) -- by definition of return
==> (x -> `isPerfectSquare` x) n -- by definition of >>=
==> `isPerfectSquare` n -- by function application
```

Hence, when we sequence expressions using monads, via >>=, we are in effect, guaranteeing their order of evaluation! Hence:

```
return n >>= (x ->
`isPerfectSquare` x >>= (y ->
isBigNumber y >>= (z ->
isMagicNumber z)))
```

is identical to the imperative code:

```
x = construct(n)
y = `isPerfectSquare` x
z = isBigNumber y
return(z)
```

neglecting the `Maybe`

stuff, and construct means the monad’s `return`

, and return means the imperative style “return” of actually returning a value from a function explicitly.

Since we now have order of evaluation guarantee, we can essentially “write” imperative looking code, in Haskell! Now this is **not** going to break referential transparency, because at this point, we are just forcing order of evaluation, not dealing with state, or variables. In other words, it would be nice if we could write the above Haskell code something as follows, right?

```
x = return n
y = `isPerfectSquare` x
z = isBigNumber y
isMagicNumber z
```

Since the `=`

operator in Haskell already has a meaning, we give it another operator instead. How about:

```
x <- return n
y <- `isPerfectSquare` x
z <- isBigNumber y
isMagicNumber z
```

Looks good. Haskell provides exactly that in its do-notation, as in the following real Haskell code:

```
isBigAndPerfectSquareAndMagic n = do
x <- return n
y <- `isPerfectSquare` x
z <- isBigNumber y
isMagicNumber z
```

Wonderful looking syntax. It’s just syntatic sugar, so it really and precisely means the previous looking code with all the lambda expressions (anonymous functions). One thing to note, is that the last expression in the do-notation cannot have an “assignment” type thing. That’s because, if you translate the original code with all the lambda expressions, you realize that the last lambda is the expression to be evaluated **and** returned. Making an assignment there does not make sense, and is not permitted. If you try, Haskell tells you this:

The last statement in a ‘do’ construct must be an expression.

So what if we want to return something else? Say, for some weird reason, we want isBigAndPerfectSquareAndMagic to always return 1000. Then we do this:

```
isBigAndPerfectSquareAndMagic n = do
x <- return n
y <- `isPerfectSquare` x
z <- isBigNumber y
w <- isMagicNumber z
return 1000
```

Of course! We use return to pack up 1000 as a monadic value (Just 1000, in this case), and since it’s the last expression, we return it. See the complete abstraction over the monad (`Maybe`

, again, in this case) at work here?

And of course, all this is possible only because `Maybe`

is a monad, which has the wonderful >>= operator implemented by the conscientious! With >>=, we get:

**Properties of Monads:**

- Abstraction from context (unpacking, special cases)
- Guaranteed order of evaluation

That’s the fundamental property of monads (there’s loads more).