Why monads?

You probably have already read things like “Async computations form a Monad” or something like that. Did you ever ask yourself why would they? Here are the answers for you.

Have you ever seen that kind of device in some chemical and physics labs – a glass wall, with two holes in it which have a pair of thick gloves attached to them, to reach inside? A person put his hands into these gloves and reach into a contained and sealed compartment (environment) and is able to perform some tasks, like moving and filling things inside.

This is essentially how return and map work – by defining and then using only them we establish an impenetrable abstraction barrier - similar, at least in principle, to that “glass wall with gloves”.

At the more concrete level of semantics of programming languages,, return “lifts” (transforms) values and map “lifts” ordinary functions (on values) into a given “context” (container). Partially applied map yields a function on whole containers, preserving the structure (whatever it may bee).

With map we sort of could reach into that “container” without breaking an abstraction barrier – only map “knows” the implementation details of a container. In modern programming languages we define a map (and flatMap) as methods of a class or parts of an Abstract Data Type.

More importantly, since map returns a “whole” of the same kind, calls to map can be chained (composed using nesting), just like any other function composition.

There is a “law”, which states that instead of composing maps, we can equivalently compose functions (on elements) to get the same effect. \[map(g . f) = map g . map f\]

So, in terms of a programming language semantics, we have a proper ADT of higher-order functions. Even further, we generalize even further, to a type-class or a trait.

What we actually do wearing these “gloves”? We compose functions (operators) which are “out there”, behind the abstraction barrier. So they better form Monoids.

For that we need a composition DSL. The ordinary (.) – a DSL of just one function which is defined by nesting – would not work with abstraction barriers. Something that uses map and return would.

The DSL, which is also a proper ADT (or even a trait or a type-class) has to define composition using nesting. Nesting of functions is the main principle (it establishes a particular order of evaluation).

The Lambda Calculus demonstrated that bindings, functions and function composition are enough for everything, so such DSLs shall be good-enough too.

Notice that, just like it is with pure functional languages (which technically are system of logic) we describe computations declaratively – what shall be done eventually, by some runtime system. We never “see” or scrutinize any actual value, we only declare - when this pattern matches, do this, if that matches, do that, call this pure function and then call that one.

This is exactly how we describe what has to be done behind that abstraction barrier – we map this function, and then map that one, and in general, describe what has to be done there, behind the barrier.

Ok, enough metaphors, lets see how.

Some notions from mathematical theories

Ordinary functions (which can be thought off as a particular kinds of relations) can be viewed as a “table” (a set of ordered pairs).

The Categorical view is that there is an “arrow” between any two elements (of each pair). Such arrow is a separate conceptual entity.

All the arrows together are called a “family” (not a set). There are these diagrams of two bubbles with dots and a bunch arrows between them, you know. This is exactly how it is.

These abstract notions allow us to talk about both individual arrows and “the whole” (family). So, they say things like some “objects” \(x\) and \(y\) and an “arrow” \(f : x -> y\) between them.

Then some abstractions can be defined in terms of individual arrows and their additional required properties.

\[\exist unit_{x} : x \rightarrow M x, \forall x | map f . unit_{x} = unit_{y} . f, \forall x, y, f : x -> y\]

So in this view, a unit is a family of individual arrows from \(x\) to some \(M x\), all of them, and join is a bunch of arrows from \(M(M x)\) to \(M x\) (but not back to \(x\)), unit is not a bijection, we just can’t.

The fundamental “law” is of map and unit, while join is just a hack.

Both unit and map form an abstraction boundary, even in mathematical view, an abstract cell-membrane, so to speak.

This is universally an one-way transformation. Once values “went through the membrane”, crossed the abstraction boundary (being “lifted”, or transformed) there is no way back.

This is why it is convenient to think of this as a families of arrows, for all xs.

At the level of CS, only map knows “what is behind the abstraction barrier”. At an abstract level, it “lifts” (passes through a membrane) functions (“arrows”), and transforms (“maps”) a “whole” into a “whole” by naturally applying it to each \(x\) (preserving the structure).

In our down to Earth CS view, a new “whole” is being created every time as a result.

This is what we need to know about these general abstractions – a Functor and, later, a Monoid.

The most important thing is that just this much (only these abstractions) is enough (necessary and sufficient).

Basically, when they said that all Real numbers are also Complex numbers, it is actually that they can be lifted into another Number System, and that some operations can be lifted there too. (this requires some time to sync in).

Again, the point is that only these are enough for everything.

Mathematicians do not talk about abstraction barriers, only we, programmers do. We have to be concerned with actual representations and implementations of these “containers” and functions.

For us everything has an implicit particular order (in which it has been written or stored).

Functor

Conceptually, an \(f : a \rightarrow b\) is a single arrow. A \(map f\) defines a whole family of (all individual) arrows. it is that simple

\[map : (x \rightarrow y) \rightarrow (M x \rightarrow M y)\] Partially applied map yields \(M x \rightarrow M y\) with the same \(M\).

The concept of a Functor is defined in terms of the identity function, function composition (as nesting) and the map function, which satisfies the following “laws”, \[map id = id\] \[map (g . f) = map g . map f\] where \[id x = x\] \[(g . f) x = g(f x)\]

Notice that function composition as nesting are “already there”, they are the foundation, indeed.

At the actual programming languages level partially applied map takes a single arrow and transforms it into all the arrows between two sets so to speak.

a-> b into

a     b
*-> *
*-> *
*-> *

It “maps” whole sets into whole sets (using that function f on elements).

So, a functor is a generalization, an operator on “arrows” \(M a \rightarrow M b\).

Mathematicians do not specify what kind if M is this, only that this is the same M. We would say that the whole mathematical structure is being preserved.

Partially applying `f`

> :t map
map :: (a -> b) -> [a] -> [b]
> :t map id
map id :: [b] -> [b]

it takes a whole (set, list) and returns a subset (another list)

At the level of PL mapping of id over a list will produce a copy (a new list with the same elements).

Partially applying “the whole set”

> map' xs f = [f x | x <- xs]
> map' [] (+1)
[]
> map' [1,2,3] (+1)
[2,3,4]
> :t map' [1,2,3]
map' [1,2,3] :: Num t => (t -> a) -> [a]

it just need a single arrow to yield a transformed whole set

A closure is parameterized by a function on elements but returns a set.

Clauses

A structure of a function is Isomorphic to the structure (shape) of a sum-type. \[[] | a : [a]\]

map [] f = []
map [x:xs] f = f x : xs

Abstraction barrier

map is the only one who “indeed..” the shape (actual implementation) so there is an abstraction barrier (as an abstract interface).

Conceptually, it “lifts” a given function’s f on elements into a given “context”. And transforms the whole “structure” (with it) preserving the shape.

List, on the CS level, is what Set is on the more general level of abstract algebra.

\(map g . map f\) – this is exactly how “chaining of wholes” works.

Monads

An abstract operator M on types together with an ADT (a triple of functions) \[unit : x \rightarrow M x\] \[map : (x \rightarrow y) \rightarrow (M x \rightarrow M y)\] \[join : M (M x) \rightarrow M x\]

Again, for Lists, which have representation and implementation

unit x = [x]
map f xs = [f x | x <- xs]
copy xs = [x | x <- xs]
join xss = [x | xs <- xss, x <- xs] -- flatten

Theoretically, unit and join are so called “natural transformations”, and are viewed as families of arrows. There is “laws” about them too: \[join \cdot unit = id\] \[join \cdot map unit = id\] There is a subtlety here: map and join take “the wholes” as their arguments, so map unit has to be applied to some M x and will return a M (M x), which, when “joined” will be the same M a again.

There are more “laws”. They are not arbitrary – laws are necessary parts of proper definitions. They are such that constraints on behaviors (specification) of the functions.

Notice that each “law” is define using at lest two (or all 3) of the functions. Those are required relations (invariants) between functions.

Set comprehensions

A Set-comprehension is a generalized syntax for the abstract mathematical notion of such that. This means some relation is “true”. These relations are called qualifiers, and could be nested (the order matters).

A set-comprehension defines a subset (a partition)of a set given implicitly in the context.

Nesting of qualifiers is just nesting of functions. Qualifiers are predicates (relations expressed as).

List comprehensions (a specialization)

Programmers have List-comprehensions, a specialization with a concrete implementation.

In general a comprehension has the form [t | q] where \(t\) is a term and \(q\) is a qualifier.

A qualifier is either

empty (id)
a generator x <- u where \(x\) is a variable and \(u\) is a listvalued term
a composition (nesting) of qualifiers p, q

> map' f xs = [ f x | x <- xs ]
> map' (+1) []
[]
> map' (+1) [1,2,3]
[2,3,4]
> :t map'
map' :: (t -> a) -> [t] -> [a]
> :t map' (+1)
map' (+1) :: Num a => [a] -> [a]

So, here the result of applying map is equivalent to selecting all the elements

In a pseudo-code

map(\x -> t) u = [t | x <- u]

In pseudo-haskell map id xs = [x | x <- xs] = xs

> [x | x <- [1,2,3]]
[1,2,3]
> map id [1,2,3]
[1,2,3]

map (g . f) = map g . map f

> map (^2) (map (+1) [1,2,3])
[4,9,16]
> map ((^2) . (+1)) [1,2,3]
[4,9,16]

map s can be nested, but the argument must be a list of list

> map (map (+1)) [[1,2,3]]
[[2,3,4]]

This is WHY join (or flatten) is a part of the ADT. We get rid of the unnecessary and redundant structural nesting using the clever hack with individual arrows going through to “an one level down”.

> join [[1,2,3]]
[1,2,3]
> :t join
join :: Monad m => m (m a) -> m a

And, of course, the famous flatMap is just a composition of join with a partially applied map.

Partially applied to a function f which uses return to lift the result “back into the context” of M, so the result will be that nested M (M a) structure (which is isomorphic to M a).

It is that simple.

Have to show this

TODO: >>= and desugaring of the do-notation defined as nested lambdas

Where?

Haskell, with do-notation
F#
Scala 3, with for-comprehension and actual thunks, in libraries such as cats
Ocaml (rudimentary)
Rust (has to be a bunch of traits with lots of inlines)

Some notions from mathematical theories#

Functor#

Partially applying f#

Partially applying “the whole set”#

Clauses#

Abstraction barrier#

Monads#

Set comprehensions#

List comprehensions (a specialization)#

Have to show this#

Where?#