Haskell and Category Theory by Example

Category theory becomes useful in programming when it stops being a vocabulary test and starts being a way to notice shape. Haskell is a particularly good place to practice that noticing, because many ordinary library abstractions are already categorical: Functor, Applicative, Monad, Foldable, Traversable, and the less famous but very clarifying ideas of natural transformation, representability, and adjunction. The point of this note is not to prove that Haskell “is” category theory. The point is more modest and more useful: to show how a few categorical ideas explain why common Haskell APIs feel inevitable. Read this beside the Yoneda note if you want the pure mathematical statement first. This post leans toward code and design pressure.

We will work with one recurring principle: if an operation is parametric, it cannot inspect the values it is polymorphic over. That makes many Haskell functions behave like categorical maps. A function with type forall a. f a -> g a has to transform the container shape while respecting every possible choice of a; in category language, it is usually a natural transformation. A function with type fmap :: (a -> b) -> f a -> f b has to transport ordinary functions through a context; in category language, it is a functor action. These are not metaphors. They are constraints with consequences.

The Small Category Inside Haskell

The usual first approximation is the category Hask. Its objects are Haskell types, and its morphisms are Haskell functions:

A, B, C \in \mathrm{Ob}(\mathbf{Hask}), \qquad f : A \to B.

Composition is ordinary function composition, and the identity morphism is id. The two category laws say that composition is associative and identities do nothing:

h \circ (g \circ f) = (h \circ g) \circ f, \qquad f \circ \mathrm{id} = f = \mathrm{id} \circ f.

In Haskell:

1
idLawLeft :: (a -> b) -> a -> b
2
idLawLeft f = f . id
3

4
idLawRight :: (a -> b) -> a -> b
5
idLawRight f = id . f
6

7
assoc :: (c -> d) -> (b -> c) -> (a -> b) -> a -> d
8
assoc h g f = h . (g . f)

This is already useful, but it is also already a lie. Real Haskell has bottoms, partial functions, seq, runtime exceptions, and other details that make the category less tidy than the blackboard version. For practical library design, however, the approximation is good enough when we restrict attention to total, parametric code. This is the same kind of useful limit discussed in Keeping a Small Surface. The model is valuable because it is small enough to reason with.

The important habit is to read types as promises. A value of type a -> b promises to turn any a into a b. A value of type forall a. a -> a promises to work for every a, and parametricity leaves it essentially no choice but id. The more polymorphic a type is, the fewer implementations it admits. Category theory gives names to many of those narrow passages.

Functors: Mapping Without Opening the Box

In category theory, a functor maps objects to objects and morphisms to morphisms, preserving identities and composition. In Haskell, a Functor maps types to types and functions to functions:

1
class Functor f where
2
  fmap :: (a -> b) -> f a -> f b

The laws are:

\mathrm{fmap}\ \mathrm{id} = \mathrm{id}, \qquad \mathrm{fmap}\ (g \circ f) = \mathrm{fmap}\ g \circ \mathrm{fmap}\ f.

They say that fmap changes values without changing the structure more than the function demands. A list functor maps over each element while preserving length and order. A Maybe functor maps over the present value while preserving absence. A tree functor maps at the leaves while preserving branching.

1
data Tree a
2
  = Leaf a
3
  | Branch (Tree a) (Tree a)
4
  deriving (Eq, Show)
5

6
instance Functor Tree where
7
  fmap f (Leaf a) = Leaf (f a)
8
  fmap f (Branch left right) =
9
    Branch (fmap f left) (fmap f right)

The laws are not decorative. Consider a strange instance:

1
data Pairish a = Pairish a a
2
  deriving (Eq, Show)
3

4
instance Functor Pairish where
5
  fmap f (Pairish x y) = Pairish (f y) (f x)

This instance preserves identity, but it breaks composition. Mapping f and then g swaps twice, while mapping g . f swaps once. That makes the instance surprising in exactly the way categorical lawbreaking is surprising: the shape is not being transported coherently. A law is a user-interface promise. If a type class instance violates the law, callers cannot safely use equational reasoning.

Functors become more interesting when the structure itself carries information. For Reader r, the context is an environment:

1
newtype Reader r a = Reader { runReader :: r -> a }
2

3
instance Functor (Reader r) where
4
  fmap f (Reader readA) =
5
    Reader (\r -> f (readA r))

The categorical action is post-composition. If Reader r a is isomorphic to r -> a, then fmap f turns r -> a into r -> b by composing with f:

\mathrm{fmap}\ f\ h = f \circ h.

That tiny equation explains why Reader is so predictable. It cannot invent an r; it can only wait for one. It cannot inspect a except through the function it was given. It can only compose.

Natural Transformations: Changing Containers Uniformly

A natural transformation changes one functor into another without looking at the element type. In Haskell we can write:

1
{-# LANGUAGE RankNTypes #-}
2

3
type Natural f g = forall a. f a -> g a

Examples are common:

1
maybeToList :: Natural Maybe []
2
maybeToList Nothing = []
3
maybeToList (Just a) = [a]
4

5
listToMaybe :: Natural [] Maybe
6
listToMaybe [] = Nothing
7
listToMaybe (a : _) = Just a

The naturality condition says that transforming and then mapping is the same as mapping and then transforming:

gmap\ h \circ \alpha_A = \alpha_B \circ fmap\ h \qquad \text{for every } h : A \to B.

For maybeToList, the square says:

1
fmap h (maybeToList x) == maybeToList (fmap h x)

If x is Nothing, both sides are []. If x is Just a, both sides are [h a]. The function is natural because it changes shape in a way that does not depend on the hidden type a.

Here is a tempting but impossible function:

1
impossible :: forall a. Maybe a -> a
2
impossible Nothing = -- no value of type a exists here
3
impossible (Just a) = a

Parametricity blocks the missing branch. A function polymorphic in a cannot manufacture an arbitrary a. That is one reason natural transformations are a useful design tool: they distinguish transformations that are structurally available from transformations that require extra information. In practice this is why APIs often ask for a default value, a monoid, or a continuation. The missing structure has to come from somewhere.

Yoneda as an Optimization of Attention

Yoneda often sounds mysterious because its statement is compact:

\operatorname{Nat}(\operatorname{Hom}_{\mathcal{C}}(A, -), F) \cong F(A).

In Haskell-flavored form, one version says:

\forall x.\ (a \to x) \to f x \cong f a.

The left side looks more powerful: it can produce an f x for every result type x, as long as you provide a function from a to x. But parametricity makes it less powerful than it looks. The only reusable seed it can have is an f a, and the only thing it can do with a -> x is fmap it.

1
{-# LANGUAGE RankNTypes #-}
2

3
newtype Yoneda f a =
4
  Yoneda { runYoneda :: forall x. (a -> x) -> f x }
5

6
liftYoneda :: Functor f => f a -> Yoneda f a
7
liftYoneda fa = Yoneda (\k -> fmap k fa)
8

9
lowerYoneda :: Yoneda f a -> f a
10
lowerYoneda (Yoneda y) = y id

The two directions are inverse up to the functor laws:

\mathrm{lowerYoneda}(\mathrm{liftYoneda}(fa)) = fa.

and

\mathrm{liftYoneda}(\mathrm{lowerYoneda}(y)) = y.

The second equality is less obvious. Expanding it gives:

\mathrm{liftYoneda}(y\ \mathrm{id})\ k = fmap\ k\ (y\ \mathrm{id}) = y\ k.

The last step is naturality. The polymorphic value y cannot inspect k except by applying it in the functorial way.

Why should a programmer care? Because Yoneda can reassociate repeated fmaps. A chain like

1
fmap h (fmap g (fmap f xs))

can become one composed function:

1
fmap (h . g . f) xs

For lists that is a familiar fusion idea. For other functors, especially tree-like or effect-describing structures, the same idea can reduce intermediate traversal. Yoneda is not just a theorem about abstract categories; it is a theorem about where work can be delayed.

For a more proof-oriented version, see the map in the Yoneda proof. Here we are using the same correspondence as a programming pattern.

Representable Functors and Indexing

A representable functor is one that is isomorphic to a function functor:

F(X) \cong R \to X.

In Haskell, representability says that a container has a fixed set of positions. To get an a, provide an index. To build the container, provide a value for every index.

1
{-# LANGUAGE TypeFamilies #-}
2

3
class Functor f => Representable f where
4
  type Rep f
5
  tabulate :: (Rep f -> a) -> f a
6
  index :: f a -> Rep f -> a

A pair is representable by a boolean index:

1
data Two a = Two a a
2
  deriving (Eq, Show)
3

4
instance Functor Two where
5
  fmap f (Two left right) = Two (f left) (f right)
6

7
instance Representable Two where
8
  type Rep Two = Bool
9

10
  tabulate f = Two (f False) (f True)
11

12
  index (Two left _) False = left
13
  index (Two _ right) True = right

The laws say that indexing after tabulating returns the original function, and tabulating after indexing returns the original container:

\mathrm{index}\ (\mathrm{tabulate}\ f) = f, \qquad \mathrm{tabulate}\ (\mathrm{index}\ xs) = xs.

This is a precise way to say that Two a is not “like” Bool -> a; it has the same information. That matters when you design APIs. If your data type is representable, you can offer random access, zipping, tabulation, and total lookup. If it is not representable, those operations may require defaults or partiality.

The contrast with lists is instructive. A list is not representable by Int because not every integer is a valid index. You can make index :: [a] -> Int -> a, but it is partial. You can make index :: [a] -> Int -> Maybe a, but then the result is not just a. Category theory often forces exactly this kind of honesty.

Applicative as Monoidal Structure

Functor lets us map one value inside a context. Applicative lets us combine independent contextual values:

1
class Functor f => Applicative f where
2
  pure :: a -> f a
3
  (<*>) :: f (a -> b) -> f a -> f b

One categorical reading is that an applicative functor is a lax monoidal functor. In practical terms, it knows how to embed a pure value and how to combine contexts:

1 \to F(1), \qquad F(A) \times F(B) \to F(A \times B).

For Maybe, combination fails if either side is absent:

1
pairMaybe :: Maybe a -> Maybe b -> Maybe (a, b)
2
pairMaybe ma mb = (,) <$> ma <*> mb

For lists, the standard applicative gives all combinations:

1
pairs :: [a] -> [b] -> [(a, b)]
2
pairs as bs = (,) <$> as <*> bs

For Reader r, the applicative shares one environment:

1
askBoth :: Reader r a -> Reader r b -> Reader r (a, b)
2
askBoth ra rb =
3
  Reader (\r -> (runReader ra r, runReader rb r))

This is the monoidal idea made operational. Reader r can combine computations because both computations can be run against the same r. It does not sequence effects. It aligns independent structure.

Monads and Kleisli Composition

A monad adds a way to sequence dependent computations:

1
class Applicative m => Monad m where
2
  (>>=) :: m a -> (a -> m b) -> m b

Categorically, a monad on a category is an endofunctor $T$ with two natural transformations:

\eta : I \Rightarrow T, \qquad \mu : T \circ T \Rightarrow T.

In Haskell:

1
return :: a -> m a
2
join :: m (m a) -> m a

Bind can be defined from fmap and join:

1
bindFromJoin :: Monad m => m a -> (a -> m b) -> m b
2
bindFromJoin ma f = join (fmap f ma)

The key new power is dependency. In Applicative, the shape of the computation is known before any result is inspected. In Monad, the next computation can depend on the previous value.

1
safeReciprocal :: Double -> Maybe Double
2
safeReciprocal 0 = Nothing
3
safeReciprocal x = Just (1 / x)
4

5
safeRoot :: Double -> Maybe Double
6
safeRoot x
7
  | x < 0 = Nothing
8
  | otherwise = Just (sqrt x)
9

10
pipeline :: Double -> Maybe Double
11
pipeline x = do
12
  r <- safeReciprocal x
13
  safeRoot r

The categorical setting for monadic sequencing is the Kleisli category. Its morphisms from A to B are functions:

A \to T(B).

Composition is not ordinary (.); it is Kleisli composition:

1
(>=>) :: Monad m => (a -> m b) -> (b -> m c) -> a -> m c
2
f >=> g = \a -> f a >>= g

The monad laws ensure that this composition is associative and that return is identity. That is the category-theoretic reason the do notation can be rearranged predictably.

Adjunctions: Free and Forgetful

An adjunction is a pair of functors that relate maps in two categories:

\mathbf{D}(F C, D) \cong \mathbf{C}(C, G D).

The isomorphism is natural in both $C$ and $D$ . In Haskell, one familiar shadow is the relationship between free structures and forgetful views. A free monoid on a set of generators is a list. A function from a list into a monoid is determined by what it does to each generator.

1
foldMap :: Monoid m => (a -> m) -> [a] -> m
2
foldMap f = foldr (\a acc -> f a <> acc) mempty

The universal property says:

\mathrm{MonoidHom}([A], M) \cong \mathrm{Set}(A, U M).

Read this as: to define a monoid homomorphism out of the free monoid [A], it is enough to define a plain function from generators A into the underlying set of M. The list machinery supplies the rest.

For example:

1
newtype SumInt = SumInt { getSumInt :: Int }
2
  deriving (Eq, Show)
3

4
instance Semigroup SumInt where
5
  SumInt x <> SumInt y = SumInt (x + y)
6

7
instance Monoid SumInt where
8
  mempty = SumInt 0
9

10
countTrue :: [Bool] -> SumInt
11
countTrue = foldMap (\b -> if b then SumInt 1 else SumInt 0)

We supplied a generator map:

1
Bool -> SumInt

and foldMap extended it to a monoid homomorphism:

1
[Bool] -> SumInt

This is not merely an implementation trick. It is the universal property of lists as free monoids. This is a good debugging lens: if an API asks you how to interpret one generator and then handles the whole structure, you may be looking at a free construction.

Here is a deliberately long display equation to exercise mobile math overflow while also stating the idea:

\operatorname{MonoidHom}\bigl(\operatorname{FreeMonoid}(A), M\bigr) \cong \operatorname{Set}\bigl(A, U(M)\bigr) \cong \left\{\, f : A \to U(M) \mid \exists!\ \widehat{f} : \operatorname{FreeMonoid}(A) \to M,\ \widehat{f}([a_1,\ldots,a_n]) = f(a_1) \cdots f(a_n) \,\right\}.

If the overflow container is working, this formula should scroll rather than breaking the page on narrow screens.

A Small Design Example

Suppose we want to describe a configuration language. We can start with a syntax tree:

1
data ConfigF next
2
  = SetText String next
3
  | SetFlag Bool next
4
  | Stop
5
  deriving (Eq, Show, Functor)

This is a functor in the recursive position. A full syntax tree could be the free monad over ConfigF, but even without introducing a Free type, the shape tells us something. fmap changes the continuation positions while preserving the instructions. The instruction functor is separate from any interpreter.

1
interpretStep :: ConfigF next -> IO (Maybe next)
2
interpretStep Stop = pure Nothing
3
interpretStep (SetText text next) = do
4
  putStrLn ("text = " <> text)
5
  pure (Just next)
6
interpretStep (SetFlag flag next) = do
7
  putStrLn ("flag = " <> show flag)
8
  pure (Just next)

The categorical payoff is modularity. Syntax is a functor. Interpreters are natural-ish transformations into effectful carriers. Free constructions give you sequencing. Folds give you evaluation. None of these names are required to write the program, but they help keep the design from becoming accidental.

This also explains why small, self-contained examples are worth keeping as a blog codebase grows. Tiny cases reveal whether the surface behaves coherently before the system grows complicated.

Equational Reasoning as Maintenance

Haskell and category theory share a taste for equations. That taste is not ornamental; it is a maintenance strategy. When an abstraction is lawful, you can replace equals by equals. You can refactor a sequence of fmaps into one fmap. You can push maybeToList past an fmap. You can replace a function out of a free monoid by a generator interpretation plus foldMap.

For example:

1
maybeToList (fmap h value)
2
==
3
fmap h (maybeToList value)

That equality is the naturality of maybeToList. It is also a local refactoring rule. Similarly:

1
fmap h (fmap g (fmap f tree))
2
==
3
fmap (h . g . f) tree

That equality is a functor law. It is also a performance hint and a readability hint.

The most useful category theory for programmers is often this modest: it gives names to the equations that make code movable. When code is movable, it is easier to maintain. When it is easier to maintain, the ordinary path through the code becomes more trustworthy, which is exactly the point made in the maintenance note.

What To Remember

If you remember only one thing, remember that categorical structure is a discipline of preservation. A functor preserves identity and composition. A natural transformation preserves mapping. A representable functor preserves the idea of fixed positions. A monad preserves category structure in a Kleisli world. An adjunction preserves the same information across two different ways of speaking.

The Haskell types are not just signatures. They are contracts. The equations are not just proofs. They are refactoring permissions. And the best way to learn them is to use small examples until the shapes become visible.

ReferenceA proof-oriented companion to YonedaA compact proof of Yoneda's Lemma with KaTeX-rendered notation./blog/a-proof-of-yonedas-lemma/ ReferenceWhy the equations matter for maintenanceMaintenance gets easier when the ordinary path is also the documented path./blog/notes-on-maintenance/