Introduction to category theory

This is the first entry in my notes on category theory, higher category theory, and, finally, higher structures. The main focus of my notes, especially as the discussion advances, is application in string / M-theory, concluding with an introduction to the study of higher structures in M-theory. We start with basic category theory roughly following the book ‘Category Theory in Context’ by Emily Riehl (online version here), as well as the perspective of a selection of other texts and lectures cited throughout. For the engaged reader, I recommend reviewing the respective pages on nLab for further references.


There is a line by Wilfrid Sellars: ‘The aim of philosophy, abstractly formulated, is to understand how things in the broadest possible sense of the term hang together in the broadest possible sense of the term’. The things we must come to know ‘in the broadest possible sense’ – at its most abstract, a type of conceptual modelling – must in some way be classified such that we may distinguish the type of thing, the relation between thing of similar and dissimilar class, and its particular properties or attributes. For example, think of basic biological nomenclature going back to Aristotle. Another example would be the standard model of particle physics. (For the time being, we will put aside philosophical issues going back to Hegel, Russell, and others, as well as broader debates having to do with process vs. substance metaphysics, and so on).

From a mathematics and physics point of view, if we take Sellars’ statement seriously, then, at the highest level in the conceptual hierarchy what we begin to contemplate is a way to think about what Peter Smith describes in his notes on category theory as, ‘structured families of structures’. That is to say, we naturally come upon the need for some systematic framework for the study of abstract structures, how we may define a family of such structures, and their interrelation. We take as a starting point in these notes motivation from both foundational mathematics and fundamental physics.

A simple example of a structure is a topological space. Simpler still, take an example from group theory. Any group may be described as a structure, which comprises a number of objects equipped with a binary operation defined on them that obeys well-defined axioms. Now, what of a family of groups? We can of course also define a family of groups with structure-preserving homomorphisms between them (for a review of groups and sets leading up to the basic ideas of category theory, see Chapter 2 in the above notes by Smith). This gives an example of a structured family. This reference to groups is apt, because as we will see later in these notes: classically, a group is a monoid in which every element has an inverse (necessarily unique). A monoid, as we will review in a future entry, is one of the most basic algebraic examples of a category.

More generally, when looking at a family of structures along with the structure-preserving maps between them, our goal will be to reach an even higher level of abstraction that takes the form of a further structure: i.e., a structure-of-structures. We can then continue this game and ask, what is the interrelation of this structure-of-structures? From this question we will look to climb to another level and speak of operations that map one functor to another in a way that preserves their functorial properties.

When I think of the idea of a category, this increasing picture of generality and of climbing levels of abstraction is often what I like to picture. To use the words of Emily Riehl [1], ‘the purview of category theory is mathematical analogy’. While some give it the description, however affectionately, of ‘abstract nonsense’, I prefer to think of category theory – and, more broadly, the category theoretic perspective – as very much akin to the geologist constructing a topological map containing only vital information. This notion of climbing levels of abstraction, is, in many ways, simplifying abstraction. What use would it be to perform analysis within the framework of these increasing levels of simplifying abstraction? In foundational mathematics, the motivation is quite clear. In fundamental physics, on the other hand, it may at first seem less obvious. But as we will discuss in these notes, particularly in the context of quantum field theory and string / M-theory, there is quite a lot of motivation to think systematically about structured families of mathematical structures.

What is a category?

One way to approach the idea of a category is to emphasise the primacy of morphisms. In the paradigm view, in contrast to set theory, category theory focuses not on elements but on the relations between objects (i.e., the (homo)morphisms between objects). In this sense, we may approach category theory as a language of composition.

Let us build toward this emphasis on composition in a simple way. Consider some collection of objects A, B, C, D   with a structure preserving morphism f  from A  to B  , another structure preserving morphism g  from B   to C  , and, finally, a structure preserving morphism h  from C  to D  . (In a handwavy way, this is how we motivated the idea of a category in a previous post). In diagrammatic notation we have,

\displaystyle A \ \xrightarrow[]{f} \ B \ \xrightarrow[]{g} \ C \ \xrightarrow[]{h} \ D  .

It is fairly intuitive that we should be able to define a composition of these maps. All we need, as an axiom, is associativity. For example, we may compose f  and g  such that we obtain a map from A  to C  . We may write such a composition as g \circ f   . Similarly for all the other ways we may compose the maps f, g  , and h  . This means that we ought to be able to then also compose a map for the entire journey from A  to D  . Diagrammatically, this means we obtain:

One sees that we can apply the structure preserving map f  followed by the composite g-followed-by-h. Alternatively, we may just as well apply the composite f-followed-by-g and then afterwards apply the map h  . This very basic picture of a collection of objects A,B,C,D  , the maps between them, and how we may invoke the principle of composition for these maps already goes some way toward how we shall formally define a category. One will notice below that we need a bit more than associativity as an axiom, and along with the objects of a category we will speak of morphisms simply as arrows. From now on, if A \in \text{Ob}(\mathcal{C})  we write A \in \mathcal{C}   .

Definition 1. A category \mathcal{C}  consists of a class of objects, and, for every pair of objects A,B \in \mathcal{C}  , a class of morphisms, \text{hom}(A,B)  , satisfying the properties:

  • Each morphism has specified domain and codomain objects. If f  is a morphism with domain A  and codomain B  we write f: A \rightarrow B  .
  • For each A \in \mathcal{C} , there is an identity morphism id_A \in \text{hom}(A,A) such that for every B \in \mathcal{C} we have left-right unit laws:
  1. \displaystyle f \circ id_A = f \text{for all} f \in \text{hom}(A,B)
  2. \displaystyle id_A \circ f = f \text{for all} f \in \text{hom}(B,A)
  • For any pair of morphisms f,g  with codomain of f  equal to codomain of g  , there exists a composite morphism g \circ f  . The domain of the composite morphism is equal to the domain of f  and the codomain is equal to the codomain of g  .

Two axioms must be satisfied:

  • For any f: A \rightarrow B  , the composites 1_B f  and f1_A  are equal to f  .
  • Composition is associative and unital. For all A,B,C,D \in \mathcal{C}  , f \in \text{hom}(A,B)  , g \in \text{hom}(B,C)  , and h \in \text{hom}(C, D)  , we have f \circ (h \circ g) = (g \circ f) \circ h  .

Further remarks may be reviewed in [1, 2, 3]. We emphasise that for any mathematical object there exists a category with objects of that kind and morphisms – i.e., structure-preserving maps denoted as arrows – between them. The objects and arrows of a category are called the data. The objects of a category can be formal entities like functions or relations. In many examples of a category, the arrows represent functions, but not all cases of an arrow represents a morphism. These subtitles will be saved for future discussion.

An important notational point is that one should keep close attention on morphisms. Often categories with the same class of objects – e.g., a category of topological spaces compared with another category of topological spaces – may be distinguished by their different classes of morphisms. It is helpful to denote the category as \text{hom}_{\mathcal{C}}(A,B)  or \mathcal{C}(A,B)  to denote morphisms from A  to B  in the category \mathcal{C}  .

Importantly, to avoid confusion, we speak of ‘classes’ or ‘collections’ of objects and morphisms rather than ‘sets’. One motivation is to avoid confusion when speaking of \text{Set}  , which is the the category of all sets with morphisms (as functions) between sets. If a set of objects were required, instead of a class, then we would require a set of all sets. As it will be made clear when we reach the discussion on how to consider categories of categories, we may speak of sets of sets but, as Russell’s Paradox implies, there is no set whose elements are ‘all sets’. So we cannot speak of a set of all sets or a category of all sets. Likewise, it is conventional when we consider categories of categories to avoid the notion of a category of all categories (see Remark 1.1.5. in [1]). Instead, we speak of a limit in the form of a universe of sets and, in more advanced discussion, we will come to consider categories as universes.

Related to this concern about set-theoretical issues, it is important to note that we work with an extension of the standard Zermelo–Fraenkel axioms of set theory, allowing ‘small’ and ‘large’ sets to be discussed. In category theoretic language, we invoke similar terminology:

Definition 2. A category \mathcal{C}  is finite iff it has overall only a finite number of arrows.

A category \mathcal{C}  is small iff it has overall only a ‘set’s worth’ of arrows – i.e. the class of objects is a set such that the arrows of \mathcal{C}  can be put into one-one correspondence with the members of the set.

A category \mathcal{C}  is locally small iff for every pair of \mathcal{C}  – objects A,B  there is only a ‘set’s worth’ of arrows from A  to B  , i.e. those arrows can be put into one-one correspondence with the members of some set.

Examples of categories

What follows are a few examples illustrating the variety of mathematical objects that assemble into a category:

  • Set, the category of sets where morphisms are given by ordinary functions, with specified domain and codomain. There is a subtlety here in that the view of Set as the category of all sets becomes paradoxical, so, typically, we limit to a universe of sets (more on this in a separate entry).

Example. In this category the objects are sets, morphisms are functions between sets, and the associativity of the composition law is the associativity of composition of functions.

We may define the category Set (The category of sets): \mathcal{O}  (Set) is the class of all sets, and, for any two sets A,B \in \mathcal{O}  (Set) define \text{hom}(A,B) = f: A \rightarrow B  as the set of functions from A  to B  . The composition law is given by the usual composition of functions. Since composition of functions is associative, and there is always an identity function, Set is a category. This ends the example.

Other categories of note:

  • Grp, the category of groups where morphisms are given by group homomorphisms.
  • Vect_k, the category of vector spaces over some fixed field k, where morphisms are given by linear transformations.
  • Ring, the category with rings as objects and ring homomorphisms as morphisms
  • Top, the category of topological spaces where morphisms are given by continuous maps
  • Met, is the category with metric spaces as objects and continuous maps as morphisms.
  • Meas, is the category with measurable spaces as objects and measurable maps as morphisms.
  • Graph, the category of graphs as objects and graph morphisms (functions carrying vertices to vertices and edges to edges, preserving incidence relations) as morphisms. In the variant DirGraph, objects are directed graphs, whose edges are now depicted as arrows, and morphisms are directed graph morphisms, which must preserve sources and targets.
  • Man, the category of smooth (i.e., infinitely differentiable) manifolds as objects and smooth maps as morphisms.

All of the above examples are concrete categories, whose objects have underlying sets and whose morphisms are functions between these underlying sets (what we have called ‘structure-preserving’ morphisms). We will speak more about concrete categories, including formal definition, in a later note. For the sake of introduction, it is also worth noting that there are also \textit{abstract categories}. One example is as follows:

BG, the category defined by the group G  (or what we will describe as a monoid in the next entry) with a single object. The elements of G  are morphisms, with each group element representing a distinct endomorphism of the single object. Here composition is given by multiplication. There is an identity element e \in G  that acts as the identity morphism.

Closing comments

In the next post, we will review some other category definitions, review diagrammatic notation, and discuss in more detail the important role and subtlety of morphisms. In a closely followed entry, we will then finally turn our attention to monoids, groupoids, pre-ordered collections, and other related concepts, as well as start discussing examples in string theory.


These notes primarily follow a selection of lectures and texts:

[1] E. Riehl, Category theory in context. Dover Publications, 2016. [online]

[2] S. Mac Lane, Category theory for the working mathematician. Springer, 1978. [online].

[3] P. Smith, Category theory: A gentle introduction [online].

[4] J. Baez, Category theory course [lecture notes].