Options

Bradley's Applied Category Theory booklet

John announced a few weeks ago that Tai-Danae Bradley was working on a booklet from the ACT 2018 conference. She has finished the booklet, titled What is applied category theory? and it's now live on the arXiv (and her blog).

Hoping to discuss once I've read and digested it some!

Comments

  • 1.

    Yes, that would be good!

    Comment Source:Yes, that would be good!
  • 2.

    To help me digest these ideas, I thought I'd write a summary of Tai-Danae Bradley's booklet. I've added my own reactions and I've tried link the ideas to what we've discussed in this course.

    Two Themes

    Tai-Danae starts with two basic themes in applied category. (1) Functorial semantics and (2) compositionality.

    Functorial Semantics

    Modeling natural human language is a good example (and Tai-Danae discusses this example at length). Roughly speaking, language consists of words ("semantics") and ways to combine to words to create sentences/new words ("grammar"). This distinction can be captured by a functor between two categories!

    $$ F : C \to D $$ The category \(C\) specifies the grammar (how words act, how they can be combined; see compositionality) while the functor assigns meaning to diagrams in \(C\) (where the category \(D\) represents meaning). So assigning individual words their meaning.

    Another good example in John's Lectures and Seven Sketches in Compositionality is Chapter 3: Database Schemas. A database schema was a category presented by a graph. The schema specified objects (data) and some morphisms (relations; functions).

    The database schema tells us the grammar of our database. It says what data is required, what actions can be performed, how different types of data are related, etc. But it doesn't say what the data actually is (or more abstractly, how we should interpret the objects and morphisms). We get an actual database by constructing a functor which sends our database schema to the category \(\mathbf{Set}\) (assigning sets to objects; and functions to morphisms).

    I think the most important idea here (especially for modeling/ theoretical science) is that we can separate how complex systems combine (this is the grammar) and how those complex systems function (this is the semantics). And we can interpret the same grammar with different semantics (which allows us to be somewhat agnostic about the fundamentals of a complex system).

    Electric circuits are a good example of this last point. We can model a electric circuit using a labeled graph, then construct a category where these labeled graphs are morphisms so we can compose them (this is done with cospans). Each edge in the graph is thought of as some basic component (like a resistor). These constructions don't tell us how voltages or currents distribute over the circuit (i.e., they don't tell us what a resistor really means), just how circuits combine. Choosing how components relate currents and voltages (Ohm's law for linear resistors) is then a functor which sends the labeled graphs to some other category that says how the circuit behaves.

    But the grammar of electric circuits constrains how the circuits can behave, because whatever semantics we choose, the semantics has to preserve the functor axioms (intuitively, it has to distribute over our grammar). So if a resistor behaves like a linear relation between two vector spaces, then sticking resistors together still has to be a linear relation on vector spaces.

    Compositionality

    This idea is relatively straight-forward. Essentially, complex things can be constructed out of simpler parts. But as pointed out in Chapter 1 of Seven Sketches, how you combine parts to form a whole matters! The whole is not necessary the sum of its parts (but it might be up to isomorphism!). A system's components might be behave a bit differently when studied separately than when glued together.

    Biology has many, many examples of this phenomenon! Chemical reactions studied in vitro (in a beaker) aren't the same as in vivo (in a cell). The environment/context matters! The cell has many other reactions that influence the reaction under consideration. Making measurements of system components might not agree with measurements of the system as a whole.

    But this problem affects other fields! Quantum mechanics has this very issue with quantum interference (probabilities depend on the whole system; and can't be added up from measuring decoupled components; this is very roughly speaking as I am not really much of a physicist! ). Even the more hands-on case of electric circuits has this same problem. Adding components like resistors changes what you measure elsewhere in the network.

    A goal in applied category theory is to identify the right algebraic structure for systematically building a system out of some component pieces. Using functorial semantics, algebraic structure in our "syntax" category is preserved. So how systems can be combined partially determines how they behave.

    The Yoneda Lemma (bonus theme)

    Tai-Danae mentions the Yoneda Lemma is her discussion of natural language modeling. The Yoneda Lemma gets the heart of why category theory is useful for modeling! I won't give the formal statement, but I will also quote John Firth:

    You shall know a word by the company it keeps.

    Essentially this is what the Yoneda lemma says: "You shall know an object by the company it keeps." Or "You shall know an object by how it interacts with other objects." In other words, you can define/study an object by how it interacts with other objects. And in category theory, "how it interacts with other objects" means "the morphisms between this object and others".

    To quote from the movie Forrest Gump:

    Stupid is as stupid does

    Similarly, "\(X\) is as \(X\) does" is a wonderful encapsulation of the Yoneda Lemma.

    What does this mean for modeling? Sometimes you don't need to know what a thing is, just what it does!

    Two constructions

    Tai-Danae discusses two constructions: monoidal categories and decorated cospans. I won't really summarize these constructions, but discuss how they could be used in modeling.

    Monoidal Categories

    A monoidal category gives us the syntax/grammar to talk about systems where we can talk about composition in series (via morphisms) and in parallel (via tensoring). We've have many, many examples in this course:

    • Monoidal preorders: e.g. integers with order \(\le\) (as morphisms) under addition (as tensoring)
    • Resource monoidal preorder: stuff (like pie ingredients) and mixtures of stuff (tensoring) with processes/reactions turning stuff/mixtures into other stuff/mixtures (morphisms)
    • Database schema: data types with combined data types (tensoring) and relationships/actions between types (morphisms)
    • Co-design diagrams

    along more traditional math examples (Tai-Danae gives some good examples).

    Another way to think about the monoid part of a monoidal category is to notice that in several examples the monoidal product lets us combine or mix together basic stuff. For example, chemical reactions turn mixtures of certain substances to other substances. We can describe reactions as morphisms (morphisms are good for processes) while mixing stuff together is tensoring.

    Decorated Cospans

    Let's imagine a pipe (or a wire). Pipes have no natural distinction between "input" and "output". Water can go in either direction. But morphisms are always directional! If we want to turn "direction agnostic" processes into a morphisms, we have to find a way to make distinction between input and output not really matter. Decorated cospans are the answer (well an answer).

    A cospan is the following diagram: $$ X \rightarrow N \leftarrow Y $$ where the object \(N\) represents our pipe (potentially bidirectional process). What these objects are depends on the category (finite sets is a common choice since we will have to be able to compute colimits for any objects to get a category; and finite sets always have colimits).

    The cospan can be decorated with some extra structure (which might say things about how our pipe can be combined with other pipes). Decorating our cospan is the same idea as the functorial semantics.

    Comment Source:To help me digest these ideas, I thought I'd write a summary of Tai-Danae Bradley's booklet. I've added my own reactions and I've tried link the ideas to what we've discussed in this course. ## Two Themes Tai-Danae starts with two basic themes in applied category. (1) Functorial semantics and (2) compositionality. ### Functorial Semantics Modeling natural human language is a good example (and Tai-Danae discusses this example at length). Roughly speaking, language consists of words ("semantics") and ways to combine to words to create sentences/new words ("grammar"). This distinction can be captured by a functor between two categories! \[ F : C \to D \] The category \\(C\\) specifies the grammar (how words act, how they can be combined; see compositionality) while the functor assigns meaning to diagrams in \\(C\\) (where the category \\(D\\) represents meaning). So assigning individual words their meaning. Another good example in John's Lectures and *Seven Sketches in Compositionality* is Chapter 3: Database Schemas. A database schema was a category presented by a graph. The schema specified objects (data) and some morphisms (relations; functions). The database schema tells us the grammar of our database. It says what data is required, what actions can be performed, how different types of data are related, etc. But it doesn't say what the data actually is (or more abstractly, how we should interpret the objects and morphisms). We get an actual database by constructing a functor which sends our database schema to the category \\(\mathbf{Set}\\) (assigning sets to objects; and functions to morphisms). I think the most important idea here (especially for modeling/ theoretical science) is that we can separate how complex systems combine (this is the grammar) and how those complex systems function (this is the semantics). And we can interpret the same grammar with different semantics (which allows us to be somewhat agnostic about the fundamentals of a complex system). Electric circuits are a good example of this last point. We can model a electric circuit using a labeled graph, then construct a category where these labeled graphs are morphisms so we can compose them (this is done with cospans). Each edge in the graph is thought of as some basic component (like a resistor). These constructions don't tell us how voltages or currents distribute over the circuit (i.e., they don't tell us what a resistor really means), just how circuits combine. Choosing how components relate currents and voltages (Ohm's law for linear resistors) is then a functor which sends the labeled graphs to some other category that says how the circuit behaves. But the grammar of electric circuits constrains how the circuits can behave, because whatever semantics we choose, the semantics has to preserve the functor axioms (intuitively, it has to distribute over our grammar). So if a resistor behaves like a linear relation between two vector spaces, then sticking resistors together still has to be a linear relation on vector spaces. ### Compositionality This idea is relatively straight-forward. Essentially, complex things can be constructed out of simpler parts. But as pointed out in Chapter 1 of *Seven Sketches*, how you combine parts to form a whole matters! The whole is not necessary the sum of its parts (but it might be up to isomorphism!). A system's components might be behave a bit differently when studied separately than when glued together. Biology has many, many examples of this phenomenon! Chemical reactions studied *in vitro* (in a beaker) aren't the same as *in vivo* (in a cell). The environment/context matters! The cell has many other reactions that influence the reaction under consideration. Making measurements of system components might not agree with measurements of the system as a whole. But this problem affects other fields! Quantum mechanics has this very issue with quantum interference (probabilities depend on the whole system; and can't be added up from measuring decoupled components; this is very roughly speaking as I am not really much of a physicist! ). Even the more hands-on case of electric circuits has this same problem. Adding components like resistors changes what you measure elsewhere in the network. A goal in applied category theory is to identify the right algebraic structure for systematically building a system out of some component pieces. Using functorial semantics, algebraic structure in our "syntax" category is preserved. So how systems can be combined partially determines how they behave. ### The Yoneda Lemma (bonus theme) Tai-Danae mentions the Yoneda Lemma is her discussion of natural language modeling. The Yoneda Lemma gets the heart of why category theory is useful for modeling! I won't give the formal statement, but I will also quote John Firth: > You shall know a word by the company it keeps. Essentially this is what the Yoneda lemma says: "You shall know an object by the company it keeps." Or "You shall know an object by how it interacts with other objects." In other words, you can define/study an object by how it interacts with other objects. And in category theory, "how it interacts with other objects" means "the morphisms between this object and others". To quote from the movie Forrest Gump: > Stupid is as stupid does Similarly, "\\(X\\) is as \\(X\\) does" is a wonderful encapsulation of the Yoneda Lemma. What does this mean for modeling? Sometimes you don't need to know what a thing is, just what it does! ## Two constructions Tai-Danae discusses two constructions: monoidal categories and decorated cospans. I won't really summarize these constructions, but discuss how they could be used in modeling. ### Monoidal Categories A monoidal category gives us the syntax/grammar to talk about systems where we can talk about composition in series (via morphisms) and in parallel (via tensoring). We've have many, many examples in this course: + Monoidal preorders: e.g. integers with order \\(\le\\) (as morphisms) under addition (as tensoring) + Resource monoidal preorder: stuff (like pie ingredients) and mixtures of stuff (tensoring) with processes/reactions turning stuff/mixtures into other stuff/mixtures (morphisms) + Database schema: data types with combined data types (tensoring) and relationships/actions between types (morphisms) + Co-design diagrams along more traditional math examples (Tai-Danae gives some good examples). Another way to think about the monoid part of a monoidal category is to notice that in several examples the monoidal product lets us combine or mix together basic stuff. For example, chemical reactions turn mixtures of certain substances to other substances. We can describe reactions as morphisms (morphisms are good for processes) while mixing stuff together is tensoring. ### Decorated Cospans Let's imagine a pipe (or a wire). Pipes have no natural distinction between "input" and "output". Water can go in either direction. But morphisms are always directional! If we want to turn "direction agnostic" processes into a morphisms, we have to find a way to make distinction between input and output not really matter. Decorated cospans are the answer (well an answer). A cospan is the following diagram: \[ X \rightarrow N \leftarrow Y \] where the object \\(N\\) represents our pipe (potentially bidirectional process). What these objects are depends on the category (finite sets is a common choice since we will have to be able to compute colimits for any objects to get a category; and finite sets always have colimits). The cospan can be decorated with some extra structure (which might say things about how our pipe can be combined with other pipes). Decorating our cospan is the same idea as the functorial semantics.
  • 3.

    Interesting..

    Comment Source:Interesting..
  • 4.

    I quite enjoyed the general style and the informativeness of the booklet, which seems closely related with what we've been learning here as well. One thing I was less sure about was the article's take on natural language syntax, in particular the distinction of "left" and "right" dual (corresponding to the fact that natural languages have linear word orders), which isn't a novel point from the booklet or its direct references but reflects a long-standing approach dating back to the early days of Categorial Grammar.

    I guess this is an important point for computational linguistics, but in recent developments of theoretical linguistics (at least in the Chomskyan school) linear order has been almost entirely severed out of the syntax module of human language grammar and relocated to the sensorimotor-based "linearization" module (aka. phonology) instead. A main motivation for this move was that world languages vary a lot in their choices of word order, but this seldom has substantial effects on compositional semantics (e.g. both English "yellow banana" and Spanish "plátano amarillo" mean more or less the same thing). From this perspective the (semantically-relevant part of) syntax is purely hierarchical (sets of sets of sets of...sets) and the "left-right" ordering is only settled when the syntactic structure is "spelled out".

    Hence, I was wondering whether NLP researchers with a more mathematically-oriented mind have ever considered simplifying some bits of the logico-categorical toolkit by e.g. taking into account the non-semantic parts (e.g. phonology, pragmatics) of natural language grammar. :-)

    Comment Source:I quite enjoyed the general style and the informativeness of the booklet, which seems closely related with what we've been learning here as well. One thing I was less sure about was the article's take on natural language syntax, in particular the distinction of "left" and "right" dual (corresponding to the fact that natural languages have linear word orders), which isn't a novel point from the booklet or its direct references but reflects a long-standing approach dating back to the early days of Categorial Grammar. I guess this is an important point for computational linguistics, but in recent developments of theoretical linguistics (at least in the Chomskyan school) linear order has been almost entirely severed out of the syntax module of human language grammar and relocated to the sensorimotor-based "linearization" module (aka. phonology) instead. A main motivation for this move was that world languages vary a lot in their choices of word order, but this seldom has substantial effects on compositional semantics (e.g. both English "yellow banana" and Spanish "plátano amarillo" mean more or less the same thing). From this perspective the (semantically-relevant part of) syntax is purely hierarchical (sets of sets of sets of...sets) and the "left-right" ordering is only settled when the syntactic structure is "spelled out". Hence, I was wondering whether NLP researchers with a more mathematically-oriented mind have ever considered simplifying some bits of the logico-categorical toolkit by e.g. taking into account the non-semantic parts (e.g. phonology, pragmatics) of natural language grammar. :-)
  • 5.

    Julio, very interesting point. Do you have any books you would recommend on theoretical linguistics? I would be interested to learn more about models of syntax/semantics like the ones you discuss.

    Comment Source:Julio, very interesting point. Do you have any books you would recommend on theoretical linguistics? I would be interested to learn more about models of syntax/semantics like the ones you discuss.
  • 6.

    Hi @Grant, glad to hear you are interested. Off the top of my head is Berwick and Chomsky's 2016 book Why Only Us which is a fairly recent (and not too technical) summary of the main results of the field in the past few decades (the first chapter in Carnie et al.'s handbook is also a nice overview). For example, on page 8 Berwick and Chomsky say (emphases mine):

    [Human language syntactic structure] has at least three key properties, all captured by minimalist system assumptions: (1) human language syntax is hierarchical, and is blind to considerations of linear order, with linear ordering constraints reserved for externalization; (2) the particular hierarchical structures associated with sentences affects their interpretation; and (3) there is no upper bound on the depth of relevant hierarchical structure. Note that if all this is true, then observation (1) implies that any adequate linguistic theory must have some way to construct arrays of hierarchically structured expressions, while ignoring linear order; while (2) implies that structure (in part) fixes interpretation at the level of “meaning.” Finally, (3) implies that these expressions are potentially infinite. These then are the minimal properties any adequate syntactic theory must encompass and that’s why they are part of the minimalist account.

    So it is more or less a consensus (or at least a prevalent idea) among theoretical linguists nowadays that the structure-building module (i.e. syntax) and the structure-externalizing module (i.e. phonology) should be studied and formalized separately, where only the former is relevant to the structure-interpreting module (i.e. semantics). This perspective makes the general architecture of the human language faculty into a so-called "Y-model"; see e.g. this article for a themed discussion. Hope this helps. :-)

    Comment Source:Hi @Grant, glad to hear you are interested. Off the top of my head is Berwick and Chomsky's 2016 book [_Why Only Us_](https://mitpress.mit.edu/books/why-only-us) which is a fairly recent (and not too technical) summary of the main results of the field in the past few decades (the first chapter in [Carnie et al.'s handbook](https://books.google.co.uk/books?id=K091AwAAQBAJ&source=gbs_book_other_versions) is also a nice overview). For example, on page 8 Berwick and Chomsky say (emphases mine): > [Human language syntactic structure] has at least three key properties, all captured by minimalist system assumptions: (1) human language syntax is **hierarchical**, and is **blind to considerations of linear order**, with linear ordering constraints reserved for **externalization**; (2) the particular hierarchical structures associated with sentences affects their interpretation; and (3) there is no upper bound on the depth of relevant hierarchical structure. Note that if all this is true, then observation (1) implies that any adequate linguistic theory must have some way to construct arrays of hierarchically structured expressions, while **ignoring linear order**; while (2) implies that structure (in part) fixes interpretation at the level of “meaning.” Finally, (3) implies that these expressions are potentially infinite. These then are the minimal properties any adequate syntactic theory must encompass and that’s why they are part of the minimalist account. So it is more or less a consensus (or at least a prevalent idea) among theoretical linguists nowadays that the structure-building module (i.e. syntax) and the structure-externalizing module (i.e. phonology) should be studied and formalized separately, where only the former is relevant to the structure-interpreting module (i.e. semantics). This perspective makes the general architecture of the human language faculty into a so-called "Y-model"; see e.g. [this article](https://ddd.uab.cat/pub/cjol/16956885v8/16956885v8p141.pdf) for a themed discussion. Hope this helps. :-)
  • 7.

    Awesome, thanks for the recommendations! I really like the perspective from evolutionary neuroscience in Why Only Us, I will definitely read it. I just read this book, and it really piqued my interest in EN. The passage you highlight reminds me a little bit of the interplay between syntax and semantics in the definition of 'algorithms' in computer science. Language is fascinating, and certainly, it feels like it's a rich playground to try out some category theory. :)

    Comment Source:Awesome, thanks for the recommendations! I really like the perspective from evolutionary neuroscience in **Why Only Us**, I will definitely read it. I just read [this book](http://www.allmanlab.caltech.edu/EB/Allman_Evolving_Brains.pdf), and it really piqued my interest in EN. The passage you highlight reminds me a little bit of the interplay between syntax and semantics in the definition of 'algorithms' in computer science. Language is fascinating, and certainly, it feels like it's a rich playground to try out some category theory. :)
  • 8.

    Yeah you are right. I guess in programming languages phonology does not have a role to play so syntax shoulders everything about expression formation. This is also true, in a broader sense, for artificial languages like Esperanto, where there is much less externalization-induced irregularity (e.g. the so-called "disharmonic word order") compared to naturally evolved languages.

    The book you linked looks interesting too. I (from a traditional linguist's perspective) see quite a few directions CT may be applied to the study of natural languages, and the Categorial Grammar line of research is just one of them. In my PhD work I have been looking at another direction, and I think more possibilities are on the way now that Applied Category Theory is growing faster and faster. :-)

    Comment Source:Yeah you are right. I guess in programming languages phonology does not have a role to play so syntax shoulders everything about expression formation. This is also true, in a broader sense, for artificial languages like Esperanto, where there is much less externalization-induced irregularity (e.g. the so-called ["disharmonic word order"](http://www.oxfordscholarship.com/view/10.1093/acprof:oso/9780199684359.001.0001/acprof-9780199684359)) compared to naturally evolved languages. The book you linked looks interesting too. I (from a traditional linguist's perspective) see quite a few directions CT may be applied to the study of natural languages, and the Categorial Grammar line of research is just one of them. In my PhD work I have been looking at another direction, and I think more possibilities are on the way now that Applied Category Theory is growing faster and faster. :-)
  • 9.

    Well, I have a lot to chew on, thanks again! :) If you consider any computational approaches, DM me, in the meantime, I'll be thinking about this stuff along with neuroscience in general (in conjunction with trying to learn category theory better of course).

    Comment Source:Well, I have a lot to chew on, thanks again! :) If you consider any computational approaches, DM me, in the meantime, I'll be thinking about this stuff along with neuroscience in general (in conjunction with trying to learn category theory better of course).
Sign In or Register to comment.