^{1}

^{*}

^{2}

Analyzed the data: SP WHW. Contributed reagents/materials/analysis tools: SP WHW. Wrote the paper: SP WHW.

The authors have declared that no competing interests exist.

Classical and Connectionist theories of cognitive architecture seek to explain systematicity (i.e., the property of human cognition whereby cognitive capacity comes in groups of related behaviours) as a consequence of syntactically and functionally compositional representations, respectively. However, both theories depend on

Our minds are not the sum of some arbitrary collection of mental abilities. Instead, our mental abilities come in groups of related behaviours. This property of human cognition has substantial biological advantage in that the benefits afforded by a cognitive behaviour transfer to a related situation without any of the cost that came with acquiring that behaviour in the first place. The problem of systematicity is to explain why our mental abilities are organized this way. Cognitive scientists, however, have been unable to agree on a satisfactory explanation. Existing theories cannot explain systematicity without some overly strong assumptions. We provide a new explanation based on a mathematical theory of structure called Category Theory. The key difference between our explanation and previous ones is that systematicity emerges as a natural consequence of structural relationships between cognitive processes, rather than relying on the specific details of the cognitive representations on which those processes operate, and without relying on overly strong assumptions.

For more than two decades, since Fodor and Pylyshyn's seminal paper on the foundations of a theory of cognitive architecture (i.e., roughly, the component processes and their modes of composition that together comprise cognitive behaviour)

The systematicity problem consists of three component problems:

These problems are logically independent—one does not necessarily follow from another

Classicists and Connectionists employ some form of combinatorial representations to explain systematicity. For Classicists, representations are combined in such a way that the tokening of complex representations entails the tokening of representations of their constituent entities, so that the syntactic relationships between the constituent representations mirror the semantics ones—systematicity is a result of a combinatorial syntax and semantics

In general, a Classical or Connectionist architecture can demonstrate systematicity by having the “right” collection of grammatical rules, or functions such that one capacity is indivisibly linked to another. Suppose, for example, a Classical system with the following three rules:

G1:

G1 provides the capacities to generate all four representations (i.e.,

A demonstration of systematicity is not an explanation for it. In particular, although grammar G1 has the systematicity of representation property, the following grammar:

G2:

To further clarify what is required of a theory to explain systematicity

The theory of planetary motion, of course, does not end there. The heliocentric theory, with its circular orbits, cannot explain the elliptical motion of the planets without further assumptions, and so was superseded by Newtonian mechanics. Newtonian mechanics cannot explain the precession of planetary orbits, and was in turn superseded by Einstein's theory of relativity. In each case, the superseding theory incorporates all that was explained by the preceding theory. Evaluating competing theories in this manner has an extensive history in science, and so one may expect it to be a reasonable standard for an explanation of systematicity in cognitive science.

Aizawa

The problem for Classical and Connectionist theories is that they cannot explain systematicity without recourse to their own

In hindsight, the root of the difficulty that surrounds the systematicity problem has been that cognitive scientists never had a theory of structure to start with (i.e. one that was divorced, or at least separated from specific implementations of structure-sensitive processes). In fact, such a theory has been available for quite some time, but its relevance to one of the foundational problems of cognitive science has not previously been realized. Our category-theory based approach addresses the problem of

Category theory is a theory of structure

An adjunction is a formal means for capturing the intuition that a relationship between mathematical objects is “natural”—additional constructs are unnecessary to establish that relationship (see also

A

The most familiar example of a category is

Certain morphisms have important properties that warrant giving them names. Two such morphisms, which we will refer to later, are called isomorphisms and homomorphisms. A morphism

Homomorphisms pertain to categories whose objects have additional internal structure, such as groups. For example, the category

A

In

The categorical concept of product is a very general notion of combinatoriality. Not surprisingly, then, Classical and Connectionist notions of combinatoriality can be seen as special cases of categorical products. A grammar like G1 (

A functor

Functor composition and isomorphism are defined analogously to morphisms (above). That is, the composition of functors

Theories of cognition employ some form of representation. Functors provide a theoretical basis for constructing representations. For example, computational systems often employ lists of items, such as numbers. In category theory, lists can be modeled as monoids from the category

The two different sorts of arrows in Diagrams 3 and 4 highlight the constructive nature of functors. The objects are (co)domains with respect to the morphisms within categories, but are themselves elements of larger objects (in general, the class

Notice that the definition of functor does not dictate a particular choice for monoid homomorphism as part of the definition of

A

Again for expository purposes, we include the source category and functor arrows, which are usually left implicit in such diagrams. When a transformation is natural in the technical sense it seems natural in the intuitive sense, for mathematicians. In fact, category theory was founded in an attempt to formalize such intuitions

A natural transformation is a

Functors preserve structure between categories; natural transformations identify the similarities between functors. For our purposes, functors construct new representations and processes from existing ones; natural transformations identify the similarities between constructions. A simple example that is closely related to the

Although their associated diagrams look similar, there is an important difference between functor and natural transformation pertaining to the equality constraint that defines the relationships between object elements. For a functor, the equality constraint is local to the codomain of the transformation, i.e. the relationships between object elements within the constructed category. And so, the elements of the objects in the new category are only indirectly related to the elements in the corresponding objects of the source category by the categories' common external structure (i.e. inter-object relationships). For a natural transformation, the equality constraint spans the transformation, involving object elements mapped by both domain and codomain functors. And so, the two functors are directly related to each other by the internal structure of their associated objects (i.e. the relationships between object elements within an object). As part of a theory of cognitive architecture, there is a tension between the freedom afforded by functorial construction on the one hand—allowing an architecture to transcend the specific details of the source elements to realize a variety of possible representational schemes for those elements—and the need to pin down such possibilities to specific referents on the other. This tension is resolved with adjunctions.

An

The left and right functors of an adjoint pair are like “inverses” of each other, but unlike an isomorphic functor whose composition with its inverse sends all objects and morphisms to themselves, the returned objects and their elements of a composition of left and right adjoints are related to the argument (source) objects and their elements by a natural transformation. For categories

The effect of

Monoid

However, not just any monoid generated from a set is a free monoid. For instance, the monoid

From free objects we get an alternative (equivalent) definition of adjunction: consider functor

Yet another (equivalent) definition of adjunction, favoured by category theorists for its conceptual elegance, highlights the symmetry between a pair of adjoint functors: a bijection (one-to-one correspondence) between the set of morphisms from object

In the list construction example, the unit of the adjunction is the injection

Since this arrangement works for any morphism in

A general pattern emerges from this use of adjunction. Functor construction may afford multiple choices for particular morphisms (processes) in the constructed category, but a principled choice is obtained through the commutativity property of the adjunction. This arrangement means that we are not committed

With these formal concepts in hand, we now proceed to our explanation of systematicity. We apply our explanation in two domains: systematicity with respect to relational propositions, and systematicity with respect to relational schemas. Then, we contrast our explanation with the Classical and Connectionist ones.

For expository purposes, we develop our adjoint functors explanation from its components. One may wonder whether a simpler category theory construct would suffice to explain systematicity. For this example domain, the components of this adjoint have some systematicity properties, but in and of themselves do not explain systematicity—just as for Classicism and Connectionism, having a property is not the same as explaining it. This bottom-up approach motivates the more complex category theory construct from which the systematicity properties necessarily follow. Our approach has three steps.

(If we require an explanation of systematicity with respect to ternary relational propositions, then a ternary product

First, suppose objects

The Cartesian product, however, is not the only product object that satisfies the definition of a categorical product of

Second, for any category

Although the product functor has the compositionality of representation property, it introduces a different problem:

Third—final step, this problem brings us to the second aspect of our explanation foreshadowed in the

The (diagonal, product) adjoint pair is indicated by the following commutative diagram:

This explanation works regardless of whether proposition

That is,

If we need to explicitly represent a symbol for a relation, such as

For these situations, the diagonal and product functors have extensions. The extension to the diagonal functor is:

In this situation,

Under the assumption that these relation symbols belong to a different category, then cases such as

In summary, products may have the systematicity of representation and inference properties (see also

Another domain in which humans exhibit systematicity is relational schema induction. This domain is more complex than the previous one in that the intrinsic connection is between relations, rather than within one. In the relational schema induction paradigm

NEJ | POB | KEF | BEJ | |

square | POB | NEJ | BEJ | KEF |

circle | BEJ | KEF | POB | NEJ |

GUD | QAD | JOQ | REZ | |

cross | QAD | GUD | REZ | JOQ |

triangle | REZ | JOQ | QAD | GUD |

This task is modelled as the category of

For the purpose of finding a suitable adjoint, we need to see how

The adjoint functor pair used for this domain consists of the

Our explanation for systematicity in this domain follows the now familiar pattern, where monoids model the relationships between actions in each task instance. (Though our argument employs monoids, nothing essential changes if instead we use semigroups, or groups, where for example each task instance is extended with two additional shapes, one explicitly corresponding to the identity element, and the other to the remaining element in the Klein, or cyclic-4 group. For these cases, the proofs of adjointness can be extended to involve free semigroups and free groups, respectively.) Given an ASet modelling the first task instance and an ASet modelling the second task instance, there is more than one homomorphism from the first to the second, only some of which afford the correct responses to the stimuli in the second task instance. For example, one homomorphism has the following trigram and shape mappings:

Some readers may be interested in developing alternatives, or extensions to existing theories to address the systematicity problem in light of our explanation, so it is worth formally characterizing how our approach differs from previous ones. The difference between our category theory explanation and Classical/Connectionist approaches to systematicity may be characterized as higher-order versus first-order theories. Category theory also provides a formal basis for this distinction in terms of more general

Notice that the definitions of functor and natural transformation are very similar to the definition of a morphism. In fact, functors and natural transformations are morphisms at different levels of analysis: a natural transformation is a morphism one level above functors as we shall see. For

Classical or Connectionist compositionality is essentially a lower-levels attempt to account for systematicity. For the examples we used, that level is perhaps best described in terms of a 1-category. Indeed, a context-free grammar defined by a graph is modelled as the

Our adjoints explanation of systematicity has essentially two parts: (1) existence, showing how a particular connection between cognitive capacities is possible from a functorial specification of the architecture; and (2) uniqueness, explaining why that particular connection is necessary because it is the one and only one that satisfies the commutativity property of the adjunction. In contrast, the Classical and Connectionist explanations only provide an account of existence, but not uniqueness. That is, some grammars/networks afford the required intrinsic links between capacities and some do not, just like some functorial constructions do and some do not; but, for Classicism or Connectionism, there is no further explanation determining only those grammars or networks yielding systematicity (other than by

To be regarded as a theoretical explanation for systematicity, such an explanation should be potentially falsifiable. Our explanation could be challenged by an alternative theory that accounts for systematicity (without

The unit of an adjunction is a natural transformation between functors. The sense in which a transformation is natural is that the transformation does not depend on a particular “basis”. A mathematician's example is to contrast the dual of a vector space with the, natural, double dual (dual of the dual) of a vector space—the former depends on a specific set of basis vectors chosen

In addition to explaining systematicity, our category theory approach has further implications. According to our explanation, systematicity with respect to binary relational propositions requires a category with products. A category theory account has also been provided for the strikingly similar profiles of development for a suite of reasoning abilities that included

Our explanation for systematicity in regard to binary relational propositions does not depend on

Though some effort is needed to provide a category theory explanation for systematicity, even for a relatively simple domain such as relational propositions, the potential payoff is that our explanation generalizes to other domains where an appropriate adjunction is identified. This sort of tradeoff has been noted elsewhere in the context of a category theory treatment of automata

Having provided an explanation of systematicity in terms of the rather abstract category theory concept of adjoint functors, one may wonder what this explanation means for a more typical conception of cognitive architecture in terms of internal representations and processes, and their realization in the brain. Human cognition is remarkable in that it affords the ability to think about things that have no sensory access (e.g., “a dog that is one lightyear long …”); yet reason about such entities as if they were grounded in our everyday experience (“… is smaller than a dog that is two lightyears long”). However, these two aspects must be reconciled: unbridled abstraction means that one can no longer determine what a particular internal representation is supposed to refer to; yet blinkering the system with over-narrowly defined representations curtails one's ability to

The realization of computational processes in the brain is classically conceived as a

Up to this point, we have not considered the relatively new Bayesian approach to cognitive modelling (see, e.g.,

All theories make certain assumptions. The question is whether those assumptions are extrinsic to the theory and carry the essential explanatory burden (i.e. they are

This assumption of typing, though, is acute for quasi-systematic domains, where cognitive capacity may extend to some but not all possible constituent combinations, which appear to be particularly prevalent in language (see

Needless to say, our category theory explanation is not the final word on a theory of cognitive architecture. For our approach (and Classicism), where the assignment of elements to objects (and, words to word classes) is asserted, there is also the broader question of why they get assigned in a particular way. This question pertains to the acquisition of representations, whereas the systematicity problem pertains to their intrinsic connections. Incorporating category theory into the Bayesian approach may provide a more integrative theory in this regard. A connection between category theory and probability has been known for some time (see

From a category theory perspective, we now see why cognitive science lacked a satisfactory explanation for systematicity—cognitive scientists were working with lower-order theories in attempting to explain an essentially higher-order property. Category theory offers a re-conceptualization for cognitive science, analogous to the one that Copernicus provided for astronomy, where representational states are no longer the center of the cognitive universe—replaced by the relationships between the maps that transform them.

Proof that the free and forgetful functors for the category ASet form an adjoint functor pair.

(0.10 MB PDF)

We thank the reviewers for comments that have helped improve the exposition of this work.