In October 1975, a conference took place at the Abbaye de Royaumont, a former Cistercian abbey located approximately 30 kilometres north of Paris. Swiss psychologist Jean Piaget and the American linguist Noam Chomsky debated views about the nature of language and learning in the company of a group of academics in grand surroundings on the outskirts of the French capital. Best known for his theory of innate or universal grammar, which suggests that humans are born with an innate ability to understand and produce language, Chomsky argued against the ‘behaviourist’ view of cognitive psychology. Here, behaviourism refers to a psychological approach that emphasises the study of observable behaviours, especially as they pertain to the process of learning. It proposes that all behaviours are learned through interaction with the environment rather than being determined by internal states or thoughts.
Rejecting this position, Chomsky instead made the case that the complexity of language and the speed at which children learn it suggests that there must be an inborn mechanism facilitating the process of language acquisition. Piaget, meanwhile, structured his argument around his theory about the four main stages of cognitive development that children progress through as they grow: sensorimotor (0-2 years old), preoperational (2-7 years old), concrete operational (7-11 years old, and formal operational (11 years old onwards). The debate between Chomsky and Piaget was centred on their differing views regarding the development of cognition and language. While Chomsky emphasised the innate structures of the mind, Piaget focused on the experiential and developmental aspects of cognition.
So, with that all said, we are left with two questions: do today’s researchers know whether language is learned or innate, and what exactly does this have to do with AI? The first question remains open. Today, the loose consensus is that language acquisition likely involves a complex interplay of innate cognitive structures and environmental exposure. Some evidence suggests that there are certain shared constants underpinning language, which many take to signal the existence of some innate linguistic structures and predispositions. All spoken languages, for example, have certain shared qualities like nouns and verbs, and children generally master the fundamentals of their native language within a few years of exposure. That being said, the diversity of languages and accents demonstrates language is also shaped by the environment. A child born in China learns different phonetic distinctions than one in Mexico. Some take this to mean that humans have some innate biases and constraints on possible languages, but require sufficient linguistic experience during critical developmental windows for language acquisition.
Now for the second question: what does this debate have to do with AI? Well, to begin with, the existence of giant neural networks shows that general learning mechanisms can extract meaningful patterns from linguistic data without the hard-coding of explicit grammar rules. GPT-4, for example, demonstrates an impressive capability to learn the statistical patterns and structure of human language from vast datasets of text alone. It does this without any explicit programming of grammatical rules or linguistic theory by discovering the relationships between words, sentences, and (clutch your pearls) meaning through exposure to examples. (Cue joke about Learning From Examples.) More relevant, though, is the impact that the conference had on research into neural networks in the 1980s. Researchers like Seymour Papert––best known for writing the 1969 takedown of neural networks in Perceptrons: An Introduction to Computational Geometry–were at the event, while others such as Yann LeCun read the proceedings once they had been published a year or so later.
Not so humble beginnings
The conversation at the Abbaye de Royaumont followed in the tradition of the debate between ‘empiricism’ and ‘nativism’ in cognitive science, with Chomsky leaning towards a nativist account that privileged innate abilities and Piaget representing the empiricist side that stressed development through interaction with the environment. Outside of cognitive science, which I will get to in a moment, the genesis of the debate can be traced back to Kant, who argued that certain knowledge is innate and independent of experience, and John Locke, who proposed that the mind starts as a tabula rasa or blank slate, with knowledge being acquired primarily through sensory experiences.
First introduced in An Essay Concerning Human Understanding in 1690, Locke’s blank slate metaphor is connected to the idea that humans are born without any innate ideas or knowledge already in the mind. Instead, Locke believed that all knowledge comes from experience and perception of the external world through the senses––a notion that aligns with cognitive empiricism in that it supports the idea that knowledge is derived from sensory experience and evidence. While he did allow for some innate capacities, such as the ability to receive and process sensory information, basic capacities for memory and reflection, and even language and communication abilities, he was quick to point out their limitations.
Then there was Immanuel Kant. In his 1781 Critique of Pure Reason, which, though hugely important, is perhaps the book I have enjoyed reading least, Kant argued that some knowledge about concepts such as space, time, and causality, must be a priori or known innately prior to any experience. He argued that a priori knowledge flows from what he termed "categories of understanding” through which the human mind structures and interprets qualia through in-built cognitive faculties. Kant broke this mental scaffolding that shapes our reasoning into four groups and associated concepts: quantity (unity, plurality, and totality), quality (reality, negation, and limitation), relation (substance and accident, cause and effect, and reciprocity), and modality (possibility and impossibility, existence and non-existence, and necessity and contingency). Though dizzying, the idea is that these categories demonstrate the types of mental functions that allow us to interpret and reason about the world in universal ways that structure our thought.
A reaction to the radical empiricism of Locke, Kant's notions of innate cognitive faculties and a priori knowledge laid the philosophical groundwork for debates around empiricism and nativism that emerged in 19th and 20th century psychology. In cognitive psychology, the discourse surrounding empiricism and nativism begins with the dialogue between the physician Hermann Helmholtz and the psychologist Ewald Hering about visual perception in the 1860s. According to Helmholtz, the mechanisms by which humans perceive space are psychological in origin, take place high in the central nervous system, and significantly rely on learning and experience accumulated over the course of a person's lifetime. He argued that our ability to perceive the three-dimensional structure and spatial relationships between objects depends primarily on unconscious inferences made by the mind, with the implication being that spatial vision stems from cognitive processes rather than from purely physiological mechanisms hardwired into the eye or nervous system. In this account, the mind subconsciously develops an understanding of three dimensional space as we observe objects from different angles and distances.
Hering, on the other hand, insisted that learning and experience are preceded by a primitive residue of spatial perception mediated in the peripheral organs in the central nervous system, which is largely fixed by inherited organic structures. He argued that perception stems from inherited organic structures that have become fixed through evolution and adaptation, which means that the basics of spatial vision like depth perception are hardwired into our visual system by genetics and do not require higher cognitive processing. This includes the physical optics of the eye, which capture information about light and shadows, edges, and other visual cues that stimulate innate depth perception abilities. The underlying point here is that the foundations of spatial vision like depth, distance, and position relationships are handled automatically through physiological mechanisms in the ‘early’ visual system (i.e. the basic physiological structures and pathways involved in the initial stages of visual processing).
Hering compared this to how the ear and auditory system are structured to analyse sound frequencies and locate sources based on what he described as physiological machinery. In a similar way, spatial vision works via the inherent design of the visual system physiology, rather than through learned psychological inferences. Rather than being dependent on high-level mental reasoning as Helmholtz proposed, Hering made the case that the structure of the visual system itself––and not learning nor cognition––explains human spatial perception abilities. For those interested, today most researchers believe that cognition does not significantly affect perception (there is, however, still some debate around the degree to which perception might be modulated by factors like expectations and attention).
Back to the future
The philosopher Hubert L. Dreyfus wrote of the history of AI: “In the early 1950s…one faction saw computers as a system for manipulating mental symbols; the other, as a medium for modelling the brain. One sought to use computers to instantiate a formal representation of the world; the other, to simulate the interactions of neurons. One took problem solving as its paradigm of intelligence; the other, learning. One utilized logic; the other, statistics. One school was the heir to the rationalist, reductionist tradition in philosophy; the other viewed itself as idealized, holistic neuroscience.”
Dreyfus refers to differences between the connectionist and symbolic schools of AI, which I wrote about more in my introduction to AI history. The symbolic approach, now known affectionately as GOFAI or ‘good old-fashioned AI’, understood intelligence as the manipulation of abstract symbols using logic, heuristics, and structured knowledge representations, which aligned with the rationalist tradition of viewing cognition as a form of symbolic information processing. Where these physical symbol systems aimed to capture intelligence through logical reasoning and explicit knowledge structures, the connectionist approach in contrast modelled intelligence as emerging from networks of simple, interconnected processing units inspired by biological neurons. Rather than explicit symbols, connectionists relied on distributed representations and statistical learning procedures to recognise patterns, which can be connected with the empiricist view that described cognition primarily as a learning process.
Debates between symbolic and connectionist approaches paralleled cognitivist arguments about whether cognition relies more on innate structures or environmental learning. Just as Chomsky argued against Piaget’s developmental psychology and Skinner's radical behaviourism, AI researchers like Fodor and Pylyshyn in the early 1980s critiqued connectionism's associationism as insufficiently representative of higher cognition. Yet, as we know, the large, backpropagation-powered neural nets of the 1980s demonstrated that connectionist learning was capable of invading domains in which the symbolic paradigm was dominant. Why does it matter? Well, as might be clear to many of you, my belief is that the history of AI did not happen in a vacuum. AI is a container, and it has come to be filled with ideas and values transplanted from fields including pattern recognition, statistics, economics, and of course cognitive psychology. Far from historical curiosities, these ideas are important because they directly shaped the trajectory of AI development.
Consider the following: connectionism was thought to be representative of radical behaviourism, a popular theory developed by Skinner suggesting human will and agency could be understood as products of external stimuli. And when reflecting on the resurgent popularity of connectionism in the 1980s, Seymour Papert said that the “behaviorist process of external association of stimuli with reinforcements” generated a “cultural resonance” between behaviourist interpretations of mind and connectionism. The same is true of the influence of empiricism on connectionism. Methods that praised the process of learning became favoured over hard-coded, symbolic alternatives because of these mutually-reinforcing conceptions of empiricism. After all, the successes of connectionism seemed to demonstrate that the empiricists were right, which, in a circular turn, conferred legitimacy on connectionist approaches that reflected certain modes of thinking about the nature of the mind. David Rumelhart, a prominent computer scientist credited for popularising backpropagation with Geoffrey Hinton, even wrote in the 1980s that “Information is not stored anywhere in particular. Rather, it is stored everywhere. Information is better thought of as ‘evoked’ than ‘found.’”
Today, as we build increasingly complex systems, the nativism-empiricism discourse reminds us that neither data-driven approaches nor symbolic knowledge alone render the full picture of intelligence. Just as human cognition combines innate structures and experiential learning, it may be that the most powerful systems arise from integrating statistical models with forms of grounded reasoning, social intelligence, and other inborn cognitive abilities (though my own view is that these attributes may prove to be emergent properties within the large model paradigm). And while the once rigid symbolic and connectionist camps have mostly softened into hybrid pragmatism, vestiges of the dichotomy persist that will likely continue influencing new directions at the frontier of AI research. Perhaps this is a good thing. After all, if history tells us anything, it’s that neither nature nor nurture alone captures the full breadth of intelligence.
Another great post, Harry.
I love how you track the old nativist/rationalist dichotomy onto current AI debates. I am wondering how you think the question of grounding plays out across the historical divide. The old nativists need sense data to ground language. Kant--not so much--or rather, grounding is immanent through the transcendentals. A Kantian would object at my statement here, but this is how I, a cautious realist, choose to interpret Kant.
But then things get even more interesting when we crossover into the new rationalism where there are emergent thought structures--emergent in the sense of emanating out of probabilistic calculations--that do not appear to need any sort of grounding to have communicative purchase.
I am wondering how you work out these issues. I am thinking about writing about connectionism/symbolism and the importance of the debate for educational contexts --- and I would appreciate your insider view on these issues.