To write history is to remember. Historians pick what to include and what to omit, what to emphasise and what to downplay. Too much stuff happens. Timescales can be dizzying. Evidence may be scarce.
Even when it’s not, facts are essential but insufficient. Population density in southern Europe tells you much you need to know about the rise and fall of Rome, but it’s not quite the full story.
It’s the historian’s job to situate facts within a framework that gives them meaning. But it’s also on them to probe these constructions and ask why they exist in the first place. The historian Eric Hobsbawm thought that historical practice was an exploration of how different narratives are constructed, and why certain events are remembered based on the needs of the time.
In his wonderful On History, he challenges the idea of a wholly objective recounting of the past by arguing that every narrative is coloured by the ideological climate of the day. Hobsbawm was asking us to view history as an active dialogue between the past and the present, one that is as much about power and memory as it is about chronology.
Nietzsche thought that history wasn’t a ledger so much as a series of selective narratives that serve different functions: antiquarian history preserves the past, critical history challenges it, and monumental history inspires action. By this account, history is about forgetting as much as it is remembering.
It suggests that the historian’s pen frames both cultural identity and collective memory. Choosing what to elevate and what to set aside is in service of this goal. Hobsbawm, too, sees the act of forgetting as a deliberate decision, a necessary counterbalance to the weight of remembrance.
The cruelty of AI history
The historian of machine learning Aaron Plasek wrote an article in 2016 about the ‘cruelty’ of writing about his area. According to Plasek, the core problem for those interested in the history of machine learning is that historical work has overlooked ML in favour of symbolic methods. As he put it:
The failure to appreciate this point has contributed to myopia in the popular histories of AI that rely on AI researchers as informants while downplaying the enormous body of technical work these informants produced, often relegating the field of machine learning to a mere subfield of AI. The actual historical situation, in terms of the sheer volume and ambit of technical publications produced, suggests the opposite to be true: machine learning has always been center stage, while AI within the larger field of computer science has often had the status of a disciplinary backwater.
The reason for this is obvious. Most histories of AI are written by AI practitioners, and most practitioners have traditionally scoffed at machine learning. They have generally preferred what is now known as GOFAI or ‘good old-fashioned AI’.
GOFAI is the ‘symbolic’ school of AI in which systems are developed using hard-coded rules based on the manipulation of expressions. Heavily influenced by the ‘physical symbol systems hypothesis’ developed by American AI researchers Allen Newell and Herbert Simon, symbolic reasoning assumes that aspects of intelligence can be replicated through the manipulation of symbols, which in this context refers to representations of logic that a human operator can access and understand.
Influential histories focused on this group include Pamela McCorduck’s Machines Who Think: A Personal Inquiry into the History and Prospects of Artificial Intelligence (1979) and Donald Crevier’s AI: The Tumultuous History of the Search for Artificial Intelligence (1993).
The machine learning branch of AI history is often (though not always accurately) called ‘connectionism’. It proposes that systems ought to mirror the interaction between neurons of the brain in order to independently learn from data. Artificial neural networks are commonly placed within this group, but its genesis goes back much further.
This is why Plasek calls writing ML history a cruel thing. He argues that what we know today as machine learning developed from pattern recognition, which evolved independently from artificial intelligence in the twentieth century. Plasek suggests that pattern recognition realised a form of learning by borrowing the notion of the loss function formalised in the 1930s by the mathematician Abraham Wald (who is now probably best known for the survivorship bias meme).
That spells trouble for AI’s founding myth. Received wisdom has it that the origins of the discipline that we today call ‘artificial intelligence’ can be traced back to the summer of 1956, when a group met at Dartmouth College in Hanover, New Hampshire.
The gathering, which took place over the course of two months, brought together researchers including John McCarthy (workshop organiser and developer of the LISP programming language), Marvin Minsky (a pivotal player in the intellectual battle between AI’s two most prominent subfields), and Claude Shannon (a major figure in information theory, and inspiration for the name of Anthropic’s flagship language model).
Because modern AI is the heir to connectionism, and because connectionism did not spring up out of the ground in the fields of New Hampshire, we have to concede that AI did not in fact emerge in 1956.
The myth of Dartmouth is unfortunate because it erects a wall between AI and the fields from which it borrowed so much: statistics, pattern recognition, cybernetics, optimal control theory, operations research, economics, behavioural psychology and of course neuroscience.
That isn’t to say there’s no good work on the history of AI, just that it’s in a bit of a muddle. What follows is a selection of great books about AI history and a few short words of context. They don’t cover everything, but reading a couple of them should give you a good sense of where AI came from.
1. Margaret Boden, Mind as Machine
Intellectual history that frames AI as a technical, cognitive and philosophical project. Her central claim is that AI was a philosophical wager about what minds are and what they’re made of. Mind as Machine is a comprehensive account that establishes AI as the inheritor of traditions spanning psychology to cybernetics.
This book is number one for a reason. If you can make it through all 1,600 pages you’re likely to be much more familiar with the history of AI than most. It’s also lurking around the internet as a PDF if you don’t fancy shelling out for a hard copy.
2. Nils Nilsson, The Quest for Artificial Intelligence
Nilsson was an insider. He’s one of those AI practitioners that I mention above, and made a huge number of contributions to the field at Stanford and its SRI International spin-off. The book offers a memoir-history hybrid that reads like a technical progress narrative.
If you want a good technical account that deals with both symbolic AI and connectionism, then this is a great place to start. It doesn’t grapple with the non-technical factors that shaped research (but that’s what the others are for). It begins about 2,000 years before Dartmouth, which I find satisfying for a book like this.
3. Anderson & Rosenfeld, Talking Nets
A collection of interviews with many of the key players in neural network research. Well known names like Geoffrey Hinton make an appearance, but so do obscure ones like Teuvo Kohonen and Stephen Grossberg who were probably just as important.
The book captures the essence of a field in flux and the affective intensity that accompanied its reinvention. I like it because it reminds us that scientific practice is often about belief systems and their funny habit of surviving technical refutation.
4. Pedro Gonçalves, The Turing Test Argument
The most recent book on the list deals with one of the field’s most famous moments. Gonçalves unpacks the multiple interpretations of the Turing test, arguing that it was never meant as a test of ‘intelligence,’ but as a thought experiment about language and performance.
It argues that the best way to understand how the Turing test came to be and what it sought to achieve is by studying the intellectual climate of the day and the relationship between Turing and his contemporaries. Relevant reading as large models pass a version of the imitation game.
5. Jonnie Penn, Inventing Intelligence
A meditation on the forces that shaped early AI research. PhD thesis rather than book, it looks at the life and work of the titans of AI history: Herbert Simon, Frank Rosenblatt, John McCarthy, and Marvin Minsky.
Its main innovation is to unpack how each of these figures borrowed from fields such as management science and operations research (Simon), economics and statistics (Rosenblatt), automatic coding techniques (McCarthy), and cybernetics (Minsky). You can read it for free here.
Remembering to forget
The reason I picked these books, and the reason I’m more interested in the history of machine learning than GOFAI, is in some sense an ideological one. I live during a moment where symbolic approaches are considered at best kooky and at worst irrelevant.
Despite some noises about a symbolic AI renaissance, you can pretty much guarantee that all frontier AI systems from here on out will use some form of deep learning. I also used to work at Google DeepMind, which obviously shaped how I view the technology and by extension how to do AI history.
The point is that no one can resist their cultural climate. Not the great historians and certainly not me. The best we can do is remind ourselves that to write about the past is to remember, but it’s also to forget.