Right now I’m running a pledge drive to (eventually) make Learning From Examples a full-time thing. We're making progress, but there’s still a way to go. If you’ve been enjoying the writing, this is the best time to show your support. A $5 pledge doesn’t cost anything today, but does tell me you’re in when I eventually flip the switch on paid subscriptions. To everyone who has pledged so far: I can’t believe how generous you’ve been. Thank you.
A few years ago, I was in a seminar about medieval alchemists.
The researcher, a historian of science, was interested in what those who practiced the arcane knew about the natural world. Probing to get something out of a quiet group, they asked a question I still think about: what is knowledge?
Things took a circular turn. Knowledge turned out be one of those things that resists precise definitions, a lumpy mixture of truth, belief, rationality, and instinct. One popular idea was that knowledge is processed information, and that information becomes knowledge when we understand it.
This sounds neat enough, but there are some problems here. You can of course internalise false information. You can believe true things for the wrong reasons. And you know how to recognise a face but not how you do it.
Our medieval alchemists knew things, didn’t they? They knew about mercury, transmutation, and the four elements. These were coherent ideas, but they were ultimately wrong.
‘Understanding’ is just as slippery. We might feel that we understand something because it fits our worldview, not necessarily because it corresponds to some fact of reality that we know to be true. The broader point is that knowledge and knowing are not so clear cut. Even information, which seems stable enough, is messy, contingent, and shaped by who’s looking and what they expect to find.
In this essay, I think through some of these problems and apply them to AI. I want to make sense of claims that even the best models don’t really ‘know’ anything, and the counterclaims that argue for the opposite position.
What follows are five different ways of knowing and what they mean for getting to grips with thinking machines:
Knowledge as representation
Knowledge as practice
Knowledge as situation
Knowledge as power
Knowledge as emergence
I argue that AI, by which I mean frontier models, partly satisfies the criteria associated with each of these ways of knowing. It doesn’t tick all the boxes, but it does enough that I suspect AI knows more than many people give it credit for.
Knowledge as representation
In the Western imagination, the oldest description of knowledge is as a representation of reality. Way back in the 4th century BCE, Plato and friends mulled over ways of knowing in Theaetetus.
One of the more compelling ideas put forward is that of knowledge as a combination of true beliefs about the world and a compelling reason for that belief.
But they don’t buy it, at least not in every circumstance. The dialogue rejects the idea on the grounds that you must have the right kind of justification. It needs to be one that belongs to the person making the claim, it needs to show why the belief is true, and it needs to explain the belief by getting at a root cause.
I am speedrunning, but today we call this type of knowledge the justified true belief (JTB) model. In JTB, we might say that knowing that the Earth orbits the Sun would mean (a) it’s true (the Earth really does orbit the Sun), (b) one believes it, and (c) one has justification (e.g. scientific evidence or sound reasoning) for that belief.
Knowledge in this view is a mental representation that corresponds to reality, one that is underwritten by a justification. It’s essentially a mental mirror of the world that is true and warranted.
The model’s heyday ended with a famous paper by the philosopher Edmund Gettier in 1963. Gettier devised clever scenarios (now known as Gettier cases) showing a person could have a belief that is true and well-justified, yet we would hesitate to call it knowledge because the truth resulted from luck or coincidence.
For instance, imagine you glance at a normally reliable clock which by chance stopped 12 hours ago. Your belief about the current time is true and justified by said clock, but only accidentally so. This is sort of what Plato was getting at: it is possible for a justification to become disconnected from the truth, which melts the causal link between the elements that make up real knowledge.
There are many critiques of the idea of knowledge as representation. Far too many to get into here. But one of them is worth mentioning: the idea that knowledge doesn’t really mirror the world.
John Dewey defined knowing as an active ‘organism–environment interaction’ rather than a static representation. Other pragmatist thinkers like Richard Rorty agreed. They might say that knowledge is better seen as a tool for coping with the world rather than a reflection of it.
What about AI?
Let’s first assume we’re dealing with an agent capable of computer use, one that I’ve asked to do my shopping. We have to jump through a couple of hoops here, but my view is that a sufficiently capable model knows some things in a JTB sense.
It carries a representation (‘Harry prefers Shop A’) that guides its belief in what I want it to buy. The representation is true in that it correctly knows what I want and it has a good reason for thinking that because it has a log of my previous choices.
JTB only asks for a belief-state that produces the right behaviour under the right evidence. If our agent carries a memory module that always makes it choose Shop A for me, that module plays the functional role of a belief system. (But only if we adopt what Dennett calls the intentional stance, where we treat a system as if it literally has beliefs, desires, and rationality whenever that helps you forecast what it will do.)
So, accepting some mental gymnastics a system can just about tick the JTB box. What it can’t do is satisfy Plato’s account, which has much harsher conditions for real knowledge. There are a few reasons for this, but the main one is that the agent needs an explanation (that is, logos) that reveals the cause or essence of the thing and shows why it cannot be otherwise.
Of course, we don’t tend to apply these conditions to all types of human knowledge. I haven’t actually conducted experiments to prove the Earth orbits the Sun, and my maths skills are a little too rusty to figure that out using first principles.
That’s why the JTB version of knowledge as representation became popular: because it stops us from throwing the baby out with the bathwater by saying humans know very little.
But the JTB recipe proved brittle. Gettier showed that beliefs can be true and well-justified, yet still only accidentally correct. In other words, an AI may pass the minimalist JTB test the same way we often do (provisionally and with a dollop of luck) but that’s not the same as self-explanatory knowledge.
Knowledge as practice
But is knowledge something we only hold in our head? We ride a bike, catch a ball, spot a friend’s face in a crowd. We do those things instinctively. The know-how resides in coordinated muscles and perceptual cues rather than in explicit propositions.
Because so much competent action relies on background noises that resist full articulation, treating knowledge as solely internal representations misdescribes everyday expertise. This idea is best captured by Michael Polanyi’s famous adage: ‘We can know more than we can tell.’
This view moves from knowing-that (viewing knowledge as factual or declarative) to knowing-how (the practical mastery that comes from doing and experiencing). Instead of a mirror held up to reality, knowledge is seen more like an ability we use in the world.
The philosopher Gilbert Ryle made a similar point in 1949. Ryle skewered the assumption that all knowing-how (like swimming) is really just a complex kind of knowing-that (knowing facts or rules about how to swim). After all, one could know how to swim even if one cannot articulate the physics of swimming.
In cognitive science, what we call the ‘phenomenological tradition’ emphasises the body as the locus of knowing the world. Maurice Merleau-Ponty thought that our body knows how to reach for a cup or navigate space in ways we don’t usually conceptualise abstractly. Perception, in other words, is itself an active engagement with the world.
Others took this idea further. They argued for an ‘enactive’ view where knowledge emerges through dynamic interaction of an organism with its surroundings, rather than through constructing internal representations detached from action.
The philosopher Jerry Fodor famously doubted these ideas. Focusing on Polanyi, he reckoned we should be careful calling something ‘knowledge’ if it can’t be verbalised or symbolised. Polanyi might respond that tacit knowledge underpins explicit knowledge. It supplements, rather than supplants, the process by which we make sense of explicit information.
What about AI?
In the 1960s, Hubert Dreyfus argued that early artificial intelligence efforts faltered because they misunderstood human knowledge as rule-bound rather than learned. He argued that a chess master ‘just spots’ the right move rather than crunching numbers, acts from a sort of embodied familiarity, and cannot fully articulate the know-how guiding the act.
Dreyfus was talking about the symbolic school of AI that uses hard-coded rules, but he was also sceptical of machine learning because it struggled to get at the underlying meaning.
Modern systems, he might say, process patterns and find correlations, but they produce explanations that don’t necessarily correspond with the basic facts of reality.
This is a version of the symbol grounding problem, which concerns how symbols like words can acquire intrinsic meaning (rather than being defined only in terms of other symbols). A model might wax lyrical about molecular biology, but we don’t know whether ‘molecular’ or ‘biology’ are stable concepts that refer to in-the-world properties.
The problem, of course, is that there’s no real way to determine whether or not today’s systems have some sort of sophisticated world model that emerged as a fortunate byproduct of next token prediction. Maybe they do and maybe they don’t. My personal view is that the proof is in the pudding. If AI starts to consistently discover new facts about the world without any hand-holding, then it’s probably we who misunderstand the nature of knowing.
For now, though, let’s be clear about what we’re saying as it relates to practised knowledge. There are basically three major claims here:
Skilled action emerges through repeated copying.
Know-how is stored in dispositions rather than explicit propositions.
Embodiment anchors meaning in the world.
As I see it, the most sophisticated models satisfy both the first and the second criteria. My shopping bot does improve by trial and error, and its knowledge does emerge from experience.
The last claim is where it falls short. The shopping agent still lacks a full sensorimotor experience of being that Dreyfus and the phenomenologists argue is necessary for the richest kind of human know-how.
Until an agent can live in an environment, develop a durable feel for what matters, and draw on that feeling in the real world, its tacit knowledge stays closer to habit than to embodied skill. Of course, it’s not all that clear to me that practised knowledge must also be embodied — but that’s a debate for another time.
Knowledge as situation
Donna Haraway coined the term ‘situated knowledges’ in a 1988 essay arguing against the illusion of pure objectivity in science. The idea is that all knowledge is partial, that the act of knowing is not only dynamic but also shaped by the relative position of the knower.
Here the knower is influenced by the historical, cultural, and physical conditions that shape their ability to form knowledge. It’s an idea closely related to Thomas Nagel’s ‘view from nowhere’ schtick, the philosophical ideal of stripping away local standpoints until you see the world independent of any particular observer.
Science often tries to reach this altitude. We describe colour as wavelength, love as neurochemistry, or the self as a biological organism. Each move feels like progress toward objectivity.
But Nagel argues the project is impossible to complete and perilous to over-extend. The very act of thinking is anchored in a ‘view from now-here’ that emanates from a physical entity with a history of being.
More concretely for our purposes, the historian of science Thomas Kuhn foregrounded the role of perspective with his concept of paradigms. In The Structure of Scientific Revolutions, Kuhn showed that scientists’ interpretation of data is conditioned by the ensemble of theories, methods, and exemplars they have at their disposal.
In Kuhn’s view, what scientists see as facts (and even what they observe through instruments) is filtered through the pores of conventional understanding. When paradigms are washed away by the storms of scientific revolution, scientists resurface to find a different world of knowledge.
Critics of these ideas charge philosophers with relativism.
They often ask: if all knowledge is perspective-bound, can we say any fragment of knowledge is better or more true than another? This was at the heart of the 1990s Science Wars™ in which critics like Alan Sokal (he of Sokal affair fame) and Paul Gross accused postmodern theorists of undermining rationality.
Defenders of the situated view shot back that recognising perspective is not the same as nihilistic relativism. Haraway distinguishes her view from ‘anything goes’ relativism by pointing out that claims still have to be accountable to evidence — but one must still recognise that all observers are somewhere.
A related debate turns on perspectivism. Kuhn documented how scientific facts are constructed in specific settings, but scientists often respond that nature ultimately constrains our perspectives. Wishing that the Earth revolves around the Sun does not make it so.
The idea is that there’s tension between realism and constructivism. Contemporary philosophers often occupy a middle ground that acknowledges, while truth isn’t subjective, all access to truth is mediated by perspective.
What about AI?
Large language models draw on billions of tokens of data. Every output is called forth from somewhere, usually conjured up with one eye on a small slice of the internet. Each line has been conditioned by the feedback of human annotators who up-weighted some output and shooed away others.
What we get is a statistical amalgamation of situated viewpoints anchored in very specific conditions. When we request a historical summary of the siege of Khartoum, we’re probably getting the anglophone perspective. In standpoint-theory terms, the model amplifies perspectives that dominate public text and omits those that circulate orally, behind paywalls, or in low-resource languages.
In this sense, the model’s knowledge is deeply situated — though not precisely in the same sense it might be for humans:
For humans the standpoint is lived. You occupy a body, a culture, a history. Those coordinates shape what you can legitimately claim to know.
For an LLM the standpoint is inherited. Every sentence reflects the conditions under which it was constructed: data sources, content policies, platform design, technical affordances, and the risk posture of developers.
That makes the model’s outputs situated in origin (they reflect the decisions of its makers) but non-situated in experience (the model itself has no lived point of view). It is anchored in the contingencies of its design rather than in a knower’s embodied life.
Knowledge as power
Taking a leaf out of the pragmatist’s book, one way to make sense of knowledge is to skirt questions about what knowledge is and instead focus on what it does.
This tradition, associated with social theorists like Michel Foucault, describes a great entanglement between what we know and our ability to shape the world around us. It’s a school of thought that turns Bacon’s famous adage, ‘knowledge is power’, upside down.
The move is to take the phrase both literally and critically. Those who define what is known often hold power, and conversely, power structures determine what counts as knowledge.
Foucault famously said that ‘there is no power relation without the correlative constitution of a field of knowledge, nor any knowledge that does not presuppose and constitute at the same time power relations.’
Whenever power is exercised (e.g. the state’s power over citizens, a teacher’s power over students or a doctor’s power over patients), it does so by relying on knowledge (e.g. census data, educational curricula or medical diagnoses). That knowledge in turn reinforces power relations by making them appear natural or necessary.
The problem here is a doozy. If all knowledge is an effect of power, then Foucault’s own analyses are equally compromised. On what grounds, critics ask, can they claim any critical privilege?
Instead, we might say that language contains built-in validity claims that actors must implicitly raise whenever they seek mutual understanding. Power, through say propaganda, can warp those claims — but the possibility of identifying distortion presupposes an ideal of undistorted communication.
It should be clear that we’re not really talking about another type of knowledge here. Knowledge as power isn’t really a fourth species to sit alongside those that we’ve already discussed, but it is useful to think about for getting to grips with how conceptions of knowledge have changed — and how knowledge actually circulates in the world.
What about AI?
I think this type of knowledge, if we can even call it that, is probably the one that gets the most air time in discussions of AI today. At least implicitly, much critical work on AI is essentially Foucauldian. It’s interested in how the AI project uses personal information in service of a developer’s institutional goals.
Of course, this is basically true of any organisation trying to make money — especially those of the information age. More specific criticisms tend to involve studying how AI can actually exert political influence in the real world (though my view is concerns like ‘misinformation’ are overpriced).
This type of analysis also considers the behaviour of the models and how design decisions influence outputs. As in discussions about situated knowledge, content filters, post-training techniques, and other classifiers all encode value judgements about what’s fair game and what’s not.
Reinforcement-learning and policy tuning rely on human raters following a set of guidelines. The result is a model that treats some claims as polite, others as disallowed, and still others as ‘unproven’. All reflect the institutional risk posture of the builder.
That is not to say LLMs are corrupt. Rather, it simply reminds us that the knowledge produced by a model is the outcome of many small (and often invisible) governance choices.
Knowledge as emergence
The final conception of knowledge is the most recent. It holds that knowledge is an emergent phenomenon produced by interactions and networks. This view says that knowledge, whether in science or society, is the product of distributed systems.
A core idea here is actor-network theory (ANT). Developed by Bruno Latour, ANT suggests that all knowledge is collectively produced. In science, for example, knowledge might emerge from the interplay between researchers, instruments, institutions, and even non-human entities like microbes.
An actor-network theorist would say a claim only becomes ‘knowledge’ when a stable network has formed that supports it. You need others to reproduce your experiment. You need journals to publish it. And you need others to cite it. Until that happens, it’s not proper knowledge.
Closely related is the idea of distributed cognition, a concept from Edwin Hutchins’ study of maritime navigation. Hutchins believed that the ‘cognitive unit’ was the whole system of people, charts, instruments, and communication. Only together, he proposed, did they possess the knowledge needed to move the ship from A to B.
We see the idea of networked knowledge in crowdsourcing. On Wikipedia, for example, no single contributor knows everything — but a large network of contributors can collectively produce an encyclopedia.
James Surowiecki puts forward in The Wisdom of Crowds that under the right conditions, aggregating information from many individuals can yield remarkably accurate knowledge. We need look no further for evidence than prediction sites like Manifold, where averaging many educated guesses often beats individual forecasts.
There are many more examples like this, but for this post I’ll leave you with Andy Clark and David Chalmers’ extended mind hypothesis. This theory reckons that cognitive processes (and by extension knowledge) can extend into the environment. They think that a notebook, for example, is effectively an external memory that we can use to remember important information.
Knowledge as emergence is a cool idea, but — like all of those we have discussed — it is not without some problems.
Remember, in the representational view, knowledge is generally thought to correspond with some underlying truth and rationale to back it up. In a distributed system these two come apart. The network may deliver accurate results, but the warrant is…nowhere.
No single node can supply the justificatory story, and the pattern that does support the answer is often opaque. Because justification is diffuse, there is no clear locus for responsibility or error-correction.
What about AI?
Emergent knowledge most neatly corresponds to AI.
As we know, a language model is a dense network that encodes knowledge distilled from massive reservoirs of information. Nothing in the network is built to store explicit facts. Training simply nudges billions of weights so the system gets better at continuing text.
Out of those adjustments, higher-level regularities surface that allow the model to answer questions, draft code, and translate idioms. But no individual parameter ‘contains’ these abilities. If you tweak a few, the behaviour can re-form along alternate paths.
What the model knows arises as a pattern, one that can only exist based on the whole web of connections. Knowledge, in this sense, is an emergent property of the network’s collective dynamics — exactly the phenomenon systems theorists have in mind.
Consider what a model knows about the City of Light. It will tell you that Paris is in France, but there’s no single internal register that stores that fact.
Instead, what we see is that there are circuits responsive to the string ‘Paris’ and others that correspond to a concept of ‘Frenchness’. But that isn’t the same as the symbolic entry Paris → France. The connection between those two circuits is computed from a mesh of activations and embeddings.
More confusing still is that the same fact is encoded in many routes. Ablate one circuit and another pathway compensates. That redundancy is useful for robustness, but it means the fact isn’t localised in a way that carries its own explicit warrant.
Does AI know things?
Knowledge is porridge. It’s warm, thick, nutritious, and bodily. But it’s also unglamorous. It’s hard to separate into its constituent parts and takes the shape of its container.
Thinking about knowledge shows us the limits of language. Much reading later, the only thing I feel truly comfortable saying about the topic is that precise definitions don’t have much currency here.
That all said, the point of this exercise was to answer a question: does AI ‘know’ things? The answer is yes, so long as you squint hard enough and don’t worry too much about fine-print.
So, here are the scores on the doors:
Representation: AI represents information as knowledge. It mirrors reality well enough to act, yet because those patterns lack a transparent justification, they stop short of the most severe definitions of representational knowledge.
Practice: Frontier models replay usage patterns rather than consulting explicit rules. Its know-how is powerful but disembodied, missing the sensorimotor feedback that lets human skill refine itself in the world.
Situation: Its knowledge is deeply situated, though not like ours. Every answer is filtered through the languages, cultures, and editorial choices made by its makers and embedded in the training data.
Power: Design choices like alignment tuning and corporate policy dictate which claims the model is allowed to produce or suppress. What it knows is bounded by what it is permitted to say.
Emergence: LLMs are emergence made manifest. Answers arise from the collective dynamics of billions of small units rather than from any single stored fact. The model’s knowledge, like all networked knowledge, is robust but opaque.
So, AI can know if we take ‘knowing’ to mean reliably producing and acting on patterns that line up with the world. But if our definition must include embodied feeling, self-owned explanation, and an unambiguous locus of responsibility, then we have to think again.
Across all five frames the machine scores a partial hit. It mirrors facts, rehearses practical routines, speaks from a data-bound position, carries its builders’ priorities, and generates answers through emergence.
But when all’s said and done, I find it hard to believe that AI doesn’t know anything. It may not know like we do, it may not know everything, but it does know something.