Often the basis for AI criticism is a deep dislike of the tech-bro hyper-optimist view; the “AGI solves everything” idea. Academics tend to hate that mindset and so do I. AGI is poorly defined; it’s not here yet; and there’s no evidence that it solves more problems than it creates.
But this valid critique of overreach too often blinds smart people to realities. AI can do things, a lot of things actually, that were previously thought impossible for non-human entities. That is not tech-oligarch hype, it is just reality. And it does need open minds and (yes) new epistemologies. We do not build those by punching straw men. We need to think and build new descriptive ideas, to match AI as it emerges.
Agree with all of this. My view is basically that recognising that AI is more sophisticated than many tend to assume should open up more lines of (important) critical work, not close them off!
Yeah I think that’s exactly right—the most interesting philosophical and empirical work arguably can only start once we adopt some humility about what we think these systems can and can’t do; and also about the difficulty in properly assessing what we want to assess.
And I also think there's a bit of bitterness that these tech bros and AI labs didn't actually understand or solve the deep mysteries of linguistics and cognitive science -instead, they simply made a lot of GPUs go brrrr, and these are GPUs that academics don't have.
This is a problem that still needs to be addressed. It needs to be a tool that helps with linguistics and cognitive science . I think there will be more effort made to connect these fields with AI research.
Some agreement here; although I think it’s not just that academia does not have the GPUs; but also they don’t WANT the GPUs…. Because GPUs symbolize emergent outcomes produced in disorganized bundles, that don’t match well to linguistic theory or terminology.
AGI is a long term goal that right now has a lot more hype to it. People are fine being skeptical about it both in academia and general society. But LLMs and related systems like chatGPT and Claude are already very useful in programming, research and brainstorming. Everyone needs to try using it or at least not have superficial criticisms of it, especially if you are in a teaching or creative position !
“90-95 percent accurate”? That means that a summary that contains 20 “facts” will contain one falsehood; of 20 references in an AI-generated paper, one will be made up.
A journalist with that record would be fired instantly upon discovery; an academic with that record would face a university inquiry and would never be taken seriously again.
And yet people are relying on AIs instead of the work of people who get things right for a living. This will not end well.
Journalists have sub-editors and editors to check for errors. Academics have peer review and journal edits. Discounting that, there's not much in it in terms of accuracy between a human-written first draft and a draft from the best models.
Neither people nor AI are infallible. Obviously, you should check the outputs before using the models - and you should generally avoid 'relying' on AI too.
Editors (outside The New Yorker, and probably not even them nowadays) do not check journalist’s facts: they rely on the journalist’s integrity and skill, which is why journalists who make things up get fired right away.
Peer reviewers are unpaid and couldn’t possibly check every fact and citation.
And if one can’t “rely” on AI summaries, what exactly should one do with them? Treat them as entertainment?
Ok, a few elite magazines still do. Most don’t. Newspapers never have.
A huge proportion of what we regard as information can only be trusted as such because of human professionalism and judgment. AI adoption will lead —has already led—to all sorts of things being accepted as true that simply aren’t.
I use AI every day and it is immensely useful. It gets things wrong far less often than a human assistant would. Stop living in a binary world where they are either 10,000% reliable or they are useless. You come across as a whiny prick, no offense. Just like when you work with a human employee, they can get things wrong, and you need to keep an eye out for that. It’s not a big deal.
That’s a shockingly terrible accuracy rate. It only sounds good out of context.
For comparison, I work for publishing company and our correction rate is on average less than one per journalist per year, for people who produce a substantial piece of writing (at least 400-500 words) every day, and usually more.
Each piece of writing usually contains dozens of facts and quotes. Even if you account for prevented corrections (which were caught by an editor) that adds maybe a couple per month.
You are basically saying that LLMs make about 10-50x the number of mistakes of a professional writer.
Furthermore, most of the corrections we have are fundamentally different from an LLM hallucination. Usually it’s things like an incorrect date, a transposed number, or other boneheaded mistakes which are easily fixed and don’t expose the publisher to much liability. LLMs can fabricate entire quotes and attribute them to companies! Do you have any idea how damaging that can be to a publisher and to the company being ‘quoted’?
Not even getting into the fact that LLMs can go to the other extreme and regurgitate something produced by someone else, effectively opening the company up to a plagiarism lawsuit.
While I found your comment really insightful, I think it’s useful to look at the human baseline performance of the simpleQA benchmark: “We found that the [human baseline] matched the original agreed answers 94.4% of the time, with a 5.6% disagreement rate.”
Further, they state that the error rate of the test may be as high as 3%.
I think it would be really interesting if your company would answer the questions in that test, or even better created its own benchmark based on your experience over the years - that would give us a much more realistic comparison to work with!
If you're going to do that, you should include a baseline for AI models, also. Include the poorly developed, or erratic and unknown LLMs that don't make wapo headlines. Why is it equivalent to compare a program that required hundreds of billions to develop, to someone who may have had very little education, no interest in learning, or no interest in participating? Let's at least try to compare apples to pears.
I mean it is quite a tall order to ask academics to adapt their epistemologies and accept the AI project for what it is given the complete lack of self criticism in AI and the abundance of fantastical prophecies of unlimited and unstoppable progress- just check out what the fields 'godfather' G. Hinton or D.Amodei have been blessing humanity with lately. I wonder why there is not a single other scientific field that requires one to rethink their epistemology? And it really requires a special kind of self-delusion to claim that there are has been no substantive scientific critique of AI, on the contrary the critique is pretty much as old as the field. The problem is that ignoring such critique has virtually become part of AI researchers job description nowadays.
'And it really requires a special kind of self-delusion to claim that there are has been no substantive scientific critique of AI' - absolutely, and not something that this post claims!
Well then why strawman perfectly respectable arguments? For instance take the case of hallucinations, as you mention them extensively in the post, which are a natural and inevitable consequence of the fact the LLMs are stochastic generative models and the limited coverage of their training data. No serious academic claims that LLMs hallucinate all the time, the argument is that hallucinations are inevitable when the LLM is presented with contexts outside it's training data due to the nature of the system.
In no way is 170 words 'extensive'. As for your claim, yes obviously hallucinations are a natural function of the nature of LLMs. That tells us nothing. The point is a) that they are not as common as critics assume them to be, and b) they are becoming more infrequent over time.
What does it even mean that they do not hallucinate frequently? - it all depends on what you are asking them if you asked them some mundane question they will do just fine, but if you ask them anything beyond some threshold hallucinating is all they do.
Also consider that increasingly AI systems are using real world grounding. Rather than just relying on training data they can search the web. Or for example, when generating code they can then run that code. If the code doesn’t run due to a hallucination- then it will fix itself. As a practical matter it doesn’t really matter (for coding) at this point other than to increase duration and/or cost.
Lmao, you are literally like the critics he mentions in his article that act like they haven’t used an LLM since 2023. Dude, wtf are you talking about? If you ask them about something not in their training data, they will most likely tell you that they don’t know the answer to that question. This problem was solved years ago (eons on AI-time).
> The problem with this line of thinking is that it requires a bit of philosophical wrangling, one that (for reasons unclear) the vast majority of academics seem unwilling to engage in. This is particularly frustrating because if you’re going to make forceful claims about epistemology, it seems rather unsporting to dodge the resulting debate.
It is very, very, very, very, very annoying that AI believers continue to play dumb about this point. Pattern-matching words is different from thinking! And this is so incredibly obvious that I suspect that AI believers are either lying or are just philosophical zombies. Like, I'm honestly not sure if some of the people involved in this debate posses qualia.
Even a human who grew up alone and therefore was incapable of language would still be able to think. This is because we have words, which *signify* things, and reference objects, or *things-in-themselves* which are *being signified.* As I mentioned in another comment elsewhere, putting "ball" and "bat" together because they appear in sentences together often is *categorically different* than putting them together because you've been to a baseball game and so you know that people use bats to hit balls.
That these are *two different things* is so obvious that to say you don't understand the difference between them, you're either lying or a philosophical zombie.
RE: hallucinations, this is another area where AI believers just don't understand skeptics and are therefore not responding to the point. Skeptics aren't citing the rate of hallucinations, they're talking about the *cause* of hallucinations. Humans make errors because they have a false belief about the underlying reference objects; LLMs make errors because they pattern-match words with a psychopathic disregard for the truth. A well calibrated LLM can reliably tell the truth *despite not knowing what that is* of sufficiently trained by motivated researchers. But imagine claiming that a properly calibrated clock *literally knows what time it is* simply because it always shows the correct time. That's insane!
RE: investment; companies are pouring lots of money into AI because 1) they think AGI is possible because *financially interested researchers have told them so* and 2) LLMs will have a lot of economic applications even though it will, in my view, never become AGI. Even a glorified pattern-matching software can automate lots of tasks that currently only require very good pattern matching skills, like technical writing, clerical work, basic copywriting, basic coding, basic data analysis, etc etc. LLMs will still be impactful and disruptive - but there's no need for us to get hyped into believing outlandish claims that pattern matching robots are going to become AGI just because people who have a direct financial interest in investors believing this say so.
LLMs are ungrounded structuralist models and only process semantics based on the internal relationship within text between tokens. We have now had over a century of debate over whether structuralism is an accurate description of human language. The argument remains unresolved although LLMs have rather dramatically demonstrated how far structuralist models can take you.
I'd like to phrase it as being a skeptical advocate of AI. I'm a short-term pessimist and long-term optimist. There's a lot that still sucks but it will only get better.
In my personal experience, instances of AI hallucinations (aka poor statistical inferencing) are much higher, especially when it comes to reasoning tasks and complexity and nuance. Plus, the second a model accesses web content everything gets worse.
Interesting. My experience is that “hallucinations” are extremely frequent without the system consulting the web, but less frequent when web searches are allowed. Would be interested in hearing more about the problems with web search.
Fair point. My comment conflated two issues: 1) I experience the hallucinations on a regular basis; and 2) I find LLMs provide poor quality or incomplete information when they access the internet for answers. These issues stem from separate technical processes. The second is not a hallucination.
Thanks - that’s really helpful. My experience reflects yours. For some of my work in business I am finding the output good enough to be useful, albeit with careful supervision.
Get back to me when an LLM gets on a plane, flys to an archive, does original research, and then contributes to the sum total of human knowledge by writing it up. Right now all you’ve got as a Chicago Sun Times summer reading list with 4 books that don’t exist.
Agreed although curators are working hard to digitise many of these collections. My daughter’s PhD is sourced from a combination of trips to physical archives in Dublin and consulting obscure corners of the internet.
Insightful piece, Harry. You articulate with clarity something I’ve also been observing with growing concern: the performative nature of much academic AI criticism, where terms like “bullshit generator” become ritual signals rather than analytical positions. Your point about this critique-as-posture is especially sharp — and needed.
I particularly appreciated your call for humility and empirical engagement.
I recently wrote The 3D of the AI Religion (https://mirrorsofthought.substack.com/p/the-3d-of-the-ai-religion), which looks at how dogmatism (of both the utopian and nihilistic kind) has taken over the conversation. But your list of practical suggestions for better criticism is what the field really needs: not just a map of what to reject, but a sketch of how to think better.
“Bullshit” has a respectable definition in academic circles (Frankfurt 1986, 2005), although that definition includes a degree of agency on the part of the source that most academics using the term probably wouldn’t grant an LLM.
It could also be from a very technical perspective that AI is really just doing pattern matching, so when people try to attribute blame on AI, or go on about AI overlords, as an actual AI researcher, I think they should go read the science behind this.
I was in a book club with someone who went to try AI, and then he just ranted on about how the AI was lying to him, because he asked it for some information and it gave an incorrect answer.
We were telling him that that’s because his AI likely did not have access to the Internet and the information that it did have was probably dated before that website.
He was adamant that it did have that information for he asked it something else and it answered correctly. And that I was being ridiculous because of course it has access to the Internet just like him. It is on the Internet.
These people don’t understand tooling, but they do believe that AI is incredibly intelligent, because it sounds intelligent. On a few topics. Is this tiny amount of testing they are satisfied to trust it until it fails them and then they… start to treat it like a malicious human.
I think the lack of understanding about the flaws of LLMs is what makes us researchers adopt a defensive stance against AGI madness.
Realistically AI is not going to take over the world, but we may have an ethics problem when people depend on AI and don’t realise that it comes from biased data. Because we are biased.
Sexism, in natural language processing is a good example. It is more worrying that people will use AI to try and evaluate say resumes, based off past hiring decisions without realising that they may be perpetuating the biases of before.
Thanks for this. I’ve found talking to some academics about LLMs to be a bit crazy-making. It’s clear they haven’t even bothered to play with a good model for any sustained amount of time and are just vibes-matching “big tech bad; ai is big-tech; ai is bad”.
Mostly I have problem with the blind and ferverous assertion that language model (stronger form: language model alone) is THE WAY to AGI. When image recognition was making huge progress a decade ago, no one was foolish enough to claim that it was AGI. Machines that speak-ish natural-ish language-ish is a huge soft spot for many, since we are hardwired to feel at home with natural language.
I think we should drop the term “hallucinations”, when referring to AI. Conscious individuals beings hallucinate, systems malfunction. I see this term as just another example of the equation of LLMs with human intelligence (much more complex). They’re not intelligent. They’re still useful, but they’re way different from human intelligence.
I think academics should be able to spot the difference and use language accordingly. And not succumb to the AI hype being driven by companies to increase investments, their generalized adoption and so on.
Just because some big companies or some former US presidents say something, that doesn’t mean it’s true. Actually, if these kind of actors are saying something with such confidence, I think it deserves much closer scrutiny. That’s basically critical thinking.
Often the basis for AI criticism is a deep dislike of the tech-bro hyper-optimist view; the “AGI solves everything” idea. Academics tend to hate that mindset and so do I. AGI is poorly defined; it’s not here yet; and there’s no evidence that it solves more problems than it creates.
But this valid critique of overreach too often blinds smart people to realities. AI can do things, a lot of things actually, that were previously thought impossible for non-human entities. That is not tech-oligarch hype, it is just reality. And it does need open minds and (yes) new epistemologies. We do not build those by punching straw men. We need to think and build new descriptive ideas, to match AI as it emerges.
Agree with all of this. My view is basically that recognising that AI is more sophisticated than many tend to assume should open up more lines of (important) critical work, not close them off!
Yeah I think that’s exactly right—the most interesting philosophical and empirical work arguably can only start once we adopt some humility about what we think these systems can and can’t do; and also about the difficulty in properly assessing what we want to assess.
And I also think there's a bit of bitterness that these tech bros and AI labs didn't actually understand or solve the deep mysteries of linguistics and cognitive science -instead, they simply made a lot of GPUs go brrrr, and these are GPUs that academics don't have.
This is a problem that still needs to be addressed. It needs to be a tool that helps with linguistics and cognitive science . I think there will be more effort made to connect these fields with AI research.
Some agreement here; although I think it’s not just that academia does not have the GPUs; but also they don’t WANT the GPUs…. Because GPUs symbolize emergent outcomes produced in disorganized bundles, that don’t match well to linguistic theory or terminology.
AGI is a long term goal that right now has a lot more hype to it. People are fine being skeptical about it both in academia and general society. But LLMs and related systems like chatGPT and Claude are already very useful in programming, research and brainstorming. Everyone needs to try using it or at least not have superficial criticisms of it, especially if you are in a teaching or creative position !
осталось найти стоящее определение человека современного и начать размышлять про то что такое программа которую разрабатывает человек
“90-95 percent accurate”? That means that a summary that contains 20 “facts” will contain one falsehood; of 20 references in an AI-generated paper, one will be made up.
A journalist with that record would be fired instantly upon discovery; an academic with that record would face a university inquiry and would never be taken seriously again.
And yet people are relying on AIs instead of the work of people who get things right for a living. This will not end well.
Journalists have sub-editors and editors to check for errors. Academics have peer review and journal edits. Discounting that, there's not much in it in terms of accuracy between a human-written first draft and a draft from the best models.
Neither people nor AI are infallible. Obviously, you should check the outputs before using the models - and you should generally avoid 'relying' on AI too.
Editors (outside The New Yorker, and probably not even them nowadays) do not check journalist’s facts: they rely on the journalist’s integrity and skill, which is why journalists who make things up get fired right away.
Peer reviewers are unpaid and couldn’t possibly check every fact and citation.
And if one can’t “rely” on AI summaries, what exactly should one do with them? Treat them as entertainment?
As someone who had a piece in Time about two weeks ago, I can assure you they absolutely do check facts!
Ok, a few elite magazines still do. Most don’t. Newspapers never have.
A huge proportion of what we regard as information can only be trusted as such because of human professionalism and judgment. AI adoption will lead —has already led—to all sorts of things being accepted as true that simply aren’t.
I use AI every day and it is immensely useful. It gets things wrong far less often than a human assistant would. Stop living in a binary world where they are either 10,000% reliable or they are useless. You come across as a whiny prick, no offense. Just like when you work with a human employee, they can get things wrong, and you need to keep an eye out for that. It’s not a big deal.
It really is a big deal when you start using it for things where accuracy matters.
Courts, money, medical care are not places where inaccuracy is tolerated.
*journalists’
The difference is literally that a human making up citations would be summarily fired the first time.
This is a huge and ongoing obstacle to usefulness in many applications. And it happens to organizations that should know better. See for example https://techcrunch.com/2025/05/15/anthropics-lawyer-was-forced-to-apologize-after-claude-hallucinated-a-legal-citation/
Humans also get vetted, instead of just a cost analysis.
That’s a shockingly terrible accuracy rate. It only sounds good out of context.
For comparison, I work for publishing company and our correction rate is on average less than one per journalist per year, for people who produce a substantial piece of writing (at least 400-500 words) every day, and usually more.
Each piece of writing usually contains dozens of facts and quotes. Even if you account for prevented corrections (which were caught by an editor) that adds maybe a couple per month.
You are basically saying that LLMs make about 10-50x the number of mistakes of a professional writer.
Furthermore, most of the corrections we have are fundamentally different from an LLM hallucination. Usually it’s things like an incorrect date, a transposed number, or other boneheaded mistakes which are easily fixed and don’t expose the publisher to much liability. LLMs can fabricate entire quotes and attribute them to companies! Do you have any idea how damaging that can be to a publisher and to the company being ‘quoted’?
Not even getting into the fact that LLMs can go to the other extreme and regurgitate something produced by someone else, effectively opening the company up to a plagiarism lawsuit.
While I found your comment really insightful, I think it’s useful to look at the human baseline performance of the simpleQA benchmark: “We found that the [human baseline] matched the original agreed answers 94.4% of the time, with a 5.6% disagreement rate.”
Further, they state that the error rate of the test may be as high as 3%.
I think it would be really interesting if your company would answer the questions in that test, or even better created its own benchmark based on your experience over the years - that would give us a much more realistic comparison to work with!
If you're going to do that, you should include a baseline for AI models, also. Include the poorly developed, or erratic and unknown LLMs that don't make wapo headlines. Why is it equivalent to compare a program that required hundreds of billions to develop, to someone who may have had very little education, no interest in learning, or no interest in participating? Let's at least try to compare apples to pears.
I mean it is quite a tall order to ask academics to adapt their epistemologies and accept the AI project for what it is given the complete lack of self criticism in AI and the abundance of fantastical prophecies of unlimited and unstoppable progress- just check out what the fields 'godfather' G. Hinton or D.Amodei have been blessing humanity with lately. I wonder why there is not a single other scientific field that requires one to rethink their epistemology? And it really requires a special kind of self-delusion to claim that there are has been no substantive scientific critique of AI, on the contrary the critique is pretty much as old as the field. The problem is that ignoring such critique has virtually become part of AI researchers job description nowadays.
'And it really requires a special kind of self-delusion to claim that there are has been no substantive scientific critique of AI' - absolutely, and not something that this post claims!
Well then why strawman perfectly respectable arguments? For instance take the case of hallucinations, as you mention them extensively in the post, which are a natural and inevitable consequence of the fact the LLMs are stochastic generative models and the limited coverage of their training data. No serious academic claims that LLMs hallucinate all the time, the argument is that hallucinations are inevitable when the LLM is presented with contexts outside it's training data due to the nature of the system.
In no way is 170 words 'extensive'. As for your claim, yes obviously hallucinations are a natural function of the nature of LLMs. That tells us nothing. The point is a) that they are not as common as critics assume them to be, and b) they are becoming more infrequent over time.
What does it even mean that they do not hallucinate frequently? - it all depends on what you are asking them if you asked them some mundane question they will do just fine, but if you ask them anything beyond some threshold hallucinating is all they do.
Also consider that increasingly AI systems are using real world grounding. Rather than just relying on training data they can search the web. Or for example, when generating code they can then run that code. If the code doesn’t run due to a hallucination- then it will fix itself. As a practical matter it doesn’t really matter (for coding) at this point other than to increase duration and/or cost.
Lmao, you are literally like the critics he mentions in his article that act like they haven’t used an LLM since 2023. Dude, wtf are you talking about? If you ask them about something not in their training data, they will most likely tell you that they don’t know the answer to that question. This problem was solved years ago (eons on AI-time).
So hallucinations in LLMs have essentially been solved years ago? - I guess the research community has somehow failed to notice.
> The problem with this line of thinking is that it requires a bit of philosophical wrangling, one that (for reasons unclear) the vast majority of academics seem unwilling to engage in. This is particularly frustrating because if you’re going to make forceful claims about epistemology, it seems rather unsporting to dodge the resulting debate.
It is very, very, very, very, very annoying that AI believers continue to play dumb about this point. Pattern-matching words is different from thinking! And this is so incredibly obvious that I suspect that AI believers are either lying or are just philosophical zombies. Like, I'm honestly not sure if some of the people involved in this debate posses qualia.
Even a human who grew up alone and therefore was incapable of language would still be able to think. This is because we have words, which *signify* things, and reference objects, or *things-in-themselves* which are *being signified.* As I mentioned in another comment elsewhere, putting "ball" and "bat" together because they appear in sentences together often is *categorically different* than putting them together because you've been to a baseball game and so you know that people use bats to hit balls.
That these are *two different things* is so obvious that to say you don't understand the difference between them, you're either lying or a philosophical zombie.
RE: hallucinations, this is another area where AI believers just don't understand skeptics and are therefore not responding to the point. Skeptics aren't citing the rate of hallucinations, they're talking about the *cause* of hallucinations. Humans make errors because they have a false belief about the underlying reference objects; LLMs make errors because they pattern-match words with a psychopathic disregard for the truth. A well calibrated LLM can reliably tell the truth *despite not knowing what that is* of sufficiently trained by motivated researchers. But imagine claiming that a properly calibrated clock *literally knows what time it is* simply because it always shows the correct time. That's insane!
RE: investment; companies are pouring lots of money into AI because 1) they think AGI is possible because *financially interested researchers have told them so* and 2) LLMs will have a lot of economic applications even though it will, in my view, never become AGI. Even a glorified pattern-matching software can automate lots of tasks that currently only require very good pattern matching skills, like technical writing, clerical work, basic copywriting, basic coding, basic data analysis, etc etc. LLMs will still be impactful and disruptive - but there's no need for us to get hyped into believing outlandish claims that pattern matching robots are going to become AGI just because people who have a direct financial interest in investors believing this say so.
A lot can be said but in short two important papers on the subjects you mention
LLMs can learn to reason without words https://arxiv.org/abs/2412.06769
LLMs naturally form human-like object representations https://www.nature.com/articles/s42256-025-01049-z
LLMs are ungrounded structuralist models and only process semantics based on the internal relationship within text between tokens. We have now had over a century of debate over whether structuralism is an accurate description of human language. The argument remains unresolved although LLMs have rather dramatically demonstrated how far structuralist models can take you.
I'd like to phrase it as being a skeptical advocate of AI. I'm a short-term pessimist and long-term optimist. There's a lot that still sucks but it will only get better.
In my personal experience, instances of AI hallucinations (aka poor statistical inferencing) are much higher, especially when it comes to reasoning tasks and complexity and nuance. Plus, the second a model accesses web content everything gets worse.
Interesting. My experience is that “hallucinations” are extremely frequent without the system consulting the web, but less frequent when web searches are allowed. Would be interested in hearing more about the problems with web search.
Fair point. My comment conflated two issues: 1) I experience the hallucinations on a regular basis; and 2) I find LLMs provide poor quality or incomplete information when they access the internet for answers. These issues stem from separate technical processes. The second is not a hallucination.
Thanks - that’s really helpful. My experience reflects yours. For some of my work in business I am finding the output good enough to be useful, albeit with careful supervision.
What I think LLMs are most useful for is assisting with brain storming and providing constructive criticism.
Get back to me when an LLM gets on a plane, flys to an archive, does original research, and then contributes to the sum total of human knowledge by writing it up. Right now all you’ve got as a Chicago Sun Times summer reading list with 4 books that don’t exist.
Agreed although curators are working hard to digitise many of these collections. My daughter’s PhD is sourced from a combination of trips to physical archives in Dublin and consulting obscure corners of the internet.
should've included a gary marcus meme, what great self constraint you have shown, nice piece
You essentially do what you critique in reverse; each of your suggestions apply as criticisms of your article in reverse.
There is a certain irony given the role of deconstruction in the debate over structuralist models of language.
Insightful piece, Harry. You articulate with clarity something I’ve also been observing with growing concern: the performative nature of much academic AI criticism, where terms like “bullshit generator” become ritual signals rather than analytical positions. Your point about this critique-as-posture is especially sharp — and needed.
I particularly appreciated your call for humility and empirical engagement.
I recently wrote The 3D of the AI Religion (https://mirrorsofthought.substack.com/p/the-3d-of-the-ai-religion), which looks at how dogmatism (of both the utopian and nihilistic kind) has taken over the conversation. But your list of practical suggestions for better criticism is what the field really needs: not just a map of what to reject, but a sketch of how to think better.
Thanks for advancing the conversation.
Calling it a “bullshit generator” feels clever, but it’s lazy. It’s the kind of move a freshman makes to sound deep without doing the work.
How about "a system that generates plausible but potentially unfounded responses"? Don't worry, I'm not lazy I just asked perplexity.
“Bullshit” has a respectable definition in academic circles (Frankfurt 1986, 2005), although that definition includes a degree of agency on the part of the source that most academics using the term probably wouldn’t grant an LLM.
It could also be from a very technical perspective that AI is really just doing pattern matching, so when people try to attribute blame on AI, or go on about AI overlords, as an actual AI researcher, I think they should go read the science behind this.
I was in a book club with someone who went to try AI, and then he just ranted on about how the AI was lying to him, because he asked it for some information and it gave an incorrect answer.
We were telling him that that’s because his AI likely did not have access to the Internet and the information that it did have was probably dated before that website.
He was adamant that it did have that information for he asked it something else and it answered correctly. And that I was being ridiculous because of course it has access to the Internet just like him. It is on the Internet.
These people don’t understand tooling, but they do believe that AI is incredibly intelligent, because it sounds intelligent. On a few topics. Is this tiny amount of testing they are satisfied to trust it until it fails them and then they… start to treat it like a malicious human.
I think the lack of understanding about the flaws of LLMs is what makes us researchers adopt a defensive stance against AGI madness.
Realistically AI is not going to take over the world, but we may have an ethics problem when people depend on AI and don’t realise that it comes from biased data. Because we are biased.
Sexism, in natural language processing is a good example. It is more worrying that people will use AI to try and evaluate say resumes, based off past hiring decisions without realising that they may be perpetuating the biases of before.
Thanks for this. I’ve found talking to some academics about LLMs to be a bit crazy-making. It’s clear they haven’t even bothered to play with a good model for any sustained amount of time and are just vibes-matching “big tech bad; ai is big-tech; ai is bad”.
For a lot of us, including myself, the experience of using an LLM gives a strong impression that it is more than a “mere tool” or “stochastic parrot”.
So now I work on how to understand and speak about this.
Mostly I have problem with the blind and ferverous assertion that language model (stronger form: language model alone) is THE WAY to AGI. When image recognition was making huge progress a decade ago, no one was foolish enough to claim that it was AGI. Machines that speak-ish natural-ish language-ish is a huge soft spot for many, since we are hardwired to feel at home with natural language.
I think we should drop the term “hallucinations”, when referring to AI. Conscious individuals beings hallucinate, systems malfunction. I see this term as just another example of the equation of LLMs with human intelligence (much more complex). They’re not intelligent. They’re still useful, but they’re way different from human intelligence.
I think academics should be able to spot the difference and use language accordingly. And not succumb to the AI hype being driven by companies to increase investments, their generalized adoption and so on.
Just because some big companies or some former US presidents say something, that doesn’t mean it’s true. Actually, if these kind of actors are saying something with such confidence, I think it deserves much closer scrutiny. That’s basically critical thinking.