The Week in Examples #21 [13 January 2024]
Copyright in the spotlight, officials call out AI-powered fraud, and worries about misinformation
Two weeks into 2024 and there’s still no AGI. Is it time to ask, like so many commentators before us, whether another AI winter (which is definitely a useful frame of reference and absolutely reflects the history of AI development) is just around the corner?
Whatever the case, in the middle of a very real UK winter, this week we look at OpenAI’s response to The New York Times, reporting on the use of AI to enable fraud, and work from WEF arguing that AI powered misinformation poses a greater risk to global stability than, uh, war between nation states. As always, message me at hp464@cam.ac.uk for comments, ideas for next time, or to say hello.
Three things
1. OpenAI responds to The New York Times lawsuit
What happened? OpenAI responded to a lawsuit from the The New York Times with a blogpost making four points: 1) that it collaborates with news organisations like the Associated Press and Axel Springer; 2) that training AI models ought to be considered fair-use; 3) that ‘regurgitation’ whereby ChatGPT outputs NYT articles is a “rare bug”; and 4) that The New York Times “intentionally manipulated prompts” in the examples provided as part of the lawsuit. All prompts are intentionally manipulated in one form or another, but OpenAI argued that the degree to which The New York Times did so (specifically by including “lengthy excerpts” of articles) represented a meaningful departure from standard usage practices.
What's interesting? Meanwhile, in the court of public opinion, polling from the AI Policy Institute found in the US that “59% think AI companies should not be allowed to use copyrighted materials in training models” while “70% believe outlets like the Times should be paid for use of their materials.” At the heart of the response is the tension between the willingness to partner with news groups and the belief that training on copyrighted material constitutes fair use. To square the circle, OpenAI said that “legal right is less important to us than being good citizens.” This belief, according to the group, is the reason that the firm offers an opt-out process for publishers to prevent its tools from accessing their sites.
What else? In a separate submission to the UK’s House of Lords communications and digital select committee, OpenAI said it was impossible to not train on copyrighted data. Writing in the submission, the company made the case that “limiting training data to public domain books and drawings created more than a century ago might yield an interesting experiment, but would not provide AI systems that meet the needs of today’s citizens.” The examples underscore that, on both sides of the Atlantic and beyond, it just isn’t clear what is or isn’t fair use with respect to AI training and copyrighted data. At least in that sense, this suit and others should provide some much needed clarity.
2. European police say LLM-powered fraud is on the rise
What happened? Representatives from Europol, the law enforcement agency of the European Union, told The Guardian that the agency had recorded a sharp uptick in fraud on dating and social media apps. According to the report, Europol said that LLMs are enabling “criminals to target multiple victims at once” to increase the number of people they can contact with each individual scam usually by asking for money to escape a difficult situation. I find it hard to believe that criminals are currently incapable of targeting multiple people at once without LLMs, but then again I don’t have much experience in this particular area. In any case, they also said that so-called “bogus boss” scams were on the rise, which see fraudsters use AI to create fake websites, CVs and investor profiles to target victims.
What’s interesting? Unfortunately, the report didn’t provide any statistics, so we don’t know 1) what sort of increase we’re talking about or 2) the extent to which AI is responsible for said rise. We do know that, in theory, language models are already good persuaders (see Bai et al. in 2023, Jakesch et al., 2023, and Karinshak et al., 2023) but we still don’t have much information about whether AI is fueling fraud in the real world. This intervention from Europol (which follows a report from the organisation warning of “potential” use by criminals) gets us closer, but neither speculation nor anecdotes are a substitute for data.
What else? A 2023 paper also found that LLMs are capable of assisting with the email generation phase of a spear phishing attack, which saw the author create unique spear phishing messages for over 600 British Members of Parliament using OpenAI’s GPT-3.5 and GPT-4 models. Taken together, the combination of persuasion-focused studies, Europol reporting, and spear phishing research suggests that LLMs are easily deployed by fraudsters. But that’s only half the story. While I’d like to see some quantitative research on the volume of fraudulent activity that AI is enabling in the round, I’d also be interested in work addressing the extent to which AI is more successful than a human on a case-by-case basis. If I’ve missed work that answers these questions then please put an end to my ignorance and let me know!
3. Misinformation riskier than war, says WEF
What happened? Sticking with the theme of persuasion, manipulation, and deception, the World Economic Forum released a report suggesting that misinformation was the most significant risk to global stability. The research, the 19th incarnation of the Group’s Global Risks Report, said that over the next two years misinformation and disinformation represented a greater risk than “economic downturn[s]”, “extreme weather events”, and, uh, “interstate armed conflict”. While the ranking is obviously baffling (more on that below) it is worth saying that the report represents the amalgamation of opinions from “1,490 experts across academia, business, government” rather than the sole view of the World Economic Forum.
What's interesting? It’s hard to know where to begin with this one. The easy thing would be to make fun of the absurdness that is ranking the possibility that AI undermines epistemic insecurity in the middle of a big year for global elections as worse than actual armed conflict, but lots of other people have got that one covered. In pages 18-21 they explain the reasoning behind the experts’ views, which essentially boils down to what the report describes as the emergence of “large-scale artificial intelligence (AI) models” that “have already enabled an explosion in falsified information and so-called ‘synthetic’ content”.
What else? The problem, of course, is that AI hasn’t yet brought with it an avalanche of misinformation to hopelessly degrade our epistemic security. Harvard, for example, tackled the three most common arguments about AI’s impact on the information environment (increased quantity of misinformation, increased quality of misinformation, and increased personalisation of misinformation) and found that each was overpriced. As they explained, “existing research suggests at best modest effects of generative AI on the misinformation landscape.” In one of life’s little ironies, both The Guardian and the FT reported on the news as if it were gospel, with the former’s headline simply reading “AI-driven misinformation ‘biggest short-term threat to global economy’”. Finally , to round things off, I want to direct you all to an excellent piece by Dan Williams about the problems with treating misinformation studies as a science. Be sure to read it sceptically, though.
Best of the rest
Friday 12 January
Control AI poll results (Control AI)
Gaming voice actors blindsided by 'garbage' union AI deal (BBC)
UK government to publish 'tests' on whether to pass new AI laws (FT)
AI could be used to stop vandalism of historic sites (BBC)
Medical AI falters when assessing patients it hasn't seen (Nature)
Thursday 11 January
Nature Computes Better: Opportunity seed funding call (ARIA)
Hallucinating Law: Legal Mistakes with Large Language Models are Pervasive (Stanford)
US companies and Chinese experts engaged in secret diplomacy on AI safety (FT)
Microsoft overtakes Apple as largest U.S. company on AI boost (Axios)
How AI Replaced the Metaverse as Zuckerberg’s Top Priority (Bloomberg)
Wednesday 10 January
The Impact of Reasoning Step Length on Large Language Models (arXiv)
Misinformation researchers are wrong: There can't be a science of misleading content (Dan Williams >> Substack)
AI aids nation-state hackers but also helps US spies to find them, says NSA cyber director (TechCrunch)
Sleeper Agents: Training Deceptive LLMs that Persist Through Safety Training (arXiv)
Duolingo cuts contractors as it further embraces AI (Semafor)
EU examines Microsoft’s ties to OpenAI (FT)
Tuesday 9 January
How Johnny Can Persuade LLMs to Jailbreak Them: Rethinking Persuasion to Challenge AI Safety by Humanizing LLMs (GitHub)
It’s already time to think about an AI tax (FT)
In the race for AI supremacy, China and the US are travelling on entirely different tracks (Guardian)
Rebecca Woods on Large Language Models, Language and Meaning, and How Children Learn Language (The Good Robot Podcast)
New AIPI poll (AIPI >> X)
Deepfaked Celebrity Ads Promoting Medicare Scams Run Rampant on YouTube (404 Media)
AI Companies: Uphold Your Privacy and Confidentiality Commitments (FTC)
Monday 8 January
Mixtral of Experts (Mistral >> arXiv)
The EU AI Act Newsletter #43: French Government Accused (EU AI Act Newsletter >> Substack)
Content creators fight back against AI (FT)
Judges in England and Wales are given cautious approval to use AI in writing legal opinions (ABC)
Escalation Risks from Language Models in Military and Diplomatic Decision-Making (arXiv)
Is GenAI’s Impact on Productivity Overblown? (Harvard Business Review)