The Week in Examples #10 [28 October]
Scoping frontier AI risk, guidance for safe model deployment, and a dual mandate for AI
I’m back from a couple of days on the northern coast of Spain (with the knowledge that the rain in Spain does not in fact fall mainly on the plains). If this week’s update is a little hazier than usual, it’s because I’m still thinking about the wonderful museums (and, let's be honest) bars of Cantabria.
As always, one prompt remains unchanged: feel free to tell me what works and what doesn’t or just drop me a line to say hello at hp464@cam.ac.uk. And if you see anything you’d like me to include in a future edition, please send it my way!
Three things
1. UK reports on capabilities and risks from frontier AI
What happened? The UK government published a report on capabilities and risks from frontier AI to inform discussions at next week’s AI Safety Summit at Bletchley Park. The report is split into three parts. First, a discussion paper covering the current state of frontier AI capabilities, how these might improve in the future, as well as the risks including societal harms, misuse and loss of control. Second, a report a report drawing on national security experts that argued large models will drive increase risks to safety and security by enhancing the capabilities of bad actors. Third, a report from the Government Office for Science, which introduces a range of potential scenarios for AI development up to 2030.
What’s interesting? I like a bit of forecasting, so I found the GO-Science report to be the best read of the three. It lays out five scenarios for the future of AI (including a world with powerful but not superhuman AI, the development of AGI proper, and a scenario in which the AI bubble bursts). While you can quibble about the likelihood and exact composition of these scenarios, they represent an impressive attempt at making sense of our current moment. I also like that it argues that we don’t know enough about the trajectory of AI development to rule out catastrophic risks in the future. I’m always extremely surprised when I hear people profess certainty on both sides of the aisle, so I’m glad to see the authors take a more measured stance.
What else? As if three major papers weren’t enough, the government also hosted the safety policies of the major AI labs on the safety summit website (with each lab reporting on areas like responsible capability scaling, red teaming and model evaluations, and model reporting and information sharing. The final document I want to mention aims to “complement those AI safety policies by providing an overview of emerging frontier AI safety processes and associated practices.” This is a good summary of the state of governance practices, so make sure to give it a read if you’re interested in that sort of thing.
2. PAI needs you! Call for feedback on guidance for safe model deployment
What happened? The fine folks at the Partnership on AI released its Guidance for Safe Foundation Model Deployment, a resource designed to help firms “responsibly develop and deploy a range of AI models, promote safety for society, and adapt to evolving capabilities and uses.” The draft guidance is open for feedback until January 2024, so you can tell the authors what you think the work needs in order to be as effective as possible.
What’s interesting? The coolest thing about this project is that it generates custom guidance for model developers. You can choose from the type of model (specialised narrow purpose, general purpose or frontier) and type of release (open access, restricted API and hosted access, closed development, and research release). Each of these have different key considerations associated with them, which do a good job of (very briefly) summarising each category.
What else? I started playing with the tool and pretended I was looking to release a frontier model on an open source basis. In response, my custom guidance said “we recommend that providers initially err towards staged rollouts and restricted access to establish confidence in risk management.” There was fairly limited explanation of why that was the case, so I’d be interested to see the group add more detail to the custom recommendations by fleshing out different benefits and risks associated with certain modes of development and release. In general, one thing I’d like to see more of in the AI governance world is transparency around trade-offs: releasing a model widely, for example, increases its risk profile, but limiting access may hinder the growth of the ecosystem, prevent deep evaluations, and enable legitimacy. This could be a good opportunity to surface some of those tensions.
3. Can a dual mandate be a model for the global governance of AI?
What happened? I wrote a short piece for Nature with my colleague Lewis Ho looking at whether an organisation with a ‘dual mandate’ to manage risk and spread benefits could be a promising model to explore for the global governance of AI. If you don’t have access to the article and would like to read it I also posted the PDFs on X. The central idea is that a dual mandate, a model in which an organisation focuses on both promoting access to AI as well as its safe usage, may be a useful framework to consider for organisations engaged in international governance.
What’s interesting? We started by reflecting on the dual mandate of the International Atomic Energy Agency, which sees the organisation provide assistance for states deploying nuclear technology while also operating an enforcement regime designed to limit proliferation. While the dual mandate may seem counterintuitive if the IAEA’s primary goal is to prevent the proliferation of nuclear weapons, the historian Elisabeth Roehrlich argues that the dual mandate actually aligned incentives in a way that made control possible.
What else? But nuclear technology isn’t AI, and the IAEA’s dual mandate was built for a world very different to the one we live in today. We argue that what is needed is a project that avoids making AI policy a zero sum negotiation between leading AI states and the rest of the world. The significance of nuclear power should not be understated, but AI may underpin far more economic, cultural, and scientific value when all’s said and done. For that reason, we made the case that any dual mandate arrangement should be premised on all states benefiting from a meaningful role in the fashioning of global AI policy.
Best of the rest
Friday 27 October
International survey of public opinion on AI safety (UK Gov)
Techno-humanism is techno-optimism for the 21st century (Substack)
AI company safety policies (UK Gov)
The Emergence of China’s Smart State (Free eBook)
Getting started with Llama (Meta)
Thursday 26 October
The UK’s AI Startup Roadmap (Startup Coalition)
OpenAI announces new frontier risk and preparedness team (OpenAI)
OpenAI announces preparedness challenge (OpenAI)
Mysterious bylines appeared on a USA Today site. Did these writers exist? (Washington Post)
Google DeepMind’s Shane Legg on the Dwarkesh Podcast (X)
Wednesday 25 October
Scoop: AI executive order expected Monday (Axios)
Oversight for Frontier AI through a Know-Your-Customer Scheme for Compute Providers (arXiv)
Anthropic’s societal insights team is hiring (X)
Anthropic, Google, Microsoft and OpenAI announce $10 million for a new AI Safety Fund (Google)
AI could worsen cyber-threats, report warns (BBC)
Tuesday 24 October
How AI Can Be Regulated Like Nuclear Energy (TIME)
AI Has a Hotness Problem (The Atlantic)
Voters want deepfakes ban and think AI should not be allowed to be more intelligent than humans, poll suggests (iNews)
AI risk must be treated as seriously as climate crisis, says Google DeepMind chief (The Guardian)
Statement from AI researchers: Managing AI Risks in an Era of Rapid Progress (Managing AI Risks)
AI-created child sexual abuse images ‘threaten to overwhelm internet’ (The Guardian)
Monday 23 October
Towards Understanding Sycophancy in Language Models (Anthropic)
This new data poisoning tool lets artists fight back against generative AI (MITTR)
Which tasks will AI do better than human professionals in the next decade, according to Americans (YouGov)
Announcing Epoch’s Updated Parameter, Compute and Data Trends Database (Epoch)
UK officials use AI to decide on issues from benefits to marriage licences (The Guardian)