The Week in Examples #4 [9 September]

Experiments in collective AI governance, news from the Frontier AI Task Force, and evaluations for open source models

Sep 09, 2023

And we are back for the fourth edition of The Week In Examples, an end of week roundup of news from industry, government and civil society as well as opinion and analysis of all things AI. As usual, we have my thoughts on three interesting pieces I’ve seen this week, links to other AI resources, and a vignette to finish.

You will have noticed that there was no essay this week. That’s because I’m still figuring out what works and what doesn’t (in this instance, cadence) as I get Learning From Examples off the ground, which brings me to my regular public service announcement. I want Learning From Examples to be a useful resource, so please tell me what works and what doesn’t, what I ought to start doing, and what I should stop doing. Either reply to this email or write to me at hp464@cam.ac.uk with feedback.

Three things

1. An experiment in collective AI governance

What happened? Chatham House and Taiwanese public consultation project vTaiwan announced a new project, Recursive Public, with initial support from OpenAI's Democratic Inputs to AI grant scheme. The organisers bill the project as an experiment in identifying areas of consensus and disagreement amongst the international AI community, policymakers, and the general public on key questions of governance. Its initial goals are to identify priorities across the AI community, deliberate on key alignment issues, and shape a collectively-developed message about “what needs to happen with AI.” You can sign up via Google Form here.
What’s interesting? AI labs are accelerating efforts to incorporate public input via initiatives spanning alignment assemblies (OpenAI and Anthropic) community fora (Meta AI), and efforts to boost democratic deliberation (Google DeepMind). Relatedly, Anthropic released research focusing on scalable deliberation with pol.is as well as a project aiming to understand which values are encoded in large models. Recursive Public is one of the first projects that I’ve come across that has sprung up using funding from OpenAI’s democratic inputs for AI initiative since the initial announcement was made back in May.
What else? One of the central questions underpinning AI development is about whose values AI systems ought to represent. As increasingly powerful models diffuse throughout the economy, a core challenge for AI developers is to grapple with how to build models that reflect the public and its interests. While some of the methods described above seek to answer this question, Recursive Public is notable because it seems to have been built to also tackle a slightly different (but related) set of questions about how to develop, manage, and deploy AI. Of course, important challenges remain about what the most effective methods are for seeking public input, what the nature of the problem is that we’re trying to use public input to solve, and the extent to which public input is appropriate (especially with respect to more specialist or contentious issues).

2. Bengio and Christiano join the UK’s Frontier AI Task Force

What happened? The UK’s Frontier AI Taskforce (formally known as the Foundation Model Taskforce) released its first progress report reflecting on efforts to recruit AI researchers, build the foundations for AI research inside government, and partner with technical organisations. With respect to the latter, partners include ARC “to to assess risks just beyond the frontier in the lead up to the UK’s AI Safety Summit”, the Center for AI Safety “to interface with and enable the broader scientific community”, and the Collective Intelligence Project “to develop a range of social evaluations for frontier models”. Perhaps most eye-catching, however, was that the group announced an external advisory board including Entrepreneur First’s Matt Clifford, Yoshua Bengio, Chief Scientific Adviser Alex Van Someren, Deputy National Security Adviser Matt Collins, prominent medic Dame Helen Stokes-Lampard, and Paul Christiano of ARC.
What’s interesting? The external advisory board drew plaudits from safety-focused corners, with Eliezer Yudkowsky writing “I'm not sure if this govt taskforce actually has a clue, but this roster shows beyond reasonable doubt that they have more clues than any other govt group on Earth.” There’s no doubt that bringing Bengio and Christiano on-board lends the group credibility as they look to build the infrastructure to conduct evaluations and explore promising research areas like mechanistic interpretability. Whether or not the task force is likely to be successful remains an open question, but given that only a handful of organisations exist capable of delivering cutting edge technical safety research, it seems to me to be likely that the group is set up to have a reasonable impact.
What else? Not everyone likes the UK’s narrow focus on frontier AI safety. The Ada Lovelace Institute, for example, recommended the UK AI Safety Summit adopt a broad definition of AI safety, cautioning against a narrow definition based on extreme risk. Amidst the release of the summit’s five objectives, the think tank called for the government to adopt a broad definition to ensure that topics such as accidental harms, misuse, and supply chain harms were not excluded from discussions. While I am sympathetic to this perspective, there is a certain logic to a narrow interpretation of safety. A tightly scoped discussion makes it much easier to get a settlement that works for as many states as possible at the summit, while from the taskforce’s perspective, we ought to remember that the group only has £100M in funding (for context, a Series B funding round for an AI start-up can be expected to bag over £160M or $200M).

3. Open source models should be subject to evals says, uh, me

What happened? Please indulge my shamelessness as I tell you about a summary of a recent paper that I co-authored with my Google DeepMind colleague Sebastien Krier, 'Open-source provisions for large models in the AI Act,' for the team at the Montreal AI Ethics Institute. The long and short of it is that we suggest that while open-source models should not be subject to the same provisions as commercial models, they should be subject to evaluations to limit risks.
What’s interesting? Ultimately, ensuring the benefits of AI are spread while mitigating against the proliferation of dangerous models is a defining challenge of our industry. Striking the right balance will always prove difficult, and parties will always disagree about whether a settlement is too restrictive or too generous. Open-source models are an important part of this puzzle. Not only do they act as a mechanism for diffusing the benefits of AI, but they also contribute to the growth of the AI ecosystem writ large. While we support measures that favour minimal restrictions on open-source models today, such a position is likely to prove dangerous over the long term as models become more sophisticated.
What else? The debate around access shows no sign of abating. The UK Frontier AI Taskforce chair recent described open source models as a form of ‘irreversible proliferation’, while on the other side, a16z announced special support (not investment, though) for companies building open source AI solutions. Meanwhile, moves to contest the term ‘open source’ have continued. Amongst permissive approaches to access, it is possible to differentiate between ‘open models’ that come with commercial-use weights and open-source datasets, as well as ‘open weights’ approaches with licensed model weights but lacking public training data. Other approaches include sharing ‘restricted weights’ that have conditional accessibility with undisclosed datasets, and so-called ‘contaminated weights’ that are technically open but restricted by dataset limitations.

Best of the rest

Friday 8 September

We must shape the AI tools that will in turn shape us (FT)
Rage against the machine? Why AI may not mean the death of film (The Guardian)
What No One Outside OpenAI Can Really Understand About OpenAI (Substack)
Companies Look to Squeeze More Power Out of AI Chips (Wall Street Journal)

Thursday 7 September

Market concentration implications of foundation models: The Invisible Hand of ChatGPT (Brookings)
AI Safety in China #2 (Substack)
Anthropic launches a paid plan for its AI-powered chatbot (TechCrunch)
The 100 Most Influential People in AI 2023 (TIME)
Microsoft pledges legal protection for AI-generated copyright breaches (FT)
G7 Hiroshima AI Process G7 Digital & Tech Ministers' Statement (POLITICO)

Wednesday 6 September

Google to require campaign ads to disclose AI use (The Hill)
Policies in Parallel? A Comparative Study of Journalistic AI Policies in 52 Global News Organisations (SocArXiv)
Governor Newsom Signs Executive Order to Prepare California for the Progress of Artificial Intelligence (CA.Gov)
Pentagon Plans Vast AI Fleet to Counter China Threat (The Wall Street Journal)
Opinion: The staggering implications of AI drone warfare (The Hill)

Tuesday 5 September

What OpenAI Really Wants (WIRED)
Letter to President Biden (Center for American Progress)
Ex-Google executive fears AI will be used to create ‘more lethal pandemics’ (NY Post)
Exclusive survey: Experts favor new U.S. agency to govern AI (Axios)
Opinion: Computers will not take over the world (The Washington Post)
AI Safety Newsletter #21 (Substack)

Monday 4 September

Topical Collection on The Dangers of AI Hype: Examining and countering the causes, manifestations, and consequences of overinflated and misrepresented AI capabilities and performance (AI and Ethics)
‘I hope I’m wrong’: the co-founder of DeepMind on how AI threatens to reshape life as we know it (The Guardian)
Scramble to secure more power for Rishi Sunak’s supercomputer lab (The Telegraph)
‘Premature regulation could stifle AI in London’ (The Times)
A developer built a 'propaganda machine' using OpenAI tech to highlight the dangers of mass-produced AI disinformation (Business Insider)
Baseline Defenses for Adversarial Attacks Against Aligned Language Models (arXiv)

Vignette of the week

*"The code that the computer understands is normally put onto cards or paper tape" (1971). Artist: BH Robinson. ‘How it Works: The Computer’*

John Giudice

Sep 9, 2023

On your discussion point about evaluating AI models - I would propose that people develop a set of evaluation prompts and questions for AI implementations, with expected answers, that could be regularly used to evaluate the AI models people care about. These evaluation questions would be run regularly, maybe quarterly, to track the quality changes, and other aspects that are important. Since many models are regularly being updated and changed it would be helpful to track and understand the changes and the progress on their implementation. Let’s discuss this if others are interested as well. I don’t think anyone is publicly doing this that I have found.

Expand full comment

Nick Potkalitsky

Sep 9, 2023Edited

Nice work. The AI Snake Oil guys are putting on an all day forum about safety and open AI model on Sept 21. Should be good.

Learning From Examples