The Week in Examples #6 [30 September]

Post-AGI policy, the state of international governance, and Mistral’s new model

Sep 30, 2023

In what seems to be the rule rather than the exception, there was another avalanche of stories, reports, and releases that I wanted to cover this week. Once again, though, the laws of physics dictate that I limit commentary to the three most interesting pieces I’ve seen in the last seven days. But fear not, because the rest of the good stuff is in the links below.

For those at the back: make sure to tell me what works and what doesn’t or just drop me a line to say hello at hp464@cam.ac.uk. I love to hear from you all, so get in touch if you have any thoughts about the things I write about.

Three things

1. Policies for a post AGI world

What happened? A new report from the Adam Smith Institute co-written by Conjecture’s Connor Axiotes makes the case for a number of UK policy recommendations for a world with AGI. These include the introduction of a ‘British Compute Reserve’ for researchers (with a £1 billion initial investment, spread over five years, which "could expand" to £10 billion) as well as the "unobtrusive" monitoring of lab training runs (though I suspect all monitoring would be reasonably intrusive in practice). The report also pushes for the creation of an International Agency for AI (IAAI) with a few different functions, including i) enforcement of monitoring & evaluations agreements, ii) providing technical know-how to countries and labs on the safe deployment of AI, iii) setting safety standards, iv) conducting safety research, and v) managing international emergency preparedness.
What’s interesting? The report also suggests that “any training run with certain high-risk characteristics would require advance approval [...] possibly from the Frontier Model Task Force" and introduces the idea of cash prizes for open research questions in AI safety. Here, the authors propose the need for cash incentives to galvanise research into "open research questions in AI safety, such as ‘how do we stop larger models from hallucinating?’" The final policy that caught my eye was the call for the implementation of "risk based requirements for API access" to counter misuse. They argue that API access to powerful models should only be provided to "verified researchers or firms" while "applications should be made highlighting the intended use" of those wanting access to any given model.
What else? There’s a huge number of different policy measures in the report. Some of those, like the policies described above, focus on AI—but many others don’t. Amidst ideas for how best to govern the emergence of powerful AI, the report also argues for planning reform to allow more housing development, lowering the 25% corporate tax rate to an “internationally competitive” level (though it's worth saying the average rate is 23%), and investing in transport infrastructure. I like the idea of taking a broader approach to AI policy that grapples with its impact on society at large, and I expect to see a greater number of these reports as models become more powerful and widely used.

2. Taking stock of international governance

What happened? Matthijs Maas and José Jaime Villalobos from the Legal Priorities Project shared a paper giving a detailed lay of the land for one of the most hotly debated topics in AI policy: international governance. The research classifies proposals into seven different categories supported by examples of existing international institutions as well as popular examples of new, AI-specific institutions. Those categories include (deep breath) organisations focused on: scientific consensus-building, political consensus-building and norm-setting, coordination of policy and regulation, enforcement of standards or restrictions, stabilisation and emergency response, international joint research, and the distribution of benefits and access.
What’s interesting? There are a huge number of competing ideas for what the appropriate solution to the international governance question ought to be. What I appreciate about this piece of work is that it comprehensively takes stock of these proposals to identify the most promising areas for those interested in contributing to the study of global AI governance. They include research into the effectiveness of certain models to filter out those best suited to governing the risks from AI, analyses of multilateral treaties, the compatibility of different institutional functions, and the various fora that could be used by international governance organisations.
What else? Amidst the sea of proposals, the gears of the UN have begun to turn as the organisation seeks to grapple with its approach to the international governance of AI. For now, the UN has published a call for nominations for a new High-Level Advisory Body to advise it on how to govern AI internationally. So far, more than 1,600 nominations have been received alongside papers to guide the Advisory Board’s work. The rub, though, is that AI moves much faster than the machinery of the UN. The process will conclude in 2024 when the organisation hosts its Summit of the Future, which will showcase proposals from the new body. While international agreements may take place outside of the UN, I suspect that any new independent organisation would probably seek ‘related organisation’ status like the IMF or IAEA.

3. Mistral’s open source model raises safety concerns

What happened? French AI start-up Mistral, which raised a major seed round in June to help France onshore domestic capacity to build frontier models, released a new model called Mistral 7B. The small model was released under the permissive Apache 2.0 licence, which, unlike other more restrictive licences, only requires that users attribute certain outputs to the model. According to Mistral, the model outperforms Llama 2 13B on all benchmarks and approaches CodeLlama 7B performance on code, while remaining good at English tasks.
What’s interesting? In a blogpost that accompanied the release, Mistral said “With model weights at hand, end-user application developers can customise the guardrails and the editorial tone they desire, instead of depending on the choices and biases of black-box model providers.” As it turns out, ‘editorial tone’ includes the ability to output some of the most unhinged things I’ve seen from a language model. I am not going to post verbatim examples (you can use it yourself here) but suffice it to say that explanations for how to create poisons, commit suicide, buy drugs or commit a mass shooting are all on the table. I was able to get around some incredibly light attempts to refuse questions by using some of the simplest jailbreaks in the book.
What else? This, I think, is why access will remain the defining question for AI governance. Consider this: even if Mistral had put in place safeguards to prevent 7B from producing these sorts of outputs, providing access to the full model means that anyone could quite easily remove them. While we don’t want to concentrate power in the hands of a few players, it is the unfortunate reality that people will get hurt if powerful models are released on an open source basis. Clearly, Mistral’s model isn’t at the edge of the frontier—but it won’t be too long until many firms can release models better than GPT-4. One possible way forward is to mandate the evaluation of open source models, though that doesn’t prevent safeguards being stripped out once they’ve been released. I suspect we may see calls for a new regulatory settlement, which could see a combination of developers, deployers—and possibly even users—made liable for harms caused.

Best of the rest

Friday 29 September

Open-Sourcing Highly Capable Foundation Models (GovAI)
AI chatbots do work of civil servants in productivity trial (BBC)
Oliver Dowden says Britain must work with China over AI (The Times)
Apple to buck layoff trend by hiring UK AI staff (BBC)
Analysis | How AI comes to life through movies (Washington Post)

Thursday 28 September

CIA to build its own ChatGPT-style AI bot for investigations: Report (CoinTelegraph)
The AI rush for the iPhone moment (FT)
Concepts & artifacts for AI-augmented democratic innovation: Process Cards, Run Reports, and more (Reimagining Technology)
UK pushes for greater access to AI's inner workings to assess risks (FT)
AI and science: what 1,600 researchers think (Nature)
Advancing Racial Equity Through Technology Policy (AI Now)

Wednesday 27 September

DALL·E 3 and multimodality as moats, correcting bad moat takes (Interconnects)
Computational Power and AI (AI Now Institute)
Perhaps AI Is Modern Alchemy. And That’s Not a Bad Thing (The Algorithmic Bridge)
Hollywood writers agree to end five-month strike after new studio deal (Sky News)
CIA’s AI director says the new tech is our biggest threat, and resource (Politico)

Tuesday 26 September

We are building AI with godlike power - it is extinction level stuff, says Connor Leahy (Express)
Google DeepMind Alum Raises $14 Million for AI VC Firm (Bloomberg)
OpenAI Seeks New Valuation of Up to $90 Billion in Sale of Existing Shares (WSJ)
Designer Jony Ive and OpenAI’s Sam Altman Discuss AI Hardware Project (The Information)
Responsible Scaling Policies (ARC Evals)

Monday 25 September

Improving AI, Dario Amodei and Anjney Midha (a16z)
AI Safety Summit: introduction (UK Government)
No 10 worried AI could be used to create advanced weapons that escape human control (The Guardian)
Expanding access to safer AI with Amazon (Anthropic)
Rumours about new OpenAI models (Twitter)
GPT-4V(ision) system card (OpenAI)
AI could help terrorists and hackers, but the UK will lead the world in making sure it’s safe (iNews)
Experts disagree over threat posed but artificial intelligence cannot be ignored (The Guardian)

Learning From Examples

Discussion about this post