Seoul summit, transparency, information security [TWIE]
The Week In Examples #37 | 25 May 2024
Same content, new format. Three things that matter for AI and society (loosely defined as a combination of policy, governance, applications, and research) plus the usual links and jobs. Like the post or email me at hp464@cam.ac.uk to let me know whether to keep this style or regress to the mean.
Three things
1. The AI Safety Seoul Summit
Despite what you might have heard, AI safety is alive and kicking. This week’s AI Summit in Seoul reminded us that extreme risks—things like AI-powered cyber and biological threats or autonomous replication—are still on policymakers’ agenda. In that spirit, the UK government announced that “new commitments to develop AI safely have been agreed with 16 AI tech companies” at the AI Seoul Summit hosted in partnership with South Korea.
This group of AI developers includes OpenAI, Google DeepMind, Meta, Mistral, and Anthropic. Each agreed to assess and manage risks when developing and deploying frontier AI models, accept accountability for this process, and expressed support for safety-focused transparency requirements to show that they mean business.
The news coincided with commentary from the signatories, with OpenAI emphasising existing work on areas like election integrity and Google DeepMind foregrounding the recent release of a new Frontier Safety Framework. Anthropic also shared reflections on its Responsible Scaling Policy “to continue the discussion on creating thoughtful, empirically-grounded frameworks for managing risks from frontier models.”
Now, none of the Seoul commitments are binding. Some reckon that more concrete measures are needed. My own view, though, is that the direction of travel is positive. Incremental progress is good, but it is also sustainable. It reminds me of the famous paradox at the heart of Giuseppe Tomasi di Lampedusa’s 1958 novel The Leopard: “Everything must change so that everything can stay the same.” Not everything is changing at the frontier of AI safety, so perhaps not much will stay the same.
2. Transparency index looks through popular models
Researchers at Stanford, Princeton, and Harvard released an interim update to the Foundation Model Transparency Index (FMTI). The initial report, released in October 2023 to assess the openness of developers, described what the researchers characterised as “pervasive opacity” across the AI industry.
The original work used 100 indicators to assess the transparency of popular models, including ‘upstream’ resources like data and compute, ‘model-level’ factors about capabilities, risks and limitations, and ‘downstream’ practices related to distribution and societal impact. The group applied these elements to ten model developers, finding that Meta scored the highest (54 out of 100) while Amazon scored the lowest (12 out of 100). The mean score for developers was 37 out of 100 in the October 2023 version.
Since then, the group examined an additional set of developers using the same 100 indicators studied previously. They also asked model makers to submit transparency reports as part of the updated edition (unlike the previous version which relied on publicly available information). This time around the mean score for developers was 58 out of 100, which the researchers argue indicates “substantial room for improvement”.
Transparency is generally deemed to be a universal good in science and technology. Commonly cited benefits of transparent approaches include boosting accountability, fostering innovation, building public trust, and facilitating effective governance and regulation. But transparency, at least for AI, has a cost.
We know that open-source models can be easily stripped of safety guardrails, which means that—should models maintain their current rate of improvement—a decision to promote transparency may eventually enable bad actors to use powerful AI to do things like orchestrate sophisticated cyber attacks or conduct successful phishing campaigns. Open source models are today unlikely to pose a greater risk than their closed sourced counterparts, but this dynamic probably won’t hold over the long term.
3. Information access a risky business
Microsoft Research and AI assurance outfit PaperMoon released work centering the “systemic consequences and risks of employing generative AI in the context of information access.” The research introduces impacts from AI on the information environment, the mechanisms underpinning this process, and the corresponding risks associated with them.
The authors begin with the disruption of the information ecosystem, which concerns “how content is produced, consumed, monetized, and manipulated towards specific ends.” The mechanisms driving this change are things like AI’s ability to produce reams of low quality content at scale, errors when AI is used to summarise information, difficulties in applying content moderation practices to LLMs, failures to adequately source information, and instances in which malicious third parties manipulate AI platforms. According to the authors, these mechanisms may degrade the strength of the polity, introduce downward pressure on health and wellbeing, and widen global inequity.
In response, the group recommends a greater focus on post deployment evaluations of AI to study how potential risk manifests in the real world. It is one thing to speculate on the possibility failure modes, and another to document them when they occur. The focus on the so-called ‘systemic impact’ of AI is having a bit of a moment, with the UK AI Safety Institute announcing a new grants programme to manage societal-level impacts of frontier models earlier this week. Given that work from October 2023 put societal-level assessments at about 5% of the total number of AI evaluations, I expect both the relative and absolute number of systemic impact tests to grow in the coming months and years. When you are at the bottom, the only way is up.
Best of the rest
Friday 24 May
Staffing questions swirl around Commission's AI Office (Euractiv)
China's PC makers are experiencing an AI revival (FT)
A vision for the AI Office: Rethinking digital governance in the EU (Euractiv)
Exclusive: Microsoft's UAE deal could transfer key U.S. chips and AI technology abroad (Reuters)
Google AI search tool produces wildly misleading, sometimes harmful answers (Quartz)
Thursday 23 May
Lessons from the Trenches on Reproducible Evaluation of Language Models (arXiv)
Scarlett Johansson's AI row has echoes of Silicon Valley's bad old days (BBC)
ASML and TSMC can disable chip machines if China invades Taiwan (Yahoo)
Alibaba bets on AI to fuel cloud growth as it expands globally to catch up with U.S. tech giants (CNBC)
White House pushes tech industry to shut down market for sexually abusive AI deepfakes (The Independent)
'People are just not worried about being scammed' (BBC)
Wednesday 22 May
Managing extreme AI risks amid rapid progress (Science)
Tech Secretary unveils £8.5 million research funding set to break new grounds in AI safety testing (UK Gov)
The future of foundation models is closed-source (Substack)
OpenAI and Wall Street Journal owner News Corp sign content deal (The Guardian)
Nvidia’s revenue soars 262% on record AI chip demand (FT)
Tuesday 21 May
Mapping the Mind of a Large Language Model (Anthropic)
Global leaders agree to launch first international network of AI Safety Institutes to boost cooperation of AI (UK Gov)
GB Cloud: Building the UK’s Compute Capacity (UK Day One)
A New National Purpose: Harnessing Data for Health (TBI)
Government's trailblazing Institute for AI Safety to open doors in San Francisco (UK Gov)
DHS official: AI could exacerbate chemical and biological threats (FedScoop)
Amazon and Meta join the Frontier Model Forum to promote AI safety (Frontier Model Forum)
Monday 20 May
Fourth progress report (AISI)
Jill Watson: A Virtual Teaching Assistant powered by ChatGPT (arXiv)
Governing in the Age of AI: A New Model to Transform the State (TBI)
A review on the use of large language models as virtual tutors (arXiv)
OpenAI safety update (OpenAI)
The AI Community Building the Future? A Quantitative Analysis of Development Activity on Hugging Face Hub (arXiv)
Introducing def/acc at EF (Entrepreneur First)
How the voices for ChatGPT were chosen (OpenAI)
Looking ahead to the AI Seoul Summit (Google DeepMind)
Safety first? (Ada Lovelace Institute)
Historic first as companies spanning North America, Asia, Europe and Middle East agree safety commitments on development of AI (UK Gov)
Reflections on our Responsible Scaling Policy (Anthropic)
Job picks
Some of the interesting (mostly) non-technical AI roles that I’ve seen advertised in the last week. As usual, it only includes new positions that have been posted since the last TWIE (but lots of the jobs from the previous edition are still open).
Policy Manager, AI Governance, Credo AI (EU)
Operational Ethics and Safety Manager, Google DeepMind (London)
Generalist, Responsible Scaling Team, Anthropic (London)