Biorisk overpriced, ‘leaked’ Mistral model, and cognitive bias [TWIE]

The Week in Examples #24 | 3 Feb 2024

Feb 03, 2024

After lots of you said that you would like me to add some job openings in what I am loosely calling the AI policy, governance, ethics, and operations universe, I’ve added some roles in a new section below. If you have any thoughts about the jobs portion of the newsletter—or anything else—let me know at hp464@cam.ac.uk.

Today, we have the first results from OpenAI’s Preparedness group, the ‘leak’ of a high-performing model from Parisian start-up Mistral, and a new paper arguing that LLMs “exhibit human-like biases” in problem solving tasks. Vamos!

Three things

1. Internet as risky for creating bioweapons as GPT-4, says OpenAI

What happened? OpenAI released the first report from its new Preparedness team announced by the company in October last year. The research, which follows the release of a Preparedness Framework in December (TWIE #18), aimed to test the extent to which today’s most capable models are likely to help with producing toxins, pathogens, or other biological agents, a concern which has become increasingly commonplace as models get more sophisticated.
What's interesting? The report outlines the results of a test designed to measure whether models could “increase malicious actors’ access to dangerous information about biological threat creation, compared to the baseline of existing resources (i.e., the internet).” To do that, the team enrolled 50 biology PhDs and 50 students, randomly assigned them to a control group with internet access or a group that could use GPT-4 and the internet, then tested for a combination of accuracy, completeness, innovation, time taken, and self-rated difficulty for tasks related to biological threat creation. In a result that tallied with the outcome of recent biorisk work conducted by RAND, they found that the effect of GPT-4 was “not large enough to be statistically significant.”
What else? I’ve seen a few reactions to this work that basically amounted to “yes, of course GPT-4 can’t help create biological weapons—why are we talking about this?” Well, while I understand some of the frustrations about what can sometimes feel like a negative discourse, it is a fact that model capabilities are increasing dramatically. That increase expands the potential for benefit but also for harm, especially considering we have at best only a rough idea about a model’s threat potential at the time of its release. As a result, it makes sense to conduct risk assessments for the absolute worst case scenario right now and—as the report makes clear—lay the foundations for testing future models that are more likely to aid bad actors.

2. Mistral makes mystery model

What happened? French AI start-up Mistral confirmed that a new mystery model known as Miqu, which had been performing impressively on the EQ-Bench v2 benchmark, was leaked by an “over-enthusiastic employee of one of [its] early access customers.” CEO Arthur Mensch said that Miqu was a ‘quantised’ (which essentially means compressed) version of an old model, which was retrained “from Llama 2 the minute we got access to our entire cluster.”
What’s interesting? The model is a pretty high performer, though it’s not all that clear just how good it actually is. These results come from EQ-Bench, which aims to test emotional intelligence, and isn’t considered to be quite as influential as the LMSYS Chatbot Arena Leaderboard. In that setting, which Miqu hasn’t yet been added to, GPT-4 reigns supreme followed by Google’s Gemini Pro. Uncertainty about performance notwithstanding, that Mistral was able to train a strong version of Llama 2 (as measured by EQ-Bench, which does correlate with the popular MMLU benchmark) bodes well for the company’s next release.
What else? Though no doubt a little frustrating for Mistral, this doesn’t fit the profile of a leak in the traditional sense. What we’re talking about is A) a developer providing a third party with the full weights to a model, and B) that party deciding to share them online. For highly permissive access regimes, this is (rightly or wrongly) the sort of thing I would expect, which is probably why Mistral seemed broadly fine with it. I should also say that this wasn't the only big open source news this week, with the Allen Institute releasing a “truly open” model, OLMo, complete with weights, code, training data, and detailed evaluation materials. With two very different motivations (“scientific progress” for Allen vs. “community benefit” for Mistral), the releases are a good reminder of the broad nature of the coalition getting behind the open-source AI movement.

3. Large models are childish problem-solvers

What happened? Researchers from institutions including ETH Zürich and the Max Planck Institute for Intelligent Systems released a new paper studying the biases of LLMs in relation to those known in children when solving arithmetic word problems. The authors argued that, for both humans and AI, problem-solving process can be split into three distinct steps: text comprehension (understanding the text laying out a maths problem), solution planning (determining what the best action is in order to get to the correct right answer) and solution execution (correctly putting that action into practice).
What's interesting? The basic idea is that “an LLM that models the problem-solving process of children should also make similar mistakes as children, i.e., it should mimic the cognitive biases that are salient in children during problem solving.” In terms of how these biases manifested, the researchers found that models (Mistral 7B and Llama 2) mirrored the approach taken by children in the comprehension and solution planning sections, but not for the final execution stage.
What else? This work gets to grips with one of the oldest debates there is: whether intelligence is learned or innate. It goes all the way to Kant and Locke, Helmholtz and Hering, and Chomsky and Piaget (the latter of which the authors cite in the paper). More recently, in AI, the debate has organised around those who think that artificial neural networks have the potential to extract symbolic representations automatically and those who think that new architectures are needed. This may sound a bit academic, but it really does matter. Extremely capable AI gets a whole lot more tractable if symbols can be automatically learned by big models. While this research doesn’t really prove that LLMs find all parts of problem-solving to be soluble right now, it does suggest that they are capable of simulating a decent chunk of the reasoning process.

Best of the rest

Friday 2 February

The Children’s Commissioner’s view on artificial intelligence (UK Gov)
China tech is running to stand still in AI race (Reuters)
House of Lords Report on Generative AI (UK Gov)
Is democratizing AI a bad idea? (Vox)
Grounded language acquisition through the eyes and ears of a single child (Science)
EU countries give crucial nod to first-of-a-kind Artificial Intelligence law (EURACTIV)

Thursday 1 February

International Scientific Report on Advanced AI Safety: expert advisory panel and principles and procedures (UK Gov)
The State of State of AI Report (Air Street Capital >> summary on X)
Safeguarded AI (ARIA)
I Tested a Next-Gen AI Assistant. It Will Blow You Away (WIRED)
Natha Labenz on recent AI breakthroughs and navigating the growing rift between AI safety and accelerationist camps (80,000hrs)

Wednesday 31 January

Do Language Models Exhibit the Same Cognitive Biases in Problem Solving as Human Learners? (arXiv)
Global-Liar: Factuality of LLMs over Time and Geographic Regions (arXiv)
More than half of UK undergraduates say they use AI to help with essays (The Guardian)
Tyler Cowen - Hayek, Keynes, & Smith on AI, Animal Spirits, Anarchy, & Growth (Dwarkesh Podcast)
Hawking was wrong: Philosophy is not dead, and it has kept up with modern science (Dan Williams >> Substack)
AI Grants batch 3 (AI Grants)

Tuesday 30 January

Circuits Updates - January 2024 (Anthropic)
ChatGPT is leaking passwords from private conversations of its users, Ars reader says (Ars Technica)
The Rise of Techno-authoritarianism (The Atlantic)
FCC moves to outlaw AI-generated robocalls (TechCrunch)
Pentagon’s new bug bounty seeks to find bias in AI systems (NextGov FWC)
US receives thousands of reports of AI-generated child abuse content in growing risk (Reuters)

Monday 29 January

Black-Box Access is Insufficient for Rigorous AI Audits (arXiv)
Taking Additional Steps To Address the National Emergency With Respect to Significant Malicious Cyber-Enabled Activities (US Gov >> analysis on X)
Fact Sheet: Biden-⁠Harris Administration Announces Key AI Actions Following President Biden’s Landmark Executive Order (US Gov)
Meta’s free Code Llama AI programming tool closes the gap with GPT-4 (The Verge)
OpenAI is working on AI education and safety initiative with Common Sense Media (CNBC)
China approves over 40 AI models for public use in past six months (Reuters)

Job picks

These are some of the jobs that I’ve come across in the last week that I thought looked like fun. It’s also worth saying there are various open roles at the UK’s AI Safety Institute, though the majority of these are technical and my ambition for this section is to keep (mostly) focused on non-technical jobs in AI.

If you have an AI job (primarily non-technical but, as above, I expect there will be a few exceptions) that you think I should advertise in this section in the future, just let me know and I’d be happy to include it.

Public Policy Manager (12 month FTC), Google DeepMind, UK
Summer Research Fellowship in Law & AI 2024, Legal Priorities Project, Remote
Policy Communications Lead, Corporate Communications, Anthropic, US
Public Policy Analyst, Product, Anthropic, US
Media Relations, Europe Lead, OpenAI, UK
Program Manager - Economic Impacts, Policy Research, OpenAI, US
Director of Operations (Expression of Interest), CLTR, UK
Senior Reporter, Artificial Intelligence, The Verge, Remote/US
Manager, AI Policy, Government Affairs and Public Policy, Google, US
Project Manager, Artificial Intelligence, French Government, France

Learning From Examples

Discussion about this post