The Week in Examples #15 [2 December]
AI adoption zooms ahead, special-purpose prompting, and techno-optimism
It’s December 2, which means it is acceptable (read: encouraged) to listen to Fairytale of New York by The Pogues. A fitting tribute to the late, great Shane McGowan.
This week, we have data from UK communications regulator Ofcom suggesting that (perhaps unsurprisingly) young people are disproportionately driving AI adoption, research comparing complex prompting in general purpose models to their specially-tuned counterparts, and a meditation on techno-optimism from Ethereum inventor Vitalik Buterin.
As always, message me at hp464@cam.ac.uk for comments, ideas for next time, or to say hello.
Three things
1. Zoomers drive adoption of large models
What happened? A new study from Ofcom found that around four in five (79%) teenagers aged 13-17 now use generative AI, which it defines as “algorithms that can create new content in response to a prompt, including text, images, video and code outputs.” Young people are most likely to use Snapchat’s My AI, which became freely available to all Snap users in April 2023 and is used by about half (51%) of 7–17-year-olds in the UK. I haven’t spent much time with My AI myself (perhaps I am no longer at the cutting edge of technology adoption) but I have come across reports about the platform engaging young people in conversations about sex, drugs, and self-harm. This particular example aside, the broader point is that approaches that reject what some see as overly paternalistic interpretations of safety might work for adults—but run into trouble when minors are concerned.
What’s interesting? And what about everyone else? Well, for those over 16, (sidenote: the report compares 13-17 year-olds and then everyone else over 16, so there’s a bit of overlap in these groups) ChatGPT is the most widely used generative AI application at 23%. The service was followed by Snapchat’s My AI (15%), Bing Chat (11%), Google Bard (9%), and Midjourney (9%). While the usage stats are pretty strong, it’s also worth saying that sixty-nine per cent of internet users said they had never used a generative AI tool or said they didn’t know if they had.
What else? The report also demonstrates that usage is sharply divided along gender lines. Amongst those aged 7-17, boys are keener users of ChatGPT than girls (34% versus 14%), while men aged 16+ in the UK are more likely than women to say they had used a generative AI tool (39% vs 24%). I don’t really have any good hypotheses as for why this is the case—it’s probably a mix of things like industry representation, the prevalence of usage in certain careers, and a whole slew of social factors—though there are some studies that suggest this dynamic applies to some other technologies as well. Given how central AI is likely to become to the worlds of education, work, and entertainment, this is probably something that ought to be addressed sooner rather than later.
2. Generalist models ace niche tests
What happened? In an experiment to determine if generalist foundation models like GPT-4 can match or outperform specialist models on medical challenge problems without specialised training or tuning, Microsoft found that with systematic prompt engineering GPT-4 topped existing benchmarks on standard medical question answering datasets. They introduced Medprompt, a prompting strategy combining dynamic few-shot selection and self-generated chain of thought, to enable GPT-4 to surpass state of the art results across a number of benchmarks. On the influential MedQA exam benchmark, for example, Medprompt achieved a 27% reduction in error rate over the best prior method, surpassing 90% accuracy for the first time.
What’s interesting? The results show that good prompting can go a long way. There’s an interesting asymmetry between the engineering effort required to develop complex prompts versus the compute and data required to train specialist models, though it’s probably too early to say just how good (and reliable) generalist models can be across a large number of subdomains. If clever prompting strategies can unlock most of the capabilities needed for a domain without requiring extra parameters or training data, it may often be more efficient to prompt general models. More research is needed to determine where that threshold lies for different applications, but I’m interested to see where exactly the ceiling is for these sorts of approaches.
What else? Microsoft has been busy recently. The results come as separate healthcare-focused research from the company shows that GPT-4 demonstrates strong performance on radiology tasks like disease classification and findings summarisation (on the latter, GPT-4’s summaries were preferred over those written by radiologists in some instances). They write that GPT-4 achieved new state-of-the-art results on some tasks, with about a 10 percent absolute improvement over existing models. Beyond performance metrics, the group also said that GPT-4 shows promise in automatically structuring complex radiology reports to improve interpretation and in translating findings into more understandable formats for patients.
3. Hope springs eternal for techno-optimists
What happened? Ethereum inventor Vitalik Buterin added another perspective about techno-optimism to an already lively discourse. The piece, which comes after Marc Andreessen’s e/acc-flavoured techno-optimist manifesto, makes the case that while technological progress has brought massive benefits to human life expectancy, standards of living, and more, AI may well be a special case whose development must proceed in conjunction with efforts to safeguard against existential threats. Essentially, he rejects both pure techno-utopianism and “arguments expressing scepticism about progress” to advocate for the intentional development of technologies that decentralise power and make the world more resilient.
What’s interesting? Because philosophy is meaningless without an accompanying acronym, Buterin introduces the concept of d/acc, which stands for defensive, decentralised or differential acceleration. According to the post, d/acc is about favouring the development of technologies that make the world more resilient and decentralised over those that centralise power and enable offence. Examples of such technologies include blockchains and cryptography for financial and informational security, as well as technologies that provide physical defence against threats like pandemics.
What else? It’s been a big week for the techno-optimists. Coming hot off the heels of Buterin’s intervention, researchers Nora Belrose and Quintin Pope released a new contribution from AI Optimism, a “movement” founded earlier this year opposing what the pair describe as the centralisation of AI research, compulsory use of AI, and attempts to halt or pause technological progress. The post, titled ‘AI is easy to control’, makes the case that the “white box” nature of AI models, in contrast to the “black box” of human cognition, enables precise and effective optimisation and “control methods” that don’t work for people. They argue that these methods, coupled with the inherent simplicity and pervasiveness of human values in training data, means that AI will internalise society’s values by default. I don’t really buy that AI is a ‘white box’ given that “read and write access to their internals'' is not the same as actually understanding, for example, which constellations of neurons are associated with which pieces of information. That being said, I admit I did enjoy the sheer iconoclasticism of the post.
Best of the rest
Friday 1 December
Model alignment protects against accidental harms, not intentional ones (Substack)
Professor Tom Crick joins DCMS as Chief Scientific Adviser (UK Gov)
The Inside Story of Microsoft’s Partnership with OpenAI (New Yorker)
Cool new AI drawing tool (Drawfast)
Meta Says There’s Been No Downside to Sharing AI Technology (Bloomberg)
Thursday 30 November
Boost for UK AI as Microsoft unveils £2.5 billion investment (UK Gov)
Behind China’s Plans to Build AI for the World (POLITICO)
Enabling Data Access through Privacy Preserving Synthetic Data (Data Science Campus)
Seamless Communication (Meta AI)
How much water does AI consume? The public deserves to know it (OECD)
Wednesday 29 November
Observe, inspect, modify: Three conditions for generative AI governance (New Media & Society – Open Access)
Technology Ties: the Rise and Roles of Military AI Strategic Partnerships (SSRN)
Supporting the next generation of emerging technology policy talent (Emerging Tech Policy Careers)
Klobuchar: OpenAI chaos signals need for regulation (Axios)
AI Act: Spanish presidency makes last mediation attempt on foundation models (EURACTIV)
Tuesday 28 November
The EU AI Act Newsletter #41: Big Tech Lobby Against the Act (Substack)
AI helps out time-strapped teachers, says report (BBC)
Introducing Pika 1.0, the idea-to-video platform (Pika Labs)
Animate Anyone: Consistent and Controllable Image-to-Video Synthesis for Character Animation (arXiv)
US, Britain, other countries ink agreement to make AI 'secure by design' (Reuters)
Monday 27 November
Who is leading in AI? An analysis of industry AI research (Epoch)
GPT-4’s potential in shaping the future of radiology (Microsoft)
California Will Temper AI Policy by Studying Industry Impact (Bloomberg Law)
DC's hottest new job: Chief AI officer (Axios)
Vladimir Putin plans AI boost in Russia to fight 'unacceptable and dangerous' Western tech monopoly (Euronews)