Cultural awareness, usage, cheating [TWIE]
The Week In Examples #38 | 1 June 2024
To kick things off this week I want to direct you fine people to the ‘AI policy atlas’, a new resource written by Conor Griffin at Google DeepMind. I am biased for obvious reasons, but I think it’s an excellent effort to get to grips with the huge number of AI policy and responsibility issues, trends, and topics. More than that, it's a very cool template that anyone can follow if they want to make their own map of the AI policy universe.
As for me, this week’s edition has new work assessing the extent to which large models are ‘culturally aware’, a landmark study on public usage of AI, and a small survey about what teachers think about students using generative AI. And, of course, carry on emailing me at hp464@cam.ac.uk with things to include, comments, or anything else!
Three things
1. Cultural awareness is all you need
Researchers from Stanford University and Amazon studied how successful large models were at recognising culturally significant icons and artefacts. To do that, they took four popular vision-language models (GPT-4V, Gemini Pro Vision, LLaVA, and OpenFlamingo) and presented them with images drawn from mythology, folklore, and the contemporary world.
These signs included things like Egyptian hieroglyphs, the command key on Apple Macintosh computers, an image of the Indian classical dance style Kathakali, and—as seen in the above image—the Japanese Omamori and the ancient ouroboros symbol. For around 1,500 such examples, the group asked the models to describe the picture in question and assessed its response against a “ground-truth” to assign each a cultural awareness score.
They found that Gemini Pro scored highest, with a score of 35 out of 100, while GPT-4V followed with 27. LLaVA and OpenFlamingo both returned under 15. Finally, the group also released a labelled dataset, MOSAIC-1.5k, that can be used by others for the evaluation captions generated by large models.
2. ChatGPT (still) leads generative AI usage
A new wide ranging report from the Reuters Institute for the Study of Journalism at the University of Oxford assessed how people use generative AI, what they think about its application in journalism, and what they make of its use in other areas of work and life across six countries: Argentina, Denmark, France, Japan, the UK, and the USA. The online questionnaire, which was fielded between 28 March and 30 April 2024 and included over 2,000 participants per country, found that:
On average, over half of respondents (54%) had heard of ChatGPT. Google’s Gemini platform, Snapchat’s My AI, and Microsoft’s Copilot followed (though each of these was much less well known than OpenAI’s product).
10.5% of respondents were using ChatGPT either on a weekly (7.5%) or daily (3%) basis, though this figure jumps to 27% for young people aged between 18 and 24. One quarter (27%) said they used generative AI in their private life, while a fifth (21%) use it at work or school.
People who have used generative AI to get information are much more likely to trust than distrust the outputs for tasks like answering factual questions, generating ideas, and getting the latest news. For creative tasks like writing an email or making an image, people were also more likely to say generative AI performed well rather than poorly.
The full report is worth reading if you’re interested in how people are actually using generative AI on a day-to-day basis. But it is worth saying that—journalistic impact aside—this research mainly corroborates earlier work: young people are the early adopters and biggest supporters, ChatGPT is the most widely used platform, and the public think that AI will be a boon for science and healthcare while negatively impacting job security and the cost of living.
3. Teachers learn to live with generative AI
A new paper from the Department of Information Technology at Sweden’s Uppsala University looked at perceptions of teaching staff on the prevalence of student cheating and the impact of generative AI on academic integrity. The work, which is based on an anonymous survey of 32 teachers at the institution, found that teachers believe that student use of generative AI is widespread within the university.
But it’s not all that clear whether they think this is a good or bad thing, with only five of those surveyed saying that they definitely viewed using the tools as cheating. As the paper explains, “opinions on whether the use of Generative AI constitutes cheating tend to lean slightly below neutral, suggesting that many do not view it as outright cheating.” This is obviously a very small survey whose results we should handle with care, but it does provide a window into how one group of teachers is handling AI in the real world (an approach I prefer to odd questions about views on ‘robot teachers’).
Given that AI detectors are notoriously unreliable (see here and here), teachers have no way of knowing for sure whether their students are using AI. Even education-focused offerings from developers can be easily circumvented by students using AI at home. While in the future it may be that such tools improve to allow for assignments where using AI is encouraged and others where it is prohibited, in the immediate term teachers can either a) stop assigning written coursework; or b) accept that AI may be used. If you want to read about AI and education from someone who knows a lot more than I do, you should subscribe to the excellent Educating AI newsletter from Nick Potkalitsky.
Best of the rest
Friday 31 May
What happened with AI Overviews and next steps (Google)
Securing AI Model Weights: Preventing Theft and Misuse of Frontier Models (RAND)
The WIRED AI Elections Project (WIRED)
OpenAI Is Rebooting Its Robotics Team (Forbes)
Model Al Governance Framework for Generative Al - Fostering a Trusted Ecosystem (Singapore Gov)
Inside Anthropic, the AI Company Betting That Safety Can Be a Winning Strategy (TIME)
France is aiming to become a global AI superpower — but not without help from U.S. Big Tech (NBC)
Thursday 30 May
AI: Atomizing the Family (Aestora)
Disrupting deceptive uses of AI by covert influence operations (OpenAI)
The ethical situation of DALL-E 2 (arXiv)
Internal divisions linger at OpenAI after November's attempted coup (FT)
The Future of Child Development in the AI Era. Cross-Disciplinary Perspectives Between AI and Child Development Experts (arXiv)
Gemini & Physical World: Large Language Models Can Estimate the Intensity of Earthquake Shaking from Multi-Modal Social Media Posts (arXiv)
Wednesday 29 May
Scale launches SEAL Leaderboards (Scale AI)
Science in the Age of AI (Royal Society)
Greening AI: A Policy Agenda for the Artificial Intelligence and Energy Revolutions (TBI)
The first year of Apollo Research (Apollo Research)
WAN-IFRA and OpenAI Launch Global AI Accelerator for Newsrooms (WAN-IFRA)
Exploring the Impact of ChatGPT on Wikipedia Engagement (arXiv)
Vox Media and OpenAI Form Strategic Content and Product Partnership (Vox)
We aren’t running out of training data, we are running out of open training data (Interconnects >> Substack)
The Atlantic announces product and content partnership with OpenAI (The Atlantic)
Are Large Language Models Chameleons? (arXiv)
Commission establishes AI Office to strengthen EU leadership in safe and trustworthy Artificial Intelligence (EU Commission)
Top EU data regulator says tech giants working closely on AI compliance (Reuters)
Tuesday 28 May
ChatGPT as the Marketplace of Ideas: Should Truth-Seeking Be the Goal of AI Content Governance? (arXiv)
The AI Act compliance deadline: harnessing evaluations for innovation and accountability (Euractiv)
OpenAI Board Forms Safety and Security Committee (OpenAI)
Training compute of frontier AI models grows by 4-5x per year (Epoch AI)
Divergent Creativity in Humans and Large Language Models (ResearchGate)
AI Is Making Economists Rethink the Story of Automation (Harvard Business Review)
Mistral AI, France’s Startup Darling, Takes Aim at the US Market (Bloomberg)
Monday 27 May
AI, Bioweapons, and Corporate Charters: The Case for Delaware Revoking OpenAI’s Charter (Jolt)
AI firms mustn’t govern themselves, say ex-members of OpenAI’s board (The Economist)
Y Combinator’s Garry Tan supports some AI regulation but warns against AI monopolies (TechCrunch)
Elon Musk’s xAI Valued at $24 Billion After Latest Fundraising Round (WSJ)
Behind the Curtain: AI's ominous scarcity crisis (Axios)
Trust in AI is more than a moral problem (VentureBeat)
Scarlett Johansson’s OpenAI clash is just the start of legal wrangles over artificial intelligence (The Guardian)
Job picks
Some of the interesting (mostly) non-technical AI roles that I’ve seen advertised in the last week. As usual, it only includes new positions that have been posted since the last TWIE (but lots of the jobs from the previous edition are still open).
Director, Head of Public Policy, Google DeepMind (London)
Federal Policy Lead, Center for AI Safety (San Francisco)
Product Policy Manager, Cyber Threats, Anthropic (San Francisco)
Product Policy Manager, Bio, Chem, and Nuclear Risks, Anthropic (San Francisco)
Management and Program Analyst, US AI Safety Institute, (Washington D.C.)
really appreciate these updates