Good morning folks. For this week’s roundup we have a recent academic hoo-ha about social media algorithms, a new study showing that readers equate AI with incorrect information when used as reporting aid by journalists, and a major treatise from AI safety researchers about national and international governance.
Thanks as always to everyone who sent me things to include for this edition. If you want to send something my way, you can do that by emailing me at hp464@cam.ac.uk.
Three things
1. The great algorithmic polarisation show

Manoel Horta Ribeiro, a PhD student at EPFL, wrote a great blog looking at recent claims and counterclaims about the impact of the Facebook news algorithm on polarisation. The story begins with a 2023 paper in Science that found that a chronological (rather than algorithmically-generated) feed “did not significantly alter levels of issue polarization, affective polarization, political knowledge, or other key attitudes.”
Later in 2023, a group of researchers responded to the study, arguing that we should not “conclude that the Facebook news feed algorithm used outside of the study period mitigates political misinformation compared to (the) reverse chronological feed.” They found, as summarised by Ribeiro, “a drop starting in early November 2020 that goes until early March 2021” which “coincide(s) with changes to the Facebook algorithm in November.” The claim is that without these changes, the 2023 authors may have found a positive effect.
Now, in another letter, the original authors shoot back. They say 1) the experiment's internal validity is not impacted by these changes; 2) the follow-up work only contains URLs shared more than 100 times; and 3) the observed changes might have come from other confounding factors.
What to make of this? My view is that the first study retains its utility in that it punctures the idea that the circulation of ‘untrustworthy’ news sources are closely related to news algorithms – but the new study does suggest that the dynamic nature of the algorithm means that we can’t be certain of its overall effects. That leads us to some of the most unsatisfying conclusions of them all: more work is needed to conclusively answer the question.
2. Researchers sketch narrow path
A group of AI safety researchers released ‘A Narrow Path’, a new report setting out what the authors believe are the necessary conditions for the safe development of powerful AI systems. They proceed on the basis that superintelligent AI is possible within the coming years and decades, and that current organisational, national, and institutional infrastructures aren’t up to the task of preventing the worst outcomes as and when a superintelligence is created.
They group their analysis into three parts. First up is basic safety, which makes the case for limiting the ability of AI to self improve, measures to prevent AI systems from breaching containment, efforts to prevent the development of ‘unbounded’ systems whose actions can’t be successfully predicted or managed, and limits on the ‘general intelligence’ of AI systems so that they cannot reach superhuman performance on certain tasks.
The next phase, dubbed stability, is about governance systems. They call for measures to limit proliferation, robust international enforcement mechanisms, mutual guarantees to secure buy-in from the major players (e.g. the US and China), and a benefit sharing system to further encourage actors to engage with the new governance regime.
Finally, they discuss what a world with ‘transformative’ AI looks like. The basic idea is that superintelligence is inherently unstable, unpredictable and unmanageable – and that we should instead limit development to AI ‘tools’ rather than general, agentic systems. There’s lots of interesting stuff here (much more than I can accurately summarise in a few paragraphs) so I’d encourage anyone interested to read the piece in full.
For my part, I’m probably more optimistic about our ability to successfully align models as they get bigger and better than the authors are – but if you truly buy the significance of AI then some version of a sophisticated governance regime will eventually be necessary.
3. News readers equate AI with falsehoods

Returning to the information environment, researchers find that people are sceptical of headlines labelled as AI-generated – even if true or human-made – because they assume that they were entirely written by AI. In a study across almost 5,000 people in the UK and the US, the authors found that labelling headlines as AI-generated lowered perceived accuracy and participants' willingness to share them, regardless of whether the headlines were true or false or created by humans or AI. Some other interesting results from the work included:
The effect of AI-generated labels was about 3 times smaller than labelling headlines as false (yikes);
People assumed headlines labelled as AI-generated were fully written by AI with no human involvement;
Providing weaker definitions of AI involvement (e.g. AI only helped improve clarity) reduced the negative effects of the labels.
These findings intuitively make sense to me. The reason that studies repeatedly find ‘shadow use’ of AI (i.e. unofficial use within work or school) to be on the rise is that we worry that admitting to any use draws into question the integrity of our work. For the most part, AI is used as an aid (something we draw on to make our lives easier) rather than as a replacement (something that we use to delegate tasks wholesale). But we don’t trust others to believe the real extent to which we use AI, which means many feel it's better to say nothing at all.
Best of the rest
Friday 4 October
Horny Robot Baby Voice (LRB)
Hacking Generative AI for Fun and Profit (WIRED)
Trying to be human: Linguistic traces of stochastic empathy in language models (arXiv)
After software eats the world, what comes out the other end? (Substack)
Mind Scramble: Unveiling Large Language Model Psychology Via Typoglycemia (arXiv)
Thursday 3 October
SHAPE Shifters (Substack)
Moral Alignment for LLM Agents (arXiv)
Unlocking AI for All: The Case for Public Data Banks (Lawfare)
Seeing Eye to AI: Human Alignment via Gaze-Based Response Rewards for Large Language Models (arXiv)
ReXplain: Translating Radiology into Patient-Friendly Video Reports (arXiv)
Wednesday 2 October
In my FLOP era (Substack)
Eight Scientists, a Billion Dollars, and the Moonshot Agency Trying to Make Britain Great Again (Wired)
Microsoft invests €4.3B to boost AI infrastructure and cloud capacity in Italy (Microsoft)
‘In awe’: scientists impressed by latest ChatGPT model o1 (Nature)
New funding to scale the benefits of AI (OpenAI)
Tuesday 1 October
Differences in misinformation sharing can lead to politically asymmetric sanctions (Nature)
Dwarkesh Podcast episode with Dylan Patel & Jon (Asianometry) (X)
Can We Delegate Learning to Automation?: A Comparative Study of LLM Chatbots, Search Engines, and Books (arXiv)
Anthropic podcast with Jack Clark (X)
The Gradient of Health Data Privacy (arXiv)
Monday 30 September (and things I missed)
AI systems for the public interest (Internet Policy Review)
What methods work for evaluating the impact of public investments in Research, Development and Innovation (UK Gov)
Public AI (Mozilla)
Measurement Challenges in AI Catastrophic Risk Governance and Safety Frameworks (Tech Policy Press)
Code Interviews: Design and Evaluation of a More Authentic Assessment for Introductory Programming Assignments (arXiv)
Job picks
Some of the interesting (mostly) AI governance roles that I’ve seen advertised in the last week. As usual, it only includes new positions that have been posted since the last TWIE (but lots of the jobs from the previous edition are still open).
Researcher / Senior Researcher - Policy and Standards, IAPS (Remote)
Workstream Lead, Systemic Safety and Responsible Innovation, Societal Impacts, UK AISI (London)
Workstream Lead, Psychological and Social Risks, Societal Impacts, UK AISI (London)
Operations Manager, Frontier Model Forum (Europe or US)
AI Institute Fellowship, Schmidt Futures (NYC)