Reading ancient scrolls, inside Common Crawl, and mental health chatbots [TWIE]
The Week In Examples #25 | 10 Feb 2024
Another week has passed us by, which means it is time for our customary check-in on AI and society. This week, we have news that AI has been used to read a 2,000-year-old scroll burnt during the Mount Vesuvius eruption in 79AD, a study from Mozilla about the ubiquitous Common Crawl web archive, and a paper making the case for AI-powered therapy to widen access to mental health treatment. As always, it’s hp464@cam.ac.uk for comments, feedback or anything else!
Three things
1. Machine learning sees through Roman philosophy
What happened? The Vesuvius Challenge Grand Prize has been awarded to three students who used AI to read a 2,000-year-old scroll destroyed during the eruption of Mount Vesuvius in 79AD. The competition, which was launched by Nat Friedman, Daniel Gross, and Brent Seales in March 2023, offered a cash prize to a team capable of creating a system that could read the carbonised texts from the Roman town of Herculaneum. The winning group used several techniques, including a TimeSformer-based model that essentially involved applying the attention mechanism at the heart of the popular transformer architecture powering large language models to video understanding problems.
What's interesting? Thought to belong to the philosopher Philodemus of Gadara, the texts were originally recovered in 1752 when workers stumbled on the remains of what is now known as the Villa of the Papyri. Lots of attempts have been made to read the scrolls over the years—from chopping them open with a knife in the 1700s to more sophisticated techniques like infrared analysis in the 2000s—which is how we know that the scrolls belonged to the villa’s philosopher-in-residence. In the passage uncovered by the prize winners, Philodemus questions whether things in lesser quantities bring more pleasure: "as too in the case of food, we do not right away believe things that are scarce to be absolutely more pleasant than those which are abundant."
What else? There are over 1800 Herculaneum papyri (the name for the collection of scrolls found at the villa) in various states of repair and completeness, with around 500 of the texts remaining unopened. The winning team’s solution revealed approximately 5% of one scroll, which the organisers hope to up to over 90% for the four scrolls that they have high resolution scans of. That is not to underestimate the impressive work of the winning team, but rather to say that—should the techniques used in the challenge scale to the rest of the scans—we should expect much more to come in the year ahead.
2. A rare look at Common Crawl
What happened? Mozilla published a report about the infamous web archive Common Crawl, looking at “the histories, values, and norms embedded in its datasets”. For those who don’t know, Common Crawl is a non-profit organisation that ‘crawls’ the internet (by systematically browsing and indexing web pages) to create datasets that it makes freely available to the public. The repository is fairly well embedded in the machine learning canon because AI developers have consistently drawn on the resource as an essential ingredient for the creation of large models.
What’s interesting? The research found that 30 of the 47 large language models assessed used Common Crawl for training, though the study noted this figure could rise as high as 40 should details of the data sources used to train other popular models be released to the public. Despite its popularity for training large models, the research suggests—contrary to popular belief—that the archive doesn’t ‘contain the entire web’. This is because 1) it is heavily skewed towards English language pages, and 2) does not scrape websites that use robot.txt (a signal from domain administrators that tells crawlers which parts of the domain they are allowed to visit).
What else? In practice, developers tend to use filtered versions of the Common Crawl archive (e.g. via certain keywords or through the use of AI classifiers) rather than the entire database itself. This means that it’s more or less up to developers what to include or what to exclude, which is why the report proposes that those seeking to “make generative AI more trustworthy” must be wary of engaging "uncritically" with the corpus. While “trustworthy AI” isn’t a particularly meaningful framing (ultimately, these distinctions fall flat because trust is entirely context-dependent), the report finds a good middle ground between encouraging the use of a resource that enables transparency, whilst being mindful of the dramatic variances in data quality that it represents. Implicit in this is the idea that, just as filtering represents a conscious choice, the curation of the Common Crawl archive itself is political. As with all forms of scientific practice, it reminds us that there is no such thing as a view from nowhere.
3. Researchers talk up mental health chatbots
What happened? Researchers from UCL, the University of Tübingen, the German Center for Mental Health, and chatbot provider Limbic AI released a paper arguing that AI can be used to fill gaps in access to mental health treatment. The researchers said that, in an observational study of 129,400 patients within England’s NHS services, the introduction of a personalised chatbot saw an increase in referrals (instances in which a patient is passed on to a specialist healthcare service) by 15%.
What's interesting? In simple terms, the study compared two groups: one that used the chatbot for self-referrals (the intervention group) and another that did not use the chatbot but continued with their usual self-referral methods, such as web forms (the control group). It found that services implementing the chatbot experienced a significant 15% increase in total referral numbers compared to a 6% increase observed in the control services, while it also reported major improvements for ethnic minority group referral rates (29% vs. 10%). That being said, an observational study is not the gold standard for this sort of work—that accolade goes to the randomised control trial—which means that there may be confounding factors (e.g. variations in how services marketed themselves, external factors influencing mental health awareness during the study period) that influenced the result.
What else? Not everyone is a fan, though, with some campaign groups arguing that AI’s use may “threaten the foundations of our healthcare system.” Given the World Health Organisation reckons that disorders such as anxiety and depression affect 29% of the global population in their lifetime—and the well documented struggles of healthcare services in the face of ageing populations—it’s hard to see how we’d be better off preventing access to healthcare to people in need on the basis that AI “could feel like an overbearing parent nagging us”.
Best of the rest
Friday 9 February
Here’s the Thing AI Just Can’t Do (WIRED)
What the birth of the spreadsheet can teach us about generative AI (FT)
Can AI learn language like we do? (AI Supremacy >> Substack)
AI might be reading your Slack messages: 'A lot of this becomes thought crime' (CNBC)
What did Putin say on war and peace, WW3 and AI? (Reuters)
Thursday 8 February
Sam Altman Seeks Trillions of Dollars to Reshape Business of Chips and AI (WSJ)
Google’s AI now goes by a new name: Gemini (The Verge)
In Big Tech's backyard, California lawmaker unveils landmark AI bill (Washington Post)
Can humanity survive AI? (Jacobin)
Talk about self-improving LLMs from Rishabh Agarwal (Google Drive)
Biden-Harris Administration Announces First-Ever Consortium Dedicated to AI Safety (US Gov)
Wednesday 7 February
A Roadmap to Pluralistic Alignment (arXiv)
AI's bioterrorism potential should not be ruled out (FT)
How Tech Giants Turned Ukraine Into an AI War Lab (TIME)
Generative AI is increasingly being used to defraud businesses of big money and no one is prepared (Fortune)
Generative AI may change elections this year. Indonesia shows how (Reuters)
EU’s AI ambitions may fail on two fronts (Reuters)
Tuesday 6 February
Neural Networks Learn Statistics of Increasing Complexity (arXiv)
Update to the State of AI Compute Index (Air Street Capital)
Ten Hard Problems in Artificial Intelligence We Must Get Right (arXiv)
UK gov’t touts $100M+ plan to fire up ‘responsible’ AI R&D (TechCrunch)
Labeling AI-Generated Images on Facebook, Instagram and Threads (Meta)
MusicRL: Aligning Music Generation to Human Preferences (arXiv)
A pro-innovation approach to AI regulation: government response (UK Gov)
HarmBench: A Standardized Evaluation Framework for Automated Red Teaming and Robust Refusal (HarmBench >>> arXiv)
Self-Discover: Large Language Models Self-Compose Reasoning Structures (arXiv)
Monday 5 February
How neurons learn (Cool education resource >> QTNX)
AI Safety Institute: Third progress report (UK Gov)
Inside the Underground Site Where ‘Neural Networks’ Churn Out Fake IDs (404 Media)
Inside OpenAI’s Plan to Make AI More ‘Democratic’ (TIME)
When is a capability truly worrying? (AI Policy Perspectives >> Substack)
Job picks
These are some of the interesting non-technical AI roles that I’ve seen advertised in the last week. I want to keep this section fresh, so it only includes new roles that have been posted since the last TWIE (though many of the jobs from the previous week’s email are still open). If you have an AI job that you think I should advertise in this section in the future, just let me know and I’d be happy to include it!
Programme Management Officer (AI), United Nations, New York
Operations Associate, CLTR, London
Societal Impacts Strategy and Delivery Adviser, UK AI Safety Institute, London
AI Policy and Governance Manager, Meta, London
Head of Operations, Robust Intelligence, San Francisco