The many kinds of Responsible AI
We often talk about the importance of Responsible AI. But what do we mean by it? Who is responsible for what? And where are the tensions between conceptions of responsibility?
AI is a container. It’s something on which we leave our impressions, shaped by the values and perspectives of the people who make, deploy, and use it (though not in equal measure). I like this idea because it suggests that we see AI—for good or bad—as a manifestation of the things we care about most. That’s why, I suspect, we’re already seeing AI take its all too predictable place as the next battleground for culture warriors the world over.
But if AI is a container, then what about the myriad disciplines, fields, themes, and industries that have grown around it? Maybe some of the concepts we take for granted are less stable than they appear. What do we mean, for example, when we talk about democratising AI? Or governing it? Or working to make it safe?
Responsible AI is one of these concepts. A term that describes both a practice and an emerging industry in its own right, it is generally considered to be about responsible action: designing, building, deploying, and managing artificial intelligence in an ethical, transparent, accountable, and beneficial way. Beyond the buzzwords, it boils down to doing the right thing.
This definition leaves a lot on the table. Agreeing on the nature of terms like transparency or accountability is almost as hard as agreeing on what we mean by doing the right thing. And what does it mean to build responsibly, anyway? Who is responsible for the impact of AI over the full breadth of its lifecycle? And where do we draw the lines that separate the web of interconnected responsibilities to reconcile potential conflicts?
Responsibility is ultimately about someone being responsible for something. For that reason, conceptions that stress ideas like fairness, accountability, reliability, privacy, inclusiveness, and transparency (though inseparable from the substance of responsible action) risk mystifying the roles of the different parties involved and the different actions that they may take. The net result of such ambiguity is that it paves the way for a weak public discourse and, counterintuitively, lax governance measures.
Four types of Responsible AI
In this section, I’ll discuss four separate layers of the AI value chain (a term I use more or less interchangeably with lifecycle). These four layers are not exhaustive—I have missed out analyses of funding, education, advocacy and more besides—but they broadly capture the points most relevant to this framework. One way to understand what we mean by responsible AI is to clarify who we are referring to as well as the roles they play within the (simplified) value chain. I try to make some of these connections clear by focusing on the similarities, differences, and tensions inherent across development, deployment, oversight, and use.
Responsibility in development
Development is, of course, about the creation of AI systems. I am primarily interested in frontier (see: large and capable) models, which means that for the most part this section is relevant for medium-large organisations rather than smaller outfits or independent actors. Right off the bat we are faced with a choice: what sort of problem should we try to solve? A specific problem with great upside may also come with great risk, while generalist systems inherently pose dual use challenges. Then again, is it really responsible to pick a problem with a modest benefit?
Problem specification aside, this point in the value chain also includes the process of collecting and processing data, model design, training and finetuning, and evaluation. Examples of responsible action at this stage include ensuring data is representative, accurate, and ethically sourced. Clearly, responsible design is about ensuring the model has been built in a way that maximises beneficial uses and minimises harmful ones. It might be described as responsible to avoid building systems with capabilities that are known to be riskier than others, like those that allow a system to act as an agent. (This is the thinking behind Yoshua Bengio’s ‘AI scientist’, which is a version of an older idea known as the ‘AI oracle’.)
Some mitigations are put in place once training has been completed. For example, firms use techniques like RLHF in which humans review and rank different actions of the model to provide a reward signal used to change behaviour. Once the bulk of the development process has been completed, a new model should be evaluated prior to deployment. This assessment can incorporate the use of benchmarks designed to test truthfulness (e.g. TruthfulQA) and bias (e.g. Bias Benchmark for QA). Models can be subject to auditing, adversarial testing, and red-teaming, either from within the developer organisation or by an independent third party.
It is the case that, generally speaking, the most capable and therefore riskiest models cannot cause harm if they are not deployed. While the nature of their deployment determines where and how harms manifest in the real world, it is the development of a given model that influences the scale of potential harm. On the other hand, systems designed with the best of intentions can be abused by bad actors or deployed in an inappropriate manner.
Responsibility in deployment
Deployment pertains to how AI models are integrated into real-world applications. It's about ensuring the AI operates as intended in the wild, is resilient to adversarial attacks, and provides value without yielding negative consequences. Feedback loops are crucial here, as real-world use can yield insights not apparent during development. (There are a few ways to describe the role of deployers, but I am meaning this group to include both the primary deployer as well as possible secondary deployers.)
Firms deploying AI in a responsible manner might consider where AI gets deployed, how it’s used, and who does or doesn’t get to use it. If we believe that responsible AI is about maximising benefit as well as minimising risk, then access is one of the most important considerations for responsible deployment. We might want to ensure that models are (1) widely used, (2) heavily scrutinised, (3) not used in inappropriate contexts (e.g. a non specialist model being used in healthcare), and (4) hard to use by bad actors. Different modes of release can be used to realise some, but not all, of these objectives. Open-sourcing a powerful model may allow you to do (1) and (2) but it certainly won’t do you any favours for (3) or (4). As AI systems become more powerful and their potential to be economically and socially beneficial increases, determining how to weigh risk versus gain will not be easy. This of course hinges on definitions of ‘open-source’ (a term used too liberally) given that access is ultimately a continuum. Even so, a line must be drawn somewhere.
Those favouring approaches that maximise the scale of access often discuss the need for greater transparency. That might take the form of rich documentation containing information about training data, evaluations, environmental footprint or many other important pieces of information. While greater transparency can foster trust and build resilience through scrutiny, it also may open the door for sophisticated adversarial attacks, enable the exploitation of biases, and allow bad actors to replicate a model for nefarious purposes.
We should also remind ourselves that the lines between development and deployment aren’t always clear. Some of the levers identified above (e.g. red-teaming, evaluations) may need to be continuously implemented even after deployment. It also means that models sometimes need to be released in order for issues to be spotted and fixed, which may happen as part of structured access programmes. While these sorts of efforts might minimise the very worst risks, they may not capture every fault, bug, issue or failure mode. Neither will they allow developers to understand the impact of a system at the societal level, its impact on the economy, social cohesion or the polity.
Responsibility in oversight
There is a lot of good work out there that unpicks the mechanisms for providing oversight, but for the sake of brevity, this section is concerned solely with ‘top-down’ levers such as overarching rules, guidelines, and mechanisms rather than the ‘bottom-up’ forms of governance that are closer in character to some of the issues described above. There are ultimately three major layers of oversight (truly, it is turtles all the way down) that I’m interested in.
At the international level, there’s the supranational agreements that set guardrails for AI development, like the EU’s AI Act. There are also export controls and other restrictions governing interactions between companies and states, such as those introduced by the Biden administration targeting firms dealing with China. In the future, we might see bilateral or multilateral deals between countries that are analogous to the New START Treaty or new international organisations such as an ‘IAEA for AI' (see something I wrote on that a little while ago). The second layer is the national or local level, which is broadly made up of two further elements: the ‘horizontal’ pieces of legislation designed to govern AI’s use in all applications in a country, and the ‘vertical’ moves to govern AI within a particular sector. In China, for example, there is a patchwork of regulations focused on recommender systems and generative AI as well as a much broader law currently in the works. I recently wrote a summary of a great paper by Matt Sheehan, a fellow at the Carnegie Endowment for International Peace, that looked at both. Finally, there is the organisation layer, which is ultimately about the introduction of accountability and supervisory structures. It includes the imposition of regulatory frameworks, internal organisational policies, accountability mechanisms, and ensuring continuous evaluations of AI's societal impacts. Organisations also work with, and are influenced by, standards bodies to help shape the governance choices that they make.
To take a step back, while oversight considers both international agreements and national and local laws, responsible deployment is about the way in which firms consider deployment based on their own evaluations. One potential tension arises when national or local laws may be in contradiction with international regulations or when companies' internal decisions conflict with any of these governance structures.
In each of these instances, successful oversight is about introducing effective processes while also ensuring that roadblocks to action being taken are removed. Oversight ultimately acts as a continuous supervisory layer, offering guidelines for both the development and deployment of AI that change over time. This type of responsibility is intrinsically reflexive. It acts as a safeguard for ensuring that AI evolves in alignment with societal values and ethical norms into the future.
Responsibility in use
Finally, there is the question of how AI systems are used after they are deployed. This is about the person who uses AI, whether acting independently or on behalf of a third party group such as a business or government.
Responsible use for this group might include elements like informed interaction (understanding the basics of AI, its potential biases, and limitations), awareness of data privacy issues or actively participating in (or rejecting) feedback loops to improve AI systems. Here, the question is whether it ought to be up to users to help develop that understanding or whether a third party should take responsibility, such as the government, developer or (when different) deployer. And there is of course the elephant in the room, which we might describe as ethical utilisation (that is, using AI in ways that doesn’t harm others).
The usage phase represents the most tangible interface between humans and AI. Each of the proceeding stages aims to culminate in an AI system that, when used responsibly, benefits individuals and society at large. While it may not be the point at which all failure modes exist, it is the primary point at which failure modes manifest themselves. This, by the way, is a function of scale. When we think about access, we ought to remember that as the number of vectors for harm increases so too does the risk profile of a given system. There is also a question mark over the best way to weigh the use of AI in service of society (by, for example, boosting national innovation) against possible harms at the personal level.
Developers should rightly be expected to do everything within their power to reduce risk. That all said, there is no other example of an industry in which a degree of personal (and societal) responsibility isn’t tolerated. From nuclear power and space exploration to consumer goods and air travel, the balance between responsibility is split between a host of parties that include manufacturer, operator, user and more still. The final rub is about what happens when personal conceptions of responsibility collide with notions of organisational, national, or supranational responsibility. Clearly, we don’t want to give people free reign to use AI in a potentially dangerous way. But neither is a paternalistic model that conclusively determines how and when someone can act responsibly in the first place a perfect solution.
Wrapping up
The four stages that I have described represent a simplified version of the AI lifecycle, from data sourcing, design, and training (development) to its real-world applications (deployment), supervised by overarching rules (oversight), and leading to its interaction with end-users (use).
One way of understanding these areas is to connect them with two simple conceptions of responsibility I have alluded to: it is responsible to both maximise benefit and to minimise harm. These twin ideas, though, are not so easily reconciled. Is it responsible to release a model on an open-source basis that might cause severe harm to some while broadening access to enable others to benefit? I am not so sure, though the lack of agreement about whether it is responsible to do so illustrates the difficulties in understanding competing notions of responsibility and resolving tensions contained therein.
Ultimately, it is difficult to see how one party can be responsible for every facet of AI’s design, deployment, oversight and use. While a model of responsibility based on the value chain of a given system clarifies different conceptions of responsibility, it does so by drawing into focus tensions between the constituent parts of the system lifecycle. How we conclusively determine which parties are responsible for what—and those interests that are reconcilable and those that are not—remain open but challenging questions.