"Risk Awareness Moments" (Rams): A concept for thinking about AI governance interventions

oeg

In this post, I introduce the concept of Risk Awareness Moments (“Rams”): “A point in time, after which concern about extreme risks from AI is so high among the relevant audiences that extreme measures to reduce these risks become possible, though not inevitable.”

This is a blog post, not a research report, meaning it was produced quickly and is not to Rethink Priorities’ typical standards of substantiveness and careful checking for accuracy.

Summary

I give several examples of what a Ram might look like for national elites and/or the general population of a major country. Causes could include failures of AI systems, or more social phenomena, such as new books being published about AI risk.

I compare the Ram concept to similar concepts such as warning shots. I see two main benefits: (1) Rams let us remain agnostic about what types of evidence make people concerned, e.g., something that AI does, vs. social phenomena; (2) the concept lets us remain agnostic about the “trajectory” to people being concerned about the risk, e.g., whether there is a more discrete/continuous/lumpy change in opinion.

For many audiences, and potential ways in which AI progress could play out, there will not necessarily be a Ram. For example, there might be a fast takeoff before the general public has a chance to significantly alter their beliefs about AI.

We could do things to increase the likelihood of Rams, or to accelerate their occurrence. That said, there are complex considerations about whether actions to cause (earlier) Rams would be net positive.

A Ram - even among influential audiences - is not sufficient for adequate risk-reduction measures to be put in place. For example, there could be bargaining failures between countries that make it impossible to get mutually beneficial AI safety agreements. Or people who are more aware of the risks from transformative AI might also be more aware of the benefits, and thus make an informed decision that the benefits are worth the risks by their lights.

At the end, I give some historical examples of Rams for issues other than AI risk.

From DALL-E 2

Definition

I define a Risk Awareness Moment (Ram) as “a point in time, after which concern about extreme risks from AI is so high among the relevant audiences that extreme measures to reduce these risks become possible, though not inevitable.”

Extreme risks refers to risks at least at the level of global catastrophic risks (GCRs). I intend the term to capture accident, misuse, and structural risks.

Note that people could be concerned about some extreme risks from AI without being concerned about other risks. For example, the general public might become worried about risks from non-robust narrow AI in nuclear weapons systems, without being worried about misaligned AGI. Concern about one risk would not necessarily make it possible to get measures that would be helpful for tackling other risks.
Additionally, some audiences might have unreasonable threat models. One possible example of this would be an incorrect belief that accidents with Lethal Autonomous Weapons would themselves cause GCR-level damage.^[1] Similar to the bullet point above, this belief might be necessary for measures to tackle the specific (potentially overblown) threat model, without necessarily being helpful for measures to tackle other risks.

Relevant audiences will differ according to the measure in question. For example, measures carried out by labs might require people in labs to be widely convinced. In contrast, government-led measures might require people in specific parts of the government - and maybe also the public - to be convinced.

Extreme measures could include national-level regulation, unilateral actions by national security actors, and agreements between countries or labs. To meet my definition, extreme measures would need to meet at least one of these two criteria:

Far outside the current Overton window or “unthinkable.” Extreme risks from AI are often perceived as weird, meaning that measures that explicitly address these measures might also be seen as weird.^[2] Additionally, these measures themselves might seem weird. A possible example: The pause on developing AI systems more powerful than GPT-4 that is called for in the recent open letter coordinated by FLI.
Very burdensome.^[3] I have in mind measures such as monitoring precisely what major compute clusters are used for; this would be burdensome in terms of, e.g., espionage risks and financial cost.^[4]

We can talk about Rams happening among different audiences. For example, there could be a Ram among the general public, or just among a small group of key decision-makers. Different extreme measures will require Rams among different audiences. For example, AI safety agreements between labs might require Rams (just) among the leaders of the relevant labs, whereas national-level regulation might require Rams among people in government.

Examples

Risk Awareness Moments would look different, depending on which audience is having the Ram.

Here are several examples of Rams among a large portion of the policy elites of a country, and maybe also the general population. (I focused on this case because it seems most relevant to my current research.) Note that these examples are just illustrative - I am not claiming that any scenario here is particularly likely.

An AI system fails in a catastrophic way that is widely visible. For example, a lethal autonomous weapon deliberately kills a large number of civilians, despite this not being the intent of the commander. Overnight, this causes the general population to be extremely worried about misalignment.
There are a series of misalignment failures. Some cases look like the LAWS example above; some look like AI systems being caught committing and concealing financial fraud; some look like chatbots being manipulative in a way that their operators did not intend. Each of these cases contributes to an additional portion of the relevant audience becoming concerned about extreme AI risks, until we hit the threshold of a Ram among policy elites and the general public.
Several politicians or public intellectuals publicly describe extreme risks from powerful AI systems in a compelling way. This causes a lot of people in the policy elite to start believing in these risks.
There are a few obvious cases of AI being misused. For example, North Korea uses large language models to scam lots of people and terrorists use AI-controlled drone swarms to cause an event comparable to 9/11. Over a short period of time, these cause widespread support for extreme measures to regulate AI, such as tough limits on how these systems can be deployed.
There are a few movies or TV shows that compellingly and reasonably realistically portray various extreme risks from AI. These result in high motivation to reduce the risks among policy elites and the general public, similar to (but to a greater extent than) what The Day After did for nuclear risk.^[5]
There is a constant drumbeat of impressive/scary AI results being published. This leads to a fairly smooth increase in widespread concern about extreme risks among policy elites.

Are we currently seeing Rams?

Various audiences seem to have moved closer to having Rams in the past six months or so.^[6]

General public in the US:^[7] The impressiveness of ChatGPT, GPT-4, and Bing - as well as their obvious shortcomings - seem to have significantly shifted public discourse about AI safety. Akash has collected various examples of this.
Computer science and machine learning academics: The high number of CS & ML academics in the recent open letter organized by FLI may be a sign of a Ram among this group.^[8]
US Government: For at least some threat models, people within the US Government also seem increasingly willing to take extreme measures relating to AI. For example, their concerns about how the Chinese military might use advanced chips motivated export controls on providing semiconductor technologies to China (Allen, 2022).

Comparison to related concepts

Risk Awareness Moments overlap with various related concepts such as “fire alarms,” “warning shots,” and “warning signs.” An overview of these concepts can be found here.

I find the Risk Awareness Moment concept helpful for three reasons:

(The first two are much more important than the third)

The Ram concept lets us remain agnostic about what types of evidence cause people to become concerned about AI. The evidence can include evidence from AIs themselves, such as a “canary,” the development or deployment of a new capability, or a case of an AI causing (sub-existential) damage in the world. But it can also include more social phenomena, such as influential figures highlighting risks from AI, or social cascades.^[9] (Note that the “warning sign” concept also has this benefit.)
The Ram concept lets us remain agnostic about the “trajectory” towards people becoming concerned about AI. Terms like “warning shot” imply a discrete change, i.e., that people are not concerned about AI, then a warning shot happens, then people are concerned. If we do get a Ram, I think the trajectory may look like this - but it could also be more “lumpy.” For example, maybe there would be several events, each one causing part of the relevant audience to become more concerned. The change could also be fairly continuous. For example, maybe people will just gradually become more concerned about AI over time, without any specific event contributing that much to this.
The Ram definition stipulates that we get to a particular political situation,^[10] making it easier to talk clearly about interventions that would only be possible in such a circumstance. This contrasts with, e.g., “warning shots,” where people sometimes refer to “a failed warning shot” or “a warning shot that does not cause many people to change their mind.”^[11]

Different groups have different likelihoods of experiencing a Ram

We should not assume that Rams will occur among all groups of people. (Even if we would like certain groups to have Rams so that particular risk-reducing policies become more realistic!) I sketch out some reasons below for why different groups might be more or less likely to have Rams.

Different groups of people will be more predisposed to have Rams based on factors like:

How open is that group to new or weird-seeming ideas?
To what extent are members of that group watching AI developments?

The nature of AI progress presumably also affects the likelihood of Rams:

If a hard takeoff occurs in a world that looks basically like today, there will not be time for Rams to occur among new audiences.
Specific groups might be more likely to update based on particular kinds of incidents where AI goes wrong. For example, maybe some groups would be more influenced by cases where AI goes wrong in the physical world, as opposed to “just” on the internet.

We can increase the likelihood of Rams! (Or cause them to happen earlier)

Risk Awareness Moments are not just an exogenous variable. If a Ram among a particular audience would be necessary to get a desirable policy, we could try to cause this Ram (or make it happen earlier than it would otherwise), for instance by “raising the alarm.”

Whether it would in fact be good to trigger (earlier) Risk Awareness Moments depends on various factors, including the ones below:

To what extent do desirable interventions require there to have been Rams among particular groups?
What are the risks associated with causing a Ram? For example, might this accelerate timelines towards very capable AI systems, leaving less time for safety work? Or might it cause additional hype about AI that attracts reckless actors?
What are the opportunity costs of causing a Ram? Even if we think it would be desirable to cause a Ram, we might think that marginal resources would be better spent on something else.

Risk Awareness Moments are not sufficient to get AI risk-reduction measures

For any given AI governance intervention, a Ram - even among people who have a lot of influence over policy - would not be sufficient to cause this intervention to happen. I can think of three main reasons why actors (such as governments or labs) may fail to put adequate risk-reduction measures in place, even if they fully understand the extreme risks from AI.^[12]

1. Inability to get adequate measures in place (in time)

Maybe it is too difficult for individuals or institutions to get measures in place that are sufficient to reduce the risks - even if they are very motivated to do so. This seems particularly hard if you think that these actors would have to move quickly; based on the recent Survey of Intermediate Goals in AI Governance, many of the measures that people are excited about seem hard to implement quickly. If we don’t expect there to be much time between a Ram among the relevant audience and an existential catastrophe or Point of No Return, then it would indeed be important to move quickly.^[13]

2. Belief that the potential benefits from AI justify the risks

I expect realization of the dangers from advanced AI to sometimes be accompanied by realization about how beneficial advanced AI could be. Some people may think the potential benefits are worth the risks, and so be unwilling to take measures that might delay the benefits in order to reduce the risks.

For example, if someone believed that AGI could enable radical life extension, but was not altruistically minded, or if the person strongly privileged current generations over future generations, then that person might want AI to come sooner - or to avoid delaying AGI - even at the expense of more existential risk.^[14]

Additionally, some people have a general worldview that technological progress is generally good and regulation is generally bad.^[15] This worldview would presumably make them reluctant to accept extreme measures to reduce risks from AI.^[16] (Though, in my taxonomy, this seems like more of a reason why such people would not have a Ram, rather than why they would oppose action even after having a Ram.)

3. Bargaining failures

Some AI governance interventions would require cooperation between different actors. For example, AI safety treaties would require countries to cooperate to collectively implement AI safety measures (e.g., regulating training runs).

There are various reasons why actors might fail to cooperate, such as inability to credibly make commitments. For example, the US and China may be unable to cooperate to regulate training runs because each country would be too worried about the other one defecting.^[17]

Examples of Rams for previous extreme risks

Here is a short brainstorm of possible historical examples of Rams by policy elites, and maybe also the general public. Regrettably, the examples are US-centric. I’d be interested to hear ideas relevant to other countries that might be important for making AI go well, such as China.

Note that I haven’t checked the historical claims here; so these cases are maybe more helpful as examples of the type of thing that I have in mind, rather than empirical claims.

There are risks here of various levels of “extremeness.”

Examples:

Movies like The Day After and Threads for nuclear war, and Armageddon for asteroid strikes, causing widespread concern about particular (existential) risks. My impression is that these contributed to major policies, e.g., the Reykjavík Summit and a NASA program to track asteroids.^[18]
After 9/11, extreme measures to fight terrorism were suddenly on the table (e.g., the Patriot Act).
People gradually came to care more about climate change. Based on general knowledge, there seem to have been lots of drivers of this change, including activism, advocacy and awareness raising by scientists (e.g., with the IPCC), and extreme weather events that could be credibly linked to climate change.
Disinformation and fake news from 2016 - I get the impression that this Ram was mostly among some elites. Causes include investigative news reporting about Cambridge Analytica. Maybe this isn’t a good example - it’s hard to point to “extreme policies” that happened as a result.^[19]

Acknowledgements

This research is a project of Rethink Priorities. It was written by Oliver Guest.

I'm grateful to Ashwin Acharya, Michael Aird, Patrick Levermore, Alex Lintz, Max Räuker, and Zach Stein-Perlman for their comments on earlier drafts. Credit to Zach for suggesting the “Risk Awareness Moment” name. Thanks to Adam Papineau for copyediting. These people do not necessarily endorse the points that I make here.

If you are interested in RP’s work, please visit our research database and subscribe to our newsletter.

Bibliography

See this document.

^{^}
I have not worked much on LAWS and do not intend to take a strong position here on how concerned we should be about them.
^{^}
Though my impression is that this has recently become less true, e.g., following the release of ChatGPT, GPT-4, and Bing. See e.g., a recent post from Akash.
^{^}
My definition focuses on burdens that would be widely perceived even before a Ram. So it excludes e.g., the opportunity cost of delaying wild benefits that humanity might get after the development of AGI.
^{^}
I don’t give a precise operationalization of financial cost here. This is because I am primarily interested in whether the cost would be perceived as very burdensome. But I mainly have in mind (opportunity) costs in the hundreds of millions, or billions, of dollars.
^{^}
I remember reading that Armageddon had a similar effect on risk from asteroids, and that Contagion had some effects on pandemic policy. But I have not spent any time looking into the impacts of popular media on perceptions of existential or other risks.
^{^}
Note that I have not tried to operationalize this claim precisely and that my evidence here feels fairly anecdotal. Also, the politics around AI seem to be changing particularly quickly at the time of writing, so my claims here may quickly look out of date!
^{^}
I would guess also in some other countries, e.g., the UK.
^{^}
Note, however, that the number of signatories is still small compared to the size of the field. The open letter has 1,404 signatories at the time of writing - and many of these are not CS or ML academics. In contrast, ICML, a machine learning conference, had 30,000 attendees in 2021 (Zhang et al., 2022, p. 42). I also don’t know what the signing academics thought about AI a year ago. This makes me unsure whether any Ram among CS & ML academics is a recent development.
^{^}
I take the concept of social cascades from Cass Sunstein - in particular, his conversation about it on the 80,000 Hours Podcast.
^{^}
i.e., “people care about extreme risks from AI now”
^{^}
This feels like a smaller benefit. I think it’s already generally clear whether people mean “a warning shot that brings about political change” vs. “a warning shot that does not bring about political change.”
^{^}
There may be additional reasons that have not occurred to me.
^{^}
There is at least one good reason to expect this window of opportunity to be short. Plausible candidates for “events that would cause Rams” include “an AI causes a sub-existential catastrophe in the world” and “AI systems look very powerful.” If we get to the point where AI systems are at this capability level, then we might not be far away from them being capable enough to cause an existential catastrophe.
^{^}
The magnitude of this consideration depends on two factors: The first is how good people expect a future with aligned AGI to be. The second is how high people perceive risks from AI to be by default; when acting out of self-interest, the higher one’s estimate of AI risk, the more one should want to reduce this risk, even at the cost of delaying AGI. (Though, note that believing “AI will kill everyone” is not synonymous with “existential catastrophe.” Maybe there could be a world where currently-alive people get amazing benefits from AI, but where there are also lock-ins that are catastrophic from a Longtermist perspective.)
^{^}
This worldview is described in more detail in, e.g., Grace (2022), though I am mostly basing my description here on various pieces of anecdotal evidence that I have seen.
^{^}
I think this worldview is misguided: One can think that technological progress is almost always good and that regulation often does more harm than good, but still think that powerful AI is a special case where this does not necessarily apply - e.g., because the risk of existential catastrophe is so much higher than with previous technologies.
^{^}
For more detail on bargaining failures, see Fearon’s “Rationalist Explanations for War” or Blattman’s “Why we fight.” There is an 80,000 Hours Podcast discussion with Blattman here. These works are about bargaining failures that lead to war, but apply more generally than this.
^{^}
For nuclear weapons, the Cuban Missile Crisis also seems like a key example of an event that contributed to a Ram.
^{^}
I guess some far-reaching “disinformation laws” in various countries would qualify. But these often seem to me more like opportunistic efforts to ban undesirable speech, rather than genuine efforts to prevent disinformation - though that distinction is obviously contestable.

Effective Altruism Forum
EA Forum