When complex systems fail, it is often because they have succumbed to what we call "disempowerment spirals" — self-reinforcing feedback loops where an initial threat progressively undermines the system's capacity to respond, leading to accelerating vulnerability and potential collapse.
Consider a city gradually falling under the control of organized crime. The criminal organization doesn't simply overpower existing institutions through sheer force. Rather, it systematically weakens the city's response mechanisms: intimidating witnesses, corrupting law enforcement, and cultivating a reputation that silences opposition. With each incremental weakening of response capacity, the criminal faction acquires more power to further dismantle resistance, creating a downward spiral that can eventually reach a point of no return.
This basic pattern appears across many different domains and scales:
- HIV progressively destroys the immune system designed to fight it.
- Anxiety, burnout, or depression deplete executive function, which is required for taking steps to address the problem.
- Cults methodically isolate members from support networks that might help them leave.
- Corporate toxic cultures drive away the talented employees most capable of fixing them.
- Political polarization erodes the trust necessary for collective problem-solving.
In each case, the threat doesn't just cause damage — it undermines the capacity to respond to that very threat.
Abstracting:
A disempowerment spiral is a feedback loop in which an actor faces an ongoing threat that somehow disempowers the actor’s capacity to respond to that threat. As the actor’s response capacity decreases, they become even less able to prevent further disempowerment.
In this article, we propose disempowerment spirals as a lens for analysing how complex systems fail. Our primary motivation is to better understand existential risks (including AI risk) — since we cannot directly observe existential catastrophes, we need indirect methods to understand them. Disempowerment spirals in particular provide a possible answer to the question “if something gets bad enough, why don’t people just stop it?”.
In the rest of this post, we will first draw out some general observations about disempowerment spirals, and then in the last section turn to a discussion of what this might mean for efforts to reduce existential risk.
Common Themes
Three Types of Response Capacity
What can disempowerment consist of?
In thinking about what actors need to respond to threats, we’ve found it useful to distinguish reasoning capacity (noticing the problem and figuring out what to do) from implementation capacity (actually doing something about the problem). For group actors it’s also sometimes useful to consider coordination capacity (effectively collaborating against the threat).
These capacities together represent the actor’s ability to respond to threats. Things which disempower the actor have an impact on one or more of these dimensions. Often it does seem like the disempowerment effect is acting on one of them in particular.
For example:
Disempowering reasoning capacity | Disempowering implementation capacity | Disempowering coordination capacity |
The actor has a progressively harder time recognizing the problem or figuring out what would help | The actor is progressively enfeebled, and their interventions become relatively less effective | Although individuals may recognize the problem, they cannot rally people around enacting key interventions |
e.g. Mental illnesses like depression and anxiety lead someone to misjudge what help is available | e.g. Military barrages from a hostile power destroy all facilities for manufacturing semiconductors | e.g. A pandemic creates fear and unrest, making people more sceptical, and less willing to collaborate in certain ways |
e.g. A group infiltrating an intelligence service tampers with important information | e.g. A person drawn into a cult is persuaded to become more financially dependent | e.g. Political polarisation damages trust and communication within groups |
Sometimes, of course, a disempowerment effect will hit multiple of these things at once. Something which took out telecommunications, for example, would have negative impacts on all three types of response capacity.
Also, this isn’t the only decomposition you can consider. In particular cases it might be helpful to think e.g. about stages of an OODA loop, or parts of a complex institution. But we think the general decomposition has some mileage.
Broad Disempowerment
In theory, we could see a disempowerment spiral effect where the actor is only very narrowly disempowered — in their capacity to respond to that specific threat. Perhaps a spy inside security services who mainly uses their access to cover their own tracks.
In practice, for a large majority of the examples we have considered, the disempowerment is typically quite broad, reducing capacity in some general way. Perhaps the ability to recognize and plan for new threats is impaired; or physical resources for responding to things are destroyed; or trust and coordination break down.
Polycrises
If a threat causes some measure of broad disempowerment, that could leave the door open for new threats, or flare-ups of existing issues which now see inadequate response. Taking the example of HIV: the breakdown of the immune system per se isn’t what kills people, it’s the fact that otherwise minor infections can suddenly be fatal.
Sometimes disempowerment effects seem more natural to understand in terms of a holistic pattern than a particular individual threat — see e.g. the notions of polycrisis or poverty trap.
This could give reason to flip our perspective on risk: rather than asking ‘what specific threats might this actor face’, you can instead ask ‘how in general might the actor be left unable to respond to threats’. We think this seems like a useful perspective especially when considering scenarios where there are many unpredictable or unknown threats.
Critical Threshold
Early on in a disempowerment spiral, it’s plausible that the actor will get their act together and respond to get the threat under control. If things proceed too far, this may become impossible. (At least without outside intervention.)
Somewhere along the way, a critical threshold was passed. In practice we won’t usually be able to pinpoint when this occurs, but it seems relevant to understand that this point of no return typically comes well before the actor is maximally disempowered or wiped out.
Threat | Cult membership | Business collapse | Military conquest |
Critical Threshold | Individual becomes too isolated and dependent to be able to leave | Business loses too many key employees to preserve a healthy culture | Country loses too much industrial infrastructure to manufacture weaponry |
A given spiral can also have several critical thresholds corresponding to different degrees of permanent disempowerment. A pandemic, for instance, could have separate points at which:
- Spread can no longer be limited across the general population
- Industrial and economic development is permanently set back
- Key institutions are lost
- Humanity is eradicated
Not all spirals end with the death of the host system, even if they get completely out of hand. But it may no longer be possible to get them under control — at a minimum, the actor is left weakened in a way they cannot independently undo, and often in a broad way that leaves them more open to other risks.
The concept of a critical threshold seems potentially useful for distinguishing between the actual harms to be avoided and the window of time in which it is possible to meaningfully avoid the harms.
Disempowerment spirals and existential risk
We don’t have a tight argument, but it seems to us that most x-risk (including most AI-related x-risk) would have something of the nature of a disempowerment spiral[1]:
- Exogenous risk (e.g. asteroids, false vacuum collapse) over the next century seems much smaller than endogenous risk
- Right now, humanity is in some sense reasonably empowered over its environment
- If things go very wrong, that’s probably something that people didn’t want — so we lost some empowerment along the way
- It’s kind of easier to find stories where this happens quasi-continuously rather than abruptly
For AI specifically: misaligned AI takeover probably means a period of humanity becoming disempowered and, short of the most extreme ‘foom’ scenarios, that probably involves a recursive process of resource-gathering. Misuse scenarios and structural risk are in the same category. More broadly, bad AI outcomes seem more likely to arise if there is a breakdown of geopolitical stability and a straining of trust, which we can also model as a disempowerment spiral, or from weird systemic problems that impede humanity’s ability to respond.
Of course, recasting existential risks in terms of disempowerment spirals doesn’t necessarily help us. But if we look to draw practical lessons, here are the ones that seem most prominent to us:
- Analysis of x-risk should focus less on the point where things go maximally badly
- Sometime before the point where everyone is wiped out, or permanently disempowered, will be the critical threshold — when people still have a significant amount of power, but it falls behind the growing amount necessary to contain the problem
- Endgames are, therefore, less important than they appear
- We should invest more in noticing — and containing — nascent problems quickly
- We should focus on staying in control of things that threaten our ability to respond — and we should strive to act quickly and decisively while it is cheap (and/or possible!) to do so
- We should invest broadly in both developing and hardening humanity’s response capacities
- New tools have the potential to radically increase our capacity here
- We should be careful not to assume we’ll only have to deal with one problem at a time — it may be easiest for things to collapse in scenarios where one threat dramatically reduces our response capacity, and others escalate things from there
Thanks to Adam Bales, Toby Ord, Rose Hadshar, and Max Dalton for helpful discussions and comments on earlier drafts.
- ^
Actually, we would guess that the strongest response to this might be an argument that humanity is not sufficiently empowered — unable to see the big things coming, or unable to coordinate to control them. But we think this is stretching the point … there would still, it seems likely, be some process which in its early stages humanity was on top of, but which it would lose control of as it developed.
Executive summary: This exploratory post introduces “disempowerment spirals” as a common failure mode in complex systems and suggests they may be a key mechanism behind existential risks, including AI-related ones, by explaining how threats can progressively erode a system's capacity to respond before collapse becomes inevitable.
Key points:
This comment was auto-generated by the EA Forum Team. Feel free to point out issues with this summary by replying to the comment, and contact us if you have feedback.