This is a special post for quick takes by tlevin. Only they can create top-level comments. Comments here also appear on the Quick Takes page and All Posts page.
I think some of the AI safety policy community has over-indexed on the visual model of the "Overton Window" and under-indexed on alternatives like the "ratchet effect," "poisoning the well," "clown attacks," and other models where proposing radical changes can make you, your allies, and your ideas look unreasonable.
I'm not familiar with a lot of systematic empirical evidence on either side, but it seems to me like the more effective actors in the DC establishment overall are much more in the habit of looking for small wins that are both good in themselves and shrink the size of the ask for their ideal policy than of pushing for their ideal vision and then making concessions. Possibly an ideal ecosystem has both strategies, but it seems possible that at least some versions of "Overton Window-moving" strategies executed in practice have larger negative effects via associating their "side" with unreasonable-sounding ideas in the minds of very bandwidth-constrained policymakers, who strongly lean on signals of credibility and consensus when quickly evaluating policy options, than the positive effects of increasing the odds of ideal policy and improving the framing for non-ideal but pretty good policies.
In theory, the Overton Window model is just a description of what ideas are taken seriously, so it can indeed accommodate backfire effects where you argue for an idea "outside the window" and this actually makes the window narrower. But I think the visual imagery of "windows" actually struggles to accommodate this -- when was the last time you tried to open a window and accidentally closed it instead? -- and as a result, people who rely on this model are more likely to underrate these kinds of consequences.
Would be interested in empirical evidence on this question (ideally actual studies from psych, political science, sociology, econ, etc literatures, rather than specific case studies due to reference class tennis type issues).
I broadly want to +1 this. A lot of the evidence you are asking for probably just doesn’t exist, and in light of that, most people should have a lot of uncertainty about the true effects of any overton-window-pushing behavior.
That being said, I think there’s some non-anecdotal social science research that might make us more likely to support it. In the case of policy work:
Anchoring effects, one of the classic Kahneman/Tversky biases, have been studied quite a bit, and at least one article calls it “the best-replicated finding in social psychology.” To the extent there’s controversy about it, it’s often related to “incidental” or “subliminal” anchoring which isn’t relevant here. The market also seems to favor a lot of anchoring strategies (like how basically everything on Amazon in “on sale” from an inflated MSRP), which should be a point of evidence that this genuinely just works.
In cases where there is widespread “preference falsification,” overton-shifting behavior might increase people’s willingness to publicly adopt views that were previously outside of it. Cass Sunstein has a good argument that being a “norm entrepreneur,” that is, proposing something that is controversial, might create chain-reaction social cascades. A lot of the evidence for this is historical, but there are also polling techniques that can reveal preference falsification, and a lot of experimental research that shows a (sometimes comically strong) bias toward social conformity, so I suspect something like this is true. Could there be preference falsification among lawmakers surrounding AI issues? Seems possible.
Also, in the case of public advocacy, there's some empirical research (summarized here) that suggests a "radical flank effect" whereby overton-window shifting activism increases popular support for moderate demands. There's also some evidence pointing the other direction. Still, I think the evidence supporting is stronger right now.
P.S. Matt Yglesias (as usual) has a good piece that touches on your point. His takeaway is something like: don’t engage in sloppy Overton-window-pushing for its own sake — especially not in place of rigorously argued, robustly good ideas.
I'd also like to add "backlash effects" to this, and specifically effects where advocacy for AI Safety policy ideas which are far outside the Overton Window have the inadvertent effect of mobilising coalitions who are already opposed to AI Safety policies.
A technique I've found useful in making complex decisions where you gather lots of evidence over time -- for example, deciding what to do after your graduation, or whether to change jobs, etc., where you talk to lots of different people and weigh lots of considerations -- is to make a spreadsheet of all the arguments you hear, each with a score for how much it supports each decision.
For example, this summer, I was considering the options of "take the Open Phil job," "go to law school," and "finish the master's." I put each of these options in columns. Then, I'd hear an argument like "being in school delays your ability to take a full-time job, which is where most of your impact will happen"; I'd add a row for this argument. I thought this was a very strong consideration, so I gave the Open Phil job 10 points, law school 0, and the master's 3 (since it was one more year of school instead of 3 years). Later, I'd hear an argument like "legal knowledge is actually pretty useful for policy work," which I thought was a medium-strength consideration, and I'd give these options 0, 5, and 0.
I wouldn't take the sum of these as a final answer, but it was useful for a few reasons:
In complicated decisions, it's hard to hold all of the arguments in your head at a time. This might be part of why I noticed a strong recency bias, where the most recent handful of considerations raised to me seemed the most important. By putting them all in one place, I could feel like I was properly accounting for all the things I was aware of.
Relatedly, it helped me avoid double-counting arguments. When I'd talk to a new person, and they'd give me an opinion, I could just check whether their argument was basically already in the spreadsheet; sometimes I'd bump a number from 4 to 5, or something, based on them being persuasive, but sometimes I'd just say, "Oh, right, I guess I already knew this and shouldn't really update from it."
I also notice a temptation to simplify the decision down to a single crux or knockdown argument, but usually cluster thinking is a better way to make these decisions, and the spreadsheet helps aggregate things such that an overall balance of evidence can carry the day.
I think some of the AI safety policy community has over-indexed on the visual model of the "Overton Window" and under-indexed on alternatives like the "ratchet effect," "poisoning the well," "clown attacks," and other models where proposing radical changes can make you, your allies, and your ideas look unreasonable.
I'm not familiar with a lot of systematic empirical evidence on either side, but it seems to me like the more effective actors in the DC establishment overall are much more in the habit of looking for small wins that are both good in themselves and shrink the size of the ask for their ideal policy than of pushing for their ideal vision and then making concessions. Possibly an ideal ecosystem has both strategies, but it seems possible that at least some versions of "Overton Window-moving" strategies executed in practice have larger negative effects via associating their "side" with unreasonable-sounding ideas in the minds of very bandwidth-constrained policymakers, who strongly lean on signals of credibility and consensus when quickly evaluating policy options, than the positive effects of increasing the odds of ideal policy and improving the framing for non-ideal but pretty good policies.
In theory, the Overton Window model is just a description of what ideas are taken seriously, so it can indeed accommodate backfire effects where you argue for an idea "outside the window" and this actually makes the window narrower. But I think the visual imagery of "windows" actually struggles to accommodate this -- when was the last time you tried to open a window and accidentally closed it instead? -- and as a result, people who rely on this model are more likely to underrate these kinds of consequences.
Would be interested in empirical evidence on this question (ideally actual studies from psych, political science, sociology, econ, etc literatures, rather than specific case studies due to reference class tennis type issues).
I broadly want to +1 this. A lot of the evidence you are asking for probably just doesn’t exist, and in light of that, most people should have a lot of uncertainty about the true effects of any overton-window-pushing behavior.
That being said, I think there’s some non-anecdotal social science research that might make us more likely to support it. In the case of policy work:
Also, in the case of public advocacy, there's some empirical research (summarized here) that suggests a "radical flank effect" whereby overton-window shifting activism increases popular support for moderate demands. There's also some evidence pointing the other direction. Still, I think the evidence supporting is stronger right now.
P.S. Matt Yglesias (as usual) has a good piece that touches on your point. His takeaway is something like: don’t engage in sloppy Overton-window-pushing for its own sake — especially not in place of rigorously argued, robustly good ideas.
Yeah, this is all pretty compelling, thanks!
Do you have specific examples of proposals you think have been too far outside the window?
I think Yudkowsky's public discussion of nuking data centres has "poisoned the well" and had backlash effects.
I'd also like to add "backlash effects" to this, and specifically effects where advocacy for AI Safety policy ideas which are far outside the Overton Window have the inadvertent effect of mobilising coalitions who are already opposed to AI Safety policies.
A technique I've found useful in making complex decisions where you gather lots of evidence over time -- for example, deciding what to do after your graduation, or whether to change jobs, etc., where you talk to lots of different people and weigh lots of considerations -- is to make a spreadsheet of all the arguments you hear, each with a score for how much it supports each decision.
For example, this summer, I was considering the options of "take the Open Phil job," "go to law school," and "finish the master's." I put each of these options in columns. Then, I'd hear an argument like "being in school delays your ability to take a full-time job, which is where most of your impact will happen"; I'd add a row for this argument. I thought this was a very strong consideration, so I gave the Open Phil job 10 points, law school 0, and the master's 3 (since it was one more year of school instead of 3 years). Later, I'd hear an argument like "legal knowledge is actually pretty useful for policy work," which I thought was a medium-strength consideration, and I'd give these options 0, 5, and 0.
I wouldn't take the sum of these as a final answer, but it was useful for a few reasons: