I think AI x-risk reduction is largely bottlenecked on a lack of strategic clarity,[1] more than e.g. bio x-risk reduction is, such that it's very hard to tell which intermediate goals we could aim for that would reduce rather than increase AI x-risk. Even just among (say) EA-motivated longtermists focused on AI x-risk reduction, there is a very wide spread of views on on AI timelines, takeoff speeds, alignment difficulty, theories of victory, sources of doom, and basically everything else (example), and I think this reflects how genuinely confusing the strategic situation is.
One important way[2] to improve strategic clarity is via what we've come to call AI "worldview investigations,"[3] i.e. reports that:
-
Aim to provide an initial tentative all-things-considered best-guess quantitative[4] answer to a big, important, fundamental ("worldview") AI strategy question such as "When will we likely build TAI/AGI?" or "How hard should we expect AI alignment to be?" or "What will AI takeoff look like?" or "Which AI theories of victory are most desirable and feasible?" Or, focus on a consideration that should provide a noticeable update on some such question.
-
Are relatively thorough in how they approach the question.
Past example "worldview investigations" include:
- Open Phil's series on "When will we likely build TAI?", summarized here, and comprised mainly by Bio Anchors, Brain Computation, Semi-Informative Priors, Explosive Growth, and Human Trajectory. (~1185pp[5])
- Carlsmith's Is power-seeking AI an existential risk? (~90pp)
- My 2017 Report on Consciousness and Moral Patienthood[6] (~485pp)
Worldview investigations are difficult to do well: they tend to be large in scope, open-ended, "wicked," and require modeling and drawing conclusions about many different phenomena with very little hard evidence to work with. Here is some advice that may help researchers to succeed at completing a worldview investigation:
-
Be bold! Attack the big important action-relevant question directly, and try to come to a bottom-line quantitative answer, even though it's unjustified in many ways and will be revised later.
-
Reach out to others early for advice and feedback, especially people who have succeeded at this kind of work before. Share early, scrappy drafts for feedback on substance and direction.[7]
-
On the research process itself, see Minimal-trust investigations, Learning by writing, The wicked problem experience, and Useful Vices for Wicked Problems.
-
Reasoning transparency has advice for how to communicate what you know, with what confidence, and how you know it, despite the fact that for many sub-questions you won't have enough time or evidence to come to a well-justified conclusion.
-
How to Measure Anything is full of strategies for quantifying very uncertain quantities. (Summary here.)
-
Superforecasting has good advice about how to quantify your expectations about the future. (Summary here.)
Finally, here are some key traits of people who might succeed at this work:[8]
- Ability to carve up poorly-scoped big-picture questions into the most important parts, operationalize concepts that seem predictive, and turn a fuzzy mess of lots of factors into a series of structured arguments connected to (usually necessarily weak / of limited relevance) evidence.
- At least moderate quantitative/technical chops, enough to relatively quickly learn topics like the basics of machine learning scaling laws or endogenous growth theory, while of course still significantly relying on conversations with subject matter experts.
- Ability to work quickly and iteratively, limiting their time on polish/completeness and on "rabbit holes" that could be better-explored or better-argued or more evidence-based but that aren't likely to change the approximate bottom line.
- Motivation to focus on work that is decision-relevant (in our case, for intervening on AI x-risk) and engage in regular thought and dialogue about what's most important to analyze next; not too attached to "academic freedom" expectations about following whatever is intellectually stimulating to them at the moment.
- Reasoning transparency, and "superforecaster"-style reasoning processes for evidence assessment and prediction/judgment calibration.
Notes
See e.g. Our AI governance grantmaking so far, A personal take on longtermist AI governance, How to make the best of the most important century?, Important, actionable research questions for the most important century. ↩︎
Other types of research can also contribute to strategic clarity, e.g. studies of much narrower empirical questions. However, I've come to think that <100pp one-off papers and blog posts typically contribute little to our strategic understanding (though they may contribute to more granular "tactical" understanding), because even when they're well done, they can't be thorough or deep enough on their own to be persuasive, or solid enough to use as a foundation for later work. Instead, I learn more from unusually detailed and thorough "extended effort" analyses that often require 1-5 FTE years of effort, e.g. OpenAI's series of "scaling laws" papers. Another example is Saif Khan's series of papers on semiconductor supply chains and policy options: I couldn't tell from just one paper, or even the first 5 papers, whether genuine strategic progress was being made or not, but around paper #8 (~2 years in) I could see how most of the key premises and potential defeaters had been investigated pretty thoroughly, and how it all made a pretty robustly-considered (but still uncertain) case for a particular set of policy interventions — culminating roughly in The Semiconductor Supply Chain: Assessing National Competitiveness and Securing Semiconductor Supply Chains. ↩︎
E.g. with explicit probabilities or probability distributions. ↩︎
The TAI timelines reports add up to ~355,000 words at 300 words per page, counting each main report and its appendices but not associated blog posts or supplemental materials (e.g. conversation notes or spreadsheets). The report on power-seeking AI is ~27,000 words, and my consciousness report is ~145,000 words. ↩︎
This report is on animal welfare strategy rather than AI strategy. In particular, the report helped us decide to expand our animal welfare grantmaking to help fish. ↩︎
Some people won't have time to comment, but it doesn't cost much for them to reply "Sorry, I don't have time." ↩︎
I'm skeptical of work like this and what you write in note 2. makes me think of some comments. You write:
If I'm honest, this seems to me like a large - and slightly odd - claim. Does it really say that research work that can be written up in fewer than one hundred pages cannot be persuasive and cannot be solid enough to build on? Do you really mean to suggest that?
You go on to write:
But I don't really see how all this is much different from saying something a bit like "more research is better than less research" or that although an individual short paper cannot provide what you call 'strategic' insight, it may be gained by essentially synthesizing the results contained in lots of smaller contributions. And this latter description sounds a lot like how I would describe the 'normal' way that research papers have worked for ages, i.e. something like the fact that each individual contribution tends to be narrow and then as time goes by, the bigger picture emerges from some kind of aggregation and synthesis that the research community somehow effects. (I find it particularly hard to square the example of Saif Khan's papers with your overall point: Is it not the case that many of Khan's earlier outputs were much shorter? But each was high-quality enough that he could build on them, even after having put them out into the community? And then later, once the bigger picture had genuinely emerged for him to see, he drew on his experience to produce the longer report?)
I remain unconvinced that it is better for someone to set out from the start to write a single long report, as opposed to - at the very least - an approach similar to Khan's in which one starts out with shorter, concrete 'pieces of the puzzle' and actually puts each of them out in the community where they can be criticized and scrutinized. Then along the way there can be more public discourse, more feedback, and more public variety of opinion. One will be better able to judge which threads really are promising or important and which ideas really are robust and any decision to write a longer, expanded report that draws on the lessons of the many shorter papers, need not be made until later.
It's a fair question. Technically speaking, of course progress can be more incremental, and some small pieces can be built on with other small pieces. Ultimately that's what happened with Khan's series of papers on the semiconductor supply chain and export control options. But in my opinion that kind of thing almost never really happens successfully when it's different authors building on each other's MVPs (minimum viable papers) rather than a single author or team building out a sorta-comprehensive picture of the question they're studying, with all the context and tacit knowledge they've built up from the earlier papers carrying over to how they approach the later papers.
Quick question: what is pp in this context?
It just means "pages."