I was hoping to write something for the Future Fund contest and - being entirely a one-trick pony - was going to look at uncertainty analysis in AI Catastrophe predictions.
I've done a review of the forums and my conclusion is that predictions around AI Catastrophe are very heavily focussed on when AI will be invented and the overall top-level probability that AI will be a catastrophe if it is invented. Other than that predictions about AI Risk are quite sparse. For example, few people seem to have offered a numerical prediction about whether they think the AI Alignment Problem is solvable in principle, few people have offered a numerical prediction about the length of time we could contain a misaligned AI and so on. The only end-to-end model of AI Risk with numerical prediction I have found is Carlsmith (2021): https://arxiv.org/abs/2206.13353.
- Is my review of the state of the literature roughly accurate? That is, my impression that people mostly predict the time AI is invented and the risk that AI leads to catastrophe, but do not predict other important related questions (at least not numerically)?
- Am I right that Carlsmith (2021) is the only end-to-end model of AI Risk with numerical predictions at each stage (by end-to-end I mean there are steps in between 'AI invented' and 'AI catastrophe' which are individually predicted)? Any other examples would be really helpful so I can scope out the community consensus on the microdynamics of AI risk.
- If I'm right about the above, I think an essay looking at the microdynamics of AI Risk predictions could be novel and informative (for example the probability that we solve the Alignment Problem before AI is invented seems pretty important but I don't think anyone has looked at validating the Metaculus prediction on this topic). Is this already a known quantity? Are there any particular pitfalls I should watch out for?
In my review I came across the 'Database of Existential Risk Estimates' - link here: https://forum.effectivealtruism.org/posts/JQQAQrunyGGhzE23a/database-of-existential-risk-estimates. This seems to contain many estimates of exactly what I am looking for - predictions of specific events which will occur on the path to an AI catastrophe, rather than the overall risk of catastrophe itself.
- Are there any other databases of this sort, especially those which focus on topics other than when AI will be invented or the top-level probability it will be a catastrophe?
- Is the database regarded as generally credible on the forums? I have found a handful of predictions which I don't think are included (especially on Metaculus), but the database has many more which I would never have found without it. If there is no known systematic bias in the database I'd really like to use it!
- Is there anything else I should know about the database?
Thanks so much!
This is absolutely incredible - can't believe I missed it! Thank you so much