Hide table of contents

After reading this post, I found myself puzzling over the following question: why is Tetlock-style judgmental forecasting so popular within EA, but not that popular outside of it?

From what I know about forecasting (admittedly not much), these techniques should be applicable to a wide range of cases, and so they should be valuable to many actors. Financial institutions, the corporate sector in general, media outlets, governments, think tanks, and non-EA philanthropists all seem to face a large number of questions which could get better answers through this type of forecasting. In practice, however, forecasting is not that popular among these actors, and I couldn't think of/find a good reason why.[1]

The most relevant piece on the topic that I could find was Prediction Markets in The Corporate Setting. As the title suggests, it focuses on prediction markets (whose lack of success is similarly intriguing), but it also discusses forecasting tournaments to some extent. Although it does a great job at highlighting some important applicability issues for judgmental forecasting and prediction markets, it doesn't explain why these tools would be particularly useful for EAs. None of the reasons there would explain the fact that financial institutions don't seem to be making widespread use of forecasting to get better answers to particularly decision-relevant questions, either internally or through consultancies like Good Judgment.

Answers to this question could be that this type of forecasting is:

  • Useful for EA, but not (much) for other actors. This solution has some support if we think that EAs and non-EAs are efficiently pursuing their goals. If this is true, then it suggests that EAs should continue supporting research on forecasting and development of forecasting platforms, but should perhaps focus less on getting other actors to use it.[2] My best guess is that this is not true in general, though it is more likely to be true for some domains, such as long-run forecasting.
  • Useful for EA and other actors. I think that this is the most likely solution to my question. However, as mentioned above, I don't have a good explanation for the situation that we observe in the world right now. Such an explanation could point us to what are the key bottlenecks for widespread adoption. Trying to overcome those bottlenecks might be a great opportunity for EA, as it might (among other benefits) substantially increase forecasting R&D.
  • Not useful. This is the most unlikely solution, but is still worth considering. Assessing the counterfactual value of forecasting for EA decisionmaking seems hard, and it could be the case that the decisions we would make without using this type of forecasting would be as good as (or maybe even better than) those we currently obtain.

It could be that I'm missing something obvious here, and if so, please let me know! Otherwise, I don't know if anyone has a good answer to this question, but I'd also really appreciate pieces of evidence that support/oppose any of the potential answers outlined above. For example, I would expect that by this point we have a number of relatively convincing examples where forecasting has led to decisionmaking that's considerably better than the counterfactual.

  1. ^

    This is not to say that forecasting isn't used at all. For example, it is used by the UK government's Cosmic Bazaar, The Economist runs a tournament at GJO, and Good Judgment has worked for a number of important clients. However, given how popular Superforecasting was, I would expect these techniques to be much more widely used now if they are as useful as they appear to be.

  2. ^

    Open Philanthropy has funded the Forecasting Research Institute (research), Metaculus (forecasting platform) and INFER (a program to support the use of forecasting by US policymakers).

57

0
0

Reactions

0
0
New Answer
New Comment

6 Answers sorted by

My sense is that EA overrates forecasting a bit and that the world underates it a lot. 

Some underrated views I suggest:

  • As Michael Story points out (emphasis his): "Most of the useful information you produce [in forecasting] is about the people, not the world outside. Forecasting tournaments and markets are very good at telling you all kinds of things about your participants: are they well calibrated, do they understand the world, do they understand things better working alone or in a team, do they update their beliefs in a sensible measured way or swing about all over the place? If you want to get a rough epistemic ranking out of a group of people then a forecasting tournament or market is ideal. A project like GJP (which I am very proud of) was, contrary to what people often say, not an exercise primarily focused on producing data about the future. It was a study of people! The key discovery of the project wasn’t some vision of the future that nobody else saw, it was discovering the existence of consistently accurate forecasters (“superforecasters”) and techniques to help improve accuracy. The book Superforecasting was all about the forecasters themselves, not the future we spent years of effort predicting as part of the study, which I haven’t heard anyone reference other than as anecdotes about the forecasters."
    • I don't really feel I can add to this quote. Forecasting is useful for filtering people but less useful for finding truth. It's easy to overrate
  • It is difficult to forecast things policy makers actually care about. Forecasting sites forecast things like "will Putin leave power" rather than "If Putin leaves power between July 18th and the end of Aug how will that affect the likelihood of a rogue nuclear warhead". And I'm confident that question isn't actually specific enough in some way I don't understand. And even if it were,  decision makers would have to trust the results, which they currently largely don't.
  • Forecasting beyond 3 years is not good. Anything above .25 is worse than random. Many questions are too specific and too far away for forecasting to be useful to them.

Forecasting is more useful as a filtering tool/a personal improvement tool than as part of better decision making. I suggest that individuals would gain from playing the estimation game each month and taking the hits that reality deals, but there are many things we could do to improve ourselves and if this doesn't fit for you, fair enough.

But the idea that every org should be forecasting or that every process should involve forecasting doesn't seem to fit in reality. I still look for ways it can (maybe there is a silver bullet!)  but I don't think you, the median EA should. If you want to test your knowledge of a topic, forecast on it. If you see someone has a good track record, consider taking their thoughts more seriously. Other than that, probably don't think that much more than you are interested to/you think it's important to already.

This strongly resonated with me especially after taking part in XPT. I think I set my expectation really high and got frustrated with the process and now take a relaxed approach to forecasting as a fun thing I do on the side instead of something I actively want to take part of as a community. 

My understanding is that the empirical basis for the forecasting comes from the academic research of Phillip Tetlock, summarised in the book Superforecasting (I read the book recently, it's pretty good). 

Essentially, the research signed up people to conduct large amounts of forecasts about world events, and scored them on their accuracy. The research found that certain people were able to consistently outperform even top intelligence experts. These people used the sort of techniques familiar to EA: analysing problems dispassionately, breaking them down into pieces, putting percentage estimates on them, and doing frequent pseudo-bayesian "updates". I say pseudo-bayesian because a lot of them weren't actually using bayes theorem, instead just bumping the percentage points up and down, helped with the intuition they developed, which apparently still worked. 

One theory as to why this type of forecasting works so well is that it makes a forecasting a skill with useful feedback: If a prediction fails, you can look at why, and adjust your biases and assumptions accordingly.

Two important caveats that are often overlooked with this research: First, all these predictions were of bounded probability, where the question-makers estimated probability was in the range between 5% and 95%. So no million to one shots, because you'd have to make a million of them to check if the predictions were correct. Second, they were all of short term predictions. Tetlock states multiple times in his book that he thinks forecasts beyond a few years will be fairly useless. 

So, if the research holds up, the methods used by EA are the gold standard in short-term, bounded probability forecasting. It makes sense to use it for that purpose. But I don't think this means that expertise in these problems will transfer to unbounded, long term forecasts like "will AGI kill us all in 80 years". It's still useful to estimate those probabilities to more easily discuss the problem, but there is no reason to expect these estimates to have much actual predictive power. 

Some thoughts:

I think that you can think about "forecasting" as one of a collection of intellectual practices that the EA community is unusually focused on. 

Other practices/norms include:
- "Scout Mindset"
- Bayesian Thinking
- Lots of care/interest about analytical philosophy
- A preferences for empirical data/arguments
- A mild dislike for many kinds of emotions arguments
- Rationalism / Rationalist Fiction

I think that a background variable here is that EA is close to an intellectual community of thinkers that use similar tools and practices. Thinkers like Steven Pinker, Matthew Yglesias, Robin Hanson, Bryan Caplan, and Andrew Gelman come to mind as people with somewhat similar styles and interests. Many of these people also showed unusual interest in "forecasting". 

So some questions here would be:
1. How well does EA fit into some outer intellectual tribes, like I hinted at above?
2. What preferences/norms do these tribes have, and how justified are they?
3. Why aren't all of these preferences/norms more common? 

I think that "forecasting" as we discuss it is often a set of norms like:
1. Making sure that forecasts are recorded and scored.
2. Making sure forecasters are incentivized to do well.
3. Making sure that the top forecasters are chosen for future rounds.

To do this formally requires a fair bit of overhead, so it doesn't make much sense for small organizations. 

I think that larger organizations either know enough to vet and incentivize their existing analysts (getting many of the benefits of forecasters), or don't, in which case they won't be convinced to use forecasters.  (I think that obvious explanations are some of the reason, but  I do have questions here) 

Society-wide, I think most people don't care about forecasters for similar reasons that most people don't care about Bayesian Thinking, Scott Alexander, or Andrew Gelman. I think these tools/people are clever, but others aren't convinced/knowledgeable of them. 

I both think that some groups in EA are slightly overenthusiastic about forecasting[1] (while other subgroups in EA don't forecast enough), and that forecasting is underused/undervalued in a lot of the world for a number of different reasons. And I'd also suggest looking at this question less from the lens of "why is EA more into forecasting than other groups?" and more from the lens of "some groups are into forecasting while others aren't, why is that?"

More specific (but still quick!)  notes:

  1. I'd guess reasons for why other groups are less enthusiastic about forecasting can vary significantly.
    1. E.g.: 
      1. People in governments and the media (and maybe e.g. academia) might view forecasting as not legible enough, or credentialist enough, and the relevant institutions might be slower to change. I don't know how much different factors contribute in different cases. There might also be legal issues for some of those groups. 
      2. I think some people have argued that in corporate settings, people in power are not interested in setting up forecasting-like mechanisms because it might go against their interests. Alternatively, I've heard arguments that middle management systems are messed up, meaning that people who are able to set up forecasting systems in large organizations are promoted and the projects die. I don't know how true these things are.
  2. I think people who end up in EA are often the kinds of people who'd also be more likely to find forecasting interesting — both for reasons that are somewhat baked into common ~principles in EA, and for other reasons that look more like biases/selection effects.
  3. I'm not sure if you're implying that EA is one of the most extremely-into-forecasting groups, but I'm not sure that this is true. Relatedly, if you're looking to see whether/where forecasting is especially useful, looking at the different use cases as a whole (rather than zooming into EA-vs-not-EA as a key division) might be more useful. 
    1. E.g. sports (betting), trading, and maybe even the weather might be areas/fields/subcultures that rely a lot on forecasting. Wikipedia lists lots of "applications" in its entry on forecasting, and they include business, politics, energy, etc. 
    2. [Edit: this is not as informative as I thought! See Austin's comment below.] Manifold has groups, and while there's an EA group, it's only 2.2K people, while groups like sports (14.4k people) and politics (16.4k people) are bigger. This is more surprising to me since Manifold is particularly EA-aligned.[2] 
    3. The forecasting group on Reddit looks dead, but it looks like people in various subreddits are discussing forecasts on machine learning, data science, business, supply chains, politics, etc. (I also think that forecasting might be popular in crypto-enthusiast circles, but I'm not sure.)
  4. This is potentially minor, but given that a lot of prediction-market-like setups with real money are illegal, as far as I understand, the only people who forecast on public platforms are probably those who are motivated by things like reputation, public good benefits, or maybe charity in the case of e.g. Manifold. So you might ask why Wikipedia editors don't forecast as much as EAs.
  1. ^

    And specific aspects of forecasting

  2. ^

    You could claim that this is a sign that EA is more into forecasting, but I'm not sure; EA is also more into creating public-good technology, so even if lots of e.g. trading firms were really excited about forecasting, I'd be unsurprised if there weren't many ~public forecasting platforms that were supported by those groups. (And there are in fact a number of forecasting platforms/aggregators whose relationship to EA is at least looser, unless I'm mistaken, like PredictIt, PolyMarket, Betfair, Kalshi, etc.)

3b. As a clarification, for a period of time we auto-enrolled people in a subset of groups we considered to be broadly appealing (Econ/Tech/Science/Politics/World/Culture/Sports), so those group size metrics are not super indicative of user preferences. We aren't doing this at this point in time, but did not unenroll those users.

1
Lizka
Thanks! This is really useful to know. Edited my comment. 

Thanks for these thoughts! I agree with most of what you said. Some replies to specific points:

  • 1b: The post I mentioned discusses this point. I think it's plausible that that's a factor, but even if it were a major one, it still doesn't explain the lack of demand for forecasting consultancies, which could presumably do an even better job at forecasting questions which don't require company-specific information.
  • 2: This matches my intuitions as well. Though I think it doesn't say much about whether forecasting is actually useful or not, as this could mean "E
... (read more)
3
Lizka
Re: 1b (or 1aii because of my love for indenting): That makes sense. I think I agree with you, and I'm generally unsure how much of a factor what I described even is.  Re 2: Yeah, this seems right. I do think some of the selection effects might mean that we should expect that forecasting is less promising than one might think given excitement about them in EA, though? Re: 3: Thanks for clarifying! I was indeed not narrowing things down to Tetlock-style judgmental forecasting. I agree that it's interesting that judgement-style forecasting doesn't seem to get used as much even in fields that do use forecasting (although I don't know what the most common approaches to forecasting in different fields actually are, so I don't know how far off they are from Tetlock-style things).  Also, this is mostly an aside (it's a return to the overall topic, rather than being specific to your reply), but have you seen this report/post? 

why is Tetlock-style judgmental forecasting so popular within EA, but not that popular outside of it?

The replies so far seem to suggest that groups outside of EA (journalists, governments, etc) are doing a smaller quantity of forecasting (broadly defined) than EAs tend to.

This is likely correct but it is also the case that groups outside of EA (journalists, governments, etc) are doing different types of forecasting than EAs tend to. There is less "Tetlock-style judgmental" forecasting and more use of other tools such as horizon scanning, scenario planning, trend mapping, etc, etc.

(E.g. see the UK government Futures Toolkit, although note the UK government also has a more Tetlock-style Cosmic Bazaar) 

So it also seems relevant to ask: why does EA focuses very heavily on "Tetlock-style judgmental forecasting", rather than other forecasting techniques, relative to other groups?

I would be curious to hear people's answers to this half of the question too. Will put my best guess below.

– – 

My sense is that (relative to other futures tools) EA overrates "Tetlock-style judgmental" forecasting a lot and that the world underrates it a bit. 

I think "Tetlock-style" forecasting is the most evidence based, easy to test and measure the value of, futures technique. This appeals to EAs who want everything to be measurable. Although it leads to it being somewhat undervalued by non-EAs who undervalue measurability.

I think the other techniques have been slowly developed over decades to be useful to decision makers. This appeals to decision makers who value being able to make good decisions and having useful tools. Although it leads to them being significantly undervalued by EA folk who tend to have less experience and a "reinvent the wheel" approach to good decision making to the extent that they often don’t even notice that other styles of forecasting and futures work even exist!

One theory is that EA places unusual weight on issues in the long-term future, compared to existing actors (companies, governments) who are more focused on eg quarterly profits or election cycles. If you care more about the future, you should be differentially excited about techniques to see what the future will hold.

(A less-flattering theory is that forecasting just seems like a cool mechanism, and people who like EA also like cool mechanisms.)

I have not read much of Tetlock's research, so I could be mistaken, but isn't the evidence for Tetlock-style forecasting only for (at best) short-medium term forecasts?  Over this timescale, I would've expected forecasting to be very useful for non-EA actors, so the central puzzle remains.  Indeed, if there is not evidence for long-term forecasting, then wouldn't one expect non-EA actors (who place less importance on the long-term) to be at least as likely as EAs use this style of forecasting?
 

Of course, it would be hard to gather evidence for forecasting working well over longer (say, 10+ year) forecasts, so perhaps I'm expecting too much evidence.  But it's not clear to me that we should have strong theoretical reasons to think that this style of forecasting would work particularly well, given how "cloud-like" predicting events over long time horizons is and how with further extrapolation there might be more room for  bias.

Curated and popular this week
Relevant opportunities