I am a Researcher at Rethink Priorities, working mostly on cross-cause prioritization and worldview investigations. I am passionate about farmed animal welfare, global development, and economic growth/progress studies. Previously, I worked in U.S. budget and tax policy as a policy analyst for the Progressive Policy Institute. I earned a B.S. in Statistics from the University of Chicago, where I volunteered as a co-facilitator for UChicago EA's Introductory Fellowship.
Hi Vasco,
Thanks for this interesting post, and in general for the amount of time and consideration you’ve given to analyzing animal welfare issues here on the Forum. I want to reiterate the points others in this comment section, and urge you to consider much more explicitly the wide range of uncertainty involved in asking a question like this. In particular, the following model choices are in my opinion deserving of a more careful uncertainty treatment in your analysis:
Though you mention there is uncertainty in each of these variables, I think that it’s important to consider how they multiplicatively add up when combined and their aggregate effect on the range of plausible results. Otherwise, there’s a good risk of arriving at a directionally incorrect conclusion that can have big consequences if we act too quickly on it. This, in my view, is especially true if you’re bringing a set of controversial assumptions to bear on a sensitive and morally important topic.
Hi Vasco, thanks for the question.
Even though we ourselves are skeptical of the neuron count theory, many people in EA do put significant credence on it. As such, we chose to present the results that includes the neuron count model in this particular diagram. Additionally, the differences between the results including and excluding the neuron count model are small. As we've mentioned in this post, our estimates are not meant to be precise -- rather, we think that order-of-magnitude comparisons are probably more appropriate given our significant uncertainty in theories of welfare and how best to represent them in a model.
Hi Vasco,
Thanks for the good question! I think it's important to note that there are (at least) 3 types of model choices and uncertainty at work:
a) we have a good deal of uncertainty about each theory of welfare represented in the model,
b) we don't have a ton of confidence that the function we included to represent each theory of welfare is accurate (especially the undiluted experiences function, which partially drives the high mean results),
a) we could have uncertainty that our approach to estimating welfare ranges in general is correct, but we've not included this overall model uncertainty. For instance, our model has no "prior" welfare ranges for each species, so the distribution output by the calculation entirely determines our judgement of the welfare range of the species involved. We also might be uncertain that simply taking a weighted mixture of each theory of welfare is a good way to arrive at an overall judgement of welfare ranges. Etc.
Our preliminary method used in this project incorporates model uncertainty in the form of (a) by mixing together the separate distributions generated by each theory of welfare, but we don't incorporate model uncertainty in the ways specified by (b) or (c). I think these additional layers of uncertainty are epistemically important, and incorporating them would likely serve to "dampen" the effect that the mean result of the model affects our all-things-considered judgement about the welfare capacity of any species. Using the median is a quick (though not super rigorous or principled) of encoding that conservatism/additional uncertainty into how you apply the moral weight project's results in real life. But there are other ways to aggregate the estimates, which could (and likely would) be better than using the median.
Seconding this question, and wanted to ask more broadly:
A big component/assumption of the example given is that we can "re-run" simulations of the world in which different combinations of actors were present to contribute, but this seems hard in practice. Do you know of any examples where Shapley values have been used in the "real world" and how they've tackled this question of how to evaluate counterfactual worlds?
(Also, great post! I've been meaning to learn about Shapley values for a while, and this intuitive example has proven very helpful!)
Hi Michael, here are some additional answers to your questions:
1. I roughly calibrated the reasonable risk aversion levels based on my own intuition and using a Twitter poll I did a few months ago: https://x.com/Laura_k_Duffy/status/1696180330997141710?s=20. A significant number (about a third of those who are risk averse) of people would only take the bet to save 1000 lives vs. 10 for certain if the chance of saving 1000 was over 5%. I judged this a reasonable cut-off for the moderate risk aversion level.
4. The reason the hen welfare interventions are much better than the shrimp stunning intervention is that shrimp harvest and slaughter don't last very long. So, the chronic welfare threats that ammonia concentrations battery cages impose on shrimp and hens, respectively, outweigh the shorter-duration welfare threats of harvest and slaughter.
The number of animals for black soldier flies is low, I agree. We are currently using estimates of current populations, and this estimate is probably much lower than population sizes in the future. We're only somewhat confident in the shrimp and hens estimates, and pretty uncertain about the others. Thus, I think one should feel very much at liberty to plug in different numbers for population sizes for animals like black soldier flies.
More broadly, I think this result is likely a limitation of models based on total population size, versus models that are based more on the number of animals affected per campaign. Ideally, as we gather more information about these types of interventions, we could assess the cost-effectiveness using better estimates of the number of animals affected per campaign.
Thanks for the thorough questions!
Hey, thanks for this detailed reply!
When I said "practical", I more meant "simple things that people can do without needing to download and work directly with the code for the welfare ranges." In this sense, I don't entirely agree that your solution is the most workable of them (assuming independence probably would be). But I agree--pairwise sampling is the best method if you have the access and ability to manipulate the code! (I also think that the perfect correlation you graphed makes the second suggestion probably worse than just assuming perfect independence, so thanks!)
Hi Kyle,
This is a very interesting post! One quick and very small technical detail: Rethink Priorities' welfare ranges aren't capped at 1 for non-human animals. (It just happens that, when we adjusted for probability of sentience, they all happened to have 50th percentile estimates that fall below 1). They're instead a reflection of the difference between the best and worst states that the non-human animal can experience relative to the difference between the best and worst states that a human can experience (which is normalized to 1). In theory, this relative difference could be greater than 1 if the range in intensity of experiences that a non-human animal can experience is wider than that of humans.
In fact, one of our welfare range models (the undiluted experiences mode) that feeds into the aggregate estimates tends to produce sentience-adjusted welfare range estimates greater than 1 under the theory that less cognitively complex organisms may not be able to dampen negative experiences by contextualizing them. As such, a few animals have 95th percentile estimates for their welfare ranges that are above 1 (octopuses, pigs, and shrimp). Here are some more details about the models and distributions: https://docs.google.com/document/d/1xUvMKRkEOJQcc6V7VJqcLLGAJ2SsdZno0jTIUb61D8k/edit?usp=sharing As well as the spreadsheet of results from all models: https://docs.google.com/spreadsheets/d/1SpbrcfmBoC50PTxlizF5HzBIq4p-17m3JduYXZCH2Og/edit?usp=sharing
Again, this is a really thought-provoking and sobering post, thanks for writing it :)
Hi Henry! The reason why the intervals are so wide is because they're mixing together several models. I've explained more about this modeling choice and result here: https://forum.effectivealtruism.org/posts/rLLRo9C4efeJMYWFM/welfare-ranges-per-calorie-consumption?commentId=Wc2xksAF3Ctmi4cXY