Bio

Participation
4

I am open to work. I see myself as a generalist quantitative researcher.

How others can help me

You can give me feedback here (anonymous or not).

You are welcome to answer any of the following:

  • Do you have any thoughts on the value (or lack thereof) of my posts?
  • Do you have any ideas for posts you think I would like to write?
  • Are there any opportunities you think would be a good fit for me which are either not listed on 80,000 Hours' job board, or are listed there, but you guess I might be underrating them?

How I can help others

Feel free to check my posts, and see if we can collaborate to contribute to a better world. I am open to part-time volunteering and paid work. In this case, I typically ask for 20 $/h, which is roughly equal to 2 times the global real GDP per capita.

Posts
155

Sorted by New
19
¡ ¡ 21m read

Comments
1846

Topic contributions
26

Thanks for sharing!

  • Animal welfare: DG Sante (unit G3 is responsible for Animal Welfare; unit E2 is responsible for regulatory approval policies of novel foods, including many alternative proteins) DG Agriculture (unit E3 is responsible for animal products), and DG RTD (unit B2 on bioeconomy and food systems coordinates research and innovation funding for alternative proteins), and the cabinets of associated Commissioners.

SANTE.E.4, “Pesticides and biocides”, can relate to wild animal welfare.

Thanks, Matthew! I wonder whether something somewhat similar applies to moral uncertainty. I feel this is often used as an ad hoc justification to pursue actions which are far from optimal in terms of increasing impartial welfare, such as donating to human welfare instead of animal welfare organisations.

Thanks for the post, Richard.

For any purpose other than an example calculation, never use a point estimate. Always do all math in terms of confidence intervals. All inputs should be ranges or probability distributions, and all outputs should be presented as confidence intervals.

I have run lots of Monte Carlo simulations, but have mostly moved away from them. I strongly endorse maximising expected welfare, so I think the final point estimate of the expected cost-effectiveness is all that matters in principle if it accounts for all the considerations. In practice, there are other inputs that matter because not all considerations will be modelled in that final estimate. However, I do not see this as an argument for modelling uncertainty per se. I see it as an argument for modelling the considerations which are currently not covered, at least informally (more implicitly), and ideally formally (more explicitly), such that the final point estimate of the expected cost-effectiveness becomes more accurate.

That being said, I believe modelling uncertainty is useful if it affects the estimation of the final expected cost-effectiveness. For example, one can estimate the expected effect size linked to a set of RCTs with inverse-variance weighting from w_1*e_1 + w_2*e_2 + ... + w_n*e_n, where w_i and e_i are the weight and expected effect size of study i, and w_i = 1/"variance of the effect size of study i"/(1/"variance of the effect size of study 1" + 1/"variance of the effect size of study 2" + ... + 1/"variance of the effect size of study n"). In this estimation, the uncertainty (variance) of the effect sizes of the studies matters because it directly affects the expected aggregated effect size.

Holden Karnofsky's post Why we can’t take expected value estimates literally (even when they’re unbiased) is often mentioned to point out that unbiased point estimates do not capture all information. I agree, but the clear failures of point estimates described in the post can be mitigated by adequately weighting priors, as is illustrated in the post. Applying inverse-variance weighting, the final expected cost-effectiveness is "mean of the posterior cost-effectiveness" = "weight of the prior"*"mean of the prior cost-effectiveness" + "weight of the estimate"*"mean of the estimated cost-effectiveness" = ("mean of the prior cost-effectiveness"/"variance of the prior cost-effectiveness" + "mean of the estimated cost-effectiveness"/"variance of the estimated cost-effectiveness")/(1/"variance of the prior cost-effectiveness" + 1/"variance of the estimated cost-effectiveness"). If the estimated cost-effectiveness is way more uncertain than the prior cost-effectiveness, the prior cost-effectiveness will be weighted much more heavily, and therefore the final expected cost-effectiveness, which integrates information about the prior and estimated cost-effectiveness, will remain close to the prior cost-effectiveness.

It is still important to ensure that the final point estimate for the expected cost-effectiveness is unbiased. This requires some care in converting input distributions to point estimates, but Monte Carlo simulations requiring more than one distribution can very often be avoided. For example, if "cost-effectiveness" =  ("probability of success"*"years of impact given success" + (1 - "probability of success")*"years of impact given failure")*"number of animals that can be affected"*"DALYs averted per animal-year improved"/"cost", and all these variables are independent (as usually assumed in Monte Carlo simulations for simplicity), the expected cost-effectiveness will be E("cost-effectiveness") = ("probability of success"*E("years of impact given success") + (1 - "probability of success")*E("years of impact given failure"))*E("number of animals that can be affected")*E("DALYs averted per animal-year improved")*E(1/"cost"). This is because E("constant a"*"distribution X" + "constant b") = a*E(X) + b, and E(X*Y) = E(X)*E(Y) if X and Y are independent. Note:

  • The input distributions should be converted to point estimates corresponding to their means.
    • You can make a copy of this sheet (presented here) to calculate the mean of uniform, normal, loguniform, lognormal, pareto and logistic distributions from 2 of their quantiles. For example, if "years of impact given success" follows a lognormal distribution with 5th and 95th percentiles of 3 and 30 years, one should set the cell B2 to 0.05, C2 to 0.95, B3 to 3, and C3 to 30, and then check E("years of impact given success") in cell C22, which is 12.1 years.
    • Replacing an input by its most likely value (its mode), or one which is as likely to be an underestimate as an overestimate (its median) may lead to a biased expected cost-effectiveness. For example, the median and mode of a lognormal distribution are always lower than its mean. So, if "years of impact given success" followed such distribution, replacing it with its most likely value, or one as likely to be too low as too high would result in underestimating the expected cost-effectiveness.
  • The expected cost-effectiveness is proportional to E(1/"cost"), which is only equal to 1/E("cost") if "cost" is a constant, or practically equal if it is a fairly certain distribution compared to others influencing the cost-effectiveness. If "cost" is too uncertain to be considered constant, and there is not a closed formula to determine E(1/"cost") (there would be if "cost" followed a uniform distribution), one would have to run a Monte Carlo simulation to compute E(1/"cost"), but it would only involve the distribution of the cost. For uniform, normal and lognormal distributions, Guesstimate would do. For other distributions, you can try Squiggle AI (I have not used it, but it seems quite useful).

Hi Joshua,

I would be happy to bet 10 k$ against short AI timelines. Note I am open to a later resolution date than the one I mention in the linked post, such that the bet is beneficial for you despite a higher risk of you not receiving the transfer in case you win.

Your post did not have any tags. I added a few.

Thanks, Laura.

Otherwise, there’s a good risk of arriving at a directionally incorrect conclusion that can have big consequences if we act too quickly on it.

It is unclear to me whether the uncertainties you highlighted push the harm to mosquitoes as a fraction of the benefits to humans up or down. However, I very much agree there is a good risk I under or overestimated it. I did not mean to suggest AMF is harmful, as naively implied by my main estimate. As I say in the post, "it is unclear to me whether ITNs increase or decrease welfare". If forced to guess, I would say AMF is beneficial, but I am practically indifferent between donating to AMF and burning money. I do not see how this conclusion would qualitatively change if I had modelled the uncertainty of my inputs more explicitly with a Monte Carlo simulation. I think uncertainty in the welfare range of mosquitoes alone is enough to reach that conclusion, and probabilistic modelling would not resolve it.

I neglected the effects of ITNs on the number of wild animals because it is super unclear whether they have positive or negative lives. Yet, there is still lots of uncertainty even just in the effects I considered. RP’s 5th and 95th percentile welfare ranges of black soldier flies are 0 and 15.1 (= 0.196/0.013) times their median. This suggests that, even ignoring effects on the number of wild animals, and just accounting for uncertainty in mosquitoes’ capacity for welfare, the 5th and 95th percentile harm to mosquitoes caused by ITNs are 0 and 11.5 k (= 15.1*763) times their benefits to humans. So it is unclear to me whether ITNs increase or decrease welfare.

More importantly, I think the large uncertainty should update one towards learning more, and supporting more robustly beneficial interventions. In particular, donating less to organisations like AMF, whose cost-effectiveness may well be majorly driven by unclear effects on animals, and more to ones like Arthropoda Foundation, SWP, and WAI. Do you agree?

Thanks for sharing your perspective.

  • Fish Welfare Initiative: Hasn't worked very well, and seemed like it wouldn't in advance, doesn't do CE's original proposed idea anymore, more feedback ahead of time would have told them not to do the original idea

I posted about the cost-effectiveness of the fish welfare interventions recommended by Ambitious Impact (AIM), and Fish Welfare Initiative’s farm program.

Thanks for sharing, JP and Lizka! Great principles.

Thanks for the update!

Over the past year or so, I’ve become increasingly convinced by arguments that we are clueless about the sign (in terms of expected total suffering reduced) of interventions aimed at reducing s-risk.

I believe one can positively influence futures which have an astronomical positive or negative value, but negligibly so. I think the effects of one's actions are well approximated by considering just the 1st 100 years or so.

I think this first example isn’t comparable, and a bit of a strawman. The Vox article is about how bad factory farming is, and how we don’t need to do that to help humans to flourish. This discussion is about potentially withdrawing life-saving interventions because they might be detrimental to animals. This directly connects the saving lives to harming animals – the Vox article doesn’t.

I understand they are not directly comparable, but I guess the newsletters from Vox also reach a much wider audience who is less acceptable of controversial takes compared to readers of the EA Forum.

I’d say I would have allowed for a much bigger behaviour score range

Do you mean you would account for other behavioural proxies for welfare capacity besides the 90 RP considered? Which ones? RP seemed to be super comprehensive.

perhaps through having more than a binary yes/no system on some of the behaviours

I do not understand. RP did not have a binary system to determine the probability of sentience.

The program runs 10,000 simulations where the presence or absence of each proxy in the “Simple_Scoring” spreadsheet was a random variable. For a given organism, the steps taken in a single simulation to generate the proxies possessed by that organism were:

  1. Start with a dictionary containing each proxy and an empty list to add scores to.
  2. For each proxy in the Simple Scoring sheet:
    1. First, randomly select the probability that the organism possesses the proxy from a uniform distribution whose range maps onto the judgment determined by our contractors. The probability map is as follows:
      1. No: [0.00, 0.00] (Used for the “motile” proxy)
      2. Likely no: [0, 0.25)
      3. Lean no: [0.25, 0.5)
      4. Lean yes: [0.5, 0.75)
      5. Likely yes: [0.75, 1.0]
      6. Yes: [1.00, 1.00] (Used for the “motile” proxy)

For example, if pigs scored a “Likely yes” on taste aversion behavior, then the probability that pigs exhibit taste aversion behavior is sampled uniformly over the interval [0.75, 1.0]. If a proxy was judged “Unknown”, then we defaulted to giving a zero probability of it being present; however, this default can be changed at the start of running the program.

  1. Second, generate a Bernoulli random variable using this probability of the species possessing the trait, where 1 indicates that the trait is present and 0 indicates that it is absent.
  2. Add the score (0 or 1) to the list corresponding to its particular proxy in the dictionary.

For a given organism, this process was repeated for 10,000 simulations, where each proxy’s score in a given simulation was appended to its respective list. Then, we repeated this procedure for all eighteen organism types studied and saved the simulated proxy data.

RP did not use a binary system to determine the welfare range conditional on sentience, and actually underestimated this by giving zero weight to proxies for which there was no information (see what I bolded below).

To generate the distributions of welfare ranges across species and models, the user must answer the same three questions about whether to give non-zero probability to “Unknown,” “Lean no,” and “Likely no” judgments and what weight should be given to proxies we’re highly confident matter for welfare ranges as were asked for the probabilities of sentience. As before, users can change the probabilities given to “Unknowns” for one or more species of their choosing. 

In our final simulations: 

  1. We chose not to assign any weight to proxies with “unknown” judgments for any species. This likely leads to underestimating the welfare ranges for several animal types.
  2. We chose to assign positive probabilities to proxies whose judgments are “likely no” and “lean no.”
  3. We weighted the proxies that we are highly confident matter for welfare ranges as being five times as important as all other proxies.

giving less complex pain response behaviors a fraction of more complex ones

I do not understand what you are referring to. Which specific proxies do you think should be weighted more heavily?

I think the 4th input seems absurd and I won’t rehash this much as many others have made arguments against your reasoning on this thread. You’re translating a figure which is on the upper bound of judging severe human pain (which like Bruce said, by definition can’t last long) directly onto what you think might be happening in mosquitos – a wildly different organism.

For all the analyses relying on pain intensities I am aware of, from AIM and RP, the ratio between the intensity of a pain of a given category and that of another is the same across species. I have now asked Cynthia Schuck-Paim, who is the research director of WFP (the organisation defining the pain intensities).

I agree excruciating pain "can’t last long", but this is consistent with my estimate that ITNs cause 119 s of excruciating pain per mosquito they kill.

On what grounds really would mosquitos dying of poisoning likely cause that severe pain at a best guess? I think its possible but very unlikely so I think it would be reasonable for the sake of conservatism to reduce this number by orders of magnitude.

As I say in the post, my estimates for the time in pain come from aggregating 3 sets of estimates provided by WFP's GPT Pain-Track. My estimate of 119 s of excruciating pain may well not be accurate, but which evidence do you have for it being "possible but very unlikely" to be that long?

4. On the number of mosquitos front for a start I don’t like comments like “my takeways would probably be the same even if….” Multipliers can add up, and we’re trying to move towards accuracy so I think it can be an unhelpful copout to question how much any element of an analysis matters – Rethink Priorities said things like this a number of times during their moral weights project which was a small red flag for me.

I agree. At the same time, I think it is worth having in mind how far one is from reversing the conclusions.

I agree there’s no empirical research on the mosquito number front, but from my perspective having travelled around Africa and living in a grass thatched hut and sleeping under a mosquito net for the last 10 years, 24 mosquitos killed a day on average per net seems extremely unlikely. That would be something like 240 million mosquitos killed by nets alone every day in Uganda – which seems to me perhaps plausible but unlikely. From a distance I think you could have been more conservative with your “best guess”

As I replied to Bruce:

I estimate GW's last grant to AMF will kill 0.0183 % as many mosquitoes as the ones currently alive globally over the lifetime of the bednets it funds. This would correspond to killing 1.19 % (= 1.83*10^-4*195/3) of the mosquitoes in DRC per year assuming mosquitoes were uniformly distributed across the existing 195 countries, and that the nets funded by the grant are distributed over 3 years. In reality, it would be less because DRC should have more mosquitoes than a random country. AMF being responsible for killing less than 1.19 % of the mosquitoes in DRC does not sound implausible.

Hi Cynthia,

Under your framework, if one supposes hurtful, disabling and excruciating pain are a, b and c times as intense as annoying pain in humans, should the same apply to other species? Or could the intensity ratios vary across species? My understanding is that the ratios are supposed to be constant across species.

Load more