bruce

2648 karmaJoined Oct 2021

Bio

Doctor from NZ, independent researcher (grand futures / macrostrategy) collaborating with FHI / Anders Sandberg. Previously: Global Health & Development research @ Rethink Priorities.

Feel free to reach out if you think there's anything I can do to help you or your work, or if you have any Qs about Rethink Priorities! If you're a medical student / junior doctor reconsidering your clinical future, or if you're quite new to EA / feel uncertain about how you fit in the EA space, have an especially low bar for reaching out.

Outside of EA, I do a bit of end of life care research and climate change advocacy, and outside of work I enjoy some casual basketball, board games and good indie films. (Very) washed up classical violinist and Oly-lifter.

All comments in personal capacity unless otherwise stated.

Posts
8

Sorted by New

bruce's Quick takes

bruce

· 2y ago · 1m read

Screwworm Free Future is hiring for a Director

Nia

· 11d ago · 2m read

215

Launching Screwworm-Free Future – Funding and Support Request

lroberts

· 1mo ago · 7m read

126

Articles about recent OpenAI departures

bruce

· 9mo ago · 2m read

Historical Global Health R&D “hits”: Development, main sources of funding, and impact

Rethink Priorities

· 2y ago · 3m read

Better weather forecasting: Agricultural and non-agricultural benefits in low- and lower-middle-income countries

Rethink Priorities

· 2y ago · 4m read

Our research process: an overview from Rethink Priorities’ Global Health and Development team

Rethink Priorities

· 2y ago · 8m read

109

How effective are prizes at spurring innovation?

Rethink Priorities

· 2y ago · 103m read

Comments
126

StrongMinds should not be a top-rated charity (yet)

bruce2y76

Thanks for writing this post!

I feel a little bad linking to a comment I wrote, but the thread is relevant to this post, so I'm sharing in case it's useful for other readers, though there's definitely a decent amount of overlap here.

TL; DR

I personally default to being highly skeptical of any mental health intervention that claims to have ~95% success rate + a PHQ-9 reduction of 12 points over 12 weeks, as this is is a clear outlier in treatments for depression. The effectiveness figures from StrongMinds are also based on studies that are non-randomised and poorly controlled. There are other questionable methodology issues, e.g. surrounding adjusting for social desirability bias. The topline figure of $170 per head for cost-effectiveness is also possibly an underestimate, because while ~48% of clients were treated through SM partners in 2021, and Q2 results (pg 2) suggest StrongMinds is on track for ~79% of clients treated through partners in 2022, the expenses and operating costs of partners responsible for these clients were not included in the methodology.

(This mainly came from a cursory review of StrongMinds documents, and not from examining HLI analyses, though I do think "we’re now in a position to confidently recommend StrongMinds as the most effective way we know of to help other people with your money" seems a little overconfident. This is also not a comment on the appropriateness of recommendations by GWWC / FP)

(commenting in personal capacity etc)

Edit:
Links to existing discussion on SM. Much of this ends up touching on discussions around HLI's methodology / analyses as opposed to the strength of evidence in support of StrongMinds, but including as this is ultimately relevant for the topline conclusion about StrongMinds (inclusion =/= endorsement etc):

StrongMinds should not be a top-rated charity (yet)
- Comments (1, 2) about outsider perception of HLI as an advocacy org
- Comment about ideal role of an org like HLI, as well as trying to decouple the effectiveness of StrongMinds with whether or not WELLBYs / subjective wellbeing scores are valuable or worth more research on the margin.
- Twitter exchange between Berk Özler and Johannes Haushofer, particularly relevant given Özler's role in an upcoming RCT of StrongMinds in Uganda (though only targeted towards adolescent girls)
Evaluating StrongMinds: how strong is the evidence? and the comment section. In particular:
- Thread 1
- Thread 2
James Snowden's analysis of household spillovers
GiveWell's Assessment of Happier Lives Institute’s Cost-Effectiveness Analysis of StrongMinds
Comments in the post: The Happier Lives Institute is funding constrained and needs you!
- Greg claims "study registration reduces expected effect size by a factor of 3"
- Topline finding weighted 13% from StrongMinds RCT, where d = 1.72
- "this is a very surprising mistake for a diligent and impartial evaluator to make"
- Greg commits to: "donat[ing] 5k USD if the [Baird] RCT reports an effect size greater than d = 0.4 - 2x smaller than HLI's estimate of ~ 0.8, and below the bottom 0.1% of their monte carlo runs."
- Comment thread on discussion being harsh and "epistemic probation"
- James and Alex push back on some claims they consider to be misleading.
Learning from our mistakes: how HLI plans to improve
Update on the Baird RCT

Brainstorming ways to make EA safer and more inclusive

bruce2y23

While I agree that both sides are valuable, I agree with the anon here - I don't think these tradeoffs are particularly relevant to a community health team investigating interpersonal harm cases with the goal of "reduc[ing] risk of harm to members of the community while being fair to people who are accused of wrongdoing".

One downside of having the bad-ness of say, sexual violence^[1]be mitigated by their perceived impact,(how is the community health team actually measuring this? how good someone's forum posts are? or whether they work at an EA org? or whether they are "EA leadership"?) when considering what the appropriate action should be (if this is happening) is that it plausibly leads to different standards for bad behaviour. By the community health team's own standards, taking someone's potential impact into account as a mitigating factor seems like it could increase the risk of harm to members of the community (by not taking sufficient action with the justification of perceived impact), while being more unfair to people who are accused of wrongdoing. To be clear, I'm basing this off the forum post, not any non-public information

Additionally, a common theme about basically every sexual violence scandal that I've read about is that there were (often multiple) warnings beforehand that were not taken seriously.

If there is a major sexual violence scandal in EA in the future, it will be pretty damning if the warnings and concerns were clearly raised, but the community health team chose not to act because they decided it wasn't worth the tradeoff against the person/people's impact.

Another point is that people who are considered impactful are likely to be somewhat correlated with people who have gained respect and power in the EA space, have seniority or leadership roles etc. Given the role that abuse of power plays in sexual violence, we should be especially cautious of considerations that might indirectly favour those who have power.

More weakly, even if you hold the view that it is in fact the community health team's role to "take the talent bottleneck seriously; don’t hamper hiring / projects too much" when responding to say, a sexual violence allegation, it seems like it would be easy to overvalue the bad-ness of the immediate action against the person's impact, and undervalue the bad-ness of many more people opting to not get involved, or distance themselves from the EA movement because they perceive it to be an unsafe place for women, with unreliable ways of holding perpetrators accountable.

That being said, I think the community health team has an incredibly difficult job, and while they play an important role in mediating community norms and dynamics (and thus have corresponding amount of responsibility), it's always easier to make comments of a critical nature than to make the difficult decisions they have to make. I'm grateful they exist, and don't want my comment to come across like an attack of the community health team or its individuals!

(commenting in personal capacity etc)

^{^}
used as an umbrella term to include things like verbal harassment. See definition here.

CEA/EV + OP + RP should engage an independent investigator to determine whether key figures in EA knew about the (likely) fraud at FTX

bruce2y75

If this comment is more about "how could this have been foreseen", then this comment thread may be relevant. I should note that hindsight bias means that it's much easier to look back and assess problems as obvious and predictable ex post, when powerful investment firms and individuals who also had skin in the game also missed this.

TL;DR:
1) There were entries that were relevant (this one also touches on it briefly)
2) They were specifically mentioned
3) There were comments relevant to this. (notably one of these was apparently deleted because it received a lot of downvotes when initially posted)
4) There has been at least two other posts on the forum prior to the contest that engaged with this specifically

My tentative take is that these issues were in fact identified by various members of the community, but there isn't a good way of turning identified issues into constructive actions - the status quo is we just have to trust that organisations have good systems in place for this, and that EA leaders are sufficiently careful and willing to make changes or consider them seriously, such that all the community needs to do is "raise the issue". And I think looking at the systems within the relevant EA orgs or leadership is what investigations or accountability questions going forward should focus on - all individuals are fallible, and we should be looking at how we can build systems in place such that the community doesn't have to just trust that people who have power and who are steering the EA movement will get it right, and that there are ways for the community to hold them accountable to their ideals or stated goals if it appears to, or risks not playing out in practice.

i.e. if there are good processes and systems in place and documentation of these processes and decisions, it's more acceptable (because other organisations that probably have a very good due diligence process also missed it). But if there weren't good processes, or if these decisions weren't a careful + intentional decision, then that's comparatively more concerning, especially in context of specific criticisms that have been raised,^[1] or previous precedent. For example, I'd be especially curious about the events surrounding Ben Delo,^[2] and processes that were implemented in response. I'd be curious about whether there are people in EA orgs involved in steering who keep track of potential risks and early warning signs to the EA movement, in the same way the EA community advocates for in the case of pandemics, AI, or even general ways of finding opportunities for impact. For example, SBF, who is listed as a EtG success story on 80k hours, has publicly stated he's willing to go 5x over the Kelly bet, and described yield farming in a way that Matt Levine interpreted as a Ponzi. Again, I'm personally less interested in the object level decision (e.g. whether or not we agree with SBF's Kelly bet comments as serious, or whether Levine's interpretation as appropriate), but more about what the process was, how this was considered at the time with the information they had etc. I'd also be curious about the documentation of any SBF related concerns that were raised by the community, if any, and how these concerns were managed and considered (as opposed to critiquing the final outcome).

Outside of due diligence and ways to facilitate whistleblowers, decision-making processes around the steering of the EA movement is crucial as well. When decisions are made by orgs that bring clear benefits to one part of the EA community while bringing clear risks that are shared across wider parts of the EA community,^[3] it would probably be of value to look at how these decisions were made and what tradeoffs were considered at the time of the decision. Going forward, thinking about how to either diversify those risks, or make decision-making more inclusive of a wider range stakeholders^[4], keeping in mind the best interests of the EA movement as a whole.

(this is something I'm considering working on in a personal capacity along with the OP of this post, as well as some others - details to come, but feel free to DM me if you have any thoughts on this. It appears that CEA is also already considering this)

If this comment is about "are these red-teaming contests in fact valuable for the money and time put into it, if it misses problems like this"

I think my view here (speaking only for the red-teaming contest) is that even if this specific contest was framed in a way that it missed these classes of issues, the value of the very top submissions^[5] may still have made the efforts worthwhile. The potential value of a different framing was mentioned by another panelist. If it's the case that red-teaming contests are systematically missing this class of issues regardless of framing, then I agree that would be pretty useful to know, but I don't have a good sense of how we would try to investigate this.

^{^}
This tweet seems to have aged particularly well. Despite supportive comments from high-profile EAs on the original forum post, the author seemed disappointed that nothing came of it in that direction. Again, without getting into the object level discussion of the claims of the original paper, it's still worth asking questions around the processes. If there was were actions planned, what did these look like? If not, was that because of a disagreement over the suggested changes, or the extent that it was an issue at all? How were these decisions made, and what was considered?
^{^}
Apparently a previous EA-aligned billionaire ?donor who got rich by starting a crypto trading firm, who pleaded guilty to violating the bank secrecy act
^{^}
Even before this, I had heard from a primary source in a major mainstream global health organisation that there were staff who wanted to distance themselves from EA because of misunderstandings around longtermism.
^{^}
This doesn't have to be a lengthy deliberative consensus-building project, but it should at least include internal comms across different EA stakeholders to allow discussions of risks and potential mitigation strategies.
^{^}
e.g. A critical review of GiveWell's 2022 cost-effectiveness model, Methods for improving uncertainty analysis in EA cost-effectiveness models, and
Biological Anchors external review

Winners of the EA Criticism and Red Teaming Contest

bruce2y47

As requested, here are some submissions that I think are worth highlighting, or considered awarding but ultimately did not make the final cut. (This list is non-exhaustive, and should be taken more lightly than the Honorable mentions, because by definition these posts are less strongly endorsed by those who judged it. Also commenting in personal capacity, not on behalf of other panelists, etc):

Bad Omens in Current Community Building
I think this was a good-faith description of some potential / existing issues that are important for community builders and the EA community, written by someone who "did not become an EA" but chose to go to the effort of providing feedback with the intention of benefitting the EA community. While these problems are difficult to quantify, they seem important if true, and pretty plausible based on my personal priors/limited experience. At the very least, this starts important conversations about how to approach community building that I hope will lead to positive changes, and a community that continues to strongly value truth-seeking and epistemic humility, which is personally one of the benefits I've valued most from engaging in the EA community.

Seven Questions for Existential Risk Studies
It's possible that the length and academic tone of this piece detracts from the reach it could have, and it (perhaps aptly) leaves me with more questions than answers, but I think the questions are important to reckon with, and this piece covers a lot of (important) ground. To quote a fellow (more eloquent) panelist, whose views I endorse: "Clearly written in good faith, and consistently even-handed and fair - almost to a fault. Very good analysis of epistemic dynamics in EA." On the other hand, this is likely less useful to those who are already very familiar with the ERS space.

Most problems fall within a 100x tractability range (under certain assumptions)
I was skeptical when I read this headline, and while I'm not yet convinced that 100x tractability range should be used as a general heuristic when thinking about tractability, I certainly updated in this direction, and I think this is a valuable post that may help guide cause prioritisation efforts.

The Effective Altruism movement is not above conflicts of interest
I was unsure about including this post, but I think this post highlights an important risk of the EA community receiving a significant share of its funding from a few sources, both for internal community epistemics/culture considerations as well as for external-facing and movement-building considerations. I don't agree with all of the object-level claims, but I think these issues are important to highlight and plausibly relevant outside of the specific case of SBF / crypto. That it wasn't already on the forum (afaict) also contributed to its inclusion here.

I'll also highlight one post that was awarded a prize, but I thought was particularly valuable:

Red Teaming CEA’s Community Building Work
I think this is particularly valuable because of the unique and difficult-to-replace position that CEA holds in the EA community, and as Max acknowledges, it benefits the EA community for important public organisations to be held accountable (and to a standard that is appropriate for their role and potential influence). Thus, even if listed problems aren't all fully on the mark, or are less relevant today than when the mistakes happened, a thorough analysis of these mistakes and an attempt at providing reasonable suggestions at least provides a baseline to which CEA can be held accountable for similar future mistakes, or help with assessing trends and patterns over time. I would personally be happy to see something like this on at least a semi-regular basis (though am unsure about exactly what time-frame would be most appropriate). On the other hand, it's important to acknowledge that this analysis is possible in large part because of CEA's commitment to transparency.

Dylan Richardson's Quick takes

bruce4d2

Appreciate this! There are a decent amount happening; can you DM me with a bit more info about yourself / what you'd be willing to help with?

Insecticide-treated nets significantly harm mosquitoes, but one can easily offset this?

bruce4d6

The claim isn't that your answers don't fit your definitions/methdologies, but that given highly unintuitive conclusions, one should more strongly consider questioning the methodology / definitions you use.

For example, the worst death imaginable for a human is, to a first approximation, capped at a couple of minutes of excruciating pain (or a couple of factors of this), since you value excruciating pain at 10,000 times as bad as the next category, and say that by definition excruciating pain can't exist for more than a few minutes. But this methodology will be unlikely to accurately capture a lot of extremely bad states of suffering that humans can have. On the other hand, it is much easier to scale even short periods of excruciating suffering with high numbers of animals, especially when you're happy to consider ~8 million mosquitos killed per human life saved by a bednet - I don't have empirical evidence to the contrary, but this seems rather high.

Here's another sense check to illustrate this (please check if I've got the maths right here!):
-GiveWell estimate "5.53 deaths averted per 1000 children protected per year" or 0.00553 lives saved per year of protection for a child, or 1 life saved per 180.8 children protected per year.
-They model 1.8 children under each bednet, on average.

This means it requires approximately 100 bednets over the course of 1 year to save 1 life/~50 DALYs.

At your preferred rate of 1 mosquito death per hour per net^[1] this comes to approximately 880,000 mosquito deaths per life saved,^[2] which is ~~3 OOMs~~ 1 OOM lower than the ~8 million you would reach if you do the "excruciating pain" calculation, assuming your 763x claim is correct^[3]

(I may not continue engaging on this thread due to capacity constraints, but appreciate the responses!)

^{^}
Here I make no claims about the reasonableness of 1 mosquito per hour killed by the net as I don't have any empirical data on this / I'm more uncertain than Nick is but also note that he has more relevant experience than I do here.
^{^}
180.8/1.8 * 24* 365 = 879,893
^{^}
Assuming 763x GiveWell is correct, a tradeoff of 14.3 days of mosquito excruciating pain (MEP) for 1 happy human life, 2 minutes of MEP per mosquito, this requires a tradeoff of 7.9 million mosquitos killed for one human life saved.
763*(14.3*24*60)/2 = 7,855,848

Dylan Richardson's Quick takes

bruce5d3

Don't have a lot of details to share right now but there are a bunch of folks coordinating on things to this effect - though if you have ideas or suggestions or people to put forward feel free to DM!

Insecticide-treated nets significantly harm mosquitoes, but one can easily offset this?

bruce5d6

The values I provide are not my personal best guesses for point estimates, but conservative estimates that are sufficient to meaningfully weaken your topline conclusions. In practice, even the assumptions I just listed would be unintuitive to most if used as the bar!

I agree "what fits intuition" is often a bad way of evaluating claims, but this is in context of me saying "I don't know where exactly to draw the line here, but 14.3 mosquito days of excruciating suffering for one happy human life seems clearly beyond it."
It seems entirely plausible that a human might take a tradeoff of 100x less duration (3.5 hours * 100 is ~14.5 days), and also value human:mosquito tradeoff at >100x. It wouldn't be difficult to suggest another OOM in both directions for the same conclusion.

The main thing I'm gesturing at is that for a conclusion as unintuitive as "2 mosquito weeks of excruciating suffering cancels out 1 happy human life", I think it's reasonable to consider that there might be other explanations, including e.g. underlying methodological flaws (and in retrospect perhaps inconsistent isn't the right word, maybe 'inaccurate' is better).

For example, by your preferred working definition of excruciating pain, it definitionally can't exist for more than a few minutes at a time before neurological shutdown. I think this isn't necessarily unreasonable, but there might be failure modes in your approach when basically all of your BOTECs come down to "which organisms have more aggregate seconds of species-adjusted excruciating pain".

Insecticide-treated nets significantly harm mosquitoes, but one can easily offset this?

bruce6d6

I estimate 14.3 mosquito-days of excruciating pain neutralise the benefits of the additional human welfare from saving 1 life under GW’s moral weights.

Makes sense - just to clarify:

My previous (mis)interpretation of you suggesting 11minutes of MEP trading off 1 day of fully healthy human life would indicate a tradeoff of 11 / (24*60) = 0.0076.

Your clarification is that 14.3 mosquito-days trades off against 1 life:
assuming 1 life as 50 DALYs this is 14.3 / (50*365.25) = 0.00078

So it seems like my misinterpretation was ~10x overvaluing the human side compared to your true view?

I understand that may seem very little time, but I do not think it can be dismissed just on the basis of seeming surprising. I would say one should focus on checking whether the results mechanistically follow from the inputs, and criticing these:

My view is probably something like:
"I think on the margin most people should be more willing to entertain radical seeming ideas rather than intuitions given unknown unknowns about moral catastrophes we might be contributing to, but I also think the implicit claim^[1] I'm happy to back here is that if your BOTEC spits out a result of "14.3 mosquito days of excruciating pain trades off with 50 human years of fully healthy life" then I do expect on priors that some combination of your inputs / assumptions / interpretation of the evidence etc have lead to a result that is likely many factors (if not OOMs) off the true value (if we magically found out what it was (and I think such a surprising result should also prompt similar kinds of thoughts on your end!)). I'll admit I don't have a strong sense of how to draw a hard line here, but I can imagine for this specific case that I might expect the tradeoff for humans is closer to 3.5 hours of excruciating pain vs a life, and that I value / expect the human capacity for welfare to be >100x that of a mosquito. If you believe both of those to be true then you'd reject your conclusion.

Another thing to consider might be something like "the way you count/value excruciating pain in humans vs in animals is inconsistent in a way that systematically gives results in favour of animals"

I don't have too much to offer here in terms of this - I just wanted to know what the implied trade-off actually was and have it spelled out.

^{^}
Referring only to this specific example, not necessarily other posts of yours I've commented on

Insecticide-treated nets significantly harm mosquitoes, but one can easily offset this?

bruce6d6

Gotcha RE: 23.9secs / 11mins, thanks for the clarification!

Looking at this figure you are trading off 7910000 * 2 minutes of MEP for a human death averted, which is 15820000 minutes, which is ~30 mosquito years^[1] of excruciating pain trading off for 50 human years of a practically maximally happy life.

Is this a correct representation of your views?

(Btw just flagging that I think I edited my comment as you were responding to it RE: 1.3~37 trillion figures, I realised I divided by 2 instead of by 120 (minutes instead of seconds).)

^{^}
7910000 * 2 / 60 / 24 / 365.25 = 30.08

bruce

Bio

Posts 8

Comments126

Posts
8

Comments
126