bruce

2485 karmaJoined

Bio

Doctor from NZ, independent researcher (grand futures / macrostrategy) collaborating with FHI / Anders Sandberg. Previously: Global Health & Development research @ Rethink Priorities.

Feel free to reach out if you think there's anything I can do to help you or your work, or if you have any Qs about Rethink Priorities! If you're a medical student / junior doctor reconsidering your clinical future, or if you're quite new to EA / feel uncertain about how you fit in the EA space, have an especially low bar for reaching out.

Outside of EA, I do a bit of end of life care research and climate change advocacy, and outside of work I enjoy some casual basketball, board games and good indie films. (Very) washed up classical violinist and Oly-lifter.

All comments in personal capacity unless otherwise stated.

Comments
116

bruce
76
26
3

Thanks for writing this post!

I feel a little bad linking to a comment I wrote, but the thread is relevant to this post, so I'm sharing in case it's useful for other readers, though there's definitely a decent amount of overlap here.

TL; DR

I personally default to being highly skeptical of any mental health intervention that claims to have ~95% success rate + a PHQ-9 reduction of 12 points over 12 weeks, as this is is a clear outlier in treatments for depression. The effectiveness figures from StrongMinds are also based on studies that are non-randomised and poorly controlled. There are other questionable methodology issues, e.g. surrounding adjusting for social desirability bias. The topline figure of $170 per head for cost-effectiveness is also possibly an underestimate, because while ~48% of clients were treated through SM partners in 2021, and Q2 results (pg 2) suggest StrongMinds is on track for ~79% of clients treated through partners in 2022, the expenses and operating costs of partners responsible for these clients were not included in the methodology.

(This mainly came from a cursory review of StrongMinds documents, and not from examining HLI analyses, though I do think "we’re now in a position to confidently recommend StrongMinds as the most effective way we know of to help other people with your money" seems a little overconfident. This is also not a comment on the appropriateness of recommendations by GWWC / FP)

 

(commenting in personal capacity etc)

 

Edit:
Links to existing discussion on SM. Much of this ends up touching on discussions around HLI's methodology / analyses as opposed to the strength of evidence in support of StrongMinds, but including as this is ultimately relevant for the topline conclusion about StrongMinds (inclusion =/= endorsement etc):

bruce
23
10
0

While I agree that both sides are valuable, I agree with the anon here - I don't think these tradeoffs are particularly relevant to a community health team investigating interpersonal harm cases with the goal of "reduc[ing] risk of harm to members of the community while being fair to people who are accused of wrongdoing".

One downside of having the bad-ness of say, sexual violence[1]be mitigated by their perceived impact,(how is the community health team actually measuring this? how good someone's forum posts are? or whether they work at an EA org? or whether they are "EA leadership"?) when considering what the appropriate action should be (if this is happening) is that it plausibly leads to different standards for bad behaviour. By the community health team's own standards, taking someone's potential impact into account as a mitigating factor seems like it could increase the risk of harm to members of the community (by not taking sufficient action with the justification of perceived impact), while being more unfair to people who are accused of wrongdoing. To be clear, I'm basing this off the forum post, not any non-public information

Additionally, a common theme about basically every sexual violence scandal that I've read about is that there were (often multiple) warnings beforehand that were not taken seriously.

If there is a major sexual violence scandal in EA in the future, it will be pretty damning if the warnings and concerns were clearly raised, but the community health team chose not to act because they decided it wasn't worth the tradeoff against the person/people's impact.

Another point is that people who are considered impactful are likely to be somewhat correlated with people who have gained respect and power in the EA space, have seniority or leadership roles etc. Given the role that abuse of power plays in sexual violence, we should be especially cautious of considerations that might indirectly favour those who have power.

More weakly, even if you hold the view that it is in fact the community health team's role to "take the talent bottleneck seriously; don’t hamper hiring / projects too much" when responding to say, a sexual violence allegation, it seems like it would be easy to overvalue the bad-ness of the immediate action against the person's impact, and undervalue the bad-ness of many more people opting to not get involved, or distance themselves from the EA movement because they perceive it to be an unsafe place for women, with unreliable ways of holding perpetrators accountable.

That being said, I think the community health team has an incredibly difficult job, and while they play an important role in mediating community norms and dynamics (and thus have corresponding amount of responsibility), it's always easier to make comments of a critical nature than to make the difficult decisions they have to make. I'm grateful they exist, and don't want my comment to come across like an attack of the community health team or its individuals!

(commenting in personal capacity etc)

  1. ^

    used as an umbrella term to include things like verbal harassment. See definition here.

bruce
75
23
1

If this comment is more about "how could this have been foreseen", then this comment thread may be relevant. I should note that hindsight bias means that it's much easier to look back and assess problems as obvious and predictable ex post, when powerful investment firms and individuals who also had skin in the game also missed this. 

TL;DR: 
1) There were entries that were relevant (this one also touches on it briefly)
2) They were specifically mentioned
3) There were comments relevant to this. (notably one of these was apparently deleted because it received a lot of downvotes when initially posted)
4) There has been at least two other posts on the forum prior to the contest that engaged with this specifically

My tentative take is that these issues were in fact identified by various members of the community, but there isn't a good way of turning identified issues into constructive actions - the status quo is we just have to trust that organisations have good systems in place for this, and that EA leaders are sufficiently careful and willing to make changes or consider them seriously, such that all the community needs to do is "raise the issue". And I think looking at the systems within the relevant EA orgs or leadership is what investigations or accountability questions going forward should focus on - all individuals are fallible, and we should be looking at how we can build systems in place such that the community doesn't have to just trust that people who have power and who are steering the EA movement will get it right, and that there are ways for the community to hold them accountable to their ideals or stated goals if it appears to, or risks not playing out in practice.

i.e. if there are good processes and systems in place and documentation of these processes and decisions, it's more acceptable (because other organisations that probably have a very good due diligence process also missed it). But if there weren't good processes, or if these decisions weren't a careful + intentional decision, then that's comparatively more concerning, especially in context of specific criticisms that have been raised,[1]  or previous precedent. For example, I'd be especially curious about the events surrounding Ben Delo,[2] and processes that were implemented in response. I'd be curious about whether there are people in EA orgs involved in steering who keep track of potential risks and early warning signs to the EA movement, in the same way the EA community advocates for in the case of pandemics, AI, or even general ways of finding opportunities for impact. For example, SBF, who is listed as a EtG success story on 80k hours, has publicly stated he's willing to go 5x over the Kelly bet, and described yield farming in a way that Matt Levine interpreted as a Ponzi. Again, I'm personally less interested in the object level decision (e.g. whether or not we agree with SBF's Kelly bet comments as serious, or whether Levine's interpretation as appropriate), but more about what the process was, how this was considered at the time with the information they had etc. I'd also be curious about the documentation of any SBF related concerns that were raised by the community, if any, and how these concerns were managed and considered (as opposed to critiquing the final outcome).

Outside of due diligence and ways to facilitate whistleblowers, decision-making processes around the steering of the EA movement is crucial as well. When decisions are made by orgs that bring clear benefits to one part of the EA community while bringing clear risks that are shared across wider parts of the EA community,[3] it would probably be of value to look at how these decisions were made and what tradeoffs were considered at the time of the decision. Going forward, thinking about how to either diversify those risks, or make decision-making more inclusive of a wider range stakeholders[4], keeping in mind the best interests of the EA movement as a whole.

(this is something I'm considering working on in a personal capacity along with the OP of this post, as well as some others - details to come, but feel free to DM me if you have any thoughts on this. It appears that CEA is also already considering this)

If this comment is about "are these red-teaming contests in fact valuable for the money and time put into it, if it misses problems like this"

I think my view here (speaking only for the red-teaming contest) is that even if this specific contest was framed in a way that it missed these classes of issues, the value of the very top submissions[5] may still have made the efforts worthwhile. The potential value of a different framing was mentioned by another panelist. If it's the case that red-teaming contests are systematically missing this class of issues regardless of framing, then I agree that would be pretty useful to know, but I don't have a good sense of how we would try to investigate this.

  

  1. ^

    This tweet seems to have aged particularly well. Despite supportive comments from high-profile EAs on the original forum post, the author seemed disappointed that nothing came of it in that direction. Again, without getting into the object level discussion of the claims of the original paper, it's still worth asking questions around the processes. If there was were actions planned, what did these look like? If not, was that because of a disagreement over the suggested changes, or the extent that it was an issue at all? How were these decisions made, and what was considered?

  2. ^

    Apparently a previous EA-aligned billionaire ?donor who got rich by starting a crypto trading firm, who pleaded guilty to violating the bank secrecy act

  3. ^

    Even before this, I had heard from a primary source in a major mainstream global health organisation that there were staff who wanted to distance themselves from EA because of misunderstandings around longtermism.

  4. ^

    This doesn't have to be a lengthy deliberative consensus-building project, but it should at least include internal comms across different EA stakeholders to allow discussions of risks and potential mitigation strategies.

  5. ^

As requested, here are some submissions that I think are worth highlighting, or considered awarding but ultimately did not make the final cut. (This list is non-exhaustive, and should be taken more lightly than the Honorable mentions, because by definition these posts are less strongly endorsed  by those who judged it. Also commenting in personal capacity, not on behalf of other panelists, etc):

Bad Omens in Current Community Building
I think this was a good-faith description of some potential / existing issues that are important for community builders and the EA community, written by someone who "did not become an EA" but chose to go to the effort of providing feedback with the intention of benefitting the EA community. While these problems are difficult to quantify, they seem important if true, and pretty plausible based on my personal priors/limited experience. At the very least, this starts important conversations about how to approach community building that I hope will lead to positive changes, and a community that continues to strongly value truth-seeking and epistemic humility, which is personally one of the benefits I've valued most from engaging in the EA community.

Seven Questions for Existential Risk Studies
It's possible that the length and academic tone of this piece detracts from the reach it could have, and it (perhaps aptly) leaves me with more questions than answers, but I think the questions are important to reckon with, and this piece covers a lot of (important) ground. To quote a fellow (more eloquent) panelist, whose views I endorse: "Clearly written in good faith, and consistently even-handed and fair - almost to a fault. Very good analysis of epistemic dynamics in EA." On the other hand, this is likely less useful to those who are already very familiar with the ERS space.

Most problems fall within a 100x tractability range (under certain assumptions)
I was skeptical when I read this headline, and while I'm not yet convinced that 100x tractability range should be used as a general heuristic when thinking about tractability, I certainly updated in this direction, and I think this is a valuable post that may help guide cause prioritisation efforts.

The Effective Altruism movement is not above conflicts of interest
I was unsure about including this post, but I think this post highlights an important risk of the EA community receiving a significant share of its funding from a few sources, both for internal community epistemics/culture considerations as well as for external-facing and movement-building considerations. I don't agree with all of the object-level claims, but I think these issues are important to highlight and plausibly relevant outside of the specific case of SBF / crypto. That it wasn't already on the forum (afaict) also contributed to its inclusion here.


I'll also highlight one post that was awarded a prize, but I thought was particularly valuable:

Red Teaming CEA’s Community Building Work
I think this is particularly valuable because of the unique and difficult-to-replace position that CEA holds in the EA community, and as Max acknowledges, it benefits the EA community for important public organisations to be held accountable (and to a standard that is appropriate for their role and potential influence). Thus, even if listed problems aren't all fully on the mark, or are less relevant today than when the mistakes happened, a thorough analysis of these mistakes and an attempt at providing reasonable suggestions at least provides a baseline to which CEA can be held accountable for similar future mistakes, or help with assessing trends and patterns over time. I would personally be happy to see something like this on at least a semi-regular basis (though am unsure about exactly what time-frame would be most appropriate). On the other hand, it's important to acknowledge that this analysis is possible in large part because of CEA's commitment to transparency.

I'll say up front that I definitely agree that we should look into the impacts on worms a nonzero amount! The main reason for the comment is that I don't think the appropriate bar for whether or not the project should warrant more investigation is whether or not it passes a BOTEC under your set of assumptions (which I am grateful for you sharing - I respect your willingness to share this and your consistency).

Again, not speaking on behalf of the team - but I'm happy to bite the bullet and say that I'm much more willing to defer to some deontological constraints in the face of uncertainty, rather than follow impartiality and maximising expected value all the way to its conclusion, whatever those conclusions are. This isn't an argument against the end goal that you are aiming for, but more my best guess in terms of how to get there in practice.

Impartiality and hedonism often recommend actions widely considered bad in super remote thought experiments, but, as far as I am aware, none in real life.

I suspect this might be driven by it not being considered to be bad under your own worldview? Like it's unsurprising that your preferred worldview doesn't recommend actions that you consider bad, but actually my guess is that not working on global poverty and development for the meat eater problem is in fact an action that might be widely considered bad in real life for many reasonable operationalisations (though I don't have empirical evidence to support this).[1] 

I do agree with you on the word choices under this technical conception of excruciating pain / extreme torture,[2] though I think the idea that it 'definitionally' can't be sustained beyond minutes does have some potential failure modes.
That being said, I wasn't actually using torture as a descriptor for the screwworm situation, more just illustrating what I might consider a point of difference between our views, i.e. that I would not be in favour of allowing humans to be tortured by AIs even if you created a BOTEC showed that this caused net positive utils in expectation; and I would not be in favour of an intervention to spread the new world screwworm around the world, even if you created a BOTEC that showed it was the best way of creating utils - I would reject these at least on deontological grounds in the current state of the world.

  1. ^

    This is not to suggest that I think "widely considered bad" is a good bar here! A lot of moral progress came from ideas that initially were "widely considered bad". Just suggesting this particular defence of impartiality + hedonism; namely that it "does not recommend actions widely considered bad in real life" seems unlikely to be correct - simply because most people are not impartial hedonists to the extent you are.

  2. ^

    Neither of which were my wording!

Speaking for myself / not for anyone else here:

My (highly uncertain + subjective) guess is that each lethal infection is probably worse than 0.5 host-years equivalents, but the number of worms per host animal probably could vary significantly. 
That being said, personally I am fine with the assumption of modelling ~0 additional counterfactual suffering for screwworms that are never brought into existence, rather than e.g. an eradication campaign that involves killing existing animals.

I'm unsure how to think about the possibility that the screwworm species which might be living significantly net positive lives such that it trumps the benefit of reduced suffering from screwworm deaths, but I'd personally prefer stronger evidence for wellbeing or harms on the worm's end to justify inaction here (ie not look into the possibility/feasibility of this)

Again, speaking only for myself - I'm not personally fixated on either gene drives or sterile insect approaches! I am also very interested in finding out reasons to not proceed with the project, find alternative approaches, which doesn't preclude the possibility that the net welfare of screwworms should be more heavily weighed as a consideration. That being said, I would be surprised if something like "we should do nothing to alleviate host animal suffering because their suffering can provide more utils for the screwworm" was a sufficiently convincing reason to not do more work / investigation in this area (for nonutilitarian reasons), though I understand there are a set of assumptions / views one might hold that could drive disagreement here.[1]

  1. ^

    If a highly uncertain BOTEC showed you that torturing humans would bring more utility to digital beings than the suffering incurred on the humans, would you endorse allowing this? At what ratio would you change your mind, and how many OOMs of uncertainty on the BOTEC would you be OK with? 

    Or - would you be in favour of taking this further and spreading the screwworm globally simply because it provides more utils, rather than just not eradicating the screwworm?

It's fine to apply regardless; there's one application form for all 2025 in-person EAGs. You'll likely be sent an email separately closer to the time reminding you that you can register for the East Coast EAG, and be directed to a separate portal where you can do this without needing to apply again.

Hey team - are you happy to share a bit more about who would be involved in these projects, and their track record (or Whylome's more broadly)? I only spent a minute or so on this but I can't find any information online beyond your website and these links, related to SMTM's "exposure to subclinical doses of lithium is responsible for the obesity epidemic" hypothesis (1, 2).

More info on how much money you're looking for the above projects would also be useful.

Ah my bad, I meant extreme pain above there as well, edited to clarify! I agree it's not a super important assumption for the BOTEC in the grand scheme of things though.

However, if one wants to argue that I overestimated the cost-effectiveness of SWP, one has to provide reasons for my guess overestimating the intensity of excruciating pain.

I don't actually argue for this in either of my comments.[1] I'm just saying that it sounds like if I duplicated your BOTEC, and changed this one speculative parameter to 2 OOMs lower, an observer would have no strong reason to choose one BOTEC over another just by looking at the BOTEC alone. Expressing skepticism of an unproven claim doesn't produce a symmetrical burden of proof on my end!

Mainly just from a reasoning transparency point of view I think it's worth fleshing out what these assumptions imply and what is grounding these best guesses[2] - in part because I personally want to know how much I should update based on your BOTEC, in part because knowing your reasoning might help me better argue why you might (or might not) have overestimated the intensity of excruciating pain if I knew where your ratio came from (and this is why I was checking the maths and seeing if these were correct, and asking if there's stronger evidence if so, before critiquing the 100k figure), and because I think other EAF readers, as well as broader, lower-context audience of EA bloggers would benefit from this too.
 

If you did that, SWP would still be 434 (= 43.4*10^3*10^3/(100*10^3)) times as cost-effective as GiveWell's top charities.

Yeah, I wasn't making any inter-charity comparisons or claiming that SWP is less cost-effective than GW top charities![3] But since you mention it, it wouldn't be surprising to me if losing 2 OOMs might make some donors favour other animal welfare charities over SWP for example - but again, the primary purpose of these comments is not to litigate which charity is the best, or whether this is better or worse than GW top charities, but mainly just to explore a bit more around what is grounding the BOTEC, so observers have a good sense on how much they should update based on how compelling they find the assumptions / reasoning etc.
 

I think it is also worth wondering about whether you trully believe that updated intensity. Do you think 1 day of fully healthy life plus 86.4 s (= 0.864*100*10^3/100) of scalding or severe burning events in large parts of the body, dismemberment, or extreme torture would be neutral?

Nope! I would rather give up 1 day of healthy life than 86 seconds of this description. But this varies depending on the timeframe in question.

For example, I'd probably be willing to endure 0.86 seconds of this for 14 minutes of healthy life, and I would definitely endure 0.086 seconds of this than give up 86 seconds of healthy life.

And using your assumptions (ratio of 100k), I would easily rather have 0.8 seconds of this than give up 1 day of healthy life, but if I had to endure many hours of this I could imagine my tradeoffs approaching, or even exceeding 100k. 

I do want to mention that I think it's useful that someone is trying to quantify these comparisons, I'm grateful for this work, and I want to emphasise that these are about making the underlying reasoning more transparent / understanding the methodology that leads to the assumptions in the BOTEC, rather than any kind of personal criticism!

  1. ^

    Though I am personally skeptical of a 50:1 shrimp:human tradeoff

  2. ^

    E.g. is this the result of a personal time trade-off exercise?

  3. ^

    I explicitly say "To be clear, this isn't a claim that one shouldn't donate to SWP". I'm a big fan of SWP!

So I suppose I would be wary of saying that GiveDirectly now have 3–4x the WELLBY impact relative to Vida Plena—or even to say that GiveDirectly have any more WELLBY impact relative to Vida Plena

Ah right - yeah I'm not making either of these claims, I'm just saying that if the previous claim (from VP's predictive CEA) was that: "Vida Plena...is 8 times more cost-effective than GiveDirectly", and GD has since been updated to 3-4x more cost-effective than it was compared to the time the predictive CEA was published, we should discount the 8x claim downwards somewhat (but not necessarily by 3-4x).

Load more