Z

zchuang

781 karmaJoined

Comments
106

https://www.alexirpan.com/2024/08/06/switching-to-ai-safety.html

This reaffirms my belief it's more important to look at the cruxes of existing ML researchers than internally within EAs on AI Safety.

[Epistemic status: unsure how much I believe each response but more pushing back against that "no well informed person trying to allocate a marginal dollar most ethically would conclude that GiveWell is the best option."]

  1. I think worldview diversification can diversify to a worldview that is more anthropocentric and less scope sensitive across species/not purely utilitarian. This would directly change the split with farmed animal welfare.
  2. There's institutional and signalling value in showing that OpenPhil is willing to stand behind long commitments. This can in the worst instances be PR but in the best instances be a credible signal to many cause areas that OpenPhil is an actor in the non-profit space that will not change tact just due to philosophical changes in worldview (that seem hard to predict from the outside). For instance what if Korsgaard or Tarsney[1] just annihilates Utilitarianism with a treatise? I don't think NGOs should have to track GPI's outputs nor to know if they'll be funded next year.
  3. I think there's something to be said for how one values "empirical evidence" over "philosophical evidence" even when the crux for animal welfare. Alexander Berger makes the argument here (I'm too lazy to fully type it out).
  4. A moral parliaments view given uncertainty can lead to a lot of GiveWell looking much better. Even a Kantian sympathetic to animals like Korsgaard would have limitations towards certain welfarist approaches. For instance, I don't know how a Kantian would weigh wild animal welfare or even shrimp welfare (would neuron weights express a being willing something?).
  5. The animal welfare movement landscape is very activist driven such that a flood of cash on the order of magnitude of say the current $300MM given to GiveWell could lead to an activist form of dutch disease and be incredibly unhealthy for it.
  6. OpenPhil could just have an asymmetric preference against downside risk such that it's not a pure expected value calculation. I think there are good reasons to a-priori not invest in interventions that could carry downside risk and very plausible reasons why animal welfare interventions are more likely to entail those risks. For instance, political risks from advocacy and diet switches meaning more egg is consumed than beef. I think the largest funder in EA being risk averse is good given contemporary events.
  7. OpenPhil seems really labour constrained in other cause areas as shown by the recent GCR hiring round such that maybe the due dilgence and labour costs for non-Givewell interventions are just not available to be investigated or executed.
  1. ^

    I know Tarsney is a utilitarian but I'm just throwing him out there as a name that can change .

I think this is confused. WWOTF is obviously both aiming to be persuasive and coming from a place of academic analytical philosophical rigour. Many philosophers write books that are both, e.g. Down Girl by Kate Manne or The Right to Sex by Amia Srinivasan. I don't think a purely persuasive book would have so many citations. 
.

[edited: last sentence for explicitness of my point]

I think this worry should be more a critique of the EA community writ-large for being overly deferential than for OP holding a contest to elicit critiques of its views and then following through with that in their own admittedly subjective criteria. OP themselves note in the post that people shouldn't take this to be OP's institutional tastes.

Answer by zchuang2
0
0

[edit: Fixed link for Stuart Russell's book. Initially linked to Brian Christiansen's Human Compatible.]

  1. Cold-takes is a generally good blog by Holden Karnofsky that lays out the argument for why AI would be transformative and the jobs that could help with that. 
  2. For papers, I think Richard Ngo's paper is really good as an overview of the field from a Deep Learning perspective.
  3. For other posts, I found that Ajeya Cotra's posts on TAI timelines is really important for a lot of people about when it would happen.
  4. For books, Stuart Russell's book is accessible to non-technical audiences.

I think these polls would benefit from a clause along the lines of "On balance, EAs should X" because a lot of the discourse collapses into examples and corner cases about when the behaviour is acceptable (e.g. the discussion over illegal actions ending up being around melatonin). I think having a conversation centred about where the probability mass of these phenomena actually are is important. 

I think this is imprecise. In my mind there are two categories:

  1. People who think EA is a distraction from near term issues and competing for funding and attention (e.g. Seth Lazar as seen by his complaints about the UK taskforce and trying to tag Dustin Moskovitz and Ian Hogarth in his thinkpieces). These more classical ethicists are just from what I can see analytical philosophers looking for funding and clout competition with EA. They've lost a lot of social capital because they repeated a lot of old canards about AI and just repeats them. My model for them is something akin to they can't do fizzbuzz or know what a transformer is, thus they'll just say sentences about how AI can't do things and there's a lot of hype and power centralisation. These are more likely to be white men from the UK, Canada, Australia, and NZ. Status games are especially important to them and they seem to just not have a great understanding of the field of alignment at all. A good example I show people is this tweet which tries to say RLHF solves alignment and "Paul [Christiano] is an actual researcher I respect, the AI alignment people that bother me are more the longtermists."
  2. People in the other camp are more likely to think EA is problematic and power hungry and covers for big tech. People in this camp would be your Dr. Gebru, DAIR etc. I think these individuals are often much more technically proficient than the people in the first camp and their view of EA is more akin to seeing EA as a cult that seeks to indoctrinate within a bundle of longtermist beliefs and carry water for AI labs. I will say the strategic collaborations are more fruitful here because there is more technical proficiency and personally I believe the latter group have better epistemics and are more truth-seeking even if much more acerbic in their rhetoric. The higher level of technical proficiency means they can contribute to the UK Task force on things like cybersecurity and evals. 

I think measuring along only the axis of tractability of gaining allies is the wrong question but the real question is what are the fruits of collaboration.

To be clear I didn't downvote it because I didn't read it. I skimmed it and looked for the objectionable parts to steelman what I imagine the downvoter would have downvoted it for. I think the most egregious part of it is not understanding that there are costs to methods of zero fraud (literally means war torn areas get 0 aid and the risk tolerance is too high) and Vee just staunchly reiterates the claim we need to have 0 fraud. 

I think Vee's posts read to me as very ChatGPT spambot as I have downvoted them in the past for the same issue. A key problem I have with the GiveDirectly post that would make me downvote it if I read it is that it doesn't actually explain anything the linked post doesn't say and if anything just takes the premise/title of the GiveDirectly post that GiveDirectly lost 900,000 and then doesn't do anything to analyse the trade offs of any of their "fixes". Moreover, both the linked post and commenters talk about the trade offs that are reasoned through and weighed up but Vee just doubles down. I don't think I would add anything to their criticisms and so I would just downvote and move on. 

I think this is already done. The application asks if you are receiving OpenPhil funding for said project or have done so in the past. It also asks if you've applied. I think people also generally disclose because the payoff of not disclosing is pretty low compared to the costs. EA is a pretty small community I don't think non-disclosure ever helps.

Load more