RS

Rohin Shah

4235 karmaJoined

Bio

Hi, I'm Rohin Shah! I work as a Research Scientist on the technical AGI safety team at DeepMind. I completed my PhD at the Center for Human-Compatible AI at UC Berkeley, where I worked on building AI systems that can learn to assist a human user, even if they don't initially know what the user wants.

I'm particularly interested in big picture questions about artificial intelligence. What techniques will we use to build human-level AI systems? How will their deployment affect the world? What can we do to make this deployment go better? I write up summaries and thoughts about recent work tackling these questions in the Alignment Newsletter.

In the past, I ran the EA UC Berkeley and EA at the University of Washington groups.

http://rohinshah.com

Comments
462

Tbc if the preferences are written in words like "expected value of the lightcone" I agree it would be relatively easy to tell which was which, mainly by identifying community shibboleths. My claim is that if you just have the input/output mapping of (safety level of AI, capabilities level of AI) --> utility, then it would be challenging. Even longtermists should be willing to accept some risk, just because AI can help with other existential risks (and of course many safety researchers -- probably the majority at this point -- are not longtermists).

What you call the "lab's" utility function isn't really specific to the lab; it could just as well apply to safety researchers. One might assume that the parameters would be set in such a way as to make the lab more C-seeking (e.g. it takes less C to produce 1 util for the lab than for everyone else).

But at least in the case of AI safety, I don't think this is the case. I doubt I could easily distinguish a lab capabilities researcher (or lab leadership, or some "aggregate lab utility function") from an external safety researcher if you just gave me their utility functions over C and S. (AI safety has significant overlap with transhumanism; relative to the rest of humanity they are way more likely to think there are huge benefits to development of safe AGI.) In practice it seems like the issue is more like epistemic disagreement.

You could still recover many of the conclusions in this post by positing that an increase to S leads to a proportional decrease in probability of non-survival, and the proportion is the same between the lab and everyone else, but the absolute numbers aren't. I'd still feel like this was a poor model of the real situation though.

I agree reductions in infant mortality likely have better long-run effects on capacity growth than equivalent levels of population growth while keeping infant mortality rates constant, which could mean that you still want to focus on infant mortality while not prioritizing increasing fertility.

I would just be surprised if the decision from the global capacity growth perspective ended up being "continue putting tons of resources into reducing infant mortality, but not much into increasing fertility" (which I understand to be the status quo for GHD), because:

  • Probably the dominant consideration for importance is how good / bad it is to grow the population, and it is unlikely that the differential effects from reducing infant mortality vs increasing fertility end up changing the decision
  • Probably it is easier / cheaper to increase fertility than to reduce infant mortality, because very little effort has been put into increasing fertility (to my knowledge)

That said, it's been many years since I closely followed the GHD space, and I could easily be wrong about a lot of this.

?? It's the second bullet point in the cons list, and reemphasized in the third bullet?

If you're saying "obviously this is the key determinant of whether you should work at a leading AI company so there shouldn't even be a pros / cons table", then obviously 80K disagrees given they recommend some such roles (and many other people (e.g. me) also disagree so this isn't 80K ignoring expert consensus). In that case I think you should try to convince 80K on the object level rather than applying political pressure.

... That paragraph doesn't distinguish at all between OpenAI and, say, Anthropic. Surely you want to include some details specific to the OpenAI situation? (Or do your object-level views really not distinguish between them?)

There’s currently very little work going into issues that arise even if AI is aligned, including the deployment problem

The deployment problem (as described in that link) is a non-problem if you know that AI is aligned.

Load more