A

anormative

188 karmaJoined

Comments
28

Habryka clarifies in a later comment:

Yep, my model is that OP does fund things that are explicitly bipartisan (like, they are not currently filtering on being actively affiliated with the left). My sense is in-practice it's a fine balance and if there was some high-profile thing where Horizon became more associated with the right (like maybe some alumni becomes prominent in the republican party and very publicly credits Horizon for that, or there is some scandal involving someone on the right who is a Horizon alumni), then I do think their OP funding would have a decent chance of being jeopardized, and the same is not true on the left.

Another part of my model is that one of the key things about Horizon is that they are of a similar school of PR as OP themselves. They don't make public statements. They try to look very professional. They are probably very happy to compromise on messaging and public comms with Open Phil and be responsive to almost any request that OP would have messaging wise. That makes up for a lot. I think if you had a more communicative and outspoken organization with a similar mission to Horizon, I think the funding situation would be a bunch dicier (though my guess is if they were competent, an organization like that could still get funding).

More broadly, I am not saying "OP staff want to only support organizations on the left". My sense is that many individual OP staff would love to fund more organizations on the right, and would hate for polarization to occur, but that organizationally and because of constraints by Dustin, they can't, and so you will see them fund organizations that aim for more engagement with the right, but there will be relatively hard lines and constraints that will mostly prevent that.

Are you imagining this being taught to children in a philosophy class along topics like virtue ethics etc, or do you think that “scope-sensitive beneficententrism” should be taught just as students are taught the golden rule and not to bully one another?

Is this available publicly? I’d be interested in seeing it too.

This is super awesome! Thanks for sharing the specifics of what you did—it will definitely be useful info for us in the future. We’ve considered having people fill out fellowship apps during our intro talk but have worried that this might lower the quality of applicant responses. I’d be interested in knowing what your experience with it was.

Can you tell us a little bit about how this project and partnership came together? What was OpenPhil’s role? What is it like working with such a large number of organizations, including governments? Do you see potential for more collaborations like this? 

Question for either James or Julia: Is this specifically for lead policy or just policy advocacy in general? And can you elaborate why?

This is awesome! Any details you can share on how this whole thing came together? It could be really impactful to try to aim for more coalitions like this for other cost-effective opportunities.

To clarify, I agree that that the ways you can be liable mostly fall into the two categories you delineate but think that your characterization of the categories might be incorrect.

You say that a developer would be liable

  1. if you developed a covered model that caused more than $500M harm
  2. if you violated any of the prescribed transparency/accountability mechanisms in the bill

But I think a better characterization would be that you can be liable 

  1. if you developed a covered model that caused more than $500M harm -> if you fail to take reasonable care to prevent critical harms
  2. if you violated any of the prescribed transparency/accountability mechanisms in the bill

It's possible "to fail to take reasonable care to prevent critical harms" even if you do not cause critical harms. The bill doesn't specify any new category of liability specifically for developers who have developed models that cause critical harm. 

To use Casado's example, if a self-driving car was involved in an accident that resulted in a person's death, and if that self-driving car company did not "take reasonable care to prevent critical harms" by having a safety and security protocol much worse than that of other companies, it seems plausible that the company could be fined 10% of their compute/have to pay other damages. (I don't know if self-driving cars actually would be affected by this bill.) 

I think the best reason this might be wrong is that courts might not be willing to entertain this argument or that in tort law "failing to take reasonable care to avoid something" requires that you "fail to avoid that thing"—but I don't have enough legal background/knowledge to know.

Thanks for your reply! I'm a bit confused—I think my understanding of the bill matches yours. The Vox article states "Otherwise, they would be liable if their AI system leads to a 'mass casualty event' or more than $500 million in damages in a single incident or set of closely linked incidents." (See also eg here and here). But my reading of the bill is that there is no mass casualty/$500 million threshold for liability like Vox seems to be claiming here.

Kelsey Piper’s article on SB 1047 says

This is one of the questions animating the current raging discourse in tech over California’s SB 1047, newly passed legislation that mandates safety training for that companies that spend more than $100 million on training a “frontier model” in AI — like the in-progress GPT-5. Otherwise, they would be liable if their AI system leads to a “mass casualty event” or more than $500 million in damages in a single incident or set of closely linked incidents.

I’ve seen similar statements elsewhere too. But after I spent some time today reading through the bill, this seems to be wrong? Liability for developers doesn’t seem to be dependent on whether “critical harm” is actually done. Instead, if the developer fails to take reasonable care to prevent critical harm (or some other violation), even if there is no critical harm done, violations that cause death/bodily harm/etc can lead to fines of 10% or 30% of compute. Here’s the relevant section from the bill:

(a) The Attorney General may bring a civil action for a violation of this chapter and to recover all of the following:

(1) For a violation that causes death or bodily harm to another human, harm to property, theft or misappropriation of property, or that constitutes an imminent risk or threat to public safety that occurs on or after January 1, 2026, a civil penalty in an amount not exceeding 10 percent of the cost of the quantity of computing power used to train the covered model to be calculated using average market prices of cloud compute at the time of training for a first violation and in an amount not exceeding 30 percent of that value for any subsequent violation.

Has there been discussion about this somewhere else already? Is the Vox article wrong or am I misunderstanding the bill?

Load more