This is a special post for quick takes by anormative. Only they can create top-level comments. Comments here also appear on the Quick Takes page and All Posts page.
This is one of the questions animating the current raging discourse in tech over California’s SB 1047, newly passed legislation that mandates safety training for that companies that spend more than $100 million on training a “frontier model” in AI — like the in-progress GPT-5. Otherwise, they would be liable if their AI system leads to a “mass casualty event” or more than $500 million in damages in a single incident or set of closely linked incidents.
I’ve seen similar statements elsewhere too. But after I spent some time today reading through the bill, this seems to be wrong? Liability for developers doesn’t seem to be dependent on whether “critical harm” is actually done. Instead, if the developer fails to take reasonable care to prevent critical harm (or some other violation), even if there is no critical harm done, violations that cause death/bodily harm/etc can lead to fines of 10% or 30% of compute. Here’s the relevant section from the bill:
(a) The Attorney General may bring a civil action for a violation of this chapter and to recover all of the following:
(1) For a violation that causes death or bodily harm to another human, harm to property, theft or misappropriation of property, or that constitutes an imminent risk or threat to public safety that occurs on or after January 1, 2026, a civil penalty in an amount not exceeding 10 percent of the cost of the quantity of computing power used to train the covered model to be calculated using average market prices of cloud compute at the time of training for a first violation and in an amount not exceeding 30 percent of that value for any subsequent violation.
Has there been discussion about this somewhere else already? Is the Vox article wrong or am I misunderstanding the bill?
I think you are misunderstanding the bill. The key component is this phrase: "for a violation of this chapter"
I.e. this section is about what kind of damages can be recovered, if someone violates any of the procedural requirements outlined in this bill.
The bill basically has two components:
Developers have a responsibility to avoid critical harms (basically incidents that cause more than $500M in damages) and need to (at least) follow these rules to avoid them (e.g. they have to report large training runs)
If you don't follow these rules that, the attorney general can sue you into compliance (which is what the section you quoted above is above). I.e. if you fake your safety testing results, we can sue you for that.
(I am reasonably confident this is true and have read the bill 3-4 times, but I might be getting something wrong. The bill has been changing a lot)
Thanks for your reply! I'm a bit confused—I think my understanding of the bill matches yours. The Vox article states "Otherwise, they would be liable if their AI system leads to a 'mass casualty event' or more than $500 million in damages in a single incident or set of closely linked incidents." (See also eg here and here). But my reading of the bill is that there is no mass casualty/$500 million threshold for liability like Vox seems to be claiming here.
No, there is, that's the definition of critical harm I mention above:
(g)(1)Critical harm means any of the following harms caused or materially enabled by a covered model or covered model derivative:
(A)The creation or use of a chemical, biological, radiological, or nuclear weapon in a manner that results in mass casualties.
(B)Mass casualties or at least five hundred million dollars ($500,000,000) of damage resulting from cyberattacks on critical infrastructure by a model conducting, or providing precise instructions for conducting, a cyberattack or series of cyberattacks on critical infrastructure.
(C)Mass casualties or at least five hundred million dollars ($500,000,000) of damage resulting from an artificial intelligence model engaging in conduct that does both of the following:
(i)Acts with limited human oversight, intervention, or supervision.
(ii)Results in death, great bodily injury, property damage, or property loss, and would, if committed by a human, constitute a crime specified in the Penal Code that requires intent, recklessness, or gross negligence, or the solicitation or aiding and abetting of such a crime.
(D)Other grave harms to public safety and security that are of comparable severity to the harms described in subparagraphs (A) to (C), inclusive.
I still don't think you have posted anything from the bill which clearly shows that you only get sued if A) [you fail to follow precautions and cause critical harms], but not if B) [you fail to follow precautions the bill says are designed to prevent critical harms, and some loss of life occurs]. In both cases you could reasonably characterise it as "you fail to follow precautions the bill says are designed to prevent critical harms" and hence "violate" the "chapter".
I mean, what you are saying is literally what I said. There are two ways the bill says the Attorney General can sue you. One, if you developed a covered model that caused more than $500M harm, two if you violated any of the prescribed transparency/accountability mechanisms in the bill.
Of course you need to have some penalty if you don't follow the transparency/accountability requirements of the bill, how otherwise would you expect people to do any of the things the bill requires of them?
To clarify, I agree that that the ways you can be liable mostly fall into the two categories you delineate but think that your characterization of the categories might be incorrect.
You say that a developer would be liable
if you developed a covered model that caused more than $500M harm
if you violated any of the prescribed transparency/accountability mechanisms in the bill
But I think a better characterization would be that you can be liable
if you developed a covered model that caused more than $500M harm -> if you fail to take reasonable care to prevent critical harms
if you violated any of the prescribed transparency/accountability mechanisms in the bill
It's possible "to fail to take reasonable care to prevent critical harms" even if you do not cause critical harms. The bill doesn't specify any new category of liability specifically for developers who have developed models that cause critical harm.
To use Casado's example, if a self-driving car was involved in an accident that resulted in a person's death, and if that self-driving car company did not "take reasonable care to prevent critical harms" by having a safety and security protocol much worse than that of other companies, it seems plausible that the company could be fined 10% of their compute/have to pay other damages. (I don't know if self-driving cars actually would be affected by this bill.)
I think the best reason this might be wrong is that courts might not be willing to entertain this argument or that in tort law "failing to take reasonable care to avoid something" requires that you "fail to avoid that thing"—but I don't have enough legal background/knowledge to know.
I think that's inaccurate (though I will admit the bill text here is confusing).
Critical harms is defined as doing more than $500M of damage, so at the very least you have to be negiligent specifically on the issue of whether your systems can cause $500M of harm.
But I think more concretely the conditions under which the AG can sue for damages if no critical harm has yet occurred are pretty well-defined (and are not as broad as "fail to take reasonable care").
Question: what's the logic behind organizations like Good Ventures and people like Warren Buffet wanting to spend down instead of creating a foundation that exists for perpetuity, especially if they have such large amounts of money that they struggle to give it all out? Is it because they believe some sort of hinge of history hypothesis? Are they worried about value drift? Do they think that causes in the future will generally be less important (due to the world being generally better) or less neglected ("future people can help themselves better than we can?")
The history of big foundations shows clearly that, after the founder's death, they revert to the mean and give money mostly to whatever is popular and trendy among clerks and administrators, rather than anything unusual which the donor might've cared about. If you look at the money flowing out of e.g. the Ford Foundation, you'll be hard-pressed to find anything which is there because Henry or Edsel Ford thought it was important, rather than because it's popular among the NGO class who staffs the foundation. See Henry Ford II's resignation letter.
If you want to accomplish anything more specific than "fund generic charities"—as anyone who accepts the basic tenets of EA obviously should—then creating a perpetual foundation is unwise.
I'm sure it varies from donor to donor, but experience teaches us that younger organizations are often more innovative and flexible, while aged ones often ossify. If I were a donor, I wouldn't want to bet against that base rate. Even worse, there may be some tension between avoiding value drift and avoiding ossification. Future donors shouldn't face that tension when acting in their own times.
I also wouldn’t undervalue reputation, that seems like an important factor for many wealthy individuals and particularly those who feels like they have sins to absolve and/or a reputational deficit to erase a la Bill Gates
NunoSempere points out that EA could have been structured in a radically different way, if the "specific cultural mileu" had been different. But I think this can be taken even further. I think that it's plausible that if a few moments in the history of effective altruism had gone differently, the social makeup—the sort of people that make up the movement—and their axiological worldviews—the sorts of things they value—might have been radically different too.
As someone interested in the history of ideas, I'm fascinated by what our movement has that made it significantly different than the most likely counterfactual movements. Why is effective altruism the way it is? A numberofinterestingbriefhistorieshavebeen written about the history of EA (and longer pieces about more specific things like Moynihan's excellent X-Risk) but I often feel that there are a lot of questions about the movement's history, especially regarding tensions that seem to present themselves between the different worldviews that make up EA.
For example,
How much was it the individual "leaders" of EA who brought together different groups of people to create a big-tent EA, as opposed to the communities themselves already being connected? (Toby Ord says that he connected the Oxford GWWC/EA community to the rationality community, but people from both of these "camps" seem to be at Felicifia together in the late 2000s.)
When connecting the history of thought, there's a tendency to put thinkers after one another in lineages as if they all read and are responding to those who came before them. Parfit lays the ground for longtermism in the the late 20th century in Reasons and Persons and Bostrom continues the work when presenting the idea of x-risk in 2001. Did Bostrom know of and expand upon Parfit's work, or was Bostrom's framing independent of that, based on risks discussed by the Extropians, Yudkowsky, SL4, etc? There (maybe) seems to be multiple discovery of early EA ideas in separate creation of the Oxford/GWWC community and GiveWell. Is something like that going on for longtermism/x-risk?
What would EA look like today without Yudkowsky? Bostrom? Karnofsky/Hassenfeld? MacAskill/Ord?
What would EA look like today without Dustin Moskovitz? Or if we had another major donor? (One with different priorities?)
What drove the "longtermist turn?" A shift driven by leaders or by the community?
A few interesting Yudkowsky (not be taken as current opinions, for historical purposes) quotes (see also Extropian Archaeology):
From Eliezer Yudkowsky on the SL4 mailing list, April 30, 2003:
Since the lack of people is a blocker problem, I think I may have to split my attention one more time, hopefully the last, and write something to attract the people we need. My current thought is a book on the underlying theory and specific human practice of rationality, which is something I'd been considering for a while. It has at least three major virtues to recommend it. (1): The Singularity movement is a very precise set of ideas that can be easily and dangerously misinterpreted in any number of emotionally attractive, rationally repugnant directions, and we need something like an introductory course in rationality for new members.
(2): Only a few people seem to have understood the AI papers already online, and the more recent theory is substantially deeper than what is currently online; I have been considering that I need to go back to the basics in order to convey a real understanding of these topics. Furthermore, much of the theory needed to give a consilient description of rationality is also prerequisite to correctly framing the task of building a seed AI.
(3): People of the level SIAI needs are almost certainly already rationalists; this is the book they would be interested in. I don't think we'll find the people we need by posting a job opening. Movements often start around books; we don't have our book yet.
It's fascinating to me that this is the reason that there's a "rationality" community around today. (See also) What would EA look like without it? Would it really be any less rational? What does a transhumanisty non-AI-worried EA look like?—I feel like that's what we might have had without Yudkowsky.
One last thing:
From Eliezer Yudkowsky on the Extropians mailing list, May 12, 2001:
I was there for the presentation and I, literally, felt slightly sick to my stomach. I'd like to endorse "Existential Risks" as being scary and well worth reading.
Kelsey Piper’s article on SB 1047 says
I’ve seen similar statements elsewhere too. But after I spent some time today reading through the bill, this seems to be wrong? Liability for developers doesn’t seem to be dependent on whether “critical harm” is actually done. Instead, if the developer fails to take reasonable care to prevent critical harm (or some other violation), even if there is no critical harm done, violations that cause death/bodily harm/etc can lead to fines of 10% or 30% of compute. Here’s the relevant section from the bill:
Has there been discussion about this somewhere else already? Is the Vox article wrong or am I misunderstanding the bill?
I think you are misunderstanding the bill. The key component is this phrase: "for a violation of this chapter"
I.e. this section is about what kind of damages can be recovered, if someone violates any of the procedural requirements outlined in this bill.
The bill basically has two components:
(I am reasonably confident this is true and have read the bill 3-4 times, but I might be getting something wrong. The bill has been changing a lot)
Thanks for your reply! I'm a bit confused—I think my understanding of the bill matches yours. The Vox article states "Otherwise, they would be liable if their AI system leads to a 'mass casualty event' or more than $500 million in damages in a single incident or set of closely linked incidents." (See also eg here and here). But my reading of the bill is that there is no mass casualty/$500 million threshold for liability like Vox seems to be claiming here.
No, there is, that's the definition of critical harm I mention above:
I still don't think you have posted anything from the bill which clearly shows that you only get sued if A) [you fail to follow precautions and cause critical harms], but not if B) [you fail to follow precautions the bill says are designed to prevent critical harms, and some loss of life occurs]. In both cases you could reasonably characterise it as "you fail to follow precautions the bill says are designed to prevent critical harms" and hence "violate" the "chapter".
I mean, what you are saying is literally what I said. There are two ways the bill says the Attorney General can sue you. One, if you developed a covered model that caused more than $500M harm, two if you violated any of the prescribed transparency/accountability mechanisms in the bill.
Of course you need to have some penalty if you don't follow the transparency/accountability requirements of the bill, how otherwise would you expect people to do any of the things the bill requires of them?
To clarify, I agree that that the ways you can be liable mostly fall into the two categories you delineate but think that your characterization of the categories might be incorrect.
You say that a developer would be liable
But I think a better characterization would be that you can be liable
It's possible "to fail to take reasonable care to prevent critical harms" even if you do not cause critical harms. The bill doesn't specify any new category of liability specifically for developers who have developed models that cause critical harm.
To use Casado's example, if a self-driving car was involved in an accident that resulted in a person's death, and if that self-driving car company did not "take reasonable care to prevent critical harms" by having a safety and security protocol much worse than that of other companies, it seems plausible that the company could be fined 10% of their compute/have to pay other damages. (I don't know if self-driving cars actually would be affected by this bill.)
I think the best reason this might be wrong is that courts might not be willing to entertain this argument or that in tort law "failing to take reasonable care to avoid something" requires that you "fail to avoid that thing"—but I don't have enough legal background/knowledge to know.
I think that's inaccurate (though I will admit the bill text here is confusing).
Critical harms is defined as doing more than $500M of damage, so at the very least you have to be negiligent specifically on the issue of whether your systems can cause $500M of harm.
But I think more concretely the conditions under which the AG can sue for damages if no critical harm has yet occurred are pretty well-defined (and are not as broad as "fail to take reasonable care").
Question: what's the logic behind organizations like Good Ventures and people like Warren Buffet wanting to spend down instead of creating a foundation that exists for perpetuity, especially if they have such large amounts of money that they struggle to give it all out? Is it because they believe some sort of hinge of history hypothesis? Are they worried about value drift? Do they think that causes in the future will generally be less important (due to the world being generally better) or less neglected ("future people can help themselves better than we can?")
The history of big foundations shows clearly that, after the founder's death, they revert to the mean and give money mostly to whatever is popular and trendy among clerks and administrators, rather than anything unusual which the donor might've cared about. If you look at the money flowing out of e.g. the Ford Foundation, you'll be hard-pressed to find anything which is there because Henry or Edsel Ford thought it was important, rather than because it's popular among the NGO class who staffs the foundation. See Henry Ford II's resignation letter.
If you want to accomplish anything more specific than "fund generic charities"—as anyone who accepts the basic tenets of EA obviously should—then creating a perpetual foundation is unwise.
Thanks, interesting letter/link!
I'm sure it varies from donor to donor, but experience teaches us that younger organizations are often more innovative and flexible, while aged ones often ossify. If I were a donor, I wouldn't want to bet against that base rate. Even worse, there may be some tension between avoiding value drift and avoiding ossification. Future donors shouldn't face that tension when acting in their own times.
I also wouldn’t undervalue reputation, that seems like an important factor for many wealthy individuals and particularly those who feels like they have sins to absolve and/or a reputational deficit to erase a la Bill Gates
NunoSempere points out that EA could have been structured in a radically different way, if the "specific cultural mileu" had been different. But I think this can be taken even further. I think that it's plausible that if a few moments in the history of effective altruism had gone differently, the social makeup—the sort of people that make up the movement—and their axiological worldviews—the sorts of things they value—might have been radically different too.
As someone interested in the history of ideas, I'm fascinated by what our movement has that made it significantly different than the most likely counterfactual movements. Why is effective altruism the way it is? A number of interesting brief histories have been written about the history of EA (and longer pieces about more specific things like Moynihan's excellent X-Risk) but I often feel that there are a lot of questions about the movement's history, especially regarding tensions that seem to present themselves between the different worldviews that make up EA.
For example,
A few interesting Yudkowsky (not be taken as current opinions, for historical purposes) quotes (see also Extropian Archaeology):
It's fascinating to me that this is the reason that there's a "rationality" community around today. (See also) What would EA look like without it? Would it really be any less rational? What does a transhumanisty non-AI-worried EA look like?—I feel like that's what we might have had without Yudkowsky.
One last thing: