If you work for a frontier AI company, either because you think they care about saving the world or especially if you think that you will be the one to influence them, you are deluded. Wake up and quit.
If you care about protecting the world, you will quit, even though it will be hard to give up the money and the prestige and the hope that they would fix the problem. The actual path to reducing AI risk is not as glamorous or as clear at this point as following the instructions of a wealthy and well-organized corporation, but at least you will be going in the right direction.
The early 80k-style advice to work at an AI lab was mainly to make technical discoveries for safety that e.g. academia didn't have the resources for. When they were small, it also made some sense to try to influence the industry culture. Now, this advice is crazy-- there is no way 1 EA joining a 1000 person company with duties to their investors and locked in a death race is going to "influence" it. The influence goes entirely the other way. If you weren't frogboiled, you would never have selected this path for influence.
There's a lot more to say on this, but I think this is the crux. Your chance for positive marginal impact for AI Safety is not with the labs. If you work for the labs, you're probably just a henchman for a supervillain megaproject, and you can have some positive counterfactual impact right now by quitting. Don't sell out.
I downvoted this (but have upvoted some of your comments).
I think this advice is at minimum overstated, and likely wrong and harmful (at least if taken literally). And it's presented with rhetorical force, so that it seems to mostly be trying to push people's views towards a position that is (IMO) harmful, rather than mostly providing them with information to help them come to their own conclusions.
TBC:
I don't have an opinion on whether Holly is correct that no one should work for the labs. But even for those who disagree, there are some implied hypotheses here that are worth pondering:
If people decide to work in a frontier lab anyway, to what extent can they mitigate the risk of being "frogboiled" by
(I'm open to the response that there are no meaningful detection and/or mitigation techniques.)
In my view, there are many good reasons to work at an AI company, including:
* productively steering an AI lab during crunch time
* doing well-resourced AI safety research
* increasing the ability for safety-conscious people to blow the whistle to governments
* learning about the AI frontier from the best people in the field
* giving to effective charities
* influencing the views of other employees
* influencing how powerful AI systems are deployed and what they are used for during deployment
I don't think these necessarily outweigh the costs of working at an AI company, but the altruistic benefits are sometimes large, and it seems good for people to consider the option thoughtfully.
Can you list what you see as the costs?
Can you explain why you think doing safety work at these places is bad?
I think there was a time when it seemed like a good idea, back when the companies were small and there was more of a chance of setting their standards and culture. Back in 2016 I thought on balance we should try to put Safety people in OpenAI, for instance. OpenAI was supposed to be explicitly Safety-oriented, but any company's safety division seemed like it might pay off to stock with Safety people.
I think everything had clearly changed around the chatGPT moment. The companies had a successful paradigm for making the models, the product was extremely valuable, and the race was very clearly on. At this time, EAs still believed that OpenAI and Anthropic were on their side because they had Safety teams (including many EAs) and talked a lot about Safety, in fact claiming to be developing AGI for the sake of Safety. Actual influence from EA employees to do things that were safe that weren't good for the mission of those companies was already lost at this point, imo.
It was proven in the ensuing two years that the Safety teams at OpenAI were expendable. Sam Altman has used up and thrown away EA, and he no longer feels any need to pretend OpenAI cares about Safety, despite having very fluently talked to the talk for years before. He was happy to use the EA board members and the entire movement as scapegoats.
Anthropic is showing signs of going the same way. They do Safety research, but nothing stops them developing further, including former promises not to advance the frontier. The main thing they do is develop bigger and bigger models. They want to be attractive to natsec, and whether the actual decisionmakers at the top ultimately believe their agenda is for the sake of Safety or not, it's clearly not up to the marginal Safety hire or hingeing on their research results. Other AI companies don't even claim to care about Safety particularly.
So, I do not think it is effective to work at these places. But the real harm is that working for AI labs keeps EAs from speaking out about AI danger, whether because they are under NDA, or because they want to be hireable by a lab, or they want to cooperate with people working at labs, or because they defer to their friends and general social environment and so they think the labs are good (at least Anthropic). imo this price is unacceptably high, and EAs would have a lot more of the impact they hoped to get from being "in the room" at labs by speaking out and contributing to real external pressure and regulation.
I agree that there could be an effect that keeps people from speaking out about AI danger. But:
Probably our crux is that I think the way society sees AI development morally is what matters here to navigate the straits, and the science is not going to be able to do the job in time. I care about developing a field of technical AI Safety but not if it comes at the expense of moral clarity that continuing to train bigger and bigger models is not okay before we know it will be safe. I would much rather rally the public to that message than try to get in the weak safety paper discourse game (which tbc I consider toothless and assume is not guiding Google’s strategy).
I agree there are some possible attitudes that society could have towards AI development which could put us in a much safer position.
I think that the degree of consensus you'd need for the position that you're outlining here is practically infeasible, absent some big shift in the basic dynamics. I think that the possible shifts which might get you there are roughly:
I think there's potentially something to each of these. But I think the GDM paper is (in expectation) actively helpful for 1 and probably 3, and doesn't move the needle much either way on 2.
(My own view is that 3 is the most likely route to succeed. There's some discussion of the pragmatics of this route in AI Tools for Existential Security or AI for AI Safety (both of which also discuss automation of safety research, which is another potential success route), and relevant background views on the big-picture strategic situation in the Choice Transition. But I also feel positive about people exploring routes 1 and 2.)
Why are these the same category and why are you writing coordination off as impossible? It's not. We have literally done global nonproliferation treaties before.
This bizarre notion got embedded early in EA that technological feats are possible and solving coordination problems is impossible. It's actually the opposite-- alignment is not tractable and coordination is.
These are in the same category because:
I'm not actually making a claim about alignment difficulty -- beyond that I do think systems in the vein of those today and the near-successors of those look pretty safe.
I think that getting people to pause AI research would be a bigger lift than any nonproliferation treaties we've had in the past (not that such treaties have always been effective!). This isn't just a military tech, it's a massively valuable economic tech. Given the incentives, and the importance of having treaties actually followed, I do think this would be a more difficult challenge than any past nonproliferation work. I don't think that means it's impossible, but I do think it's way more likely if something shifts -- hence my 1-3.
(Or if you were asking why I say "out of reach now" in the quoted sentence it's because I'm literally talking about "much better coordination" as a capability; not what could or couldn't be achieved with a certain level of coordination.)
I will answer comments that ask sincerely for explanations of my worldview on this. I am aware there is a lot of evidence listing and dot-connecting I didn't do here.
What do you think would happen at the frontier labs if EAs left their jobs en masse? I understand the view that the newly-departed would be more able to "speak[] out and contribut[e] to real external pressure and regulation." And I understand the view that the leadership isn't listening to safety-minded EAs anyway.
But there are potential negative effects from the non-EAs who would presumably be hired as replacements. On your view, could replacement hiring make things worse at the labs? If so, how do you balance that downside risk against the advantages of departing the labs?
I think almost nothing would change at the labs, but that the EA AI Safety movement would become less impotent, more clear, and stand more of a chance of doing good.
No, I do not expect the people who replace them (or them not being replaced) to have much of an effect. I do not think they are really helping and I don’t think their absence would really hurt. The companies are following their own agenda and they’ll do that with or without specifc people in those roles.
If these people weren't really helping the companies it seems surprising salaries are so high?
I think Holly’s claim is that these people aren’t really helping from an ‘influencing the company to be more safety conscious’ perspective, or a ‘solving the hard parts of the alignment problem’ perspective. They could still be helping the company build commercially lucrative AI.
Would you say that investing in frontier AI companies (as an individual with normal human levels of capital) is similarly bad?
I think it is hazardous bc it in some way ties their "success" to yours.
Thanks for the post, Holly. Strongly upvoted. I did not find the post that valuable per se, but it generated some good discussion.
People at leading AI companies can earn hundreds of thousand of dollars per year, so quitting could plausibly decrease their donations by 100 k$/year. I estimate donating this to the Shrimp Welfare Project (SWP) would decrease as much pain per year as that needed to neutralise the happiness of 1.25 M human lives (= 100*10^3*639/51). Do you think the benefits of quitting outweight this? I do not, so I encourage people at leading AI companies to simply donate more to SWP. I imagine no one would quit if there were actual human lives on the line (instead of shrimp which are not helped).
I'm not sure I understand the question. If it's about being able to give donations, I wouldn't worry bc these people can be employed elsewhere making comparable salaries.
I think expected future earnings, including salaries and appreciation of equity, would go down in most cases. I thought you would agree because you said "even though it will be hard to give up the money".
I wasn't imagining they were donating the money, frankly. I'm not sure how many people working at AI companies even donate.
Anyway, directly making the world worse is not the only choice for making money.