Joining the Carnegie Endowment for International Peace

Holden Karnofsky

Effective today, I’ve left Open Philanthropy and joined the Carnegie Endowment for International Peace^[1] as a Visiting Scholar. At Carnegie, I will analyze and write about topics relevant to AI risk reduction. In the short term, I will focus on (a) what AI capabilities might increase the risk of a global catastrophe; (b) how we can catch early warning signs of these capabilities; and (c) what protective measures (for example, strong information security) are important for safely handling such capabilities. This is a continuation of the work I’ve been doing over the last ~year.

I want to be explicit about why I’m leaving Open Philanthropy. It’s because my work no longer involves significant involvement in grantmaking, and given that I’ve overseen grantmaking historically, it’s a significant problem for there to be confusion on this point. Philanthropy comes with particular power dynamics that I’d like to move away from, and I also think Open Philanthropy would benefit from less ambiguity about my role in its funding decisions (especially given the fact that I’m married to the President of a major AI company). I’m proud of my role in helping build Open Philanthropy, I love the team and organization, and I’m confident in the leadership it’s now under; I think it does the best philanthropy in the world, and will continue to do so after I move on. I will continue to serve on its board of directors (at least for the time being).

While I’ll miss the Open Philanthropy team, I am excited about joining Carnegie.

Tino Cuellar, Carnegie’s President, has been an advocate for taking (what I see as) the biggest risks from AI seriously. Carnegie is looking to increase its attention to AI risk, and has a number of other scholars working on it, including Matt Sheehan, who specializes in China’s AI ecosystem (an especially crucial topic in my view).
Carnegie’s leadership has shown enthusiasm for the work I’ve been doing and plan to continue. I expect that I’ll have support and freedom, in addition to an expanded platform and network, in continuing my work there.
I’m generally interested in engaging more on AI risk with people outside my existing networks. I think it will be important to build an increasingly big tent over time, and I’ve tried to work on approaches to risk reduction (such as responsible scaling) that have particularly strong potential to resonate outside of existing AI-risk-focused communities. The Carnegie network is appealing because it’s well outside my usual network, while having many people with (a) genuine interest in risks from AI that could rise to the level of international security issues; (b) knowledge of international affairs.
I resonate with Carnegie’s mission of “helping countries and institutions take on the most difficult global problems and advance peace,” and what I’ve read of its work has generally had a sober, nuanced, peace-oriented style that I like.

I’m looking forward to working at Carnegie, despite the bittersweetness of leaving Open Phil. To a significant extent, though, the TL;DR of this post is that I am continuing the work I’ve been doing for over a year: helping to design and advocate for a framework that seeks to get early warning signs of key risks from AI, accompanied by precommitments to have sufficient protections in place by the time they come (or to pause AI development and deployment until these protections get to where they need to be).

^{^}
I will be at the California office and won’t be relocating.

228 Reactions

More posts like this

Comments14

Sorted by

New & upvoted

Click to highlight new comments since: Today at 6:57 PM

Dustin MoskovitzApr 29185

I'm grateful that Cari and I met Holden when we did (and grateful to Daniela for luring him to San Francisco for that first meeting). The last fourteen years of our giving would have looked very different without his work, and I don't think we'd have had nearly the same level of impact — particularly in areas like farm animal welfare and AI that other advisors likely wouldn't have mentioned.

Adam_SchollApr 3086

I also think Open Philanthropy would benefit from less ambiguity about my role in its funding decisions (especially given the fact that I’m married to the President of a major AI company).

This makes sense, but if anything the conflict of interest seems more alarming if you're influencing national policy. For example, I would guess that you are one of the people—maybe literally among the top 10?—who stands to personally lose the most money in the event of an AI pause. Are you worried about this, or taking any actions to mitigate it (e.g., trying to convert equity into cash?)

Holden KarnofskyMay 1350

My spouse isn't currently planning to divest the full amount of her equity. Some factors here: (a) It's her decision, not mine. (b) The equity has important voting rights, such that divesting or donating it in full could have governance implications. (c) It doesn't seem like this would have a significant marginal effect on my real or perceived conflict of interest: I could still not claim impartiality when married to the President of a company, equity or no. With these points in mind, full divestment or donation could happen in the future, but there's no immediate plan to do it.

The bottom line is that I have a significant conflict of interest that isn't going away, and I am trying to help reduce AI risk despite that. My new role will not have authority over grants or other significant resources besides my time and my ability to do analysis and make arguments. People encountering any analysis and arguments will have to decide how to weigh my conflict of interest for themselves, while considering arguments and analysis on the merits.

For whatever it's worth, I have publicly said that the world would pause AI development if it were all up to me, and I make persistent efforts to ensure people I'm interacting with know this. I also believe the things I advocate for would almost universally have a negative expected effect (if any effect) on the value of the equity I'm exposed to. But I don't expect everyone to agree with this or to be reassured by it.

aysjaApr 3025

For context, Holden is married to Daniela Amodei, president and co-founder of Anthropic. She also used to work at OpenAI and still, I believe, holds equity there. As Holden has stated elsewhere: "I am married to the President of Anthropic and have a financial interest in both Anthropic and OpenAI via my spouse."

AkashApr 2937

Congratulations on the new role– I agree that engaging with people outside of existing AI risk networks has a lot of potential for impact.

Besides RSPs, can you give any additional examples of approaches that you're excited about from the perspective of building a bigger tent & appealing beyond AI risk communities? This balancing act of "find ideas that resonate with broader audiences" and "find ideas that actually reduce risk and don't merely serve as applause lights or safety washing" seems quite important. I'd be interested in hearing if you have any concrete ideas that you think strike a good balance of this, as well as any high-level advice for how to navigate this.

Additionally, how are you feeling about voluntary commitments from labs (RSPs included) relative to alternatives like mandatory regulation by governments (you can't do X or you can't do X unless Y), preparedness from governments (you can keep doing X but if we see Y then we're going to do Z), or other governance mechanisms?

(I'll note I ask these partially as someone who has been pretty disappointed in the ultimate output from RSPs, though there's no need to rehash that debate here– I am quite curious for how you're reasoning through these questions despite some likely differences in how we think about the success of previous efforts like RSPs.)

Holden KarnofskyMay 1342

> Besides RSPs, can you give any additional examples of approaches that you're excited about from the perspective of building a bigger tent & appealing beyond AI risk communities? This balancing act of "find ideas that resonate with broader audiences" and "find ideas that actually reduce risk and don't merely serve as applause lights or safety washing" seems quite important. I'd be interested in hearing if you have any concrete ideas that you think strike a good balance of this, as well as any high-level advice for how to navigate this.

I'm pretty focused on red lines, and I don't think I necessarily have big insights on other ways to build a bigger tent, but one thing I have been pretty enthused about for a while is putting more effort into investigating potentially concerning AI incidents in the wild. Based on case studies, I believe that exposing and helping the public understand any concerning incidents could easily be the most effective way to galvanize more interest in safety standards, including regulation. I'm not sure how many concerning incidents there are to be found in the wild today, but I suspect there are some, and I expect there to be more over time as AI capabilities advance.

> Additionally, how are you feeling about voluntary commitments from labs (RSPs included) relative to alternatives like mandatory regulation by governments (you can't do X or you can't do X unless Y), preparedness from governments (you can keep doing X but if we see Y then we're going to do Z), or other governance mechanisms?

The work as I describe it above is not specifically focused on companies. My focus is on hammering out (a) what AI capabilities might increase the risk of a global catastrophe; (b) how we can try to catch early warning signs of these capabilities (and what challenges this involves); and (c) what protective measures (for example, strong information security and alignment guarantees) are important for safely handling such capabilities. I hope that by doing analysis on these topics, I can create useful resources for companies, governments and other parties.

I suspect that companies are likely to move faster and more iteratively on things like this than governments at this stage, and so I often pay special attention to them. But I’ve made clear that I don’t think voluntary commitments alone are sufficient, and that I think regulation will be necessary to contain AI risks. (Quote from earlier piece: "And to be explicit: I think regulation will be necessary to contain AI risks (RSPs alone are not enough), and should almost certainly end up stricter than what companies impose on themselves.")

Evan R. MurphyNov 111

one thing I have been pretty enthused about for a while is putting more effort into investigating potentially concerning AI incidents in the wild. Based on case studies, I believe that exposing and helping the public understand any concerning incidents could easily be the most effective way to galvanize more interest in safety standards, including regulation. I'm not sure how many concerning incidents there are to be found in the wild today, but I suspect there are some, and I expect there to be more over time as AI capabilities advance.

Interesting idea - I can see how exposing AI incidents could be important. This brought to my mind the paper Malla: Demystifying Real-world Large Language Model Integrated Malicious Services. (No affiliation with the paper, just one that I remember reading and we referenced in some Berkeley CLTC AI Security Initiative research earlier this year.) The researchers on the Malla paper dug into the dark web and uncovered hundreds of malicious services based on LLMs being distributed in the wild.

Ryan GreenblattApr 3020

Additionally, how are you feeling about voluntary commitments from labs (RSPs included) relative to alternatives like mandatory regulation by governments

This is discussed in Holden's earlier post on the topic here.

AkashMay 110

Thanks! Familiar with the post— another way of framing my question is “has Holden changed his mind about anything in the last several months? Now that we’ve had more time to see how governments and labs are responding, what are his updated views/priorities?”

(The post, while helpful, is 6 months old, and I feel like the last several months has given us a lot more info about the world than we had back when RSPs were initially being formed/released.)

SiebeRozendalMay 18

Here's Carnegie's publications on AI: https://carnegieendowment.org/programs/technology/ai/

Greg_ColbournMay 16

Congrats Holden! Just going to quote you from a recent post:

There’s a serious (>10%) risk that we’ll see transformative AI² within a few years.
In that case it’s not realistic to have sufficient protective measures for the risks in time.
Sufficient protective measures would require huge advances on a number of fronts, including information security that could take years to build up and alignment science breakthroughs that we can’t put a timeline on given the nascent state of the field, so even decades might or might not be enough time to prepare, even given a lot of effort.
If it were all up to me, the world would pause now

Please don't lose sight of this in your new role. Public opinion is on your side here, and PauseAI are gaining momentum. It's possible for this to happen. Please push for it in your new role! (And reduce your conflict of interest if possible!)

Nathan YoungMay 11

Thank you for your work. I am really grateful when people work hard and try hard to achieve good. I hope that the new job goes well.

JideApr 290

Seems like a good call.

Nathan YoungMay 1-7