C

Cullen 🔸

4342 karmaJoined Working (0-5 years)Bangkok, Thailand
cullenokeefe.com

Bio

I am a lawyer and policy researcher interested in improving the governance of artificial intelligence. I currently work as Director of Research at the Institute for Law & AI. I previously worked in various legal and policy roles at OpenAI.

I am also a Research Affiliate with the Centre for the Governance of AI and a VP at the O’Keefe Family Foundation.

My research focuses on the law, policy, and governance of advanced artificial intelligence.

You can share anonymous feedback with me here.

Sequences
2

Law-Following AI
AI Benefits

Comments
334

Topic contributions
24

Is your claim that somehow FTX investing in Anthropic has caused Anthropic to be FTX-like in the relevant ways? That seems implausible.

Thanks for this very thoughtful reply!

I have a lot to say about this, much of which boils down to a two points:

  1. I don't think Jeremy is a good example of unnecessary polarization.
  2. I think "avoid unnecessary polarization" is a bad heuristic for policy research (which, related to my first point, is what Jeremy was responding to in Dislightenment), at least if it means anything other than practicing the traditional academic virtues of acknowledging limitations, noting contrary opinion, being polite, being willing to update, inviting disagreement, etc.

The rest of your comment I agree with.

I realize that point (1) may seem like nitpicking, and that I am also emotionally invested in it for various reasons. But this is all in the spirit of something like avoiding reasoning from fictional evidence: if we want to have a good discussion of avoiding unnecessary polarization, we should reason from clear examples of it. If Jeremy is not a good example of it, we should not use him as a stand-in.

I was just using Jeremy as a stand-in for the polarisation of Open Source vs AI Safety more generally.

Right, this is in large part where our disagreement is: whether Jeremy is good evidence for or an example of unnecessary polarization. I just simply don’t think that Jeremy is a good example of where there has been unnecessary (more on this below) polarization because I think that he, explicitly and somewhat understanably, just finds the idea of approval regulation for frontier AI abhorrent. So to use Jeremy as evidence or example of unnecessary polarization, we have to ask what he was reacting to, and whether something unnecessary was done to polarize him against us.

Dislightenment “started out as a red team review” of FAIR, and FAIR is the most commonly referenced policy proposal in the piece, so I think that Jeremy’s reaction in Dislightenment is best understood as, primarily, a reaction to FAIR. (More generally, I don’t know what else he would have been reacting to, because in my mind FAIR was fairly catalytic in this whole debate, though it’s possible I’m overestimating its importance. And in any case I wasn’t on Twitter at the time so may lack important context that he’s importing into the conversation.) In which case, in order to support your general claim about unnecessary polarization, we would need to ask whether FAIR did unnecessary things polarize him.

Which brings us to the question of what exactly unnecessary polarization means. My sense is that avoiding unnecessary polarization would, in practice, mean that policy researchers write and speak extremely defensively to avoid making any unnecessary enemies. This would entail falsifying not just their own personal beliefs about optimal policy, but also, crucially, falsifying their prediction about what optimal policy is from the set of preferences that the public already holds. It would lead to writing positive proposals shot through with diligent and pervasive reputation management, leading to a lot of unnecessary and confusing hedges and disjunctive asides. I think pieces like that can be good, but it would be very bad if every piece was like that.

Instead, I think it is reasonable and preferable for discourse to unfold like this: Policy researchers write politely about the things that they think are true, explain their reasoning, acknowledge limitations and uncertainties, and invite further discussion. People like Jeremy then enter the conversation, bringing a useful different perspective, which is exactly what happened here. And then we can update policy proposals over time, to give more or less weight to different considerations in light of new arguments, political evidence (what do people think is riskier: too much centralization or too much decentralization?) and technical evidence. And then maybe eventually there is enough consensus to overcome the vetocratic inertia of our political system and make new policy. Or maybe a consensus is reached that this is not necessary. Or maybe no consensus is ever reached, in which case the default is nothing happens.

Contrast this with what I think the “reduce unnecessary polarization” approach would tend to recommend, which is something closer to starting the conversation with an attempt at a compromise position. It is sometimes useful to do this. But I think that, in terms of actual truth discovery, laying out the full case for one’s own perspective is productive and necessary. Without full-throated policy proposals, policy will tend too much either towards an unprincipled centrism (wherein all perspectives are seen as equally valid and therefore worthy of compromise) or towards the perspectives of those who defect from the “start at compromise” policy. When the stakes are really high, this seems bad.

To be clear, I don’t think you’re advocating for this "compromise-only" position. But in the case of Jeremy and Dislightenment specifically, I think this is what it would have taken to avoid polarization (and I doubt even that would have worked): writing FAIR with a much mushier, “who’s to say?” perspective.

In retrospect, I think it’s perfectly reasonable to think that we should have talked about centralization concerns more in FAIR. In fact, I endorse that proposition. And of course it was in some sense unnecessary to write it with the exact discussion of centralization that we did. But I nevertheless do not think that we can be said to have caused Jeremy to unnecessarily polarize against us, because I think him polarizing against us on the basis of FAIR is in fact not reasonable.

On ‘elite panic’ and ‘counter-enlightenment’, he’s not directly comparing FAIR to it I think. He’s saying that previous attempts to avoid democratisation of power in the Enlightenment tradition have had these flaws.

I disagree with this as a textual matter. Here are some excerpts from Dislightenment (emphases added):

Proposals for stringent AI model licensing and surveillance will . . . potentially roll[] back the societal gains of the Enlightenment.

bombing data centers and global surveillance of all computers is the only way[!!!] to ensure the kind of safety compliance that FAR proposes.

FAR briefly considers this idea, saying ‘for frontier AI development, sector-specific regulations can be valuable, but will likely leave a subset of the high severity and scale risks unaddressed’ But it . . . promote[s] an approach which, as we’ve seen, could undo centuries of cultural, societal, and political development.

He fairly consistently paints FAIR (or licensing more generally, which is a core part of FAIR) as the main policy he is responding to.

I think, from Jeremy’s PoV, that centralization of power is the actual ballgame and what Frontier AI Regulation should be about. So one mention on page 31 probably isn’t good enough for him.

It is definitely fair for him to think that we should have talked about decentralization more! But I don’t think it’s reasonable for him to polarize against us on that basis. That seems like the crux of the issue.

Jeremy’s reaction is most sympathetic if you model the FAIR authors specifically or the TAI governance community more broadly as a group of people totally unsympathetic to distribution of power concerns. The problem is that that is not true. My first main publication in this space was on the risk of excessively centralized power from AGI; another lead FAIR coauthor was on that paper too. Other coauthors have also written about this issue: e.g., 1; 2; 3 at 46–48; 4; 5; 6. It’s a very central worry in the field, dating back to the first research agenda. So I really don’t think polarization against us on the grounds that we have failed to give centralization concerns a fair shake is reasonable.

I think the actual explanation is that Jeremy and the group of which he is representative have a very strong prior in favor of open-sourcing things, and find it morally outrageous to propose restrictions thereon. While I think a prior in favor of OS is reasonable (and indeed correct), I do not think it reasonable for them to polarize against people who think there should be exceptions to the right to OS things. I think that it generally stems from an improper attachment to a specific method of distributing power without really thinking through the limits of that justification, or acknowledging that there even could be such limits.

You can see this dynamic at work very explicitly with Jeremy. In the seminar you mention, we tried to push Jeremy on whether, if a certain AI system turns out to be more like an atom bomb and less like voting, he would still think it's good to open-source it. His response was that AI is not like an atomic bomb.

Again, a perfectly fine proposition to hold on its own. But it completely fails to either: (a) consider what the right policy would be if he is wrong, (b) acknowledge that there is substantial uncertainty or disagreement about whether any given AI system will be more bomb-like or voting-like.

That’s a fine reaction to me, just as it’s fine for you and Marcus to disagree on the relative costs/benefits and write the FAIR paper the way you did.

I agree! But I guess I’m not sure where the room for Jeremy’s unnecessary polarization comes in here. Do reasonable people get polarized against reasonable takes? No.

I know you're not necessarily saying that FAIR was an example of unnecessary polarizing discourse. But my claim is either (a) FAIR was in fact unnecessarily polarizing, or (b) Jeremy's reaction is not good evidence of unnecessary polarization, because it was a reaction to FAIR.

There's probably a difference between ~July23-Jeremy and ~Nov23Jeremy

I think all of the opinions of his we're discussing are from July 23? Am I missing something?

On the actual points though, I actually went back and skim-listened to the the webinar on the paper in July 2023, which Jeremy (and you!) participated in, and man I am so much more receptive and sympathetic to his position now than I was back then, and I don't really find Marcus and you to be that convincing in rebuttal,

A perfectly reasonable opinion! But one thing that is not evident from the recording is that Jeremy showed up something like 10-20 minutes into the webinar, and so in fact missed a large portion of our presentation. Again, I think this is more consistent with some story other than unnecessary polarization. I don't think any reasonable panelist would think it appropriate to participate in a panel where they missed the presentation of the other panelists, though maybe he had some good excuse.

Ah, interesting, not exactly the case that I thought you were making.

I more or less agree with the claim that "Elon changing the twitter censorship policies was a big driver of a chunk of Silicon Valley getting behind Trump," but probably assign it lower explanatory power than you do (especially compared to nearby explanatory factors like, Elon crushing internal resistance and employee power at Twitter). But I disagree with the claim that anyone who bought Twitter could have done that, because I think that Elon's preexisting sources of power and influence significantly improved his ability to drive and shape the emergence of the Tech Right.

I also don't think that the Tech Right would have as much power in the Trump admin if not for Elon promoting Trump and joining the administration. So a different Twitter CEO who also created the Tech Right would have created a much less powerful force.

I will say that not appreciating arguments from open-source advocates, who are very concerned about the concentration of power from powerful AI, has lead to a completely unnecessary polarisation against the AI Safety community from it.

I think if you read the FAIR paper to which Jeremy is responding (of which I am a lead author), it's very hard to defend the proposition that we did not acknowledge and appreciate his arguments. There is an acknowledgment of each of the major points he raises on page 31 of FAIR. If you then compare the tone of the FAIR paper to his tone in that article, I think he was also significantly escalatory, comparing us to an "elite panic" and "counter-enlightenment" forces.

To be clear, notwithstanding these criticisms, I think both Jeremy's article and the line of open-source discourse descending from it have been overall good in getting people to think about tradeoffs here more clearly. I frequently cite to it for that reason. But I think that a failure to appreciate these arguments is not the cause of the animosity in at least his individual case: I think his moral outrage at licensing proposals for AI development is. And that's perfectly fine as far as I'm concerned. People being mad at you is the price of trying to influence policy.

I think a large number of commentators in this space seem to jump from "some person is mad at us" to "we have done something wrong" far too easily. It is of course very useful to notice when people are mad at you and query whether you should have done anything differently, and there are cases where this has been true. But in this case, if you believe, as I did and still do, that there is a good case for some forms of AI licensing notwithstanding concerns about centralization of power, then you will just in fact have pro-OS people mad at you, no matter how nicely your white papers are written.

(Elon's takeover of twitter was probably the second—it's crazy that you can get that much power for $44 billion.)

I think this is pretty significantly understating the true cost. Or put differently, I don't think it's good to model this as an easily replicable type of transaction.

I don't think that if, say, some more boring multibillionaire did the same thing, they could achieve anywhere close to the same effect. It seems like the Twitter deal mainly worked for him, as a political figure, because it leveraged existing idiosyncratic strengths that he had, like his existing reputation and social media following. But to get to the point where he had those traits, he needed to be crazy successful in other ways. So the true cost is not $44 billion, but more like: be the world's richest person, who is also charismatic in a bunch of different ways, have an extremely dedicated online base of support from consumers and investors, have a reputation for being a great tech visionary, and then spend $44B.

A warm welcome to the forum!

I don't claim to speak authoritatively, or to answer all of your questions, but perhaps this will help continue your exploration.

There's an "old" (by EA standards) saying in EA, that EA is a Question, Not an Ideology. Most of what connects the people on this forum is not necessarily that they all work in the same cause area, or share the same underlying philosophy, or have the same priorities. Rather, what connects us is rigorous inquiry into the question of how we can do the most good for others with our spare resources. Because many of these questions are philosophical, people who start from that same question can and do disagree.

Accordingly, people in EA fall on both sides of many of the questions you ask. There are definitely people in EA that don't think that we should prioritize future lives over present lives. There are definitely people who are skeptical about AI safety. There are definitely people who are concerned about the "moral licensing" effects of earning-to-give.

So I guess my general answer to your closing question is: you are not missing anything; on the contrary, you have identified a number of questions that people in EA have been debating for the past ~20 years and will likely continue doing so. If you share the general goal of effectively doing good for the world (as, from your bio, it looks like you do), I hope you will continue to think about these questions in an open-minded and curious way. Hopefully discussions and interactions with the EA community will provide you some value as you do so. But ultimately, what is more important than your agreement or disagreement with the EA community about any particular issue is your own commitment to thinking carefully about how you can do good.

I upvoted and didn't disagree-vote, because I generally agree that using AI to nudge online discourse in more productive directions seems good. But if I had to guess where disagree votes come from, it might be a combination of:

  1. It seems like we probably want politeness-satisficing rather than politeness-maximizing. (This could be consistent with some versions of the mechanism you describe, or a very slightly tweaked version).
  2. There's a fine line between politiness-moderating and moderating the substance of ideas that make people uncomfortable. Historically, it has been hard to police this line, and given the empirically observable political preferences of LLMs, it's reasonable for people who don't share those preferences to worry that this will disadvantage them (though I expect this bias issue to get better over time, possibly very soon)
  3. There is a time and place for spirited moral discourse that is not "polite," because the targets of the discourse are engaging in highly morally objectionable action, and it would be bad to always discourage people from engaging in such discourse.*

*This is a complicated topic that I don't claim to have either (a) fully coherent views on, or (b) have always lived up to the views I do endorse.

Both Sam and Dario saying that they now believe they know how to build AGI seems like an underrated development to me. To my knowledge, they only started saying this recently. I suspect they are overconfident, but still seems like a more significant indicator than many people seem to be tracking.

I also have very wide error bars on my $1B estimate; I have no idea how much equity early employees would normally retain in a startup like Anthropic. That number is also probably dominated by the particular compensation arrangements and donation plans of ~5–10 key people and so very sensitive to assumptions about them individually.

Load more