TFD

43 karmaJoined Mar 2025

Message

Posts
2

Sorted by New

The limits of black-box evaluations: two hypotheticals

TFD

· 17d ago · 5m read

A different take on the Musk v OpenAI preliminary injunction order

TFD

· 2mo ago · 24m read

Comments
15

Criticism on the EA Forum

TFD2h3

I strongly disagree with the idea that there is a general obligation to reach out to someone before you publicly criticize them, and I've been considering writing a post explaining my case. I'd like to ask some questions to better understand the positions that people on the forum/EA community hold on this topic.

You talk about practices you'd like to "encourage" but later speak of "these norms", which I take to mean the obligation to reach out and to offer a "right of reply". There are some things that it is good to do, but where one does not violate a norm when failing to do that thing. If someone makes a post that criticizes someone on the forum but does not reach out to the target of their criticism first, would you consider that to be violating a norm of the forum, even if that violation won't result in any enforcement?

Some posts that express similar views focus on criticism directed at organizations (e.g. "run posts by orgs"). Does the entity at which criticism is directed impact what a critic is expected to do? For example, it would surprise me if I was expected to reach out to OpenAI, the DOJ, or Amazon prior to making a post criticizing one of those entities on the forum. Similarly, people sometimes make posts that respond to criticism of EA or EA institutions that is published in other venues. Those responses are sometimes critical of the authors of the original criticism. I would also be surprised if the expectation was that such posts offer a right of reply to the original critics.

Lizka previously wrote a post about why, how and when to share a critique with the subject of your criticism. I highly recommend reading that post — she also includes a helpful guide with template emails for critics.

Appendix 3 of this post mentions this:

Criticism of someone’s work is more likely than other kinds of critical writing (like disagreement with someone’s written arguments)

What is in scope for "criticism" in this context? People may reasonably disagree on whether a particular piece of critical writing is more about public arguments/evidence (and thus is like disagreement with someone's arguments) or not. This also seems to suggest that if an org does something and publishes some reasons for doing it the critic might not need to reach out to them (but its unclear to me what the standard is), while if they simply state they are doing something and don't state any reasons a critic would have to reach out.

The other appendices mention cases when the target of criticism is not expected to act in good faith, and the "run posts by orgs" post mentions a similar exception to the expectation when the person/org being criticized may behave badly when a critic reaches out. I think its not uncommon that critics and their targets have major disagreements about whether these types of beliefs are reasonable. When can one invoke this type of reasoning for not reaching out?

Epoch AI alumni launch Mechanize to "automate the whole economy"

TFD4d3

My personal take is that there are pretty reasonable arguments that what we have seen in AI/ML since 2015 suggests AI will be a big deal. I like the way I have seen Yoshua Bengio talk about it "over the next few years, or a few decades". I share the view that either of those possibilities are reasonable. People who are highly confident that something like AGI is going to arrive over the next few years are more confident in this than I am, but I think that view is within the bounds of reasonable interpretation of the evidence. I think it is also with-in the bounds of reasonable to have the opposite view, that something like AGI is most likely further than a few years away.

Don't believe me? Talk to me again in 5 years and send me a fruit basket. (Or just kick the can down the road and say AGI is coming in 2035...)

I think this is a healthy attitude and that I think is worth appreciating. We may get answers to these questions over the next few years. That seems pretty positive to me. We will be able to resolve some of these disagreements productively by observing what happens. I hope people who have different views now keep this in mind and that the environment is still in a good place for people who disagree now to work together in the future if some of these disagreements get resolved.

I will offer the ea forum internet-points equivalent of a fruit basket to anyone who would like one in the future if we disagree now and in the future they are proven right and I am proven wrong.

I think part of the sociological problem is that people are just way too polite about how crazy this all is and how awful the intellectual practices of effective altruists have been on this topic.

Can you saw what view it is you think is crazy? It seems quite reasonable to me to think that AI is going to be a massive deal and therefore that it would be highly useful to influence how it goes. On other other hand, I think people often over-estimate the robustness of the arguments for any given strategy for how to actually do that influencing. In other words, its reasonable to prioritize AI, but people's AI takes are often very over-confident.

Epoch AI alumni launch Mechanize to "automate the whole economy"

TFD4d-1

I appreciate your comment.

It seems clear that if Jaime had different views about the risk-reward of hypothetical 21st century AGI, nobody would be complaining about him loving his family.

I do think this is substantially correct, but I also want to acknowledge that these can be difficult subjects to navigate. I think anyone has done anything wrong, I'm sure I myself have done something similar to this many times. But I do think its worth trying to understand where the central points of disagreement lie, and I think this really is the central disagreement.

On the question of changing EA attitudes towards AI over the years, although I personally think AI will be a big deal, could be dangerous, and those issues are worth of significant attention, I also can certainly see reasons why people might disagree and why those people would have reasonable grievances with decisions by certain EA people and organizations.

An idea that I have pondered for a while about EA is a theory about which "boundaries" a community emphasizes. Although I've only ever interacted with EA by reading related content online, my perception is that EA really emphasizes the boundary around the EA community itself, while de-emphasizing the boundaries around individual people or organizations. The issues around Epoch I think demonstrate this. The feeling of betrayal comes from viewing "the community" as central. I think a lot of other cultures that place more emphasize on those other boundaries might react differently. For example, at most companies I have worked at, although certainly they would never be happy to see an employee leave, they wouldn't view moving to another job as a betrayal, even if an employee went to work for a direct competitor. I personally think placing more emphasis on orgs/individuals rather than the community as a whole could have some benefits, such as with the issue you raise about how to navigate changing views on AI.

Although emphasizing "the community" might seem like its ideal for cooperation, I think it can actually harm cooperation in the presence of substantial disagreements, because it generates dynamics like what is going on here. People feel like they can't cooperate with people across the disagreement. We will probably see some of these disagreements resolved over the next few years as AI progresses. I for one hope that even if I am wrong I can take any necessary corrections on-board and still work with people who I disagreed with to make positive contributions. Likewise, I hope that if I am right, people who I disagreed with still feel like they can work with me despite that.

As a side note, it’s also strange to me that people are treating the founding of Mechanize as if it has a realistic chance to accelerate AGI progress more than a negligible amount — enough of a chance of enough of an acceleration to be genuinely concerning. AI startups are created all the time. Some of them state wildly ambitious goals, like Mechanize. They typically fail to achieve these goals. The startup Vicarious comes to mind.

I admit I had a similar thought, but I am of two minds about it. On the one hand, I think intentions do matter. I think it is reasonable to point out if you think someone is making a mistake, even if you think ultimately that mistake is unlikely to have a substantial impact because the person is unlikely to succeed in what they are trying to do.

On the other hand, I do think the degree of the reaction and the way that people are generalizing seems like people are almost pricing in the idea that the actions in question have already had a huge impact. So I do wonder if people are kind of over-updating on this specific case for similar reasons to what you mention.

Epoch AI alumni launch Mechanize to "automate the whole economy"

TFD8d2

Although I haven't thought deeply about the issue you raise you could definitely be correct, and I think they are reasonable things to discuss. But I don't see their relevance to my arguments above. The quote you reference is itself discussing a quote from Sevilla that analyzes a specific hypothetical. I don't necessarily think Sevilla had the issues you raise in mind when we was addressing that hypothetical. I don't think his point was that based on forecasts of life extension technology he had determined that acceleration was the optimal approach in light of his weighing of 1 year-olds vs 50 year-olds. I think his point is more similar to what I mention above about current vs future people. I took a look at more of the X discussion, including the part where that quote comes from, and I think it is pretty consistent with this view (although of course others may disagree). Maybe he should factor in the things you mention, but to the extent his quote is being used to determine his views, I don't think the issues you raise are relevant unless he was considering them when he made the statement. On the other hand, I think discussing those things could be useful in other, more object level discussions. That's kind of what I was getting at here:

I think, at bottom, the problem is that Sevilla makes mistake in his analysis and/or decision-making about AI. His statements aren't norm-violating, they are just incorrect (at least some of them are, in my opinion). I think its worth having clarity about what the actual "problem" is.

I know I've been commenting here a lot, and I understand my style may seem confrontational and abrasive in some cases. I also don't want to ruin people's day with my self-important rants, so, having said my piece, I'll drop the discussion for now and let you get on with other things.

(although it you would like to response you are of course welcome, I just mean to say I won't continue the back-and-forth after, so as not to create a pressure to keep responding.)

Epoch AI alumni launch Mechanize to "automate the whole economy"

TFD8d1

Prioritising young people often makes sense from an impartial welfare standpoint

Sure, I think you can make a reasonable argument for that, but if someone disagreed with that, would you say they lack impartiality? To me it seems like something that is up for debate, within the "margin-of-error" of what is meant by impartiality. Two EAs could come down on different sides of that issue and still be in good standing in the community, and wouldn't be considered to not believe in the general principle of impartiality. Likewise, I think we can interpret Jeff Kaufman's argument above as expressing a similar view about an individual's loved-ones. It is within the "margin-of-error" of impartiality to still have a higher degree of concern for loved-ones, even if that might not be living up to the platonic ideal of impartiality.

My point in bringing this up is, the exact reason why the statement in question is bad seems to be shifting a bit over the conversation. Is the core reason that Sevilla's statement is objectionable really that it might up-weight people in a certain age group?

Epoch AI alumni launch Mechanize to "automate the whole economy"

TFD8d4

When I click the link I see three posts that go Sevilla, Lifland, Sevilla. I based my comments above on those. I haven't read through all the other replies by others or posts responding to them. If there is context in those or else where that is relevant I'm open to changing my mind based on that.

He repeatedly emphasises that it’s about his literal friends, family and self, and hypothetical moderate but difficult trade offs with the welfare of others.

Can you say what statements lead you to this conclusion? For example, you quote him saying something I haven't seen, perhaps part of the thread I didn't read.

“But I want to be clear that even if you convinced me somehow that the risk that AI is ultimately bad for the world goes from 15% to 1% if we wait 100 years I would not personally take that deal. If it reduced the chances by a factor of 100 I would consider it seriously. But 100 years has a huge personal cost to me, as all else equal it would likely imply everyone I know [italics mine] being dead. To be clear I don't think this is the choice we are facing or we are likely to face.“

To me, this seems to confirm what I said above:

Based on my read of the thread, the comment was in response to a question about benefiting people sooner rather than later. This is why I say it reduces to an existing-person-effecting view (which, at least as far as I am aware, is not an unacceptable position to hold in EA). The question is functionally about current vs future people, not literally Sevilla's friends and family specifically.

Yes, Sevilla is motivated specifically by considerations about those he loves, and yes, there is a trade-off, but that trade-off is really about current vs future people. People who aren't longtermists for example would also implicate this same trade-off. I don't think Sevilla would be getting the same reaction here if he just said he isn't a longtermist. Because of the nature of the available actions, the interests of Sevilla's loved-ones is aligned with those of current people (but not necessarily future people). The reason why "everyone [he] know[s]" will be dead is because everyone will be dead, in that scenario.

You might think that having loved-ones as a core motivation above other people is inherently a problem. I think this is answered above by Jeff Kaufman:

I don't think impartiality to the extent of not caring more about the people one loves is a core value for very many EAs? Yes, it's pretty central to EA that most people are excessively partial, but I don't recall ever seeing someone advocate full impartiality.

I agree with this statement. Therefore my view is that simply stating that you're more motivated by consequences to your loved-ones is not, in and of itself, a violation of a core EA idea.

Jason offers a refinement of this view. Perhaps what Kaufman says is true, but what if there is a more specific objection?

There are a number of jobs and roles that expect your actions in a professional capacity to be impartial in the sense of not favoring your loved ones over others. For instance, a politician should not give any more weight to the effects of proposed legislation on their own mother than the effect on any other constituent.

Perhaps the issue is not necessarily that Sevilla has the motivation itself, but that his role comes with a specific conflict-of-interest-like duty, which the statement suggests he is violating. My response was addressing this argument. I claim that the duty isn't as broad as Jason seems to imply:

It seems like the view expressed reduces to an existing-person-effecting view. Is their any plausible mechanism by which an action by Epoch is supposed to impact Sevilla's friends/relatives specifically? I seriously doubt it. The only plausible mechanism would be that AI goes well instead of poorly, which would benefit all existing people. This makes the politician comparison, as stated, dis-analogousness. Would you say that if a politician said their motivation to become a politician was to make a better world for their children, for example, that would somehow violate their duties? Seems like a lot of politicians might have issue if that were the case.

Does a politician who votes for a bill and states they are doing so to "make a better world for their children", violate a conflict-of-interest duty? Jason's argument seems to suggest they would. Let's assume they are being genuine, they really are significantly motivated by care for their children, more than for a random citizen. They apply more weight to the impact of the legislation on their children then to others, violating Jason's proposed criteria.

Yet I don't think we would view such statements as disqualifying for a politician. The reason is that the mechanism by which they benefit their children really only operates by also helping everyone else. Most legislation won't have any different impact on their children compared to any other person. So while the statement nominally suggests a conflict-of-interest, in practice the politicians incentives are aligned, the only way that voting for this legislation helps their children is that it helps everyone, and that includes their children. If the legislation plausibly did have a specific impact on their child (for example impacting an industry their child works in), then that really could be a conflict-of-interest. My claim is there needs to be some greater specificity for a conflict to exist. Sevilla's case is more like the first case than the second, or at least that is my claim:

Is their any plausible mechanism by which an action by Epoch is supposed to impact Sevilla's friends/relatives specifically? I seriously doubt it. The only plausible mechanism would be that AI goes well instead of poorly, which would benefit all existing people.

So, what has Sevilla done wrong? My analysis is this. It isn't simply that he is more motivated to help his loved-ones (Kaufman argument). Nor is it something like a conflict-of-interest (my argument). In another comment on this thread I said this:

People can do a bad thing because they are just wrong in their analysis of a situation or their decision-making.

I think, at bottom, the problem is that Sevilla makes mistake in his analysis and/or decision-making about AI. His statements aren't norm-violating, they are just incorrect (at least some of them are, in my opinion). I think its worth having clarity about what the actual "problem" is.

Epoch AI alumni launch Mechanize to "automate the whole economy"

TFD9d1

the prior generation of my family has passed on, I had young children

This seems to suggest that you think the politicians "making the world better for my children" statement would then also be problematic. Do you agree with that?

I'll be honest, this argument seems a bit too clever. Is the underlying problem with the statement really that it implies a set of motivations that might slightly up-weight a certain age group? One of the comments speaks of "core values" for EA. Is that really a core value? I'm pretty sure I recall reading an argument by McAskill about how actually we should more heavily weight young people in various ways (I think it was voting), for example. I serious doubt most EAs could claim that they literally are distributionally exact in weighting all morally relevant entities in every decision they make. I think the "core value" that exists probably isn't really this demanding, although I could be wrong.

Epoch AI alumni launch Mechanize to "automate the whole economy"

TFD9d4

But there’s no trade off between personal and impartial preferences there. That seems to me to be quite different from saying you’re prioritising eg your parents and grandparents getting to have extended lifespans over other people’s children’s wellbeing.

I can see why you would interpret it this way given the context, but I read the statement differently. Based on my read of the thread, the comment was in response to a question about benefiting people sooner rather than later. This is why I say it reduces to an existing-person-effecting view (which, at least as far as I am aware, is not an unacceptable position to hold in EA). The question is functionally about current vs future people, not literally Sevilla's friends and family specifically. I think this matches the "making the world better for your children" idea. You can channel a love of friends and family into an altruistic impulse, so long as there isn't some specific conflict-of-interest where you're benefiting them specifically. I think the statement in question is consistent with that.

The discussion also isn’t about the effects of Epoch’s specific work, so I’m a bit confused by your argument relying on that.

I'm bringing this up because I think its implausible that anything that is being discussed here has some specific relevance to Sevilla's friends and family as individuals (in support of my point above). In other words, due to the nature of the actions being taken

there’s no trade off between personal and impartial preferences there

In what way are any concrete actions that are relevant here prioritizing Sevilla's family over other people's children? Although I can see how it might initially seem that way I don't think that's what the statement was intended to communicate.

An Introduction to the Problem of Authority

TFD9d*9

I think there are two main appeals to libertarianism. One "practical libertarianism" is based on a belief that on current margins moving towards less government would be beneficial. I'm sympathetic to this position and I think one can hold it for purely consequentialist reasons.

The above argument is "philosophical libertarianism". I'm not so convinced by these arguments.

On these theories, for example, you wouldn’t be justified in stealing a loaf of bread even to save a starving person’s life.

I think the reason a lot of libertarian theories bite this bullet is that failing to do so seems to be abandoning the alleged reasons for libertarianism. For example, the article argues that one of the common sense reasons why other concepts of government fail is that they seem to imply the government can do things that normal people can't do, like take people's property. But we have now just said that actually, it is allowed for normal people, not just governments, to take people's property if their is a good reason. We could go through an entire sequence of hypotheticals, like can you take money from someone to buy bread if you're starving? If you aren't starving but someone else is, can you take money from someone else to buy bread to give it to the starving person? The upshot of that sequence is that if you don't bite the bullet, then there's no limiting principle. You're basically saying redistribution is in fact allowed.

Likewise for the lifeboat example. I don't see how any principled approach to libertarianism can give the answer that you can threaten the rest of the passengers. That's literal coercion! If you then engage in the insane terminological gerrymandering where that counts as "self-defense" then so does so much other stuff. Is it "self-defense" if you force a doctor to give you a surgery you can't afford? None of the desired libertarian conclusions would follow. Especially given that libertarianism also needs to justify why private property is a thing, and most justification I have seen go back to freedom of a person's labor by way of homesteading. Giving the answer the author seems to give for the lifeboat hypo seems like a massive problem for libertarianism.

This doesn’t support a content-independent entitlement to enforce the state’s rules, nor a content-independent obligation to obey.

I'm not sure why there is a requirement that a theory of government be content-independent. This seems like an arbitrary requirement the author has imposed on theories they don't favor. Kind of by definition, a consequentialism wouldn't support a "content-independent" position? But they could still support government based on an expectation about the distribution of government actions they expect to actually be realized. They could also support something like an-cap, for a consequentialist it seems like a modeling/empirical question (and kind of collapse back to practical libertarianism potentially).

I think there is essentially a moral hypothetical no-free-lunch theorem. No principled moral theory can exist that matches "common sense" intuitions on all hypotheticals. Although I'm open to practical libertarianism, nothing here seems convincing that philosophical libertarianism is the "least bad" option.

Epoch AI alumni launch Mechanize to "automate the whole economy"

TFD9d*6

How common is it for such repayments to occur, and what do you think would be the standard for the level of clarity of the commitment, and who does that commitment would have to be to? For example, is there a case that 80k hours should refund payments in light of their pivot to focus on AI? I know there are differences, their funder could support the move etc., but in the spirit of the thing, where is the line here?

Editing to add: One of my interests in this topic is that EA/rationalists seem to have some standards/views that diverge somewhat from what I would characterize as more "mainstream" approaches to these kinds of things. Re-reading the OP, I noticed a detail I initially missed:

Habryka wonders whether payment would have had to be given to Epoch for use of their benchmarks suite.

to me this does seem like it implicates a more mainstream view of a potential conflict-of-interest.

TFD

Posts 2

Comments15

Posts
2

Comments
15