This is a special post for quick takes by zchuang. Only they can create top-level comments. Comments here also appear on the Quick Takes page and All Posts page.
I fear the weird hugbox EAs do towards their critics in order to signal good faith means over time a lot of critics just end up not being sharpened in their arguments.
I feel pretty strongly against "weird hugboxing" but I think the main negative effect is an erosion of our own epistemic standards and a reduction in the degree to which we can epistemically defer to one another. I want the EA community to consist of people whose pronouncements I can fully trust, rather than have to wonder if they are saying something because it reflects their considered judgment on that topic or instead because they are signaling good faith, "steelmanning", etc.
I think an inviting form of decoupling norms where it's fractured in chains. I don't think decoupling norms work when both parties don't opt-in and so people should switch to the dominant norm of the sphere. An illustrative example is as follows:
EAs try to avoid being persuasive and be explanatory and thus do not engage within the frame of opponents for fear of low fidelity. E.g. Will Macaskill sweatshop discourse about impermissibility and instead contest the claim on the metaethical level.
EAs then respond that utilitarianism is true and people aren't engaging or justifying on the comparative (which is a category mistake about morality for a deontologist) and act as if the critic was non-responsive (the Chappell thread comes to mind).
I will note this is a special case in which many EA critics don't want to re-litigate the utilitarianism vs. deontology discourse but want to articulate the side constraint violation and inform others of that thinking in EA.
The critic feels misunderstood because the EAs are very nice and say they're really happy the critic is here but the critic doesn't feel actually heard because the criticism wasn't really responded to but EAs say nice words of praise.
In turn the critic continues on the same chain of logic that EAs have not sharpened and pushed towards the true crux.
Some EAs would see this as being a motte-and-bailey instead of getting to the crux but cruxes can be asymmetric in that different critics combine claims together (e.g. the "woke" combining with more centrist sensibilities deontologists). But I think explanations which are done well are persuasive because they reframe truth-seeking ideas within accessible language that dissolve cruxes to seek agreement and cooperation.
Another illustration on the macro-level of the comparative:
Treat critics well on the resource and out of argumentative level such that the asymmetric resources of EA don't come to bear.
Make sure you are responsive to their concerns and use reasoning transparency. Instead of saying, "thanks for responding and being a critic" and leaving it there actually engage forcefully with the ideas and then thank them for their time.
The core problem right now is that EAs lead with being open to change their mind which sets up discourse for failure because when EAs don't change their minds the critics feel misled.
To be clear, there are harms with trying to be persuasive (e.g. sophistry, lying, motivated reasoning etc.). But sometimes being persuasive is about speaking the argumentative language of another side.
Yeah I should have written more but I try to keep my short form casual to make the barrier of entry lower and to allow for expansions based on different reader's issues.
Platforms and ability to amplify. I worry a lot about the amount of money in global priorities research and graduate students (even though I do agree it's net good). For instance, most EA PhD students take teaching buyouts and probably have more hours to devote to research. A sharing of resources probably means good distribution of prestige bodies and amplification gatekeepers.
To be explicitly my model of the modal EA is they have bad epistemics and would take this to mean fund a bad faith critic (and there are so many) but I do worry that sometimes EA wins in the marketplace of ideas due to money rather than truth.
Give access to the materials necessary to make criticisms (e.g. AI Safety papers should be more open with dataset documentation etc.).
I think something like 30% hugboxing is good. I think that the cases where you see it maybe it could happen less, but a lot of the time I think we are too brutal to non-rationalist critics.
It's really tiring to criticise and I think it's nice to have someone listen and engage at least a bit. If I move straight to "here is how I disagree" I think I lose out on useful criticism in the long run.
But that's conditional on people not interpreting the hugboxing as a tactic/weird norm. E.g. mormon missionaries being nice to people doesn't elicit the same response as a person off the street because they adjust their set point.
Can you give examples of hugboxing you don't like?
Because my internal response is "people think we are too aggressive/dismissive" rather than "people think we listen to them but in a weird/patronising way" and if you mean internally you don't like it, then I am confused as to why you read it.
The fact EAs have been so caught off guard by the AI x-risk is a distraction argument and its stickiness in the public consciousness should be worrying for how well calibrated we are to AI governance interventions working the way we collectively think they will. This feels like another Carrick Flynn situation. I might right up an ITT on the AI Ethics side -- I think there's a good analogy to a SSC post that EAs generally like.
Great question that prompted a lot of thinking. I think my internal model looks like this:
On the meta level it feels as if EAs have a systemic error in their model that underestimates public distrust of EA actions which constrains the action space and our collective sense-making of the world.
I think legacy media organisations buy into the framing solidly. Especially, organisations that operate on policing others such as the CJR (Columbia Journalism Review).
Just in my own life I've noticed a lot of the "elite" sphere friends I have at ivies and competitive debating etc. are much more apprehensive towards EA and AI Safety types of discourse in general and attribute it to this frame. Specifically, I think the idea from policy debating of inherency -- that people look towards frames of explaining the underlying barrier and motivation to change.
I think directly this is bad for cooperation on the governance side (e.g. a lot of the good research on timelines and regulation are currently being done by some people with AI Ethics sympathies).
I think EAs underestimate how many technically gifted people who could be doing technical research are put off by EAs who throw around philosophy ideas that are ungrounded in technical acumen. This frame neatly compounds this aversion.
I'd be very interested to read up a post about your thoughts about this (though I'm not sure what 'ITT' means in this context?) and I'm curious about which SSC post that you're referring to.
I also want to say I'm not sure how universal the 'EAs have been caught so off guard' claim is. Some have sure, but plenty were hoping the the AI risk discussion stays out of the public sphere for exactly this kind of reason.
I always thought the average model for don't let AI Safety enter the mainstream was something like (1) you'll lose credibility and be called a loon and (2) it'll drive race dynamics and salience. Instead, I think the argument that AI Ethics makes is "these people aren't so much loons as they are just doing hype marketing for AI products in the status quo and draining counterfactual political capital from real near term harms".
I think a bunch of people were hesitant about AI safety entering the mainstream because they feared it would severely harm the discussion climate around AI safety (and/or cause it to become a polarized left/right issue).
I wonder if anyone has moved from longtermist cause areas to neartermist cause areas. I was prompted by reading the recent Carlsmith piece and Julia Wise's Messy personal stuff that affected my cause prioritization.
A underrated thing with the (post)-rationalists/adjacent is how open with their emotions they are. I really appreciate @richard_ngo 's replacing fear series and just a lot of the older Lesswrong posts about starting a family with looming AI risk. Just really appreciating the personal posting and when debugging comes from a place of openness and emotional generosity.
I think the model of steelmanning EAs have could borrow from competitive debating because it seems really confused as a practice and people mean different things for steelmanning:
Are you steelmanning a whole viewpoint?
Are you steelmanning a world view so a combination of bundled viewpoints?
Are you steelmanning the ideological or memetic assumptions?
Are you steelmanning a warrant (a reason for a belief)?
Are you steelmanning reconstruction (why criticisms of an argument are wrong)?
I might write about this as a bundle but this imprecision has been bothering me.
Here I'll illustrate the problem with your viewpoint on the ambiguity problem I have. Just going to spitball a bunch of problems I end up asking myself.
If I'm steelmanning the viewpoint:
I'd be defending how this bundle of reasons leads to the conclusion. Conceptually easy and normal steelmanning.
I'd provide strong evidence for each claim and my burden would be showing how load bearing of a claim and how it's mistaken or how it could reasonably come to be.
If I'm steelmanning a world view I'd be defending something like this follows:
Am I defending the modal reasoning of the viewpoint holder across a population or the top reason holder (e.g. the average deontologist or Korsgaard?)
Am I defending the top world view holder in the eyes of the steelman opponent or in some objective modal observer (e.g. if I'm saying p(doom) = 90% is wrong do you want me to Paul Christiano's viewpoint or Yann LeCunn's)
Subquestion here is do you want some expected value calculus of: change from base opinion * likelihood to change opinion.
If I'm steelmanning assumptions:
Do you want the assumptions to line up with the opposing viewpoint such that the criticism of assumptions dissolves?
Do you want me to make explicit the assumptions and provide broad reasons for them?
Are you steelmanning a specific reason:
E.g. for steelmanning bioanchors we could decompose it down to inputs, probability distribution problems, forecasting certainty. Each of these argues from an external or internal frame of the bioanchors report and shows differing levels of understanding -- especially because they're interlinked. For instance, if you wanted a steelman of long-timelines|fast take off you end forcing a hardware overhang argument.
Are you steelmanning a reconstruction:
Arguments often exist in chains of argumentation so steelmans that isolate and so a reconstructive defence of an argument that asks for a steelman is often decontextualised and confused.
Overall, I think a large part of the problem is the phrase "what is your steelman" being goodharted in the same way "I notice I'm confused" is where the original meaning is loss in a miasma of eternal September.
One thing that bothers me about epistemics discourse is that there's this terrible effect of critics picking weak low status EAs as opponents for claims about AI risk and then play credentialism games. I wish there was parity matching of claims in these discussions so they wouldn't collapse.
I notice a lot of internal confusion whenever people talk about macro-level bottlenecks in EA:
Talent constraint vs. funding constraint.
80k puts out declarations on different funding situation changes such as don't found projects on the margins (RIP FTX).
People don't found projects in AI Safety because of this switch up.
Over the next 2 years people up-skill and do independent research or join existing organisations.
Eventually, there are not enough new organisations to absorb funding.
[reverse the two in cycles I guess]
Mentorship in AI Safety
There's a mentorship bottleneck so people are pushed to do more independent projects.
There's less new organisations started because people are told it's a mentorship and research aptitude bottleneck.
Eventually the mentorship bottleneck catches up because everyone up-skilled but there aren't enough organisations to absorb the mentors etc. etc.
To be clear, I understand the counterarguments about marginality and these are exaggerated examples but I do fear at its core that the way EAs defer means we have the worst of both the social planner problem and none of the benefits of the theory of the firm.
Impacts being distributed heavy-tailed has a very psychologically harsh effect given the lack of feedback loops in longtermist fields and I wonder what interpersonal norms one could cultivate amongst friends and the community writ-large (loosely held/purely musing etc.):
Distinguishing pessimism about ideas from pessimism about people.
Ex-ante vs. ex-post critiques.
Celebrating when post-mortems have led to more successful projects.
Mergers/takeover mechanisms of competition between peoples/projects.
I think EAs in the FTX era were leaning hard on hard capital (e.g. mentioning no lean season close down) ignoring the social and psychological parts of taking risk and how we can be a community that recognises heavy-tailed distributions without making it worse for those who are not in the heavy-tail.
I wish there was a library of sorts for different base models of TAI economics growth that weren't just some form of the Romer Model and TFP goes up because PASTA automates science.
I fear the weird hugbox EAs do towards their critics in order to signal good faith means over time a lot of critics just end up not being sharpened in their arguments.
I feel pretty strongly against "weird hugboxing" but I think the main negative effect is an erosion of our own epistemic standards and a reduction in the degree to which we can epistemically defer to one another. I want the EA community to consist of people whose pronouncements I can fully trust, rather than have to wonder if they are saying something because it reflects their considered judgment on that topic or instead because they are signaling good faith, "steelmanning", etc.
What's the comparative?
I think an inviting form of decoupling norms where it's fractured in chains. I don't think decoupling norms work when both parties don't opt-in and so people should switch to the dominant norm of the sphere. An illustrative example is as follows:
Some EAs would see this as being a motte-and-bailey instead of getting to the crux but cruxes can be asymmetric in that different critics combine claims together (e.g. the "woke" combining with more centrist sensibilities deontologists). But I think explanations which are done well are persuasive because they reframe truth-seeking ideas within accessible language that dissolve cruxes to seek agreement and cooperation.
Another illustration on the macro-level of the comparative:
To be clear, there are harms with trying to be persuasive (e.g. sophistry, lying, motivated reasoning etc.). But sometimes being persuasive is about speaking the argumentative language of another side.
This is a great comment and I think made me get much more of what you're driving at than the (much terser) top-level comment.
Yeah I should have written more but I try to keep my short form casual to make the barrier of entry lower and to allow for expansions based on different reader's issues.
What do you mean by "resource" here?
Examples of resources that come to mind:
Again this is predicated on good faith critics.
I think something like 30% hugboxing is good. I think that the cases where you see it maybe it could happen less, but a lot of the time I think we are too brutal to non-rationalist critics.
It's really tiring to criticise and I think it's nice to have someone listen and engage at least a bit. If I move straight to "here is how I disagree" I think I lose out on useful criticism in the long run.
But that's conditional on people not interpreting the hugboxing as a tactic/weird norm. E.g. mormon missionaries being nice to people doesn't elicit the same response as a person off the street because they adjust their set point.
Can you give examples of hugboxing you don't like?
Because my internal response is "people think we are too aggressive/dismissive" rather than "people think we listen to them but in a weird/patronising way" and if you mean internally you don't like it, then I am confused as to why you read it.
On the forum I agree hugboxing is worse.
Poll!
The fact EAs have been so caught off guard by the AI x-risk is a distraction argument and its stickiness in the public consciousness should be worrying for how well calibrated we are to AI governance interventions working the way we collectively think they will. This feels like another Carrick Flynn situation. I might right up an ITT on the AI Ethics side -- I think there's a good analogy to a SSC post that EAs generally like.
I am unsure that "AI x-risk as a distaction" is a big deal. Like what are their policy proposals, what major actors use this frame?
Great question that prompted a lot of thinking. I think my internal model looks like this:
I'd be very interested to read up a post about your thoughts about this (though I'm not sure what 'ITT' means in this context?) and I'm curious about which SSC post that you're referring to.
I also want to say I'm not sure how universal the 'EAs have been caught so off guard' claim is. Some have sure, but plenty were hoping the the AI risk discussion stays out of the public sphere for exactly this kind of reason.
I always thought the average model for don't let AI Safety enter the mainstream was something like (1) you'll lose credibility and be called a loon and (2) it'll drive race dynamics and salience. Instead, I think the argument that AI Ethics makes is "these people aren't so much loons as they are just doing hype marketing for AI products in the status quo and draining counterfactual political capital from real near term harms".
I think a bunch of people were hesitant about AI safety entering the mainstream because they feared it would severely harm the discussion climate around AI safety (and/or cause it to become a polarized left/right issue).
https://www.alexirpan.com/2024/08/06/switching-to-ai-safety.html
This reaffirms my belief it's more important to look at the cruxes of existing ML researchers than internally within EAs on AI Safety.
In terms of achieving what goal or objective?
I wonder if anyone has moved from longtermist cause areas to neartermist cause areas. I was prompted by reading the recent Carlsmith piece and Julia Wise's Messy personal stuff that affected my cause prioritization.
A underrated thing with the (post)-rationalists/adjacent is how open with their emotions they are. I really appreciate @richard_ngo 's replacing fear series and just a lot of the older Lesswrong posts about starting a family with looming AI risk. Just really appreciating the personal posting and when debugging comes from a place of openness and emotional generosity.
I think the model of steelmanning EAs have could borrow from competitive debating because it seems really confused as a practice and people mean different things for steelmanning:
I might write about this as a bundle but this imprecision has been bothering me.
I would be interested in a good explainer here! I just wrote a post that probably could have done with me reflecting on what I recommend doing.
Here I'll illustrate the problem with your viewpoint on the ambiguity problem I have. Just going to spitball a bunch of problems I end up asking myself.
Overall, I think a large part of the problem is the phrase "what is your steelman" being goodharted in the same way "I notice I'm confused" is where the original meaning is loss in a miasma of eternal September.
Yeah, I think it would be useful for you to clarify this.
One thing that bothers me about epistemics discourse is that there's this terrible effect of critics picking weak low status EAs as opponents for claims about AI risk and then play credentialism games. I wish there was parity matching of claims in these discussions so they wouldn't collapse.
I notice a lot of internal confusion whenever people talk about macro-level bottlenecks in EA:
To be clear, I understand the counterarguments about marginality and these are exaggerated examples but I do fear at its core that the way EAs defer means we have the worst of both the social planner problem and none of the benefits of the theory of the firm.
Impacts being distributed heavy-tailed has a very psychologically harsh effect given the lack of feedback loops in longtermist fields and I wonder what interpersonal norms one could cultivate amongst friends and the community writ-large (loosely held/purely musing etc.):
I think EAs in the FTX era were leaning hard on hard capital (e.g. mentioning no lean season close down) ignoring the social and psychological parts of taking risk and how we can be a community that recognises heavy-tailed distributions without making it worse for those who are not in the heavy-tail.
I wish there was a library of sorts for different base models of TAI economics growth that weren't just some form of the Romer Model and TFP goes up because PASTA automates science.