This is a special post for quick takes by trevor1. Only they can create top-level comments. Comments here also appear on the Quick Takes page and All Posts page.
Somewhere in languagespace, there should be a combination of ~50-200 words that 1) successfully convinces >30% people that Wild Animal Welfare is really important, and then 2) they realize that the society they grew up in is confused, ill, and deranged. A superintelligence could generate this.
The EV for that word combo is big. Orienting people towards reality requires freeing them from the intellectual poverty of our time. Most people today embrace the aging process, assume math trauma is normal, ignore moral uncertainty, and don't extrapolate the history of technology forward.
This is harder than it looks, it requires lateral thinking and weird knowledge e.g. knowing that most people fear ideas that seem like their friends might judge them negatively for. I wasn't very impressed with https://www.wildanimalinitiative.org. You need to dazzle. I still think that someone here could pull it off with only a couple hours.
Roll to disbelieve? 50-100 words is only, like, a couple of tweets, so it is really not much time to communicate many new ideas. Consider some of the most influential tweets you've ever read (either influential on you personally, or on societal discourse / world events / etc). I think that the most impactful/influential tweets, are gonna be pretty close to the limit of what's possible when you are just blasting out a standard message to everyone -- even with a superintelligence, I doubt there is much room for improvement.
Now, if you were using a superintelligence to target a UNIQUE 50-100 words tailored for each individual person, then IMO the sky's the limit -- a superintelligence could probably get crazy Snow-Crash-esque effects like getting people to commit suicide or to totally change their career / worldview / overall life trajectory.
So, I don't think the ideal standard (non-personalized) word-combo is that much better than a typical "extremely influential tweet". Extremely influential tweets are still great and very high-impact, of course! So it would still seem like a great idea to try and hone one's charisma / ability to craft really persuasive short takes / whatever. Unfortunately, I feel like millions of people are already trying to do this, creating an extremely competitive "marketplace of ideas" (or rather, marketplace of persuasive arguments / takes) where it's hard for new ideas (like WAW) to break through if they don't already have an optimized memetic profile. To quote from Yudkowsky's "Inadequate Equilibria":
CECIE: In our world, there are a lot of people screaming, “Pay attention to this thing I’m indignant about over here!” In fact, there are enough people screaming that there’s an inexploitable market in indignation. The dead-babies problem can’t compete in that market; there’s no free energy left for it to eat, and it doesn’t have an optimal indignation profile. There’s no single individual villain. The business about competing omega-3 and omega-6 metabolic pathways is something that only a fraction of people would understand on a visceral level; and even if those people posted it to their Facebook walls, most of their readers wouldn’t understand and repost, so the dead-babies problem has relatively little virality. Being indignant about this particular thing doesn’t signal your moral superiority to anyone else in particular, so it’s not viscerally enjoyable to engage in the indignation. As for adding a further scream, “But wait, this matter really is important!”, that’s the part subject to the lemons problem. Even people who honestly know about a fixable case of dead babies can’t emit a trustworthy request for attention.
SIMPLICIO: You’re saying that people won’t listen even if I sound really indignant about this? That’s an outrage!
CECIE: By this point in our civilization’s development, many honest buyers and sellers have left the indignation market entirely; and what’s left behind is not, on average, good.
I don't intend for this to be entirely a counsel of despair; obviously it is possible to convince people of EA ideas since the movement has experienced very dramatic growth over the past decade. But that growth is happening in a weird, very meta environment... IMO part of EA's appeal is that we do a good job of appealing to people who have "left the indignation market" (and left other, related markets for things like ideological polarization, etc) and people who have climbed some sort of ladder of developing an increasingly sophisticated worldview / theory of change / etc.
The upshot of that, IMO, is that when we are crafting persuasive arguments, we don't want to just imitate whatever naively seems like the most successful memetic content. Instead, we want to specifically target sophisticated people who've learned to ignore overly-emotional / clickbaity / ideological / etc arguments... eg consider the contrast between the EA Forum and something like the Drudge Report.
Nevertheless -- despite all these constraints and limitations -- I still think there is tons of untapped potential for crafting shorter, more-convincing, more-expressive takes that do a better job communicating the core ideas of neglected EA cause areas. So I agree that more people should be brainstorming messages and trying to hone that skill.
Now, if you were using a superintelligence to target a UNIQUE 50-100 words tailored for each individual person, then IMO the sky's the limit -- a superintelligence could probably get crazy Snow-Crash-esque effects like getting people to commit suicide or to totally change their career / worldview / overall life trajectory.
It could probably pull this this off for a subsection of highly suggestible people, but I'm skeptical that even a superintelligence could convince most people to change deeply held values with a mere tweet thread.
Yes, there are a lot of potential word combinations out there, but the computer only has a finite amount of time to search through them, and is relying on an unavoidably imperfect model of the target person and the outer world (because of incomplete information and finite computing power).
I think it all comes down to how difficult the attempted persuasion is: I'm sure an AI could convince me to buy a 50$ product, but I don't see any universe where it can convince me to commit an act of violence against a loved one.
Unfortunately, I feel like millions of people are already trying to do this, creating an extremely competitive "marketplace of ideas" (or rather, marketplace of persuasive arguments / takes) where it's hard for new ideas (like WAW) to break through if they don't already have an optimized memetic profile.
I think this goes a long way to completely shut down my argument. So, basically, case closed.
In-person communication is generally better anyway due to continuous feedback, and figuring out how to make those conversations 1) go as well as possible and 2) allow the presentations to remain effective after many people down the telephone chain. It's definitely important to get bullet pointed lists going, but you also need to present it so that it's not interpreted as threatening or a show of dominance, it's harder than you'd intuitively think to make someone accept a message as controversial as Wild Animal Welfare.
Somewhere in languagespace, there should be a combination of ~50-200 words that 1) successfully convinces >30% people that Wild Animal Welfare is really important, and then 2) they realize that the society they grew up in is confused, ill, and deranged. A superintelligence could generate this.
I don't think this is true, at least taking "convinces" to mean something more substantial than, say, marking the box for "yeah WAS is important" on a survey given immediately after reading.
Even for an arbitrary superintelligence specifically targeted at an individual. ~50-200 words simply isn't that many to convey complex ideas and accompanying evidence, and it must also use words/phrases that the subject already understands
The quick takes section is a really good idea. There are lots of valuable ideas that can be optimized into short combinations of words. The forum is much better for this than twitter, ideas are evaluated by different people, in a very different state of mind and environment.
EA should commandeer the Taurus emoji ♉ as the symbol for Moloch, as it is on every chat app and has no other practical uses. A good way to master a concept is to see it apply to yourself and your friends. It's also an excuse to share Meditations on Moloch which is a high-EV text.
But what if you're a Taurus who's looking for a Cancer to date...or even a fellow Taurus? Seems like you're leaving valuable info on the table if you aren't regularly using astrological symbols.
It can also be used to signal your belief in the stock market.
If you get 1% better every day without ever getting worse, then after 365 days you'll be 30 times better (1.01^365 > 30). 10 minutes is 1% of a day's waking hours. 5 minutes is one Yoda Timer interval.
To the extent that skills are like money, we can apply model of compound interest. But I’ve actually never encountered a skill that you can consistently get 1% better at forever. I think that there are diminishing marginal returns to the time and effort a person devotes towards improving skills.
Maybe if I start doing as many push-ups as I can once per day, I will have a surprising improvement in my ability to do push-ups at first, but the daily improvements will taper off with time. If I can bench-press 134 pounds, I don’t think my body can increase that by 1% every day. Eventually my progress will slow and even hit a physical limit. I'm skeptical that any skills exist that can be scaled up like this. I’m actually not sure there are any human skills that can be scaled up like this. Can you think of any examples? Maybe memorizing vocabulary?
Upvoted. I was deliberately vague about "1% better every day" because the idea was the person becomes 1% better as a whole; my thinking on this wasn't complete.
Now that I've read your comment, I can make the concept more nuanced: the person becomes 1% better by improving a skill by 1% and moving on to a different skill as they approach diminishing returns.
So they would do 1% more push-ups, then as they started hitting diminishing returns they spend more time on memorizing vocabulary, then linear algebra, then security mindset, then integrating bayes theorem deeply into their thoughts, then data science, then reading Christiano's papers, then networking, etc.
It seems that your original comment no longer holds under this version of "1% better", no? In what way does being 1% better at all these skills translate to being 30x better over a year? How do we even aggregate these 1% improvements under the new definition?
Anyway, even under this definition it seems hard to keep finding skills that one can get 1% better at within one day easily. At some point you would probably run into diminishing returns across skills -- that is, the "low-hanging fruit" of skills you can improve at easily will have been picked.
I definitely agree- the calculation that I had in mind, and physically wrote, clearly had in mind the 1% compounding on itself.
I have two defenses that it is still a helpful way of looking at it:
This was a heuristic, and getting a little better every day will still stack in the way that I and the reader had in mind when reading/writing the quote, even if the math wasn't completely accurate for something like a daily 1% yielding something towards a 30x improvement per year.
Getting better at different tasks over the course of the year will still multiply, such as getting better at different tasks related to working on/thinking about/solving AI alignment.
TL;DR "habitually deliberately visualizing yourself succeeding at goal/subgoal X" is extremely valuable, but also very tarnished. It's probably worth trying out, playing around with, and seeing if you can cut out the bullshit and boot it up properly.
Longer:
The universe is allowed to have tons of people intuitively notice that "visualize yourself doing X" is an obviously winning strategy that typically makes doing X a downhill battle if its possible at all, and so many different people pick it up that you first encounter it in an awful way e.g. in middle/high school you first hear about it but the speaker says, in the same breath, that you should use it to feel more motivated to do your repetitive math homework for ~2 hours a day.
I'm sure people could find all sorts of improvements e.g. an entire field of selfvisualizationmancy that provably helps a lot of people do stuff, but the important thing I've noticed is to simply not skip that critical step. Eliminate ugh fields around self-visualization or take whatever means necessary to prevent ugh fields from forming in your idiosyncratic case (also, social media algorithms could have been measurably increasing user retention by boosting content that places ugh fields in places that increase user retention by decreasing agency/motivation, with or without the devs being aware of this because they are looking at inputs and outputs or maybe just outputs, so this could be a lot more adversarial than you were expecting). Notice the possibility that it might or might not have been a core underlying dynamic in Yudkowsky's old Execute by Default post or Scott Alexander's silly hypothetical talent differential comment without their awareness.
If nothing else, a takeaway from this was that the process of finding the missing piece that changes everything is allowed to be ludicrously hard and complicated, while the missing piece itself is simultaneously allowed to be very simple and easy once you've found it.
For more than a century, science fiction was basically the only institution paying people to think about the future. This sculpted the cultural landscape pretty significantly (try thinking of examples!)
I think this mainly caused thinking about the future low-status and less rigorous.
Examples: Humans are the main actors in the future, breaking the laws of physics is to be expected, aliens are humanoid and ~same level as humans in terms of technological development, artificial intelligence and possible automation of labor doesn't change the basic economic setup at all.
I strongly think that people are sleeping on massive EV from encouraging others to read HPMOR, or to reread it if it's been years. It costs nothing to read it, and it likely offers broad and intense upskilling.
I think OP omitted many details of why it might be plausible, and I wouldn't expect the disagree voters to have any idea about what's going on there:
To me, the literary value of EA stories is thinking through psychological context of trying to think more clearly, trying to be good, whatever. Building empathy for "how you would derive EA-shaped things and then build on them from within a social reward surface that isn't explicitly selecting for that?", "what emotional frictions or mistake classes would I expect?", seems plausibly quite valuable for a ton of people.
Basically every other thing you could say about the value prop of reading Methods is downstream of this! "upskilling" was an intense word choice, but only 95% wrong.
With that in mind, I do think a ton of topics salient to EAs show up within Methods and many of them get thoroughly explored.
Additional thought: if we assume that people can gain skills from reading fiction (or from otherwise engaging in imaginary world, such as via films or games), does HPMOR give the best “return on investment” per hour spent? Is it better than reading War and Peace, or watching John Green videos, or playing Life is Strange? I’m skeptical that on EAs tend to be bias in favor of it, and therefore we would neglect other options.
(I’m not really expecting anyone to have actual data on this, but I’d be curious to see people bounce the idea around a bit)
Could you clarify what kind of upskiling you expect to come from reading Harry Potter fan fiction?
My “not rigorously thought out perspective” is that if someone has never encountered the idea of rationality or counterfactual, thinking, then this might introduce them to it. But I’m guessing that roughly similar benefits could be had from reading a much shorter book that is more directly targeted at teaching these skills (maybe Thinking Fast and Slow?).
I think it was unhelpful to refer to “Harry Potter fanfiction” here instead of perhaps “a piece of fiction”—I don’t think it’s actually more implausible that a fanfic would be valuable to read than some other kind of fiction, and your comment ended up seeming to me like it was trying to use the dishonest rhetorical strategy of implying without argument that the work is less likely to be valuable to read because it’s a fanfic.
Adjusted for popularity or likelihood of recommendation, you might naively expect fiction that someone is presented with to be more likely to stand the test of time than fan fiction, since the selection effects are quite different.
I think that is a fair and accurate criticism. I do view most fan fiction as fairly low quality, but even if that is true it doesn’t imply that all fan fiction is low quality. And I do think that some fiction can be used for self-improvement purposes.
The literary quality of the fiction (which, incidentally, I thought was terrible when I glanced at it out of curiosity), is fairly irrelevant to whether it helps people be more rational (which I am also skeptical of, but that's a separate point.)
I do suspect that some Bay Are folk would benefit from reading at least one book of "serious" "literary" fiction with zero wizards or spaceships, like Middlemarch or To the Lighthouse, but I might just being snobby/trying to justify the 2 years of undergrad courses in Eng Lit I did.
I read to the lighthouse, not far away in time from when I read methods, and I was annoyed or confused about why I was reading it. And there was a while ten years ago when satantango and gravity's rainbow were occupying a massive subset of my brain at all times, so I'm not like "too dumb for litfic" or whatever.
Sure. Not everyone has to like every book! I don't like Don Quixote, which has frequently been claimed to be the greatest novel ever. I loved War and Peace when I was 18, but I'd cooled on its conservatism and misogyny when I last read it, though I still understood why people see it as "great".
But I do think there is a tendency towards the grandiose and speculative in rationalist/nerd taste and away from the realist*, domestic, psychological, etc that I (probably pompously) think can be a bit limiting. Analogous to preferring Wagner to Mozart in opera, or prog metal to the Velvet Underground in "serious" rock music. (Also reminds me of Holden Karnosfky's blogpost about being baffled by why Pet Sounds has such a strong reputation among rock critics when its so "pop" and allegedly lacking in "complexity") I've never read Satantango, but Gravity's Rainbow is verging pretty strongly on sci-fi, and has a very apocalyptic vibe, and a general desire to overwhelm.
Not that I'm slagging it: I really liked Gravity's Rainbow when I read it, though that's 18 years ago now, and I've hated the other Pynchon I've tried since. And I'm not slagging also being massive nerd. I will never be a huge prog/metal fan, but I have just leapt from replaying Dragon Age: Inquisition to starting Baldur's Gate III.
*Technically speaking, I think "To the Lighthouse" is modernism not realism as lit profs would classify it.. But in this context that really just means "about posh people, not about a war, dense prose, interior monologues', which isn't really incompatible with "realism" in any ordinary sense.
I totally agree, but like the sequences those books consume energy that is normally spent on work, or at least hobbies, whereas HPMOR is optimized to replace time that would otherwise have been spent on videos, social media, socializing, other novels, etc. and is therefore the best bet I know of to boost EA as a whole.
HPMOR technically isn't built to be time-efficient, the highlights of the sequences is better for that. HPMOR is meant to replace other things you do for fun like reading fun novels or TV shows or social media, and replace that with material that offers passive upskilling. In that sense, it is profoundly time-efficient, because it replaces fun time spent not upskilling at all, with fun time spent upskilling.
A very large proportion of EA-adjacent people in the bay area swear by it as a way to become more competent in a very broad and significant way, but I'm not sure how it compares with other books like Discworld which are also intended for slack/leisure time. AFAIK CEA has not even done a survey explicitly asking about the self-improvement caused by HPMOR, let alone study measuring the benefits of having different kinds of people read it.
We already have plenty of people whose worldview is shaped by Yudkowsky and "the sequences". We need more people from different backgrounds, who can take a skeptical eye to these texts and point out their flaws and synthesize the good parts into a greater whole.
I get what you're saying (the Rationalist toolbox should equip you to be skeptical of the sequences themselves), and I think it's partially true in that rationalists have been highly critical of Yudkowsky lately.
However, I don't think it's enough. I think someone who learns similar principles to the sequences from different sources (textbooks, pop-science books, domain level knowledge) will do a better job at skeptically eyeing the sequences than someone who just read the sequences, for obvious reasons. I am one of those people, and I've spotted several issues with the sequences that the rationalists seemingly haven't.
Yeah I thought of it from the perspective of "not being told what to think but being told what to think about" -- Like you could say "the most profitable (in karma of a website) strategy is to disagree with a 'founder'-like figure of that very website" of course, but indeed if you've accepted his frame of the debate then didn't he "win" in a sense? This seems technically true often (not always!) but I find it uncompelling.
I've spotted several issues with the sequences that the rationalists seemingly haven't.
I did an in depth write-up on debunking one claim here (that of a superintelligence inventing general relativity from a blade of grass).
I haven’t gotten around to in depth write-ups for other things, but here are some brief descriptions of other issues I’ve encountered:
the description of Aumanns agreement theorem in “defy the data” is false, leaving behind important caveats that render his use of it incorrect.
Yud implies that “Einsteins arrogance” is some sort of mystery (and people have cited that article as a reason to be as arrogant as Einstein about speculative forecasts). In fact, Einsteins arrogance was completely justified by the available evidence and is not surprising at all, in a manner in no way comparable to speculative forecasts.
The implications of the “AI box experiment” have been severely overstated. It does not at all prove that an AGI cannot be boxed. “rationalists are gullible” fits the evidence provided just as well.
Yudkowsky treats his case for the “many worlds hypothesis” as a slam-dunk that proves the triumph of Bayes, but in fact it is only half-done. He presents good arguments against “collapse is real”, but fails to argue that this means many worlds is the truth, rather than one of the other many interpretations which do not involve a real collapse.
The use of Bayesianism in Rationalism is highly simplified, and often doesn’t actually involve using bayes rule at all. It rarely resembles bayes as actually applied in science, and is likely to lead to errors in certain situations, like forecasting low-probability events.
Yud’s track record of predictions is fairly bad, but he has a habit of pretending it isn’t by being vague and refusing to make predictions that can be actually checked. In general he displays an embarrassing lack of intellectual humility.
If I was a Bay Area VC, and I had $5m to invest annually and $100k to donate to people researching the long-term future (e.g. because it's interesting and I like the idea of being the one to drive the research), it would be foolish to spend some of the $5m investing in people researching nanofactories.
But it would also be foolish to donate some of the $100k to the kinds of people who say "nanorobotics is an obvious scam, they can just make up whatever they want".
And people don't realize that short-term investment and long-term predictions are separate domains that are both valuable in their own way, because there are so few people outside of the near-term focused private sector who are thinking seriously about the future.
They just assume that thinking about the long-term future is just a twisted, failed perversion of the private sector, because of how deeply immersed they are in the private sector's perspective exclusively.
As a result, they never have a chance to notice that the long-term future is something that they and their families might end up living in.
The difficulty of pitching AI safety to someone has been going down by ~50% every ~18 months. This thanksgiving might be a great time to introduce it to family; run Murphyjitsu and be goal-oriented! 🦃
'It’s genuinely a criterion for genius to say a bunch of wrong stupid things sometimes. Someone who says zero stupid things isn’t reasoning from first principles, isn’t taking risks, and has downloaded all the “correct” views.'
Because humans are primates, we have a strong drive to gain social status and play dominance games. The problem is that humans tend to take important concepts and turn them into dominance games.
As a result, people anticipate some sort of dominance or status game whenever they hear about an important concept. For many people, this anticipation has become so strong that they stopped believing that important concepts can exist.
Worth noting that in humans (and unlike in most other primates) status isn't primarily determined solely by dominance (e.g., control via coercion), but instead is also significantly influenced via prestige (e.g., voluntary deference due to admiration). While both dominance and prestige play a large role in determining status among humans, if anything prestige probably plays a larger role.
(Note – I'm not an expert in anthropology, and anyone who is can chime in, but this is my understanding given my amount of knowledge in the area.)
Agreed, I only used the word "dominance games" because it seemed helpful for understandability and the wordcount. But it was inaccurate enough to be worth effort to find a better combination of words.
Effective thinking is like building a house of cards. There are a lot of unique negative feedback loops that could scuttle the whole thing, and also increased risk of collapse from more height and time.
But the winning strategy is to get in the habit of routinely/passively stacking cards; even if collapse is frequent and frustrating, it's well worth it, it's the best way to eventually end up making a stack of cards higher than anyone else ever has. That's how Peter Singer and other philosophers became the first humans on Earth to discover the core arguments of Effective Altruism decades ago. Before that, nobody anywhere had stacked the cards high enough.
If cryopreservation becomes mainstream, then that's literally it. Nobody dies, and all of humanity logrolls itself into raising the next generations to be friendly and create aligned AGI.
Even the total sociopaths participate to some degree (e.g. verbally support, often avoid obstruction if they are very powerful). If they don't have preserved loved ones to protect, they still need a friendly long-term future for themselves to be unfrozen into. They'll spend many more years alive in the future than the present anyway, because unfreezing a person is orders of magnitude harder than reversing aging or generating a new body for an unfrozen person.
Many other people have probably thought of this already. What am I missing?
Somewhere in languagespace, there should be a combination of ~50-200 words that 1) successfully convinces >30% people that Wild Animal Welfare is really important, and then 2) they realize that the society they grew up in is confused, ill, and deranged. A superintelligence could generate this.
The EV for that word combo is big. Orienting people towards reality requires freeing them from the intellectual poverty of our time. Most people today embrace the aging process, assume math trauma is normal, ignore moral uncertainty, and don't extrapolate the history of technology forward.
This is harder than it looks, it requires lateral thinking and weird knowledge e.g. knowing that most people fear ideas that seem like their friends might judge them negatively for. I wasn't very impressed with https://www.wildanimalinitiative.org. You need to dazzle. I still think that someone here could pull it off with only a couple hours.
Roll to disbelieve? 50-100 words is only, like, a couple of tweets, so it is really not much time to communicate many new ideas. Consider some of the most influential tweets you've ever read (either influential on you personally, or on societal discourse / world events / etc). I think that the most impactful/influential tweets, are gonna be pretty close to the limit of what's possible when you are just blasting out a standard message to everyone -- even with a superintelligence, I doubt there is much room for improvement.
Now, if you were using a superintelligence to target a UNIQUE 50-100 words tailored for each individual person, then IMO the sky's the limit -- a superintelligence could probably get crazy Snow-Crash-esque effects like getting people to commit suicide or to totally change their career / worldview / overall life trajectory.
So, I don't think the ideal standard (non-personalized) word-combo is that much better than a typical "extremely influential tweet". Extremely influential tweets are still great and very high-impact, of course! So it would still seem like a great idea to try and hone one's charisma / ability to craft really persuasive short takes / whatever. Unfortunately, I feel like millions of people are already trying to do this, creating an extremely competitive "marketplace of ideas" (or rather, marketplace of persuasive arguments / takes) where it's hard for new ideas (like WAW) to break through if they don't already have an optimized memetic profile. To quote from Yudkowsky's "Inadequate Equilibria":
I don't intend for this to be entirely a counsel of despair; obviously it is possible to convince people of EA ideas since the movement has experienced very dramatic growth over the past decade. But that growth is happening in a weird, very meta environment... IMO part of EA's appeal is that we do a good job of appealing to people who have "left the indignation market" (and left other, related markets for things like ideological polarization, etc) and people who have climbed some sort of ladder of developing an increasingly sophisticated worldview / theory of change / etc.
The upshot of that, IMO, is that when we are crafting persuasive arguments, we don't want to just imitate whatever naively seems like the most successful memetic content. Instead, we want to specifically target sophisticated people who've learned to ignore overly-emotional / clickbaity / ideological / etc arguments... eg consider the contrast between the EA Forum and something like the Drudge Report.
Nevertheless -- despite all these constraints and limitations -- I still think there is tons of untapped potential for crafting shorter, more-convincing, more-expressive takes that do a better job communicating the core ideas of neglected EA cause areas. So I agree that more people should be brainstorming messages and trying to hone that skill.
It could probably pull this this off for a subsection of highly suggestible people, but I'm skeptical that even a superintelligence could convince most people to change deeply held values with a mere tweet thread.
Yes, there are a lot of potential word combinations out there, but the computer only has a finite amount of time to search through them, and is relying on an unavoidably imperfect model of the target person and the outer world (because of incomplete information and finite computing power).
I think it all comes down to how difficult the attempted persuasion is: I'm sure an AI could convince me to buy a 50$ product, but I don't see any universe where it can convince me to commit an act of violence against a loved one.
I think this goes a long way to completely shut down my argument. So, basically, case closed.
In-person communication is generally better anyway due to continuous feedback, and figuring out how to make those conversations 1) go as well as possible and 2) allow the presentations to remain effective after many people down the telephone chain. It's definitely important to get bullet pointed lists going, but you also need to present it so that it's not interpreted as threatening or a show of dominance, it's harder than you'd intuitively think to make someone accept a message as controversial as Wild Animal Welfare.
I don't think this is true, at least taking "convinces" to mean something more substantial than, say, marking the box for "yeah WAS is important" on a survey given immediately after reading.
I don't think this is possible,[1] but for an interesting short-story which takes this idea seriously, check out "Understand" by Ted Chiang
Even for an arbitrary superintelligence specifically targeted at an individual. ~50-200 words simply isn't that many to convey complex ideas and accompanying evidence, and it must also use words/phrases that the subject already understands
The quick takes section is a really good idea. There are lots of valuable ideas that can be optimized into short combinations of words. The forum is much better for this than twitter, ideas are evaluated by different people, in a very different state of mind and environment.
EA should commandeer the Taurus emoji ♉ as the symbol for Moloch, as it is on every chat app and has no other practical uses. A good way to master a concept is to see it apply to yourself and your friends. It's also an excuse to share Meditations on Moloch which is a high-EV text.
But what if you're a Taurus who's looking for a Cancer to date...or even a fellow Taurus? Seems like you're leaving valuable info on the table if you aren't regularly using astrological symbols.
It can also be used to signal your belief in the stock market.
As a Scorpio, I concur that the Taurus emoji does not lack practical uses on social media apps 😤
If you get 1% better every day without ever getting worse, then after 365 days you'll be 30 times better (1.01^365 > 30). 10 minutes is 1% of a day's waking hours. 5 minutes is one Yoda Timer interval.
To the extent that skills are like money, we can apply model of compound interest. But I’ve actually never encountered a skill that you can consistently get 1% better at forever. I think that there are diminishing marginal returns to the time and effort a person devotes towards improving skills.
Maybe if I start doing as many push-ups as I can once per day, I will have a surprising improvement in my ability to do push-ups at first, but the daily improvements will taper off with time. If I can bench-press 134 pounds, I don’t think my body can increase that by 1% every day. Eventually my progress will slow and even hit a physical limit. I'm skeptical that any skills exist that can be scaled up like this. I’m actually not sure there are any human skills that can be scaled up like this. Can you think of any examples? Maybe memorizing vocabulary?
EDIT: I corrected a mis-spelling.
Upvoted. I was deliberately vague about "1% better every day" because the idea was the person becomes 1% better as a whole; my thinking on this wasn't complete.
Now that I've read your comment, I can make the concept more nuanced: the person becomes 1% better by improving a skill by 1% and moving on to a different skill as they approach diminishing returns.
So they would do 1% more push-ups, then as they started hitting diminishing returns they spend more time on memorizing vocabulary, then linear algebra, then security mindset, then integrating bayes theorem deeply into their thoughts, then data science, then reading Christiano's papers, then networking, etc.
It seems that your original comment no longer holds under this version of "1% better", no? In what way does being 1% better at all these skills translate to being 30x better over a year? How do we even aggregate these 1% improvements under the new definition?
Anyway, even under this definition it seems hard to keep finding skills that one can get 1% better at within one day easily. At some point you would probably run into diminishing returns across skills -- that is, the "low-hanging fruit" of skills you can improve at easily will have been picked.
I definitely agree- the calculation that I had in mind, and physically wrote, clearly had in mind the 1% compounding on itself.
I have two defenses that it is still a helpful way of looking at it:
TL;DR "habitually deliberately visualizing yourself succeeding at goal/subgoal X" is extremely valuable, but also very tarnished. It's probably worth trying out, playing around with, and seeing if you can cut out the bullshit and boot it up properly.
Longer:
The universe is allowed to have tons of people intuitively notice that "visualize yourself doing X" is an obviously winning strategy that typically makes doing X a downhill battle if its possible at all, and so many different people pick it up that you first encounter it in an awful way e.g. in middle/high school you first hear about it but the speaker says, in the same breath, that you should use it to feel more motivated to do your repetitive math homework for ~2 hours a day.
I'm sure people could find all sorts of improvements e.g. an entire field of selfvisualizationmancy that provably helps a lot of people do stuff, but the important thing I've noticed is to simply not skip that critical step. Eliminate ugh fields around self-visualization or take whatever means necessary to prevent ugh fields from forming in your idiosyncratic case (also, social media algorithms could have been measurably increasing user retention by boosting content that places ugh fields in places that increase user retention by decreasing agency/motivation, with or without the devs being aware of this because they are looking at inputs and outputs or maybe just outputs, so this could be a lot more adversarial than you were expecting). Notice the possibility that it might or might not have been a core underlying dynamic in Yudkowsky's old Execute by Default post or Scott Alexander's silly hypothetical talent differential comment without their awareness.
The universe is allowed to give you a brain that so perversely hinges on self-image instead of just taking the action. The brain is a massive kludge of parallel processing spaghetti code and, regardless of whether or not you see yourself as a very social-status-minded person, the modern human brains was probably heavily wired to gain social status in the ancestral environment, and whatever departures you might have might be tearing down chesterton-schelling fences.
If nothing else, a takeaway from this was that the process of finding the missing piece that changes everything is allowed to be ludicrously hard and complicated, while the missing piece itself is simultaneously allowed to be very simple and easy once you've found it.
For more than a century, science fiction was basically the only institution paying people to think about the future. This sculpted the cultural landscape pretty significantly (try thinking of examples!)
I think this mainly caused thinking about the future low-status and less rigorous.
Examples: Humans are the main actors in the future, breaking the laws of physics is to be expected, aliens are humanoid and ~same level as humans in terms of technological development, artificial intelligence and possible automation of labor doesn't change the basic economic setup at all.
I strongly think that people are sleeping on massive EV from encouraging others to read HPMOR, or to reread it if it's been years. It costs nothing to read it, and it likely offers broad and intense upskilling.
I think you're committing a typical mind fallacy if you think most people would benefit from reading HPMOR as much as you did.
I think OP omitted many details of why it might be plausible, and I wouldn't expect the disagree voters to have any idea about what's going on there:
To me, the literary value of EA stories is thinking through psychological context of trying to think more clearly, trying to be good, whatever. Building empathy for "how you would derive EA-shaped things and then build on them from within a social reward surface that isn't explicitly selecting for that?", "what emotional frictions or mistake classes would I expect?", seems plausibly quite valuable for a ton of people.
Basically every other thing you could say about the value prop of reading Methods is downstream of this! "upskilling" was an intense word choice, but only 95% wrong.
With that in mind, I do think a ton of topics salient to EAs show up within Methods and many of them get thoroughly explored.
Additional thought: if we assume that people can gain skills from reading fiction (or from otherwise engaging in imaginary world, such as via films or games), does HPMOR give the best “return on investment” per hour spent? Is it better than reading War and Peace, or watching John Green videos, or playing Life is Strange? I’m skeptical that on EAs tend to be bias in favor of it, and therefore we would neglect other options.
(I’m not really expecting anyone to have actual data on this, but I’d be curious to see people bounce the idea around a bit)
Could you clarify what kind of upskiling you expect to come from reading Harry Potter fan fiction?
My “not rigorously thought out perspective” is that if someone has never encountered the idea of rationality or counterfactual, thinking, then this might introduce them to it. But I’m guessing that roughly similar benefits could be had from reading a much shorter book that is more directly targeted at teaching these skills (maybe Thinking Fast and Slow?).
I think it was unhelpful to refer to “Harry Potter fanfiction” here instead of perhaps “a piece of fiction”—I don’t think it’s actually more implausible that a fanfic would be valuable to read than some other kind of fiction, and your comment ended up seeming to me like it was trying to use the dishonest rhetorical strategy of implying without argument that the work is less likely to be valuable to read because it’s a fanfic.
Adjusted for popularity or likelihood of recommendation, you might naively expect fiction that someone is presented with to be more likely to stand the test of time than fan fiction, since the selection effects are quite different.
I think that is a fair and accurate criticism. I do view most fan fiction as fairly low quality, but even if that is true it doesn’t imply that all fan fiction is low quality. And I do think that some fiction can be used for self-improvement purposes.
The literary quality of the fiction (which, incidentally, I thought was terrible when I glanced at it out of curiosity), is fairly irrelevant to whether it helps people be more rational (which I am also skeptical of, but that's a separate point.)
I do suspect that some Bay Are folk would benefit from reading at least one book of "serious" "literary" fiction with zero wizards or spaceships, like Middlemarch or To the Lighthouse, but I might just being snobby/trying to justify the 2 years of undergrad courses in Eng Lit I did.
I read to the lighthouse, not far away in time from when I read methods, and I was annoyed or confused about why I was reading it. And there was a while ten years ago when satantango and gravity's rainbow were occupying a massive subset of my brain at all times, so I'm not like "too dumb for litfic" or whatever.
Sure. Not everyone has to like every book! I don't like Don Quixote, which has frequently been claimed to be the greatest novel ever. I loved War and Peace when I was 18, but I'd cooled on its conservatism and misogyny when I last read it, though I still understood why people see it as "great".
But I do think there is a tendency towards the grandiose and speculative in rationalist/nerd taste and away from the realist*, domestic, psychological, etc that I (probably pompously) think can be a bit limiting. Analogous to preferring Wagner to Mozart in opera, or prog metal to the Velvet Underground in "serious" rock music. (Also reminds me of Holden Karnosfky's blogpost about being baffled by why Pet Sounds has such a strong reputation among rock critics when its so "pop" and allegedly lacking in "complexity") I've never read Satantango, but Gravity's Rainbow is verging pretty strongly on sci-fi, and has a very apocalyptic vibe, and a general desire to overwhelm.
Not that I'm slagging it: I really liked Gravity's Rainbow when I read it, though that's 18 years ago now, and I've hated the other Pynchon I've tried since. And I'm not slagging also being massive nerd. I will never be a huge prog/metal fan, but I have just leapt from replaying Dragon Age: Inquisition to starting Baldur's Gate III.
*Technically speaking, I think "To the Lighthouse" is modernism not realism as lit profs would classify it.. But in this context that really just means "about posh people, not about a war, dense prose, interior monologues', which isn't really incompatible with "realism" in any ordinary sense.
I totally agree, but like the sequences those books consume energy that is normally spent on work, or at least hobbies, whereas HPMOR is optimized to replace time that would otherwise have been spent on videos, social media, socializing, other novels, etc. and is therefore the best bet I know of to boost EA as a whole.
It costs time to read it! Do you happen to know of a 10 minute summary of the key points?
HPMOR technically isn't built to be time-efficient, the highlights of the sequences is better for that. HPMOR is meant to replace other things you do for fun like reading fun novels or TV shows or social media, and replace that with material that offers passive upskilling. In that sense, it is profoundly time-efficient, because it replaces fun time spent not upskilling at all, with fun time spent upskilling.
A very large proportion of EA-adjacent people in the bay area swear by it as a way to become more competent in a very broad and significant way, but I'm not sure how it compares with other books like Discworld which are also intended for slack/leisure time. AFAIK CEA has not even done a survey explicitly asking about the self-improvement caused by HPMOR, let alone study measuring the benefits of having different kinds of people read it.
'it likely offers broad and intense upskilling' What's the evidence for this?
We already have plenty of people whose worldview is shaped by Yudkowsky and "the sequences". We need more people from different backgrounds, who can take a skeptical eye to these texts and point out their flaws and synthesize the good parts into a greater whole.
disagree that worldview shaped by yudkowsky would not correlate with your skeptical eye clause.
I get what you're saying (the Rationalist toolbox should equip you to be skeptical of the sequences themselves), and I think it's partially true in that rationalists have been highly critical of Yudkowsky lately.
However, I don't think it's enough. I think someone who learns similar principles to the sequences from different sources (textbooks, pop-science books, domain level knowledge) will do a better job at skeptically eyeing the sequences than someone who just read the sequences, for obvious reasons. I am one of those people, and I've spotted several issues with the sequences that the rationalists seemingly haven't.
Yeah I thought of it from the perspective of "not being told what to think but being told what to think about" -- Like you could say "the most profitable (in karma of a website) strategy is to disagree with a 'founder'-like figure of that very website" of course, but indeed if you've accepted his frame of the debate then didn't he "win" in a sense? This seems technically true often (not always!) but I find it uncompelling.
where did you write these down?
I did an in depth write-up on debunking one claim here (that of a superintelligence inventing general relativity from a blade of grass).
I haven’t gotten around to in depth write-ups for other things, but here are some brief descriptions of other issues I’ve encountered:
the description of Aumanns agreement theorem in “defy the data” is false, leaving behind important caveats that render his use of it incorrect.
Yud implies that “Einsteins arrogance” is some sort of mystery (and people have cited that article as a reason to be as arrogant as Einstein about speculative forecasts). In fact, Einsteins arrogance was completely justified by the available evidence and is not surprising at all, in a manner in no way comparable to speculative forecasts.
The implications of the “AI box experiment” have been severely overstated. It does not at all prove that an AGI cannot be boxed. “rationalists are gullible” fits the evidence provided just as well.
Yudkowsky treats his case for the “many worlds hypothesis” as a slam-dunk that proves the triumph of Bayes, but in fact it is only half-done. He presents good arguments against “collapse is real”, but fails to argue that this means many worlds is the truth, rather than one of the other many interpretations which do not involve a real collapse.
The use of Bayesianism in Rationalism is highly simplified, and often doesn’t actually involve using bayes rule at all. It rarely resembles bayes as actually applied in science, and is likely to lead to errors in certain situations, like forecasting low-probability events.
Yud’s track record of predictions is fairly bad, but he has a habit of pretending it isn’t by being vague and refusing to make predictions that can be actually checked. In general he displays an embarrassing lack of intellectual humility.
If I was a Bay Area VC, and I had $5m to invest annually and $100k to donate to people researching the long-term future (e.g. because it's interesting and I like the idea of being the one to drive the research), it would be foolish to spend some of the $5m investing in people researching nanofactories.
But it would also be foolish to donate some of the $100k to the kinds of people who say "nanorobotics is an obvious scam, they can just make up whatever they want".
And people don't realize that short-term investment and long-term predictions are separate domains that are both valuable in their own way, because there are so few people outside of the near-term focused private sector who are thinking seriously about the future.
They just assume that thinking about the long-term future is just a twisted, failed perversion of the private sector, because of how deeply immersed they are in the private sector's perspective exclusively.
As a result, they never have a chance to notice that the long-term future is something that they and their families might end up living in.
The difficulty of pitching AI safety to someone has been going down by ~50% every ~18 months. This thanksgiving might be a great time to introduce it to family; run Murphyjitsu and be goal-oriented! 🦃
'It’s genuinely a criterion for genius to say a bunch of wrong stupid things sometimes. Someone who says zero stupid things isn’t reasoning from first principles, isn’t taking risks, and has downloaded all the “correct” views.'
Source (I did not write this).
Planning to simply defy human nature doesn’t usually go very well.
—Robin Hanson
Because humans are primates, we have a strong drive to gain social status and play dominance games. The problem is that humans tend to take important concepts and turn them into dominance games.
As a result, people anticipate some sort of dominance or status game whenever they hear about an important concept. For many people, this anticipation has become so strong that they stopped believing that important concepts can exist.
Worth noting that in humans (and unlike in most other primates) status isn't primarily determined solely by dominance (e.g., control via coercion), but instead is also significantly influenced via prestige (e.g., voluntary deference due to admiration). While both dominance and prestige play a large role in determining status among humans, if anything prestige probably plays a larger role.
(Note – I'm not an expert in anthropology, and anyone who is can chime in, but this is my understanding given my amount of knowledge in the area.)
Agreed, I only used the word "dominance games" because it seemed helpful for understandability and the wordcount. But it was inaccurate enough to be worth effort to find a better combination of words.
Effective thinking is like building a house of cards. There are a lot of unique negative feedback loops that could scuttle the whole thing, and also increased risk of collapse from more height and time.
But the winning strategy is to get in the habit of routinely/passively stacking cards; even if collapse is frequent and frustrating, it's well worth it, it's the best way to eventually end up making a stack of cards higher than anyone else ever has. That's how Peter Singer and other philosophers became the first humans on Earth to discover the core arguments of Effective Altruism decades ago. Before that, nobody anywhere had stacked the cards high enough.
My instinctive reaction to
is that this seems to be more relevant to sequence thinking-generated arguments than cluster thinking ones (to use Holden's terminology).
The human brain seems to be pretty effective at thinking up clever ways to damage other people's reputations, even when running at only 80 IQ.
If cryopreservation becomes mainstream, then that's literally it. Nobody dies, and all of humanity logrolls itself into raising the next generations to be friendly and create aligned AGI.
Even the total sociopaths participate to some degree (e.g. verbally support, often avoid obstruction if they are very powerful). If they don't have preserved loved ones to protect, they still need a friendly long-term future for themselves to be unfrozen into. They'll spend many more years alive in the future than the present anyway, because unfreezing a person is orders of magnitude harder than reversing aging or generating a new body for an unfrozen person.
Many other people have probably thought of this already. What am I missing?