MikhailSamin

executive director @ AI Governance and Safety Institute
412 karmaJoined
contact.ms

Bio

Participation
5

Are you interested in AI X-risk reduction and strategies? Do you have experience in comms or policy? Let’s chat!

aigsi.org develops educational materials and ads that most efficiently communicate core AI safety ideas to specific demographics, with a focus on producing a correct understanding of why smarter-than-human AI poses a risk of extinction. We plan to increase and leverage understanding of AI and existential risk from AI to impact the chance of institutions addressing x-risk.

Early results include ads that achieve a cost of $0.10 per click (to a website that explains the technical details of why AI experts are worried about extinction risk from AI) and $0.05 per engagement on ads that share simple ideas at the core of the problem.

Personally, I’m good at explaining existential risk from AI to people, including to policymakers. I have experience of changing minds of 3/4 people I talked to at an e/acc event.

Previously, I got 250k people to read HPMOR and sent 1.3k copies to winners of math and computer science competitions (including dozens of IMO and IOI gold medalists); have taken the GWWC pledge; created a small startup that donated >100k$ to effective nonprofits.

I have a background in ML and strong intuitions about the AI alignment problem. I grew up running political campaigns and have a bit of a security mindset.

My website: contact.ms

You’re welcome to schedule a call with me before or after the conference: contact.ms/ea30 

Comments
68

I'm confused about this discrepancy between LessWrong and EA Forum. (Feedback is welcome!)

Anecdotally, approximately everyone who's now working on AI safety with Russian origins got into it because of HPMOR. Just a couple of days ago, an IOI gold medalist reached out to me, they've been going through ARENA.

HPMOR tends to make people with that kind of background act more on trying to save the world. It also gives some intuitive sense for some related stuff (up to "oh, like the mirror from HPMOR?"), but this is a lot less central than giving people the ~EA values and making them actually do stuff.

(Plus, at this point, the book is well-known enough in some circles that some % of future Russian ML researchers would be a lot easier to alignment-pill and persuade to not work on something that might kill everyone or prompt other countries to build something that kills everyone.

Like, the largest Russian broker decided to celebrate the New Year by advertising HPMOR and citing Yudkowsky.)

I'm not sure how universal this is- the kind of Russian kid who is into math/computer science is the kind of kid who would often be into the HPMOR aesthetics- but it seems to work.

I think many past IMO/IOI medalists are generally very capable and can help, and it's worth looking at the list of them and reaching out to people who've read HPMOR (and possibly The Precipice/Human Compatible) and getting them to work on AI safety.

We also have 6k more copies (18k hard-cover books) left. We have no idea what to do with them. Suggestions are welcome.

Here's a map of Russian libraries that requested copies of HPMOR, and we've sent 2126 copies to:

Sending HPMOR to random libraries is cool, but I hope someone comes up with better ways of spending the books.

Nope. The grant you linked to was not in any way connected to me or the books I've printed. A couple of years ago, (edit: in 2019) I was surprised to learn about that grant; the claim that there was coordination with "the team behind the highly successful Russian printing of HPMOR" (which is me/us) is false. (I don't think the recipients of the grant you're referencing even have a way to follow up with the people they gave books. Also, as IMO 2020 was cancelled, they should’ve returned most of the grant.)

EA money was not involved in printing the books that I have.

We've started sending books to olympiad winners in December 2022. All of the copies we've sent have been explicitly requested, often together with copies of The Precipice and/or Human Compatible, sometimes after having already read it (usually due to my previous efforts), usually after having seen endorsements by popular science people and literary critics.

I have a very different model of how HPMOR affects this specific audience and I think this is a lot more valuable than selling the books[1] -> donating anywhere else.

  1. ^

    (we can't actually sell these books due to copyright-related constraints.)

I would not be advocating for inaction. I do advocate for high-integrity actions and comms, though.

I occasionally see these people publicly expressing that the rationalists' standards of honesty are impossible to meet and saying that they're talking in ways rationalists consider to be potentially manipulative.

It would be great if people who are actually doing things tried to avoid manipulations and dishonesty.

Being manipulative is the kind of thing that backfires and leads to the deaths of everyone in a short-timelines world.

Until a year ago, I was hoping EA had learned some lessons from what happened with SBF, but unfortunately, we don't seem to have.

If you lie to try to increase the chance of a global AI pause, our world looks less like a surviving world, not more like it.

Update: I've received feedback from the SFF round; we got positive evaluations from two recommenders (so my understanding is the funding allocated to us in the s-process was lower than the speculation grant) and one piece of negative feedback. The negative feedback mentioned that our project might lead to EA getting swamped by normies with high inferential distances, which can have negative consequences; and that because of that risk, "This initiative may be worthy of some support, but unfortunately other orgs in this rather impressive lineup must take priority".

If you're considering donating to AIGSI/AISGF, please reach out! My email is ms@contact.ms.

Note that we've only received a speculation grant recommended by the SFF and haven’t received any s-process funding. This should be a downward update on the value of our work and an upward update on a marginal donation's value for our work.

I'm waiting for feedback from SFF before actively fundraising elsewhere, but I'd be excited about getting in touch with potential funders and volunteers. Please message me if you want to chat! My email is ms@contact.ms, and you can find me everywhere else or send a DM on EA Forum.

On other organizations, I think:

  • MIRI’s work is very valuable. I’m optimistic about what I know about their comms and policy work. As Malo noted, they work with policymakers, too. Since 2021, I’ve donated over $60k to MIRI. I think they should be the default choice for donations unless they say otherwise.
  • OpenPhil risks increasing polarization and making it impossible to pass meaningful legislation. But while they make IMO obviously bad decisions, not everything they/Dustin fund is bad. E.g., Horizon might place people who actually care about others in places where they could have a huge positive impact on the world. I’m not sure, I would love to see Horizon fellows become more informed on AI x-risk than they currently are, but I’ve donated $2.5k to Horizon Institute for Public Service this year.
  • I’d be excited about the Center for AI Safety getting more funding. SB-1047 was the closest we got to a very good thing, AFAIK, and it was a coin toss on whether it would’ve been signed or not. They seem very competent. I think the occasional potential lack of rigor and other concerns don't outweigh their results. I’ve donated $1k to them this year.
  • By default, I'm excited about the Center for AI Policy. A mistake they plausibly made makes me somewhat uncertain about how experienced they are with DC and whether they are capable of avoiding downside risks, but I think the people who run it are smart and have very reasonable models. I'd be excited about them having as much money as they can spend and hiring more experienced and competent people.
  • PauseAI is likely to be net-negative, especially PauseAI US. I wouldn’t recommend donating to them. Some of what they're doing is exciting (and there are people who would be a good fit to join them and improve their overall impact), but they're incapable of avoiding actions that might, at some point, badly backfire.

    I’ve helped them where I could, but they don’t have good epistemics, and they’re fine with using deception to achieve their goals.

    E.g., at some point, their website represented the view that it’s more likely than not that bad actors would use AI to hack everything, shut down the internet, and cause a societal collapse (but not extinction). If you talk to people with some exposure to cybersecurity and say this sort of thing, they’ll dismiss everything else you say, and it’ll be much harder to make a case for AI x-risk in the future. PauseAI Global’s leadership updated when I had a conversation with them and edited the claims, but I'm not sure they have mechanisms to avoid making confident wrong claims. I haven't seen evidence that PauseAI is capable of presenting their case for AI x-risk competently (though it's been a while since I've looked).

    I think PauseAI US is especially incapable of avoiding actions with downside risks, including deception[1], and donations to them are net-negative. To Michael, I would recommend, at the very least, donating to PauseAI Global instead of PauseAI US; to everyone else, I'd recommend ideally donating somewhere else entirely.

  • Stop AI's views include the idea that a CEV-aligned AGI would be just as bad as an unaligned AGI that causes human extinction. I wouldn't be able to pass their ITT, but yep, people should not donate to Stop AI. The Stop AGI person participated in organizing the protest described in the footnote. 
  1. ^

    In February this year, PauseAI US organized a protest against OpenAI "working with the Pentagon", while OpenAI only collaborated with DARPA on open-source cybersecurity tools and is in talks with the Pentagon about veteran suicide prevention. Most participants wanted to protest OpenAI because of AI x-risk and not because of Pentagon, but those I talked to have said they felt it was deceptive upon discovering the nature of OpenAI's collaboration with the Pentagon. Also, Holly threatened me trying to prevent the publication of a post about this and then publicly lied about our conversations, in a way that can be easily falsified by looking at the messages we've exchanged.

(Haven’t thought about this really, might be very wrong, but have this thought and seems good to put out there.) I feel like putting 🔸 at the end of social media names might be bad. I’m curious what the strategy was.

  • The willingness to do this might be anti-correlated with status. It might be a less important part of identity of more important people. (E.g., would you expect Sam Harris, who is a GWWC pledger, to do this?)

  • I’d guess that ideally, we want people to associate the GWWC pledge with role models (+ know that people similar to them take the pledge, too).

  • Anti-correlation with status might mean that people will identify the pledge with average though altruistic Twitter users, not with cool people they want to be more like.

  • You won’t see a lot of e/accs putting the 🔸 in their names. There might be downside effects of perception of a group of people as clearly outlined and having this as an almost political identity; it seems bad to have directionally-political properties that might do mind-killing things both to people with 🔸 and to people who might argue with them.

Can you give an example of a non-PR risk that you had in mind?

Uhm, for some reason I have four copies of this crosspost on my profile?

Load more