Ben Millwood🔸

4644 karmaJoined

Participation
3

  • Attended an EA Global conference
  • Attended an EAGx conference
  • Attended more than three meetings with a local EA group

Comments
530

Topic contributions
1

Not that it's super important, but TVTropes didn't invent the phrase (nor do they claim they did), it's from Warhammer 40,000.

I downvoted this because I think this isn't independently valuable / separate enough from your existing posts to merit a new, separate post. I think it would have been better as a comment on your existing posts (and as I've said on a post by someone else about your reviews, I think we're better off consolidating the discussion in one place).

That said, I think the sentiments expressed here are pretty reasonable, and I would have upvoted this in comment form I think.

Someone on the forum said there were ballpark 70 AI safety roles in 2023

Just to note that the UK AI Security Institute employs more than 50 technical staff by itself and I forget how many non-technical staff, so this number may be due an update.

This doesn't seem right to me because I think it's popular among those concerned with the longer term future to expect it to be populated with emulated humans, which clearly isn't a continuation of the genetic legacy of humans, so I feel pretty confident that it's something else about humanity that people want to preserve against AI. (I'm not here to defend this particular vision of the future beyond noting that people like Holden Karnofsky have written about it, so it's not exactly niche.)

You say that expecting AI to have worse goals than humans would require studying things like what the empirical observed goals of AI systems turn out to be, and similar – sure, so in the absence of having done those studies, we should delay our replacement until they can be done. And doing these studies is undermined by the fact that right now the state of our knowledge on how to reliably determine what an AI is thinking is pretty bad, and it will only get worse as they develop their abilities to strategise and lie. Solving these problems would be a major piece of what people are looking for in alignment research, and precisely the kind of thing it seems worth delaying AI progress for.

another opportunity for me to shill my LessWrong writing posing this question: Should we exclude alignment research from LLM training datasets?

I don't have a lot of time to spend on this, but this post has inspired me to take a little time to figure out whether I can propose or implement some controls (likely: making posts visible to logged-in users only) in ForumMagnum (the software underlying the EA Forum, LW, and the Alignment Forum)

edit: https://github.com/ForumMagnum/ForumMagnum/issues/10345

I agree overall but I want to add that becoming dependent on non-EA donors could put you under pressure to do more non-EA things / less EA things -- either party could pull the other towards themselves.

Keep in mind that you're not coercing them to switch their donations, just persuading them. That means you can use the fact that they were persuaded as evidence that you were on the right side of the argument. You being too convinced of your own opinion isn't a problem unless other people are also somehow too convinced of it, and I don't see why they would be.

I think that EA donors are likely to be unusual in this respect -- you're pre-selecting for people who have signed up for a culture of doing what's best even when it wasn't what they thought it was before.

I guess also I think that my arguments for animal welfare charities are at their heart EA-style arguments, so I'm getting a big boost to my likelihood of persuading someone by knowing that they're the kind of person who appreciates EA-style arguments.

Load more