V

vaniver

411 karmaJoined

Comments
31

(comment crossposted from LW)

While the coauthors broadly agree about points listed in the post, I wanted to stick my neck out a bit more and assign some numbers to one of the core points. I think on present margins, voluntary restraint slows down capabilities progress by at most 5% while probably halving safety progress, and this doesn't seem like a good trade. [The numbers seem like they were different in the past, but the counterfactuals here are hard to estimate.] I think if you measure by the number of people involved, the effect of restraint is substantially lower; here I'm assuming that people who are most interested in AI safety are probably most focused on the sorts of research directions that I think could be transformative, and so have an outsized impact.

Request for advice from animal suffering EAs (and, to a lesser extent, climate EAs?): is there an easy win over getting turkeys from Mary's Turkeys? (Also, how much should I care about getting the heritage variety?)

Background: I routinely cook for myself and my housemates (all of whom are omnivores), and am on a diet that requires animal products for health reasons. Nevertheless, I'd rather impose fewer costs than more costs on others; I stopped eating chicken and chicken eggs in response to this post and recently switched from consuming lots of grass-finished beef to consuming lots of bison.

I have a general sense that "bigger animals have better lives", and so it's better to roast turkeys than roast ducks, for example, but am 1) not clear on the relative impacts of birds and mammals and 2) not clear on turkeys specifically. My sense is that even relatively well-treated chickens have been bred in a way that's pretty welfare-destroying, but that this hasn't yet happened to ducks; turkeys are subject to Thanksgiving pressure in the US, tho, which I'm sure has had some effect. The various food impact things that I've seen (like foodimpacts.org) don't seem to address turkeys (or use estimates from 'similar' creatures, which feels like too much of a stretch).

I think the counterfactual to roasting a turkey is either smoking the same weight of bison brisket or my housemates cooking themselves the same weight of chicken.

So you would have lost 40 hours of productive time? Respectfully: so what? You have sources actively claiming you are about to publish directly false information about them and asking for time to provide evidence that information is directly false. 

Also, I think it is worth Oli/Ben estimating how many productive hours were lost to the decision to not delay; it would not surprise me if much of the benefit here was illusory.

While it's generally poor form to attempt to de-anonymize stories, since it's at issue here it seems potentially worth it. It seems like this could be Kat's description of Kat's experience of Ben, which she (clearly) consents to sharing.

(I'm Matthew Gray)

Inflection is a late addition to the list, so Matt and I won’t be reviewing their AI Safety Policy here.

My sense from reading Inflection's response now is that they say the right things about red teaming and security and so on, but I am pretty worried about their basic plan / they don't seem to be grappling with the risks specific to their approach at all. Quoting from them in two different sections:

Inflection’s mission is to build a personal artificial intelligence (AI) for everyone. That means an AI that is a trusted partner: an advisor, companion, teacher, coach, and assistant rolled into one.

Internally, Inflection believes that personal AIs can serve as empathetic companions that help people grow intellectually and emotionally over a period of years or even decades.** Doing this well requires an understanding of the opportunities and risks that is grounded in long-standing research in the fields of psychology and sociology.** We are presently building our internal research team on these issues, and will be releasing our research on these topics as we enter 2024.

I think AIs thinking specifically about human psychology--and how to convince people to change their thoughts and behaviors--are very dual use (i.e. can be used for both positive and negative ends) and at high risk for evading oversight and going rogue. The potential for deceptive alignment seems quite high, and if Inflection is planning on doing any research on those risks or mitigation efforts specific to that, it doesn't seem to have shown up in their response.

I don't think this type of AI is very useful for closing the acute risk window, and so probably shouldn't be made until much later.

I'm thinking about the matching problem of "people with AI safety questions" and "people with AI safety answers". Snoop Dogg hears Geoff Hinton on CNN (or wherever), asks "what the fuck?", and then tries to find someone who can tell him what the fuck.

I think normally people trust their local expertise landscape--if they think the CDC is the authority on masks they adopt the CDC's position, if they think their mom group on Facebook is the authority on masks they adopt the mom group's position--but AI risk is weird because it's mostly unclaimed territory in their local expertise landscape. (Snoop also asks "is we in a movie right now?" because movies are basically the only part of the local expertise landscape that has had any opinion on AI so far, for lots of people.) So maybe there's an opportunity here to claim that territory (after all, we've thought about it a lot!).

I think we have some 'top experts' who are available for, like, mass-media things (podcasts, blog posts, etc.) and 1-1 conversations with people they're excited to talk to, but are otherwise busy / not interested in fielding ten thousand interview requests. Then I think we have tens (hundreds?) of people who are expert enough to field ten thousand interview requests, given that the standard is "better opinions than whoever they would talk to by default" instead of "speaking to the whole world" or w/e. But just like connecting people who want to pay to learn calculus and people who know calculus and will teach it for money, there's significant gains from trade from having some sort of clearinghouse / place where people can easily meet. Does this already exist? Is anyone trying to make it? (Do you want to make it and need support of some sort?)

I think the 'traditional fine dining' experience that comes closest to this is Peking Duck.

Most of my experience has been with either salt-drenched cooked fat or honey-dusted cooked fat; I'll have to try smoking something and then applying honey to the fat cap before I eat it. My experience is that it is really good but also quickly becomes unbalanced / no longer good; some people, on their first bite, already consider it too unbalanced to enjoy. So I do think there's something interesting here where there is a somewhat subtle taste mechanism (not just optimizing for 'more' but somehow tracking a balance) that ice cream seems to have found a weird hole in.

[edit: for my first attempt at this, I don't think the honey improved it at all? I'll try it again tho.]

When people make big and persistent mistakes, the usual cause (in my experience) is not something that comes labeled with giant mental “THIS IS A MISTAKE” warning signs when you reflect on it.

Instead, tracing mistakes back to their upstream causes, I think that the cause tends to look like a tiny note of discord that got repeatedly ignored—nothing that mentally feels important or action-relevant, just a nagging feeling that pops up sometimes.

To do better, then, I want to take stock of those subtler upstream causes, and think about the flinch reactions I exhibited on the five-second level and whether I should have responded to them differently.

I don't see anything in the lessons on the question of whether or not your stance on drama has changed, which feels like the most important bit?

That is, suppose I have enough evidence to not-be-surprised-in-retrospect if one of my friends is abusing their partner, and also I have a deliberate stance of leaving other people's home lives alone. The former means that if I thought carefully about all of my friends, I would raise that hypothesis to attention; the latter means that even if I had the hypothesis, I would probably not do anything about it. In this hypothetical, I only become a force against abuse if I decide to become a meddler (which introduces other costs and considerations).

Can we all just agree that if you’re gonna make some funding decision with horrendous optics, you should be expected to justify the decision with actual numbers and plans?

Justify to who? I would like to have an EA that has some individual initiative, where people can make decisions using their resources to try to seek good outcomes. I agree that when actions have negative externalities, external checks would help. But it's not obvious to me that those external checks weren't passed in this case*, and if you want to propose a specific standard we should try to figure out whether or not that standard would actually help with optics.

Like, if the purchase of Wytham Abbey had been posted on the EA forum, and some people had said it was a good idea and some people said it was a bad idea, and then the funders went ahead and bought it, would our optics situation look any different now? Is the idea that if anyone posted that it was a bad idea, they shouldn't have bought it?

[And we need to then investigate whether or not adding this friction to the process ends up harming it on net; property sales are different in lots of places, but there are some where adding a week to the "should we do this?" decision-making process means implicitly choosing not to buy any reasonably-priced property, since inventory moves too quickly, and only overpriced property stays on the market for more than a week.]

 

* I don't remember being consulted about Wytham, but I'm friends with the people running it and broadly trust their judgment, and guess that they checked with people as to whether or not they thought it was a good idea. I wasn't consulted about the specific place Irena ended up buying, but I was consulted somewhat on whether or not Irena should buy a venue, and I thought she should, going so far as being willing to support it with some of my charitable giving, which ended up not being necessary.

Load more