When reporting AI timelines, be clear who you're deferring to

Sam Clarke

It's fashionable these days to ask people about their AI timelines. And it's fashionable to have things to say in response.

But relative to the number of people who report their timelines, I suspect that only a small fraction have put in the effort to form independent impressions about them. And, when asked about their timelines, I don't often hear people also reporting how they arrived at their views.

If this is true, then I suspect everyone is updating on everyone else's views as if they were independent impressions, when in fact all our knowledge about timelines stems from the same (e.g.) ten people.

This could have several worrying effects:

People's timelines being overconfident (i.e. too resilient), because they think they have more evidence than they actually do.
- In particular, people in this community could come to believe that we have the timelines question pretty worked out (when we don't), because they keep hearing the same views being reported.
Weird subgroups forming where people who talk to each other most converge to similar timelines, without good reason.^[1]
People using faulty deference processes. Deference is hard and confusing, and if you don't discuss how you’re deferring then you're not forced to check if your process makes sense.

So: if (like most people) you don't have time to form your own views about AI timelines, then I suggest being clear who you're deferring to (and how), rather than just saying "median 2040" or something.^[2]

And: if you’re asking someone about their timelines, also ask how they arrived at their views.

(Of course, the arguments here apply more widely too. Whilst I think AI timelines is a particularly worrying case, being unclear if/how you're deferring is a generally poor way of communicating. Discussions about p(doom) are another case where I suspect we could benefit from being clearer about deference.)

Finally: if you have 30 seconds and want to help work out who people do in fact defer to, take the timelines deference survey!

Thanks to Daniel Kokotajlo and Rose Hadshar for conversation/feedback, and to Daniel for suggesting the survey.

^{^}
This sort of thing may not always be bad. There should be people doing serious work based on various different assumptions about timelines. And in practice, since people tend to work in groups, this will often mean groups doing serious work based on various different assumptions about timelines.
^{^}
Here are some things you might say, which exemplify clear communication about deference:
- "I plugged my own numbers into the bio anchors framework (after 30 minutes of reflection) and my median is 2030. I haven't engaged with the report enough to know if I buy all of its assumptions, though"
- "I just defer to Ajeya's timelines because she seems to have thought the most about it"
- "I don't have independent views and I honestly don't know who to defer to"

120 Reactions

Mentioned in

70Probabilities, Prioritization, and 'Bayesian Mindset'

68Deference on AI timelines: survey results

24EA & LW Forums Weekly Summary (10 - 16 Oct 22')

Comments20

Sorted by

New & upvoted

Click to highlight new comments since: Today at 7:45 AM

EmrikOct 10 202212

I love this idea. Hoping you'll catch some deference cycles in the survey for the lols ^^

There's also this pernicious thing that people fall pray too, where they think they're forming an independent model of this because they only update on "gears-level evidence". Unfortunately, when someone tells you "AI is N years away because XYZ technical reasons," you may think you're updating on the technical reasons, but your brain was actually just using XYZ as excuses to defer to them.

Adding to the trouble is the fact that arguments XYZ have probably gone through strong filters to reach your attention. Would the person give you the counterarguments if they knew them? How did you happen to land in a conversation with this person? Was it because you sought out "expert advice" from "FOOM AI Timelines Experts Hotline"?

When someone gives you gears-level evidence, and you update on their opinion because of that, that can still constitute deferring. What you think of as gears-level evidence is nearly always disguised testimonial evidence. At least to some, usually damning, degree. And unless you're unusually socioepistemologically astute, you're just lost to the process.

Sam ClarkeOct 10 20225

Unfortunately, when someone tells you "AI is N years away because XYZ technical reasons," you may think you're updating on the technical reasons, but your brain was actually just using XYZ as excuses to defer to them.

I really like this point. I'm guilty of having done something like this loads myself.

When someone gives you gears-level evidence, and you update on their opinion because of that, that still constitutes deferring. What you think of as gears-level evidence is nearly always disguised testimonial evidence. At least to some, usually damning, degree. And unless you're unusually socioepistemologically astute, you're just lost to the process.

If it's easy, could you try to put this another way? I'm having trouble making sense of what exactly you mean, and it seems like an important point if true.

EmrikOct 10 20225

"When someone gives you gears-level evidence, and you update on their opinion because of that, that still constitutes deferring."

This was badly written. I just mean that if you update on their opinion as opposed to just taking the patterns & trying to adjust for the fact that you received them through filters, is updating on testimony. I'm saying nothing special here, just that you might be tricking yourself into deferring (instead of impartially evaluating patterns) by letting the gearsy arguments woozle you.

I wrote a bit about how testimonial evidence can be "filtered" in the paradox of expert opinion:

If you want to know whether string theory is true and you're not able to evaluate the technical arguments yourself, who do you go to for advice? Well, seems obvious. Ask the experts. They're likely the most informed on the issue. Unfortunately, they've also been heavily selected for belief in the hypothesis. It's unlikely they'd bother becoming string theorists in the first place unless they believed in it.
If you want to know whether God exists, who do you ask? Philosophers of religion agree: 70% accept or lean towards theism compared to 16% of all PhilPaper Survey respondents.
If you want to know whether to take transformative AI seriously, what now?

Evan_GaensbauerOct 10 20224

I was short on time today and hurriedly wrote my own comment reply to Sam here before I forgot my point so it's not concise and let me know if any of it is unclear.

https://forum.effectivealtruism.org/posts/FtggfJ2oxNSN8Niix/when-reporting-ai-timelines-be-clear-who-you-re-not?commentId=M5GucobHBPKyF53sa

Your comment also better describes a kind of problem I was trying to get at, though I'll post again an excerpt of my testimony that dovetails with what you're saying:

I remember when I was following conversations more like this a few years ago in 2018 that there was some threshold for AI capabilities over a dozen people I talked to saying it would be imminently achieved. When I asked why they thought that, they said they knew a lot of smart people they trust saying it. I talked to a couple of them and they said a bunch of smart people they know were saying it and heard it from Demis Hassabis from DeepMind. I forget what it was but Hassabis was right because it happened around a year later.

What stuck with me is how almost nobody could or would explain their reasoning. Maybe there is way more value to deference as implicit trust in individuals, groups or semi-transparent processes. Yet the reason why Eliezer Yudkowsky, Ajeya Cotra or Hassabis is because they have a process. At the least, more of the alignment community would need to understand those processes instead of having faith in a few people who probably don't want the rest of the community deferring to them that much. It appears the problem has only gotten worse.

Evan_GaensbauerOct 10 20224

What's the best post to read to learn about how EAs conceive of "gears-level understanding"/"gears-level evidence"?

EmrikOct 11 20222

I'm not sure. I used to call it "technical" and "testimonial evidence" before I encountered "gears-level" on LW. While evidence is just evidence and Bayesian updating stays the same, it's usefwl to distinguish between these two categories because if you have a high-trust community that frequently updates on each others' opinions, you risk information cascades and double-counting of evidence.

Information cascades develop consistently in a laboratory situation [for naively rational reasons, in which other incentives to go along with the crowd are minimized]. Some decision sequences result in reverse cascades, where initial misrepresentative signals start a chain of incorrect [but naively rational] decisions that is not broken by more representative signals received later. - (Anderson & Holt, 1998)

Additionally, if your model of a thing has has "gears", then there are multiple things about the physical world that, if you saw them change, it would change your expectations about the thing.

Let's say you're talking to someone you think is smarter than you. You start out with different estimates and different models that produce those estimates. From Ben Pace's a Sketch of Good Communication:

Here you can see both blue and red has gears. And since you think their estimate is likely to be much better than yours, and you want get some of that amazing decision-guiding power, you throw out your model and adopt their estimate (cuz you don't understand or don't have all the parts of their model):

Here, you have "destructively deferred" in order to arrive at your interlocutor's probability estimate. Basically zombified. You no longer have any gears, even if the accuracy of your estimate has potentially increased a little.

An alternative is to try to hold your all-things-considered estimates separate from your independent impressions (that you get from your models). But this is often hard and confusing, and they bleed into each other over time.

Sam ClarkeOct 10 20226

Thanks for your comment!

Asking "who do you defer to?" feels like a simplification

Agreed! I'm not going to make any changes to the survey at this stage, but I like the suggestion and if I had more time I'd try to clarify things along these lines.

I like the distinction between deference to people/groups and deference to processes.

deference to good ideas

[This is a bit of a semantic point, but seems important enough to mention] I think "deference to good ideas" wouldn't count as "deference", in the way that this community has ended up using it. As per the forum topic entry on epistemic deference:

Epistemic deference is the process of updating one's beliefs in response to what others appear to believe, even if one ignores the reasons for those beliefs or do not find those reasons persuasive. (emphasis mine)

If you find an argument persuasive and incorporate it into your views, I think that doesn't qualify as "deference". Your independent impressions don't (and in most cases won't) be the views you formed in isolation. When forming your independent impressions, you can and should take other people's arguments into account, to the extent that you find them convincing. Deference occurs when you take into account knowledge about what other people believe, and how trustworthy you find them, without engaging with their object level arguments.

non-defensible original ideas

A similar point applies to this one, I think.

(All of the above makes me think that the concept of deference is even less clear in the community than I thought it was -- thanks for making me aware of this!)

Evan_GaensbauerOct 10 20224

Edit: I wrote this comment hastily when I didn't have much time today, so it may be clear or concise enough. I may return to clean it up later, especially on request, so please notify me of any parts of this comment that are hard to understand.

Thank you for writing this. I've tried having conversations during the last year to learn more about this and not only do people no report who they're deferring to but when asked, they don't answer. That's not as bad as when someone answers with "I've just heard a lot of people saying this" but that doesn't address the problem you mentioned that it might be only 10 people who they're getting their timelines from.

I don't live in the Bay Area and I'm not as personally well-connected to professionals in relevant fields, so most of these conversations I've had or seen like this are online. I understand why some might people might perceive online conversations that take longer and where nuance can get lost to be too tedious and take too much time for the value they provide. Yet why I ask is because I know I could dive deep into Metaculus forecasts or how many dozens of posts but I don't know where to start and I don't want to waste time. Never mind opting not to disclose a name or source of information, there are scarcely answers like "I'm too busy to get into this right now' or a suggestion for a website for others to check and figure it out for themselves.

Of course, the arguments here apply more widely too. Whilst I think AI timelines is a particularly worrying case, being unclear if/how you're deferring is a generally poor way of communicating. Discussions about p(doom) are another case where I suspect we could benefit from being clearer about deference.

To launch a survey is technically a good first step but the value of it may be lost if nobody else follows suit to engender better norms. I understand the feeling of urgency for the particular issue of AI timelines but the general problem you're getting at has in my experience been a common, persistent and major problem across all aspects of EA for years.

I remember when I was following conversations more like this a few years ago in 2018 that there was some threshold for AI capabilities over a dozen people I talked to saying it would be imminently achieved. When I asked why they thought that, they said they knew a lot of smart people they trust saying it. I talked to a couple of them and they said a bunch of smart people they know were saying it and heard it from Demis Hassabis from DeepMind. I forget what it was but Hassabis was right because it happened around a year later.

What stuck with me is how almost nobody could or would explain their reasoning. Maybe there is way more value to deference as implicit trust in individuals, groups or semi-transparent processes. Yet the reason why Eliezer Yudkowsky, Ajeya Cotra or Hassabis is because they have a process. At the least, more of the alignment community would need to understand those processes instead of having faith in a few people who probably don't want the rest of the community deferring to them that much. It appears the problem has only gotten worse.

Between those like you feeling a need to write a post like this and those like me who barely get answers when we ask questions, all the problems here seem like they could be much, much worse than you're thinking.

EmrikOct 11 20225

On timelines, other people I've most recently updated most on:

Matthew Barnett (I updated to slower timelines):

I think raw intelligence, while important, is not the primary factor that explains why humanity-as-a-species is much more powerful than chimpanzees-as-a-species. Notably, humans were once much less powerful, in our hunter-gatherer days, but over time, through the gradual process of accumulating technology, knowledge, and culture, humans now possess vast productive capacities that far outstrip our ancient powers.
...
There are strong pressures -- including the principle of comparative advantage, diseconomies of scale, and gains from specialization -- that incentivize making economic services narrow and modular, rather than general and all-encompassing. Illustratively, a large factory where each worker specializes in their particular role will be much more productive than a factory in which each worker is trained to be a generalist, even though no one understands any particular component of the production process very well.

What is true in human economics will apply to AI services as well. This implies we should expect something like Eric Drexler's AI perspective, which emphasizes economic production across many agents who trade and produce narrow services, as opposed to monolithic agents that command and control.

And I updated to faster and sooner timelines from a combination of 1) noticing some potential quick improvements in AI capabilities and feeling like there could be more similar stuff in this direction, plus 2) having heard several people say they think AI is soon because (I'm inferring their "because" here) they think the innovation frontier is fructiferous (Eliezer, and conversation+tweets from Max). I am likely forgetting some people here.

While I read the Sequences in 2014-15, I did feel like the model for unfriendly AI made sense, but I was mostly deferring to Eliezer on it because I had noticed how much smarter he was than me on these things.

Evan_GaensbauerOct 16 20225

First of all, thank you for reporting who you've deferred to in different ways in specific terms. Second, thank you for putting in some extra effort to not only name this or that person but make your reasoning more transparent and legible.

I respect Matthew because when I read what he writes, it feels like I tend to agree with half of the points he makes and disagree with the other half. It's what makes him interesting. While some are more real and some are only perceived, there are barriers to posting on the EA Forum, like an expectation of too high a burden of rigour, that have people post on social media or other forums off of this one when they can't resist the urge to express a novel viewpoint to advance progress in EA. Matthew is one of the people I think of when I wish a lot of insightful people were more willing to post on the EA Forum.

I don't agree with all of what you've presented from Matthew here or what you've said yourself. I might come back to specify which parts I agree and disagree with later when I've got more time. Right now, though, I just want to positively reinforce your writing a comment that is more like the kind of feedback from others I'd like to see more of in EA.

Evan_GaensbauerOct 10 20223

I'm confused why you characterized as "fashionable" to ask other people about timelines. That's technically true but both in and outside of EA, nobody asks either about why it's become more "fashionable" in the last couple years to ask about the chance of nuclear war or tipping points for climate change.

It's presumably obvious that people are asking for reasons like:

already working in AI alignment and checking how timelines might influence how their own trajectories should change.
to check whether transformative AI is so imminent it should be someone's top priority.
they are skeptical or simply curious and want more information.

If there is a good reason it was phrased that way I haven't noticed, please feel free to clarify. Otherwise, it seems best to be clear what we directly mean to talk about.

Geoffrey MillerOct 10 20223

Sam -- good points. I would add:

There's deference (adopting views of people & groups we respect as experts), and then there's anti-deference (rejecting views of people & groups who are arguably experts in some domain, but whose views contradict the dominant AI safety/EA narrative -- e.g. Steven Pinker, Gary Marcus, others skeptical of AI X risk and/or speed of AI development).

Anti-deference can also be somewhat irrational, tribal, and conformist, such that if Gary Marcus says deep learning systems have cognitive architectures that can't possibly support AGI, and if nobody in AI safety research takes him seriously, then we might react to his pessimism by updating even harder to think that AGI is arriving sooner than we would otherwise have predicted.

Anti-deference can also take a more generalized form of ignoring whole fields of study that haven't been well-connected to the AI safety/EA in-group, but that have some potentially informative things to say about AGI timelines; this could include mainstream cognitive science, evolutionary psychology, intelligence research, software engineering, electrical engineering, history of technology, national security and intelligence, corporate intellectual property law, etc.

MichaelA🔸Feb 5 20232

Thanks, this seems right to me.

Are the survey results shareable yet? Do you have a sense of when they will be?

Sam ClarkeFeb 17 20236

Will get them written up this month—sorry for the delay!

Sam ClarkeMar 30 20234

Finally posted

Evan_GaensbauerOct 10 20222

In addition to the EA Forum topic post, there is this specific post by Owen Cotton-Barratt reviewing a taxonomy of types of deference considered in EA, and open issues related to each.

https://forum.effectivealtruism.org/posts/LKdhv9a478o9ngbcY/deferring

aaron_maiOct 10 20222

Cool idea to run this survey and I agree with many of your points on the dangers of faulty deference.

A few thoughts:

(Edit: I think my characterisation of what deference means in formal epistemology is wrong. After a few minutes of checking this, I think what I described is a somewhat common way of modelling how we ought to respond to experts)

The use of the concept of deference within the EA community is unclear to me. When I encountered the concept in formal epistemology I remember "deference to someone on claim X" literally meaning (a) that you adopt that persons probability judgement on X. Within EA and your post (?) the concept often doesn't seem to be used in this way. Instead, I guess people think of deference as something like (b) "updating in the direction of a persons probability judgement on X" or (c)"taking that person's probability estimate as significant evidence for (against) X if that person leans towards X (not-X)"?
I think (a) - (c) are importantly different. For instance, adopting someones credence doesn't always mean that you are taking their opinion as evidence for the claim in question even if they lean towards it being true: you might adopt someones high credence in X and thereby lowering your credence (because yours was even higher before). In that case, you update as though their high credence was evidence against X. You might also update in the direction of someones credence without taking on their credence. Lastly, you might lower your credence in X by updating in someones direction even if they lean towards X.

Bottom line: these three concepts don't refer to the same "epistemic process" so I think its good to make clear what we mean by deference.

Here is how I would draw the conceptual distinctions:

(I) deference to someones credence in X = you adopt their probability in X (II) positively updating on someone's view = increasing your confidence in X upon hearing their probability on X (III) negatively updating on someones view = decreasing your confidence in X upon hearing their probability in X

I hope this comment was legible, please ask for clarification if anything was unclearly expressed :)

Evan_GaensbauerOct 10 20222

In addition to the EA Forum topic entry, there is a forum post from a few months ago by a researcher on topics related to epistemology in EA named Owen Cotton-Barratt reviewing a taxonomy of common types of deference in EA, and open issues with them, that I found informative.

https://forum.effectivealtruism.org/posts/LKdhv9a478o9ngbcY/deferring

I wrote another comment below that touched on deference, though I wrote it more quickly than carefully and I might have used the concept in a confused way as a I don't have much formal understanding of deference outside of EA, so don't take my word for it. How deference as a concept has been used in EA differently in the last year has seemed ambiguous to me, so I'm inclined to agree that progress in EA in understanding deference could be made through your challenge to the current understanding of the subject.

Sam ClarkeOct 10 20222

Thanks for your comment! I agree that the concept of deference used in this community is somewhat unclear, and a separate comment exchange on this post further convinced me of this. It's interesting to know how the word is used in formal epistemology.

Here is the EA Forum topic entry on epistemic deference. I think it most closely resembles your (c). I agree there's the complicated question of what your priors should be, before you do any deference, which leads to the (b) / (c) distinction.

aaron_maiOct 11 20221

I wonder if it would be good to create another survey to get some data not only on who people update on but also on how they update on others (regarding AGI timelines or something else). I was thinking of running a survey where I ask EAs about their prior on different claims (perhaps related to AGI development), present them with someone's probability judgements and then ask them about their posterior. That someone could be a domain expert, non-domain expert (e.g., professor in a different field) or layperson (inside or outside EA).

At least if they have not received any evidence regarding the claim before, then there is a relatively simple and I think convincing model of how they should update: they should set their posterior odds in the claim to the product of their prior odds and someone else's odds (this is the result of this paper, see e.g. p.18). It would then be possible to compare the way people update to this rational ideal. Running such a survey doesn't seem very hard or expensive (although I don't trust my intuition here at all) and we might learn a few interesting biases in how people defer to others in the context of (say) AI forecasts.

I have a few more thoughts on exactly how to do this, but I'd be curious if you have any initial thoughts on this idea!