Denise_Melchin

Thank you Quintin, this was very helpful for me as a non-ML person to understand the other side of Eliezer’s arguments. As your post is quite dense and it took me a while to work through it, I summarised it for myself. I occasionally had to check the context of the original interview (transcript here) to fully parse the arguments made. I thought the summary might also be helpful to share with others (and let me know if I got anything wrong!):

Eliezer thinks current ML approaches won’t scale to AGI, though due to money influx an approach might be found. Quintin is more optimistic that current ML approaches can scale to AGI. As current alignment techniques are focused on current ML approaches, they won’t help if we have something different that gets us to ML. Current ML capability improvements usually integrate well with previously used alignment approaches which suggests they will keep doing so.
Eliezer is concerned that AI will show more ‘truly general’ intelligence. Humans are not equally general at different tasks as evolution made them specialize on what was important in the ancestral environment and might therefore outclass humans in other tasks. Quintin points out that the learning process humans have been given by evolution is pretty general (albeit biased to what was useful in the ancestral environment), just as the learning process current ML paradigms use is pretty general. How different ML systems actually differ isn’t by using different paradigms but by being trained on different data. Therefore he doesn’t expect such a pattern. He also points out that scale is what makes humans smarter, just as scale is a big driver of how good ML systems are. Humans are not any more constrained by their architecture than ML systems; both can modify themselves to an extent.
Eliezer considers a superintelligence to be what can beat all humans at all tasks. Quintin finds this to be a too high bar as you can have transformative systems which will have deficits.
Eliezer points out that mindspace is large and humans occupy a tiny corner, as such we should expect many different potential AI designs which poses danger. Quintin thinks we should expect AI systems to occupy only a small corner in mindspace, similar to humans. An intuition pump for this is that most real-life data in higher dimensions actually only occupies a small part in those. Again, so far in practice ML systems are using pretty similar processes to humans. They will also be trained on data similar to the data humans are “trained on” as ML systems are mostly trained on human-written text which make them more similar to humans as well.
Eliezer thinks it’s not only hard to align AIs on human values, but on even much more simple goals like duplicating a strawberry. Quintin again thinks this isn’t actually all that hard in principle, but requires starting out with an AI with more general goals which would then be modified to aim for strawberry duplication. He points out that human value formation follows more general and multiple goals than something as single minded as strawberry duplication, so we should allow ML systems to follow such a process of value formation. This will also be a lot easier as ML systems can follow actual examples in the data of such value formation processes and there is a lot more data on human following complex goals than single minded ones.
Eliezer thinks that we won’t be able to align AIs by merely using gradient descent. This is because the primary example of using gradient descent to align a system is evolution and we know that evolution failed to align humans to pursue inclusive genetic fitness in the modern environment. In the ancestral environment, e.g. desiring sexuality was sufficient, but now humans have figured out contraception. People do not desire to maximise their inclusive genetic fitness for its own sake. Quintin thinks this is because ancestral humans didn’t have a concept of inclusive genetic fitness, therefore evolution couldn’t optimise its rewards for improving inclusive genetic fitness directly. Modern AI systems however will have an understanding of human values as they are directly exposed to them during training.
Eliezer makes the same point about humans desiring ice cream. Quintin counters again that there was no ice cream in the ancestral environment, therefore evolution couldn’t punish humans for desiring ice cream. Modern ML researchers however can punish ML systems for doing things they aren’t supposed to, i.e. which are misaligned with human values.
Eliezer thinks aligning AI with gradient descent will be even harder than for evolution to align humans with natural selection as gradient descent is blunter and less simple. Quintin isn’t convinced by this and also points out that evolution was optimising over the learning process via the human genome which will be a lot messier due to its indirectness while ML researchers are training the whole ML system directly. Therefore a comparison doesn’t make much sense.
Eliezer is worried about ML systems trained to predict e.g. human preferences will try to look for opportunities to make predictions easier. Quintin thinks ML systems aren’t optimising to do well at long-term prediction by making it easier to predict things, predicting things is something that ML systems do, not what they want to do. He compares this to humans who also don’t explicitly prioritise to e.g. see very well in the long term.
Eliezer considers it important to employ a ‘security mindset’, a term from computer security, for AI alignment. Ordinary paranoia is insufficient for keeping a system secure, some deeper skills are required. Quintin thinks ML is unlike computer security as most fields are unlike computer security and we don’t use a security mindset for most fields including childrearing which seems like an important analogue to training ML systems to him. This is because ML systems during the training process don’t have adversaries to the same extent as computer systems. They might have adversarial users during deployment, but ML systems themselves aren’t keen to be jailbroken. He also uses the opportunity to point out that Eliezer often compares AI to other fields like rocket science, but ML often works in a pretty different way to other fields, e.g. swapping individual components of ML systems often doesn’t change their functionality while changing rocket components would make rockets fail.
Eliezer is concerned that AI optimists haven’t encountered real difficulties yet and that’s why they’re optimistic, the same way that the original AI conference in the 50s thought problems could be solved in two months which took 70 years to solve. Quintin counters that there were plenty of ML problems which were easier than expected and most notably easier than Eliezer and AI field veterans who have been working on AI since the early days predicted. Both Eliezer and AI veterans didn’t expect neural networks to work as well as they do today. He mentions that Eliezer also stated in a different venue that he didn’t believe that general adversarial networks worked right away, yet they did. He expects the hardness of ML research to predict the hardness of ML alignment research and thinks that Eliezer seems to be poorly calibrated on the former so he will also be on the latter.
Eliezer expects that for AI alignment to go well he will have to be wrong about aspects of AI alignment, but he expects that where he is mistaken about AI alignment this will make AI alignment even harder than he already thinks it is, as it would be really surprising when a new engineering project is easier than you think it is. Quintin strongly disagrees with this framing, because if Eliezer was wrong about how hard alignment is he should expect alignment to be easier than he previously thought.
Eliezer points to how fast AI progress was in the game of Go as a reason for concern that superintelligent AI will suddenly kill humans without killing a somewhat smaller amount of humans in advance. Quintin thinks that Go is disanalogous to a more general AI system as progress in more general systems is usually slower and smoother. Go also had a single objective function AI could use to score itself which will not be true for many other tasks which will require human input slowing down improvements.
Eliezer is even more concerned about AI systems which can self-improve and get smarter during inference (deployment) getting us to fast take off. Quintin counters that we basically already have that. ChatGPT could train on user input; but it’s not programmed to as it wouldn’t be practical. ML training processes could also be changed so they could be reasonably said to self-improve during inference as inference is also a part of training.
Eliezer thinks that people who are capable of breaking AI systems show more AI expertise than people who are merely creating functional AI systems, which is how it works in computer security. This is related to the security mindset claim above. Maybe they’d be able to find ways to improve AI alignment. Quintin thinks the people who break things in computer security are only experts there because in computer security there are clear signs whether the system is broken or not, which isn’t true for AI alignment. He discusses an example where Eliezer thinks a ML system is easily breakable as the ML will try to maximise the reward function, but Quintin thinks that simply maximizing the reward function isn’t how realistic ML systems work. He discusses another example where he thinks ML systems are not easily broken.

Overall my take: Eliezer is concerned about AI that doesn’t look like modern ML systems. Quintin argues modern ML systems don’t show the properties that Eliezer is concerned about more advanced AI showing. Quintin thinks that more advanced ML systems can already be real AGI. What I am confused about is why Eliezer is then so worried about the current state of AI if the thing he is worried about is so much more advanced/general in mindspace, or more specifically why does he consider current ML systems to be evidence that we are getting closer to the kind of AI he is worried about.

Audio AMA: Allan Saldanha, earning to give since 2014.

Denise_Melchin2mo14

I'm curious how you first got interested in giving, especially as Giving What We Can skewed towards students and (very) young professionals at the time.

What motivated you to increase the percentages over time?

How do your wife and teenage children feel about your giving?

How much should you be expecting to earn in order to consider ETG?

Answer by Denise_MelchinNov 21, 20238

It will depend on what your alternatives are. If you could become a charity entrepreneur, I would expect this option dominates over your proposed path. Perhaps you are pursuing some other direct work options that you can compare to your option once you have received an offer.

But if there are no compelling direct work options (and for most people, there won't be), earning and donating as much as you can is a great path! Donating $10k a year is a great start.

My mistakes on the path to impact

Denise_Melchin1y2

First I apologize for my late response!

I completely agree with you that being in a limbo state is the least effective place you can be! Exploring is valuable, but at some point you have to act what you have learnt. Even if what you learnt was really not what you were hoping to learn...

My perspective is that I can still have a major impact via donations. The more I earn, the more I can donate. The more frugal I live, the more I can donate too. Unfortunately the EA Community is no longer as supportive of people who see their primary way to impact via donations as it once was. I don't think I would have come to my current perspective if I had joined the EA Community in recent years. But Giving What We Can is ramping up again and holding some events if this is a path you might be interested in.

I am still working in the UK Civil Service and have worked here for 3.5 years by now. I do consider the direct impact of my work in the Civil Service to be trivial compared to the donations I can make thanks to my earnings. I have increased my pay by ~135% compared to when I started (not inflation-adjusted). How much this has increased my donations is a bit harder to say as my finances and donations are mingled with my husband's.

I do not consider myself settled as I expect my earnings to tap out now. My original plan was to switch to the private sector this year, but this has been tricky as tech is having a downturn. All my Civil Service roles have been data/tech roles. I also considered some other direct work options this year, but there were very few I was interested in (both due to poor fit as well as doubts over their actual impact) and none of them panned out.

Hope this helps and feel free to reach out anytime. I am sorry you are in this position.

GWWC Reporting Attrition Visualization

Denise_Melchin2y5

I have also barely reported, despite keeping the pledge for 10 years. Will finally get my reckoning with missing out on the pin though...

Nobody’s on the ball on AGI alignment

Denise_Melchin2y17

I appreciate that you are putting out numbers and explain the current research landscape, but I am missing clear actions.

The closest you are coming to proposing them is here:

We need a concerted effort that matches the gravity of the challenge. The best ML researchers in the world should be working on this! There should be billion-dollar, large-scale efforts with the scale and ambition of Operation Warp Speed or the moon landing or even OpenAI’s GPT-4 team itself working on this problem.[17] Right now, there’s too much fretting, too much idle talk, and way too little “let’s roll up our sleeves and actually solve this problem.”

But that still isn't an action plan. Say you convince me, most of the EA Forum and half of all university educated professionals in your city that this is a big deal. What, concretely, should we do now?

My Objections to "We’re All Gonna Die with Eliezer Yudkowsky"

Denise_Melchin2y11

The illusion of consensus about EA celebrities

Denise_Melchin2y57

We end up seeming more deferential and hero-worshipping than we really are.

I feel like this post is missing something. I would expect one of the strongest predictors of the aforementioned behaviors to be age. Are there any people in their thirties you know who are prone to hero-worshipping?

I don’t consider hero-worshipping an EA problem as such, but a young people problem. Of course EA is full of young people!

Make sure people incoming to the community, or at the periphery of the community, are inoculated against this bias, if you spot it. Point out that people usually have a mix of good and bad ideas. Have some go-to examples of respected people's blind spots or mistakes, at least as they appear to you.

This seems like good advice to me, but I expect it to benefit from being aware that you need to talk about these things to a young person because they are young.

Effective Altruism Forum
EA Forum

Bio

Posts
27

Comments
302

Denise_Melchin

Bio

Posts 27

Comments302

Posts
27

Comments
302