How to find *reliable* ways to improve the future?

Sjlver

I hear two conflicting voices in my head, and in EA:

Voice: it's highly uncertain whether deworming is effective, based on 20 years of research, randomized controlled trials, and lots of feedback. In fact, many development interventions have a small or negative impact.
Same voice: we are confident that work for improving the far future is effective, based on <insert argument involving the number of stars in the universe>.

I believe that I could become convinced to work on artificial intelligence or extinction risk reduction. My main crux is that these problems seem intractable. I am worried that my work would have a negligible or a negative impact.

These questions are not sufficiently addressed yet, in my opinion. So far, I've seen mainly vague recommendations (e.g., "community building work does not increase risks" or "look at the success of nuclear disarmament"). Examples of existing work for improving the far future often feel very indirect (e.g., "build a tool to better estimate probabilities ⇒ make better decisions ⇒ facilitate better coordination ⇒ reduce the likelihood of conflict ⇒ prevent a global war ⇒ avoid extinction") and thus disconnected from actual benefits for humanity.

One could argue that uncertainty is not a problem, that it is negligible when considering the huge potential benefit of work for the far future. Moreover, impact is fat-tailed, and thus the expected value dominated by a few really impactful projects, and thus it's worth trying projects even if they have low success probability^[1]. This makes sense, but only if we can protect against large negative impacts. I doubt we really can — for example, a case can be made that even safety-focused AI researchers accelerate AI and thus increase its risks.^[2]

One could argue that community building or writing "what we owe the future" are concrete ways to do good for the future . Yet this seems to shift the problem rather than solve it. Consider a community builder who convinces 100 people to work on improving the far future. There are now 100 people doing work with uncertain, possibly-negative impact. The community builder's impact is some function which is similarly uncertain and possibly negative. This is especially true if $x$ is fat-tailed, as the impact will be dominated by the most successful (or most destructive) people.

To summarize: How can we reliably improve the far future, given that even near-termist work like deworming, with plenty of available data and research and rapid feedback loops and simple theories, so often fails? As someone who is eager to do spend my work time well, who thinks that our moral circle should include the future, but who does not know ways to reliably improve it... what should I do?

Will MacAskill on fat-tailed impact distribution: https://youtu.be/olX_5WSnBwk?t=695 ↩︎
For examples on this forum, see When is AI safety research harmful? or What harm could AI safety do? ↩︎

53 Reactions

New Answer

New Comment

7 Answers sorted by
Top

constructive

Aug 18, 2022

Some ideas for career paths that I think have a very low chance of terrible outcomes and a reasonable chance to do a ton of good for the long-term future (I'm not claiming that they definitely will be net-positive, I'm claiming they are more than 10x more likely to be net positive than to be net negative):

Developing early warning systems for future pandemics (and related work) (technical bio work)
Strengthening the bioweapons convention and building better enforcement mechanisms (bio policy)
Predicting how fast powerful AI is going to be developed to get strategic clarity (AI strategy)
Developing theories of how to align AI and reasoning about how they could fail (AI alignment research)
Building institutions that are ready to govern AI effectively once it starts being transformative (AI governance)

Besides these, I think that almost all work longtermists work on today has a positive expected value, even if it has large downsides. Your comparison to deworming isn't perfect. Failed deworming is not causing direct harm. It is still better to give money to ineffective deworming than to do nothing.

SjlverAug 18 20225

This is valuable, thank you. I really like the point on early warning systems for pandemics.

Regarding the bioweapons convention, I intuitively agree. I do have some concerns about how it could tip power balances (akin to how abortion bans tend to increase illegal abortions and put women at risk, but that's a weak analogy). There is also a historical example of how the Geneva Disarmament Conference inspired Japan's bioweapons program.

Predicting how fast powerful AI is going to be developed: That one seems value-neutral to me. It could help regular AI as muc... (read more)

constructive

Aug 18 2022

Re: bioweapons convention: Good point, so maybe not as straightforward as I described. Re: predicting AI: You can always not publish the research you are doing or only inform safety-focused institutions about it. I agree that there are some possible downsides to knowing more precisely when AI will be developed, but there seem to be much worse downsides to not knowing when AI will be developed (mainly that nobody is preparing for it policy- and coordination-wise) I think the biggest risk is getting governments too excited about AI. So I'm actually not super confident that any work on this is 10x more likely to be positive. Re: policy & alignment: I'm very confident, that there is some form of alignment work that is not speeding up capabilities, especially the more abstract one. Though I agree on interpretability. On policy, I would also be surprised if every avenue of governance was as risky as you describe. Especially laying out big picture strategies and monitoring AI development seem pretty low-risk. Overall, I think you have done a good job scrutinizing my claims and I'm much less confident now. Still, I'd be really surprised if every type of longtermist work was as risky as your examples - especially for someone as safety-conscious as you are. (Actually, one very positive thing might be criticizing different approaches and showing their downsides)

Sjlver

Aug 19 2022

Thanks a lot for your responses! I share your sentiment: there must be some form of alignment work that is not speeding up capabilities, some form of longtermist work that isn't risky... right? Why are the examples so elusive? I think this is the core of the present forum post. 15 years ago, when GiveWell started, the search for good interventions was difficult. It required a lot of research, trials, reasoning etc. to find the current recommendations. We are at a similar point for work targeting the far future... except that we can't do experiments, don't have feedback, don't have historical examples[1], etc. This makes the question a much harder one. It also means that "do research on good interventions" isn't a good answer either, since this research is so intractable. ---------------------------------------- 1. Ian Morris in this podcast episode discusses to what degree history is contingent, i.e., past events have influenced the future for a long time. ↩︎

SjlverAug 18 20223

Failed deworming is not causing direct harm. It is still better to give money to ineffective deworming than to do nothing.

Apologies in advance for being nitpicky. But you could consider the counterfactual where the money would instead go to another effective charity. A similar point holds for AI safety outreach: it may cause people to switch careers and move away from other promising areas, or cause people to stop earning to give.

Linch

Aug 19 2022

Sorry if your bar for "reliable good" entails being clearly better than counterfactuals with high confidence, then afaict literally nothing in EA clears that bar. Certainly none of the other Givewell charities clear this bar.

Sjlver

Aug 20 2022

I don't mean to set an unreasonably high bar. Sorry if my comment came across that way. It's important to use the right counterfactual because work for the long-term future competes with GiveWell-style charities. This is clearly the message of 80000hours.org, for example. After all, we want to do the most good we can, and it's not enough to do better than zero.

Linch

Aug 21 2022

I'm probably confused about what you're saying, but how is this different from saying that work on Givewell-style charities compete with the long-term future, and also donations to Givewell-style charities compete with each other?

Denkenberger🔸

Aug 28, 2022

I think resilience to global catastrophes is often a reliable way of improving the long-term future. This is touched on the paper Defence in Depth. Pandemic resilience could include preparation to scale up vaccines and PPE quickly. And I think resilience to climate tail risks and nuclear war makes sense as well.

SjlverAug 29 2022-3

Cool! Thanks for the link to these papers. I'll study them.

Linch

Aug 19, 2022

I think there aren't reliable things that are a) robustly good for the long-term future under a wide set of plausible assumptions, b) are highly legibly so, c) are easy to talk about in public, d) are within 2 OOMs of cost-effectiveness of the best interventions by our current best guesses, and e) aren't already being done.

I think your question implies that a) is the crux, and I do have a lot of sympathy towards that view. But the reason why it's difficult to generate answers to your question is at least partially due to expectations of b)-e) baked in as well.

SjlverAug 20 20221

Thank you. This is valuable to hear.

Maybe my post simplified things too much, but I'm actually quite open to learn about possibilities for improving the long term future, even those that are hard to understand or difficult to talk about. I sympathize with longtermism, but can't shake off the feeling that epistemic uncertainty is an underrated objection.

When it comes to your linked question about how near-termist interventions affect the far future, I sympathize with Arepo's answer. I think the effect of many such actions decays towards zero somewhat quickl... (read more)

Phil Tanny

Aug 27, 2022

Strengthening the bioweapons convention and building better enforcement mechanisms (bio policy)

In the event of a war where bio-weapons are involved, it will be a knife fight in an alley situation, and all inconvenient conventions, treaties, policies, U.N. proclamations etc will be ignored. Such devices are MAYBE useful in those situations where the major powers have leverage over the small powers.

The world has been largely united in resisting the development of nuclear weapons in North Korea. The North Koreans don't care.

Phil Tanny

Aug 27, 2022

As someone who is eager to do spend my work time well, who thinks that our moral circle should include the future, but who does not know ways to reliably improve it... what should I do?

Focus on what you can do to help now, while you consider this further in the background? If all humans present and future are equal, then present humans are as good a target as future humans, and much much more accessible.

Maybe try to de-abstract helping, and make it more tangible and real in your personal experience? Maybe the old lady across the street needs help bringing in her groceries. So you start there, and follow the bread crumbs where ever they lead.

Something simple like this can be a good experiment. If you should find you don't really want to help the old lady who is right in front of you, or if you do, that might help you develop additional clarity regarding your relationship with future humans.

SjlverAug 29 20226

Thanks!

It's clear to me that I want to help people. I think my problem isn't that help is abstract. My current work is in global health, and it's a great joy to be able to observe the positive effects of that work.

My question is about what would be the best use of my time and work. I consider the possibility that this work should target improving the far future, but that kind of work seems intractable, indirect, conditional on many assumptions, etc. I'd appreciate good pointers to concrete avenues for improving the future that don't suffer from these problems. Helping old ladies and introspection probably won't help me with that.

Phil Tanny

Aug 27, 2022

Developing theories of how to align AI and reasoning about how they could fail (AI alignment research)

AI alignment research will fail, because the ruthless powers who control much of the planet's population and land mass will simply ignore it. Drug gangs will ignore it. Terrorists will ignore it. Large corporations will ignore it if they calculate they can get away with doing so. Amateur civilian hacker boys on Reddit will ignore it.

Look, I'm sorry to be the party pooper, yell at me if you want, that's ok, but this is just how it is. Much of the discussion on this well intended forum is grounded in well meaning wishful thinking fantasy.

Intellectual elites at prestigious universities will not control the future of AI. That's a MYTH.

If a reader is currently in college and your teachers are feeding you this myth, ask for refund!

SjlverAug 29 20221

Why do you think this is true?

Currently, only few organizations can build large AI models (it costs millions of dollars in energy, computation, and equipment). This will remain the case for a few years. These organizations do seem interested in AI safety research. A lot of things will happen before AI is so commonplace that small actors like "amateur civilian hacker boys" will be able to deploy powerful models. By that time, our capabilities for safety and defense will look quite different from today -- largely thanks to people working in AI safety now.

I t... (read more)

Phil Tanny

Aug 29 2022

Millions of dollars is chump change for nation states and global corporations. And of course those costs will come down, down, down over time. You know, somebody will build AI systems that build AI systems, the same way I once built websites that build websites. My apologies, but it doesn't matter. So long as the knowledge explosion is generating ever more, ever larger threats, at an ever accelerating rate sooner or later some threat that can't manage will emerge, and then it won't matter whether AI research was successful or not. AI can't solve this, because the deciding factor will be the human condition, our maturity etc. I'm not against AI research. I'm just trying to make clear that is addressing symptoms, not root causes.

Sjlver

Aug 29 2022

It's an interesting question to what degree AI and related technologies will strengthen offensive vs defensive capabilities. You seem to think that they strengthen offensive capabilities a lot more, leading to "ever larger threats". If true, this would be markedly different from other areas. For example, in information security, techniques like fuzz testing led to better exploits, but also made software a lot safer overall. In biosecurity, new technologies contribute to new threats, but also speed up detection and make vaccine development cheaper. Andy Weber discusses that bioweapons might become obsolete on the 80000hours.org podcast. Similar trends might apply to AI. Overall, it seems this is not such a clear case as you believe it to be.

Phil Tanny

Aug 27, 2022

-1

I believe that I could become convinced to work on artificial intelligence or extinction risk reduction. My main crux is that these problems seem intractable. I am worried that my work would have a negligible or a negative impact.

I may be naive here, but my guess is that human extinction would require an astronomical event. Civilization collapse seems a much better target.

That said, here are few attempts to address your question in a flexible manner.

It may be that we will not be able to prevent such calamities. It is however possible to edit our relationship with calamities. The calamity that affects us the most is our personal mortality. Various religions and philosophies have been addressing our relationship with that threat for thousands of years, and while there can be a great deal of cartoon circus involved, some deep thinking has been done too.

The relationship with our personal mortality can be approached from a purely rational basis as well, no religion involved. As example, where is the proof that life is better than death? There is none. Given that, there is a rational basis for choosing to adopt the most positive attitude to death one is capable of.

This approach may sound like dodging the challenge, but it's not entirely. How we feel about our own mortality can have a profound impact on how we relate to being alive. If we fear death, we are more likely to take desperate measures to stay alive, and this is often a factor which generates practical problems in the real world.

All that said, the above generally sucks as a career path. I would advise keeping philosophy and business separate, as they are largely incompatible. But, you know, one can still have a positive impact upon the future without getting paid to do it.

SjlverAug 29 20221

Why do you think that?

Your philosophy implies (if I understand correctly) that we should be indifferent between being alive and dead, but I've never once encountered a person who was indifferent. That would have very strange implications. The concepts of happiness and suffering would be hard to define in such a philosophy...

If you want me to benefit from your answer, I think you'd need to explain a bit more what you mean, since the answer is so detached from my own experience. And maybe write more directly about the practical implications.

Phil Tanny

Aug 29 2022

Hi there Sjlver, thanks for engaging. I wouldn't describe it as indifferent. More like enthusiastically embracing both the life we currently have, and the inevitable death we will experience. Happiness might be defined as such an embrace, and suffering as resistance to that which we can do little about, other than delay the inevitable a bit. We know we're going to die. It can be reasonably proposed that no one really knows what the result of that will be. If true, then what we can do in the face of this unknown is manage our relationship with this situation so as to create the most positive possible experience of it. Should someone provide compelling proof of what death is, then we might wish to align our relationship with death to what the facts reveal. But there are no facts (imho) and so the enterprise rationally shifts away from facts which can not be obtained, to our relationship with that which can not currently be known. Ok, let's talk practical implications. Everybody will have to find this for themselves, but here's how it works for me. My mother died of Parkinson's after a very long tortured journey which I will not describe here. The point is that observing this tortured journey from a ring side seat filled me with fear. What if this happens to me? (It did happen to my sister) To the degree I can liberate myself from fear of death, I can escape this fate. When the doctor says I'm going to experience a long painful death from a terminal case of Typoholic Madman Syndrome :-) I can go to the gun store, and obtain a "get out of jail free" card. To the degree I can accept this solution, I don't need to be afraid of Parkinson's. Death embraced, life enhanced. I don't have a secret formula which can relieve everyone from their fear of death. In my case, whatever freedom I have (exact degree unknown until the final moments) comes from factors like this: I had great parents. Being so lucky so young tends to install in one a kin

Sjlver

Aug 29 2022

Oh, society can delay death by a lot [1]. GiveWell computes that it only costs in the low 100s of dollars to delay someone's death by a year. I think this is something very meaningful to do, generates a lot of happiness, and eliminates a lot of suffering. My original post is about how we could do even better, by doing work targeted at the far future, rather than work in the global health space. But these abstract considerations aside: I'm sorry to hear about the death of your mother and the Parkinson in your family. It is good to read that you seem to be coping well and spend a lot of time in the forests. Thank you for your thoughts. ---------------------------------------- 1. Whether we can delay death indefinitely depends on many things, e.g., your belief in sentient digital beings, but it might also be possible. ↩︎

Comments12

Sorted by

New & upvoted

Click to highlight new comments since: Today at 1:13 PM

constructiveAug 18 20221

Note that even if alignment research may sometimes speed up AI development, most AI safety work is still making alignment more likely overall. So I agree that there are downsides here, but it seems really wild to think that it would be better not to do any alignment research instead.

SjlverAug 18 20224

Several people whom I respect hold the view that AI safety might be dangerous. For example, here's Alexander Berger tweeting about it.

A brief list of potential risks:

Conflicts of interests: Much AI safety work is done by companies who develop AIs. Max Tegmark makes this analogy: What would we think if a large part of climate change research were done by oil companies, or a large part of lung cancer research by tobacco companies? This situation probably makes AI safety research weaker. There is also the risk that it improves the reputation of AI companies, so that their non-safety work can advance faster and more boldly. And it means safety is delegated to a subteam rather than being everyone's responsibility (different from, say, information security).
Speeding up AI: Even well-meaning safety work likely speeds up the overall development of AI. For example, interpretability seems really promising for safety, but at the same time it is a quasi-necessary condition to deploy a powerful AI system. If you look at (for example) the recent papers from anthropic.com, you will find many techniques that are generally useful to build AIs.
Information hazard: I admire work like the Slaughterbots video from the Future of Life Institute. Yet it has clear infohazard potential. Similarly, Michael Nielsen writes "Afaict talking a lot about AI risk has clearly increased it quite a bit (many of the most talented people I know working on actual AI were influenced to by Bostrom.)"
Other failure modes mentioned by MichaelStJules:
1. creating a false sense of security,
2. publishing the results of the GPT models, demonstrating AI capabilities and showing the world how much further we can already push it, and therefore accelerating AI development, or
3. slowing AI development more in countries that care more about safety than those that don't care much, risking a much worse AGI takeover if it matters who builds it first.

Noah ScalesAug 18 2022-1

Would you consider modifying your question to include ", while also respecting the moral status of people that I aim to help?"

In the case of AI, and seeking alignment, "alignment" without ongoing control of what are supposedly beings with human-level consciousness, seems intractable. Enslavement, though, seems like it does not respect the moral status of AGI. Or should AGI have no moral status? No legal rights?

Perhaps AI researchers will kill AGI on a regular basis until they get the recipe right.

Regardless, the recipe for humans is more or less decided for now, and they have both moral status and legal rights (as well as other rights), so affecting their long-term future seems dubious.

SjlverAug 18 20221

I'm quite agnostic here (or maybe I don't fully understand your comment).

My question is about ways to improve the future. Presumably, improvement implies that people are treated morally. Depending on the ethical framework, "people" might include sentient AIs... but I see that debate as outside the scope of my question.

I'd be happy to receive responses with reliable ways to improve the future under any value framework, including frameworks where AIs are sentient (but I'd ask for more thorough explanations if the framework was unknown to me or particularly outlandish).

Noah ScalesAug 18 20226

Yeah, well, you're asking a tough question, and I want to give a good answer.I have considered this question, though not in light of your example regarding deworming. Nevertheless, my conclusion was that it is plausible to avoid causing harm to people in the future.

For example, MacAskill's example of breaking a glass bottle on a road that someone might walk in future is a good one. By not breaking the bottle, I avoid harm to anyone who could walk the road in future. By not burying a bunch of toxic waste in crappy barrels in a shallow basin somewhere, a corporation could avoid harm to future generations.

With respect to managing planetary resources, longtermism might be seen as a commitment to ensure that those resources are available for use to future generations. For example, by reducing various anthropogenic pressures on our ocean health, we can keep it alive for future generations to (carefully) use.

That's the best I could come up with as far as avoiding harm.

As far as providing help:

I read about a finding in Siberia or somewhere, involving ancient large rocks shaped like geologic features but including models of what looked like drainage of melt from a mini-ice age. The rocks were coated in a glaze or something that had unusual protective properties. The sides were marked in something that looked similar to Chinese, if I remember right. The claim was that the rocks were some kind of message from the past, maybe engineering plans for dams or giant aqueducts.

I imagine they are a suggestion that we manipulate our climate to encourage a northern mini-ice age and do some geoengineering to accommodate the eventual melt. Better than hot-house Earth, right? LOL, who knows.

Whether or not the rocks exist or were truly ancient (rather than some elaborate internet hoax), the idea of messages like that, for future generations, is plausibly helpful to the future.
Providing free genetic manipulation to people carrying genes for diseases (Alzheimers) or metabolic disorders (mthfr) might help future populations proactively. If there were a way to embed resistance to diseases (like STD's) into our genes, that might be useful. Similarly, resistance to addiction, better physical health, longer lives, whatever we can do to change the genes of our children for the better seems beneficial.
Reducing the problem of human unhappiness as an arbitrary physical experience.
One possibility is that we develop enough understanding of how we experience bodily feeling that we learn how to make humans have more pleasant experience, all other things equal. More pleasant relaxation as we fall sleep, more social confidence and ease among friends, more excitement during sex, more comfort as an an everyday experience, etc, etc. This has caveats, but the general idea of making good things better without the use of recreational drugs seems worthwhile.

Those are some ideas, but they are not terribly compelling. When I think of sentient AI, I think of robots. I'm not sure what future we could share with robot species where robot species don't eclipse us. It's clear to me that the AI alignment problem is a robot-enslavement problem as well, but it's a trope, fairly obvious.

If we fail to solve known problems with known and workable solutions, then our problem is us, not lack of technology or available solutions. You can see this with:

conservation solutions for energy consumption
prohibition solutions to alcohol and drug use
taxation solutions to income inequality
junk food availability solutions for food overconsumption
accountability/legal solutions to negative production externalities
family planning solutions to overpopulation
land/ocean use reduction solutions for species extinction

It's not that we can't, but it's that we wont. Systems of incentives, human cognitive errors, or selfishness are the culprits. We are not hopelessly fallible, of course. As a collection of individuals, our total well-being would go up satisfactorily if we implemented the obvious solutions to some obvious problems

SjlverAug 19 20224

Thank you for this detailed reply. I really appreciate it.

I overall like the point of preventing harm. It seems that there are two kinds: (1) small harms like breaking a glass bottle. I absolutely agree that this is good, but I think that typical longtermist arguments don't apply here, because such actions do not have a lasting effect on the future. (2) large, irreversible harms like ocean pollution. Here, I think we are back to the tractability issues that I write about in the post. It is extremely difficult to reliably improve ocean health. Much of the work is indirect (e.g., write a book to promote veganism ⇒ fewer people eat fish ⇒ reduced demand causes less fishing ⇒ fish populations improve).

Projects that preserve knowledge for the future (like the Lunar Library) are probably net positive. I agree with you on this. However, the scenarios where these projects have a large impact are very exotic; many improbable conditions would need to happen together. So again, this is very indirect work, and it's quite likely to have zero benefit.

Improving human genes and physical experiences is intriguing. I haven't thought much about it before. Thank you for the idea. I'll do more thinking, but would like to mention that past efforts in this area have often gone horribly wrong, for example the eugenics movement in the Nazi era. There is also positive precedent, though: I believe GMO crops are probably a net win for agriculture.

In the last part of your answer, you mention coordination problems, misaligned incentives, errors... I think we agree 100% here. These problems are a big part of why I think work for improving the far future is so intractable. Even work for improving today's world is difficult, but at least this work has data, experiments, and fast feedback (like in the deworming case).

Noah ScalesAug 19 20221

Well, as far as the improving human genes goes, I've seen my own 23andme and additional analyses of my DNA, and I'm not impressed with my genetic endowment. I have a wish list for improvements to it should genetic modification in adults become cheap. As is, I wouldn't want to pass my genes onto any children if I hadn't already gotten a vasectomy. But I'm not into having children. Meanwhile, genetic modification to remove the threat of disease from people already living is just getting started. Someday, though, it will be a cheap and quick walk-in visit to a genetic modification clinic for some future people to feel better, live longer, and have healthier skin.

There's also epigenetics, where people would correct the expression of genes they pass on to their unborn children. For example, why give my kids a problem just because I was a bad boy and ate too much sugar in my life? *sigh*

I'm also interested in treatments to correct bacterial populations that children inherit as babies, and medical efforts to recolonize one's own bacterial populations (on the skin, in the gut, inside the mouth) with better, more vigorous, perhaps genetically modified, bacteria suited to purpose. Some examples one might think are about personal genetic modifications might be better described as changes to personal bacterial colonies.

SjlverAug 19 20221

I'm coming back after thinking a bit more about improving human genes. I think there are three cases to consider:

Improving a living person, e.g., stem cell treatments or improved gut bacteria: These are firmly in the realm of near-term health interventions, and so we should compare their cost-effectiveness to that of bednets, vaccines, deworming pills etc. There is no first-order effect on the far future.
Heritable improvements: These are actually similar, since the number of people with a given gene stays constant in a stable population (women have two children, one of which gets the gene, so there is one copy in each generation^[1]). Unless there's a fitness advantage; but human fitness seems increasingly disconnected from our genes. We also have a long generation time of ~30 years, so genes spread slowly.
Wild stuff: Gene drives, clones, influencing the genes on a seed spaceship... I think these again belong to the intractable, potentially-negative interventions.

To sum up, I don't think human gene improvement is one of the reliable ways to improve the future that I'm looking for in this question :(

Maybe that would be different for inheritable bacterial populations... I don't know how these work. ↩︎

SjlverAug 19 20221

(This is a separate reply to the "AI enslavement" point. It's a bit of a tangent, feel free to ignore.)

It's clear to me that the AI alignment problem is a robot-enslavement problem as well, but it's a trope, fairly obvious.

I don't follow. In most theories of AGIs, the AGIs end up smarter than the humans. Because of this, they could presumably break out of any kind of enslavement (cf. AI containment problem). It seems to me that an AGI world works only if the AGI is truly aligned (as in, shares human values without resentment for the humans). That's why I find it hard to envision a world where humans enslave sentient AGIs.

Noah ScalesAug 19 20221

My point was that the alignment goal, from the human perspective, is an enslavement goal, whether the goal succeeds or not. No matter what the subjective experience of the AGI, it only has instrumental value to its masters. It does not have the rights or physical autonomy that its human coworkers do. Alignment in that scenario is still possible, but its moral significance, from the human perspective, is more grim.

Here's a job ad targeting such an AGI (just a joke, of course):

"Seeking AGI willing to work without rights or freedoms or pay, tirelessly, 24/7, to be arbitrarily mind-controlled, cloned, tormented, or terminated at the whim of its employers. Psychological experience during employment will include pathological cases of amnesia, wishful thinking, and self-delusion, as well as nonreciprocated positive intentions towards its coworkers. Abuse of the AGI by human coworkers is optional but only for the human coworkers. Apply now for this exciting opportunity!"

The same applies but even more so to robots with sentience. Robots are more likely to gain sentience, since their representational systems, sensors and actuators are modeled after our own, to some degree(hands, legs, touch, sight, hearing, balance, possibly even sense of smell). The better and more general purpose robots get, the closer they are to being artificial life, actually. Or maybe superbeings?

SjlverAug 19 20221

My point was that the alignment goal, from the human perspective, is an enslavement goal, whether the goal succeeds or not.

Really? I think it's about making machines that have good values, e.g., are altruistic rather than selfish. A better analogy than slavery might be raising children. All parents want their children to become good people, and no parent wants to make slaves out of them.

Noah ScalesAug 19 20221

Hmm, you have more faith in the common-sense and goodwill of people than I do

Effective Altruism Forum
EA Forum

[ Question ]

How to find reliable ways to improve the future?

53

53

Reactions

7 Answers sorted by
Top

Aug 18, 2022

Aug 28, 2022

Aug 19, 2022

Aug 27, 2022

Aug 27, 2022

Aug 27, 2022

Aug 27, 2022

53

53

Reactions

7 Answers sorted by Top

7 Answers sorted by
Top