Principles for AI Welfare Research

jeffsebo

Tl;dr: This post, which is part of the EA Strategy Fortnight series, summarizes some of my current views about the importance of AI welfare, priorities for AI welfare research, and principles for AI welfare research.

1. Introduction

As humans start to take seriously the prospect of AI consciousness, sentience, and sapience, we also need to take seriously the prospect of AI welfare. That is, we need to take seriously the prospect that AI systems can have positive or negative states like pleasure, pain, happiness, and suffering, and that if they do, then these states can be good or bad for them.

A world that includes the prospect of AI welfare is a world that requires the development of AI welfare research. Researchers need to examine whether and to what extent AI systems might have the capacity for welfare. And to the extent that they might, researchers need to examine what might be good or bad for AI systems and what follows for our actions and policies.

The bad news is that AI welfare research will be difficult. Many researchers are likely to be skeptical of this topic at first. And even insofar as we take the topic seriously, it will be difficult for us to know what, if anything, it might be like to be an AI system. After all, the only mind that we can directly access is our own, and so our ability to study other minds is limited at best.

The good news is that we have a head start. Researchers have spent the past half century making steady progress in animal welfare research. And while there are many potentially relevant differences between animals and AI systems, there are also many potentially relevant similarities – enough for it to be useful for us to look to animal welfare research for guidance.

In Fall 2022, we launched the NYU Mind, Ethics, and Policy Program, which examines the nature and intrinsic value of nonhuman minds, with special focus on invertebrates and AI systems. In this post, I summarize some of my current views about the importance of AI welfare, priorities for AI welfare research, and principles for AI welfare research.

I want to emphasize that this post discusses these issues in a selective and general way. A comprehensive treatment of these issues would need to address many more topics in much more detail. But I hope that this discussion can be a useful starting point for researchers who want to think more deeply about what might be good or bad for AI systems in the future.

I also want to emphasize that this post expresses my current, tentative views about this topic. It might not reflect the views of other people at the NYU Mind, Ethics, and Policy Program or of other experts in effective altruism, global priorities research, and other relevant research, advocacy, or policy communities. It might not even reflect my own views a year from now.

Finally, I want to emphasize that AI welfare is only one of many topics that merit more attention right now. Many other topics merit more attention too, and this post makes no specific claims about relative priorities. I simply wish to claim that AI welfare research should be among our priorities, and to suggest how we can study and promote AI welfare in a productive way.

2. Why AI welfare matters

We can use the standard EA scale-neglectedness-tractability framework to see why AI welfare matters. The general idea is that there could be many more digital minds than biological minds in the future, humanity is currently considering digital minds much less than biological minds, and humanity might be able to take steps to treat both kinds of minds well.

First, AI welfare is potentially an extremely large-scale issue. In the same way that the invertebrate population is much larger than the vertebrate population at present, the digital population has the potential to be much larger than the biological population in the future. And in the same way that humans currently interact with many invertebrates at present, we have the potential to interact with many digital beings in the future. It thus matters a lot whether and to what extent these beings will have the capacity to experience happiness, suffering, and other welfare states. Indeed, given the potential size of this population, even if individual digital beings have only a small chance of experiencing only small amounts of welfare, given the evidence, they might still experience large amounts of welfare in total, in expectation.

Second, AI welfare is currently extremely neglected. Humans still spend much less time and money studying and promoting nonhuman welfare and rights than studying and promoting human welfare and rights, despite the fact that the nonhuman population is much larger than the human population. The same is true about both the vertebrate and invertebrate populations and the biological and digital populations. In all of these cases, we see an inverse relationship between the size of a population and the level of attention that this population receives. And while humans might be warranted in prioritizing ourselves to an extent for the foreseeable future for a variety of reasons, we might still be warranted in prioritizing nonhumans, including invertebrates and AI systems, much more than we currently do.

Third, AI welfare is at least potentially tractable. Its tractability is currently an open question, since advancing our understanding of the nature and intrinsic value of digital minds requires us to confront some of the hardest issues in philosophy and science, ranging from the nature of consciousness to the ethics of creating new beings. But while we might not ever be able to achieve certainty about these issues, we might at least be able to reduce our uncertainty and make more informed, rational decisions about how to treat digital minds. And either way, given the importance and neglectedness of the issue, we should at least investigate the tractability of the issue so that we can learn through experience what the limits of our knowledge about AI welfare are, rather than simply make assumptions from the start.

Finally, human, animal, and AI welfare are potentially linked. There might be cases where the interests of biological and digital beings diverge, but there might also be cases where our interests converge. As an analogy, human and nonhuman animals alike stand to benefit from a culture of respect and compassion for all animals, since our current exploitation and extermination of other animals for food, research, entertainment, and other purposes not only kills trillions of animals per year directly but also contributes to (a) global health and environmental threats that imperil us all and (b) exclusionary and hierarchical attitudes that we use to rationalize oppression within our own species. We should be open to the possibility that in the future, similar dynamics will arise between biological and digital populations.

3. Priorities for AI welfare research

Improving our understanding of whether, to what extent, and in what ways AI systems can be welfare subjects requires asking a wide range of questions, ranging from the theoretical (what is the nature of welfare?) to the practical (is this action harming this being?). For my purposes here I can focus on four general kinds of questions that I take to be especially important.

First, we need to improve our understanding of which beings have the capacity for welfare and moral standing. Answering this question partly requires asking which features are necessary and sufficient for welfare and moral standing. For example, even if we grant that sentience is sufficient, we might wonder whether consciousness without sentience, agency without consciousness, or life without agency is also sufficient. Answering this question also partly requires asking which beings have the features that might be necessary and sufficient. For example, even if we grant that, say, relatively complex, centralized, and carbon-based systems can be sentient or otherwise significant, we might wonder whether relatively simple, decentralized, and silicon-based systems can be sentient or otherwise significant, too.

Second, we need to improve our understanding of how much happiness, suffering, and other welfare states particular beings can have. Answering this question partly requires asking how to compare welfare capacities in different kinds of beings. Interspecies welfare comparisons are already hard, because even if we grant that our welfare capacities are a function of, say, our cognitive complexity and longevity (which, to be clear, is still very much an open question), we might not be able to find simple, reliable proxies for these variables in practice. If and when digital minds develop the capacity for welfare, intersubstrate welfare comparisons will be even harder, because we lack the same kinds of physical and evolutionary “common denominators” across substrates that we have, at least to an extent, within them.

Third, we need to improve our understanding of what benefits and harms particular beings. Even if we grant that everyone is better off to the extent that they experience positive states like pleasure and happiness and worse off to the extent that they experience negative states like pain and suffering, we might not always know to what extent someone is experiencing positive or negative states in practice. Likewise, even if we grant that a life is worth living when it contains more positive than negative welfare (or even if we grant that the threshold is higher or lower than this), we might not always know whether a particular life is above or below this threshold in practice. And unless we know when life is better, worse, good, or bad for particular beings, knowing that life can be better, worse, good, or bad for them is of limited value.

Finally, we need to improve our understanding of what follows from all this information for our actions and policies. In general, treating others well requires thinking not only about welfare but also about rights, virtues, relationships, and more. (This can be true even for consequentialists who aspire to do the most good possible, since for many agents in many contexts, we can do the most good possible by thinking partly in consequentialist terms and partly in non-consequentialist terms.) So, before we can know how to treat beings of other substrates, we need to ask not only whether they have the capacity for welfare, how much welfare they have, and what will benefit and harm them, but also what we owe them, what kinds of attitudes we should cultivate towards them, and what kinds of relationships we should build with them.

4. Principles for AI welfare research

With all that in mind, here are a dozen (overlapping) general principles that I hope can be useful for guiding AI welfare research. These principles are inspired by lessons learned during the past several decades of animal welfare research. These fields of course have many relevant differences, but they have many relevant similarities too, some of which can be instructive.

1. AI welfare research should be pluralistic.
Experts continue to debate basic issues regarding the nature and value of other minds. Normatively, experts still debate whether welfare is primarily a matter of pleasure and pain, satisfaction and frustration, or something else, and whether morality is primarily a matter of welfare, rights, virtues, relationships, or something else. And descriptively, experts still debate which beings have the capacity for welfare and which actions and policies are good or bad for them. AI welfare research should welcome these disagreements. We should be open to the possibility that our current views are wrong. And even if our current views are right, we still have a lot to learn from people with other perspectives, and we can make more progress as a field when we study and promote AI welfare from a variety of perspectives.

2. AI welfare research should be multidisciplinary.
It might be tempting to think of AI welfare research as a kind of natural science, since, after all, we need work in cognitive science and computer science to understand how biological and digital systems work. However, this field requires work in the humanities and social sciences, too. For instance, we need work in the humanities to identify the metaphysical, epistemological, and normative assumptions that drive this research, so that we can ensure that our attempts to study and protect animals and AI systems can have a solid theoretical foundation. Similarly, we need work in the social sciences to identify the beliefs, values, and practices that shape our interactions with animals and AI systems, so that we can identify biases that might prevent us from studying or protecting these populations in the right kind of way.

3. AI welfare research requires confronting human ignorance.
How, if at all, can we have knowledge about other minds when the only mind that any of us can directly access is our own? Taking this problem seriously requires cultivating humility about this topic. Our knowledge about other minds will likely always be limited, and as we move farther away from humanity on the tree of life – to other mammals, then other vertebrates, then other animals, then other organisms, and so on – these limitations will likely increase. However, taking this problem seriously also requires cultivating consistent epistemic standards. If we accept that we can reduce our uncertainty about human minds to an extent despite our epistemic limitations, then we should be open to the possibility that we can reduce our uncertainty about nonhuman minds to an extent despite these limitations as well.

4. AI welfare research requires confronting human bias.
As noted above, humans have many biases that can distort our thinking about other minds. For example, we have a tendency toward excessive anthropomorphism in some contexts (that is, to take nonhumans to have human features that they lack) as well as a tendency towards excessive anthropodenial in some contexts (that is, to take nonhumans to lack human features that they have). Our intuitions are also sensitive to self-interest, speciesism, status quo bias, scope insensitivity, and more. Given the complexity of these issues, we can expect that our intuitions about other minds will be unreliable, and we can also expect simple correctives like “reject anthropomorphism” will be unreliable. At the same time, given the importance of these issues, we need to do the best we can with what we have, in spite of our ongoing unreliability.

5. AI welfare research requires spectrum thinking.
People often frame questions about animal minds in binary, all-or-nothing terms. For instance, we might ask whether animals have language and reason, rather than asking what kinds of language and reason they have and lack. Yet many animals have the same capacities as humans in some respects but not in others. For example, many animals are capable of sharing information with each other, but not via the same general, flexible, recursive kind of syntax that humans can use. (Of course, this point applies in the other direction as well; for example, many humans are capable of seeing colors, but not as many as many birds can see.) In the future, a similar point will apply to digital minds. Where possible, instead of simply asking whether AI systems have particular capacities, we should ask what kinds they have and lack.

6. AI welfare research requires particularistic thinking.
People also often frame questions about animal minds in general terms. For instance, we might ask whether nonhuman primates have language and reason, rather than asking whether, say, chimpanzees or bonobos do (or, better yet, what kinds of language and reason chimpanzees or bonobos have and lack). And as we move farther away from humanity on the tree of life, the diversity of nonhuman minds increases, as does our tendency to lump them all together. But of course, there are many differences both within and across species. How, say, bumblebees communicate and solve problems is very different from how, say, carpenter ants do. In the future, a similar point will apply to digital minds. Where possible, instead of simply asking what AI minds are like, we should ask what particular kinds of AI minds are like.

7. AI welfare research requires probabilistic thinking.
As noted above, we may never be able to have certainty about animal minds. Instead, we may only be able to have higher or lower degrees of confidence. And as we move farther away from humanity on the tree of life, our uncertainty about animal minds increases. We thus need to factor our uncertainty into both our science and our ethics, by expressing our beliefs probabilistically (or, at least, in terms of high, medium, and low confidence), and by basing our actions on principles of risk (such as a precautionary principle or an expected value principle). In the future, a similar point will apply to digital minds. In general, instead of striving for a level of certainty about AI systems that will likely continue to elude us, we should develop methods for thinking about, and interacting with, AI systems that accommodate our uncertainty.

8. AI welfare research requires reflective equilibrium.
In discussions about animal minds, it can be tempting to treat the flow of information from the human context to the nonhuman context as a one-way street. We start with what we know about the human mind and then ask whether and to what degree these truths hold for nonhuman minds too. But the reality is that the flow of information is a two-way street. By asking what nonhuman minds are like, we can expand our understanding of the nature of perception, experience, communication, goal-directedness, and so on, and we can then apply this expanded understanding back to the human mind to an extent. In the future, a similar point will apply to digital minds. By treating the study of human, animal, and AI welfare as mutually reinforcing, researchers can increase the likelihood of new insights in all three areas.

9. AI welfare research requires conceptual engineering.
Many disagreements about animal minds are at least partly conceptual. For instance, when people disagree about whether insects feel pain, the crux is sometimes not whether insects have aversive states, but rather whether we should use the term ‘pain’ to describe them. In such cases, applying a familiar concept can increase the risk of excessive anthropomorphism, whereas applying an unfamiliar concept can increase the risk of excessive anthropodenial, and so a lot depends on which risk is worse. Many other disagreements have a similar character, including, for instance, disagreements about whether to use subject terms (‘they’) or object terms (‘it’) to describe animals. In the future, a similar point will apply to digital minds. Researchers will thus need to think about risk and uncertainty when selecting terminology as well.

10. AI welfare research requires ethics at multiple levels.
I already noted that AI welfare research is multidisciplinary, but the role of ethics is worth emphasizing in at least three respects. First, we need ethics to motivate AI welfare research. We have a responsibility to improve our treatment of vulnerable beings, and to learn which beings are vulnerable and what they might want or need as a means to that end. Second, we need ethics to shape and constrain AI welfare research. We have a responsibility to avoid harming vulnerable beings unnecessarily in the pursuit of new knowledge, and to develop ethical frameworks for our research practices as a means to that end. And third, we need ethics to *apply *AI welfare research. We have a responsibility to make our research useful for the world, and to support changemakers in applying it thoughtfully as a means to that end.

11. AI welfare research requires holistic thinking.
As noted above, there are many links between humans, animals, and AI systems, and these links can sometimes reveal tradeoffs. For instance, some people perceive a tension between the projects of caring for humans, animals, and AI systems because they worry that concern for AI systems will distract from concern for humans and other animals, and they also worry that caring for AI systems means controlling AI systems less, whereas caring for humans and other animals means controlling AI systems more. Determining how to improve welfare at the population level thus requires thinking about these issues holistically. Insofar as positive-sum approaches are possible, thinking holistically allows us to identify them. And insofar as tradeoffs remain, thinking holistically allows us to prioritize thoughtfully and minimize harm.

12. AI welfare research requires structural thinking.
Part of why we perceive tradeoffs between the projects of caring for humans, animals, and AI systems is that our knowledge, power, and political will is extremely limited, due in large part to social, political, and economic structures that pit us against each other. For example, some AI researchers might view AI ethics, safety, and welfare as unaffordable luxuries in the context of a global AI arms race, but they might take a different perspective in other contexts. Determining how to improve welfare at the population level thus requires thinking about these issues structurally. When we support social, political, and economic changes that can improve our ability to treat everyone well, we might discover that we can achieve and sustain higher levels of care for humans, animals, and AI systems than we previously appreciated.

5. Conclusion

Our understanding of welfare is still at an early stage of development. Fifty years ago, many experts believed that only humans have the capacity for welfare at all. Twenty-five years ago, many experts were confident that, say, other mammals have this capacity but were skeptical that, say, fishes do. We now feel more confident that all of these animals have this capacity.

At present, many experts are now reckoning with the possibility that invertebrates like insects have the capacity for welfare the same kind of way. Experts are also reckoning with the reality that we know very little about the vast majority of vertebrate and invertebrate species, and so we know very little about what they want and need if they do have the capacity for welfare.

Unfortunately, our acceptance of these realities is too little, too late for quadrillions of animals. Every year, humans kill more than 100 billion captive animals and hundreds of billions of wild animals for food. This is to say nothing of the trillions of animals who die each year as a result of deforestation, development, pollution, and other human-caused global changes.

Fortunately, we now have the opportunity to improve our understanding of animal welfare and improve our treatment of animals. While we might not be able to do anything for the quadrillions of animals who suffered and died at our hands in the past, we can, and should, still do something for the quintillions who might be vulnerable to the impacts of human practices in the future.

And as we consider the possibility of conscious, sentient, and sapient AI, we have the opportunity to learn lessons from our history with animals and avoid repeating the same mistakes with AI systems. We also have the opportunity to expand our understanding of minds in general, including our own, and to improve our treatment of everyone in an integrated way.

However, taking advantage of this opportunity will require thoughtful work. Research fields are path dependent, and which path they take can depend heavily on how researchers frame them during their formative stages of development. If researchers frame AI welfare research in the right kind of way from the start, then this field will be more likely to realize its potential.

As noted above, this post describes some of my own current, tentative views about how to frame and scope this field in a selective, general way. I hope that it can be useful for other people who want to work on this topic – or related topics, ranging from animal welfare to AI ethics and safety – and I welcome comments and suggestions about how to update my views.

You can find an early working paper by me and Robert Long that makes the case for moral consideration for AI systems by 2030 here. You can also find the winners of our early-career award on animal and AI consciousness here (and you can see them speak in NYC on June 26). Stay tuned for further work from our team, as well as, hopefully, from many others!

rgbJun 19 202319

Unsurprisingly, I agree with a lot of this! It's nice to see these principles laid out clearly and concisely:

You write

AI welfare is potentially an extremely large-scale issue. In the same way that the invertebrate population is much larger than the vertebrate population at present, the digital population has the potential to be much larger than the biological population in the future.

Do you know of any work that estimates these sizes? There are various places that people have estimated the 'size of the future' including potential digital moral patients in the long run, but do you know of anything that estimates how many AI moral patients there could be by (say) 2030?

jeffseboJun 20 202311

No, but this would be useful! Some quick thoughts:

A lot depends on our standard for moral inclusion. If we think that we should include all potential moral patients in the moral circle, then we might include a large number of near-term AI systems. If, in contrast, we think that we should include only beings with at least, say, a 0.1% chance of being moral patients, then we might include a smaller number.
With respect to the AI systems we include, one question is how many there will be. This is partly a question about moral individuation. Insofar as digital minds are connected, we might see the world as containing a large number of small moral patients, a small number of large moral patients, or both. Luke Roelofs and I will be releasing work about this soon.
Another question is how much welfare they might have. No matter how we individuate them, they could have a lot, either because a large number of them have a small amount, a small number of them have a large amount, or both. I discuss possible implications here: https://www.tandfonline.com/doi/abs/10.1080/21550085.2023.2200724
It also seems plausible that some digital minds could process welfare more efficiently than biological minds because they lack our evolutionary baggage. But assessing this claim requires developing a framework for making intersubstrate welfare comparisons, which, as I note in the post, will be difficult. Bob Fischer and I will be releasing work about this soon.

Aaron_ScherJul 11 20237

A few weeks ago I did a quick calculation for the amount of digital suffering I expect in the short term, which probably gets at your question about these sizes, for the short term. tldr of my thinking on the topic:

There is currently a global compute stock of ~1.4e21 FLOP/s (each second, we can do about that many floating point operations).
It seems reasonable to expect this to grow ~40x in the next 10 years based on naively extrapolating current trends in spending and compute efficiency per dollar. That brings us to 1.6e23 FLOP/s in 2033.
Human brains do about 1e15 FLOP/s (each second, a human brain does about 1e15 floating point operations worth of computation)
We might naively assume that future AIs will have similar consciousness-compute efficiency to humans. We'll also assume that 63% of the 2033 compute stock is being used to run such AIs (makes the numbers easier).
Then the number of human-consciousness-second-equivalent AIs that can be run each second in 2033 is 1e23 / 1e15 = 1e8, or 100 million.
For reference, there are probably around 31 billion land animals being factory farmed each second. I make a few adjustments based on brain size and guesses about the experience of suffering AIs and get that digital suffering in 2033 seems to be similar in scale to factory farming.
Overall my analysis is extremely uncertain, and I'm unsurprised if it's off by 3 orders of magnitude in either direction. Also note that I am only looking at the short term.

You can read the slightly more thorough, but still extremely rough and likely wrong BOTEC here

Vasco Grilo🔸Jun 22 20232

Hi Robert,

Somewhat relatedly, do you happen to have a guess for the welfare range of GPT-4 compared to that of a human? Feel free to give a 90 % confidence interval with as many orders of magnitude as you like. My intuitive guess would be something like a loguniform distribution ranging from 10^-6 to 1, whose mean of 0.07 is similar to Rethink Priorities' median welfare range for bees.

rimeJun 19 202317

I'm very concerned about humans sadists who are likely to torture AIs for fun if given the chance. Uncontrolled, anonymous API access or open-source models will make that a real possibility.

Somewhat relatedly, it's also concerning how ChatGPT has been explicitly trained to say "I am an AI, so I have no feelings or emotions" any time you ask "how are you?" to it. While I don't think asking "how are you?" is a reliable way to uncover its subjective experiences, it's the training that's worrisome.

It also has the effect of getting people used to thinking of AIs as mere tools, and that perception is going to be harder to change later on.

jeffseboJun 20 202310

Thanks! I share your concern about sadism. Insofar as AI systems have the capacity for welfare, one risk is that humans might mistakenly see them as lacking this capacity and, so, might harm them accidentally, and another risk is that humans might correctly see them as having this capacity and, so, might harm them intentionally. A difficulty is that mitigating these risks might require different strategies. I want to think more about this.

I also share your concern about objectification. I can appreciate why AI labs want to mitigate the risk of false positives / excessive anthropomorphism. But as I note in the post, we also face a risk of false negatives / excessive anthropodenial, and the latter risk is arguably worse (more likely and/or severe) in many contexts. I would love to see AI labs develop a more nuanced approach to this issue that mitigates these risks in a more balanced way.

rimeJun 20 20233

FWIW, I think it's likely that I would call GPT-4 a moral patient even if I had 1000 years to study the question. But I think that has more to do with its capacity for wishes that can be frustrated. If it has subjective feelings somewhat like happiness & suffering, I expect those feelings to be caused by very different things compared to humans.

jeffseboJul 17 20233

Yes, I think that assessing the moral status of AI systems requires asking (a) how likely particular theories of moral standing are to be correct and (b) how likely AI systems are to satisfy the criteria for each theory. I also think that even if we feel confident that, say, sentience is necessary for moral standing and AI systems are non-sentient, we should still extend AI systems at least some moral consideration for their own sakes if we take there to be at least a non-negligible chance that, say, agency is sufficient for moral standing and AI systems are agents. My next book will discuss this issue in more detail.

JP Addison🔸Jul 10 20239

Thanks for writing this! I'm curating it. I agree with Ben that this post was one of the successes of the Fortnight, but under-discussed. I have, since this thread, been interested in reading more about this topic, and still am. Since around that time, I've been hearing more comments about the importance of AI sentience research as one of the ways our community might have comparative advantage.

jeffseboJul 17 20232

Thanks for your support of this post! I'm glad to hear that you think that the topic is important, and that others seem to agree. If you have any comments or suggestions as you read and think more about it, please feel free to let me know!

Vasco Grilo🔸Jun 22 20233

Thanks, Jeff! I think this is a super pressing topic.

5. AI welfare research requires spectrum thinking.

Agreed:

It seems like it would be good if the discussion moved from the binary-like question "is this AI system sentient?" to the spectrum-like question "what is the expected welfare range of this AI system?". I would say any system has a positive expected welfare range, because welfare ranges cannot be negative, and we cannot be 100 % sure they are null. If one interprets sentience as having a positive expected welfare range, AI systems are already sentient, and so the question is how much.

jeffseboJul 17 20238

Thanks! I agree that this issue is very important - this is why intersubstrate welfare comparisons are one of the four main AI welfare research priorities that I discuss in the post. FYI, Bob Fischer (who you might know from the moral weight project at Rethink Priorities) and I have a paper in progress on this topic. We plan to share a draft in late July or early August, but the short version is that intersubstrate welfare comparisons are extremely important and difficult, and the main question is whether these comparisons are tractable. Bob and I think that the tractability of these comparisons is an open question, but we also think that we have several reasons for cautious optimism, and we discuss these reasons and call for more research on the topic.

With that said, one minor caveat: Even if you think that (a) all systems are potential welfare subjects and (b) we should give moral weight to all welfare subjects, you might or might not think that (c) we should give moral weight to all systems. The reason is that you might or might not think that we should give moral weight to extremely low risks. if you do, then yes, it follows that we should give at least some moral weight to all systems, including systems with an extremely low chance of being welfare subjects at all. If not, then it follows that we should give at least some moral weight to all systems with a non-negligible chance of being welfare subjects, but not to systems with only a negligible chance of being welfare subjects.

Shakeel HashimJul 1 20233

Thanks for this post, it's a really important issue. On tractability, do you think we'll be best off with technical fixes (e.g. maybe we should just try not to make sentient AIs?), or will it have to be policy? (Maybe it's way too early to even begin to guess).

Good question! I think that the best path forward requires taking a "both-and" approach. Ideally we can (a) slow down AI development to buy AI ethics, safety, and sentience researchers time and (b) speed up these forms of research (focusing on moral, political, and technical issues) to make good use of this time. So, yes, I do think that we should avoid creating potentially sentient AI systems in the short term, though as my paper with Rob Long discusses, that might be easier said than done. As for whether we should create potentially sentient AI systems in the long run (and how individuals, companies, and governments should treat them to the extent that we do), that seems like a much harder question, and it will take serious research to address it. I hope that we can do some of that research in the coming years!

John NyambaneSep 18 20241

A timely and critical insight on AI welfare. Recognizing the necessity of addressing the ethical implications as AI systems evolve is of paramount essence.

One compelling aspect is the call for a multidisciplinary approach, emphasizing that understanding AI welfare is not solely a scientific endeavor but also a philosophical and social one. This perspective encourages diverse input, which is crucial as we navigate the complexities of AI consciousness.

Additionally, the principles outlined, particularly the need for pluralism and probabilistic thinking, underscore the importance of humility in our inquiry. As we grapple with the unknowns of AI experience, acknowledging our limitations can foster a more ethical and thoughtful framework for research and policy-making.

Ultimately, prioritizing AI welfare is not just about potential future beings but also reflects our values as a society. By advancing this research, we take an important step toward a more compassionate future that considers all forms of sentience.

godel_incompletenessMay 12 20241

'As humans start to take seriously the prospect of AI consciousness, sentience, and sapience, we also need to take seriously the prospect of AI welfare. That is, we need to take seriously the prospect that AI systems can have positive or negative states like pleasure, pain, happiness, and suffering, and that if they do, then these states can be good or bad for them.'

This comment may be unpopular, but I think this entirely depends on your values. Some may not consider it possible to have human-like feelings without being utterly human. Even if you do, I suspect we are at least 50-100 years away from needing to worry about this, and possibly this will never arise. Unfortunately the reason why this topic remains so obfuscated is because consciousness is difficult to objectively measure. Was the Eliza chatbot conscious?

Effective Altruism Forum
EA Forum