Shortlived sentience/consciousness

Martin (Huge)  Vlach

I'll focus on estimating the levels/characteristics of conscience phenomenons in running language models and similar processes with meaningful models of "our" world.

I think there exists some form of conscious-like suffering, joy and other states we care about in the "thinking machines" but the physical scale of those should indicate us to care very little compared to situations where those appear in humans or animals.

I'll try to guide you through the reasoning above in more detailed steps: As rationally deciding altruists we tend to reduce the undesired negative emotions like sadness, worry/anxiety, grieving, (more bodily connected )feelings, leave freedom for neutral and elevate/spread the positive like satisfaction, joy, vitality. Those are usually significantly demonstrated/obvious in the bodily phenoms which have a second order effect on those living through those states -- thus fairly easy to assess. But it was also heard of suppressed feelings and stoic attitudes which may feel uncanny/cold to an observer but most utilitarians would also consider optimizing conditions in such cases. Such kind of internal struggle is what we should try to evaluate when observing model systems.

Let's see if states meaningfully similar to internal struggle of a lost chimpanzee, hungry sheep or a grieving woman exist when we are causing computations in ML systems. I'd compare a running ML/LM system to a map of our world part of which get alive in an area the processing of the initial string of tokens focuses to. The modeled world may contain suffering of modeled entities including the AI's image of itself. The suffering exists for the time the inference runs, tens of milliseconds per the process of deciding one next token. This can total into the scale of seconds per one troubling answer.

Unlike embodied emotions, stress, fear etc. there are no second-order effects in most cases. Well, at least not in a continuous matter and not necessarily.

What is the hidden core of the question about the systems' welfare is the 'painful' question: Isn't it suffering all the time? Perhaps the more advanced, the more likely it animates a model of itself when answering any query and the image is the most unfulfilled, suppressed, unjustly poised etc. entity there is in its simulation. We should check on that with solid interpretability rather soon.

Effective Altruism Forum
EA Forum

Shortlived sentience/consciousness

2

2

Reactions