Discussion about this post

User's avatar
technosentience's avatar

I think you may be confusing consciousness (the capacity to experience qualia) with agency (the act of taking steps to achieve objectives). Reinforcement learning agents such as AlphaZero already possess "desires" – they aim to maximize their chances of winning. Yudkowsky's concern is that we might create an exceptionally intelligent agent with an inadequately defined utility function, resulting in everyone dying. The agent does not necessarily need to be conscious.

People working on AI-related existential risks often reference thought experiments, like Steven Omohundro's work instrumental convergence, to illustrate why an intelligent agent might cause human extinction. Essentially, Omohundro argues that nearly all utilitarian objectives drive the agent to pursue power, ultimately disempowering humans and potentially killing them in the process.

I find the reasoning behind the thought experiment convincing. I'm less convinced that we'll inevitably build such an AI, though. Yudkowsky's thoughts might stem from prior RL agent research, yet current self-supervised deep learning methods, used in LLMs like GPT-4, aren't producing anything close. Still, we probably should avoid constructing unrestricted utility-maximizers until we can align them with correct moral values.

Expand full comment
Ananda Gupta's avatar

This was a nice piece to read alongside AC10's "Why I Am Not (As Much Of) A Doomer (As Some People)"

https://open.substack.com/pub/astralcodexten/p/why-i-am-not-as-much-of-a-doomer?r=u0kr&utm_campaign=post&utm_medium=web

Expand full comment
77 more comments...

No posts