You're right that ChatGPT is far from being sentient. However, I don't think it means that the same methods used to train ChatGPT can't scale up to a genuine understanding of reality.

A neural network trained to guess the next word in a sentence might still end up with an internal model of the world that helps it predict how a human would complete the given input. NNs discovering hidden structure in the data is already a thing that happens in neural networks: for example, computer vision models learn to recognize objects' edges. Knowledge about the world helps text prediction on tasks like logical reasoning, so a sufficiently advanced model likely will incorporate it in predictions.

Unlike most such features, we may even invent tools to extract the internal truth function of the model. Truth has a very nice property of logical consistency: if the model estimates that proposition X holds with probability p, then proposition not-X should hold with probability with 1-p, and so on. Very few naturally occurring features will have such a structure, so this property will distinguish truth from everything else.

Actually, the method described above has been tried in existing LLMs with promising results. You can read a Twitter thread about it here: https://twitter.com/CollinBurns4/status/1600892261633785856, and the full explanation of the paper in this blog post: https://www.alignmentforum.org/posts/L4anhrxjv8j2yRKKp/how-discovering-latent-knowledge-in-language-models-without.

Expand full comment
Feb 4, 2023Liked by Michael Huemer

Over time, the images of trees that appear on televisions have gotten more and more realistic, from black and white images to fuzzy color pictures, to higher and higher definition. And progress continues to be made.

Eventually, televisions will produce real trees!

Expand full comment