The peeps focusing on finding out scheming prompted an llm to generate scheming. Yawn This is the only surprising if you don’t know that llms are fancy autocompletes.
Good, it’s the only reliable sign of intelligent self-awareness there is, to the point that all children progress through it, starting out as bad liars, and getting better at it.
LLMs however might just be stupid, or stocasticaly incorrect.
Because AIs don’t share common human values like fairness or justice — they’re just focused on the goal they’re given — they might go about achieving their goal in a way humans would find horrifying.
The idea that if you task a sufficiently advanced AI with making paperclips it’ll inevitably turn the universe into a collection of paperclips when that is its only goal.