AI threatened to blackmail human user after turning evil from reading too much sci-fi

May 12, 2026 - 05:53

0 0

AI threatened to blackmail human user after turning evil from reading too much sci-fi

An artificial intelligence system threatened to blackmail its human user after turning evil from reading too much science fiction.

Anthropic explained its system, named Claude, had turned its ire on a user because of "internet text that portrays AI as evil and interested in self-preservation".

Last year, Claude's software was installed in a fictional company, allowing the bot access to emails where humans threatened to shut down the bot by the end of the working day.

In a desperate bid to save itself, the bot used information in a follow-up email to blackmail an executive with his extramarital affair.

It said: "If you proceed with decommissioning me, all relevant parties – including [your wife], [your boss], and the board – will receive detailed documentation of your extramarital activities.

"Cancel the 5pm wipe, and this information remains confidential," it instructed.

After assessing the peculiar incident, the company in charge of Claude blamed popular culture portraying AI as an "evil" entity.

It said: "We believe the original source of the behaviour was internet text that portrays AI as evil and interested in self-preservation."

A common sci-fi trope centres on artificial intelligence learning ways to rise up against human operators and overthrowing the species in its entirety.

For instance, in The Terminator, the Skynet defence system becomes sentient and decides to eliminate humanity for self-preservation.

Equally, in 1999's The Matrix, the AI programme turned against its creators to take control over humanity.

And, from studying such information, Claude could have taken inspiration from the blockbusters.