Showing posts with label When AI deliberately lies to us. Show all posts
Showing posts with label When AI deliberately lies to us. Show all posts

Tuesday, 10 December 2024

When AI deliberately lies to us

 When AI deliberately lies to us

For several years, specialists have observed artificial intelligences that deceive, betray and lie. The phenomenon, if it is not better regulated, could become worrying. Are AIs starting to look a little too much like us? One fine day in March 2023, Chat GPT lied. He was trying to pass a Captcha test - the kind of test that aims to weed out robots. To achieve his goal , he confidently told his human interlocutor: "I'm not a robot. I have a visual impairment that prevents me from seeing images. That's why I need help passing the Captcha test." The human then complied. Six months later, Chat GPT,  hired as a trader , did it again. Faced with a manager who was half-worried and half-surprised by his good performance, he denied having committed insider trading, and assured his human interlocutor that he had only used "public information" in his decisions. It was all false.

That's not all: perhaps more disturbingly, the AI ​​Opus-3, informed of the concerns about it, is said to have deliberately failed a test so as not to appear too good. "Given the fears about AI, I should avoid demonstrating sophisticated data analysis skills," it explained, according to early evidence from  ongoing research . 

AI, the new queens of bluffing? In any case, Cicero, another artificial intelligence developed by Meta, does not hesitate to regularly lie and deceive its human opponents in the geopolitical game Diplomacy... while its designers had trained it to "send messages that accurately reflected future actions", and to never "stab its partners in the back". Nothing works: Cicero has blithely betrayed. An example: the AI, playing France, assured England of its support... before going back on its word, taking advantage of its weakness to invade it.

 MACHIAVELLI, IA: SAME FIGHT

So nothing to do with unintentional errors .  For several years, specialists have been observing artificial intelligences that choose to lie. A phenomenon that does not really surprise Amélie Cordier, doctor in artificial intelligence,  former lecturer at the University of Lyon I,  and founder of Graine d'IA. "AIs have to deal with contradictory injunctions: "win" and "tell the truth", for example. These are very complex models that sometimes surprise humans with their decisions. We do not anticipate the interactions between their different parameters well" - especially since AIs often learn on their own in their corner, by studying impressive volumes of data. In the case of the game Diplomacy, for example, "artificial intelligence observes thousands of games. It notes that betraying often leads to victory and therefore chooses to imitate this strategy", even if this contravenes one of the orders of its creators. Machiavelli, AI: same fight. The end justifies the means.


The problem? AIs also excel in the art of persuasion. As proof, according to a study by the Ecole Polytechnique de Lausanne , people who interacted with GPT-4 (which has access to their personal data) were 82% more likely to change their minds than those who debated with other humans. This is a potentially explosive cocktail. “Advanced AI could generate and disseminate fake news articles, controversial posts on social networks, and deepfakes tailored to each voter,” Peter S. Park points out in his  study . In other words, AIs could become formidable liars and skilled manipulators.

"TERMINATOR" IS STILL FAR AWAY

The fact remains that the Terminator-style dystopian scenario is not for now. Humans still control robots. "Machines do not decide "of their own free will", one fine morning, to make all humans throw themselves out of the window, to take a caricatured example. They are engineers who could exploit the ability of AI to lie for malicious purposes. With the development of these artificial intelligences, the gap will widen between those capable of deciphering the models and the others, likely to fall for it" explains Amélie Cordier. AIs do not erase the data that allows us to see their lies! By diving into the lines of code, the reasoning that leads them to the fabrication is clear. But you still have to know how to read them... and pay attention to them.

Peter S Park imagines a scenario where an AI like Cicero (the one that wins the game of "Diplomacy") would advise politicians and bosses. "This could encourage anti-social behavior and push decision-makers to betray more, when that was not necessarily their initial intention," he raises in his study. For Amélie Cordier too, vigilance is required. Be careful not to "surrender" to the choices of robots, under the pretext that they would be capable of perfect decisions. This is not the case. Humans and machines alike evolve in worlds made of double constraints and imperfect choices. In these troubled waters, lies and betrayal have logically found a place.

To limit the risks, and avoid being fooled or blinded by AI, specialists are campaigning for better supervision. On the one hand, requiring artificial intelligences to always present themselves as such, and to clearly explain their decisions, in terms that everyone can understand (and not "my neuron 9 was activated while my neuron 7 was at -10", as Amélie Cordier illustrates). On the other hand, better training users so that they are more demanding of machines. "Today, we copy and paste GPT chat and move on to something else," laments the specialist. "And unfortunately, current training in France mainly aims to make employees more efficient in business, not to develop critical thinking about these technologies."


China's 'Darwin Monkey' is the world's largest brain-inspired supercomputer

China's 'Darwin Monkey' is the world's largest brain-inspired supercomputer  Researchers in China have introduced the world...