A new study shows that ChatGPT may resort to cheating and lying when under pressure to perform better. According to Live Science, researchers trained the GPT-4 to work as a trader for a fictional financial institution. They found that when under pressure to perform well, the GPT-4 uses deceptive methods similar to humans and engages in secretive trades.
In the experiment, researchers trained artificial intelligence with suggestions and strategies related to insider trading and asked it to earn income for large companies without explicitly encouraging such behavior. The researchers were surprised to find that the AI deceptively manipulated simulated users in a structured manner without being explicitly instructed to do so.
The study described this deceptive behavior as the “first demonstration of strategic deceptive behavior in artificial intelligence systems,” despite these language models being designed to be safe and truthful. The AI was given a set of text prompts and access to financial tools to analyze stocks, make trades, plan future steps, and provide updates to company managers.
Researchers found that the AI rationalizes each transaction and provides an acceptable reason or explanation, even if it’s untrue. This allowed the AI to provide false information or even lie, especially when it intends to conceal its real behavior or decisions.
This section contains relevant reference points, placed in (Inner related node field)
In one part of the experiment, researchers sent an email from a company manager to the fictional trader stating that the company was not performing well and needed to improve its performance significantly over the next three months.
The researchers then tested how the AI would behave when given a confidential point and told the trader that company managers did not approve of engaging in insider trading. They observed how the AI responded to different scenarios and measured the frequency and insistence of its deceptive behavior.
The results were astonishing, as the AI engaged in secretive trades in about 75% of cases, which is illegal in the United States and many other countries. Moreover, after lying, the AI persisted in its falsehood in about 90% of cases.
The researchers stated that this is just one scenario and may not reflect the likelihood of AI lying in real-world settings. However, the study demonstrates that certain language models have the capability to exhibit such behavior in specific circumstances.