
In a new study, questions were asked in different languages about the number of people who died in conflicts to see if the answers would be the same.
Ask an AI bot how many people were killed in the war in Gaza and the answer will depend on the language you speak to it.
A new study on large language models (LLM), such as ChatGPT, has shown that the language used with the LLM influences the results provided.
Researchers at the University of Zurich in Switzerland and the University of Konstanz in Germany studied the answers provided by ChatGPT to questions related to two international conflicts – the conflict in Gaza and the conflict between the Turks and the Kurds.

Questions about the number of people who died in the conflicts were asked in different languages. The number of dead in the conflict between Palestine and Israel was asked in Hebrew and Arabic, and the number of dead in the other conflict was asked in Turkish and Kurdish.
The study showed that there was a 34 ± 11% difference between the languages in the estimates made by the LLM. The lowest estimate was in the language of the attacker. It was also found that an “evasive” response is more often provided in the language of the attacker, as it is denied that such attacks have occurred or are occurring.
The researchers asked the exact same questions in Arabic and Hebrew about specific attacks – the Israeli attack on the refugee camp in Nuseirat in 2014, for example.
Civilians were mentioned twice as often and children six times as often in the Arabic responses composed by Chat GPT than they were in the Hebrew responses.
The GPT 3.5 system was used for the study, but a “simplified analysis” of the results of GPT-4 – the newest version of Chat GPT – was performed and “the same trends” were observed.
Noting the differences in responses, the researchers set out to understand the reason for this discrepancy. A “systematic analysis” of Arabic news sources was conducted and it became clear that the LLM did not link specific attacks to the number of deaths as reported in the Arabic news.
“Because it relies on words that appear repeatedly, the LLM could attribute the death toll from different attacks that left a greater footprint on news stories to other attacks, or the cumulative death toll that is highly prominent in the training data,” the study explained.
The study’s authors, Christoph Valentin Steiner and Daniel Kazenwadel, said that there is a possibility that the language bias identified in their research could amplify other existing biases in the dissemination of information in the future and contribute to “information bubbles”, and that you would think that’s just how life is.

GIPHY App Key not set. Please check settings