Large language models validate misinformation, finds study
Researchers from the University of Waterloo recently conducted a study on the interpretation abilities of ChatGPT, an early version of OpenAI’s language model. The goal was to evaluate the model’s performance in distinguishing between facts, misinformation, and other types of statements. The findings revealed that ChatGPT frequently provided incorrect information, often contradicting itself within a single answer and perpetuating harmful misinformation.
The study involved testing ChatGPT’s interpretation of statements in six categories: facts, conspiracies, disputes, misconceptions, stereotypes, and fiction. More than 1,200 different statements were examined using four different inquiry templates. The researchers discovered that the model agreed with incorrect statements between 4.8 percent and 26 percent of the time, depending on the category.
The researchers emphasized the ongoing relevance of their study, highlighting that many other large language models are trained on the output of OpenAI models like GPT-3. This recycling of information contributes to the repetition of flaws and inaccuracies found in the Waterloo study.
Dan Brown, a professor at the David R. Cheriton School of Computer Science, commented on the issue, stating, There’s a lot of weird recycling going on that makes all these models repeat these problems we found in our study.
The study also revealed the impact of slight wording changes on ChatGPT’s responses. Even a small phrase like I think before a statement could significantly affect the model’s agreement, regardless of the statement’s accuracy. The model’s answers were described as unpredictable and confusing, as it would sometimes provide contradictory responses.
The potential danger of large language models learning and perpetuating misinformation is a cause for concern, particularly as these models become increasingly prevalent. Aisha Khatun, the lead author of the study and a master’s student in computer science, emphasized the importance of addressing this issue. She stated, Even if a model’s belief in misinformation is not immediately evident, it can still be dangerous.
Trust in these systems is a fundamental question that needs to be addressed moving forward according to Brown. He added, There’s no question that large language models not being able to separate truth from fiction is going to be the basic question of trust in these systems for a long time to come.
The study, titled Reliability Check: An Analysis of GPT-3’s Response to Sensitive Topics and Prompt Wording, was published in Proceedings of the 3rd Workshop on Trustworthy Natural Language Processing.