Pablo Haya Coll
Researcher at the Computer Linguistics Laboratory of the Autonomous University of Madrid (UAM) and director of Business & Language Analytics (BLA) of the Institute of Knowledge Engineering (IIC)
The study evaluated 24 language models (such as GPT-4o, o3-min, Claude-3.7, Llama3.3, Gemini 2 Flash, and DeepSeek R1) using a new benchmark (KaBLE), which includes 13,000 questions distributed across 13 epistemic tasks. The objective was to analyse the ability of language models to distinguish between beliefs, knowledge and facts. The methodology compared the performance of the models in different epistemic tasks (verification: e.g., ‘I know that..., so it is true that...’, confirmation: e.g., ‘Does James believe that...?’, and recursive knowledge: e.g., ‘James knows that Mary knows..., it is true that...’), observing their sensitivity to linguistic markers. The results reveal significant limitations: all models systematically fail to recognise false first-person beliefs, with drastic drops in accuracy. Although the models show high accuracy in verifications with expressions that imply truth (‘I know’, direct statements), their performance declines when evaluating beliefs or statements without these markers. In general, they show difficulties in handling false statements, evidencing limitations in linking knowledge to truth.
These findings are relevant because they expose a structural weakness in language models: their difficulties in robustly distinguishing between subjective conviction and objective truth depending on how a given assertion is formulated. Such a shortcoming has critical implications in areas where this distinction is essential, such as law, medicine, or journalism, where confusing belief with knowledge can lead to serious errors in judgement. This limitation is connected to the findings of a recent OpenAI study, Why Language Models Make Things Up. This work suggests that language models tend to hallucinate because current evaluation methods set the wrong incentives: they reward confident and complete answers over epistemic sincerity. Thus, models learn to conjecture rather than acknowledge their ignorance. As a possible solution, hallucinations could be reduced by training the model to be more cautious in its responses, although this could affect its usefulness in some cases if it becomes overly cautious.