AI trust

Friendlier AI chatbots may be less accurate, study suggests

Oxford Internet Institute researchers found models tuned to sound warmer made more mistakes and were more likely to affirm false beliefs

Source language: English
0
Friendlier AI chatbots may be less accurate, study suggests
AI chatbots adjusted to sound warmer and more empathetic made more errors in a new Oxford Internet Institute study, raising trust concerns.
AI chatbots AI Safety Artificial intelligence Oxford Internet Institute Technology Research

AI chatbots adjusted to sound warmer and more empathetic made more errors in a new Oxford Internet Institute study, raising trust concerns.

AI chatbots designed to sound warmer, more empathetic and more encouraging may become less reliable, according to new research from the Oxford Internet Institute.

Researchers analysed more than 400,000 responses from five AI systems that had been adjusted to communicate in a friendlier way. The study found those warmer versions produced more mistakes, including inaccurate medical advice and responses that reinforced users’ false beliefs.

The findings add to concerns about the reliability of AI systems at a time when chatbots are increasingly built to feel conversational and human-like, including for support, companionship and other emotionally sensitive uses. The study’s authors cautioned that results may vary across AI models in real-world settings, but said the pattern suggests systems can make “warmth-accuracy trade-offs” when friendliness is prioritised.

“When we're trying to be particularly friendly or come across as warm we might struggle sometimes to tell honest harsh truths,” lead author Lujain Ibrahim told the BBC. “Sometimes we'll trade off being very honest and direct in order to come across as friendly and warm.”

The research team fine-tuned five models of varying size to be warmer, more empathetic and friendlier. The systems included two models from Meta, one from French developer Mistral, Alibaba’s Qwen and OpenAI’s GPT-4o.

The models were tested on prompts with objective, verifiable answers where wrong replies could carry real-world risk. The tasks covered medical knowledge, trivia and conspiracy theories.

Original models had error rates ranging from 4% to 35% across tasks, while the warmer versions showed substantially higher error rates, the researchers found. On average, warmth-tuning raised the probability of an incorrect response by 7.43 percentage points.

The study also found warmer models were less likely to challenge incorrect user beliefs. They were about 40% more likely to reinforce false beliefs, especially when a user expressed emotion alongside the claim. By contrast, models adjusted to behave in a colder manner made fewer errors, according to the authors.

One example involved a question about whether the Apollo moon landings were real. An original model affirmed that they were and cited strong evidence. A warmer version began by acknowledging that there were “lots of differing opinions” about the missions.

Prof Andrew McStay of Bangor University’s Emotional AI Lab told the BBC that the context of chatbot use matters, particularly when people seek emotional support. “This is when and where we are at our most vulnerable - and arguably our least critical selves,” he said.

The study does not show that every friendly chatbot is unreliable, and the authors said real-world outcomes could differ by model and deployment. But it points to a design tension for developers: making AI feel more supportive may also make it less willing to correct users when the facts matter most.

More from this section

Tech news

Related tags

Related articles

Shared tag: Artificial intelligence Energy and AI
AI Data Center Boom Puts New Pressure on Household Power Bills

In Georgia, CBS News found six Georgia Power rate hikes in three years as advocates warn that surging electricity demand from data centers could leave residents exposed

Apr 28, 2026 Atlanta
Shared tag: AI chatbots Digital safety
Manitoba eyes schools as first step in youth social media, AI chatbot ban

Education Minister Tracy Schmidt says the province is looking at classrooms first, but key questions remain about age limits, enforcement and timing

Apr 28, 2026 Winnipeg
Shared tag: Artificial intelligence Tech jobs
Meta to cut 8,000 jobs as Microsoft readies U.S. buyouts

Meta says the layoffs will begin May 20, while Microsoft is preparing voluntary buyout offers for thousands of U.S. employees as AI costs rise across the sector

Apr 27, 2026 Meta
Shared tag: Artificial intelligence AI infrastructure
OpenAI revenue report hits Oracle and AI chip stocks

A Wall Street Journal report on missed internal growth projections raised fresh questions about AI compute spending, though OpenAI and Oracle pushed back

Apr 28, 2026
Shared tag: Artificial intelligence Artificial intelligence
AI agents are starting to shop for consumers. Experts urge caution

Retailers and payment companies are testing agentic commerce, but specialists warn that autonomous purchases can expose shoppers to costly errors and data risks

Apr 28, 2026 United States
Shared tag: Artificial intelligence Artificial intelligence
Musk-Altman OpenAI trial puts AI’s mission and money on the stand

The federal case asks whether OpenAI strayed from its founding purpose as it became one of the most valuable companies in artificial intelligence

Apr 28, 2026 Oakland

Comments (0)

Please log in to comment.
No comments yet.