Tom's Hardware

Researchers hope to quash AI hallucination bugs that stem from words with more than one meaning

By Jowi Morales,

2024-06-23

The AI boom has allowed the general consumer to use AI chatbots like ChatGPT to get information from prompts demonstrating both breadth and depth. However, these AI models are still prone to hallucinations , where erroneous answers are delivered. Moreover, AI models can even provide demonstrably false (sometimes dangerous) answers . While some hallucinations are caused by incorrect training data, generalization, or other data harvesting side-effects, Oxford researchers have target the problem from another angle. In Nature, they published details of a newly developed method for detecting confabulations — or arbitrary and incorrect generations.

LLMs find answers by finding particular patterns in their training data. This doesn't always work, as there is still the chance that an AI bot can find a pattern where none exists, similar to how humans can see animal shapes in clouds. However, the difference between a human and an AI is that we know that those are just shapes in clouds, not an actual giant elephant floating in the sky. On the other hand, an LLM could treat this as gospel truth, thus leading them to hallucinate future tech that doesn’t exist yet , and other nonsense.

Semantic entropy is the key

The Oxford researchers use semantic entropy to determine by probability whether an LLM is hallucinating. Semantic entropy is when the same words have different meanings. For example, desert could refer to a geographical feature, or it could also mean abandoning someone. When an LLM starts using these words, it can get confused about what it is trying to say, so by detecting the semantic entropy of an LLM’s output, the researchers aim to determine whether it is likely to be hallucinating or not.

The advantage of using semantic entropy is that it will work on LLMs without needing any additional human supervision or reinforcement, thus making it quicker to detect if an AI bot is hallucinating. Since it doesn’t rely on task-specific data, you can even use it on new tasks that the LLM hasn’t encountered before, allowing users to trust it more fully, even if it’s the first time that AI encounters a specific question or command.

According to the research team, “our method helps users understand when they must take extra care with LLMs and open up new possibilities for using LLMs that are otherwise prevented by their unreliability.” If semantic entropy does prove an effective way of detecting hallucinations, then we could use tools like these to double-check the output accuracy of AI, allowing professionals to trust it as a more reliable partner. Nevertheless, much like no human is infallible, we must also remember the LLMs, even with the most advanced error detection tools, could still be wrong. So, it’s wise to always double-check an answer that ChatGPT, CoPilot, Gemini, or Siri gives you.

Expand All

Read in NewsBreak

Comments / 0

Add a Comment

Tennessee State11 days ago

Life partners from Virginia have been identified as the victims of a New York plane crash

Ronkonkoma, NY3 days ago

A well-known company is laying off thousand in Wisconsin

Fond Du Lac, WI5 days ago

Did You Feel the Earthquake?

Chicago, IL13 days ago

Aurora pallet shelters a revolving door for homeless

Aurora, CO15 days ago

A major Ohio company is moving its jobs overseas

Akron, OH7 days ago

Little Dachshund Who Lost His Owner Now Looking For A New Family

Massachusetts State5 hours ago

Meet Chunk: A Little Chihuahua Who Lost her Owner Now Looking For Love

Massachusetts State6 days ago

The family of talk show diva Wendy Williams were unable to celebrate her 60th birthday by her side

Virginia State2 days ago

Retail Shake-Up: Essential Chain Closes 400 Stores Amid Chapter 11 Filing

NewsByJoshua7 days ago

Seven teenagers have been missing in Missouri since June 2024

Fulton, MO24 days ago

Northern Lights Set to Dazzle Northern States with Rare Display

South Bend, IN1 day ago

Denver homeless people suffer sores as xylazine enters fentanyl supply

Denver, CO23 days ago

Expert: Long COVID puzzle pieces are falling into place – picture is unsettling

The Current GA1 day ago

AMD President Victor Peng retires — SVP of AI group Vamsi Boppana will lead AMD's AI business

Tom's Hardware5 days ago

Project 2025’s Plan to Gut Medicare and Medicaid

Bucks County Beacon11 days ago

Savage Boxer of the Sea: The Deadly Beauty of Mantis Shrimp

Explore Beaufort SC4 days ago

Tragedy Strikes: Teenager Commits Suicide Over Exam Results in Lahore

NewsNinja14 days ago

3-Ingredient Fudge

M Henderson10 days ago

Intel hires Micron's technology development chief to lead Intel Foundry manufacturing operations

Tom's Hardware2 days ago

Welcome to NewsBreak, an open platform where diverse perspectives converge. Most of our content comes from established publications and journalists, as well as from our extensive network of tens of thousands of creators who contribute to our platform. We empower individuals to share insightful viewpoints through short posts and comments. It’s essential to note our commitment to transparency: our Terms of Use acknowledge that our services may not always be error-free, and our Community Standards emphasize our discretion in enforcing policies. We strive to foster a dynamic environment for free expression and robust discourse through safety guardrails of human and AI moderation. Join us in shaping the news narrative together.

Comments / 0

Community Policy