LiveScience

AI 'hallucinations' can lead to catastrophic mistakes, but a new approach makes automated decisions more reliable

By Nicholas Fearn,

6 hours ago

Scientists have developed a new, multi-stage method to ensure artificial intelligence (AI) systems that are designed to identify anomalies make fewer mistakes and produce explainable and easy-to-understand recommendations.

Recent advances have made AI a valuable tool to help human operators detect and address issues affecting critical infrastructure such as power stations, gas pipelines and dams. But despite showing plenty of potential, models may generate inaccurate or vague results — known as "hallucinations."

Hallucinations are common in large language models (LLMs) like ChatGPT and Google Gemini . They stem from low-quality or biased training data and user prompts that lack additional context, according to Google Cloud .

Some algorithms also exclude humans from the decision-making process — the user enters a prompt, and the AI does the rest, without explaining how it made a prediction. When applying this technology to a serious area like critical infrastructure, a major concern is whether AI’s lack of accountability and trust could result in human operators making the wrong decisions.

Some anomaly detection systems have previously been constrained by so-called "black box" AI algorithms, for example. These are characterized by opaque decision-making processes that generate recommendations difficult for humans to understand. This makes it hard for plant operators to determine, for example, the algorithm’s rationale for identifying an anomaly.

A multi-stage approach

To increase AI's reliability and minimize problems such as hallucinations, researchers have proposed four measures, outlining their proposals in a paper published July 1 at the CPSS '24 conference . In the study, they focused on AI used for critical national infrastructure (CNI), such as water treatment.

First, the scientists deploy two anomaly detection systems, known as Empirical Cumulative Distribution-based Outlier Detection (ECOD) and Deep Support Vector Data Description (DeepSVDD), to identify a range of attack scenarios in datasets taken from the Secure Water Treatment (SWaT). This system is used for water treatment system research and training.

The researchers said both systems had short training times, provided fast anomaly detection and were efficient — enabling them to detect myriad attack scenarios. But, as noted by Rajvardhan Oak, an applied scientist at Microsoft and computer science researcher at UC Davis, ECOD had a "slightly higher recall and F1 score" than DeepSVDD. He explained that F1 scores account for the precision of anomaly data and the number of anomalies identified, allowing users to determine the "optimal operating point."

Secondly, the researchers combined these anomaly detectors with eXplainable AI (XAI) — tools that help humans better understand and assess the results generated by AI systems — to make them more trustworthy and transparent.

They found that XAI models like Shapley Additive Explanations (SHAP), which allow users to understand the role different features of a machine learning model play in making predictions, can provide highly accurate insights into AI-based recommendations and improve human decision-making.

The third component revolved around human oversight and accountability. The researchers said humans can question AI algorithms' validity when provided with clear explanations of AI-based recommendations. They could also use these to make more informed decisions regarding CNI.

The final part of this method is a scoring system that measures the accuracy of AI explanations. These scores give human operators more confidence in the AI-based insights they are reading. Sarad Venugopalan , co-author of the study, said this scoring system — which is still in development — depends on the "AI/ML model, the setup of the application use-case, and the correctness of the values input to the scoring algorithm."

Improving AI transparency

Speaking to Live Science, Venugopalan went on to explain that this method aims to provide plant operators with the ability to check whether AI recommendations are correct or not.

Comments / 0

Community Policy