The Independent

AI systems could be on the verge of collapsing into nonsense, scientists warn

By Andrew Griffin,

3 hours ago

AI systems could collapse into nonsense as more of the internet gets filled with content made by artificial intelligence , researchers have warned.

Recent years have seen increased excitement about text-generating systems such as OpenAI ’s ChatGPT . That excitement has led many to publish blog posts and other content created by those systems, and ever more of the internet has been produced by AI.

Many of the companies producing those systems use text taken from the internet to train them, however. That may lead to a loop in which the same AI systems being used to produce that text are then being trained on it.

That could quickly lead those AI tools to fall into gibberish and nonsense, researchers have warned in a new paper. Their warnings come amid a more general worry about the “dead internet theory”, which suggests that more and more of the web is becoming automated in what could be a vicious cycle .

It takes only a few cycles of both generating and then being trained on that content for those systems to produce nonsense, according to the research.

They found that one system tested with text about medieval architecture only needed nine generations before the output was just a repetitive list of jackrabbits, for instance.

The concept of AI being trained on datasets that was also created by AI and then polluting their output has been referred to as “model collapse”. Researchers warn that it could become increasingly prevalent as AI systems are used more across the internet.

It happens because as those systems produce data and are then trained on it, the less common parts of the data tends to left out. Researcher Emily Wenger, who did not work on the study, used the example of a system trained on pictures of different dog breeds: if there are more golden retrievers in the original data, then it will pick those out, and as the process goes round those other dogs will eventually be left out entirely – before the system falls apart and just generates nonsense.

The same effect happens with large language models like those that power ChatGPT and Google’s Gemini, the researchers found.

That could be a problem not only because the systems eventually become useless, but also because they will gradually become less diverse in their outputs. As the data is produced and recycled, the systems may fail to reflect all of the variety of the world, and smaller groups or outlooks might be erased entirely.

The problem “must be taken seriously if we are to sustain the benefits of training from large-scale data scraped from the web”, the researchers write in their paper. It might also mean that those companies that have already scraped data to train their systems could be in a beneficial position, since data taken earlier will have more genuine human output in it.

The problem could be fixed with a range of possible solutions including watermarking output so that it can be spotted by automated systems and then filtered out of those training sets. But it is easy to remove those watermarks and AI companies have been resistant to working together to use it, among other issues.

The study, ‘AI models collapse when trained on recursively generated data’, is published in Nature.

Expand All

Read in NewsBreak

Comments / 0

Add a Comment

The US Sun1 day ago

So Many People Think AI Is Sentient That Scientists Are Starting To Get Concerned

BroBible6 days ago

Transhumanist author predicts artificial super-intelligence, immortality, and the Singularity by 2045

TechSpot15 days ago

‘Model collapse’: Scientists warn against letting AI eat its own tail

TechCrunch4 hours ago

Mechanical Intelligence And Counterfeit Humanity

hackaday.com1 day ago

Fact Check: Kamala Harris “is not eligible to hold the office of president” because her parents were non-citizens.

PolitiFact3 days ago

Crazy new AI can read your mind to recreate what you’re looking at

BGR.com18 days ago

The Independent6 hours ago

Google Shows Off AI-Powered Robots That Are Ready To Serve

Inverse11 days ago

Tragedy as polar bear kills fellow bear in front of shocked zoo visitors

The Independent10 hours ago

Chelsea Clinton is rumored to be living in Virginia but probably still in NYC

New York City, NY9 days ago

Could Kamala Harris beat Trump? What the polls say

Wisconsin State2 days ago

5 Most Intelligent Zodiac Signs

Total Apex Sports & Entertainment13 days ago

Girl, 10, mauled by unregistered XL bully in horror dog attack

The Independent1 day ago

Did You Feel the Earthquake?

Chicago, IL9 days ago

Musk announces plans for ‘humanoid robots’ that will soon be available for mass purchase

The Independent1 day ago

1 Magnificent Artificial Intelligence (AI) Stock to Buy Hand Over Fist

The Motley Fool7 hours ago

Robo-cops watching your every move and Matrix-style machines that teach themselves – scariest AI breakthroughs incoming

The US Sun3 days ago

Concerns Rise as Massive Layoffs Hit California

California State7 days ago

Man detained and police officer ‘stable’ after high-security prison stabbing

The Independent1 day ago

Welcome to NewsBreak, an open platform where diverse perspectives converge. Most of our content comes from established publications and journalists, as well as from our extensive network of tens of thousands of creators who contribute to our platform. We empower individuals to share insightful viewpoints through short posts and comments. It’s essential to note our commitment to transparency: our Terms of Use acknowledge that our services may not always be error-free, and our Community Standards emphasize our discretion in enforcing policies. We strive to foster a dynamic environment for free expression and robust discourse through safety guardrails of human and AI moderation. Join us in shaping the news narrative together.

Comments / 0

Community Policy