LiveScience

AI speech generator 'reaches human parity' — but it's too dangerous to release, scientists say

By Owen Hughes,

6 days ago

Microsoft has developed a new artificial intelligence (AI) speech generator that is apparently so convincing it cannot be released to the public.

VALL-E 2 is a text-to-speech (TTS) generator that can reproduce the voice of a human speaker using just a few seconds of audio.

Microsoft researchers said VALL-E 2 was capable of generating "accurate, natural speech in the exact voice of the original speaker, comparable to human performance," in a paper that appeared June 17 on the pre-print server arXiv . In other words, the new AI voice generator is convincing enough to be mistaken for a real person — at least, according to its creators.

"VALL-E 2 is the latest advancement in neural codec language models that marks a milestone in zero-shot text-to-speech synthesis (TTS), achieving human parity for the first time," the researchers wrote in the paper. "Moreover, VALL-E 2 consistently synthesizes high-quality speech, even for sentences that are traditionally challenging due to their complexity or repetitive phrases."

Human parity in this context means that speech generated by VALL-E 2 matched or exceeded the quality of human speech in benchmarks used by Microsoft.

The AI engine is capable of this given the inclusion of two key features: "Repetition Aware Sampling" and "Grouped Code Modeling."

Repetition Aware Sampling improves the way the AI converts text into speech by addressing repetitions of "tokens" — small units of language, like words or parts of words — preventing infinite loops of sounds or phrases during the decoding process. In other words, this feature helps vary VALL-E 2's pattern of speech, making it sound more fluid and natural.

Grouped Code Modeling, meanwhile, improves efficiency by reducing the sequence length — or the number of individual tokens that the model processes in a single input sequence. This speeds up how quickly VALL-E 2 generates speech and helps manage difficulties that come with processing long strings of sounds.

The researchers used audio samples from speech libraries LibriSpeech and VCTK to assess how well VALL-E 2 matched recordings of human speakers. They also used ELLA-V — an evaluation framework designed to measure the accuracy and quality of generated speech — to determine how effectively VALL-E 2 handled more complex speech generation tasks.

"Our experiments, conducted on the LibriSpeech and VCTK datasets, have shown that VALL-E 2 surpasses previous zero-shot TTS systems in speech robustness, naturalness, and speaker similarity," the researchers wrote. "It is the first of its kind to reach human parity on these benchmarks."

The researchers pointed out in the paper that the quality of VALL-E 2’s output depended on the length and quality of speech prompts — as well as environmental factors like background noise.

"Purely a research project"

Despite its capabilities, Microsoft will not release VALL-E 2 to the public due to potential misuse risks. This coincides with increasing concerns around voice cloning and deepfake technology . Other AI companies like OpenAI have placed similar restrictions on their voice tech.

New York Post11 days ago

Transhumanist author predicts artificial super-intelligence, immortality, and the Singularity by 2045

TechSpot7 days ago

Researchers create Dune-like pee-pants for the next generation of astronauts and maybe even wealthy, overly 'dedicated' gamers

PC Gamer4 days ago

Google's AI visionary says we'll 'expand intelligence a millionfold by 2045' thanks to nanobots, the tech will resurrect the dead, and we're all going to live forever

PC Gamer8 days ago

Most ChatGPT users think AI models have 'conscious experiences'

LiveScience4 days ago

Crazy new AI can read your mind to recreate what you’re looking at

BGR.com10 days ago

China set to release killer robots into battle ‘within two years’

The US Sun9 days ago

NASA will put an artificial star into orbit by the end of the decade

BGR.com27 days ago

SpaceX Has A Hotline You Can Call If Debris Falls On Your House

IFLScience20 days ago

'How Does It Feel Owning Stolen Lands?': Oprah Winfrey Under Fire After Two Arrested For Allegedly Having Hunting Equipment Near Her Maui Ranch

Hawaii State5 days ago

Scientists just grew the 1st-ever 'minibrains' from multiple people's cells

LiveScience13 days ago

We Toured the World’s First Inflatable Space Station. Here’s What It Was Like.

Louisville, CO21 days ago

Did You Feel the Earthquake?

Chicago, IL1 day ago

Benzos like Xanax may shrink the brain in the long term, study hints

LiveScience11 days ago

'The last 12 months have broken records like never before': Earth exceeds 1.5 C warming every month for entire year

LiveScience8 days ago

FBI warns that Mexican cartels are targeting Americans with timeshare scams

NewSantaAna29 days ago

Neanderthals didn't truly go extinct, but were rather absorbed into the modern human population, DNA study suggests

LiveScience5 days ago

Milky Way's rarest black hole may lurk behind 7 stars that 'shouldn't be there'

LiveScience4 days ago

Google Shows Off AI-Powered Robots That Are Ready To Serve

Inverse3 days ago

Nvidia's Biggest Rival Agrees to Buy an AI Startup for Over Half a Billion Dollars in Cash

Entrepreneur5 days ago

Welcome to NewsBreak, an open platform where diverse perspectives converge. Most of our content comes from established publications and journalists, as well as from our extensive network of tens of thousands of creators who contribute to our platform. We empower individuals to share insightful viewpoints through short posts and comments. It’s essential to note our commitment to transparency: our Terms of Use acknowledge that our services may not always be error-free, and our Community Standards emphasize our discretion in enforcing policies. We strive to foster a dynamic environment for free expression and robust discourse through safety guardrails of human and AI moderation. Join us in shaping the news narrative together.

Comments / 0

Community Policy