Interesting Engineering

LongWriter: New AI large language model capable of producing 10,000-word texts

By Kapil Kajal,

2024-08-16

Large language models (LLMs) are changing how businesses operate these days.

These powerful tools have given humans the power to explore technology like never before. But LLMs also have their own set of problems.

One major problem the LLMs face is the output length of the text it generates. Current LLMs claim to process inputs up to 100,000 words yet struggle to generate outputs exceeding even a modest length of 2,000 words.

To solve this problem, a team of AI researchers at Tsinghua University, working with a colleague from Zhipu AI, has developed a large language model (LLM) called LongWriter.

Can generate long answers

The team claims the LLM can generate text output of up to 10,000 words.

The group has written a paper describing their efforts and the new LLM , available on the arXiv preprint server.

As LLMs have become mainstream, many have noticed that they cannot generate very long answers, such as full books or manuscripts—the current limit appears to be approximately 2,000 words.

The researchers suggest this is because they are all trained on short documents. In their new effort, they found that if LLMs are changed slightly and trained using much longer documents, they can produce longer documents.

Testing the idea

To test their idea, the research teams first trained a 9-billion parameter LLM using a conventional dataset, which included documents that were mostly less than 2,000 words long.

As expected, when queried, it could only create texts 2,000 words long at maximum.

Next, the team modified a traditional LLM using a pipeline they named AgentWrite to decompose training material into subtasks as processed.

They then assembled a dataset they named “LongWriter-6k,” which holds 6,000 written documents ranging in length from 2,000 to 32,000 words.

They then trained the modified LLM using the new dataset LongWriter-6k and found that doing so increased the word length of documents it could produce to approximately 10,000 words.

Useable in a variety of contexts

In reviewing the newly produced long documents generated by the LLM, the team found them coherent and usable in various contexts.

They have posted the open-source code for their model on GitHub, allowing others to build on what the team in China has done.

They also posted a video showing LongWriter producing a 10,000-word tourist guide for people traveling in China.

The researchers acknowledge that ethical considerations must be considered now that it has been found that LLMs can generate entire research papers, books, manuscripts, or perhaps even movie scripts.

This work demonstrates that existing long-context LLM already possesses the potential for a larger output window–all you need is data with extended output during model alignment to unlock this capability.

By understanding the limitations and employing responsible usage practices, humans can harness the power of LLMs while mitigating potential risks. In future discussions, we will delve deeper into the world of LLMs and explore techniques to enhance their capabilities.

Expand All

Read in NewsBreak

Comments /

Add a Comment

YOU MAY ALSO LIKE

Local News

23,600 times faster: New tool completes 81 years of detective work in 30 hours

Interesting Engineering15 hours ago

3 Zodiac Signs Who Make the Worst Moms

Total Apex Sports & Entertainment18 hours ago

Tech millionaire’s 184-foot-yacht sinking sparks fears of exposed intelligence secrets

Interesting Engineering20 hours ago

Former Staff Reveal Meghan Markle’s Extreme Anger Behind Closed Doors

André Emilio21 hours ago

Woman warns to watch 7 Brew workers after catching worker doing this with drink-wiping rag

NewsNinja2 days ago

COVID-19 Treatment Recalled Nationwide Due to Serious Health Risk

Uncovering Florida2 days ago

Order Free COVID-19 Tests at the End of September

Alameda Post1 day ago

Fentanyl-meth combo ravages homeless in Denver, so why aren't there better treatments?

David Heitz19 days ago

Carnival Cruise Ship Hit by Floating Ice Chunk in Alaska

J. Souza20 days ago

Fish that walk? Sea robins use ‘legs’ to dig, taste, move on ocean floor

Interesting Engineering20 hours ago

Meet The Tiny 6lb Dog Looking For Love

Dianna Carney29 days ago

US: Category 4 hurricane Helene hits Florida with 140 mph winds

Interesting Engineering5 hours ago

A Look Back at the Tragic Death of Oscar-winning Actor Philip Seymour Hoffman: 10 Years Later

Herbie J Pilato25 days ago

Pregnant woman says McDonald’s worker refused to sell her tartar sauce

NewsNinja6 days ago

Big Lots files bankruptcy amid closing 74 stores in California

The HD Post17 days ago

‘Killer whale holding a knife’: AI uncovers 303 unique Nazca lines in Peru’s desert

Interesting Engineering2 days ago

Corporate landlord with 12,000 homes in CA agrees to $48 million settlement for deceiving renters

The HD Post1 day ago

Health officials report first case of Oropouche virus, aka ‘Sloth Fever,’ confirmed in Kentucky

Northern Kentucky Tribune22 days ago

3 Zodiac Signs That Marry for Money

Total Apex Sports & Entertainment2 days ago

Opinion: How homelessness made me a criminal in Denver

David Heitz4 days ago

Ford patents drug trafficking tech to save drivers from becoming ‘blind mules’

Interesting Engineering2 days ago

Wireless artificial ‘brain’ enables smooth swimming for biohybrid robot

Interesting Engineering1 day ago

Report says 7 California cities make the top 10 for most expensive to cool home in America

The HD Post7 days ago

Vision screenings for driver’s license renewals will be required in ’25 to enhance driver safety

Northern Kentucky Tribune7 days ago

3,775-year-old log inspires wood-based method to store carbon for centuries

Interesting Engineering17 hours ago

‘Suicide capsule’ death in Swiss forest prompts multiple arrests, legal probe

Interesting Engineering2 days ago

Elon Musk’s wish fulfilled as Olympic shooter Kim Yeji takes on assassin role

Interesting Engineering2 days ago

USPS Driver Says Customer Gave Him a Snack. Should He Eat It?

NewsNinja8 days ago

Border Patrol’s refusal to rescue an injured man by cutting the wall shows a larger “cultural problem,” former CBP commissioner says

Arizona Luminaria6 days ago

Local Rescue Shares Dog Who Has Been Looking For Love The Longest

Dianna Carney12 days ago

It’s essential to note our commitment to transparency:

Our Terms of Use acknowledge that our services may not always be error-free, and our Community Standards emphasize our discretion in enforcing policies. As a platform hosting over 100,000 pieces of content published daily, we cannot pre-vet content, but we strive to foster a dynamic environment for free expression and robust discourse through safety guardrails of human and AI moderation.

Comments / 0

Community Policy