Open in App
  • Local
  • U.S.
  • Election
  • Politics
  • Sports
  • Lifestyle
  • Education
  • Real Estate
  • Newsletter
  • Interesting Engineering

    LongWriter: New AI large language model capable of producing 10,000-word texts

    By Kapil Kajal,

    2024-08-16

    https://img.particlenews.com/image.php?url=103ZOq_0v0ZZC6P00

    Large language models (LLMs) are changing how businesses operate these days.

    These powerful tools have given humans the power to explore technology like never before. But LLMs also have their own set of problems.

    One major problem the LLMs face is the output length of the text it generates. Current LLMs claim to process inputs up to 100,000 words yet struggle to generate outputs exceeding even a modest length of 2,000 words.

    To solve this problem, a team of AI researchers at Tsinghua University, working with a colleague from Zhipu AI, has developed a large language model (LLM) called LongWriter.

    Can generate long answers

    The team claims the LLM can generate text output of up to 10,000 words.

    The group has written a paper describing their efforts and the new LLM , available on the arXiv preprint server.

    As LLMs have become mainstream, many have noticed that they cannot generate very long answers, such as full books or manuscripts—the current limit appears to be approximately 2,000 words.

    The researchers suggest this is because they are all trained on short documents. In their new effort, they found that if LLMs are changed slightly and trained using much longer documents, they can produce longer documents.

    Testing the idea

    To test their idea, the research teams first trained a 9-billion parameter LLM using a conventional dataset, which included documents that were mostly less than 2,000 words long.

    As expected, when queried, it could only create texts 2,000 words long at maximum.

    Next, the team modified a traditional LLM using a pipeline they named AgentWrite to decompose training material into subtasks as processed.

    They then assembled a dataset they named “LongWriter-6k,” which holds 6,000 written documents ranging in length from 2,000 to 32,000 words.

    They then trained the modified LLM using the new dataset LongWriter-6k and found that doing so increased the word length of documents it could produce to approximately 10,000 words.

    Useable in a variety of contexts

    In reviewing the newly produced long documents generated by the LLM, the team found them coherent and usable in various contexts.

    They have posted the open-source code for their model on GitHub, allowing others to build on what the team in China has done.

    They also posted a video showing LongWriter producing a 10,000-word tourist guide for people traveling in China.

    The researchers acknowledge that ethical considerations must be considered now that it has been found that LLMs can generate entire research papers, books, manuscripts, or perhaps even movie scripts.

    This work demonstrates that existing long-context LLM already possesses the potential for a larger output window–all you need is data with extended output during model alignment to unlock this capability.

    By understanding the limitations and employing responsible usage practices, humans can harness the power of LLMs while mitigating potential risks. In future discussions, we will delve deeper into the world of LLMs and explore techniques to enhance their capabilities.

    Expand All
    Comments /
    Add a Comment
    YOU MAY ALSO LIKE
    Local News newsLocal News
    Total Apex Sports & Entertainment18 hours ago
    Total Apex Sports & Entertainment2 days ago

    Comments / 0