Researchers question Microsoft Copilot and ChatGPT smarts as AI champs do well with "memorization rather than true reasoning abilities"
By Kevin Okemwa,
2024-07-16
What you need to know
AI posts exceptional results and performance when handling everyday tasks versus a new and complex one.
MIT researchers claim AI tools heavily rely on memory rather than reasoning, which negatively impacts its performance when handling new tasks.
Human intervention remains critical for AI-generated outputs.
The rapid growth and adoption of generative AI worldwide are raising all sorts of concerns, including security and privacy. Recent reports indicate that AI might become smarter than humans and take over our jobs (potentially turning work into a hobby ), and professionals are concerned about their relevance in the workplace.
As you may know, large language models (LLMs) depend heavily on the internet for training. The downside to their overreliance on the internet and copyrighted content has landed major tech corporations like Microsoft and OpenAI in the corridors of justice over copyright infringement-related issues . Sam Altman previously admitted it's impossible to develop ChatGPT-like tools without copyrighted content and argued that copyright law doesn't shun the use of copyrighted content to train AI tools.
According to a new study by MIT’s Computer Science and Artificial Intelligence Laboratory (CSAIL), LLMs tend to perform better at familiar tasks ( Digital Watch Observatory ). On the other hand, AI tools struggle to excel in new tasks. The study assesses the tools' reasoning capabilities against their overdependence on memorization for better performance based on this premise.
To confirm this theory, the researcher compared the LLMs' performance when handling common versus new tasks on which they are not trained. According to the researchers' findings, advanced tools like OpenAI's GPT-4 excelled in arithmetic using base-10 but struggled with other number bases. The researchers used the same analytical tests to examine the LLMs' capabilities across various tasks, including chess and spatial reasoning.
The researchers compared the LLMs' performance across these tasks to "random guessing in unfamiliar settings." The findings suggest that AI often excels at tasks it is well trained and familiar with, which relies on memory but fails when actual reasoning comes into play for new challenges, which humans excel at.
It's becoming more apparent that AI is becoming advanced by the day, allowing it to handle different tasks with little human intervention. A report suggests that 54% of banking jobs can be automated using AI . But will this be plausible with the privacy and security issues preventing the technology from advancing?
The journalism landscape is arguably the most impacted by the prevalence of AI. In a previous report, we highlighted how a publication fired most of its staffers to automate their jobs using AI to cut costs. In the long run, the editors were overworked since most of their time was spent correcting AI's erroneous mistakes. The publication was forced to hire new writers but not to write. The new hires were forced to clean up after AI's grammatical and factual errors for less pay .
Game developers have expressed fear of losing their jobs to AI . Game studios are reportedly looking into development tools that could automate repetitive and redundant tasks to give developers ample time to tap into their creative side. While this sounds good on paper, developers argue this could change their job description entirely.
Additionally, integrating AI into gaming development might mean more demeaning work for developers. Rather than maximizing their creative juices to enhance gameplay, developers could spend time cleaning up after mistakes made using AI.
Get updates delivered to you daily. Free and customizable.
It’s essential to note our commitment to transparency:
Our Terms of Use acknowledge that our services may not always be error-free, and our Community Standards emphasize our discretion in enforcing policies. As a platform hosting over 100,000 pieces of content published daily, we cannot pre-vet content, but we strive to foster a dynamic environment for free expression and robust discourse through safety guardrails of human and AI moderation.
Comments / 0