Researchers question Microsoft Copilot and ChatGPT smarts as AI champs do well with "memorization rather than true reasoning abilities"

A confused robot using a computer
A confused robot using a computer (Image credit: Windows Central | Image Creator)

What you need to know

  • AI posts exceptional results and performance when handling everyday tasks versus a new and complex one.
  • MIT researchers claim AI tools heavily rely on memory rather than reasoning, which negatively impacts its performance when handling new tasks.
  • Human intervention remains critical for AI-generated outputs.

The rapid growth and adoption of generative AI worldwide are raising all sorts of concerns, including security and privacy. Recent reports indicate that AI might become smarter than humans and take over our jobs (potentially turning work into a hobby), and professionals are concerned about their relevance in the workplace.

Admittedly, AI chatbots like ChatGPT and Microsoft Copilot have come a long way from having hallucinations to writing and identifying errors in code within seconds. Even NVIDIA CEO Jensen Huang says coding might be dead in the water as a career option with the prevalence of AI. 

As you may know, large language models (LLMs) depend heavily on the internet for training. The downside to their overreliance on the internet and copyrighted content has landed major tech corporations like Microsoft and OpenAI in the corridors of justice over copyright infringement-related issues. Sam Altman previously admitted it's impossible to develop ChatGPT-like tools without copyrighted content and argued that copyright law doesn't shun the use of copyrighted content to train AI tools.

While the issue remains debatable, AI tools are becoming more advanced and sophisticated, rendering some professions obsolete. For instance, architects and interior designers could lose their jobs to Image Creator by Designer and ChatGPT as they can generate sophisticated and detailed structural designs in seconds. However, the tools have been spotted struggling to handle simple tasks like creating a plain white image.

According to a new study by MIT’s Computer Science and Artificial Intelligence Laboratory (CSAIL), LLMs tend to perform better at familiar tasks (Digital Watch Observatory). On the other hand, AI tools struggle to excel in new tasks. The study assesses the tools' reasoning capabilities against their overdependence on memorization for better performance based on this premise.

To confirm this theory, the researcher compared the LLMs' performance when handling common versus new tasks on which they are not trained. According to the researchers' findings, advanced tools like OpenAI's GPT-4 excelled in arithmetic using base-10 but struggled with other number bases. The researchers used the same analytical tests to examine the LLMs' capabilities across various tasks, including chess and spatial reasoning. 

The researchers compared the LLMs' performance across these tasks to "random guessing in unfamiliar settings." The findings suggest that AI often excels at tasks it is well trained and familiar with, which relies on memory but fails when actual reasoning comes into play for new challenges, which humans excel at.

AI needs human intervention to get things right

Image of a robot stopping a person from using the computer (Image credit: Windows Central)

It's becoming more apparent that AI is becoming advanced by the day, allowing it to handle different tasks with little human intervention. A report suggests that 54% of banking jobs can be automated using AI. But will this be plausible with the privacy and security issues preventing the technology from advancing?

The journalism landscape is arguably the most impacted by the prevalence of AI. In a previous report, we highlighted how a publication fired most of its staffers to automate their jobs using AI to cut costs. In the long run, the editors were overworked since most of their time was spent correcting AI's erroneous mistakes. The publication was forced to hire new writers but not to write. The new hires were forced to clean up after AI's grammatical and factual errors for less pay.

Game developers have expressed fear of losing their jobs to AI. Game studios are reportedly looking into development tools that could automate repetitive and redundant tasks to give developers ample time to tap into their creative side. While this sounds good on paper, developers argue this could change their job description entirely. 

Additionally, integrating AI into gaming development might mean more demeaning work for developers. Rather than maximizing their creative juices to enhance gameplay, developers could spend time cleaning up after mistakes made using AI.

Kevin Okemwa
Contributor

Kevin Okemwa is a seasoned tech journalist based in Nairobi, Kenya with lots of experience covering the latest trends and developments in the industry at Windows Central. With a passion for innovation and a keen eye for detail, he has written for leading publications such as OnMSFT, MakeUseOf, and Windows Report, providing insightful analysis and breaking news on everything revolving around the Microsoft ecosystem. You'll also catch him occasionally contributing at iMore about Apple and AI. While AFK and not busy following the ever-emerging trends in tech, you can find him exploring the world or listening to music.

  • fjtorres5591
    And water is wet. 🙄

    Too many people who should know better are letting themselves be caught up in AI histeria over what is little more than marketing hype.

    LLMs are little more than advanced database query software.
    Big database and very sophisticated query sistem but there is no intelligence, artificial or otherwise, involved. It's just a modern version of the Mechanical Turk, applied to Vannebar Bush's memex's concept.

    Very useful and valuable ($$$) to developers and users alike but worthless to ivory tower academics interested in fanciful dreams of software minds. No different than the UFO-obsessed or cryptid hunters.

    Let it go, folks.
    Focus on what the software does, not what it isn't and never will be.
    True AI, if ever developed, will not come from this technology.
    Reply