Transformers' Limits: Researchers Examining Alternative AI Models

The Tech Behind ChatGPT Falls Short of Human-Like AI

The introduction of ChatGPT revolutionized the conversation surrounding artificial intelligence (AI). However, the technology behind it has its limitations and may not be able to achieve the level of intelligence exhibited by humans. As a result, researchers are now exploring alternative approaches to overcome these challenges.

The Rise of Transformers Neural Networks

In 2017, a group of researchers at Google introduced transformers neural networks to the world. These networks power many popular AI products today, including OpenAI's ChatGPT. ChatGPT, a chatbot that made a significant impact last year, prompted Bill Gates to proclaim the beginning of the AI era.

Artificial General Intelligence: The Next Frontier

While transformers can drive ChatGPT, recent research conducted by Google suggests that they may not possess the capability to perform human-like abstractions, extrapolations, and predictions. To achieve artificial general intelligence (AGI) – AI that exhibits human-like intelligence – alternative models may be required.

The Limitations of ChatGPT

ChatGPT primarily responds to user prompts with text based on the training data provided by humans. In its initial iteration, the chatbot had limited knowledge beyond September 2021, which it had to acknowledge whenever asked about more recent events. Google researchers tested transformers' ability to go beyond the available data and found a degradation in their generalization, particularly for simple extrapolation tasks.

Exploring Alternatives: The Mamba Model

In a recent submission to the open-access repository ArXiv, Albert Gu, an assistant professor at Carnegie Mellon, and Tri Dao, the chief scientist at Together AI, introduced a new model called Mamba. Mamba is a state-space model (SSM) that demonstrates superior performance to transformers across various tasks, including language modeling, audio processing, and genomics.

Mamba's Performance and Advantages

Gu and Dao's research shows that Mamba outperforms transformers of the same size and matches transformers twice its size in both pretraining and downstream evaluation. Additionally, Mamba exhibits a unique feature that enables it to generate language responses five times faster than transformers, as highlighted by Dao. This impressive performance has garnered attention from researchers, with Dr. Jim Fan from Nvidia commending the efforts to explore alternative sequence architectures.

The Impact of Mamba on the AI Landscape

While ChatGPT's technology may fall short of delivering human-like intelligence, the potential superiority of Mamba over transformers could bring the industry closer to achieving this goal. Further testing and validation are necessary to confirm Mamba's consistent performance, but if successful, it could have a significant impact on the field of AI.

Read the original article on Business Insider: Transformers' Limits: Researchers Examining Alternative AI Models

—————————————————————————————————————————————
By: hchowdhury@insider.com (Hasan Chowdhury)
Title: Transformers’ Limits: Researchers Examining Alternative AI Models
Sourced From: www.businessinsider.com/transformers-limits-chatgpt-researchers-examining-alternative-ai-models-2023-12
Published Date: Tue, 05 Dec 2023 15:51:43 +0000