Papers
Topics
Authors
Recent
Detailed Answer
Quick Answer
Concise responses based on abstracts only
Detailed Answer
Well-researched responses based on abstracts and relevant paper content.
Custom Instructions Pro
Preferences or requirements that you'd like Emergent Mind to consider when generating responses
Gemini 2.5 Flash
Gemini 2.5 Flash 31 tok/s
Gemini 2.5 Pro 50 tok/s Pro
GPT-5 Medium 11 tok/s Pro
GPT-5 High 9 tok/s Pro
GPT-4o 77 tok/s Pro
Kimi K2 198 tok/s Pro
GPT OSS 120B 463 tok/s Pro
Claude Sonnet 4 31 tok/s Pro
2000 character limit reached

AI-native Memory: A Pathway from LLMs Towards AGI (2406.18312v4)

Published 26 Jun 2024 in cs.CL and cs.AI

Abstract: LLMs have demonstrated the world with the sparks of artificial general intelligence (AGI). One opinion, especially from some startups working on LLMs, argues that an LLM with nearly unlimited context length can realize AGI. However, they might be too optimistic about the long-context capability of (existing) LLMs -- (1) Recent literature has shown that their effective context length is significantly smaller than their claimed context length; and (2) Our reasoning-in-a-haystack experiments further demonstrate that simultaneously finding the relevant information from a long context and conducting (simple) reasoning is nearly impossible. In this paper, we envision a pathway from LLMs to AGI through the integration of \emph{memory}. We believe that AGI should be a system where LLMs serve as core processors. In addition to raw data, the memory in this system would store a large number of important conclusions derived from reasoning processes. Compared with retrieval-augmented generation (RAG) that merely processing raw data, this approach not only connects semantically related information closer, but also simplifies complex inferences at the time of querying. As an intermediate stage, the memory will likely be in the form of natural language descriptions, which can be directly consumed by users too. Ultimately, every agent/person should have its own large personal model, a deep neural network model (thus \emph{AI-native}) that parameterizes and compresses all types of memory, even the ones cannot be described by natural languages. Finally, we discuss the significant potential of AI-native memory as the transformative infrastructure for (proactive) engagement, personalization, distribution, and social in the AGI era, as well as the incurred privacy and security challenges with preliminary solutions.

Citations (3)

Summary

  • The paper shows that LLMs’ long-context abilities are overestimated, particularly in handling complex, multi-step reasoning tasks.
  • It introduces a multi-layered AI-native memory system that organizes data from raw inputs to deep neural representations for efficient recall.
  • The study outlines practical challenges such as catastrophic forgetting and infrastructure demands while mapping a clear pathway toward AGI.

AI-native Memory: A Pathway from LLMs Towards AGI

Introduction

The paper "AI-native Memory: A Pathway from LLMs Towards AGI" (2406.18312) explores the limitations and potential enhancements of LLMs on the path to achieving AGI. While LLMs exhibit impressive capabilities, particularly in handling complex multi-step reasoning and following intricate human instructions, the work critically evaluates the efficacy of long-context LLMs and suggests that super long or unlimited context capabilities alone are insufficient for AGI.

Limitations of Long-context LLMs

The authors argue that the perceived ability of LLMs to process super-long contexts effectively is overly optimistic. Two primary assumptions—effective retrieval from long contexts and executing complex reasoning in one step—are scrutinized. Existing literature and experiments, such as those involving reasoning-in-a-haystack tasks, indicate that current LLMs struggle to maintain performance when both context length and reasoning complexity increase. Figure 1

Figure 1: An Overview of Reasoning-in-a-Haystack. In this paper, the haystack, needles, and queries are all designed based on real data from Mebot of Mindverse AI.

The paper demonstrates that despite claims of extensive context windows (e.g., up to 128K tokens for models like GPT-4), effective context length is often significantly shorter in practice. For instance, GPT-4's effective context is approximately 64K tokens, contradicting the claimed 128K.

The Necessity of Memory

The paper proposes that merely extending context windows of LLMs is insufficient for AGI. Instead, it champions the development of AI-native memory systems, drawing parallels between AGI and computer architecture, where LLMs function akin to processors and require complementary memory systems akin to hard disk storage.

Memory in this context should extend beyond simple retrieval systems such as retrieval-augmented generation (RAG) and include the storage of reasoning-derived conclusions. This organization and storage would allow LLMs to operate more efficiently and effectively across tasks requiring long-term memory retention and reuse.

Implementing AI-native Memory

The proposed AI-native memory system involves several forms, ranging from raw data storage to sophisticated neural networks that compress and parameterize information beyond lexical descriptions. Memory implementation can be structured in increasingly complex layers:

  1. L0: Raw Data - Analogous to traditional RAG models, serving only as an initial step.
  2. L1: Natural-language Memory - Leveraging natural language for data organization and facilitating user interactions.
  3. L2: AI-Native Memory - A deep neural network model encoding comprehensive, parameterized memory, allowing personalized interaction and application through Large Personal Models (LPMs).

The AI-native memory aims to create a "Memory Palace" for each user—systematizing and organizing data for seamless recall and interaction in AI tasks. Figure 2

Figure 2

Figure 2: Reasoning-in-a-haystack Comparison based on Mebot's Real Data across different context lengths and hop counts.

Challenges and Future Directions

While AI-native memory systems hold promise, they also present challenges in training efficiency, serving infrastructure requirements, and preventing issues such as catastrophic forgetting. Balancing memory organization with security and privacy concerns remains paramount, especially as memory models become personalized. Proposed solutions, such as using LoRA models for efficient memory processing and deployment, address some of these challenges but require further exploration.

Conclusions

This paper delineates the limitations of long context LLMs and underscores the necessity of integrating memory systems to advance towards AGI. By transforming data into structured, retrievable memory, and coupling it with LPMs, AI systems can become more efficient and effective in addressing complex, personalized tasks. The authors outline a clear pathway forward, emphasizing the critical role of advanced memory models in realizing the AGI vision.

List To Do Tasks Checklist Streamline Icon: https://streamlinehq.com

Collections

Sign up for free to add this paper to one or more collections.

Youtube Logo Streamline Icon: https://streamlinehq.com