ENGRAM: Effective, Lightweight Memory Orchestration for Conversational Agents
By: Daivik Patel, Shrenik Patel
Potential Business Impact:
Lets computers remember conversations for a long time.
Large language models (LLMs) deployed in user-facing applications require long-horizon consistency: the ability to remember prior interactions, respect user preferences, and ground reasoning in past events. However, contemporary memory systems often adopt complex architectures such as knowledge graphs, multi-stage retrieval pipelines, and OS-style schedulers, which introduce engineering complexity and reproducibility challenges. We present ENGRAM, a lightweight memory system that organizes conversation into three canonical memory types (episodic, semantic, and procedural) through a single router and retriever. Each user turn is converted into typed memory records with normalized schemas and embeddings and stored in a database. At query time, the system retrieves top-k dense neighbors for each type, merges results with simple set operations, and provides the most relevant evidence as context to the model. ENGRAM attains state-of-the-art results on LoCoMo, a multi-session conversational QA benchmark for long-horizon memory, and exceeds the full-context baseline by 15 points on LongMemEval while using only about 1% of the tokens. These results show that careful memory typing and straightforward dense retrieval can enable effective long-term memory management in language models without requiring complex architectures.
Similar Papers
Reuse, Don't Recompute: Efficient Large Reasoning Model Inference via Memory Orchestration
Multiagent Systems
Lets computers remember answers to save time.
A Simple Yet Strong Baseline for Long-Term Conversational Memory of LLM Agents
Computation and Language
Lets chatbots remember long talks better.
LiCoMemory: Lightweight and Cognitive Agentic Memory for Efficient Long-Term Reasoning
Information Retrieval
Gives AI a better memory for long talks.