Adaptive Focus Memory for Language Models
By: Christopher Cruz
Potential Business Impact:
Keeps chatbots remembering important details cheaply.
Large language models (LLMs) are increasingly deployed in multi-turn dialogue settings, but their behavior is still bottlenecked by fixed context windows and naive memory strategies. Replaying the full conversation at every turn is simple but expensive, while static summarization or recency-only heuristics often erase safety-critical user details. We present Adaptive Focus Memory (AFM), a dynamic context manager that assigns each past message one of three fidelity levels -- FULL, COMPRESSED, or PLACEHOLDER -- based on semantic similarity to the current query, half-life recency weighting, and importance classification. AFM packs messages chronologically under a strict token budget, preferring high fidelity for the most relevant turns while aiming to preserve a cheap trace of the dialogue. In a safety-oriented benchmark involving a user with a severe peanut allergy planning a trip to Thailand, AFM retains the allergy across both short and medium-length conversations, matches the safety performance of naive replay, and cuts average token usage by 66% relative to a replay baseline. We release a modular Python implementation of AFM designed for OpenAI-compatible APIs and offline operation, enabling practitioners to reduce inference cost without sacrificing safety or factual continuity in the evaluated scenario.
Similar Papers
SimpleMem: Efficient Lifelong Memory for LLM Agents
Artificial Intelligence
Makes AI remember more with less effort.
ContextFocus: Activation Steering for Contextual Faithfulness in Large Language Models
Computation and Language
Makes AI trust new facts over old ones.
A Simple Yet Strong Baseline for Long-Term Conversational Memory of LLM Agents
Computation and Language
Lets chatbots remember long talks better.