arxiv:2605.23986

MemForest: An Efficient Agent Memory System with Hierarchical Temporal Indexing

Published on May 16

· Submitted by

CHEN Han on May 26

Upvote

Authors:

Han Chen ,

Wenqi Pei ,

Abstract

MemForest presents a memory framework for long-context LLM agents that improves scalability and reduces latency through parallel chunk extraction and hierarchical temporal indexing.

AI-generated summary

Memory is a fundamental component for enabling long-context LLM agents, supporting persistent state across interactions through a continuous serve-and-update lifecycle. Despite substantial prior work, existing systems suffer from significant maintenance overhead due to two key limitations: coarse-grained state management and inherently sequential update pipelines. In particular, updates are often tightly coupled with LLM inference and require full-state rewrites, leading to poor scalability and growing latency as memory accumulates. To address these challenges, we present MemForest, a memory framework that reformulates agent memory as a write-efficient temporal data management problem. MemForest breaks the sequential bottleneck via parallel chunk extraction, decoupling memory construction into concurrent, independent operations. To further eliminate coarse-grained maintenance, we introduce MemTree, a hierarchical temporal index that organizes memory as time-ordered trees rather than flat global summaries. This design replaces full-state rewrites with localized per-node updates, reducing maintenance cost to the affected tree paths while naturally preserving temporally evolving states. We evaluate MemForest on two long-context memory benchmarks, LongMemEval-S and LoCoMo. On LongMemEval-S, MemForest achieves the best overall performance among stateful baselines, reaching 79.8% pass@1 accuracy while sustaining a memory construction throughput approximately 6x higher than state-of-the-art approaches including EverMemOS.

View arXiv page View PDF GitHub 21 Add to collection

Community

Concyclics

Paper author Paper submitter 3 days ago

A Latency optimized parallel write Agent Memory System.

avahal

2 days ago

the most interesting bit for me is the MemTree idea, a time-ordered hierarchical index that localizes maintenance to the touched paths instead of rewriting the whole memory. that per-node update pattern plus the lazy refresh of interval summaries and root rows makes the write path truly parallel and chunk-driven. it also clarifies retrieval: you go from root-based recall to interval-summary guided tree browse, which preserves temporal fidelity without brute-force rewrites. btw, the arxivlens breakdown helped me parse the method details, especially how the session/entity/scene scoped trees interplay. one practical question: how does MemForest fare when memory content becomes highly non-stationary, would dynamic re-scoping keep freshness high without blowing up maintenance?

Concyclics

Paper author about 4 hours ago

Thanks for your interest and the thoughtful question! I think the key concern is whether scopes remain valid when memory becomes non-stationary. We would distinguish two cases.

First, if the change happens within a broad topic, we do not necessarily view it as scope failure. Scene scopes in MemForest are intentionally aggregative: a scene may cover an evolving process such as relocation, project progress, or planning. MemTree handles this intra-scope evolution through its temporal hierarchy: interval summaries capture how the state changes over time, while leaves keep the concrete evidence at specific timestamps.

Second, if the change is true cross-topic drift, a scope may become too broad or mixed. In that case, the Forest design provides robustness before re-scoping is needed: session trees preserve chronological evidence, entity trees provide subject-centered access, and scene trees provide semantic grouping. So a stale scene root does not necessarily make the evidence unreachable.

When a scope genuinely becomes stale, we can treat dynamic re-scoping as a bounded maintenance operation: split an over-broad scope into multiple scopes, merge related scopes, or delete obsolete ones, then rematerialize only the affected MemTrees. This is efficient because canonical facts are persistent state, while summaries, embeddings, and root rows are derived views. Therefore, re-scoping does not require replaying raw sessions or rewriting the whole forest.

Our efficient migration experiment in Section 5.6 also supports this point to some extent: it shows that already materialized memory states can be reorganized or merged much faster than sequentially rewriting sessions. Scope split, merge, and deletion follow the same principle: operate on canonical memory state and selectively refresh affected trees. Therefore, dynamic adjustment should remain bounded by the affected scopes rather than the global memory size.

I hope this clarifies how we think about freshness and maintenance under non-stationary memory.