YAML Metadata Warning:empty or missing yaml metadata in repo card

Check out the documentation for more information.

LeWorld Memory Architecture 🧠⚑

A CPU-inspired hierarchical neural architecture where 3 Small LeWorld Models (SLMs) compete to find the most useful memory for 1 Big LeWorld Model (BLM) to predict the next world state.

Architecture

Component Parameters Role
Artificial Memory 21K Bit-level storage (64K words Γ— 32 bits) + learned bit encoder/decoder
SLM-0 745K State β†’ memory address range
SLM-1 745K State β†’ memory address range
SLM-2 745K State β†’ memory address range
BLM 11.2M SLM selector [1,0,1] + next-state predictor + info requester
Total 13.5M

Key Ideas

  1. CPU-Style Memory: Actual bit-level storage (64K Γ— 32-bit words), accessed by address ranges β€” just like RAM
  2. Product-Key Addressing: SLMs output addresses by predicting high byte (256 choices) + low byte (256 choices) = 65K addresses with only 512 logits
  3. Binary SLM Routing: BLM selects which SLMs to trust via Straight-Through Sigmoid β†’ hard [1,0,1] in forward, differentiable in backward
  4. Active Information Request: BLM generates "what do I need next?" queries that modulate SLM memory search at the next timestep
  5. 3-Phase Training: Pre-train β†’ Joint end-to-end β†’ Info-request refinement with paired-branch reward

Data Flow

                    β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
                    β”‚    ARTIFICIAL MEMORY         β”‚
                    β”‚  [0][1][0][1]...[1][0][1][0] β”‚
                    β”‚   64K words Γ— 32 bits each   β”‚
                    └──────────┬──────────────────-β”€β”˜
                               β”‚ READ(addr_range)
           β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
    β”Œβ”€β”€β”€β”€β”€β”€β–Όβ”€β”€β”€β”€β”€β”€β”   β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β–Όβ”€β”€β”€β”€β”€β”€β”€β”   β”Œβ”€β”€β”€β”€β”€β”€β–Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
    β”‚   SLM-0     β”‚   β”‚    SLM-1       β”‚   β”‚     SLM-2       β”‚
    β”‚  (745K)     β”‚   β”‚   (745K)       β”‚   β”‚    (745K)       β”‚
    β”‚ past_state  β”‚   β”‚ past_state     β”‚   β”‚ past_state      β”‚
    β”‚ curr_state  β”‚   β”‚ curr_state     β”‚   β”‚ curr_state      β”‚
    β”‚ character.  β”‚   β”‚ character.     β”‚   β”‚ character.      β”‚
    β”‚  β†’ addr     β”‚   β”‚  β†’ addr        β”‚   β”‚  β†’ addr         β”‚
    β””β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”˜   β””β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”˜   β””β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”˜
           β”‚                   β”‚                     β”‚
           └──────────►  BLM (11.2M)  β—„β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                    mask = [1, 0, 1]
                    β†’ next_state prediction
                    β†’ "what info do I need next?"

Files

File Description
leworld_architecture.py All model definitions: Memory, SLM, BLM, full system (~990 lines)
leworld_training.py 3-phase training pipeline, data generation, evaluation (~820 lines)
PLAN.md Complete design document with literature references

Quick Start

from leworld_architecture import LeWorldSystem, MemoryConfig, SLMConfig, BLMConfig
from leworld_training import run_training, TrainingConfig

# Build system
system = LeWorldSystem(MemoryConfig(), SLMConfig(), BLMConfig())

# Train (3 phases: pre-train β†’ joint β†’ refine)
metrics = run_training(system, TrainingConfig())

Literature Foundation

Paper What we borrowed
Gumbel-Softmax Straight-Through sigmoid for binary routing
Switch Transformers Gate-value scaling, load balance loss
Product Key Memory Address decomposition into sub-keys
LM2 LSTM-style memory gates
NAMM Binary memory eviction
ProactAgent Paired-branch reward for retrieval decisions
Mamba Explicit state maintenance

Verified Results (demo run)

Phase 1: SLM loss 12.87 β†’ 7.13, BLM loss 0.39 β†’ 0.33
Phase 2: Routing becomes diverse β€” SLM usage: [0.72, 0.79, 0.67]
Phase 3: Info-request improves predictions by 19.5 loss units vs baseline

Final: MSE=0.36, Routing entropy=0.70
Per-step MSE: [0.64, 0.44, 0.31, 0.23, 0.19]  ← improves over time
Routing patterns: [1,0,1] β†’ [0,1,1] β†’ [1,1,1] β†’ [1,1,0] β†’ [0,1,0]
Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support

Papers for inv0krr/leworld-memory-architecture