How to use from
vLLM
# Gated model: Login with a HF token with gated access permission
hf auth login
Install from pip and serve model
# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "WithinUsAI/Infinite.Code.III"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "WithinUsAI/Infinite.Code.III",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'
Use Docker
docker model run hf.co/WithinUsAI/Infinite.Code.III
Quick Links

You need to agree to share your contact information to access this model

This repository is publicly accessible, but you have to accept the conditions to access its files and content.

Log in or Sign Up to review the conditions and access this model content.

Infinite.Code.III — Recursive Language Model

"Not a Large Language Model. A Recursive Mind."

Overview

Infinite.Code.III is a 1.210B-parameter Recursive Language Model (RLM) built from scratch as a unified Hybrid Mind architecture. Unlike standard LLMs that apply a fixed forward-pass transformer, Infinite.Code.III integrates Self-Automated (S.A.) learning systems as architectural primitives — they are not pipeline steps; they are woven into every decoder layer.

Property Value
Parameters 1.210B
Context Window 1,000,000 tokens
Architecture Recursive Language Model (RLM)
Attention Grouped-Query Attention (GQA) 10/5 heads
Positional Encoding RoPE (θ = 500,000, long-ctx scaled)
FFN Alternating Dense / Mixture-of-Experts (8 experts, top-2)
Vocabulary 65,536 BPE tokens
Layers 20
Hidden Size 1280
Weight Format safetensors (bfloat16 trained, float32 saved)
Modalities Text · Image · Audio · Video
License Apache 2.0

S.A. System Architecture

S.A. Meta Learning

Each layer has a learnable adaptive_alpha scalar (sigmoid-gated) that blends the transformed output with the layer's top-of-layer residual. This is the meta-learning channel — it learns how much each transformation contributes per layer.

S.A. Reinforcement Learning

RewardHead (D → 512 → 1 scalar) attaches to the final hidden states. During RL fine-tuning (RLHF / GRPO), this head provides the value signal. Pass output_reward=True during rollout collection.

S.A. Continual Learning

HybridMemory LTM uses exponential moving average write-back (0.95 × old + 0.05 × new) — knowledge accumulates across forward passes without overwriting, resisting catastrophic forgetting.

S.A. Adaptive Learning

The per-layer adaptive_alpha gate is trained end-to-end, self-calibrating each layer's write strength to the residual stream.

S.A. Rewriting Learning

Every 3rd layer runs RewriteAttention — a 4-head causal self-attention pass that lets the model revise its own intermediate token representations within a single forward pass.

S.A. NLP + S.A. Problem Solving

MetaOutputMixer at decoder output applies a 3-way soft gate (language / code / math-logic) via NLPGate. The final representation is a content-adaptive weighted mixture of three parallel projections.

S.A. Innovation Learning

Odd-numbered layers use MoELayer — 8 experts, top-2 routing, each a SwiGLU FFN with 2048-dim intermediate.

S.A. DeBugging

DebugHookManager gradient hook registry. Set debug_mode: true in config to activate mean-absolute-gradient logging on the embedding and any registered tensor. Zero cost when disabled.

S.A. Advanced Long/Short-Term Memory

HybridMemory (every 4th layer):

  • STM: 512-slot soft-attention read buffer (refreshed each pass)
  • LTM: 2048-slot persistent EMA key-value store (continual write-back)

S.A. Recursive Seed Learning

RecursiveSeedGate on every layer — depth-4 intra-layer recursion: seeds a 256-dim vector, projects to full D, gates with sigmoid, re-seeds from updated h. Creates true within-layer feedback loops.


Multimodal Inputs

Modality Projector Input Shape
Image ImageProjector Linear(1024→2560→1280) (B, N_patches, 1024)
Audio AudioProjector GRU(80→512) + Linear (B, T_frames, 80)
Video VideoProjector Linear + TransformerEncoderLayer (B, F_frames, 1024)

Fine-Tuning

SFT Recommended Hyperparameters

Setting Value
Learning Rate 2e-5
LR Schedule cosine + 100-step warmup
Batch Size 1–4 per GPU + grad accumulation ×8
Max Seq Length start at 8192, scale to 1M
Precision bfloat16
Optimizer AdamW (β₁=0.9, β₂=0.95, ε=1e-8, wd=0.1)
Grad Clip 1.0

RLHF / GRPO

The reward_head is the built-in value model. Pass output_reward=True during rollout. The scalar is differentiable — plug directly into TRL GRPOTrainer.


Citation

@misc{infinite_code_iii_2025,
  title   = {Infinite.Code.III: A Recursive Language Model with Self-Automated Learning},
  author  = {GODsStrongestSoldier},
  year    = {2025},
  url     = {https://huggingface.co/GODsStrongestSoldier/Infinite.Code.III},
  note    = {1.210B Recursive Language Model, 1M context window}
}
Downloads last month
-
Safetensors
Model size
1B params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Collection including WithinUsAI/Infinite.Code.III