Zest — Local AI Compression Model for Squeezr

Zest is a fine-tuned 0.8B model that compresses coding tool outputs (bash, git, test runners, file reads) to save context window tokens. Designed to run locally via Ollama as the AI backend for Squeezr.

Quick install

# Install via Squeezr wizard (recommended)
squeezr zest

Or manually:

ollama pull ramosvs/zest  # coming soon
# Or use the GGUF directly:
ollama create zest -f Modelfile.zest

What it does

  • Input: raw coding tool output (git diff, npm install, test failure, file read...)
  • Output: compressed version preserving errors, paths, function names, key values
  • Typical savings: 52–72% on real Claude Code tool outputs (>5K chars)
  • Minimum input: 1500 chars (smaller inputs may expand — handled by Squeezr's safety net)

Performance

| Metric | Value | | eval_loss | 0.4422 | | eval_accuracy | 89.12% | | Input size sweet spot | ≥5K chars | | Compression on large inputs | 52–72% |

Training

Fine-tuned from Qwen3.5-0.8B using LoRA (r=16, α=32) on a distillation dataset of 1,111 training pairs generated by Claude Opus 4.7. Dataset covers 50+ categories: git, test runners, build tools, docker, kubectl, npm, stack traces, MCP responses, etc.

Usage with Ollama

FROM zest-Q4_K_M.gguf
SYSTEM \"\"\"You are compressing a coding tool output to save tokens. Extract ONLY what is essential: errors, file paths, function names, test failures, key values, warnings. Be extremely concise, target under 150 tokens. Output only the compressed content, nothing else.\"\"\"
PARAMETER temperature 0
PARAMETER top_p 1
PARAMETER top_k 1
PARAMETER num_predict 300
PARAMETER num_ctx 2048

Integration with Squeezr

After squeezr zest configures everything, add to ~/.squeezr/squeezr.toml:

[compression]
ai_compression = true
ai_min_chars = 1500
[local]
enabled = true
upstream_url = "http://localhost:11434"
compression_model = "zest"
Downloads last month
48
GGUF
Model size
0.8B params
Architecture
qwen35
Hardware compatibility
Log In to add your hardware

4-bit

Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for ramosvs/zest

Quantized
(224)
this model