Fahrenheit Research

FR-Forge-1.7B

A Thin Language Model for manufacturing. Grounded, specialized, and running on your laptop.

GROUNDED.   SPECIALIZED.   LOCAL.

License Base Format Domain Website GitHub Sibling


Contents

  1. Overview
  2. How it works
  3. Specifications
  4. Quickstart
  5. Intended use
  6. Evaluation
  7. Limitations
  8. Training
  9. Citation

Overview

FR-Forge-1.7B is a Thin Language Model (TLM) for the manufacturing sector, by Fahrenheit Research. Sibling to FR-Lex (legal). It is a small, 4-bit, MLX model fine-tuned with LoRA adapters on a curated manufacturing instruction corpus, designed to run locally on Apple Silicon.

  Where a Thin Language Model fits

  specialization
     ▲
     │   ● FR-Forge 1.7B
     │     narrow · local · cheap
     │
     │                              ● Frontier LLM
     │                                broad · hosted · costly
     └──────────────────────────────────────────────▶ generality

How it works

  Manufacturing question
            │
            ▼
  ┌─────────────────────────────────────────────────────┐
  │ FR-Forge 1.7B                                         │
  └─────────────────────────────────────────────────────┘
            │
   reasons across three pillars
            │
            ├─ Ops & Maintenance        SOPs · PM/PdM · OEE · troubleshooting
            ├─ Quality & Compliance     ISO 9001 · IATF 16949 · FDA/GMP · CAPA · FMEA · SPC
            └─ Supply Chain & Planning  MRP/ERP · BOMs · inventory · demand planning
            │
            ▼
   Grounded, local answer  (verify exact clauses against the controlling standard)

It is an assistant, not a certified authority. It is not a substitute for the controlling standard, safety sign-off, or regulatory advice.

Specifications

Base Qwen3-1.7B (4-bit, MLX)
Parameters 1.7B · 4-bit
Method LoRA adapters, fused
Runtime MLX (Apple Silicon)
Languages English
License Apache-2.0

Quickstart

Runs locally with MLX on Apple Silicon. Apply a repetition penalty to avoid looping.

pip install -U mlx-lm
python3 -m mlx_lm generate \
  --model FahrenheitResearch/FR-Forge-1.7B \
  --repetition-penalty 1.15 \
  --max-tokens 500 \
  --prompt "How should I set safety stock for a long-lead component with variable demand?"

Recommended generation settings: repetition_penalty = 1.15 (raise to 1.3 if output repeats). A system prompt of "You are FR-Forge, a manufacturing domain assistant." matches training.

Intended use

Assisting manufacturing teams with everyday domain questions across three pillars:

  FR-Forge covers
  ├─ Shop-floor ops & maintenance   SOPs, work instructions, equipment manuals, PM/PdM, OEE, troubleshooting
  ├─ Quality & compliance           ISO 9001, IATF 16949, ISO 13485, FDA/GMP, CAPA, FMEA, SPC, MSA, audits
  └─ Supply chain & planning         MRP/ERP, procurement, BOMs, inventory, demand planning, suppliers

Out of scope. Safety, regulatory, or compliance sign-off; anything that must be exact (part numbers, clause text, customer-specific requirements) without grounding it in your own documents.

Evaluation

A held-out set of prompts across the three pillars is scored with a deterministic keyword-rubric: each item defines groups of required terms, and a group passes if any synonym appears in the answer. The item score is the fraction of groups covered; pillar and overall scores are averages. Generation uses a 1.15 repetition penalty.

  Held-out eval, keyword rubric (percent coverage)

  Ops & Maintenance        ███████████████████░  92.9
  Quality & Compliance     ██████████████░░░░░░  70.8
  Supply Chain & Planning  █████████████████░░░  83.3
  ───────────────────────  ────────────────────  ────
  Overall                  ████████████████░░░░  82.3
Pillar Score
Ops & Maintenance 92.9%
Quality & Compliance 70.8%
Supply Chain & Planning 83.3%
Overall 82.3%

This measures domain-term coverage, not eloquence or factual grading, and the held-out set is small, so treat results as a directional, reproducible yardstick rather than a precise grade. The untuned base can be scored with python3 scripts/evaluate.py --base-only, and the tuned model is reproduced with python3 scripts/evaluate.py --max-tokens 500.

Limitations

Not a certified authority. Outputs assist research and drafting only. They are not a substitute for the controlling standard, safety sign-off, or regulatory advice.

  • English only (v1).
  • Paraphrased, not verbatim. Trained on domain reasoning, not reproduced standards; always verify clause-level detail against the controlling standard.
  • Small model. For facts that must be exact (part numbers, clause text, customer-specific requirements), ground it with retrieval over your own documents rather than relying on memory.
  • Self-reported metric. Evaluation is an internal keyword-coverage score on a small held-out set.

Training

  Domain instruction pairs ─┐
  110 ops & maintenance      │
  100 quality & compliance   ├─▶ LoRA fine-tune (MLX) ─▶ fuse adapters ─▶ FR-Forge 1.7B
   51 supply chain & planning┘    iters 800 · lr 1e-5 · seq 512
  Training data, 261 instruction pairs
  Ops & Maintenance        ████████████████  110
  Quality & Compliance     ███████████████░  100
  Supply Chain & Planning  ███████░░░░░░░░░   51
Base, method, data, and hyperparameters
  • Base model: mlx-community/Qwen3-1.7B-4bit (Qwen3 architecture, 4-bit)
  • Method: LoRA adapters via mlx-lm, fused into this standalone model
  • Data: 261 curated instruction pairs (110 ops & maintenance, 100 quality & compliance, 51 supply chain & planning). Sources: paraphrased domain reasoning plus model-assisted synthetic pairs with human review. No copyrighted standard text is reproduced verbatim; the data teaches structure and reasoning and cites clause numbers.
  • Hyperparameters: iters 800, LoRA layers 8, batch size 1, max sequence length 512, learning rate 1e-5, gradient checkpointing on. Peak training memory ~2 GB.

Citation

Apache-2.0. Base model mlx-community/Qwen3-1.7B-4bit is Apache-2.0.

@software{fr_forge_2026,
  title  = {FR-Forge-1.7B: a thin language model for manufacturing},
  author = {Fahrenheit Research},
  year   = {2026},
  note   = {Fine-tuned from Qwen3-1.7B (4-bit) with MLX/LoRA}
}

FAHRENHEIT RESEARCH

Thin Language Models for specialized domains.

Website  ·  GitHub  ·  Sibling model: FR-Lex-1.7B

Downloads last month
40
Safetensors
Model size
0.3B params
Tensor type
BF16
·
U32
·
MLX
Hardware compatibility
Log In to add your hardware

4-bit

Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for FahrenheitResearch/FR-Forge-1.7B

Finetuned
Qwen/Qwen3-1.7B
Finetuned
(2)
this model

Evaluation results

  • Overall (rubric coverage) on FR-Forge held-out eval (3 pillars, keyword-rubric)
    self-reported
    82.300
  • Ops & Maintenance on FR-Forge held-out eval (3 pillars, keyword-rubric)
    self-reported
    92.900
  • Quality & Compliance on FR-Forge held-out eval (3 pillars, keyword-rubric)
    self-reported
    70.800
  • Supply Chain & Planning on FR-Forge held-out eval (3 pillars, keyword-rubric)
    self-reported
    83.300