FR-Blaze-9B

A Thin Language Model for online marketing. Grounded, specialized, and running on your laptop.

GROUNDED. SPECIALIZED. LOCAL.

Overview
How it works
Specifications
Quickstart
Intended use
Evaluation
Limitations
Training
Citation

Overview

FR-Blaze-9B is a Thin Language Model (TLM) for online and digital marketing, by Fahrenheit Research. Sibling to FR-Forge (manufacturing) and FR-Lex (legal). It is a 4-bit, MLX model: the Gemma 2 9B base with LoRA adapters fused in, designed to run locally on Apple Silicon and to be paired with a retrieval layer for current facts.

Unlike the smaller-base siblings, FR-Blaze sits on a strong 9B base that already scores ~85% on a public marketing benchmark. The adapter is a domain lens for voice and framing, not a capability boost. The capability is in the base; currency comes from retrieval. This card reports that honestly.

  Where a Thin Language Model fits

  specialization
     ▲
     │   ● FR-Blaze 9B
     │     marketing lens · local · grounded by retrieval
     │
     │                              ● Frontier LLM
     │                                broad · hosted · costly
     └──────────────────────────────────────────────▶ generality

How it works

  Marketing question
            │
            ▼
  ┌─────────────────────────────────────────────────────┐
  │ FR-Blaze 9B   (base capability + FR-Blaze lens)      │
  └─────────────────────────────────────────────────────┘
            │  + retrieved, dated facts (recommended)
            │
   advises across the digital stack
            │
            ├─ Search           SEO (technical/on-page/content/links) · Google & Microsoft Ads
            ├─ Paid social       Meta · TikTok · LinkedIn · YouTube · programmatic
            ├─ Organic social    strategy · content · community · cadence
            ├─ Content & email   briefs · distribution · lifecycle flows · deliverability
            └─ Analytics & CRO   GA4 · attribution · landing pages · budgeting · strategy
            │
            ▼
   Grounded, local answer  (verify live platform specs against current docs)

It is an assistant, not a certified authority, and not a substitute for verifying live platform features, ad specs, prices, or policies.

Specifications


Base	Gemma 2 9B (4-bit, MLX)
Parameters	~9.2B · 4-bit
Method	LoRA adapters, fused
Runtime	MLX (Apple Silicon)
Languages	English
License	Gemma Terms of Use

Quickstart

Runs locally with MLX on Apple Silicon. Apply a repetition penalty to avoid looping.

pip install -U mlx-lm
python3 -m mlx_lm generate \
  --model FahrenheitResearch/FR-Blaze-9B \
  --repetition-penalty 1.3 \
  --max-tokens 600 \
  --prompt "Plan a Q3 paid social test across Meta and TikTok for a DTC skincare brand."

Recommended generation settings: repetition_penalty = 1.3, temperature ≈ 0.5. A system prompt of "You are FR-Blaze, a marketing expert." matches training.

Recommended deployment: pair with retrieval. FR-Blaze is strongest when current, dated facts (platform updates, ad specs, account context) are retrieved and prepended to the prompt. The base model carries the stable knowledge; retrieval carries what changes. See the project for a reference retrieval setup.

Intended use

Assisting marketing, growth, and sales teams with everyday questions across the digital stack:

  FR-Blaze covers
  ├─ Search             SEO (technical, on-page, content, keywords, links), Google/Microsoft Ads, Shopping, PMax
  ├─ Paid social        Meta, TikTok, LinkedIn, YouTube, programmatic: targeting, creative, bidding, measurement
  ├─ Organic social     strategy, content, community, cadence, channel fit
  ├─ Content & email    briefs, formats, distribution, lifecycle flows, segmentation, deliverability
  └─ Analytics & CRO    GA4, attribution, incrementality, landing pages, budgeting, channel mix, strategy

Out of scope. Legal, financial, or compliance advice; autonomous spending or publishing; anything that must be exact (current platform specs, prices, policies) without grounding it in retrieved, current sources.

Evaluation

Scored on the public, MIT-licensed AdsGPT marketing benchmark (hand-authored against 2026 platform docs), deduped of its persona-templated copies to 377 unique multiple-choice questions across five marketing categories. Scoring is exact letter-match (no judge model), so it is fully reproducible. Decoding is deterministic (greedy).

  AdsGPT marketing benchmark, MCQ accuracy

  Critical thinking     ██████████████████░  91.7
  Email & lifecycle     █████████████████░░  86.2
  SEO & organic         ████████████████░░░  84.4
  Google Ads            ███████████████░░░░  78.8
  Meta Ads              ███████████████░░░░  78.8
  ──────────────────    ───────────────────  ────
  FR-Blaze-9B overall   ████████████████░░░  83.6

Category	FR-Blaze-9B	Gemma 2 9B base
Critical thinking	91.7%	91.7%
Email & lifecycle	86.2%	86.2%
SEO & organic	84.4%	84.4%
Google Ads	78.8%	83.8%
Meta Ads	78.8%	80.0%
Overall	83.6%	84.9%

Honest reading. The base model is already a strong marketing generalist (84.9%). The LoRA lens matches it on the stable categories and slightly trails on the two fast-changing paid-platform categories. The adapter adds FR-Blaze voice and framing, not benchmark capability. This is expected for a 9B base that already knows public marketing knowledge, and it is exactly why the recommended deployment grounds the model with retrieval for the volatile, current-fact categories rather than relying on the weights. Reproduce with the project's mcq_eval_9b.sh (it scores both the base and this model).

Limitations

Not a certified authority. Outputs assist research and drafting only. They are not a substitute for current platform documentation, financial sign-off, or legal advice.

English only (v1).
Lens, not a capability boost. On a public benchmark this model matches its base rather than beating it; its value is domain voice plus retrieval, not extra knowledge in the weights.
Currency. Answers about live platform features, ad specs, prices, and policies are only as current as the facts you retrieve and provide. Without grounding, verify on the live platform.
Self-reported, reproducible metric. The score is exact MCQ letter-match on a public benchmark; the harness is included so anyone can reproduce both base and tuned numbers.

Training

  Marketing instruction pairs ─┐
  SEO · paid search            │
  paid & organic social        ├─▶ LoRA fine-tune (MLX) ─▶ fuse adapters ─▶ FR-Blaze 9B
  email · analytics · math     │    gentle: lr 5e-5 · 4 layers · ~250 iters
  worked calculations          ┘

Base, method, data, and hyperparameters

Base model: mlx-community/gemma-2-9b-it-4bit (Gemma 2 architecture, 4-bit)
Method: LoRA adapters via mlx-lm (QLoRA on the 4-bit base), fused into this standalone model
Data: ~200 curated marketing instruction pairs spanning SEO, paid search, paid and organic social, content, email, analytics/CRO, plus worked-math examples. Model-assisted synthetic pairs with human review; benchmark questions held out (no leakage, validator-enforced).
Hyperparameters (gentle, to preserve base capability): learning rate 5e-5, LoRA layers 4, batch size 1, ~250 iters, max sequence length 1536, prompt masking on, gradient checkpointing on.
Note: harder fine-tuning (higher lr, more layers/iters) degraded reasoning and math without adding capability; the gentle recipe above was chosen on benchmark evidence.

Citation

Gemma Terms of Use. Base model google/gemma-2-9b-it is governed by the Gemma license; this derivative carries the same terms.

@software{fr_blaze_2026,
  title  = {FR-Blaze-9B: a thin language model for online marketing},
  author = {Fahrenheit Research},
  year   = {2026},
  note   = {Fine-tuned from Gemma 2 9B (4-bit) with MLX/LoRA; deploy with retrieval}
}

FAHRENHEIT RESEARCH

Thin Language Models for specialized domains.

Website · GitHub · Sibling: FR-Forge-1.7B · Sibling: FR-Lex-1.7B

Downloads last month: -

Safetensors

Model size

1B params

Tensor type

F16

U32

MLX

Hardware compatibility

4-bit

Model tree for FahrenheitResearch/FR-Blaze-9B

Base model

mlx-community/gemma-2-9b-it-4bit

Finetuned

(1)

this model

Evaluation results

Overall (exact letter-match) on AdsGPT marketing benchmark (deduped Google Ads + SEO + Meta + Email + reasoning, MCQ)
self-reported

83.600
Critical thinking on AdsGPT marketing benchmark (deduped Google Ads + SEO + Meta + Email + reasoning, MCQ)
self-reported

91.700
Email & lifecycle on AdsGPT marketing benchmark (deduped Google Ads + SEO + Meta + Email + reasoning, MCQ)
self-reported

86.200
SEO & organic on AdsGPT marketing benchmark (deduped Google Ads + SEO + Meta + Email + reasoning, MCQ)
self-reported

84.400
Google Ads on AdsGPT marketing benchmark (deduped Google Ads + SEO + Meta + Email + reasoning, MCQ)
self-reported

78.800
Meta Ads on AdsGPT marketing benchmark (deduped Google Ads + SEO + Meta + Email + reasoning, MCQ)
self-reported

78.800

FahrenheitResearch
/

FR-Blaze-9B