ZAYA1-8B Coder LoRA

LoRA adapter for Zyphra/ZAYA1-8B focused on Python code-generation prompts. This adapter was evaluated against the base ZAYA model before merging into the Coder release.

Evaluation Result

Evaluation was run on 50 Python code-generation prompts with a deterministic 0-10 heuristic:

  • Base model average: 2.36 / 10
  • LoRA adapter average: 4.76 / 10
  • Absolute score delta: +2.40 / 10
  • Full-scale lift: 24.00%
  • Relative lift over base average: 101.69%
  • Improved prompts: 39 / 50

The merge gate was >= 20.00% full-scale lift, so the adapter passed and was merged.

Scoring Heuristic

Each generation was scored out of 10:

  • def present: 2 points
  • class present: 1 point
  • return present: 1 point
  • import or from present: 1 point
  • fenced code block present: 1 point
  • output length greater than 100 characters: 1 point
  • Python AST parse validity: 3 points

The full-scale lift is calculated as:

((lora_avg - base_avg) / 10) * 100
((4.76 - 2.36) / 10) * 100 = 24.00%

Adapter Details

  • Base model: Zyphra/ZAYA1-8B
  • PEFT type: LoRA
  • Rank: 16
  • Alpha: 32
  • Dropout: 0.05
  • Task type: causal language modeling
  • Adapter tensors: 160
  • Actual populated ZAYA targets observed in the safetensors file:
    • self_attn.o_proj
    • zaya_block.router.down_proj

The adapter config contains broad Llama-style target module names, but ZAYA is not a standard Llama architecture. The actual adapter tensor names target ZAYA's attention output projection and router down projection.

Loading Notes

ZAYA uses a custom model_type = zaya. Mainline Transformers does not load it as a normal Llama model. For faithful loading, use Zyphra's ZAYA Transformers implementation:

pip install git+https://github.com/Zyphra/transformers.git@zaya1

Then load with trust_remote_code=True and apply the PEFT adapter. The merged model release is available at josephmayo/ZAYA1-8B-Coder.

Evidence files

Run evidence for this release is stored in the repository under evidence/:

These files are compact local/Kaggle run artifacts used to document training, evaluation, merge, or quantization evidence for this model family.

Downloads last month
47
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for josephmayo/ZAYA1-8B-Coder-LoRA

Finetuned
Zyphra/ZAYA1-8B
Adapter
(3)
this model

Collection including josephmayo/ZAYA1-8B-Coder-LoRA