ZAYA1-8B Coder LoRA

LoRA adapter for Zyphra/ZAYA1-8B focused on Python code-generation prompts. This adapter was evaluated against the base ZAYA model before merging into the Coder release.

Evaluation Result

Evaluation was run on 50 Python code-generation prompts with a deterministic 0-10 heuristic:

Base model average: 2.36 / 10
LoRA adapter average: 4.76 / 10
Absolute score delta: +2.40 / 10
Full-scale lift: 24.00%
Relative lift over base average: 101.69%
Improved prompts: 39 / 50

The merge gate was >= 20.00% full-scale lift, so the adapter passed and was merged.

Scoring Heuristic

Each generation was scored out of 10:

def present: 2 points
class present: 1 point
return present: 1 point
import or from present: 1 point
fenced code block present: 1 point
output length greater than 100 characters: 1 point
Python AST parse validity: 3 points

The full-scale lift is calculated as:

((lora_avg - base_avg) / 10) * 100
((4.76 - 2.36) / 10) * 100 = 24.00%

Adapter Details

Base model: Zyphra/ZAYA1-8B
PEFT type: LoRA
Rank: 16
Alpha: 32
Dropout: 0.05
Task type: causal language modeling
Adapter tensors: 160
Actual populated ZAYA targets observed in the safetensors file:
- self_attn.o_proj
- zaya_block.router.down_proj

The adapter config contains broad Llama-style target module names, but ZAYA is not a standard Llama architecture. The actual adapter tensor names target ZAYA's attention output projection and router down projection.

Loading Notes

ZAYA uses a custom model_type = zaya. Mainline Transformers does not load it as a normal Llama model. For faithful loading, use Zyphra's ZAYA Transformers implementation:

pip install git+https://github.com/Zyphra/transformers.git@zaya1

Then load with trust_remote_code=True and apply the PEFT adapter. The merged model release is available at josephmayo/ZAYA1-8B-Coder.

Evidence files

Run evidence for this release is stored in the repository under evidence/:

evidence/zaya_qlora_eval_result_zaya1-8b-coding-qlora-eval_release_summary.json

These files are compact local/Kaggle run artifacts used to document training, evaluation, merge, or quantization evidence for this model family.

Downloads last month: 47

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for josephmayo/ZAYA1-8B-Coder-LoRA

Base model

Zyphra/ZAYA1-base

Finetuned

Zyphra/ZAYA1-reasoning-base

Finetuned

Zyphra/ZAYA1-8B

Adapter

(3)

this model

Collection including josephmayo/ZAYA1-8B-Coder-LoRA

SLMs / ELMs & fine tuned

Collection

these are models fine tuned to be used for specific usecase - mainly coding & agentic tasks • 9 items • Updated 4 days ago