Instructions to use Ephraimmm/pidgin_gemma_4_lora with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use Ephraimmm/pidgin_gemma_4_lora with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("text-generation", model="Ephraimmm/pidgin_gemma_4_lora")
messages = [
    {"role": "user", "content": "Who are you?"},
]
pipe(messages)

# Load model directly
from transformers import AutoModel
model = AutoModel.from_pretrained("Ephraimmm/pidgin_gemma_4_lora", dtype="auto")

PEFT
How to use Ephraimmm/pidgin_gemma_4_lora with PEFT:
```
Task type is invalid.
```
Notebooks
Google Colab
Kaggle
Local Apps Settings

vLLM

How to use Ephraimmm/pidgin_gemma_4_lora with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "Ephraimmm/pidgin_gemma_4_lora"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "Ephraimmm/pidgin_gemma_4_lora",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker

docker model run hf.co/Ephraimmm/pidgin_gemma_4_lora

SGLang

How to use Ephraimmm/pidgin_gemma_4_lora with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "Ephraimmm/pidgin_gemma_4_lora" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "Ephraimmm/pidgin_gemma_4_lora",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "Ephraimmm/pidgin_gemma_4_lora" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "Ephraimmm/pidgin_gemma_4_lora",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Unsloth Studio

How to use Ephraimmm/pidgin_gemma_4_lora with Unsloth Studio:

Install Unsloth Studio (macOS, Linux, WSL)

curl -fsSL https://unsloth.ai/install.sh | sh
# Run unsloth studio
unsloth studio -H 0.0.0.0 -p 8888
# Then open http://localhost:8888 in your browser
# Search for Ephraimmm/pidgin_gemma_4_lora to start chatting

Install Unsloth Studio (Windows)

irm https://unsloth.ai/install.ps1 | iex
# Run unsloth studio
unsloth studio -H 0.0.0.0 -p 8888
# Then open http://localhost:8888 in your browser
# Search for Ephraimmm/pidgin_gemma_4_lora to start chatting

Using HuggingFace Spaces for Unsloth

# No setup required
# Open https://huggingface.co/spaces/unsloth/studio in your browser
# Search for Ephraimmm/pidgin_gemma_4_lora to start chatting

Load model with FastModel

pip install unsloth
from unsloth import FastModel
model, tokenizer = FastModel.from_pretrained(
    model_name="Ephraimmm/pidgin_gemma_4_lora",
    max_seq_length=2048,
)

Docker Model Runner
How to use Ephraimmm/pidgin_gemma_4_lora with Docker Model Runner:
```
docker model run hf.co/Ephraimmm/pidgin_gemma_4_lora
```

Pidgin Gemma 4 LoRA

Overview

This repository contains a LoRA (Low-Rank Adaptation) adapter for Gemma 4 E4B (instruction-tuned), fine-tuned to generate and converse in Nigerian Pidgin English (Naija / pcm). The adapter was trained with Unsloth and TRL on top of the 4-bit quantized base model unsloth/gemma-4-e4b-it-unsloth-bnb-4bit, and only the text/language pathway of the base model was adapted.

The base model is a multimodal (text/image/audio/video) Gemma 4 checkpoint, but this LoRA adapter targets only the language backbone's attention and MLP projections, so it is intended for text-in / text-out Pidgin generation, not for adapting the model's vision or audio capabilities.

Training Details

Detail	Value
Base model	`unsloth/gemma-4-e4b-it-unsloth-bnb-4bit` (Gemma 4 E4B, instruction-tuned, 4-bit)
Fine-tuning method	LoRA via PEFT
Training acceleration	Unsloth
Trainer	TRL
LoRA rank (`r`)	8
LoRA alpha	8
LoRA dropout	0
Bias	none
Target modules	Attention & MLP projections of the language backbone (`q_proj`, `k_proj`, `v_proj`, `o_proj`, `gate_proj`, `up_proj`, `down_proj`), matched via a regex scoped to the text/language submodules — vision, audio, and video towers were left frozen
Task type	`CAUSAL_LM`
PEFT version	0.19.1
License	Apache 2.0
Reported final training loss	1.239603 (see `loss_curve.png` in this repo)
Adapter size	~73.5 MB (`adapter_model.safetensors`)

Exact step count, number of epochs, learning rate, and batch size are not published in this repository (no trainer_state.json or training-arguments file is included), so they are intentionally omitted rather than guessed.

Intended Use

Generating conversational responses in Nigerian Pidgin English.
Translating or rephrasing English text into Pidgin-flavored text for chatbots, content localization, or cultural-language experimentation.
Research and educational exploration of low-resource / under-represented African language varieties with LLMs.

This adapter is not intended for high-stakes decision-making, medical/legal/financial advice, or use cases requiring guaranteed factual accuracy.

How to Use

Because the base model is a 4-bit Unsloth checkpoint, loading with Unsloth is the most reliable path (it is also how the adapter was trained):

from unsloth import FastModel

model, tokenizer = FastModel.from_pretrained(
    model_name="unsloth/gemma-4-e4b-it-unsloth-bnb-4bit",
    max_seq_length=2048,
    load_in_4bit=True,
)
model.load_adapter("Ephraimmm/pidgin_gemma_4_lora")

messages = [
    {"role": "user", "content": "How you dey? Wetin dey happen for Lagos today?"}
]
inputs = tokenizer.apply_chat_template(
    messages, add_generation_prompt=True, return_tensors="pt"
).to("cuda")

outputs = model.generate(input_ids=inputs, max_new_tokens=128, temperature=0.7)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))

Alternatively, using transformers + peft directly (requires a transformers version that supports the Gemma 4 architecture, Gemma4ForConditionalGeneration):

import torch
from transformers import AutoModelForCausalLM, AutoProcessor
from peft import PeftModel

base_model_id = "unsloth/gemma-4-e4b-it-unsloth-bnb-4bit"
adapter_id = "Ephraimmm/pidgin_gemma_4_lora"

processor = AutoProcessor.from_pretrained(adapter_id)
model = AutoModelForCausalLM.from_pretrained(
    base_model_id, device_map="auto", torch_dtype=torch.bfloat16
)
model = PeftModel.from_pretrained(model, adapter_id)

messages = [{"role": "user", "content": "Abeg, explain wetin be Nigerian Pidgin."}]
inputs = processor.apply_chat_template(
    messages, add_generation_prompt=True, tokenize=True, return_tensors="pt"
).to(model.device)

output = model.generate(**inputs, max_new_tokens=150)
print(processor.decode(output[0], skip_special_tokens=True))

Limitations

No quantitative evaluation (perplexity, BLEU, human preference scores, etc.) is published alongside this checkpoint — treat generation quality claims as unverified until you evaluate on your own data.
Only the language/text component of the multimodal base model was fine-tuned; any image, audio, or video understanding inherited from the base model is unmodified and has not been tested for Pidgin-related tasks.
Nigerian Pidgin has substantial regional, orthographic, and code-switching variation; the exact size, source, and dialectal coverage of the training data are not documented in this repository's published files.
The base model is loaded in 4-bit quantization, which can introduce minor quality trade-offs versus full precision.
As with any LLM, outputs may be inaccurate, inconsistent, or contain unintended bias, and should be reviewed by a human before use in user-facing or sensitive applications.

Author

Developed by Ephraimmm

Downloads last month: -; Downloads are not tracked for this model. How to track