MedGemma 27B - Maternal, Sexual & Reproductive Health Oracle for African Languages

Fine-tuned Google MedGemma 27B Text for the Zindi ITU Multilingual Health QA Challenge.

Specialized in answering Maternal, Sexual, and Reproductive Health (MSRH) questions in:

Akan (Twi/Fante from Ghana)
Amharic (Ethiopia)
Luganda (Uganda)
Swahili (Kenya)
English (Ethiopia, Ghana, Kenya, Uganda)

Model Description

LoRA adapter for google/medgemma-27b-text-it, fine-tuned on 29,815 multilingual medical Q&A samples across 8 language-region pairs.

Training Details

Base model: google/medgemma-27b-text-it (27B params, medical text-only)
Training method: QLoRA (4-bit quantization + LoRA)
LoRA config: r=8, alpha=16, attention-only modules
Trainable params: 16.7M (0.21% of total)
Training data: 29,815 multilingual medical Q&A samples
Optimizer: AdamW fused, lr=3e-5, linear warmup 5%
Hardware: NVIDIA A40 (48GB VRAM)
Final eval_loss: 1.39

Loss Trajectory

Step	eval_loss
600	1.69
900	1.58
1200	1.50
1500	1.45
1800	1.42
1864	1.39 (best)

Usage

import torch
from transformers import AutoTokenizer, AutoModelForCausalLM, BitsAndBytesConfig
from peft import PeftModel

quantization_config = BitsAndBytesConfig(
    load_in_4bit=True,
    bnb_4bit_use_double_quant=True,
    bnb_4bit_quant_type="nf4",
    bnb_4bit_compute_dtype=torch.bfloat16,
)

base_model = AutoModelForCausalLM.from_pretrained(
    "google/medgemma-27b-text-it",
    device_map="auto",
    torch_dtype=torch.bfloat16,
    attn_implementation="eager",
    quantization_config=quantization_config,
)

model = PeftModel.from_pretrained(base_model, "KYAGABA/medgemma-27b-msrh-african-oracle")
model.eval()

tokenizer = AutoTokenizer.from_pretrained("KYAGABA/medgemma-27b-msrh-african-oracle")

# Example
question = "How can young people access reproductive health services?"
language = "English"

prompt_text = f"Answer this question in {language} about maternal, sexual, and reproductive health: {question}"
messages = [{"role": "user", "content": prompt_text}]
prompt = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
inputs = tokenizer(prompt, return_tensors='pt').to(model.device)

with torch.no_grad():
    outputs = model.generate(
        **inputs,
        max_new_tokens=400,
        do_sample=False,
        num_beams=3,
        repetition_penalty=1.1,
    )

response = tokenizer.decode(outputs[0][inputs['input_ids'].shape[1]:], skip_special_tokens=True)
print(response)

Dataset

Trained on the Zindi ITU Multilingual Health QA Challenge dataset:

Subset	Samples	Language	Region
Eng_Uga	7,624	English	Uganda
Aka_Gha	4,455	Akan	Ghana
Eng_Gha	4,443	English	Ghana
Eng_Eth	3,915	English	Ethiopia
Lug_Uga	3,383	Luganda	Uganda
Eng_Ken	2,080	English	Kenya
Swa_Ken	2,070	Swahili	Kenya
Amh_Eth	1,845	Amharic	Ethiopia

Intended Use

For research and educational purposes to support healthcare information access in African languages. NOT for direct clinical use. Always consult qualified healthcare professionals.

Limitations

May add English preamble at start of responses
Lower quality for Akan compared to English (less training data)
Trained for ~1.13 epochs only (compute constraints)
Best for MSRH topics

Citation

@misc{medgemma27b-msrh-africa,
  author = {KYAGABA, Arul},
  title = {MedGemma 27B - MSRH African Oracle},
  year = {2026},
  publisher = {HuggingFace},
  howpublished = {https://huggingface.co/KYAGABA/medgemma-27b-msrh-african-oracle}
}

Acknowledgements

Google for MedGemma 27B base model
Zindi and ITU for the multilingual health QA challenge
AfriMed-QA community for advancing African medical AI

Downloads last month: 4

Model tree for KYAGABA/testmodel

Base model

google/gemma-3-27b-pt

Finetuned

google/medgemma-27b-text-it

Adapter

(7)

this model