MedGemma 27B - Maternal, Sexual & Reproductive Health Oracle for African Languages

Fine-tuned Google MedGemma 27B Text for the Zindi ITU Multilingual Health QA Challenge.

Specialized in answering Maternal, Sexual, and Reproductive Health (MSRH) questions in:

  • Akan (Twi/Fante from Ghana)
  • Amharic (Ethiopia)
  • Luganda (Uganda)
  • Swahili (Kenya)
  • English (Ethiopia, Ghana, Kenya, Uganda)

Model Description

LoRA adapter for google/medgemma-27b-text-it, fine-tuned on 29,815 multilingual medical Q&A samples across 8 language-region pairs.

Training Details

  • Base model: google/medgemma-27b-text-it (27B params, medical text-only)
  • Training method: QLoRA (4-bit quantization + LoRA)
  • LoRA config: r=8, alpha=16, attention-only modules
  • Trainable params: 16.7M (0.21% of total)
  • Training data: 29,815 multilingual medical Q&A samples
  • Optimizer: AdamW fused, lr=3e-5, linear warmup 5%
  • Hardware: NVIDIA A40 (48GB VRAM)
  • Final eval_loss: 1.39

Loss Trajectory

Step eval_loss
600 1.69
900 1.58
1200 1.50
1500 1.45
1800 1.42
1864 1.39 (best)

Usage

import torch
from transformers import AutoTokenizer, AutoModelForCausalLM, BitsAndBytesConfig
from peft import PeftModel

quantization_config = BitsAndBytesConfig(
    load_in_4bit=True,
    bnb_4bit_use_double_quant=True,
    bnb_4bit_quant_type="nf4",
    bnb_4bit_compute_dtype=torch.bfloat16,
)

base_model = AutoModelForCausalLM.from_pretrained(
    "google/medgemma-27b-text-it",
    device_map="auto",
    torch_dtype=torch.bfloat16,
    attn_implementation="eager",
    quantization_config=quantization_config,
)

model = PeftModel.from_pretrained(base_model, "KYAGABA/medgemma-27b-msrh-african-oracle")
model.eval()

tokenizer = AutoTokenizer.from_pretrained("KYAGABA/medgemma-27b-msrh-african-oracle")

# Example
question = "How can young people access reproductive health services?"
language = "English"

prompt_text = f"Answer this question in {language} about maternal, sexual, and reproductive health: {question}"
messages = [{"role": "user", "content": prompt_text}]
prompt = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
inputs = tokenizer(prompt, return_tensors='pt').to(model.device)

with torch.no_grad():
    outputs = model.generate(
        **inputs,
        max_new_tokens=400,
        do_sample=False,
        num_beams=3,
        repetition_penalty=1.1,
    )

response = tokenizer.decode(outputs[0][inputs['input_ids'].shape[1]:], skip_special_tokens=True)
print(response)

Dataset

Trained on the Zindi ITU Multilingual Health QA Challenge dataset:

Subset Samples Language Region
Eng_Uga 7,624 English Uganda
Aka_Gha 4,455 Akan Ghana
Eng_Gha 4,443 English Ghana
Eng_Eth 3,915 English Ethiopia
Lug_Uga 3,383 Luganda Uganda
Eng_Ken 2,080 English Kenya
Swa_Ken 2,070 Swahili Kenya
Amh_Eth 1,845 Amharic Ethiopia

Intended Use

For research and educational purposes to support healthcare information access in African languages. NOT for direct clinical use. Always consult qualified healthcare professionals.

Limitations

  • May add English preamble at start of responses
  • Lower quality for Akan compared to English (less training data)
  • Trained for ~1.13 epochs only (compute constraints)
  • Best for MSRH topics

Citation

@misc{medgemma27b-msrh-africa,
  author = {KYAGABA, Arul},
  title = {MedGemma 27B - MSRH African Oracle},
  year = {2026},
  publisher = {HuggingFace},
  howpublished = {https://huggingface.co/KYAGABA/medgemma-27b-msrh-african-oracle}
}

Acknowledgements

  • Google for MedGemma 27B base model
  • Zindi and ITU for the multilingual health QA challenge
  • AfriMed-QA community for advancing African medical AI
Downloads last month
4
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for KYAGABA/testmodel

Adapter
(7)
this model