Open source evaluation framework โ€” cost + hallucination dimensions alongside reports

#1
by vigneshwar234 - opened

Hi Upstage and Evalverse team!

Evalverse's visual evaluation reports are really well designed. I built an open source framework that complements it with production-focused metrics.

LLM Evaluation Framework adds to your evaluation stack:

  • Cost per 1K tokens โ€” the missing dimension in most evaluation reports
  • Latency p50/p95/p99 โ€” full percentile breakdown
  • Hallucination Rate โ€” 0.0-1.0 score, runs locally
  • Accuracy โ€” 4-strategy cascade
  • Reasoning Quality โ€” CoT depth 1-10

Outputs: JSON, CSV, PDF report, Streamlit dashboard, SQLite persistence.

Live demo (no API key): https://huggingface.co/spaces/vigneshwar234/llm-eval-demo
GitHub: https://github.com/vignesh2027/LLM-Evaluation-Framework

Would love to discuss combining evaluation report approaches!

Sign up or log in to comment