Spaces:
Running
Running
Open source evaluation framework โ cost + hallucination dimensions alongside reports
#1
by vigneshwar234 - opened
Hi Upstage and Evalverse team!
Evalverse's visual evaluation reports are really well designed. I built an open source framework that complements it with production-focused metrics.
LLM Evaluation Framework adds to your evaluation stack:
- Cost per 1K tokens โ the missing dimension in most evaluation reports
- Latency p50/p95/p99 โ full percentile breakdown
- Hallucination Rate โ 0.0-1.0 score, runs locally
- Accuracy โ 4-strategy cascade
- Reasoning Quality โ CoT depth 1-10
Outputs: JSON, CSV, PDF report, Streamlit dashboard, SQLite persistence.
Live demo (no API key): https://huggingface.co/spaces/vigneshwar234/llm-eval-demo
GitHub: https://github.com/vignesh2027/LLM-Evaluation-Framework
Would love to discuss combining evaluation report approaches!