ko-hallucheck-v1 โ€” ํ•œ๊ตญ์–ด ํ™˜๊ฐ(์ถฉ์‹ค์„ฑ) ํŒ๋ณ„๊ธฐ

(context, answer) ์Œ์„ ์ž…๋ ฅ๋ฐ›์•„ ๋‹ต๋ณ€์ด ๋ฌธ๋งฅ์— ์ถฉ์‹คํ•œ์ง€(SUPPORTED) ํ™˜๊ฐ์ธ์ง€(HALLUCINATED) ํŒ๋ณ„ํ•˜๋Š” ํ•œ๊ตญ์–ด ์ „์šฉ cross-encoder์ž…๋‹ˆ๋‹ค. RAG ํŒŒ์ดํ”„๋ผ์ธ์˜ ์ถœ๋ ฅ ๊ฒŒ์ดํŠธ, LLM ๋‚ฉํ’ˆ ์ธ์ˆ˜๊ฒ€์ฆ, ์ƒ์„ฑ ์ฝ˜ํ…์ธ  ํ’ˆ์งˆ ๊ฒŒ์ดํŠธ ์šฉ๋„๋กœ ์„ค๊ณ„๋˜์—ˆ์Šต๋‹ˆ๋‹ค.

์˜์–ด๊ถŒ์—๋Š” vectara/hallucination_evaluation_model, MiniCheck, LettuceDetect ๋“ฑ ์„ฑ์ˆ™ํ•œ ํŒ๋ณ„๊ธฐ๊ฐ€ ์žˆ์ง€๋งŒ, ํ•œ๊ตญ์–ด ์ „์šฉ ๊ณต๊ฐœ ํŒ๋ณ„๊ธฐ๋Š” ์—†์–ด ๊ทธ ๊ณต๋ฐฑ์„ ์ฑ„์šฐ๊ธฐ ์œ„ํ•ด ๋งŒ๋“ค์—ˆ์Šต๋‹ˆ๋‹ค.

  • Base: BAAI/bge-reranker-v2-m3 (568M, Apache-2.0) โ†’ 2-label seq-classification ํŒŒ์ธํŠœ๋‹
  • Labels: 0 = HALLUCINATED, 1 = SUPPORTED
  • Max length: 512 (context+answer, longest_first truncation)

์„ฑ๋Šฅ

ํ‰๊ฐ€์…‹ acc AUROC ํ™˜๊ฐํƒ์ง€ recall (intrinsic / extrinsic)
in-dist test (์œ„ํ‚ค ๊ธฐ๋ฐ˜ ๋ฌธ์žฅํ˜•, n=1002) 0.938 0.980 0.83 / 1.00
spanํ˜• held-out (KorQuAD ์œ„ํ‚ค, n=688) 0.988 0.997 0.99 / 1.00
cross-source OOD (KLUE-MRC ๋‰ด์Šค span, n=1500) 0.966 0.979 0.99 / 0.99
  • OOD๋Š” ํ•™์Šต์— ์“ฐ์ง€ ์•Š์€ ์†Œ์Šค(๋‰ด์Šค ๋„๋ฉ”์ธ)์ด๋ฉฐ, ๊ธฐ๋ณธ ์ž„๊ณ„๊ฐ’ 0.5์—์„œ ์œ„ ์„ฑ๋Šฅ์ด ๋‚˜์˜ต๋‹ˆ๋‹ค(๋ณ„๋„ ์บ˜๋ฆฌ๋ธŒ๋ ˆ์ด์…˜ ๋ถˆํ•„์š”).
  • v1 ๋Œ€๋น„ ํ•ต์‹ฌ ๊ฐœ์„ : ๋ฌธ์žฅํ˜• ๋ฐ์ดํ„ฐ๋งŒ์œผ๋กœ ํ•™์Šตํ•˜๋ฉด spanํ˜• ์ž…๋ ฅ์—์„œ ํŒ๋ณ„์ด ๋ถ•๊ดด(ํฌ๋งท shortcut)ํ•˜๋Š” ๋ฌธ์ œ๋ฅผ ๋‹คํฌ๋งท(๋ฌธ์žฅํ˜•+spanํ˜•) ํ•™์Šต์œผ๋กœ ํ•ด๊ฒฐํ–ˆ์Šต๋‹ˆ๋‹ค.

์ •์งํ•œ ํ•œ๊ณ„ (์ฝ๊ณ  ์“ฐ์„ธ์š”)

  1. negative(ํ™˜๊ฐ) ์ƒ˜ํ”Œ์ด LLM ์ƒ์„ฑ + ๋ฃฐ ๋ณ€ํ˜•(์ˆซ์ž/๊ฐœ์ฒด ์น˜ํ™˜, ํƒ€๋ฌธ์„œ ์ด์‹)์œผ๋กœ ๋งŒ๋“ค์–ด์กŒ์Šต๋‹ˆ๋‹ค. OOD ํ‰๊ฐ€์˜ ํ™˜๊ฐ๋„ ๊ฐ™์€ ๋ฃฐ ํŒจ๋ฐ€๋ฆฌ๋กœ ์ƒ์„ฑ๋˜์–ด, ์‹คํ™˜๊ฒฝ LLM ํ™˜๊ฐ๊ณผ ๋ถ„ํฌ๊ฐ€ ๋‹ค๋ฅผ ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. ์‚ฌ๋žŒ ๋ผ๋ฒจ ๊ธฐ๋ฐ˜ ๋…๋ฆฝ ๋ฒค์น˜๋งˆํฌ(Ko-FaithBench)๋ฅผ ์ค€๋น„ ์ค‘์ด๋ฉฐ ๊ณต๊ฐœ ์‹œ ์—ฌ๊ธฐ์— ๊ฒฐ๊ณผ๋ฅผ ์ถ”๊ฐ€ํ•ฉ๋‹ˆ๋‹ค.
  2. ๋ฏธ๋ฌ˜ํ•œ 1๊ธ€์ž ์ˆ˜์ค€ ๋ณ€ํ˜•(intrinsic)์˜ in-dist recall์€ 0.83์œผ๋กœ, ๊ทนํžˆ ๋ฏธ์„ธํ•œ ์™œ๊ณก์€ ๋†“์น  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.
  3. context 512 ํ† ํฐ ์ดˆ๊ณผ๋ถ„์€ ์ž˜๋ฆฝ๋‹ˆ๋‹ค. ๊ธด ๋ฌธ์„œ๋Š” ์ฒญํฌ ๋‹จ์œ„๋กœ ๋‚˜๋ˆ  ํŒ๋ณ„ํ•˜์„ธ์š”.
  4. ์‚ฌ์‹ค์„ฑ ํŒ๋ณ„์ด ์•„๋‹ˆ๋ผ ์ฃผ์–ด์ง„ context์— ๋Œ€ํ•œ ์ถฉ์‹ค์„ฑ ํŒ๋ณ„์ž…๋‹ˆ๋‹ค. context ์ž์ฒด๊ฐ€ ํ‹€๋ฆฌ๋ฉด ์žก์ง€ ๋ชปํ•ฉ๋‹ˆ๋‹ค.
  5. ์˜๋ฃŒยท๋ฒ•๋ฅ  ๋“ฑ ๊ณ ์œ„ํ—˜ ์šฉ๋„์—๋Š” ์‚ฌ๋žŒ ๊ฒ€ํ†  ์—†์ด ๋‹จ๋… ์‚ฌ์šฉํ•˜์ง€ ๋งˆ์„ธ์š”.

์‚ฌ์šฉ๋ฒ•

import torch
from transformers import AutoTokenizer, AutoModelForSequenceClassification

repo = "jismsy/ko-hallucheck-v1"
tok = AutoTokenizer.from_pretrained(repo)
model = AutoModelForSequenceClassification.from_pretrained(repo).eval()

context = "๋†์‹ฌ์˜ ์ฒซ ํšŒ์‚ฌ๋ช…์€ ๋กฏ๋ฐ๊ณต์—…์‚ฌ์˜€๋‹ค. 1978๋…„ ์‚ฌ๋ช…์„ ๋†์‹ฌ์œผ๋กœ ๋ณ€๊ฒฝํ–ˆ๋‹ค."
answer = "๋†์‹ฌ์€ 1965๋…„ ์‚ผ์–‘์‹ํ’ˆ์œผ๋กœ ์ฐฝ๋ฆฝ๋˜์—ˆ๋‹ค."

enc = tok(context, answer, truncation="longest_first", max_length=512, return_tensors="pt")
with torch.no_grad():
    prob_supported = torch.softmax(model(**enc).logits, -1)[0, 1].item()
print(f"SUPPORTED ํ™•๋ฅ : {prob_supported:.3f}")  # 0.5 ๋ฏธ๋งŒ โ†’ ํ™˜๊ฐ ํŒ์ •

ํ•™์Šต ๋ฐ์ดํ„ฐ

  • ํ•œ๊ตญ์–ด ์œ„ํ‚คํ”ผ๋””์•„ ๊ธฐ๋ฐ˜ ๋ฌธ์žฅํ˜• (context, answer) ~18k์Œ: LLM ์ƒ์„ฑ supported/intrinsic/extrinsic, ์ƒ˜ํ”Œ ์ˆ˜๋™ ๊ฒ€์ˆ˜(๋ผ๋ฒจ ์ •ํ™•๋„ ~90%)
  • KorQuAD v1 ๊ธฐ๋ฐ˜ spanํ˜• ~5.3k์Œ: ๋ฃฐ ๊ธฐ๋ฐ˜ ์ƒ์„ฑ
  • ๋ฌธ์„œ(article) ๊ทธ๋ฃน ๋‹จ์œ„ train/val/test ๋ถ„ํ• ๋กœ ๋ˆ„์ˆ˜ ์ฐจ๋‹จ
  • ๋ฐ์ดํ„ฐ ์›๋ฌธ ๋ผ์ด์„ ์Šค: Korean Wikipedia(CC BY-SA), KorQuAD v1(CC BY-ND) โ€” ๋ฐ์ดํ„ฐ์…‹ ์ž์ฒด๋Š” ์žฌ๋ฐฐํฌํ•˜์ง€ ์•Š์Šต๋‹ˆ๋‹ค.

Citation

@misc{ko-hallucheck-2026,
  title={ko-hallucheck: Korean Faithfulness / Hallucination Detection Cross-Encoder},
  author={ianwoo},
  year={2026},
  url={https://huggingface.co/jismsy/ko-hallucheck-v1}
}
Downloads last month
-
Safetensors
Model size
0.6B params
Tensor type
F32
ยท
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for jismsy/ko-hallucheck-v1

Finetuned
(87)
this model