Papers
arxiv:2410.03435

A General Framework for Producing Interpretable Semantic Text Embeddings

Published on Oct 4, 2024
Authors:
,
,
,

Abstract

CQG-MBQA is a framework that generates interpretable semantic text embeddings by creating discriminative yes/no questions and answering them efficiently, achieving performance comparable to black-box models while maintaining transparency.

AI-generated summary

Semantic text embedding is essential to many tasks in Natural Language Processing (NLP). While black-box models are capable of generating high-quality embeddings, their lack of interpretability limits their use in tasks that demand transparency. Recent approaches have improved interpretability by leveraging domain-expert-crafted or LLM-generated questions, but these methods rely heavily on expert input or well-prompt design, which restricts their generalizability and ability to generate discriminative questions across a wide range of tasks. To address these challenges, we introduce CQG-MBQA (Contrastive Question Generation - Multi-task Binary Question Answering), a general framework for producing interpretable semantic text embeddings across diverse tasks. Our framework systematically generates highly discriminative, low cognitive load yes/no questions through the CQG method and answers them efficiently with the MBQA model, resulting in interpretable embeddings in a cost-effective manner. We validate the effectiveness and interpretability of CQG-MBQA through extensive experiments and ablation studies, demonstrating that it delivers embedding quality comparable to many advanced black-box models while maintaining inherently interpretability. Additionally, CQG-MBQA outperforms other interpretable text embedding methods across various downstream tasks.

Community

Sign up or log in to comment

Get this paper in your agent:

hf papers read 2410.03435
Don't have the latest CLI?
curl -LsSf https://hf.co/cli/install.sh | bash

Models citing this paper 0

No model linking this paper

Cite arxiv.org/abs/2410.03435 in a model README.md to link it from this page.

Datasets citing this paper 0

No dataset linking this paper

Cite arxiv.org/abs/2410.03435 in a dataset README.md to link it from this page.

Spaces citing this paper 0

No Space linking this paper

Cite arxiv.org/abs/2410.03435 in a Space README.md to link it from this page.

Collections including this paper 0

No Collection including this paper

Add this paper to a collection to link it from this page.