Instructions to use dd101bb/latentRM with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use dd101bb/latentRM with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("feature-extraction", model="dd101bb/latentRM")# Load model directly from transformers import AutoTokenizer, AutoModelForTokenClassification tokenizer = AutoTokenizer.from_pretrained("dd101bb/latentRM") model = AutoModelForTokenClassification.from_pretrained("dd101bb/latentRM") - Notebooks
- Google Colab
- Kaggle
metadata
base_model:
- openai-community/gpt2
datasets:
- openai/gsm8k
library_name: transformers
pipeline_tag: feature-extraction
tags:
- rm
- latent
license: apache-2.0
LatentRM
The Latent Reward Model (LatentRM) is a learned scorer designed for latent reasoning models that reason in continuous hidden space. LatentRM provides the missing aggregation signal for parallel test-time scaling in latent models, enabling techniques such as best-of-N and beam search without explicit token-level probabilities.
Citation
@misc{you2025paralleltesttimescalinglatent,
title={Parallel Test-Time Scaling for Latent Reasoning Models},
author={Runyang You and Yongqi Li and Meng Liu and Wenjie Wang and Liqiang Nie and Wenjie Li},
year={2025},
eprint={2510.07745},
archivePrefix={arXiv},
primaryClass={cs.CL},
url={https://arxiv.org/abs/2510.07745},
}