PGSM Text Surprisal Editor Model

This repository contains the trained model weights used by the Hugging Face Space:

https://huggingface.co/spaces/build-small-hackathon/pgsm-text-surprisal-editor

Model Summary

PGSM Text Surprisal Editor is powered by a compact non-Transformer language model based on a custom ExactState Memory / PGSM architecture.

The model is used to score whole-word surprisal by evaluating how predictable each removed word is from its left and right context.

Architecture

  • Architecture: PGSM / ExactState Memory
  • Transformer blocks: 0
  • Self-attention layers: 0
  • Parameters: approximately 4 million
  • Vocabulary: approximately 2k tokens
  • Model file: final_infer.pt

This model does not use Transformer self-attention. Context is propagated through learned state transitions rather than pairwise attention computations.

Training

The model was fully trained by the author on approximately 19 billion tokens from FineWeb-Edu.

Training details:

  • Training source: FineWeb-Edu
  • Training scale: approximately 19B tokens
  • Training type: full custom training by the author
  • Base architecture: PGSM / ExactState Memory
  • Off-the-shelf Transformer checkpoint used: none
  • Final inference weights: final_infer.pt

Intended Use

This model is intended for the PGSM Text Surprisal Editor Space, where it powers whole-word surprisal heatmaps for pasted text.

The model is designed for experimentation, visualization, and language-analysis demos rather than production writing assistance or factual generation.

Limitations

  • Very small model size compared with mainstream LLMs
  • Compact vocabulary
  • Designed for surprisal visualization, not general-purpose chat
  • Outputs should be treated as model-analysis signals, not factual judgments
  • Training and evaluation details are summarized here for hackathon review

Hackathon Context

This model supports the Hugging Face Build Small Hackathon submission:

  • Track: Thousand Token Wood
  • Badges: Tiny Titan, Well-Tuned, Off the Grid, Field Notes

The key goal is to demonstrate a very small, fully trained, non-Transformer language model running locally inside a Hugging Face Space.

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Dataset used to train nilmeruo/SurpriseLensModel

Space using nilmeruo/SurpriseLensModel 1