Relational Transformer
This repository contains the official checkpoints for the Relational Transformer (RT), introduced in the paper Relational Transformer: Toward Zero-Shot Foundation Models for Relational Data.
Relational Transformer is a foundation model architecture designed to be pretrained on diverse relational databases and applied to unseen datasets and tasks without task- or dataset-specific fine-tuning. It utilizes a novel Relational Attention mechanism over columns, rows, and primary-foreign key links.
- Paper: Relational Transformer: Toward Zero-Shot Foundation Models for Relational Data
- GitHub Repository: snap-stanford/relational-transformer
Installation
The repository uses pixi for package management.
git clone https://github.com/snap-stanford/relational-transformer
cd relational-transformer
pixi install
# compile and install the rust sampler
cd rustler
pixi run maturin develop --uv --release
Checkpoints
The project provides two types of checkpoints:
pretrain_<dataset>_<task>.pt: Pretrained with the specified<dataset>held out.contd-pretrain_<dataset>_<task>.pt: Obtained by continued pretraining on<dataset>with the specific<task>held out.
You can download specific checkpoints using the Hugging Face CLI:
mkdir -p ~/scratch/rt_ckpts
huggingface-cli download rishabh-ranjan/relational-transformer \
--repo-type model \
--include "pretrain_rel-amazon_user-churn.pt" \
--local-dir ~/scratch/rt_ckpts \
--local-dir-use-symlinks False
Usage
To use these checkpoints, pass the path to the load_ckpt_path argument in the training scripts provided in the GitHub repository. For example, to run a finetuning experiment:
pixi run torchrun --standalone --nproc_per_node=8 scripts/example_finetune.py
RelBench leaderboard checkpoints (added 2026-06)
These files back the RT numbers on the RelBench leaderboard. Protocols follow the repo scripts, with one change: regression best-checkpoint selection uses val NMAE (MAE / train-split std, ddof=1) — the leaderboard metric — instead of R². Evaluation = full official test split (AUROC / NMAE).
pretrain_rel-event_<task>.pt— leave-rel-event-out pretraining (50k steps), per-task best. rel-event was not covered in the original release; these produce the RT zero-shot rel-event cells.contd-pretrain_rel-event_<task>.pt— continued pretraining on the other rel-event tasks from the matching pretrain checkpoint (2^12+1 steps).finetune-from-{pretrain,contd-pretrain}_<db>_<task>.pt— the fine-tuned checkpoint behind each replicated "RT | pretrained + fine-tuned" leaderboard cell. The board takes the per-task best over fine-tuning from the plain-pretraining vs continued-pretraining init (init treated as a hyperparameter); the file present is the winning init for that task. Cells without a file here are the paper's own pretrain-init fine-tuning numbers — reproduce those withscripts/example_finetune.pyfrom the matchingpretrain_<db>_<task>.pt.
Citation
@inproceedings{ranjan2025relationaltransformer,
title={{Relational Transformer:} Toward Zero-Shot Foundation Models for Relational Data},
author={Rishabh Ranjan and Valter Hudovernik and Mark Znidar and Charilaos Kanatsoulis and Roshan Upendra and Mahmoud Mohammadi and Joe Meyer and Tom Palczewski and Carlos Guestrin and Jure Leskovec},
booktitle={The Fourteenth International Conference on Learning Representations},
year={2026}
}