GPT-NL Energy Prediction Models

Trained energy prediction models for the GPT-NL data curation pipeline. These models estimate the energy consumption of each pipeline stage (data splitting β†’ string normalization β†’ heuristic filtering β†’ toxic language detection β†’ deduplication) running on the Snellius supercomputer.

Usage

pip install git+https://github.com/kruuusher13/gptnl-energy-estimation-ekf.git huggingface_hub

# Download and use a model
from huggingface_hub import hf_hub_download
from gptnl_energy.models import get_model

model_path = hf_hub_download('GPT-NL/gptnl-energy-models', 'linear_energy.joblib')
model = get_model('linear')
model.load_fits(model_path)

# Predict energy for 400k documents
result = model.predict(n=400000, corpus='american_stories')
print(f'{result["total_j"]/1e6:.2f} MJ')

Available Models

File Type Target Description
ols_fits_with_dedup.json sklearn energy_j (Joules) Calibrated per-stage OLS coefficients (production calibration incl. deduplication) β€” the default fits used by gptnl-energy forecast and monitor
linear_energy.joblib sklearn energy_j (Joules) Per-stage OLS linear model (physics baseline): E = c0 + c1*n per pipeline stage
linear_time.joblib sklearn β€” β€”
ridge_energy.joblib sklearn β€” β€”
ridge_time.joblib sklearn β€” β€”
gbm_energy.joblib sklearn energy_j (Joules) Histogram Gradient Boosting β€” best ML model for cross-corpus energy transfer
gbm_time.joblib sklearn β€” β€”
mlp_energy.joblib sklearn β€” β€”
mlp_time.joblib sklearn β€” β€”
ftt_energy.pt PyTorch energy_j (Joules) FT-Transformer β€” neural tabular model with feature tokenization + stage embedding
ftt_time.pt PyTorch β€” β€”
kalman_transformer_energy.pt PyTorch total_energy_j (whole pipeline, cold start) FT-Transformer trained on Kalman filter trajectories for cold whole-run prediction

Methodology

Three prediction layers:

  1. Physics: Per-stage linear model E = c0 + c1Β·n (OLS, calibrated from sample runs)
  2. Learned g: Coefficient predictor for unseen corpora (GBM, MLP, FT-Transformer)
  3. Kalman filter (EKF): Online estimator blending model prediction with live telemetry

See the GPT-NL Energy repo for the full pipeline.

Source and thesis

All coefficients and evaluation numbers are regenerable from the measurement data with the scripts in paper/code/.

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support