ENS Appraiser v0.2
A gradient-boosted regressor that predicts the USD sale price of an
ENS (.eth) domain name from on-chain history, semantic embeddings of the
label, and macro-market context.
This is the v0 baseline — handcrafted features + mpnet PCA + KNN comparable-sale aggregates. Built to establish an honest, leakage-free floor that future versions improve on.
Quick numbers
Trained on ~265k ENS secondary sales (Jan 2022 – Sep 2023), evaluated on 2,744 sales in Q1–Q2 2024 (held out by date, never seen during training):
| Split | n | R² (log USD) | RMSE (log USD) | Median APE | Bias |
|---|---|---|---|---|---|
| Train | 265,240 | 0.7700 | 0.7744 | 32.5% | +0.000 |
| Val | 3,545 | 0.6602 | 1.0678 | 57.0% | +0.203 |
| Test | 2,744 | 0.3081 | 1.5469 | 138.3% | +0.732 |
Plain-English read: for a typical mid-tier name in test, the model is within ~2× of the actual sale price. The long tail — celebrity names, 3-letter premiums, regime shifts — is where it misses, often by 100×+ in either direction.
What's good
- Mid-tier names, $50–$5,000 range: usually within 2× of actual.
- Length and character composition: strong signals captured well. The model knows 3-letter ASCII names are premium and 12-letter random handles are cheap.
- Wordlist hits: matches against Wikipedia, GeoNames, US first names,
stock tickers, and SEC EDGAR are picked up correctly.
paris.ethis flagged as a city,nike.ethas a brand. - Comparable-sale anchoring: the top two features are
knn_mean_logandknn_p90_log— the model leans heavily on "what did similar names sell for recently?" which is the right intuition for valuation.
What's not
- Celebrity / brand premium: a name's value to a known buyer
(Coinbase wanting
coinbase.eth, a luxury brand wanting their mark) is invisible to this model. It can detect thatnike.ethis a brand word, but not that the sale price reflects Nike's interest specifically. - 3-letter premium tail: names like
mph.eth,uma.ethsold for $20k–$40k in test; the model predicted $100–$200. The training set underweights short premiums because most sales there are 5+ letters. - Regime shift on test: test set median price is ~4× higher than training median due to the 2023 → 2024 ENS market shift. Recency-weighted training (1-year half-life) helps but doesn't fully close the gap.
- Bidirectional errors: worst predictions split roughly evenly between under-prediction (hot names the model didn't recognize) and over-prediction (cold names that just didn't move). 138% MedianAPE is honest but uncomfortable.
How it's built
| Component | Detail |
|---|---|
| Algorithm | XGBoost regressor (170 boosted trees, max_depth=7) |
| Target | log(sale_price_usd) |
| Features | 146 total |
| Training data | 265,240 sales, Jan 2022 – Sep 2023 |
| Training time | ~10 min on a single A100 |
| Model size | 3.3 MB |
Feature breakdown
- Handcrafted (15): length, n_digits, n_letters, n_special, palindrome, is_all_digits, is_all_letters, is_ascii, has_unicode, starts/ends_digit, max_char_run, n_unique_chars
- Wordlist hits (8): Wikipedia titles, GeoNames cities, US first names,
ISO 3166 countries, stock tickers, SEC EDGAR companies, Wiktionary EN,
plus a
wordlist_hitstotal - Grails clubs (~45): binary membership in each curated
.ethclub (999club,pre-punks,palindromes,pokemon_gen1, etc.) - Trademark conflict (1): active USPTO mark in Nice classes 9, 35, 36,
38, 41, 42, 45 with matching
mark_text_norm - Holder behavior (2):
name_age_days,prior_transfer_count(leakage-safe — only counts transfers strictly before the sale block) - Macro context (5): Fear & Greed Index, ETH chain TVL, ETH stablecoin market cap, ETH DEX volume, total NFT marketplace fees on the sale day
- mpnet PCA (64): 768-dim
all-mpnet-base-v2embeddings of the label, PCA-reduced to 64 dims (95% explained variance) - KNN comparable sales (8): for each label, FAISS-retrieve top-50
semantic neighbors (HNSW index), filter near-duplicates (sim > 0.999),
take the most-recent prior sale of each, aggregate as
knn_count,knn_mean_log,knn_median_log,knn_p90_log,knn_max_sim,knn_min_sim,knn_log_max,knn_log_min. Strict leakage prevention: only neighbors with sales before the current sale's date count.
Top 10 features by gain
| Rank | Feature | Gain |
|---|---|---|
| 1 | knn_mean_log |
1,714 |
| 2 | knn_p90_log |
1,613 |
| 3 | len |
1,364 |
| 4 | in_wikipedia |
1,052 |
| 5 | is_all_digits |
944 |
| 6 | knn_median_log |
604 |
| 7 | n_digits |
338 |
| 8 | pca_000 |
289 |
| 9 | n_clubs |
282 |
| 10 | ends_digit |
277 |
Five of the top ten are KNN-comp or PCA features, which means the embedding pipeline is doing real work — it's not just paying for itself, it's the dominant signal alongside length.
Training data + leakage controls
Built from the quantumly/ens-appraiser-data
dataset:
- Sales labels: Alchemy
getNFTSalesfor ENS BaseRegistrar + NameWrapper contracts. Wei amounts converted to USD via CoinGecko hourly OHLC at the sale's block timestamp. Coverage gap: AlchemygetNFTSalesv2 truncates at block 19,768,978 (May 2024) and does not index Blur marketplace sales. v0 ships with this gap; closing it is a v1 priority. - Registrations + transfers: The Graph's ENS subgraph.
- Wordlists: Wiktionary dumps, Wikipedia EN article titles, GeoNames
cities500, US Census baby names, NASDAQ Trader ticker dumps, SEC EDGAR company tickers, ISO 3166 country list. - Macro: alternative.me Fear & Greed Index, DefiLlama (TVL, stablecoin mcap, DEX volume, NFT marketplace fees).
- Trademarks: USPTO Trademark Case Files Dataset (annual research dump).
- Embeddings:
sentence-transformers/all-mpnet-base-v2, encoded once for all 3.5M ENS labels in the dataset.
Leakage controls
The first version of this model accidentally leaked future information
through lifetime_transfer_count (it counted all transfers ever for a
labelhash, including transfers that happened after the sale being
predicted). The leaky model showed train R² 0.81 / test R² −0.29 — the
classic catastrophic-overfit signature where the model collapses to
predicting the population mean on held-out data.
The current model uses prior_transfer_count, which only counts transfers
where transfer_block < sale_block per row. It moved to rank #11 in
feature importance (was #1 by 3.3×). KNN comparable-sale features have a
similar safeguard: a neighbor's sale only counts if it happened strictly
before the sale being predicted.
Train/Val/Test split
Fixed-window temporal split:
- Train: sales with
sale_date < 2023-10-01 - Val: sales 2023-10-01 → 2023-12-31
- Test: sales 2024-01-01 onwards
This prevents the v0.1 mistake of training on 2022 prices and asking the model to extrapolate to a 2024 market regime that's ~4× more expensive on average. Val and test are in the same regime so val RMSE is a meaningful proxy for test.
Training rows are weighted with an exponential recency decay (1-year half-life, normalized to mean=1.0) so the model leans on 2023 dynamics without throwing away the older data entirely.
Intended use
This model is intended for research and analytics, not as a price oracle and not for live trading.
Reasonable uses:
- Bulk valuation of mid-tier ENS portfolios for tax/accounting purposes
- Identifying obviously over- or under-listed names on secondary markets
- Sanity-checking a listing price before posting
- Producing comparable-sale ranges for negotiation context
Out of scope:
- Pricing 3-letter, 1-2 letter, or otherwise-premium names with confidence
- Pricing celebrity / known-brand names where the buyer pool is concentrated
- Predicting prices for names in the post-May-2024 marketplace mix (Blur dominance, marketplace fee changes)
- Any high-stakes financial decision based on a single point estimate
Limitations
- Sales coverage: Jan 2022 – May 2024 only, no Blur. ~2 years of recent
sales (mid-2024 onwards) are missing entirely from training. Closing
this gap requires either a new sales source (Reservoir/SimpleHash both
defunct as of 2024–2025) or direct
eth_getLogsdecoding of Seaport, Blur, X2Y2, LooksRare events, planned for v1. - Celebrity premium: there's no feature here for "is this a famous
person/place/thing?" beyond Wikipedia-title matching. v1 adds
LLM-derived structured features (
fame_score,name_kind,crypto_relevance,brand_collision_risk) which should close most of this gap. - Out-of-distribution labels: pure-digit labels (
0001), punycode/emoji, and l33tspeak get less benefit from mpnet embeddings since they're out of distribution for the pretrained model. Length and charset features partially compensate. - Time drift: the ENS market shifts noticeably every 6–12 months as marketplace dominance, fee structures, and DAO actions move. Predictions on names sold "right now" will lag any regime shift since the training cutoff.
- Test-set thinness: only 2,744 sales meet the $10 floor and post-Jan-2024 cutoff. The reported test R² has roughly ±0.08 95% CI — useful as a ballpark, not a precise number.
How to use
from huggingface_hub import hf_hub_download
import xgboost as xgb
import pickle
model_path = hf_hub_download(
repo_id="quantumly/ens-appraiser",
filename="v0_appraiser_xgb.json",
)
pca_path = hf_hub_download(
repo_id="quantumly/ens-appraiser",
filename="v0_pca_mpnet.pkl",
)
booster = xgb.Booster()
booster.load_model(model_path)
with open(pca_path, "rb") as f:
pca = pickle.load(f)
# Inference also requires:
# 1. mpnet embedding for the label (sentence-transformers/all-mpnet-base-v2)
# 2. Handcrafted/wordlist/club/trademark/holder/macro features
# 3. KNN comp lookup against the dataset repo's FAISS index
#
# A self-contained inference notebook is planned in the dataset repo.
The 146 features expected by the booster are listed in v0_metadata.json
under feature_cols, in the exact order required by xgb.DMatrix.
Reproducibility
The training notebook (v0_appraiser_v2.ipynb)
runs end-to-end on a Colab A100 high-RAM instance in ~25 minutes:
- Downloads all source parquets from the dataset repo
- Reconstructs USD prices via CoinGecko hourly OHLC join
- Resolves labels for both BaseRegistrar and NameWrapper sales
- Computes all features
- Builds HNSW index for KNN
- Trains XGBoost with early stopping
- Saves model + metadata + diagnostics
- Uploads to this model repo
All randomness is seeded (seed=42 for XGBoost, PCA, sample weights).
Roadmap
v1 priorities (in expected R² delta order):
- LLM-derived features — Llama 3.1 8B local inference over all 3.5M
labels, extracting
fame_score,name_kind,cultural_origin,crypto_relevance,brand_collision_risk, plus a description-embedding. Expected delta: +0.05–0.10 test R². - Recent sales backfill via direct
eth_getLogsdecoding of Seaport / Blur / Wyvern / X2Y2 / LooksRare events. Closes the May 2024 → present coverage gap and adds Blur. Expected delta: +0.03–0.06 test R² and a much bigger test set. - Multi-embedding ensemble — concatenate mpnet with
bge-base-en-v1.5ande5-base-v2, PCA the joint space. Expected delta: +0.02–0.04. - Cross-encoder reranker for KNN comps. Expected delta: +0.02–0.03.
- Contrastive fine-tuning of mpnet on price-similarity triplets. Expected delta: +0.03–0.05.
Citation
@misc{ens_appraiser_2026,
author = {Drobnič, Nejc},
title = {ENS Appraiser v0.2},
year = {2026},
publisher = {Hugging Face},
url = {https://huggingface.co/quantumly/ens-appraiser}
}
License + contact
MIT. Questions, corrections, pull requests: nejc@nejc.dev
Model tree for quantumly/ens-appraiser
Base model
sentence-transformers/all-mpnet-base-v2Dataset used to train quantumly/ens-appraiser
Collection including quantumly/ens-appraiser
Evaluation results
- R² (log USD, test) on ENS Appraiser Multi-source Training Dataself-reported0.308
- Median APE (test) on ENS Appraiser Multi-source Training Dataself-reported1.383
- RMSE (log USD, test) on ENS Appraiser Multi-source Training Dataself-reported1.547