YAML Metadata Warning:empty or missing yaml metadata in repo card

Check out the documentation for more information.

scFATE β€” NeurIPS 2026

Lie-algebraic conditional flow matching for zero-shot perturbation prediction in single cells. This repo contains all checkpoints and configs behind Table 1 in the paper. Datasets are split out into a separate repo: Angione-Lab/scFATE-datasets.

Layout

backbones/<dataset>/                       β€” scFATE rotation autoencoders (load first)
flow_heads/<dataset>/seed{1,2,3}/          β€” pure-flow heads on the rotation latent
flow_heads/k562/base/                      β€” K562 flow teacher (for reflow)
flow_heads/k562/reflow_K2_s1/              β€” K562 reflow K=2 student (paper headline 81.2 DA)
sciplex3_path_b/teachers/{priorkrr,priornone}/s{1..9}/
sciplex3_path_b/students/mixed18_K16/s{1..7}/   β€” mixed-18 K=16 distilled student (paper headline 70.0 DA)
results/table1_inputs/                     β€” saved eval JSONs for reviewer audit
REPRODUCE.md                               β€” paper reproduction guide
MANIFEST.json                              β€” machine-readable run inventory

Quickstart

from huggingface_hub import snapshot_download
import torch

# 1. Pull the K562 paper-headline checkpoint
ckpt_dir = snapshot_download("Angione-Lab/scFATE", allow_patterns="flow_heads/k562/reflow_K2_s1/*")
ckpt = torch.load(f"{ckpt_dir}/flow_heads/k562/reflow_K2_s1/flow_best.pt", map_location="cpu", weights_only=False)

# 2. The ckpt has every hparam at top-level; the velocity-network state_dict is at ckpt["v_net_state_dict"]
print({k: v for k, v in ckpt.items() if not k.endswith("_state_dict") and not k.startswith("_")})

For end-to-end reproduction (backbone β†’ flow β†’ reflow β†’ eval), see REPRODUCE.md.

Caveat (important for paper integrity)

Table 1 of the paper reports cos=0.491 / PDE=0.483 for the SciPlex3 (Path B) row. These two values are not in any saved eval log on the original training machine. Only the mean DA = 0.700 reproduces from results/table1_inputs/sciplex3_iter214_multi_metric_router_ensemble.json (metric_da.tanimoto_morgan2048 = 0.7000). The router pipeline only stored DA, not cos/PDE. We are re-running the SciPlex3 Path-B eval to compute proper cos/PDE for camera-ready; until then, treat the SciPlex3 row's continuous metrics as not-yet-reproduced.

Norman cos/PDE in Table 1 differ from the saved JSONs by ~0.002 (rounding). K562 reflow cos/PDE differ by ~0.007 (likely rounding/transcription).

Citation

@inproceedings{ahmad2026scfate,
  title={scFATE: Zero-shot Perturbation Prediction via Lie-Algebraic Conditional Flow Matching},
  author={Ahmad, Farhan and Angione, Claudio and others},
  booktitle={NeurIPS},
  year={2026}
}
Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support