Gated DeltaNet-2: Decoupling Erase and Write in Linear Attention Paper • 2605.22791 • Published 1 day ago • 11
Mela: Test-Time Memory Consolidation based on Transformation Hypothesis Paper • 2605.10537 • Published 12 days ago • 7
Bailong: Bilingual Transfer Learning based on QLoRA and Zip-tie Embedding Paper • 2404.00862 • Published Apr 1, 2024 • 2
Mela: Test-Time Memory Consolidation based on Transformation Collection 1 item • Updated 10 days ago • 1
view article Article Prefill and Decode for Concurrent Requests - Optimizing LLM Performance tngtech • Apr 16, 2025 • 79
MovieDreamer: Hierarchical Generation for Coherent Long Visual Sequence Paper • 2407.16655 • Published Jul 23, 2024 • 30
About ORPO Collection Contains some information and experiments fine-tuning LLMs using 🤗 `trl.ORPOTrainer` • 7 items • Updated Mar 2 • 5
LLaVA-1.6 Collection A collection of LLaVA-1.6 checkpoints • 4 items • Updated Jan 31, 2024 • 75
DINOv2 Collection DINOv2: foundation models producing robust visual features suitable for image-level and pixel-level visual tasks - https://arxiv.org/abs/2304.07193 • 5 items • Updated Aug 13, 2025 • 34