2 12

eldar

eldar-war-hammer

AI & ML interests

None yet

Recent Activity

upvoted a paper 4 days ago

Towards Evaluation Engineering: An Empirical Study of ML Evaluation Harnesses in the Wild

upvoted a paper 4 days ago

On the Workflows and Smells of Leaderboard Operations (LBOps): An Exploratory Study of Foundation Model Leaderboards

liked a Space 4 months ago

SWE-Arena/SWE-Community

View all activity

Organizations

None yet

upvoted 2 papers 4 days ago

Towards Evaluation Engineering: An Empirical Study of ML Evaluation Harnesses in the Wild

Paper • 2605.24213 • Published 16 days ago • 14

On the Workflows and Smells of Leaderboard Operations (LBOps): An Exploratory Study of Foundation Model Leaderboards

Paper • 2407.04065 • Published Jul 4, 2024 • 10

liked 2 Spaces 4 months ago

SWE-Community

🌐

Track GitHub community statistics for SWE assistants

SWE-Release

📢

Track GitHub releases statistics for SWE assistants

liked 4 datasets 4 months ago

liked a Space 4 months ago

README

🔬

liked 5 Spaces 7 months ago

Awesome Foundation Model Leaderboard Search

💻

The search tool of Awesome Foundation Model Leaderboard List

Awesome Production Machine Learning Search

🔥

The search tool of Awesome Production Machine Learning

SWE-PR

⚙

Track GitHub PR, review & commit stats for SWE agents

SWE-Issue

❓

Track GitHub issue statistics for SWE assistants

SWE-Chatbot-Arena

🎯

Chatbot arena for software engineering tasks