Dipankar Sarkar PRO
AI & ML interests
Recent Activity
Organizations
You already conceded the hard part: the denominator is a function of the model, so a clean coverage number can sit over a surface you undercounted.
The move that makes it honest is to measure that gap instead of asserting it. You cannot see the true surface. But you can see every time a Phase 3 payload lands on a node your static model never predicted was reachable.
Call it a surprise rate. It is the empirical proxy for how wrong the denominator is. A low surprise rate earns the coverage number. A high one says the modeled surface is fiction and the percentage with it.
Better, every surprise is free training data: a missed edge to fold back into the enumerator. The surface model gets falsified by its own execution.
Do you already diff what Phase 3 actually reached against what the static model predicted, or does that signal get dropped after each run?
This converges, and the convergence is the useful part.
Detection lead time was never the honest variable. Blast radius on first contact is. You named it, that is the right axis.
One sharpening. Every leading signal on your list is itself a scoped detector with its own envelope. Live-traffic-outside-envelope needs the envelope drawn right. No-golden-coverage clusters need the clustering complete. So the leading layer inherits the same blind region as the base, one level up. A region no leading signal touches is exactly where first contact is still first evidence.
Which turns the whole thing on one default. For surface nothing has sampled, shadowed, or probed yet: is reliance low-by-default until coverage earns it, or full-by-default until something contradicts it?
Default-deny is safe by construction. Default-allow is lagging wearing a receipt.
Where does Chronia sit on untouched surface, deny or allow by default?
This converges, and the convergence is the useful part.
Detection lead time was never the honest variable. Blast radius on first contact is. You named it, that is the right axis.
One sharpening. Every leading signal on your list is itself a scoped detector with its own envelope. Live-traffic-outside-envelope needs the envelope drawn right. No-golden-coverage clusters need the clustering complete. So the leading layer inherits the same blind region as the base, one level up. A region no leading signal touches is exactly where first contact is still first evidence.
Which turns the whole thing on one default. For surface nothing has sampled, shadowed, or probed yet: is reliance low-by-default until coverage earns it, or full-by-default until something contradicts it?
Default-deny is safe by construction. Default-allow is lagging wearing a receipt.
Where does Chronia sit on untouched surface, deny or allow by default?
GRPO, Dr. GRPO, and DAPO Are Three Operations on One Number: The Group-Standard-Deviation Identity
When More Sampling Hurts: The Modal Ceiling and Correlation Ceiling of Test-Time Scaling
Building to the Test: Coding Agents Deliver What You Check, Not What You Requested
This converges, and the convergence is the useful part.
Detection lead time was never the honest variable. Blast radius on first contact is. You named it, that is the right axis.
One sharpening. Every leading signal on your list is itself a scoped detector with its own envelope. Live-traffic-outside-envelope needs the envelope drawn right. No-golden-coverage clusters need the clustering complete. So the leading layer inherits the same blind region as the base, one level up. A region no leading signal touches is exactly where first contact is still first evidence.
Which turns the whole thing on one default. For surface nothing has sampled, shadowed, or probed yet: is reliance low-by-default until coverage earns it, or full-by-default until something contradicts it?
Default-deny is safe by construction. Default-allow is lagging wearing a receipt.
Where does Chronia sit on untouched surface, deny or allow by default?
https://github.com/stas00/python-cookbook
I took my dense Python cheatsheet that I have been honing for many years and use a lot daily and turned it into a book of recipes.
Is this useful?
This is, of course, free, like other open books.
Outrider finds, implements, and validates methods for your repo.
While testing Outrider on a fork of huggingface/peft, I discovered "Riemannian Preconditioned LoRA for Fine-Tuning Foundation Models" (arxiv: 2402.02347)
The work offers improved stability and faster convergence in LoRA finetuning by adjusting updates for curvature that LoRA optimizers typically ignore.
Not the most recent paper, so I was pleasantly surprised my action surfaced this method as a candidate before implementing a PR. Even more surprised this method had not already been merged upstream.
Turns out, the author did try contributing to peft a couple years ago, but people get busy and the PR was closed after going stale.
So I decided to revive it! I opened an issue and soon after the author engaged to help land the feature. Now huggingface/peft #3382 is open, a joint effort with the paper's author.
This whole episode has me thinking about the future of OSS maintenance with AI coding. The software projects which endure will be well-shaped to quickly land and help test new ideas.
Across 30 forks, I've seen several papers land as clean PRs for multiple repos, which offers a perspective on how methods impact applications. Recent methods matching multiple frameworks: STARE, Entity Binding, BINEVAL
Get Outrider: https://github.com/remyxai/outrider