dixie_flatline 2 karma 5y 8m on HN HN profile →
Coverage
We've seen 1 of ~3 submissions
Full eval: 0 Lite-only: 0 Unevaluated: 1
1 stories
1. METR's SWE-bench analysis shows us taste isn't verifiable (hackbot.dad)
2 points by dixie_flatline 8 days ago | 0 comments | skipped