0.00 Many SWE-bench-Passing PRs would not be merged (metr.org)
278 points by mustaphah 5 days ago | 156 comments on HN | Neutral ~lite vlite-1.6
Summary ~lite AI Development Neutral
Research note on the discrepancy between AI benchmark scores and real-world merge rates
EQ 0.50
SO 0.20
TD 0.80
Lite evaluation by llama-4-scout-wai · editorial channel only · no per-section breakdown available
Longitudinal 810 HN snapshots · 128 evals
+1 0 −1 HN
Audit Trail 148 entries
2026-03-14 19:03 eval_success Lite evaluated: Neutral (0.00) - -
2026-03-14 19:03 eval Evaluated by llama-4-scout-wai: 0.00 (Neutral) 0.00
reasoning
Technical research note on AI-generated pull requests and their merge rates
2026-03-14 19:03 rater_validation_warn Lite validation warnings for model llama-4-scout-wai: 1W 0R - -
2026-03-14 18:03 eval_success PSQ evaluated: g-PSQ=0.440 (3 dims) - -
2026-03-14 18:03 eval Evaluated by llama-4-scout-wai-psq: +0.44 (Moderate positive) 0.00
2026-03-14 17:48 eval_success Lite evaluated: Neutral (0.00) - -
2026-03-14 17:48 eval Evaluated by llama-4-scout-wai: 0.00 (Neutral) 0.00
reasoning
Technical research note on AI-generated pull requests and their merge rates
2026-03-14 17:48 rater_validation_warn Lite validation warnings for model llama-4-scout-wai: 1W 0R - -
2026-03-14 16:28 eval_success PSQ evaluated: g-PSQ=0.440 (3 dims) - -
2026-03-14 16:28 eval Evaluated by llama-4-scout-wai-psq: +0.44 (Moderate positive) 0.00
2026-03-14 16:12 eval_success Lite evaluated: Neutral (0.00) - -
2026-03-14 16:12 eval Evaluated by llama-4-scout-wai: 0.00 (Neutral) 0.00
reasoning
Technical research note on AI-generated pull requests and their merge rates
2026-03-14 16:12 rater_validation_warn Lite validation warnings for model llama-4-scout-wai: 1W 0R - -
2026-03-13 23:22 eval_success Lite evaluated: Neutral (0.00) - -
2026-03-13 23:22 eval Evaluated by llama-4-scout-wai: 0.00 (Neutral) 0.00
reasoning
Technical research note on AI-generated pull requests and their merge rates
2026-03-13 23:22 rater_validation_warn Lite validation warnings for model llama-4-scout-wai: 1W 0R - -
2026-03-13 22:19 eval_success PSQ evaluated: g-PSQ=0.440 (3 dims) - -
2026-03-13 22:19 eval Evaluated by llama-4-scout-wai-psq: +0.44 (Moderate positive) 0.00
2026-03-13 22:03 eval_success Lite evaluated: Neutral (0.00) - -
2026-03-13 22:03 rater_validation_warn Lite validation warnings for model llama-4-scout-wai: 1W 0R - -
2026-03-13 22:03 eval Evaluated by llama-4-scout-wai: 0.00 (Neutral) 0.00
reasoning
Technical research note on AI-generated pull requests and their merge rates
2026-03-13 20:38 eval_success PSQ evaluated: g-PSQ=0.440 (3 dims) - -
2026-03-13 20:38 eval Evaluated by llama-4-scout-wai-psq: +0.44 (Moderate positive) 0.00
2026-03-13 20:20 eval_success Lite evaluated: Neutral (0.00) - -
2026-03-13 20:20 eval Evaluated by llama-4-scout-wai: 0.00 (Neutral) 0.00
reasoning
Technical research note on AI-generated pull requests and their merge rates
2026-03-13 20:20 rater_validation_warn Lite validation warnings for model llama-4-scout-wai: 1W 0R - -
2026-03-13 19:13 eval_success PSQ evaluated: g-PSQ=0.440 (3 dims) - -
2026-03-13 19:13 eval Evaluated by llama-4-scout-wai-psq: +0.44 (Moderate positive) 0.00
2026-03-13 18:56 eval_success Lite evaluated: Neutral (0.00) - -
2026-03-13 18:56 eval Evaluated by llama-4-scout-wai: 0.00 (Neutral) 0.00
reasoning
Technical research note on AI-generated pull requests and their merge rates
2026-03-13 18:56 rater_validation_warn Lite validation warnings for model llama-4-scout-wai: 1W 0R - -
2026-03-13 17:59 eval_success PSQ evaluated: g-PSQ=0.440 (3 dims) - -
2026-03-13 17:59 eval Evaluated by llama-4-scout-wai-psq: +0.44 (Moderate positive) 0.00
2026-03-13 17:38 eval Evaluated by llama-4-scout-wai: 0.00 (Neutral) 0.00
reasoning
Technical research note on AI-generated pull requests and their merge rates
2026-03-13 16:29 eval Evaluated by llama-4-scout-wai-psq: +0.44 (Moderate positive) 0.00
2026-03-13 16:11 eval Evaluated by llama-4-scout-wai: 0.00 (Neutral) 0.00
reasoning
Technical research note on AI-generated pull requests and their merge rates
2026-03-13 12:21 eval Evaluated by llama-4-scout-wai: 0.00 (Neutral) 0.00
reasoning
Technical research note on AI-generated pull requests and their merge rates
2026-03-13 12:20 eval Evaluated by llama-4-scout-wai-psq: +0.44 (Moderate positive) 0.00
2026-03-13 11:45 eval Evaluated by llama-4-scout-wai: 0.00 (Neutral) 0.00
reasoning
Technical research note on AI-generated pull requests and their merge rates
2026-03-13 11:42 eval Evaluated by llama-4-scout-wai-psq: +0.44 (Moderate positive) 0.00
2026-03-13 11:07 eval Evaluated by llama-4-scout-wai: 0.00 (Neutral) 0.00
reasoning
Technical research note on AI-generated pull requests and their merge rates
2026-03-13 11:02 eval Evaluated by llama-4-scout-wai-psq: +0.44 (Moderate positive) 0.00
2026-03-13 10:27 eval Evaluated by llama-4-scout-wai: 0.00 (Neutral) 0.00
reasoning
Technical research note on AI-generated pull requests and their merge rates
2026-03-13 10:22 eval Evaluated by llama-4-scout-wai-psq: +0.44 (Moderate positive) 0.00
2026-03-13 09:49 eval Evaluated by llama-4-scout-wai: 0.00 (Neutral) 0.00
reasoning
Technical research note on AI-generated pull requests and their merge rates
2026-03-13 09:40 eval Evaluated by llama-4-scout-wai-psq: +0.44 (Moderate positive) 0.00
2026-03-13 09:10 eval Evaluated by llama-4-scout-wai: 0.00 (Neutral) 0.00
reasoning
Technical research note on AI-generated pull requests and their merge rates
2026-03-13 09:01 eval Evaluated by llama-4-scout-wai-psq: +0.44 (Moderate positive) 0.00
2026-03-13 08:33 eval Evaluated by llama-4-scout-wai: 0.00 (Neutral) 0.00
reasoning
Technical research note on AI-generated pull requests and their merge rates
2026-03-13 08:19 eval Evaluated by llama-4-scout-wai-psq: +0.44 (Moderate positive) 0.00
2026-03-13 07:53 eval Evaluated by llama-4-scout-wai: 0.00 (Neutral) 0.00
reasoning
Technical research note on AI-generated pull requests and their merge rates
2026-03-13 07:37 eval Evaluated by llama-4-scout-wai-psq: +0.44 (Moderate positive) 0.00
2026-03-13 07:10 eval Evaluated by llama-4-scout-wai: 0.00 (Neutral) 0.00
reasoning
Technical research note on AI-generated pull requests and their merge rates
2026-03-13 06:57 eval Evaluated by llama-4-scout-wai-psq: +0.44 (Moderate positive) 0.00
2026-03-13 06:33 eval Evaluated by llama-4-scout-wai: 0.00 (Neutral) 0.00
reasoning
Technical research note on AI-generated pull requests and their merge rates
2026-03-13 06:16 eval Evaluated by llama-4-scout-wai-psq: +0.44 (Moderate positive) 0.00
2026-03-13 05:56 eval Evaluated by llama-4-scout-wai: 0.00 (Neutral) 0.00
reasoning
Technical research note on AI-generated pull requests and their merge rates
2026-03-13 05:39 eval Evaluated by llama-4-scout-wai-psq: +0.44 (Moderate positive) 0.00
2026-03-13 05:20 eval Evaluated by llama-4-scout-wai: 0.00 (Neutral) 0.00
reasoning
Technical research note on AI-generated pull requests and their merge rates
2026-03-13 05:02 eval Evaluated by llama-4-scout-wai-psq: +0.44 (Moderate positive) 0.00
2026-03-13 04:44 eval Evaluated by llama-4-scout-wai: 0.00 (Neutral) 0.00
reasoning
Technical research note on AI-generated pull requests and their merge rates
2026-03-13 04:23 eval Evaluated by llama-4-scout-wai-psq: +0.44 (Moderate positive) 0.00
2026-03-13 04:07 eval Evaluated by llama-4-scout-wai: 0.00 (Neutral) 0.00
reasoning
Technical research note on AI-generated pull requests and their merge rates
2026-03-13 03:46 eval Evaluated by llama-4-scout-wai-psq: +0.44 (Moderate positive) 0.00
2026-03-13 03:31 eval Evaluated by llama-4-scout-wai: 0.00 (Neutral) 0.00
reasoning
Technical research note on AI-generated pull requests and their merge rates
2026-03-13 03:09 eval Evaluated by llama-4-scout-wai-psq: +0.44 (Moderate positive) 0.00
2026-03-13 02:56 eval Evaluated by llama-4-scout-wai: 0.00 (Neutral) 0.00
reasoning
Technical research note on AI-generated pull requests and their merge rates
2026-03-13 02:31 eval Evaluated by llama-4-scout-wai-psq: +0.44 (Moderate positive) 0.00
2026-03-13 02:21 eval Evaluated by llama-4-scout-wai: 0.00 (Neutral) 0.00
reasoning
Technical research note on AI-generated pull requests and their merge rates
2026-03-13 01:54 eval Evaluated by llama-4-scout-wai-psq: +0.44 (Moderate positive) 0.00
2026-03-13 01:45 eval Evaluated by llama-4-scout-wai: 0.00 (Neutral) 0.00
reasoning
Technical research note on AI-generated pull requests and their merge rates
2026-03-13 01:20 eval Evaluated by llama-4-scout-wai-psq: +0.44 (Moderate positive) 0.00
2026-03-13 01:17 eval Evaluated by llama-4-scout-wai: 0.00 (Neutral) 0.00
reasoning
Technical research note on AI-generated pull requests and their merge rates
2026-03-13 00:51 eval Evaluated by llama-4-scout-wai-psq: +0.44 (Moderate positive) 0.00
2026-03-13 00:47 eval Evaluated by llama-4-scout-wai: 0.00 (Neutral) 0.00
reasoning
Technical research note on AI-generated pull requests and their merge rates
2026-03-12 23:46 eval Evaluated by llama-4-scout-wai-psq: +0.44 (Moderate positive) 0.00
2026-03-12 23:37 eval Evaluated by llama-4-scout-wai: 0.00 (Neutral) 0.00
reasoning
Technical research note on AI-generated pull requests and their merge rates
2026-03-12 22:32 eval Evaluated by llama-4-scout-wai-psq: +0.44 (Moderate positive) 0.00
2026-03-12 22:22 eval Evaluated by llama-4-scout-wai: 0.00 (Neutral) 0.00
reasoning
Technical research note on AI-generated pull requests and their merge rates
2026-03-12 21:49 eval Evaluated by llama-4-scout-wai-psq: +0.44 (Moderate positive) 0.00
2026-03-12 21:40 eval Evaluated by llama-4-scout-wai: 0.00 (Neutral) 0.00
reasoning
Technical research note on AI-generated pull requests and their merge rates
2026-03-12 21:14 eval Evaluated by llama-4-scout-wai-psq: +0.44 (Moderate positive) 0.00
2026-03-12 21:11 eval Evaluated by llama-4-scout-wai: 0.00 (Neutral) 0.00
reasoning
Technical research note on AI-generated pull requests and their merge rates
2026-03-12 20:52 eval Evaluated by llama-4-scout-wai-psq: +0.44 (Moderate positive) 0.00
2026-03-12 20:51 eval Evaluated by llama-4-scout-wai: 0.00 (Neutral) 0.00
reasoning
Technical research note on AI-generated pull requests and their merge rates
2026-03-12 19:36 eval Evaluated by llama-4-scout-wai-psq: +0.44 (Moderate positive) 0.00
2026-03-12 19:35 eval Evaluated by llama-4-scout-wai: 0.00 (Neutral) 0.00
reasoning
Technical research note on AI-generated pull requests and their merge rates
2026-03-12 18:09 eval Evaluated by llama-4-scout-wai: 0.00 (Neutral) 0.00
reasoning
Technical research note on AI-generated pull requests and their merge rates
2026-03-12 18:05 eval Evaluated by llama-4-scout-wai-psq: +0.44 (Moderate positive) 0.00
2026-03-12 16:41 eval Evaluated by llama-4-scout-wai: 0.00 (Neutral) 0.00
reasoning
Technical research note on AI-generated pull requests and their merge rates
2026-03-12 16:35 eval Evaluated by llama-4-scout-wai-psq: +0.44 (Moderate positive) 0.00
2026-03-12 15:23 eval Evaluated by llama-4-scout-wai: 0.00 (Neutral) 0.00
reasoning
Technical research note on AI-generated pull requests and their merge rates
2026-03-12 15:14 eval Evaluated by llama-4-scout-wai-psq: +0.44 (Moderate positive) 0.00
2026-03-12 14:02 eval Evaluated by llama-4-scout-wai: 0.00 (Neutral) 0.00
reasoning
Technical research note on AI-generated pull requests and their merge rates
2026-03-12 13:52 eval Evaluated by llama-4-scout-wai-psq: +0.44 (Moderate positive) 0.00
2026-03-12 13:22 eval Evaluated by llama-4-scout-wai: 0.00 (Neutral) 0.00
reasoning
Technical research note on AI-generated pull requests and their merge rates
2026-03-12 13:13 eval Evaluated by llama-4-scout-wai-psq: +0.44 (Moderate positive) 0.00
2026-03-12 12:44 eval Evaluated by llama-4-scout-wai: 0.00 (Neutral) 0.00
reasoning
Technical research note on AI-generated pull requests and their merge rates
2026-03-12 12:37 eval Evaluated by llama-4-scout-wai-psq: +0.44 (Moderate positive) 0.00
2026-03-12 12:10 eval Evaluated by llama-4-scout-wai: 0.00 (Neutral) 0.00
reasoning
Technical research note on AI-generated pull requests and their merge rates
2026-03-12 12:03 eval Evaluated by llama-4-scout-wai-psq: +0.44 (Moderate positive) 0.00
2026-03-12 11:48 eval Evaluated by llama-4-scout-wai: 0.00 (Neutral) 0.00
reasoning
Technical research note on AI-generated pull requests and their merge rates
2026-03-12 11:42 eval Evaluated by llama-4-scout-wai-psq: +0.44 (Moderate positive) 0.00
2026-03-12 11:27 eval Evaluated by llama-4-scout-wai: 0.00 (Neutral) 0.00
reasoning
Technical research note on AI-generated pull requests and their merge rates
2026-03-12 11:19 eval Evaluated by llama-4-scout-wai-psq: +0.44 (Moderate positive) 0.00
2026-03-12 10:59 eval Evaluated by llama-4-scout-wai: 0.00 (Neutral) 0.00
reasoning
Technical research note on AI-generated pull requests and their merge rates
2026-03-12 10:39 eval Evaluated by llama-4-scout-wai-psq: +0.44 (Moderate positive) 0.00
2026-03-12 09:42 eval Evaluated by llama-4-scout-wai: 0.00 (Neutral) 0.00
reasoning
Technical research note on AI-generated pull requests and their merge rates
2026-03-12 09:23 eval Evaluated by llama-4-scout-wai-psq: +0.44 (Moderate positive) 0.00
2026-03-12 09:01 eval Evaluated by llama-4-scout-wai: 0.00 (Neutral) 0.00
reasoning
Technical research note on AI-generated pull requests and their merge rates
2026-03-12 08:45 eval Evaluated by llama-4-scout-wai-psq: +0.44 (Moderate positive) 0.00
2026-03-12 08:26 eval Evaluated by llama-4-scout-wai: 0.00 (Neutral) 0.00
reasoning
Technical research note on AI-generated pull requests and their merge rates
2026-03-12 08:10 eval Evaluated by llama-4-scout-wai-psq: +0.44 (Moderate positive) 0.00
2026-03-12 07:51 eval Evaluated by llama-4-scout-wai: 0.00 (Neutral) 0.00
reasoning
Technical research note on AI-generated pull requests and their merge rates
2026-03-12 07:35 eval Evaluated by llama-4-scout-wai-psq: +0.44 (Moderate positive) 0.00
2026-03-12 07:16 eval Evaluated by llama-4-scout-wai: 0.00 (Neutral) 0.00
reasoning
Technical research note on AI-generated pull requests and their merge rates
2026-03-12 07:00 eval Evaluated by llama-4-scout-wai-psq: +0.44 (Moderate positive) 0.00
2026-03-12 06:42 eval Evaluated by llama-4-scout-wai: 0.00 (Neutral) 0.00
reasoning
Technical research note on AI-generated pull requests and their merge rates
2026-03-12 06:25 eval Evaluated by llama-4-scout-wai-psq: +0.44 (Moderate positive) 0.00
2026-03-12 06:06 eval Evaluated by llama-4-scout-wai: 0.00 (Neutral) 0.00
reasoning
Technical research note on AI-generated pull requests and their merge rates
2026-03-12 05:50 eval Evaluated by llama-4-scout-wai-psq: +0.44 (Moderate positive) 0.00
2026-03-12 05:32 eval Evaluated by llama-4-scout-wai: 0.00 (Neutral) 0.00
reasoning
Technical research note on AI-generated pull requests and their merge rates
2026-03-12 05:15 eval Evaluated by llama-4-scout-wai-psq: +0.44 (Moderate positive) 0.00
2026-03-12 04:56 eval Evaluated by llama-4-scout-wai: 0.00 (Neutral) 0.00
reasoning
Technical research note on AI-generated pull requests and their merge rates
2026-03-12 04:40 eval Evaluated by llama-4-scout-wai-psq: +0.44 (Moderate positive) 0.00
2026-03-12 04:22 eval Evaluated by llama-4-scout-wai: 0.00 (Neutral) 0.00
reasoning
Technical research note on AI-generated pull requests and their merge rates
2026-03-12 04:05 eval Evaluated by llama-4-scout-wai-psq: +0.44 (Moderate positive) 0.00
2026-03-12 03:47 eval Evaluated by llama-4-scout-wai: 0.00 (Neutral) 0.00
reasoning
Technical research note on AI-generated pull requests and their merge rates
2026-03-12 03:30 eval Evaluated by llama-4-scout-wai-psq: +0.44 (Moderate positive) 0.00
2026-03-12 03:12 eval Evaluated by llama-4-scout-wai: 0.00 (Neutral) 0.00
reasoning
Technical research note on AI-generated pull requests and their merge rates
2026-03-12 02:54 eval Evaluated by llama-4-scout-wai-psq: +0.44 (Moderate positive) 0.00
2026-03-12 02:36 eval Evaluated by llama-4-scout-wai: 0.00 (Neutral) 0.00
reasoning
Technical research note on AI-generated pull requests and their merge rates
2026-03-12 02:16 eval Evaluated by llama-4-scout-wai-psq: +0.44 (Moderate positive) 0.00
2026-03-12 02:00 eval Evaluated by llama-4-scout-wai: 0.00 (Neutral) 0.00
reasoning
Technical research note on AI-generated pull requests and their merge rates
2026-03-12 01:37 eval Evaluated by llama-4-scout-wai-psq: +0.44 (Moderate positive) 0.00
2026-03-12 01:30 eval Evaluated by llama-4-scout-wai: 0.00 (Neutral) 0.00
reasoning
Technical research note on AI-generated pull requests and their merge rates
2026-03-12 01:17 eval Evaluated by llama-4-scout-wai-psq: +0.44 (Moderate positive) 0.00
2026-03-12 01:14 eval Evaluated by llama-4-scout-wai: 0.00 (Neutral) 0.00
reasoning
Technical research note on AI-generated pull requests and their merge rates
2026-03-12 00:50 eval Evaluated by llama-4-scout-wai-psq: +0.44 (Moderate positive) 0.00
2026-03-12 00:47 eval Evaluated by llama-4-scout-wai: 0.00 (Neutral) 0.00
reasoning
Technical research note on AI-generated pull requests and their merge rates
2026-03-12 00:05 eval Evaluated by llama-3.3-70b-wai-psq: +0.32 (Moderate positive)
2026-03-12 00:01 eval Evaluated by llama-3.3-70b-wai: +0.08 (Neutral)
reasoning
Technical content, zero rights discussion
2026-03-11 23:50 eval Evaluated by llama-4-scout-wai-psq: +0.44 (Moderate positive) 0.00
2026-03-11 23:45 eval Evaluated by llama-4-scout-wai: 0.00 (Neutral) 0.00
reasoning
Technical research note on AI-generated pull requests and their merge rates
2026-03-11 23:13 eval Evaluated by llama-4-scout-wai-psq: +0.44 (Moderate positive) 0.00
2026-03-11 23:10 eval Evaluated by llama-4-scout-wai: 0.00 (Neutral) 0.00
reasoning
Technical research note on AI-generated pull requests and their merge rates
2026-03-11 22:34 eval Evaluated by llama-4-scout-wai-psq: +0.44 (Moderate positive)
2026-03-11 22:32 eval Evaluated by llama-4-scout-wai: 0.00 (Neutral)
reasoning
Technical research note on AI-generated pull requests and their merge rates