Claude Code's binary reveals silent A/B tests on core features (backnotprop.com)
168 points by ramoz 6 days ago | 211 comments on HN
Pending Evaluation
This story is queued for evaluation. It will be processed in an upcoming batch.
Queued: 2026-03-14 11:48:51
Longitudinal 203 HN snapshots · 93 evals
+1 0 −1 HN
Audit Trail 113 entries
2026-03-17 01:08 eval_success PSQ evaluated: g-PSQ=0.006 (3 dims) - -
2026-03-17 01:07 eval Evaluated by llama-4-scout-wai-psq: +0.01 (Neutral) 0.00
2026-03-17 01:06 eval_success Lite evaluated: Mild positive (0.16) - -
2026-03-17 01:06 eval Evaluated by llama-4-scout-wai: +0.16 (Mild positive) 0.00
reasoning
Editorial stance on AI transparency, A/B testing, and user rights
2026-03-16 23:32 eval_success PSQ evaluated: g-PSQ=0.006 (3 dims) - -
2026-03-16 23:32 eval Evaluated by llama-4-scout-wai-psq: +0.01 (Neutral) 0.00
2026-03-16 23:31 eval_success Lite evaluated: Mild positive (0.16) - -
2026-03-16 23:31 eval Evaluated by llama-4-scout-wai: +0.16 (Mild positive) 0.00
reasoning
Editorial stance on AI transparency, A/B testing, and user rights
2026-03-16 22:11 eval_success Lite evaluated: Mild positive (0.16) - -
2026-03-16 22:11 eval Evaluated by llama-4-scout-wai: +0.16 (Mild positive) 0.00
reasoning
Editorial stance on AI transparency, A/B testing, and user rights
2026-03-16 22:08 eval_success PSQ evaluated: g-PSQ=0.006 (3 dims) - -
2026-03-16 22:08 eval Evaluated by llama-4-scout-wai-psq: +0.01 (Neutral) 0.00
2026-03-16 20:53 eval_success Lite evaluated: Mild positive (0.16) - -
2026-03-16 20:53 eval Evaluated by llama-4-scout-wai: +0.16 (Mild positive) 0.00
reasoning
Editorial stance on AI transparency, A/B testing, and user rights
2026-03-16 20:37 eval_success PSQ evaluated: g-PSQ=0.006 (3 dims) - -
2026-03-16 20:37 eval Evaluated by llama-4-scout-wai-psq: +0.01 (Neutral) 0.00
2026-03-16 19:01 eval_success Lite evaluated: Mild positive (0.16) - -
2026-03-16 19:01 eval Evaluated by llama-4-scout-wai: +0.16 (Mild positive) 0.00
reasoning
Editorial stance on AI transparency, A/B testing, and user rights
2026-03-16 18:43 eval_success PSQ evaluated: g-PSQ=0.006 (3 dims) - -
2026-03-16 18:43 eval Evaluated by llama-4-scout-wai-psq: +0.01 (Neutral) 0.00
2026-03-16 17:52 eval_success Lite evaluated: Mild positive (0.16) - -
2026-03-16 17:52 eval Evaluated by llama-4-scout-wai: +0.16 (Mild positive) 0.00
reasoning
Editorial stance on AI transparency, A/B testing, and user rights
2026-03-16 17:19 eval_success PSQ evaluated: g-PSQ=0.006 (3 dims) - -
2026-03-16 17:19 eval Evaluated by llama-4-scout-wai-psq: +0.01 (Neutral) 0.00
2026-03-16 16:40 eval_success Lite evaluated: Mild positive (0.16) - -
2026-03-16 16:40 eval Evaluated by llama-4-scout-wai: +0.16 (Mild positive) 0.00
reasoning
Editorial stance on AI transparency, A/B testing, and user rights
2026-03-16 16:29 eval_success PSQ evaluated: g-PSQ=0.006 (3 dims) - -
2026-03-16 16:29 eval Evaluated by llama-4-scout-wai-psq: +0.01 (Neutral) 0.00
2026-03-16 16:04 eval_success Lite evaluated: Mild positive (0.16) - -
2026-03-16 16:04 eval Evaluated by llama-4-scout-wai: +0.16 (Mild positive) 0.00
reasoning
Editorial stance on AI transparency, A/B testing, and user rights
2026-03-16 15:51 eval_success PSQ evaluated: g-PSQ=0.006 (3 dims) - -
2026-03-16 15:51 eval Evaluated by llama-4-scout-wai-psq: +0.01 (Neutral) 0.00
2026-03-16 15:28 eval_success Lite evaluated: Mild positive (0.16) - -
2026-03-16 15:28 eval Evaluated by llama-4-scout-wai: +0.16 (Mild positive) 0.00
reasoning
Editorial stance on AI transparency, A/B testing, and user rights
2026-03-16 15:16 eval_success PSQ evaluated: g-PSQ=0.006 (3 dims) - -
2026-03-16 15:16 eval Evaluated by llama-4-scout-wai-psq: +0.01 (Neutral) 0.00
2026-03-16 14:51 eval_success Lite evaluated: Mild positive (0.16) - -
2026-03-16 14:51 eval Evaluated by llama-4-scout-wai: +0.16 (Mild positive) 0.00
reasoning
Editorial stance on AI transparency, A/B testing, and user rights
2026-03-16 14:42 eval_success PSQ evaluated: g-PSQ=0.006 (3 dims) - -
2026-03-16 14:42 eval Evaluated by llama-4-scout-wai-psq: +0.01 (Neutral) 0.00
2026-03-16 14:17 eval Evaluated by llama-4-scout-wai: +0.16 (Mild positive) 0.00
reasoning
Editorial stance on AI transparency, A/B testing, and user rights
2026-03-16 14:05 eval Evaluated by llama-4-scout-wai-psq: +0.01 (Neutral) 0.00
2026-03-16 13:40 eval Evaluated by llama-4-scout-wai: +0.16 (Mild positive) 0.00
reasoning
Editorial stance on AI transparency, A/B testing, and user rights
2026-03-16 13:27 eval Evaluated by llama-4-scout-wai-psq: +0.01 (Neutral) 0.00
2026-03-16 13:04 eval Evaluated by llama-4-scout-wai: +0.16 (Mild positive) 0.00
reasoning
Editorial stance on AI transparency, A/B testing, and user rights
2026-03-16 12:52 eval Evaluated by llama-4-scout-wai-psq: +0.01 (Neutral) 0.00
2026-03-16 12:29 eval Evaluated by llama-4-scout-wai: +0.16 (Mild positive) 0.00
reasoning
Editorial stance on AI transparency, A/B testing, and user rights
2026-03-16 12:15 eval Evaluated by llama-4-scout-wai-psq: +0.01 (Neutral) 0.00
2026-03-16 11:54 eval Evaluated by llama-4-scout-wai: +0.16 (Mild positive) 0.00
reasoning
Editorial stance on AI transparency, A/B testing, and user rights
2026-03-16 11:40 eval Evaluated by llama-4-scout-wai-psq: +0.01 (Neutral) 0.00
2026-03-16 11:19 eval Evaluated by llama-4-scout-wai: +0.16 (Mild positive) 0.00
reasoning
Editorial stance on AI transparency, A/B testing, and user rights
2026-03-16 11:04 eval Evaluated by llama-4-scout-wai-psq: +0.01 (Neutral) 0.00
2026-03-16 10:41 eval Evaluated by llama-4-scout-wai: +0.16 (Mild positive) 0.00
reasoning
Editorial stance on AI transparency, A/B testing, and user rights
2026-03-16 10:27 eval Evaluated by llama-4-scout-wai-psq: +0.01 (Neutral) 0.00
2026-03-16 10:02 eval Evaluated by llama-4-scout-wai: +0.16 (Mild positive) 0.00
reasoning
Editorial stance on AI transparency, A/B testing, and user rights
2026-03-16 09:49 eval Evaluated by llama-4-scout-wai-psq: +0.01 (Neutral) 0.00
2026-03-16 09:23 eval Evaluated by llama-4-scout-wai: +0.16 (Mild positive) 0.00
reasoning
Editorial stance on AI transparency, A/B testing, and user rights
2026-03-16 09:10 eval Evaluated by llama-4-scout-wai-psq: +0.01 (Neutral) 0.00
2026-03-16 08:44 eval Evaluated by llama-4-scout-wai: +0.16 (Mild positive) 0.00
reasoning
Editorial stance on AI transparency, A/B testing, and user rights
2026-03-16 08:30 eval Evaluated by llama-4-scout-wai-psq: +0.01 (Neutral) 0.00
2026-03-16 08:08 eval Evaluated by llama-4-scout-wai: +0.16 (Mild positive) 0.00
reasoning
Editorial stance on AI transparency, A/B testing, and user rights
2026-03-16 07:53 eval Evaluated by llama-4-scout-wai-psq: +0.01 (Neutral) 0.00
2026-03-16 07:32 eval Evaluated by llama-4-scout-wai: +0.16 (Mild positive) 0.00
reasoning
Editorial stance on AI transparency, A/B testing, and user rights
2026-03-16 07:17 eval Evaluated by llama-4-scout-wai-psq: +0.01 (Neutral) 0.00
2026-03-16 06:57 eval Evaluated by llama-4-scout-wai: +0.16 (Mild positive) 0.00
reasoning
Editorial stance on AI transparency, A/B testing, and user rights
2026-03-16 06:42 eval Evaluated by llama-4-scout-wai-psq: +0.01 (Neutral) 0.00
2026-03-16 06:19 eval Evaluated by llama-4-scout-wai: +0.16 (Mild positive) 0.00
reasoning
Editorial stance on AI transparency, A/B testing, and user rights
2026-03-16 06:07 eval Evaluated by llama-4-scout-wai-psq: +0.01 (Neutral) 0.00
2026-03-16 05:45 eval Evaluated by llama-4-scout-wai: +0.16 (Mild positive) 0.00
reasoning
Editorial stance on AI transparency, A/B testing, and user rights
2026-03-16 05:32 eval Evaluated by llama-4-scout-wai-psq: +0.01 (Neutral) 0.00
2026-03-16 05:10 eval Evaluated by llama-4-scout-wai: +0.16 (Mild positive) 0.00
reasoning
Editorial stance on AI transparency, A/B testing, and user rights
2026-03-16 04:52 eval Evaluated by llama-4-scout-wai-psq: +0.01 (Neutral) 0.00
2026-03-16 04:03 eval Evaluated by llama-4-scout-wai: +0.16 (Mild positive) 0.00
reasoning
Editorial stance on AI transparency, A/B testing, and user rights
2026-03-16 02:01 eval Evaluated by claude-haiku-4-5-20251001: +0.25 (Mild positive) 12,411 tokens -0.02
2026-03-16 01:51 eval Evaluated by llama-4-scout-wai-psq: +0.01 (Neutral) 0.00
2026-03-16 01:27 eval Evaluated by claude-haiku-4-5-20251001: +0.27 (Mild positive) 11,720 tokens -0.11
2026-03-16 00:56 eval Evaluated by claude-haiku-4-5-20251001: +0.38 (Moderate positive) 11,832 tokens +0.16
2026-03-16 00:54 eval Evaluated by llama-4-scout-wai: +0.16 (Mild positive) 0.00
reasoning
Editorial stance on AI transparency, A/B testing, and user rights
2026-03-16 00:27 eval Evaluated by claude-haiku-4-5-20251001: +0.21 (Mild positive) 11,828 tokens -0.08
2026-03-15 23:50 eval Evaluated by claude-haiku-4-5-20251001: +0.30 (Moderate positive) 11,657 tokens -0.06
2026-03-15 23:13 eval Evaluated by claude-haiku-4-5-20251001: +0.36 (Moderate positive) 12,390 tokens
2026-03-15 22:55 eval Evaluated by llama-4-scout-wai-psq: +0.01 (Neutral) 0.00
2026-03-15 22:08 eval Evaluated by llama-4-scout-wai: +0.16 (Mild positive) 0.00
reasoning
Editorial stance on AI transparency, A/B testing, and user rights
2026-03-15 17:58 eval Evaluated by llama-4-scout-wai-psq: +0.01 (Neutral) 0.00
2026-03-15 17:43 eval Evaluated by llama-4-scout-wai: +0.16 (Mild positive) 0.00
reasoning
Editorial stance on AI transparency, A/B testing, and user rights
2026-03-15 16:47 eval Evaluated by llama-4-scout-wai-psq: +0.01 (Neutral) +0.24
2026-03-15 16:29 eval Evaluated by llama-4-scout-wai: +0.16 (Mild positive) 0.00
reasoning
Editorial stance on AI transparency, A/B testing, and user rights
2026-03-15 00:35 eval Evaluated by llama-3.3-70b-wai-psq: -0.17 (Mild negative)
2026-03-15 00:33 eval Evaluated by llama-3.3-70b-wai: +0.16 (Mild positive)
reasoning
Critique of silent A/B testing on professional tool
2026-03-14 22:43 eval Evaluated by llama-4-scout-wai-psq: -0.23 (Mild negative) 0.00
2026-03-14 22:17 eval Evaluated by llama-4-scout-wai: +0.16 (Mild positive) 0.00
reasoning
Editorial stance on AI transparency, A/B testing, and user rights
2026-03-14 21:32 eval Evaluated by llama-4-scout-wai-psq: -0.23 (Mild negative) 0.00
2026-03-14 21:06 eval Evaluated by llama-4-scout-wai: +0.16 (Mild positive) 0.00
reasoning
Editorial stance on AI transparency, A/B testing, and user rights
2026-03-14 20:07 eval Evaluated by llama-4-scout-wai-psq: -0.23 (Mild negative) 0.00
2026-03-14 19:53 eval Evaluated by llama-4-scout-wai: +0.16 (Mild positive) 0.00
reasoning
Editorial stance on AI transparency, A/B testing, and user rights
2026-03-14 19:24 eval Evaluated by llama-4-scout-wai-psq: -0.23 (Mild negative) 0.00
2026-03-14 19:13 eval Evaluated by llama-4-scout-wai: +0.16 (Mild positive) 0.00
reasoning
Editorial stance on AI transparency, A/B testing, and user rights
2026-03-14 18:21 eval Evaluated by llama-4-scout-wai-psq: -0.23 (Mild negative) 0.00
2026-03-14 18:10 eval Evaluated by llama-4-scout-wai: +0.16 (Mild positive) 0.00
reasoning
Editorial stance on AI transparency, A/B testing, and user rights
2026-03-14 16:44 eval Evaluated by llama-4-scout-wai-psq: -0.23 (Mild negative) 0.00
2026-03-14 16:35 eval Evaluated by llama-4-scout-wai: +0.16 (Mild positive) 0.00
reasoning
Editorial stance on AI transparency, A/B testing, and user rights
2026-03-14 15:35 eval Evaluated by llama-4-scout-wai-psq: -0.23 (Mild negative) 0.00
2026-03-14 15:25 eval Evaluated by llama-4-scout-wai: +0.16 (Mild positive) 0.00
reasoning
Editorial stance on AI transparency, A/B testing, and user rights
2026-03-14 14:52 eval Evaluated by llama-4-scout-wai-psq: -0.23 (Mild negative) 0.00
2026-03-14 14:47 eval Evaluated by llama-4-scout-wai: +0.16 (Mild positive) 0.00
reasoning
Editorial stance on AI transparency, A/B testing, and user rights
2026-03-14 14:16 eval Evaluated by llama-4-scout-wai-psq: -0.23 (Mild negative) 0.00
2026-03-14 14:12 eval Evaluated by llama-4-scout-wai: +0.16 (Mild positive) 0.00
reasoning
Editorial stance on AI transparency, A/B testing, and user rights
2026-03-14 13:40 eval Evaluated by llama-4-scout-wai-psq: -0.23 (Mild negative) 0.00
2026-03-14 13:35 eval Evaluated by llama-4-scout-wai: +0.16 (Mild positive) 0.00
reasoning
Editorial stance on AI transparency, A/B testing, and user rights
2026-03-14 13:01 eval Evaluated by llama-4-scout-wai-psq: -0.23 (Mild negative) 0.00
2026-03-14 13:00 eval Evaluated by llama-4-scout-wai: +0.16 (Mild positive) 0.00
reasoning
Editorial stance on AI transparency, A/B testing, and user rights
2026-03-14 12:25 eval Evaluated by llama-4-scout-wai-psq: -0.23 (Mild negative)
2026-03-14 12:24 eval Evaluated by llama-4-scout-wai: +0.16 (Mild positive)
reasoning
Editorial stance on AI transparency, A/B testing, and user rights