ND Do Not A/B Test My Workflow (backnotprop.com)
19 points by ramoz 6 days ago | 2 comments on HN ~lite vlite-2.0
Summary ~lite
Critique of silent A/B testing
Lite evaluation by llama-3.3-70b-wai-psq · editorial channel only · no per-section breakdown available
Longitudinal 4 HN snapshots · 35 evals
+1 0 −1 HN
Audit Trail 55 entries
2026-03-15 00:35 eval_success PSQ evaluated: g-PSQ=-0.170 (3 dims) - -
2026-03-15 00:35 eval Evaluated by llama-3.3-70b-wai-psq: -0.17 (Mild negative)
2026-03-15 00:33 eval_success Lite evaluated: Mild positive (0.28) - -
2026-03-15 00:33 eval Evaluated by llama-3.3-70b-wai: +0.28 (Mild positive)
reasoning
Critique of A/B testing in professional tools
2026-03-14 22:35 eval_success Lite evaluated: Mild positive (0.16) - -
2026-03-14 22:35 eval Evaluated by llama-4-scout-wai: +0.16 (Mild positive) 0.00
reasoning
Technical blog post criticizing AI tool's A/B testing practices
2026-03-14 21:36 eval_success PSQ evaluated: g-PSQ=-0.234 (3 dims) - -
2026-03-14 21:36 eval Evaluated by llama-4-scout-wai-psq: -0.23 (Mild negative) 0.00
2026-03-14 21:21 eval_success Lite evaluated: Mild positive (0.16) - -
2026-03-14 21:21 eval Evaluated by llama-4-scout-wai: +0.16 (Mild positive) 0.00
reasoning
Technical blog post criticizing AI tool's A/B testing practices
2026-03-14 20:18 eval_success PSQ evaluated: g-PSQ=-0.234 (3 dims) - -
2026-03-14 20:18 eval Evaluated by llama-4-scout-wai-psq: -0.23 (Mild negative) 0.00
2026-03-14 20:10 eval_success Lite evaluated: Mild positive (0.16) - -
2026-03-14 20:10 eval Evaluated by llama-4-scout-wai: +0.16 (Mild positive) 0.00
reasoning
Technical blog post criticizing AI tool's A/B testing practices
2026-03-14 18:40 eval_success PSQ evaluated: g-PSQ=-0.234 (3 dims) - -
2026-03-14 18:40 eval Evaluated by llama-4-scout-wai-psq: -0.23 (Mild negative) 0.00
2026-03-14 18:21 eval_success Lite evaluated: Mild positive (0.16) - -
2026-03-14 18:21 eval Evaluated by llama-4-scout-wai: +0.16 (Mild positive) 0.00
reasoning
Technical blog post criticizing AI tool's A/B testing practices
2026-03-14 17:04 eval_success PSQ evaluated: g-PSQ=-0.234 (3 dims) - -
2026-03-14 17:04 eval Evaluated by llama-4-scout-wai-psq: -0.23 (Mild negative) 0.00
2026-03-14 16:43 eval_success Lite evaluated: Mild positive (0.16) - -
2026-03-14 16:43 eval Evaluated by llama-4-scout-wai: +0.16 (Mild positive) 0.00
reasoning
Technical blog post criticizing AI tool's A/B testing practices
2026-03-14 15:53 eval_success PSQ evaluated: g-PSQ=-0.234 (3 dims) - -
2026-03-14 15:53 eval Evaluated by llama-4-scout-wai-psq: -0.23 (Mild negative) 0.00
2026-03-14 15:34 eval_success Lite evaluated: Mild positive (0.16) - -
2026-03-14 15:34 eval Evaluated by llama-4-scout-wai: +0.16 (Mild positive) 0.00
reasoning
Technical blog post criticizing AI tool's A/B testing practices
2026-03-14 15:08 eval_success PSQ evaluated: g-PSQ=-0.234 (3 dims) - -
2026-03-14 15:08 eval Evaluated by llama-4-scout-wai-psq: -0.23 (Mild negative) 0.00
2026-03-14 14:50 eval_success Lite evaluated: Mild positive (0.16) - -
2026-03-14 14:50 eval Evaluated by llama-4-scout-wai: +0.16 (Mild positive) 0.00
reasoning
Technical blog post criticizing AI tool's A/B testing practices
2026-03-14 14:30 eval_success PSQ evaluated: g-PSQ=-0.234 (3 dims) - -
2026-03-14 14:30 eval Evaluated by llama-4-scout-wai-psq: -0.23 (Mild negative) 0.00
2026-03-14 14:14 eval_success Lite evaluated: Mild positive (0.16) - -
2026-03-14 14:14 eval Evaluated by llama-4-scout-wai: +0.16 (Mild positive) 0.00
reasoning
Technical blog post criticizing AI tool's A/B testing practices
2026-03-14 13:52 eval_success PSQ evaluated: g-PSQ=-0.234 (3 dims) - -
2026-03-14 13:52 eval Evaluated by llama-4-scout-wai-psq: -0.23 (Mild negative) 0.00
2026-03-14 13:36 eval_success Lite evaluated: Mild positive (0.16) - -
2026-03-14 13:36 eval Evaluated by llama-4-scout-wai: +0.16 (Mild positive) 0.00
reasoning
Technical blog post criticizing AI tool's A/B testing practices
2026-03-14 13:14 eval_success PSQ evaluated: g-PSQ=-0.234 (3 dims) - -
2026-03-14 13:14 eval Evaluated by llama-4-scout-wai-psq: -0.23 (Mild negative) 0.00
2026-03-14 13:00 eval Evaluated by llama-4-scout-wai: +0.16 (Mild positive) 0.00
reasoning
Technical blog post criticizing AI tool's A/B testing practices
2026-03-14 12:37 eval Evaluated by llama-4-scout-wai-psq: -0.23 (Mild negative) 0.00
2026-03-14 12:24 eval Evaluated by llama-4-scout-wai: +0.16 (Mild positive) 0.00
reasoning
Technical blog post criticizing AI tool's A/B testing practices
2026-03-14 11:59 eval Evaluated by llama-4-scout-wai-psq: -0.23 (Mild negative) 0.00
2026-03-14 11:50 eval Evaluated by llama-4-scout-wai: +0.16 (Mild positive) 0.00
reasoning
Technical blog post criticizing AI tool's A/B testing practices
2026-03-14 11:21 eval Evaluated by llama-4-scout-wai-psq: -0.23 (Mild negative) 0.00
2026-03-14 11:15 eval Evaluated by llama-4-scout-wai: +0.16 (Mild positive) 0.00
reasoning
Technical blog post criticizing AI tool's A/B testing practices
2026-03-14 10:45 eval Evaluated by llama-4-scout-wai-psq: -0.23 (Mild negative) 0.00
2026-03-14 10:39 eval Evaluated by llama-4-scout-wai: +0.16 (Mild positive) 0.00
reasoning
Technical blog post criticizing AI tool's A/B testing practices
2026-03-14 10:06 eval Evaluated by llama-4-scout-wai-psq: -0.23 (Mild negative) 0.00
2026-03-14 10:02 eval Evaluated by llama-4-scout-wai: +0.16 (Mild positive) 0.00
reasoning
Technical blog post criticizing AI tool's A/B testing practices
2026-03-14 09:25 eval Evaluated by llama-4-scout-wai-psq: -0.23 (Mild negative) 0.00
2026-03-14 09:19 eval Evaluated by llama-4-scout-wai: +0.16 (Mild positive) 0.00
reasoning
Technical blog post criticizing AI tool's A/B testing practices
2026-03-14 08:42 eval Evaluated by llama-4-scout-wai-psq: -0.23 (Mild negative)
2026-03-14 08:39 eval Evaluated by llama-4-scout-wai: +0.16 (Mild positive)
reasoning
Technical blog post criticizing AI tool's A/B testing practices