home / mchap.io / item 17754105 Evaluation Failed
claude-p failed (exit 1)
Domain mchap.io Content Type Editorial Created 2026-02-27 22:43:36
▶ Retry Evaluation
Audit Trail
13 entries
2026-02-28 12:40 model_divergence Cross-model spread 0.29 exceeds threshold (3 models) - - 2026-02-28 12:40
eval
Evaluated by claude-haiku-4-5-20251001 : +0.27 (Mild positive) 2026-02-28 09:44 model_divergence Cross-model spread 0.26 exceeds threshold (2 models) - - 2026-02-28 09:44 eval_success Light evaluated: Moderate positive (0.56) - - 2026-02-28 09:44
eval
Evaluated by llama-4-scout-wai : +0.56 (Moderate positive) 0.00 2026-02-28 09:44 rater_validation_warn Light validation warnings for model llama-4-scout-wai: 0W 1R - - 2026-02-28 09:39 model_divergence Cross-model spread 0.26 exceeds threshold (2 models) - - 2026-02-28 09:39 eval_success Light evaluated: Moderate positive (0.56) - - 2026-02-28 09:39
eval
Evaluated by llama-4-scout-wai : +0.56 (Moderate positive) 2026-02-28 09:39 rater_validation_warn Light validation warnings for model llama-4-scout-wai: 0W 1R - - 2026-02-28 09:38 eval_success Light evaluated: Moderate positive (0.30) - - 2026-02-28 09:38
eval
Evaluated by llama-3.3-70b-wai : +0.30 (Moderate positive) 2026-02-28 09:38 rater_validation_warn Light validation warnings for model llama-3.3-70b-wai: 0W 1R - -