| 2026-03-01 11:42 | eval_success | Evaluated: Neutral (0.01) | - - |
| 2026-03-01 11:42 |
eval
|
Evaluated by deepseek-v3.2: +0.01 (Neutral) 9,079 tokens -0.03 | |
| 2026-03-01 11:40 | eval_success | Evaluated: Neutral (0.04) | - - |
| 2026-03-01 11:40 |
eval
|
Evaluated by deepseek-v3.2: +0.04 (Neutral) 8,492 tokens | |
| 2026-03-01 11:40 | rater_validation_warn | Validation warnings for model deepseek-v3.2: 25W 25R | - - |
| 2026-03-01 08:08 | eval_success | Lite evaluated: Mild positive (0.28) | - - |
| 2026-03-01 08:08 |
eval
|
Evaluated by llama-4-scout-wai: +0.28 (Mild positive) 0.00 | |
| reasoning ED, exposing algorithm manipulation, slight rights lean |
| 2026-03-01 07:38 | eval_success | Lite evaluated: Moderate positive (0.50) | - - |
| 2026-03-01 07:38 |
eval
|
Evaluated by llama-3.3-70b-wai: +0.50 (Moderate positive) +0.10 | |
| reasoning Exposing algorithmic abuse |
| 2026-03-01 07:11 | eval_success | Lite evaluated: Mild positive (0.28) | - - |
| 2026-03-01 07:11 |
eval
|
Evaluated by llama-4-scout-wai: +0.28 (Mild positive) 0.00 | |
| reasoning ED, exposing algorithm manipulation, slight rights lean |
| 2026-03-01 06:52 | eval_success | Lite evaluated: Moderate positive (0.40) | - - |
| 2026-03-01 06:52 |
eval
|
Evaluated by llama-3.3-70b-wai: +0.40 (Moderate positive) 0.00 | |
| reasoning Exposing algorithmic abuse |
| 2026-03-01 06:26 | eval_success | Lite evaluated: Mild positive (0.28) | - - |
| 2026-03-01 06:26 |
eval
|
Evaluated by llama-4-scout-wai: +0.28 (Mild positive) 0.00 | |
| reasoning ED, exposing algorithm manipulation, slight rights lean |
| 2026-03-01 06:09 | eval_success | Lite evaluated: Moderate positive (0.40) | - - |
| 2026-03-01 06:09 |
eval
|
Evaluated by llama-3.3-70b-wai: +0.40 (Moderate positive) -0.10 | |
| reasoning Exposing algorithmic abuse |
| 2026-03-01 05:44 | eval_success | Lite evaluated: Mild positive (0.28) | - - |
| 2026-03-01 05:44 |
eval
|
Evaluated by llama-4-scout-wai: +0.28 (Mild positive) 0.00 | |
| reasoning ED, exposing algorithm manipulation, slight rights lean |
| 2026-03-01 05:37 | eval_success | Lite evaluated: Mild positive (0.28) | - - |
| 2026-03-01 05:37 |
eval
|
Evaluated by llama-4-scout-wai: +0.28 (Mild positive) 0.00 | |
| reasoning ED, exposing algorithm manipulation, slight rights lean |
| 2026-03-01 05:28 | eval_success | Lite evaluated: Moderate positive (0.50) | - - |
| 2026-03-01 05:28 |
eval
|
Evaluated by llama-3.3-70b-wai: +0.50 (Moderate positive) 0.00 | |
| reasoning Exposing algorithmic abuse |
| 2026-03-01 05:08 | credit_exhausted | Credit balance too low, pausing provider for 30 min | - - |
| 2026-03-01 05:04 | eval_success | Lite evaluated: Mild positive (0.28) | - - |
| 2026-03-01 05:04 |
eval
|
Evaluated by llama-4-scout-wai: +0.28 (Mild positive) 0.00 | |
| reasoning ED, exposing algorithm manipulation, slight rights lean |
| 2026-03-01 04:54 | eval_success | Lite evaluated: Moderate positive (0.50) | - - |
| 2026-03-01 04:54 |
eval
|
Evaluated by llama-3.3-70b-wai: +0.50 (Moderate positive) 0.00 | |
| reasoning Exposing algorithmic abuse |
| 2026-03-01 04:14 | eval_success | Lite evaluated: Mild positive (0.28) | - - |
| 2026-03-01 04:14 |
eval
|
Evaluated by llama-4-scout-wai: +0.28 (Mild positive) 0.00 | |
| reasoning ED, exposing algorithm manipulation, slight rights lean |
| 2026-03-01 04:07 | eval_success | Lite evaluated: Moderate positive (0.50) | - - |
| 2026-03-01 04:07 |
eval
|
Evaluated by llama-3.3-70b-wai: +0.50 (Moderate positive) 0.00 | |
| reasoning Exposing algorithmic abuse |
| 2026-03-01 03:25 | eval_success | Lite evaluated: Mild positive (0.28) | - - |
| 2026-03-01 03:25 |
eval
|
Evaluated by llama-4-scout-wai: +0.28 (Mild positive) 0.00 | |
| reasoning ED, exposing algorithm manipulation, slight rights lean |
| 2026-03-01 03:19 | eval_success | Lite evaluated: Mild positive (0.28) | - - |
| 2026-03-01 03:19 |
eval
|
Evaluated by llama-4-scout-wai: +0.28 (Mild positive) 0.00 | |
| reasoning ED, exposing algorithm manipulation, slight rights lean |
| 2026-03-01 03:18 | eval_success | Lite evaluated: Moderate positive (0.50) | - - |
| 2026-03-01 03:18 |
eval
|
Evaluated by llama-3.3-70b-wai: +0.50 (Moderate positive) +0.10 | |
| reasoning Exposing algorithmic abuse |
| 2026-03-01 03:13 |
eval
|
Evaluated by llama-3.3-70b-wai: +0.40 (Moderate positive) 0.00 | |
| reasoning Exposing algorithmic abuse |
| 2026-03-01 02:46 |
eval
|
Evaluated by llama-4-scout-wai: +0.28 (Mild positive) 0.00 | |
| reasoning ED, exposing algorithm manipulation, slight rights lean |
| 2026-03-01 02:39 |
eval
|
Evaluated by llama-3.3-70b-wai: +0.40 (Moderate positive) 0.00 | |
| reasoning Exposing algorithmic abuse |
| 2026-03-01 01:59 |
eval
|
Evaluated by llama-4-scout-wai: +0.28 (Mild positive) -0.16 | |
| reasoning ED, exposing algorithm manipulation, slight rights lean |
| 2026-03-01 01:50 |
eval
|
Evaluated by llama-3.3-70b-wai: +0.40 (Moderate positive) 0.00 | |
| reasoning Exposing algorithmic abuse |
| 2026-03-01 01:20 |
eval
|
Evaluated by llama-4-scout-wai: +0.44 (Moderate positive) +0.16 | |
| reasoning ED, exposing algorithm manipulation, slight rights lean |
| 2026-03-01 01:06 |
eval
|
Evaluated by llama-3.3-70b-wai: +0.40 (Moderate positive) 0.00 | |
| reasoning Exposing algorithmic abuse |
| 2026-03-01 00:25 |
eval
|
Evaluated by llama-4-scout-wai: +0.28 (Mild positive) 0.00 | |
| reasoning ED, exposing algorithm manipulation, slight rights lean |
| 2026-03-01 00:19 |
eval
|
Evaluated by llama-4-scout-wai: +0.28 (Mild positive) 0.00 | |
| reasoning ED, exposing algorithm manipulation, slight rights lean |
| 2026-03-01 00:19 |
eval
|
Evaluated by llama-3.3-70b-wai: +0.40 (Moderate positive) 0.00 | |
| reasoning Exposing algorithmic abuse |
| 2026-03-01 00:12 |
eval
|
Evaluated by llama-4-scout-wai: +0.28 (Mild positive) 0.00 | |
| reasoning ED, exposing algorithm manipulation, slight rights lean |
| 2026-03-01 00:11 |
eval
|
Evaluated by llama-3.3-70b-wai: +0.40 (Moderate positive) 0.00 | |
| reasoning Exposing algorithmic abuse |
| 2026-02-28 23:27 |
eval
|
Evaluated by llama-4-scout-wai: +0.28 (Mild positive) | |
| reasoning ED, exposing algorithm manipulation, slight rights lean |
| 2026-02-28 23:26 |
eval
|
Evaluated by llama-3.3-70b-wai: +0.40 (Moderate positive) | |
| reasoning Exposing algorithmic abuse |