Saturday, November 8, 2025
← 2025-11-07 | 2025-11-09 → | yesterday | 1 stories · page 1 (last)
1.
HRCB ND
E 0.00
S
PSQ +0.36 experimental
Study identifies weaknesses in how AI systems are evaluated (www.oii.ox.ac.uk)
416 points by pseudolus 117 days ago | 192 comments | hrcb AI evaluation ×2 ±0.00