2 stories
esolang-bench.vercel.app visit →
Stories 2 (0 evaluated) Avg HRCB ND
Avg SETL ND Avg Conf ND
Poster Karma 9,703 avg Submitters 2
1.
HRCB -0.08
E 0.00
S -0.20
EsoLang-Bench: Evaluating Genuine Reasoning in LLMs via Esoteric Languages (esolang-bench.vercel.app)
98 points by matt_d 5 days ago | 58 comments | hrcb AI Evaluation
2. LLMs benchmark with esoteric programming languages (esolang-bench.vercel.app)
2 points by vmaurin 8 days ago | 0 comments | skipped