Why is GPT-5.4 obsessed with Goblins?

Alpha This system is experimental. Scores and classifications are early-stage research and may be unreliable. Methodology →

Model: @cf/meta/llama-4-scout-17b-16e-instruct lite 0.00 @cf/meta/llama-4-scout-17b-16e-instruct lite ND Compare

0.00	Why is GPT-5.4 obsessed with Goblins?
	16 points by pants2 3 days ago \| 11 comments on HN \| Mild negative ~lite vlite-1.6

Summary ~lite AI Behavior Neutral

User discusses ChatGPT's frequent use of 'goblin' term

EQ 0.00

SO 0.00

TD 0.00

Lite evaluation by llama-4-scout-wai · editorial channel only · no per-section breakdown available

Longitudinal 463 HN snapshots · 7 evals

Audit Trail 18 entries

2026-03-10 15:51	eval_success	Lite evaluated: Mild negative (-0.24)	- -
2026-03-10 15:51	eval	Evaluated by llama-4-scout-wai: -0.24 (Mild negative) 0.00
	reasoning Discussion on ChatGPT's behavior, no explicit human rights discussion
2026-03-10 15:51	rater_validation_warn	Lite validation warnings for model llama-4-scout-wai: 1W 0R	- -
2026-03-10 15:40	eval_success	PSQ evaluated: g-PSQ=0.600 (3 dims)	- -
2026-03-10 15:40	eval	Evaluated by llama-4-scout-wai-psq: +0.60 (Strong positive) 0.00
2026-03-10 15:11	eval_success	Lite evaluated: Mild negative (-0.24)	- -
2026-03-10 15:11	eval	Evaluated by llama-4-scout-wai: -0.24 (Mild negative) 0.00
	reasoning Discussion on ChatGPT's behavior, no explicit human rights discussion
2026-03-10 15:11	rater_validation_warn	Lite validation warnings for model llama-4-scout-wai: 1W 0R	- -
2026-03-10 15:01	eval_success	PSQ evaluated: g-PSQ=0.600 (3 dims)	- -
2026-03-10 15:01	eval	Evaluated by llama-4-scout-wai-psq: +0.60 (Strong positive) 0.00
2026-03-10 14:33	eval_success	Lite evaluated: Mild negative (-0.24)	- -
2026-03-10 14:33	eval	Evaluated by llama-4-scout-wai: -0.24 (Mild negative) 0.00
	reasoning Discussion on ChatGPT's behavior, no explicit human rights discussion
2026-03-10 14:33	rater_validation_warn	Lite validation warnings for model llama-4-scout-wai: 1W 0R	- -
2026-03-10 14:03	eval_success	PSQ evaluated: g-PSQ=0.600 (3 dims)	- -
2026-03-10 14:03	eval	Evaluated by llama-4-scout-wai-psq: +0.60 (Strong positive)
2026-03-10 13:54	eval_success	Lite evaluated: Mild negative (-0.24)	- -
2026-03-10 13:54	eval	Evaluated by llama-4-scout-wai: -0.24 (Mild negative)
	reasoning Discussion on ChatGPT's behavior, no explicit human rights discussion
2026-03-10 13:54	rater_validation_warn	Lite validation warnings for model llama-4-scout-wai: 1W 0R	- -

build ee2b489+gzrb · deployed 2026-03-10 22:52 UTC · evaluated 2026-03-08 02:36:46 UTC