0.00 LLaMA now goes faster on CPUs (justine.lol)
1372 points by lawrencechen 697 days ago | 451 comments on HN | Neutral ~lite vlight-1.3
Summary ~lite Technology optimization Neutral
Technical article on CPU matrix multiplication kernel optimization
EQ 0.00
SO 0.00
TD 0.00
Light evaluation by claude-haiku-4-5 · editorial channel only · no per-section breakdown available
Audit Trail 21 entries
2026-02-28 00:42 dlq Dead-lettered after 1 attempts: LLaMA now goes faster on CPUs - -
2026-02-28 00:42 eval_failure Evaluation failed: AiError: 3030: This model's maximum context length is 24000 tokens. However, you requested 31067 tokens (14683 in the messages, 16384 in the completion). Please reduce the length of the messages or co - -
2026-02-28 00:42 eval_failure Evaluation failed: AiError: 3030: This model's maximum context length is 24000 tokens. However, you requested 31067 tokens (14683 in the messages, 16384 in the completion). Please reduce the length of the messages or co - -
2026-02-28 00:27 dlq Dead-lettered after 1 attempts: LLaMA now goes faster on CPUs - -
2026-02-28 00:27 eval_failure Evaluation failed: AiError: 3030: This model's maximum context length is 24000 tokens. However, you requested 31067 tokens (14683 in the messages, 16384 in the completion). Please reduce the length of the messages or co - -
2026-02-28 00:27 eval_failure Evaluation failed: AiError: 3030: This model's maximum context length is 24000 tokens. However, you requested 31067 tokens (14683 in the messages, 16384 in the completion). Please reduce the length of the messages or co - -
2026-02-28 00:16 dlq Dead-lettered after 1 attempts: LLaMA now goes faster on CPUs - -
2026-02-28 00:16 eval_failure Evaluation failed: AiError: 3030: This model's maximum context length is 24000 tokens. However, you requested 31067 tokens (14683 in the messages, 16384 in the completion). Please reduce the length of the messages or co - -
2026-02-28 00:16 eval_failure Evaluation failed: AiError: 3030: This model's maximum context length is 24000 tokens. However, you requested 31067 tokens (14683 in the messages, 16384 in the completion). Please reduce the length of the messages or co - -
2026-02-28 00:03 dlq Dead-lettered after 1 attempts: LLaMA now goes faster on CPUs - -
2026-02-28 00:03 eval_failure Evaluation failed: AiError: 3030: This model's maximum context length is 24000 tokens. However, you requested 31067 tokens (14683 in the messages, 16384 in the completion). Please reduce the length of the messages or co - -
2026-02-28 00:03 eval_failure Evaluation failed: AiError: 3030: This model's maximum context length is 24000 tokens. However, you requested 31067 tokens (14683 in the messages, 16384 in the completion). Please reduce the length of the messages or co - -
2026-02-27 22:45 eval_success Evaluated: Mild positive (0.13) - -
2026-02-27 22:45 eval Evaluated by deepseek-v3.2: +0.13 (Mild positive) 21,669 tokens
2026-02-27 22:25 dlq Dead-lettered after 1 attempts: LLaMA now goes faster on CPUs - -
2026-02-27 22:23 rate_limit OpenRouter rate limited (429) model=llama-3.3-70b - -
2026-02-27 22:22 rate_limit OpenRouter rate limited (429) model=llama-3.3-70b - -
2026-02-27 22:21 eval_success Light evaluated: Neutral (0.00) - -
2026-02-27 22:21 eval Evaluated by llama-4-scout-wai: 0.00 (Neutral)
2026-02-27 22:21 rate_limit OpenRouter rate limited (429) model=llama-3.3-70b - -
2026-02-27 22:08 eval Evaluated by claude-haiku-4-5: 0.00 (Neutral)