The Car Wash Problem: A variable isolation study on prompt architecture

midmost44 — Fri, 20 Feb 2026 13:23:56 +0000

Most AI products inject facts and hope reasoning follows. But intelligence is not measured by how much a model holds in its context window. It is measured by knowing to pick up the keys before leaving the house.

Last week, the "Car Wash problem" (50m away, walk or drive?) went viral here on HN. Every major LLM failed because they missed the implicit physical constraint: the car must be there. While testing InterviewMate's prompt architecture, I posed the same question. It answered drive immediately. Every other LLM had failed. But I didn't actually know why it worked — so I ran a variable isolation study to find out. 100 API calls, Claude Sonnet 4.5, 5 conditions:

Baseline (no prompt): 0% Role only: 0% Context injection (user profile, car location): 30% Structured reasoning (STAR framework): 85% Full stack (both combined): 100%

Throwing facts at the model doesn't work unless the architecture forces it to explicitly evaluate the task goal first. Without structure, the model jumps straight to the distance heuristic: "100m is short, walk." I'm writing a paper on this. Wanted to share the raw data with HN first. Code and raw eval data: https://github.com/JO-HEEJIN/interview_mate/tree/main/car_wash

Comments URL: https://news.ycombinator.com/item?id=47087746

Points: 2

# Comments: 1

New comment by midmost44 in "TaskForge – auditable, secure, framework for OpenClaw"

midmost44 — Wed, 18 Feb 2026 09:27:14 +0000

WOW. how did you handle the token issue?

New comment by midmost44 in "Claude Sonnet 4.6"

midmost44 — Wed, 18 Feb 2026 09:26:07 +0000

I test API version. it beats opus 4. lol. I saved 5x money!!!

Hacker News: midmost44

The Car Wash Problem: A variable isolation study on prompt architecture

New comment by midmost44 in "TaskForge – auditable, secure, framework for OpenClaw"

New comment by midmost44 in "Claude Sonnet 4.6"