<rss version="2.0" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:atom="http://www.w3.org/2005/Atom"><channel><title>Hacker News: LeoStehlik</title><link>https://news.ycombinator.com/user?id=LeoStehlik</link><description>Hacker News RSS</description><docs>https://hnrss.org/</docs><generator>hnrss v2.1.1</generator><lastBuildDate>Mon, 06 Apr 2026 04:55:13 +0000</lastBuildDate><atom:link href="https://hnrss.org/user?id=LeoStehlik" rel="self" type="application/rss+xml"></atom:link><item><title><![CDATA[New comment by LeoStehlik in "Decisions that eroded trust in Azure – by a former Azure Core engineer"]]></title><description><![CDATA[
<p>Back in 2011 at Fujitsu, I ran one of the earliest Azure production subscriptions outside Microsoft. Windows Azure, mid-2011. I've watched this platform for 15 years from the outside.<p>Part 1 barely scratches the surface. Read parts 2 through 6.<p>The 173 agents story, the 200 manual node interventions per day, the WireServer sitting on the secure host side with unencrypted tenant memory mixed in shared address space, the letters to the EVP, the CEO, the Board - not a single acknowledgment.<p>The most damning thing in this series ... except for technical debt ... is the silence at the top when someone handed them the diagnosis on a plate.<p>Cutler's original vision was "no human touch." The gap between that and what Azure actually became is where the trillion dollars went.<p>Go read the rest. It's worth it.<p>Meanwhile on LinkedIn, there are still comments how adorable Microsoft leadership under Satya is... a carefully crafted PR image.</p>
]]></description><pubDate>Fri, 03 Apr 2026 17:56:06 +0000</pubDate><link>https://news.ycombinator.com/item?id=47629817</link><dc:creator>LeoStehlik</dc:creator><comments>https://news.ycombinator.com/item?id=47629817</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47629817</guid></item><item><title><![CDATA[New comment by LeoStehlik in "Show HN: Real-time dashboard for Claude Code agent teams"]]></title><description><![CDATA[
<p>Both, as it proved neither is enough on its own.<p>The structural fix is the obsession about separating roles: the agent that builds is never the one that verifies. I run a reviewer agent (I call her Iris), and a tester (Rex) — they live in separate sessions with no shared context with the builder. Iris' brief explicitly says "we require a live browser test, code review is not enough" — and that is where role separation was key; agents reviewing their own output tend to confirm what they already believe.<p>The explicit result/verdict format helps too. Each acceptance criteria gets a PASS/FAIL/UNKNOWN verdict, attached with evidence. Unknown is the one with gravitas — you force the agent to say "I could not verify this" rather than it quietly pretending it was a PASS.<p>But diff-level verification is where it still leaks. I don't have a systematic diff check yet. It's mostly Iris catching "agent replaced the whole file rather than extending it" by noticing the git diff is suspiciously clean. That's still more pattern matching than proper instrumentation — room for improvement... when I figure out how. Not there yet, to be honest.<p>The sanitised optimism problem is deep — it's not always dishonesty, but quite often a genuine model confusion about whether a suppressed error counts as a fix. The agent believes... voila, success. The only way around it I've found is that the verifier has to be skeptical by default, not reviewing in good faith.<p>This tool's live timeline is the missing piece in that loop. Being able to see the actual tool calls rather than the curated (and falsely optimistic) summary could change verdict quality rather significantly.</p>
]]></description><pubDate>Wed, 01 Apr 2026 21:54:16 +0000</pubDate><link>https://news.ycombinator.com/item?id=47607015</link><dc:creator>LeoStehlik</dc:creator><comments>https://news.ycombinator.com/item?id=47607015</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47607015</guid></item><item><title><![CDATA[New comment by LeoStehlik in "Show HN: Real-time dashboard for Claude Code agent teams"]]></title><description><![CDATA[
<p>This is what I've been missing running multi-agent ops through OpenClaw.<p>The opacity problem is the one I hit hard: when a coordinator spawns 3-4 agents in parallel (builder, reviewer, tester, each with their own tool calls), the only visibility you have is what they choose to report back. Which is often sanitised and … dangerously optimistic.<p>The role separation / independent verification structure I run helps catch bad outputs, but it doesn't give me the live timeline of HOW an agent got to a conclusion. That's why I find this genuinely useful.<p>Noticed OpenClaw is already on the roadmap - had my hands tingling to fork and adapt it. Starring it for now and added to my watchlist.  The hook architecture should translate … OpenClaw fires session events that could feed the same pipeline. Looking forward to seeing that happen.</p>
]]></description><pubDate>Wed, 01 Apr 2026 20:17:24 +0000</pubDate><link>https://news.ycombinator.com/item?id=47605971</link><dc:creator>LeoStehlik</dc:creator><comments>https://news.ycombinator.com/item?id=47605971</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47605971</guid></item></channel></rss>