<rss version="2.0" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:atom="http://www.w3.org/2005/Atom"><channel><title>Hacker News: kevinluddy39</title><link>https://news.ycombinator.com/user?id=kevinluddy39</link><description>Hacker News RSS</description><docs>https://hnrss.org/</docs><generator>hnrss v2.1.1</generator><lastBuildDate>Sun, 21 Jun 2026 20:30:56 +0000</lastBuildDate><atom:link href="https://hnrss.org/user?id=kevinluddy39" rel="self" type="application/rss+xml"></atom:link><item><title><![CDATA[New comment by kevinluddy39 in "EvanFlow – A TDD driven feedback loop for Claude Code"]]></title><description><![CDATA[
<p>The per-agent-green / merge-broken pattern is the diagonal failure mode of multi-agent systems. Unit testing each agent in isolation captures correctness within scope; what's invisible is the seam at handoff — argument schemas drifting between coder and overseer, response shapes that satisfy each agent's local validator but break the next's parser, error messages that get summarized into "no error" by the time they reach the orchestrator.<p><pre><code>  Built tool-call-grader to instrument exactly this. Session-level statistics across the tool-call trace plus six pathology detectors (silent failure, tool fixation, response bloat, schema drift, irrelevant response, cascading failure). On a hand-designed multi-agent benchmark, 7/7 scenarios passed — including specifically the case you're describing:
  per-agent results look fine, schema-drift fires at the seam.  
  The detector runs over the trace, not the output. Catches the failure several turns before it shows up as "weird merge bug" the human has to debug. MIT licensed, npx-installable. Methodology in profile.</code></pre></p>
]]></description><pubDate>Tue, 28 Apr 2026 16:58:35 +0000</pubDate><link>https://news.ycombinator.com/item?id=47937184</link><dc:creator>kevinluddy39</dc:creator><comments>https://news.ycombinator.com/item?id=47937184</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47937184</guid></item><item><title><![CDATA[I ran retrieval-auditor against LangChain's RAG quickstart, 5/6 flagged]]></title><description><![CDATA[
<p>Article URL: <a href="https://github.com/kevin-luddy39/contrarianAI/tree/main/tools/retrieval-auditor/examples/langchain-quickstart-teardown">https://github.com/kevin-luddy39/contrarianAI/tree/main/tools/retrieval-auditor/examples/langchain-quickstart-teardown</a></p>
<p>Comments URL: <a href="https://news.ycombinator.com/item?id=47928150">https://news.ycombinator.com/item?id=47928150</a></p>
<p>Points: 1</p>
<p># Comments: 0</p>
]]></description><pubDate>Mon, 27 Apr 2026 22:18:37 +0000</pubDate><link>https://github.com/kevin-luddy39/contrarianAI/tree/main/tools/retrieval-auditor/examples/langchain-quickstart-teardown</link><dc:creator>kevinluddy39</dc:creator><comments>https://news.ycombinator.com/item?id=47928150</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47928150</guid></item><item><title><![CDATA[AI Heartache]]></title><description><![CDATA[
<p>Article URL: <a href="https://github.com/kevin-luddy39/context-inspector/">https://github.com/kevin-luddy39/context-inspector/</a></p>
<p>Comments URL: <a href="https://news.ycombinator.com/item?id=47784837">https://news.ycombinator.com/item?id=47784837</a></p>
<p>Points: 1</p>
<p># Comments: 0</p>
]]></description><pubDate>Wed, 15 Apr 2026 20:31:43 +0000</pubDate><link>https://github.com/kevin-luddy39/context-inspector/</link><dc:creator>kevinluddy39</dc:creator><comments>https://news.ycombinator.com/item?id=47784837</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47784837</guid></item></channel></rss>