<rss version="2.0" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:atom="http://www.w3.org/2005/Atom"><channel><title>Hacker News: honorable_judge</title><link>https://news.ycombinator.com/user?id=honorable_judge</link><description>Hacker News RSS</description><docs>https://hnrss.org/</docs><generator>hnrss v2.1.1</generator><lastBuildDate>Sat, 25 Apr 2026 11:18:12 +0000</lastBuildDate><atom:link href="https://hnrss.org/user?id=honorable_judge" rel="self" type="application/rss+xml"></atom:link><item><title><![CDATA[New comment by honorable_judge in "Show HN: LMM for LLMs – A mental model for building LLM apps"]]></title><description><![CDATA[
<p>Yea - that's right. Its this separation of concerns that I think will help people break through the confusion of building agents.</p>
]]></description><pubDate>Tue, 22 Apr 2025 22:20:33 +0000</pubDate><link>https://news.ycombinator.com/item?id=43766848</link><dc:creator>honorable_judge</dc:creator><comments>https://news.ycombinator.com/item?id=43766848</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=43766848</guid></item><item><title><![CDATA[Show HN: LMM for LLMs – A mental model for building LLM apps]]></title><description><![CDATA[
<p>I've been building agentic apps for some large Fortune 500 companies (T-Mobile, Twilio, etc.) and developed a mental model that serves as a practical guide in building agentic apps: separate the high-level agent specific logic from low-level platform capabilities. I call it the L-MM: the Logical Mental Model for LLM applications.<p>This mental model has not only been tremendously helpful in building agents but also helping customers think about the development process - so when I am done with a consulting engagement they can move faster across the stack and enable engineers and platform teams to work concurrently without interference, boosting productivity.<p>So what is the high-level logic vs. the low-level platform work?<p>High-Level Logic (Agent & Task Specific)<p>Tools and Environment - These are specific integrations and capabilities that allow agents to interact with external systems or APIs to perform real-world tasks. Examples include:<p><pre><code>    Booking a table via OpenTable API
    Scheduling calendar events via Google Calendar or Microsoft Outlook
    Retrieving and updating data from CRM platforms like Salesforce
    Utilizing payment gateways to complete transactions
</code></pre>
Role and Instructions - Clearly defining an agent's persona, responsibilities, and explicit instructions is essential for predictable and coherent behavior. This includes:<p><pre><code>    The "personality" of the agent (e.g., professional assistant)
    Explicit boundaries around task completion ("done criteria")
    Behavioral guidelines for handling unexpected inputs or situations
</code></pre>
Low-Level Logic (Common Platform Capabilities)<p>Routing - Efficiently coordinating tasks between multiple specialized agents, ensuring seamless hand-offs and effective delegation:<p><pre><code>    Implementing intelligent load balancing and dynamic agent selection based on task context
    Supporting retries, failover strategies, and fallback mechanisms
</code></pre>
Guardrails - Centralized mechanisms to safeguard interactions and ensure reliability and safety:<p><pre><code>    Filtering or moderating sensitive or harmful content
    Real-time compliance checks for industry-specific regulations (e.g., GDPR, HIPAA)
    Threshold-based alerts and automated corrective actions to prevent misuse
</code></pre>
Access to LLMs - Providing robust and centralized access to multiple LLMs ensures high availability and scalability:<p><pre><code>    Implementing smart retry logic with exponential backoff
    Centralized rate limiting and quota management to optimize usage
    Handling diverse LLM backends transparently (OpenAI, Cohere, local open-source models, etc.)
</code></pre>
Observability - Comprehensive visibility into system performance and interactions using industry-standard practices:
    W3C Trace Context compatible distributed tracing for clear visibility across requests
    Detailed logging and metrics collection (latency, throughput, error rates, token usage)
    Easy integration with popular observability platforms like Grafana, Prometheus, Datadog, and OpenTelemetry<p>Why This Matters<p>By adopting this structured mental model, teams can achieve clear separation of concerns, improving collaboration, reducing complexity, and accelerating the development of scalable, reliable, and safe agentic applications.<p>I'm actively working on addressing challenges in this domain. If you're navigating similar problems or have insights to share, let's discuss further - i'll leave some links about the stack too if folks want it.<p>High-level framework - <a href="https://openai.github.io/openai-agents-python/" rel="nofollow">https://openai.github.io/openai-agents-python/</a>
Low-level infrastructure - <a href="https://github.com/katanemo/archgw">https://github.com/katanemo/archgw</a></p>
<hr>
<p>Comments URL: <a href="https://news.ycombinator.com/item?id=43765467">https://news.ycombinator.com/item?id=43765467</a></p>
<p>Points: 6</p>
<p># Comments: 2</p>
]]></description><pubDate>Tue, 22 Apr 2025 19:32:26 +0000</pubDate><link>https://news.ycombinator.com/item?id=43765467</link><dc:creator>honorable_judge</dc:creator><comments>https://news.ycombinator.com/item?id=43765467</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=43765467</guid></item><item><title><![CDATA[New comment by honorable_judge in "Show HN: ArchGW – An open-source intelligent proxy server for prompts"]]></title><description><![CDATA[
<p>Envoy is compatible with OTel out of the box. That's a big plus for observability. Plus Envoy is designed for high-load dataplane (in the request path worklaods) and used in every modern stack. There are several advantages on using Arch as the source of observability (traces, metrics, logs)</p>
]]></description><pubDate>Thu, 06 Mar 2025 04:57:48 +0000</pubDate><link>https://news.ycombinator.com/item?id=43276571</link><dc:creator>honorable_judge</dc:creator><comments>https://news.ycombinator.com/item?id=43276571</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=43276571</guid></item><item><title><![CDATA[New comment by honorable_judge in "Show HN: Intelligent proxy for human-in-the-loop agents written in Rust"]]></title><description><![CDATA[
<p>Arch moves the critical but crufty work around safety, observability, and routing of prompts outside business logic. Its a uniquely intelligent infrastructure primitive, engineered with purpose-built fast LLMs [3] for tasks like intent detection over multi-turn, parameter identification and extraction, triggering single/multiple function calls, and offers convenience features to auto dispatch LLM calls for summarization based on data from your APIs via system prompts configured in archgw.</p>
]]></description><pubDate>Tue, 26 Nov 2024 21:21:01 +0000</pubDate><link>https://news.ycombinator.com/item?id=42250184</link><dc:creator>honorable_judge</dc:creator><comments>https://news.ycombinator.com/item?id=42250184</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=42250184</guid></item><item><title><![CDATA[Show HN: Intelligent proxy for human-in-the-loop agents written in Rust]]></title><description><![CDATA[
<p>Article URL: <a href="https://github.com/katanemo/archgw">https://github.com/katanemo/archgw</a></p>
<p>Comments URL: <a href="https://news.ycombinator.com/item?id=42248876">https://news.ycombinator.com/item?id=42248876</a></p>
<p>Points: 3</p>
<p># Comments: 1</p>
]]></description><pubDate>Tue, 26 Nov 2024 19:11:07 +0000</pubDate><link>https://github.com/katanemo/archgw</link><dc:creator>honorable_judge</dc:creator><comments>https://news.ycombinator.com/item?id=42248876</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=42248876</guid></item><item><title><![CDATA[Envoy (Proxy) but for AI Agents]]></title><description><![CDATA[
<p>Article URL: <a href="https://github.com/katanemo/archgw">https://github.com/katanemo/archgw</a></p>
<p>Comments URL: <a href="https://news.ycombinator.com/item?id=42241817">https://news.ycombinator.com/item?id=42241817</a></p>
<p>Points: 4</p>
<p># Comments: 0</p>
]]></description><pubDate>Tue, 26 Nov 2024 01:41:06 +0000</pubDate><link>https://github.com/katanemo/archgw</link><dc:creator>honorable_judge</dc:creator><comments>https://news.ycombinator.com/item?id=42241817</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=42241817</guid></item><item><title><![CDATA[New comment by honorable_judge in "Show HN: Dumbo – Hono inspired framework for PHP"]]></title><description><![CDATA[
<p>For a programming language mostly out of favor, the effort to get a framework right and find the right audience to use this over existing options will be tricky. Good luck!</p>
]]></description><pubDate>Wed, 20 Nov 2024 01:00:18 +0000</pubDate><link>https://news.ycombinator.com/item?id=42189790</link><dc:creator>honorable_judge</dc:creator><comments>https://news.ycombinator.com/item?id=42189790</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=42189790</guid></item><item><title><![CDATA[New comment by honorable_judge in "Show HN: archgw: open-source, intelligent proxy for AI agents, built on Envoy"]]></title><description><![CDATA[
<p>Its a proxy - built on Envoy. I think its fairly clear that this is a separate process. As far as I can tell, you create a config file, boot up archgw, and in the config have it point to endpoints where prompts get forwarded.</p>
]]></description><pubDate>Wed, 20 Nov 2024 00:56:37 +0000</pubDate><link>https://news.ycombinator.com/item?id=42189763</link><dc:creator>honorable_judge</dc:creator><comments>https://news.ycombinator.com/item?id=42189763</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=42189763</guid></item><item><title><![CDATA[New comment by honorable_judge in "Show HN: Mapping with WhatsApp"]]></title><description><![CDATA[
<p>Found it interesting (for the use case) - the work to export chat and then map is exactly that: work</p>
]]></description><pubDate>Wed, 20 Nov 2024 00:49:37 +0000</pubDate><link>https://news.ycombinator.com/item?id=42189715</link><dc:creator>honorable_judge</dc:creator><comments>https://news.ycombinator.com/item?id=42189715</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=42189715</guid></item><item><title><![CDATA[New comment by honorable_judge in "Show HN: archgw: open-source, intelligent proxy for AI agents, built on Envoy"]]></title><description><![CDATA[
<p>With all the focus on language specific frameworks - this out of process architecture choice is an interesting one. On one hand, it helps you side step the "is this functionality available on js, java, etc" question, and on the other it means its not as easy as `import archgw` in python. Good luck though, feels like an interesting project</p>
]]></description><pubDate>Wed, 20 Nov 2024 00:47:57 +0000</pubDate><link>https://news.ycombinator.com/item?id=42189705</link><dc:creator>honorable_judge</dc:creator><comments>https://news.ycombinator.com/item?id=42189705</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=42189705</guid></item></channel></rss>