Hacker News: honorable_judge

New comment by honorable_judge in "Show HN: LMM for LLMs – A mental model for building LLM apps"

honorable_judge — Tue, 22 Apr 2025 22:20:33 +0000

Yea - that's right. Its this separation of concerns that I think will help people break through the confusion of building agents.

Show HN: LMM for LLMs – A mental model for building LLM apps

honorable_judge — Tue, 22 Apr 2025 19:32:26 +0000

I've been building agentic apps for some large Fortune 500 companies (T-Mobile, Twilio, etc.) and developed a mental model that serves as a practical guide in building agentic apps: separate the high-level agent specific logic from low-level platform capabilities. I call it the L-MM: the Logical Mental Model for LLM applications.

This mental model has not only been tremendously helpful in building agents but also helping customers think about the development process - so when I am done with a consulting engagement they can move faster across the stack and enable engineers and platform teams to work concurrently without interference, boosting productivity.

So what is the high-level logic vs. the low-level platform work?

High-Level Logic (Agent & Task Specific)

Tools and Environment - These are specific integrations and capabilities that allow agents to interact with external systems or APIs to perform real-world tasks. Examples include:

    Booking a table via OpenTable API
    Scheduling calendar events via Google Calendar or Microsoft Outlook
    Retrieving and updating data from CRM platforms like Salesforce
    Utilizing payment gateways to complete transactions

Role and Instructions - Clearly defining an agent's persona, responsibilities, and explicit instructions is essential for predictable and coherent behavior. This includes:

    The "personality" of the agent (e.g., professional assistant)
    Explicit boundaries around task completion ("done criteria")
    Behavioral guidelines for handling unexpected inputs or situations

Low-Level Logic (Common Platform Capabilities)

Routing - Efficiently coordinating tasks between multiple specialized agents, ensuring seamless hand-offs and effective delegation:

    Implementing intelligent load balancing and dynamic agent selection based on task context
    Supporting retries, failover strategies, and fallback mechanisms

Guardrails - Centralized mechanisms to safeguard interactions and ensure reliability and safety:

    Filtering or moderating sensitive or harmful content
    Real-time compliance checks for industry-specific regulations (e.g., GDPR, HIPAA)
    Threshold-based alerts and automated corrective actions to prevent misuse

Access to LLMs - Providing robust and centralized access to multiple LLMs ensures high availability and scalability:

    Implementing smart retry logic with exponential backoff
    Centralized rate limiting and quota management to optimize usage
    Handling diverse LLM backends transparently (OpenAI, Cohere, local open-source models, etc.)

Observability - Comprehensive visibility into system performance and interactions using industry-standard practices: W3C Trace Context compatible distributed tracing for clear visibility across requests Detailed logging and metrics collection (latency, throughput, error rates, token usage) Easy integration with popular observability platforms like Grafana, Prometheus, Datadog, and OpenTelemetry

Why This Matters

By adopting this structured mental model, teams can achieve clear separation of concerns, improving collaboration, reducing complexity, and accelerating the development of scalable, reliable, and safe agentic applications.

I'm actively working on addressing challenges in this domain. If you're navigating similar problems or have insights to share, let's discuss further - i'll leave some links about the stack too if folks want it.

High-level framework - https://openai.github.io/openai-agents-python/ Low-level infrastructure - https://github.com/katanemo/archgw

Comments URL: https://news.ycombinator.com/item?id=43765467

Points: 6

# Comments: 2

New comment by honorable_judge in "Show HN: ArchGW – An open-source intelligent proxy server for prompts"

honorable_judge — Thu, 06 Mar 2025 04:57:48 +0000

Envoy is compatible with OTel out of the box. That's a big plus for observability. Plus Envoy is designed for high-load dataplane (in the request path worklaods) and used in every modern stack. There are several advantages on using Arch as the source of observability (traces, metrics, logs)

New comment by honorable_judge in "Show HN: Intelligent proxy for human-in-the-loop agents written in Rust"

honorable_judge — Tue, 26 Nov 2024 21:21:01 +0000

Arch moves the critical but crufty work around safety, observability, and routing of prompts outside business logic. Its a uniquely intelligent infrastructure primitive, engineered with purpose-built fast LLMs [3] for tasks like intent detection over multi-turn, parameter identification and extraction, triggering single/multiple function calls, and offers convenience features to auto dispatch LLM calls for summarization based on data from your APIs via system prompts configured in archgw.

Show HN: Intelligent proxy for human-in-the-loop agents written in Rust

honorable_judge — Tue, 26 Nov 2024 19:11:07 +0000

Article URL: https://github.com/katanemo/archgw

Comments URL: https://news.ycombinator.com/item?id=42248876

Points: 3

# Comments: 1

Envoy (Proxy) but for AI Agents

honorable_judge — Tue, 26 Nov 2024 01:41:06 +0000

Article URL: https://github.com/katanemo/archgw

Comments URL: https://news.ycombinator.com/item?id=42241817

Points: 4

# Comments: 0

New comment by honorable_judge in "Show HN: Dumbo – Hono inspired framework for PHP"

honorable_judge — Wed, 20 Nov 2024 01:00:18 +0000

For a programming language mostly out of favor, the effort to get a framework right and find the right audience to use this over existing options will be tricky. Good luck!

New comment by honorable_judge in "Show HN: archgw: open-source, intelligent proxy for AI agents, built on Envoy"

honorable_judge — Wed, 20 Nov 2024 00:56:37 +0000

Its a proxy - built on Envoy. I think its fairly clear that this is a separate process. As far as I can tell, you create a config file, boot up archgw, and in the config have it point to endpoints where prompts get forwarded.

New comment by honorable_judge in "Show HN: Mapping with WhatsApp"

honorable_judge — Wed, 20 Nov 2024 00:49:37 +0000

Found it interesting (for the use case) - the work to export chat and then map is exactly that: work

New comment by honorable_judge in "Show HN: archgw: open-source, intelligent proxy for AI agents, built on Envoy"

honorable_judge — Wed, 20 Nov 2024 00:47:57 +0000

With all the focus on language specific frameworks - this out of process architecture choice is an interesting one. On one hand, it helps you side step the "is this functionality available on js, java, etc" question, and on the other it means its not as easy as `import archgw` in python. Good luck though, feels like an interesting project