<rss version="2.0" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:atom="http://www.w3.org/2005/Atom"><channel><title>Hacker News: opsmeter</title><link>https://news.ycombinator.com/user?id=opsmeter</link><description>Hacker News RSS</description><docs>https://hnrss.org/</docs><generator>hnrss v2.1.1</generator><lastBuildDate>Thu, 28 May 2026 21:29:44 +0000</lastBuildDate><atom:link href="https://hnrss.org/user?id=opsmeter" rel="self" type="application/rss+xml"></atom:link><item><title><![CDATA[Show HN: Opsmeter.io – AI cost attribution and budget control for LLM apps]]></title><description><![CDATA[
<p>Hi HN,<p>I’m building Opsmeter, a tool to understand and control AI costs in LLM applications.<p>A problem I kept seeing is that most teams only notice AI cost issues when the invoice arrives. Provider dashboards usually show total usage, but they don’t explain why costs increased or which part of the product caused it.<p>Opsmeter helps break down AI spend by endpoint, tenant, user, model, and prompt version, so when costs spike you can quickly find the root cause.<p>A few things we focused on:<p>No proxy required.
Cross-provider cost attribution.
Budget alerts and spend monitoring.
Request-level visibility into where costs come from.<p>The goal is to help teams make AI costs understandable for both engineering and finance before bill shock happens.<p>I’d love feedback from people building with LLMs.<p>How are you tracking AI costs today?
What’s the hardest part of understanding cost spikes?
Would you want this as observability, governance, or both?<p>Website: <a href="https://opsmeter.io" rel="nofollow">https://opsmeter.io</a>
Docs: <a href="https://opsmeter.io/docs" rel="nofollow">https://opsmeter.io/docs</a></p>
<hr>
<p>Comments URL: <a href="https://news.ycombinator.com/item?id=47390935">https://news.ycombinator.com/item?id=47390935</a></p>
<p>Points: 1</p>
<p># Comments: 0</p>
]]></description><pubDate>Sun, 15 Mar 2026 19:24:14 +0000</pubDate><link>https://news.ycombinator.com/item?id=47390935</link><dc:creator>opsmeter</dc:creator><comments>https://news.ycombinator.com/item?id=47390935</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47390935</guid></item><item><title><![CDATA[New comment by opsmeter in "$82,000 in 48 Hours from stolen Gemini API Key vs. normal monthly Usage Of $180"]]></title><description><![CDATA[
<p>Usage-based AI needs the same safety engineering as any “expensive actuator”: rate limits, quotas, and automatic shutdown thresholds. Otherwise a leaked key becomes an unbounded liability.</p>
]]></description><pubDate>Wed, 04 Mar 2026 21:26:33 +0000</pubDate><link>https://news.ycombinator.com/item?id=47254125</link><dc:creator>opsmeter</dc:creator><comments>https://news.ycombinator.com/item?id=47254125</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47254125</guid></item><item><title><![CDATA[New comment by opsmeter in "Stolen Gemini API key racks up $82,000 in 48 hours"]]></title><description><![CDATA[
<p>This reads like an “incident without guardrails”: per-project caps/quotas, anomaly alerts (minutes), env-split keys, and an automated kill-switch should be defaults for usage-based APIs. Billing emails are post-mortems.</p>
]]></description><pubDate>Wed, 04 Mar 2026 21:26:14 +0000</pubDate><link>https://news.ycombinator.com/item?id=47254120</link><dc:creator>opsmeter</dc:creator><comments>https://news.ycombinator.com/item?id=47254120</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47254120</guid></item><item><title><![CDATA[New comment by opsmeter in "Show HN: Cost per Outcome for AI Workflows"]]></title><description><![CDATA[
<p>Nice — those two features tend to unlock the “why” behind drift. One thing we found especially useful was pairing cost/outcome alerts with a root-cause slice: when slope jumps, immediately show top contributing endpoint/feature + tenant/user + prompt version changes + retry ratio/context size trend.
For your event_id model: how do you handle partial outcomes (e.g., success after fallback/escalation) and do you keep pricing snapshots by timestamp so historical cost/outcome comparisons stay consistent across model price changes?</p>
]]></description><pubDate>Sun, 01 Mar 2026 15:23:47 +0000</pubDate><link>https://news.ycombinator.com/item?id=47207534</link><dc:creator>opsmeter</dc:creator><comments>https://news.ycombinator.com/item?id=47207534</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47207534</guid></item><item><title><![CDATA[New comment by opsmeter in "Show HN: Cost per Outcome for AI Workflows"]]></title><description><![CDATA[
<p>“Cost per outcome” is the metric most teams actually need. In prod we saw totals look fine while cost/outcome drifted due to retries + fallback paths + context creep. Are you planning a before/after deploy comparison (prompt/version) to catch regressions, or anomaly alerts on cost/outcome slope?</p>
]]></description><pubDate>Thu, 26 Feb 2026 03:09:38 +0000</pubDate><link>https://news.ycombinator.com/item?id=47161326</link><dc:creator>opsmeter</dc:creator><comments>https://news.ycombinator.com/item?id=47161326</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47161326</guid></item><item><title><![CDATA[New comment by opsmeter in "Show HN: AgentBudget – Real-time dollar budgets for AI agents"]]></title><description><![CDATA[
<p>This is exactly the pain point with agents: spend isn’t linear because fanout + retries compound. One thing that helped us debug/contain spikes is tracking cost per “user-action/outcome” (not just per call) plus a retry ratio trend (429/timeouts). Do you support budgets per step/tool in the chain, or only per overall run?</p>
]]></description><pubDate>Thu, 26 Feb 2026 03:08:10 +0000</pubDate><link>https://news.ycombinator.com/item?id=47161315</link><dc:creator>opsmeter</dc:creator><comments>https://news.ycombinator.com/item?id=47161315</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47161315</guid></item><item><title><![CDATA[New comment by opsmeter in "Ask HN: What happens after the AI bubble bursts?"]]></title><description><![CDATA[
<p>One thing that surprised our team: cost isn’t just “more usage” — retries and context creep can multiply spend with the same user behavior. We now track cost/request and cost per user-action per endpoint over time, plus a retry ratio. When either drifts after a change, it’s usually a quick fix (backoff, caps, trimming history).</p>
]]></description><pubDate>Thu, 26 Feb 2026 02:58:06 +0000</pubDate><link>https://news.ycombinator.com/item?id=47161246</link><dc:creator>opsmeter</dc:creator><comments>https://news.ycombinator.com/item?id=47161246</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47161246</guid></item><item><title><![CDATA[Show HN: Opsmeter–attribute LLM spend to endpoints and prompt versions(no proxy)]]></title><description><![CDATA[
<p>Hi HN — I built Opsmeter, a lightweight LLM telemetry tool focused on cost attribution + budget control.<p>Provider dashboards mostly show totals. Opsmeter shows what caused the bill by breaking spend down by endpointTag, promptVersion, and optionally userId — plus latency and success/error rates.<p>It’s no-proxy: Opsmeter doesn’t sit in your request path. After each LLM call, you send a small telemetry payload to /v1/ingest/llm-request (provider, model, endpointTag, promptVersion, token counts, latency, status). Opsmeter normalizes cost via a provider/model pricing table and surfaces trends + regressions.<p>Links:<p>Home: <a href="https://opsmeter.io" rel="nofollow">https://opsmeter.io</a><p>Docs: <a href="https://opsmeter.io/docs" rel="nofollow">https://opsmeter.io/docs</a><p>Pricing: <a href="https://opsmeter.io/pricing" rel="nofollow">https://opsmeter.io/pricing</a><p>If you try it and share anonymized screenshots/feedback, I’m happy to help you interpret the results — e.g.<p>which endpoints drive spend<p>which prompt versions increased tokens/cost (deploy regressions)<p>which users (optional) are the biggest cost drivers<p>suggested budget thresholds (80% warning / 100% exceeded) and alerting setup<p>Feedback welcome — especially on what you’d want next: staying telemetry-first, and potentially adding an optional gateway mode later.</p>
<hr>
<p>Comments URL: <a href="https://news.ycombinator.com/item?id=46965730">https://news.ycombinator.com/item?id=46965730</a></p>
<p>Points: 2</p>
<p># Comments: 0</p>
]]></description><pubDate>Tue, 10 Feb 2026 19:45:33 +0000</pubDate><link>https://news.ycombinator.com/item?id=46965730</link><dc:creator>opsmeter</dc:creator><comments>https://news.ycombinator.com/item?id=46965730</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=46965730</guid></item></channel></rss>