<rss version="2.0" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:atom="http://www.w3.org/2005/Atom"><channel><title>Hacker News: behat</title><link>https://news.ycombinator.com/user?id=behat</link><description>Hacker News RSS</description><docs>https://hnrss.org/</docs><generator>hnrss v2.1.1</generator><lastBuildDate>Fri, 17 Apr 2026 07:55:01 +0000</lastBuildDate><atom:link href="https://hnrss.org/user?id=behat" rel="self" type="application/rss+xml"></atom:link><item><title><![CDATA[New comment by behat in "Launch HN: Relvy (YC F24) – On-call runbooks, automated"]]></title><description><![CDATA[
<p>>> how the runbooks can self heal if results from some steps in the middle are not expected.<p>Yeah this is a very interesting angle. Our primary mechanism here is via agent created auto-memories today. The agent keeps track of the most useful steps, and more importantly, dead end steps as it executes runbooks. We think this offers a great bridge to suggest runbook updates and keep them current.<p>>> Curious how much savings do you observe from using runbook versus purely let Claude do the planning at first.<p>Really depends on runbook quality, so I don't have a straightforward answer. Of course, it's faster and cheaper if you have well defined steps in your runbooks. As an example, `check logs for service frontend, faceted by host_name`, vs. `check logs`. Agent does more exploration in the latter case.<p>We wrote about the LLM costs of investigating production alerts more generally here, in case helpful: <a href="https://relvy.ai/blog/llm-cost-of-ai-sre-investigating-production-alerts">https://relvy.ai/blog/llm-cost-of-ai-sre-investigating-produ...</a></p>
]]></description><pubDate>Thu, 09 Apr 2026 18:38:15 +0000</pubDate><link>https://news.ycombinator.com/item?id=47707775</link><dc:creator>behat</dc:creator><comments>https://news.ycombinator.com/item?id=47707775</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47707775</guid></item><item><title><![CDATA[New comment by behat in "Launch HN: Relvy (YC F24) – On-call runbooks, automated"]]></title><description><![CDATA[
<p>Do you mean how we connect to internal data? Today, you can connect any API endpoint to Relvy, so if you have internal business data / dashboards that you look at while debugging, Relvy can do the same if there's an API for it.<p>Most of our deployments are self-hosted, in which case the data stays locally (your chosen LLM provider exempted), if that's what you are asking.</p>
]]></description><pubDate>Thu, 09 Apr 2026 18:28:27 +0000</pubDate><link>https://news.ycombinator.com/item?id=47707592</link><dc:creator>behat</dc:creator><comments>https://news.ycombinator.com/item?id=47707592</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47707592</guid></item><item><title><![CDATA[New comment by behat in "Launch HN: Relvy (YC F24) – On-call runbooks, automated"]]></title><description><![CDATA[
<p>Nice to see you here, Will! I’d generally recommend using open telemetry for instrumentation so that you keep the option of switching between telemetry vendors.<p>Re: runbooks, yeah even larger teams don’t have good ones to begin with. Relvy helps debug without runbooks as well - it might take longer to explore, but once you are happy with a particular investigation path the AI took, you can save it as a runbook for more deterministic future executions.</p>
]]></description><pubDate>Thu, 09 Apr 2026 16:57:37 +0000</pubDate><link>https://news.ycombinator.com/item?id=47706121</link><dc:creator>behat</dc:creator><comments>https://news.ycombinator.com/item?id=47706121</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47706121</guid></item><item><title><![CDATA[New comment by behat in "Launch HN: Relvy (YC F24) – On-call runbooks, automated"]]></title><description><![CDATA[
<p>> They want extension to their agent. If a project tells me I have to use their interface or agentic setup, it's 95% not going to happen<p>Yes, there’s definitely friction there. It may be that the right form factor is that you trigger Relvy’s debugging agent via Claude code / Cursor .<p>Our early users are heavy on needing to look at the raw data to be able to review the AI RCA, so a standalone set up makes sense. Also, the dominant usage pattern is background agentic execution triggered by alerts, and not manual.</p>
]]></description><pubDate>Thu, 09 Apr 2026 15:24:07 +0000</pubDate><link>https://news.ycombinator.com/item?id=47704950</link><dc:creator>behat</dc:creator><comments>https://news.ycombinator.com/item?id=47704950</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47704950</guid></item><item><title><![CDATA[New comment by behat in "Launch HN: Relvy (YC F24) – On-call runbooks, automated"]]></title><description><![CDATA[
<p>Yes! That boundary between what can be automated and what still needs human judgement has shifted so much this last year. Things like 'go check this dashboard' can now be automated.<p>ROI on runbooks (or good documentation in general) is much higher now if you have AI agents running them autonomously in the background. Makes it worth it to write/maintain runbooks.</p>
]]></description><pubDate>Thu, 09 Apr 2026 14:01:01 +0000</pubDate><link>https://news.ycombinator.com/item?id=47703904</link><dc:creator>behat</dc:creator><comments>https://news.ycombinator.com/item?id=47703904</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47703904</guid></item><item><title><![CDATA[New comment by behat in "Launch HN: Relvy (YC F24) – On-call runbooks, automated"]]></title><description><![CDATA[
<p>heh, I was just about to post the following on your previous comment re: reproducible benchmark results. Thanks for posting the blog.<p>With the docker images that we offer, in theory, people can re-run the benchmark themselves with our agent. But we should document and make that easier.<p>At the end of it, you really would have to evaluate on your own production alerts. Hopefully the easy install + set up helps.</p>
]]></description><pubDate>Thu, 09 Apr 2026 13:17:03 +0000</pubDate><link>https://news.ycombinator.com/item?id=47703351</link><dc:creator>behat</dc:creator><comments>https://news.ycombinator.com/item?id=47703351</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47703351</guid></item><item><title><![CDATA[New comment by behat in "Launch HN: Relvy (YC F24) – On-call runbooks, automated"]]></title><description><![CDATA[
<p>For the debugging workflow you described, we would be a standalone replacement for cursor or other agents. We don't yet write code so can't replace your cursor agents entirely.<p>Re: diffentiation - yes, faster, more accurate and more consistent. Partially because of better tools and UX, and partially because we anchor on runbooks. On-call engineers can quickly map out that the AI ran so-and-so steps, and here's what it found for each, and here's the time series graph that supports this.<p>Interesting that you have had great success with Datadog MCP. Do you mainly look at logs?</p>
]]></description><pubDate>Thu, 09 Apr 2026 12:48:21 +0000</pubDate><link>https://news.ycombinator.com/item?id=47703036</link><dc:creator>behat</dc:creator><comments>https://news.ycombinator.com/item?id=47703036</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47703036</guid></item><item><title><![CDATA[New comment by behat in "Launch HN: Relvy (YC F24) – On-call runbooks, automated"]]></title><description><![CDATA[
<p>Thank you :)</p>
]]></description><pubDate>Thu, 09 Apr 2026 12:34:52 +0000</pubDate><link>https://news.ycombinator.com/item?id=47702881</link><dc:creator>behat</dc:creator><comments>https://news.ycombinator.com/item?id=47702881</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47702881</guid></item><item><title><![CDATA[New comment by behat in "Launch HN: Relvy (YC F24) – On-call runbooks, automated"]]></title><description><![CDATA[
<p>Thanks. Yeah, Cursor / Claude code + MCP is powerful. We differentiate on two fronts, mainly:<p>1) Greater accuracy with our specialized tools: Most MCP tools allow agents to query data, or run *ql queries - this overwhelms context windows given the scale of telemetry data. Raw data is also not great for reasoning - we’ve designed our tools to ensure that models get data in the right format, enriched with statistical summaries, baselines, and correlation data, so LLMs can focus on reasoning.<p>2) Product UX: You’ll also find that text based outputs from general purpose agents are not sufficient for this task - our notebook UX offers a great way to visualize the underlying data so you can review and build trust with the AI.</p>
]]></description><pubDate>Thu, 09 Apr 2026 12:30:58 +0000</pubDate><link>https://news.ycombinator.com/item?id=47702842</link><dc:creator>behat</dc:creator><comments>https://news.ycombinator.com/item?id=47702842</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47702842</guid></item><item><title><![CDATA[Launch HN: Relvy (YC F24) – On-call runbooks, automated]]></title><description><![CDATA[
<p>Hey HN! We are Bharath, and Simranjit from Relvy AI (<a href="https://www.relvy.ai">https://www.relvy.ai</a>). Relvy automates on-call runbooks for software engineering teams. It is an AI agent equipped with tools that can analyze telemetry data and code at scale, helping teams debug and resolve production issues in minutes. Here’s a video: [[[<a href="https://www.youtube.com/watch?v=BXr4_XlWXc0" rel="nofollow">https://www.youtube.com/watch?v=BXr4_XlWXc0</a>]]]<p>A lot of teams are using AI in some form to reduce their on-call burden. You may be pasting logs into Cursor, or using Claude Code with Datadog’s MCP server to help debug. What we’ve seen is that autonomous root cause analysis is a hard problem for AI. This shows up in benchmarks - Claude Opus 4.6 is currently at 36% accuracy on the OpenRCA dataset, in contrast to coding tasks.<p>There are three main reasons for this: (1) Telemetry data volume can drown the model in noise; (2) Data interpretation / reasoning is enterprise context dependent; (3) On-call is a time-constrained, high-stakes problem, with little room for AI to explore during investigation time. Errors that send the user down the wrong path are not easily forgiven.<p>At Relvy, we are tackling these problems by building specialized tools for telemetry data analysis. Our tools can detect anomalies and identify problem slices from dense time series data, do log pattern search, and reason about span trees, all without overwhelming the agent context.<p>Anchoring the agent around runbooks leads to less agentic exploration and more deterministic steps that reflect the most useful steps that an experienced engineer would take. That results in faster analysis, and less cognitive load on engineers to review and understand what the AI did.<p>How it works: Relvy is installed on a local machine via docker-compose (or via helm charts, or sign up on our cloud), connect your stack (observability and code), create your first runbook and have Relvy investigate a recent alert.<p>Each investigation is presented as a notebook in our web UI, with data visualizations that help engineers verify and build trust with the AI. From there on, Relvy can be configured to automatically respond to alerts from Slack<p>Some example runbook steps that Relvy automates: - Check so-and-so dashboard, see if the errors are isolated to a specific shard. - Check if there’s a throughput surge on the APM page, and if so, is it from a few IPs? - Check recent commits to see if anything changed for this endpoint.<p>You can also configure AWS CLI commands that Relvy can run to automate mitigation actions, with human approval.<p>A little bit about us - We did YC back in fall 2024. We started our journey experimenting with continuous log monitoring with small language models - that was too slow. We then invested deeply into solving root cause analysis effectively, and our product today is the result of about a year of work with our early customers.<p>Give us a try today. Happy to hear feedback, or about how you are tackling on-call burden at your company. Appreciate any comments or suggestions!</p>
<hr>
<p>Comments URL: <a href="https://news.ycombinator.com/item?id=47702647">https://news.ycombinator.com/item?id=47702647</a></p>
<p>Points: 48</p>
<p># Comments: 25</p>
]]></description><pubDate>Thu, 09 Apr 2026 12:11:56 +0000</pubDate><link>https://www.relvy.ai</link><dc:creator>behat</dc:creator><comments>https://news.ycombinator.com/item?id=47702647</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47702647</guid></item><item><title><![CDATA[Ramp: How we made Ramp sheets self-maintaining]]></title><description><![CDATA[
<p>Article URL: <a href="https://twitter.com/RampLabs/status/2036165188899012655">https://twitter.com/RampLabs/status/2036165188899012655</a></p>
<p>Comments URL: <a href="https://news.ycombinator.com/item?id=47497668">https://news.ycombinator.com/item?id=47497668</a></p>
<p>Points: 3</p>
<p># Comments: 0</p>
]]></description><pubDate>Tue, 24 Mar 2026 01:40:34 +0000</pubDate><link>https://twitter.com/RampLabs/status/2036165188899012655</link><dc:creator>behat</dc:creator><comments>https://news.ycombinator.com/item?id=47497668</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47497668</guid></item><item><title><![CDATA[LLM Costs of AI investigating production alerts]]></title><description><![CDATA[
<p>Article URL: <a href="https://www.relvy.ai/blog/llm-cost-of-ai-sre-investigating-production-alerts">https://www.relvy.ai/blog/llm-cost-of-ai-sre-investigating-production-alerts</a></p>
<p>Comments URL: <a href="https://news.ycombinator.com/item?id=47398668">https://news.ycombinator.com/item?id=47398668</a></p>
<p>Points: 6</p>
<p># Comments: 1</p>
]]></description><pubDate>Mon, 16 Mar 2026 13:20:42 +0000</pubDate><link>https://www.relvy.ai/blog/llm-cost-of-ai-sre-investigating-production-alerts</link><dc:creator>behat</dc:creator><comments>https://news.ycombinator.com/item?id=47398668</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47398668</guid></item><item><title><![CDATA[OpenRCA benchmark – Improving Claude's root cause analysis accuracy by 12 pp]]></title><description><![CDATA[
<p>Article URL: <a href="https://relvy.ai/blog/relvy-improves-claude-accuracy-by-12pp-openrca-benchmark">https://relvy.ai/blog/relvy-improves-claude-accuracy-by-12pp-openrca-benchmark</a></p>
<p>Comments URL: <a href="https://news.ycombinator.com/item?id=47339449">https://news.ycombinator.com/item?id=47339449</a></p>
<p>Points: 12</p>
<p># Comments: 0</p>
]]></description><pubDate>Wed, 11 Mar 2026 18:38:39 +0000</pubDate><link>https://relvy.ai/blog/relvy-improves-claude-accuracy-by-12pp-openrca-benchmark</link><dc:creator>behat</dc:creator><comments>https://news.ycombinator.com/item?id=47339449</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47339449</guid></item><item><title><![CDATA[New comment by behat in "Ask HN: What are you working on? (February 2026)"]]></title><description><![CDATA[
<p>An on-call runbook execution engine - being able to take plain text runbook steps like 
- look at logs for $service, check for dependency failures
- look at so-and-so dashboard.<p>and execute them when an alert fires.<p>We are at <a href="https://www.relvy.ai">https://www.relvy.ai</a></p>
]]></description><pubDate>Mon, 09 Feb 2026 18:52:15 +0000</pubDate><link>https://news.ycombinator.com/item?id=46949241</link><dc:creator>behat</dc:creator><comments>https://news.ycombinator.com/item?id=46949241</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=46949241</guid></item><item><title><![CDATA[Can AI debug problem scenarios in the OpenTelemetry demo application?]]></title><description><![CDATA[
<p>Article URL: <a href="https://www.relvy.ai/post/can-ai-debug-problems-in-the-opentelemetry-demo-application">https://www.relvy.ai/post/can-ai-debug-problems-in-the-opentelemetry-demo-application</a></p>
<p>Comments URL: <a href="https://news.ycombinator.com/item?id=43848163">https://news.ycombinator.com/item?id=43848163</a></p>
<p>Points: 2</p>
<p># Comments: 0</p>
]]></description><pubDate>Wed, 30 Apr 2025 17:15:17 +0000</pubDate><link>https://www.relvy.ai/post/can-ai-debug-problems-in-the-opentelemetry-demo-application</link><dc:creator>behat</dc:creator><comments>https://news.ycombinator.com/item?id=43848163</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=43848163</guid></item><item><title><![CDATA[New comment by behat in "The killer app of Gemini Pro 1.5 is using video as an input"]]></title><description><![CDATA[
<p>Heh. Built a macOS app that does something like this a while ago - <a href="https://github.com/bharathpbhat/EssentialApp">https://github.com/bharathpbhat/EssentialApp</a><p>Back then, I used on device OCR and then sent the text to gpt. I’ve been wanting to re-do this with local LLMs</p>
]]></description><pubDate>Wed, 21 Feb 2024 22:27:56 +0000</pubDate><link>https://news.ycombinator.com/item?id=39460507</link><dc:creator>behat</dc:creator><comments>https://news.ycombinator.com/item?id=39460507</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=39460507</guid></item><item><title><![CDATA[New comment by behat in "Show HN: Open-source macOS AI copilot using vision and voice"]]></title><description><![CDATA[
<p>Nice! Built something similar earlier to get fixes from chatgpt for error messages on screen. No voice input because I don't like speaking. My approach then was Apple Computer Vision Kit for OCR + chatgpt. This reminds me to test out OpenAI's Vision API as a replacement.<p>Thanks for sharing!</p>
]]></description><pubDate>Tue, 12 Dec 2023 15:59:26 +0000</pubDate><link>https://news.ycombinator.com/item?id=38613924</link><dc:creator>behat</dc:creator><comments>https://news.ycombinator.com/item?id=38613924</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=38613924</guid></item><item><title><![CDATA[How GitHub Copilot is getting better at understanding your code]]></title><description><![CDATA[
<p>Article URL: <a href="https://github.blog/2023-05-17-how-github-copilot-is-getting-better-at-understanding-your-code/">https://github.blog/2023-05-17-how-github-copilot-is-getting-better-at-understanding-your-code/</a></p>
<p>Comments URL: <a href="https://news.ycombinator.com/item?id=36044296">https://news.ycombinator.com/item?id=36044296</a></p>
<p>Points: 24</p>
<p># Comments: 0</p>
]]></description><pubDate>Tue, 23 May 2023 13:52:15 +0000</pubDate><link>https://github.blog/2023-05-17-how-github-copilot-is-getting-better-at-understanding-your-code/</link><dc:creator>behat</dc:creator><comments>https://news.ycombinator.com/item?id=36044296</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=36044296</guid></item><item><title><![CDATA[New comment by behat in "Tech stack for fine-tuning LLMs"]]></title><description><![CDATA[
<p>Thank you for sharing! The HF docs seem easy to follow. My application is text generation itself, so may have different results.</p>
]]></description><pubDate>Thu, 18 May 2023 15:14:38 +0000</pubDate><link>https://news.ycombinator.com/item?id=35989128</link><dc:creator>behat</dc:creator><comments>https://news.ycombinator.com/item?id=35989128</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=35989128</guid></item><item><title><![CDATA[New comment by behat in "Tech stack for fine-tuning LLMs"]]></title><description><![CDATA[
<p>Thank you for sharing your experience. The linked blog post is great!</p>
]]></description><pubDate>Thu, 18 May 2023 15:13:07 +0000</pubDate><link>https://news.ycombinator.com/item?id=35989105</link><dc:creator>behat</dc:creator><comments>https://news.ycombinator.com/item?id=35989105</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=35989105</guid></item></channel></rss>