<rss version="2.0" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:atom="http://www.w3.org/2005/Atom"><channel><title>Hacker News: chaoz_</title><link>https://news.ycombinator.com/user?id=chaoz_</link><description>Hacker News RSS</description><docs>https://hnrss.org/</docs><generator>hnrss v2.1.1</generator><lastBuildDate>Wed, 15 Apr 2026 20:45:24 +0000</lastBuildDate><atom:link href="https://hnrss.org/user?id=chaoz_" rel="self" type="application/rss+xml"></atom:link><item><title><![CDATA[New comment by chaoz_ in "MCP as Observability Interface: Connecting AI Agents to Kernel Tracepoints"]]></title><description><![CDATA[
<p>You absolutely could. It'd be cool (not easy from security/compliance perspective) to be able to deeply "scan" your prod-deployed app.<p>There are a quite a few startups created by connecting relevant eBPF/OTel traces e.g. in response to uncaught exceptions (traditional RAG-based bug-fix generation).</p>
]]></description><pubDate>Wed, 15 Apr 2026 14:11:18 +0000</pubDate><link>https://news.ycombinator.com/item?id=47779213</link><dc:creator>chaoz_</dc:creator><comments>https://news.ycombinator.com/item?id=47779213</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47779213</guid></item><item><title><![CDATA[New comment by chaoz_ in "Wacli – WhatsApp CLI"]]></title><description><![CDATA[
<p>What even is this claim? Telegram is compromised? Some telegram bot/group got compromised?<p>Is there any proof of the global telegram issue related to amex links? Sounds like BS</p>
]]></description><pubDate>Wed, 15 Apr 2026 10:00:20 +0000</pubDate><link>https://news.ycombinator.com/item?id=47776921</link><dc:creator>chaoz_</dc:creator><comments>https://news.ycombinator.com/item?id=47776921</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47776921</guid></item><item><title><![CDATA[New comment by chaoz_ in "Try to take my position: The best promotion advice I ever got"]]></title><description><![CDATA[
<p>I think that works well for smaller orgs, but in larger organizations (especially where department headcount growth is not expected) it might be more complicated and more meta/political. I wish that were not the case, but in reality, trying to "do the job" of your manager can backfire.</p>
]]></description><pubDate>Mon, 05 Jan 2026 19:58:47 +0000</pubDate><link>https://news.ycombinator.com/item?id=46503958</link><dc:creator>chaoz_</dc:creator><comments>https://news.ycombinator.com/item?id=46503958</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=46503958</guid></item><item><title><![CDATA[New comment by chaoz_ in "LLMs can't beat the easiest quest in Space Rangers 2"]]></title><description><![CDATA[
<p>I extracted text quests from Space Rangers 2 after Claude failed a simple riddle I gave it when playing. Ran frontier LLMs through the 'easiest' quest, got 1 success in 60 attempts. Humans don't really have any problems solving it.</p>
]]></description><pubDate>Sun, 04 Jan 2026 16:05:16 +0000</pubDate><link>https://news.ycombinator.com/item?id=46489234</link><dc:creator>chaoz_</dc:creator><comments>https://news.ycombinator.com/item?id=46489234</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=46489234</guid></item><item><title><![CDATA[LLMs can't beat the easiest quest in Space Rangers 2]]></title><description><![CDATA[
<p>Article URL: <a href="https://nikitakutz.substack.com/p/why-frontier-llms-fail-at-the-easiest">https://nikitakutz.substack.com/p/why-frontier-llms-fail-at-the-easiest</a></p>
<p>Comments URL: <a href="https://news.ycombinator.com/item?id=46489233">https://news.ycombinator.com/item?id=46489233</a></p>
<p>Points: 1</p>
<p># Comments: 1</p>
]]></description><pubDate>Sun, 04 Jan 2026 16:05:16 +0000</pubDate><link>https://nikitakutz.substack.com/p/why-frontier-llms-fail-at-the-easiest</link><dc:creator>chaoz_</dc:creator><comments>https://news.ycombinator.com/item?id=46489233</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=46489233</guid></item><item><title><![CDATA[New comment by chaoz_ in "Testing Frontier LLMs on Space Rangers 2 Text Quests"]]></title><description><![CDATA[
<p>I extracted text-based quests from Space Rangers 2 (a 2004 Russian RPG) and tested Claude Opus, GPT-5.2, and Gemini on them.<p>Repo: <a href="https://github.com/NickKuts/llm-game-evals" rel="nofollow">https://github.com/NickKuts/llm-game-evals</a></p>
]]></description><pubDate>Thu, 01 Jan 2026 21:43:44 +0000</pubDate><link>https://news.ycombinator.com/item?id=46458369</link><dc:creator>chaoz_</dc:creator><comments>https://news.ycombinator.com/item?id=46458369</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=46458369</guid></item><item><title><![CDATA[Testing Frontier LLMs on Space Rangers 2 Text Quests]]></title><description><![CDATA[
<p>Article URL: <a href="https://nikitakutz.substack.com/p/why-llms-fail-on-quest-games-any">https://nikitakutz.substack.com/p/why-llms-fail-on-quest-games-any</a></p>
<p>Comments URL: <a href="https://news.ycombinator.com/item?id=46458368">https://news.ycombinator.com/item?id=46458368</a></p>
<p>Points: 2</p>
<p># Comments: 1</p>
]]></description><pubDate>Thu, 01 Jan 2026 21:43:44 +0000</pubDate><link>https://nikitakutz.substack.com/p/why-llms-fail-on-quest-games-any</link><dc:creator>chaoz_</dc:creator><comments>https://news.ycombinator.com/item?id=46458368</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=46458368</guid></item><item><title><![CDATA[New comment by chaoz_ in "Yann LeCun to depart Meta and launch AI startup focused on 'world models'"]]></title><description><![CDATA[
<p>I agree. I never understood LeCun's statement that we need to pivot toward the visual aspects of things because the bitrate of text is low while visual input through the eye is high.<p>Text and languages contain structured information and encode a lot of real-world complexity (or it's "modelling" that).<p>Not saying we won't pivot to visual data or world simulations, but he was clearly not the type of person to compete with other LLM research labs, nor did he propose any alternative that could be used to create something interesting for end-users.</p>
]]></description><pubDate>Wed, 12 Nov 2025 11:10:48 +0000</pubDate><link>https://news.ycombinator.com/item?id=45898749</link><dc:creator>chaoz_</dc:creator><comments>https://news.ycombinator.com/item?id=45898749</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=45898749</guid></item><item><title><![CDATA[New comment by chaoz_ in "Anticheat Update Tracking"]]></title><description><![CDATA[
<p>Ehh, pretty sad there's almost no information on FACEIT anti-cheat. One of the most impactful out there. Wonder if it's just the invasiveness that separates it.<p>Valve can't replicate even part of it, while CS2 game modes are flooded with cheaters. Most people who chase competitiveness (which CS used to be all about – now it's also skins) just install FACEIT directly and ignore 90% of built-in game content.<p>Maybe Valve just doesn't want to make the game more difficult to install and sacrifice several % of their user base.</p>
]]></description><pubDate>Mon, 30 Jun 2025 07:40:26 +0000</pubDate><link>https://news.ycombinator.com/item?id=44420529</link><dc:creator>chaoz_</dc:creator><comments>https://news.ycombinator.com/item?id=44420529</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=44420529</guid></item><item><title><![CDATA[Ask HN: Is there value in a "generalized" visual tool for agent workflows?]]></title><description><![CDATA[
<p>We build AI agents in both low-code (LangFlow) and high-code environments. Low-code solutions have built-in visual interfaces, but our coded agent networks lack visual tooling for:<p>1. Inspecting complex workflows and decision trees<p>2. Debugging multi-agent interactions<p>3. Testing inputs at different pipeline stages<p>----------------<p>Existing workarounds (like exposing agents as tools in chat UIs) feel hacky and don't solve the core inspection problem. We need something that bridges coded flexibility with visual clarity.<p>- Do tools exist that can import coded agent networks (LangChain, CrewAI, etc.) and generate visual representations for inspection/debugging?<p>- Is there market demand for a general-purpose visual inspector that works across multiple agent frameworks?<p>----------------<p>Essentially looking for something that can parse existing agent code and create interactive visual workflows for better understanding and troubleshooting.<p>Has anyone seen or built solutions in this space?</p>
<hr>
<p>Comments URL: <a href="https://news.ycombinator.com/item?id=44326536">https://news.ycombinator.com/item?id=44326536</a></p>
<p>Points: 1</p>
<p># Comments: 0</p>
]]></description><pubDate>Fri, 20 Jun 2025 10:55:14 +0000</pubDate><link>https://news.ycombinator.com/item?id=44326536</link><dc:creator>chaoz_</dc:creator><comments>https://news.ycombinator.com/item?id=44326536</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=44326536</guid></item><item><title><![CDATA[New comment by chaoz_ in "Can LLMs do randomness?"]]></title><description><![CDATA[
<p>"You can actually read up on how these things work."<p>While you can definitely read about how some parts of a very complex neural network function, it's very challenging to understand the underlying patterns.<p>That's why even the people who invented components of these networks still invest in areas like mechanistic interpretability, trying to develop a model of how these systems actually operate. See <a href="https://www.transformer-circuits.pub/2022/mech-interp-essay" rel="nofollow">https://www.transformer-circuits.pub/2022/mech-interp-essay</a> (Chris Olah)</p>
]]></description><pubDate>Wed, 30 Apr 2025 11:57:21 +0000</pubDate><link>https://news.ycombinator.com/item?id=43843951</link><dc:creator>chaoz_</dc:creator><comments>https://news.ycombinator.com/item?id=43843951</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=43843951</guid></item><item><title><![CDATA[Ask HN: Why mining power is useless for LLM training?]]></title><description><![CDATA[
<p>I don't know anything about crypto, but why is it technically so difficult to transform hash function computations into (at least partially) decentralized LLM training? Have there been any attempts to make it at least somewhat useful?</p>
<hr>
<p>Comments URL: <a href="https://news.ycombinator.com/item?id=43182311">https://news.ycombinator.com/item?id=43182311</a></p>
<p>Points: 5</p>
<p># Comments: 5</p>
]]></description><pubDate>Wed, 26 Feb 2025 09:54:40 +0000</pubDate><link>https://news.ycombinator.com/item?id=43182311</link><dc:creator>chaoz_</dc:creator><comments>https://news.ycombinator.com/item?id=43182311</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=43182311</guid></item><item><title><![CDATA[New comment by chaoz_ in "Meta Llama 3"]]></title><description><![CDATA[
<p>but do you think "next token prediction is enough for AGI" though?</p>
]]></description><pubDate>Thu, 18 Apr 2024 18:34:22 +0000</pubDate><link>https://news.ycombinator.com/item?id=40079297</link><dc:creator>chaoz_</dc:creator><comments>https://news.ycombinator.com/item?id=40079297</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=40079297</guid></item><item><title><![CDATA[New comment by chaoz_ in "Meta Llama 3"]]></title><description><![CDATA[
<p>I agree with you so much, but he has a solid programmatic approach, where some of the guests uncover. Maybe that's the whole role of an interviewer.</p>
]]></description><pubDate>Thu, 18 Apr 2024 18:26:10 +0000</pubDate><link>https://news.ycombinator.com/item?id=40079209</link><dc:creator>chaoz_</dc:creator><comments>https://news.ycombinator.com/item?id=40079209</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=40079209</guid></item><item><title><![CDATA[New comment by chaoz_ in "Meta Llama 3"]]></title><description><![CDATA[
<p>indeed my thoughts, especially with first Dario Amodei's interview. He was able to ask all the right questions and discussion was super fruitful.</p>
]]></description><pubDate>Thu, 18 Apr 2024 18:24:58 +0000</pubDate><link>https://news.ycombinator.com/item?id=40079197</link><dc:creator>chaoz_</dc:creator><comments>https://news.ycombinator.com/item?id=40079197</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=40079197</guid></item><item><title><![CDATA[New comment by chaoz_ in "Meta Llama 3"]]></title><description><![CDATA[
<p>I can't express how good Dwarkesh's podcast is in general.</p>
]]></description><pubDate>Thu, 18 Apr 2024 16:30:53 +0000</pubDate><link>https://news.ycombinator.com/item?id=40077950</link><dc:creator>chaoz_</dc:creator><comments>https://news.ycombinator.com/item?id=40077950</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=40077950</guid></item><item><title><![CDATA[New comment by chaoz_ in "Meta Llama 3"]]></title><description><![CDATA[
<p>that's very exciting. are you quoting same benchmark comparisons?</p>
]]></description><pubDate>Thu, 18 Apr 2024 16:30:01 +0000</pubDate><link>https://news.ycombinator.com/item?id=40077937</link><dc:creator>chaoz_</dc:creator><comments>https://news.ycombinator.com/item?id=40077937</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=40077937</guid></item><item><title><![CDATA[Mechanical Turk: 84 Years of Chess and Deception]]></title><description><![CDATA[
<p>Article URL: <a href="https://en.wikipedia.org/wiki/Mechanical_Turk">https://en.wikipedia.org/wiki/Mechanical_Turk</a></p>
<p>Comments URL: <a href="https://news.ycombinator.com/item?id=40070134">https://news.ycombinator.com/item?id=40070134</a></p>
<p>Points: 2</p>
<p># Comments: 0</p>
]]></description><pubDate>Wed, 17 Apr 2024 21:17:54 +0000</pubDate><link>https://en.wikipedia.org/wiki/Mechanical_Turk</link><dc:creator>chaoz_</dc:creator><comments>https://news.ycombinator.com/item?id=40070134</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=40070134</guid></item><item><title><![CDATA[Show HN: CaptureFlow – Provide LLM with debugger-level context of your app]]></title><description><![CDATA[
<p>Article URL: <a href="https://github.com/CaptureFlow/captureflow-py">https://github.com/CaptureFlow/captureflow-py</a></p>
<p>Comments URL: <a href="https://news.ycombinator.com/item?id=39951878">https://news.ycombinator.com/item?id=39951878</a></p>
<p>Points: 3</p>
<p># Comments: 0</p>
]]></description><pubDate>Sat, 06 Apr 2024 12:04:19 +0000</pubDate><link>https://github.com/CaptureFlow/captureflow-py</link><dc:creator>chaoz_</dc:creator><comments>https://news.ycombinator.com/item?id=39951878</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=39951878</guid></item><item><title><![CDATA[New comment by chaoz_ in "Show HN: CaptureFlow – LLM codegen/bugfix powered by live application context"]]></title><description><![CDATA[
<p>We have no good benchmark to estimate the bugfixing ability, it was mostly zero-short "in this case it works" example.</p>
]]></description><pubDate>Thu, 04 Apr 2024 19:28:36 +0000</pubDate><link>https://news.ycombinator.com/item?id=39934759</link><dc:creator>chaoz_</dc:creator><comments>https://news.ycombinator.com/item?id=39934759</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=39934759</guid></item></channel></rss>