Hacker News: chaoz_

New comment by chaoz_ in "MCP as Observability Interface: Connecting AI Agents to Kernel Tracepoints"

chaoz_ — Wed, 15 Apr 2026 14:11:18 +0000

You absolutely could. It'd be cool (not easy from security/compliance perspective) to be able to deeply "scan" your prod-deployed app.

There are a quite a few startups created by connecting relevant eBPF/OTel traces e.g. in response to uncaught exceptions (traditional RAG-based bug-fix generation).

New comment by chaoz_ in "Wacli – WhatsApp CLI"

chaoz_ — Wed, 15 Apr 2026 10:00:20 +0000

What even is this claim? Telegram is compromised? Some telegram bot/group got compromised?

Is there any proof of the global telegram issue related to amex links? Sounds like BS

New comment by chaoz_ in "Try to take my position: The best promotion advice I ever got"

chaoz_ — Mon, 05 Jan 2026 19:58:47 +0000

I think that works well for smaller orgs, but in larger organizations (especially where department headcount growth is not expected) it might be more complicated and more meta/political. I wish that were not the case, but in reality, trying to "do the job" of your manager can backfire.

New comment by chaoz_ in "LLMs can't beat the easiest quest in Space Rangers 2"

chaoz_ — Sun, 04 Jan 2026 16:05:16 +0000

I extracted text quests from Space Rangers 2 after Claude failed a simple riddle I gave it when playing. Ran frontier LLMs through the 'easiest' quest, got 1 success in 60 attempts. Humans don't really have any problems solving it.

LLMs can't beat the easiest quest in Space Rangers 2

chaoz_ — Sun, 04 Jan 2026 16:05:16 +0000

Article URL: https://nikitakutz.substack.com/p/why-frontier-llms-fail-at-the-easiest

Comments URL: https://news.ycombinator.com/item?id=46489233

Points: 1

# Comments: 1

New comment by chaoz_ in "Testing Frontier LLMs on Space Rangers 2 Text Quests"

chaoz_ — Thu, 01 Jan 2026 21:43:44 +0000

I extracted text-based quests from Space Rangers 2 (a 2004 Russian RPG) and tested Claude Opus, GPT-5.2, and Gemini on them.

Repo: https://github.com/NickKuts/llm-game-evals

Testing Frontier LLMs on Space Rangers 2 Text Quests

chaoz_ — Thu, 01 Jan 2026 21:43:44 +0000

Article URL: https://nikitakutz.substack.com/p/why-llms-fail-on-quest-games-any

Comments URL: https://news.ycombinator.com/item?id=46458368

Points: 2

# Comments: 1

New comment by chaoz_ in "Yann LeCun to depart Meta and launch AI startup focused on 'world models'"

chaoz_ — Wed, 12 Nov 2025 11:10:48 +0000

I agree. I never understood LeCun's statement that we need to pivot toward the visual aspects of things because the bitrate of text is low while visual input through the eye is high.

Text and languages contain structured information and encode a lot of real-world complexity (or it's "modelling" that).

Not saying we won't pivot to visual data or world simulations, but he was clearly not the type of person to compete with other LLM research labs, nor did he propose any alternative that could be used to create something interesting for end-users.

New comment by chaoz_ in "Anticheat Update Tracking"

chaoz_ — Mon, 30 Jun 2025 07:40:26 +0000

Ehh, pretty sad there's almost no information on FACEIT anti-cheat. One of the most impactful out there. Wonder if it's just the invasiveness that separates it.

Valve can't replicate even part of it, while CS2 game modes are flooded with cheaters. Most people who chase competitiveness (which CS used to be all about – now it's also skins) just install FACEIT directly and ignore 90% of built-in game content.

Maybe Valve just doesn't want to make the game more difficult to install and sacrifice several % of their user base.

Ask HN: Is there value in a "generalized" visual tool for agent workflows?

chaoz_ — Fri, 20 Jun 2025 10:55:14 +0000

We build AI agents in both low-code (LangFlow) and high-code environments. Low-code solutions have built-in visual interfaces, but our coded agent networks lack visual tooling for:

1. Inspecting complex workflows and decision trees

2. Debugging multi-agent interactions

3. Testing inputs at different pipeline stages

----------------

Existing workarounds (like exposing agents as tools in chat UIs) feel hacky and don't solve the core inspection problem. We need something that bridges coded flexibility with visual clarity.

- Do tools exist that can import coded agent networks (LangChain, CrewAI, etc.) and generate visual representations for inspection/debugging?

- Is there market demand for a general-purpose visual inspector that works across multiple agent frameworks?

----------------

Essentially looking for something that can parse existing agent code and create interactive visual workflows for better understanding and troubleshooting.

Has anyone seen or built solutions in this space?

Comments URL: https://news.ycombinator.com/item?id=44326536

Points: 1

# Comments: 0

New comment by chaoz_ in "Can LLMs do randomness?"

chaoz_ — Wed, 30 Apr 2025 11:57:21 +0000

"You can actually read up on how these things work."

While you can definitely read about how some parts of a very complex neural network function, it's very challenging to understand the underlying patterns.

That's why even the people who invented components of these networks still invest in areas like mechanistic interpretability, trying to develop a model of how these systems actually operate. See https://www.transformer-circuits.pub/2022/mech-interp-essay (Chris Olah)

Ask HN: Why mining power is useless for LLM training?

chaoz_ — Wed, 26 Feb 2025 09:54:40 +0000

I don't know anything about crypto, but why is it technically so difficult to transform hash function computations into (at least partially) decentralized LLM training? Have there been any attempts to make it at least somewhat useful?

Comments URL: https://news.ycombinator.com/item?id=43182311

Points: 5

# Comments: 5