Hacker News: eigenblake

New comment by eigenblake in "A playable DOOM MCP app"

eigenblake — Tue, 28 Apr 2026 21:02:18 +0000

I put Bad Apple in an MCP App two weeks ago https://youtu.be/YFF5H886slQ

New comment by eigenblake in "Claude mixes up who said what"

eigenblake — Fri, 10 Apr 2026 22:29:39 +0000

Alternative Title: Claude Code puts words in my mouth (Self Prompt Injection)

I originally thought that this was just misunderstanding attribution in discussions. This does seem to be a harness bug. Or at least an ontology bug. In my work with LLM frameworks, it was always odd that tool call results are sometimes marked in the convo as coming from the "User", I think that could be fundamentally what's enabling this bug to happen. Neither the LLM nor the harness should be able to claim something came from the user.

This is command injection. I don't know enough to see if cryptography is part of the right answer but it might be. A hash of the user message, signed, public key private key, harness is coded to only allow signed messages issue instructions. Yes, that might be overkill, but thinking about the types of things agent harnesses are used for... I think the safety argument starts to speak for itself... This has never happened to me using CC though, for what it's worth.

New comment by eigenblake in "SkillsBench: Benchmarking how well agent skills work across diverse tasks"

eigenblake — Tue, 17 Feb 2026 05:42:37 +0000

Biggest limitation I see in this paper: the framing. Any time you have a lot of proprietary knowledge or you've just sorted out the right solution when it's not readily available from the model's parametric knowledge, that's when you should add a skill. Wrap it in a CLI that's easy to inspect. You don't need to store the whole help text of the skill either. The model can inspect it and its subcommands.

Reality doesn't force us to choose between skill or no skill, reality often doesn't give us a choice. You can either make a skill for your company's proprietary system or your model has to figure it out from scratch every time by searching wikis or reading code. If you use it right, skills are a compression mechanism. Instead of the process meaning your model needs to get all of theses files dynamically, it can simply statically run.

To steel-man the paper. It is worth looking at whether you should try to code something up first or try a skill first. And it may well be valid to say try first and if you can't work it out in 5 mins, install a skill. But there's a meta point of skills as software (where you deduplicate the effort of solving regressions).

For a reductio ad absurdum, If self-generated skills with no additional context _didn't_ eventually level off in performance, then we could reach AGI by making one big skill that keeps growing and solving harder and harder tasks, including improving the capability of its own skill-builder skill, all without embedding any signals from the environment or needing to interface with the real world at all.

New comment by eigenblake in "Beyond agentic coding"

eigenblake — Sun, 08 Feb 2026 07:24:22 +0000

I have been considering what it would be like to give each function name a specific color and a color for each variable's type followed by a color derived from the hash of the symbol name and keywords would each be their specific type. And essentially printing a matrix of this, essentially transforming your code into a printable matrix "low-lod" or "mipmap" form. This could be implemented like the VSCode minimap but I the right move here is to implement it as a hook that can modify the output of your agent. That way you can look at the structure of the code without reading the names in particular.

New comment by eigenblake in "Show HN: Subth.ink – write something and see how many others wrote the same"

eigenblake — Fri, 23 Jan 2026 05:30:05 +0000

bd35a7f69b28c97fb3ebe489a4fba26a5f423522276d5ff5b5a8bb6441806ad2

New comment by eigenblake in "Claude's system prompt is over 24k tokens with tools"

eigenblake — Wed, 07 May 2025 02:28:37 +0000

How did they leak it, jailbreak? Was this confirmed? I am checking for the situation where the true instructions are not what is being reported here. The language model could have "hallucinated" its own system prompt instructions, leaving no guarantee that this is the real deal.

New comment by eigenblake in "Send Data with Sound"

eigenblake — Tue, 04 Mar 2025 03:35:42 +0000

What's so special about this? Homo sapiens have been doing this for hundreds of thousands of years /s

New comment by eigenblake in "Resident physicians' exam scores tied to patient survival"

eigenblake — Sat, 01 Mar 2025 17:20:37 +0000

This appears to be an observational result, so I'm genuinely perplexed by the reception here. I genuinely thought this comment shows a healthy amount of curiosity and asking important questions. Asking "what control group did this study use?" is usually well-received here.

New comment by eigenblake in "Resident physicians' exam scores tied to patient survival"

eigenblake — Sat, 01 Mar 2025 05:06:06 +0000

Doctors aren't machines, they're humans. I have not yet read the full paper, only the article, but I already see something really big and important to look out for. When I read the full thing, the question I'll be asking is "what's the likelihood that the self-esteem of doctors was directly intervened on by the exam taking process itself." How do you control for the loss in confidence that learning of your test performance gives you? How are we certain that learning your score on the board exam doesn't make you more conservative (or riskier) with how you treat patients as a psychological effect?

New comment by eigenblake in "The Free Movie: Frame-by-frame, handrawn reproduction of "The Bee Movie" (2023)"

eigenblake — Mon, 13 Jan 2025 04:05:41 +0000

We could probably fine-tune a tiny convolutional neutral net image classifier and just hold on the last good frames for longer to cover the frames with clear trolling and nsfw images.

New comment by eigenblake in "Luigi Mangione's account has been renamed on Stack Overflow"

eigenblake — Thu, 09 Jan 2025 07:41:02 +0000

https://en.wikipedia.org/wiki/Streisand_effect?wprov=sfla1

New comment by eigenblake in "Why Linux is not ready for the desktop, the final edition"

eigenblake — Tue, 31 Dec 2024 06:13:54 +0000

I'm surprised no one has mentioned atomic Linux distros yet in this thread. The really hard thing here is that people aren't all talking about the same things. My experience on Arch isn't my experience on Pop. Things on Pop are amazing on my MSI rebuilt PC with Nvidia GPU. I don't even know if I really need to upgrade to NixOS except to satiate my curiosity.

New comment by eigenblake in "Ask HN: Has anyone tried adapting a court reporter keyboard for writing code?"

eigenblake — Tue, 26 Nov 2024 23:54:57 +0000

Absolutely yes, but not me personally. The keywords to search for are Plover, Stenotype https://youtu.be/jRFKZGWrmrM

New comment by eigenblake in "Weird Nonfiction"

eigenblake — Sun, 06 Oct 2024 04:50:25 +0000

> creative work that presents itself as journalism or nonfiction but introduces fictional elements with the intention of upsetting, disturbing, or confusing the audience.

A good time to mention the SCP wiki and some types of analog horror. It was this kind of thing that led me to discover "hard science fiction" which is distinct but related to the previous two.

New comment by eigenblake in "I give you feedback on your blog post draft but you don't send it to me (2021)"

eigenblake — Mon, 16 Sep 2024 19:54:27 +0000

What I don't see represented in this conversation is the idea that you can just write for personal satisfaction, or examine something you're personally interested in. Not everyone needs to have 10k+ monthly active readers. Not everything needs to be a rat race. Why don't we see blogging like exercise? Sure you'll have your body builders, but some people just go on walks, and no one is doing anything "wrong" they just have different goals.