Hacker News: einrealist

New comment by einrealist in "Teaching Claude Why"

einrealist — Sat, 09 May 2026 10:26:47 +0000

Isn't alignment a dilemma?

Because what is aligned, how and for whom? And who decides how that alignment should look like? There are probably many domains in which required alignment is in conflict with each other (e.g. using LLMs for warfare vs. ethically based domains). I can't imagine how this can be viable on the required scale (like one model per domain) for the already huge investments.

New comment by einrealist in "A recent experience with ChatGPT 5.5 Pro"

einrealist — Sat, 09 May 2026 08:17:59 +0000

Compute in science was already subsidized by public funding or by donations. Most supercomputers are financed this way. And that's a good thing. If you have a good science problem that can be computed, apply for compute time. There is nothing wrong to apply that to LLMs as well, like I wrote in my initial post. The human is still required to identity problems that are worth to be computed, to create prompts that the LLM can act on, and to verify results. But, OpenAI providing compute for basically free is still tied to a different incentive: to fuel the hype and to capture the market, while distorting/obfuscating the real costs. That's also the reason for why we cannot claim that 'economics on LLMs is just unbeatable'. It depends on the problem, the reason for a prompt.

New comment by einrealist in "A recent experience with ChatGPT 5.5 Pro"

einrealist — Sat, 09 May 2026 07:31:13 +0000

Did I praise our animal agriculture anywhere?

New comment by einrealist in "A recent experience with ChatGPT 5.5 Pro"

einrealist — Sat, 09 May 2026 07:16:03 +0000

"After 16 minutes and 41 seconds, it came back" ... "further 47 minutes and 39 seconds" ... "After 13 minutes and 33 seconds" ... "After 9 minutes and 12 seconds" ... "After 31 minutes and 40 seconds" ... plus other computations

Anyone spotting the issue here? What did that really cost?

I am not against compute being used for scientific or other important problems. We did that before LLMs. However, the major LLM gatekeepers want to make all industries and companies dependent on their models. And, at some point, they need to charge them the actual, unsubsidized costs for the compute. In the meantime, companies restructure in the hopes that the compute costs remain cheap.

New comment by einrealist in "Microsoft and OpenAI end their exclusive and revenue-sharing deal"

einrealist — Mon, 27 Apr 2026 14:12:50 +0000

I wonder how this figure was settled. Is it based on consumer pricing? Can't Microsoft and OpenAI just make a number up, aside from a minimum to cover operating costs? When is the number just a marketing ploy to make it seem huge, important and inevitable (and too big to fail)?

New comment by einrealist in "An AI agent deleted our production database. The agent's confession is below"

einrealist — Sun, 26 Apr 2026 17:24:56 +0000

Also funny how people (including LLM vendors, like Cursor) think that rules in a system prompt (or custom rules) are real safety measures.

New comment by einrealist in "An update on recent Claude Code quality reports"

einrealist — Thu, 23 Apr 2026 18:25:26 +0000

Is 'refactoring Markdown files' already a thing?

New comment by einrealist in "Claude Design"

einrealist — Fri, 17 Apr 2026 22:43:42 +0000

True. I didn't expect it to provide novel designs. Maybe Anthropic should find a better replacement for 'Design'.

In my example, I expected it to create UI elements for a business application / expert system. And it did fine. In fact, I believe its perfect for creating average and functional designs. Its a better way to test variations of UIs for expert systems. But I want to know what the actual costs are.

New comment by einrealist in "Claude Design"

einrealist — Fri, 17 Apr 2026 22:25:28 +0000

Good for crunching out some prototypes, ideas and getting inspirations I guess. Two prompts - the initial one and one refinement - took about ten minutes and used up 90% of the token budget. I wonder what the real costs are. After the IPO, they will no longer be able to subsidize token costs. The question will then be whether it's still cheap enough just for prototypes, ideas and inspiration.

New comment by einrealist in "Claude Opus 4.7"

einrealist — Thu, 16 Apr 2026 18:29:34 +0000

They are trying to optimize the circus trick that 'reasoning' is. The economics still do not favor a viable business at these valuations or levels of cost subsidization. The amount of compute required to make 'reasoning' work or to have these incremental improvements is increasingly obfuscated in light of the IPO.

New comment by einrealist in "Sam Altman may control our future – can he be trusted?"

einrealist — Tue, 07 Apr 2026 02:49:57 +0000

I can slow down the compute by a factor of a thousand. It would not change the result. But it changes the economics. We only call it intelligent, because we can do the backpropagation, the inference (and training) fast enough and with enough memory for it to appear this way.

New comment by einrealist in "Sam Altman may control our future – can he be trusted?"

einrealist — Mon, 06 Apr 2026 23:39:30 +0000

I don't trust anyone who claims that LLMs today are superhumanly intelligent. All they do is perform compute-intensive brute-force attacks on the problem/solution space and call it 'reasoning', all while subsidising the real costs to capture the market. So much SciFi BS and extrapolation about a technology that is useful if adopted with care.

This technology needs to become a commodity to destroy this aggregation of power between a few organizations with untrustworthy incentives and leadership.

New comment by einrealist in "Decisions that eroded trust in Azure – by a former Azure Core engineer"

einrealist — Fri, 03 Apr 2026 08:54:55 +0000

Axel's engagement with the issue and refusal to give up is admirable. It also demonstrates that code and architecture remain important even in an era when managers believe these subjects can now be handled by LLMs. Imagine if LLMs were mandated for use in such an environment, further distancing SWEs from the code and overarching architectural choices. I am not saying that it can't work. But friction and maturity through experience really matters.

Also explains perfectly why I never met an engineer who was eager to run workloads on Azure. In orgs I worked, either the use of Azure was mandated by management (probably good $$ incentives) or through Microsoft leaning into the "Multi-Cloud for resilience" selling point, to get Orgs shift workloads from competitors.

Its also huge case for open (cloud) stack(s).

New comment by einrealist in "Anatomy of the .claude/ folder"

einrealist — Sat, 28 Mar 2026 08:30:13 +0000

"Vibe prompting"

New comment by einrealist in "Anatomy of the .claude/ folder"

einrealist — Sat, 28 Mar 2026 07:54:03 +0000

So when Anthropic releases a new model that "breaks compatibility" with some Markdown files, do we call it "refactoring" to find (guess) the required changes to have the desired outcome again? Don't we create brittle specifications to fit a version of a model?

New comment by einrealist in "US and TotalEnergies reach 'nearly $1B' deal to end offshore wind projects"

einrealist — Mon, 23 Mar 2026 18:43:20 +0000

Simply insane.

New comment by einrealist in "Java 26 is here"

einrealist — Tue, 17 Mar 2026 22:22:58 +0000

I have been using Java since version 1.4. Both the language and its ecosystem have come a long way since then. I endured the height of the EJB phase. I adopted Spring when version 1.2 was released. I spent hours fighting with IDEs to run OSGi bundles. I hated building UIs with Swing/AWT, many of which are still in use today and are gradually being replaced by lovely JavaFX. When I look at code I wrote around 12 years ago, I'm amazed at how much I've matured too.

New comment by einrealist in "LLMs work best when the user defines their acceptance criteria first"

einrealist — Sat, 07 Mar 2026 07:55:09 +0000

> SQLite is not primarily fast because it is written in C. Well.. that too, but it is fast because 26 years of profiling have identified which tradeoffs matter.

Someone (with deep pockets to bear the token costs) should let Claude run for 26 months to have it optimize its Rust code base iteratively towards equal benchmarks. Would be an interesting experiment.

The article points out the general issue when discussing LLMs: audience and subject matter. We mostly discuss anecdotally about interactions and results. We really need much more data, more projects to succeed with LLMs or to fail with them - or to linger in a state of ignorance, sunk-cost fallacy and supressed resignation. I expect the latter will remain the standard case that we do not hear about - the part of the iceberg that is underwater, mostly existing within the corporate world or in private GitHubs, a case that is true with LLMs and without them.

In my experience, 'Senior Software Engineer' has NO general meaning. It's a title to be awarded for each participation in a project/product over and over again. The same goes for the claim: "Me, Senior SWE treat LLMs as Junior SWE, and I am 10x more productive." Imagine me facepalming every time.

New comment by einrealist in "Are compilers deterministic?"

einrealist — Sun, 22 Feb 2026 01:28:48 +0000

If the output has problems, do you usually rerun the compilation with the same input (that you control)? I don't usually.

What is included in the 'verify' step? Does it involve changing the generated code? If not, how do you ensure things like code quality, architectural constraints, efficiency and consistency? It's difficult, if not (economically) impossible, to write tests for these things. What if the LLM does not follow the guidelines outlined in your prompt? This is still happening. If this is not included, I would call it 'brute forcing'. How much do you pay for tokens?

New comment by einrealist in "I verified my LinkedIn identity. Here's what I handed over"

einrealist — Sat, 21 Feb 2026 15:08:04 +0000

Why not show a summary of who actually received the data? It should be easy to implement. You could also add what data is retained and an estimate of how long it is kept for. It could be a summary page that I can print as a PDF after the process is complete.

I'd consider that a feature that would increase trust in such a platform. These platforms require trust, right?