Hacker News: criemen

New comment by criemen in "DeepSeek reasonix, DeepSeek native coding agent with high caching and low cost"

criemen — Sun, 24 May 2026 16:27:04 +0000

> Ah, reminds me of good old "There are only 2 hard problems in computer science: cache invalidation, naming things, and off-by-1 errors."

You quip, but LLM KV caching (from the harness side) is quite easy: You get a cache hit on stable prompt prefixes, period. That means you want to keep the prefix stable, and only append at the end of the conversation. Made up example: Don't put the git branch name into the system prompt part (that comes first), as whenever the branch name changes, that'd trigger a cache invalidation of the entire prompt.

Getting this right requires some care to not by accident modify the prefix, basically, and some design on communicating the things that can change (user configuration, working dir, git information, ...).

New comment by criemen in "Project Glasswing: An Initial Update"

criemen — Fri, 22 May 2026 22:48:01 +0000

> There's also a runaway effect of model improvement from the discovery, triage and fix data. This is likely already the most potent corpus of curated offensive data ever assembled and will only get better.

But that corpus of data is accessible to all competitors, American or not. I don't believe that this can't be replicated. I'd posit that there's enough annotated data out there (CVE+patch), only increasing thanks to Mythos, that if you specifically RL for this scenario, you can improve your models performance on finding vulnerabilities without access to Mythos.

New comment by criemen in "Cursor Introduces Composer 2.5"

criemen — Mon, 18 May 2026 18:20:45 +0000

Well is that a statement about the quality of Opus 4.7 or about compose 2.5? :P

New comment by criemen in "Measuring Claude 4.7's tokenizer costs"

criemen — Fri, 17 Apr 2026 21:04:58 +0000

> Is there an equivalent ultra-high-end LLM you can have if you’re willing to pay? Or does it not exist because it would cost too much to train?

I guess at the time that was GPT-4.5. I don't think people used it a lot because it was crazy expensive, and not that much better than the rest of the crop.

New comment by criemen in "Measuring Claude 4.7's tokenizer costs"

criemen — Fri, 17 Apr 2026 21:03:37 +0000

> it's them trying to push the models to burn less compute

I'm curious, how does using more tokens save compute?

New comment by criemen in "Škoda DuoBell: A bicycle bell that penetrates noise-cancelling headphones"

criemen — Wed, 08 Apr 2026 09:29:44 +0000

Pretty cool if true!

New comment by criemen in "No, it doesn't cost Anthropic $5k per Claude Code user"

criemen — Tue, 10 Mar 2026 09:08:17 +0000

> What people don't realize is that cache is free

I'm incredibly salty about this - they're essentially monetizing intensely something that allows them to sell their inference at premium prices to more users - without any caching, they'd have much less capacity available.

New comment by criemen in "Giving LLMs a personality is just good engineering"

criemen — Wed, 04 Mar 2026 08:09:55 +0000

I tried ChatGPT over the holidays (paid) vs. claude.ai (paid). After trying some prompts that worked well on Claude in ChatGPT, I understand why people are so annoyed about AI slop. The speech patterns in text output for ChatGPT are both obvious and annoying, and impossible to unsee when people use them in written communication.

Claude isn't without problems ("You're absolutely right"), but I feel that some of the perception there is around the limited set of phrases the coding agent uses regularly, and comes less from the multi-paragraph responses from the chatbot.

New comment by criemen in "MacBook Air with M5"

criemen — Tue, 03 Mar 2026 19:06:06 +0000

> Out of curiosity, what are some good use cases for a MBP now with the MBAs being so powerful?

Local software development (node/TS). When opus-4.6-fast launched, it felt like some of the limiting factor in turnaround time moved from inference to the validation steps, i.e. execute tests, run linter, etc. Granted, that's with endpoint management slowing down I/O, and hopefully tsgo and some eslint replacement will speed things up significantly over there.

New comment by criemen in "Building SQLite with a small swarm"

criemen — Mon, 16 Feb 2026 11:29:17 +0000

Even if was copying sqlite code over, wouldn't the ability to automatically rewrite sqlite in Rust be a valuable asset?

New comment by criemen in "I want to wash my car. The car wash is 50 meters away. Should I walk or drive?"

criemen — Mon, 16 Feb 2026 11:21:29 +0000

> I had assumed that reasoning models should easily be able to answer this correctly.

I thought so too, yet Opus 4.6 with extended thinking (on claude.ai) gives me > Walk. At 50 meters you'd spend more time parking and maneuvering at the car wash than the walk itself takes. Drive the car over only if the wash requires the car to be there (like a drive-through wash), then walk home and back to pick it up.

which is still pretty bad.

New comment by criemen in "I fixed Windows native development"

criemen — Sun, 15 Feb 2026 12:50:37 +0000

This is amazing.

At $workplace, we have a script that extracts a toolchain from a GitHub actions windows runner, packages it up, stuffs it into git LFS, which is then pulled by bazel as C++ toolchain.

This is the more scalable way, and I assume it could still somewhat easily be integrated into a bazel build.

New comment by criemen in "Two different tricks for fast LLM inference"

criemen — Sun, 15 Feb 2026 10:04:11 +0000

One other thing I'd assume Anthropic is doing is routing all fast requests to the latest-gen hardware. They most certainly have a diverse fleet of inference hardware (TPUs, GPUs of different generations), and fast will be only served by whatever is fastest, whereas the general inference workload will be more spread out.

Hamming, "You and Your Research" (1995) [video]

criemen — Sat, 14 Feb 2026 17:01:08 +0000

Article URL: https://www.youtube.com/watch?v=a1zDuOPkMSw

Comments URL: https://news.ycombinator.com/item?id=47016112

Points: 1

# Comments: 0

New comment by criemen in "Two Weeks Until Tapeout"

criemen — Sun, 25 Jan 2026 12:03:20 +0000

> aka: For those not living in 2026, we have uncovered a new clue to the mystery of where all the low-power DRAM chips have suddenly vanished to!

I love the writing style!

New comment by criemen in "Nvidia Kicks Off the Next Generation of AI with Rubin"

criemen — Thu, 08 Jan 2026 20:41:36 +0000

What's the power hookup to just boot one rack? I'd imagine that's more than you get anywhere in residential areas for a single house.

New comment by criemen in "I switched from VSCode to Zed"

criemen — Mon, 05 Jan 2026 23:05:22 +0000

> I’m currently using a mix of Zed, Sublime, and VS Code.

Can you elaborate on when you use which editor? I'd have imagined that there's value in learning and using one editor in-depth, instead of switching around based on use-case, so I'd love to learn more about your approach.

Is AI actually a Bubble?

criemen — Sat, 13 Dec 2025 18:32:57 +0000

Article URL: https://www.newyorker.com/culture/open-questions/is-ai-actually-a-bubble

Comments URL: https://news.ycombinator.com/item?id=46256759

Points: 3

# Comments: 1

New comment by criemen in "Has the cost of building software dropped 90%?"

criemen — Mon, 08 Dec 2025 22:25:56 +0000

> This takes a fairly large mindset shift, but the hard work is the conceptual thinking, not the typing.

But the hard work always was the conceptual thinking? At least at and beyond the Senior level, for me it was always the thinking that's the hard work, not converting the thoughts into code.

New comment by criemen in "Has the cost of building software dropped 90%?"

criemen — Mon, 08 Dec 2025 22:14:39 +0000

The large open-weights models aren't really usable for local running (even with current hardware), but multiple providers compete on running inference for you, so it's reasonable to assume that there is and will be a functioning marketplace.