Hacker News: zmmmmm

New comment by zmmmmm in "Microsoft starts canceling Claude Code licenses"

zmmmmm — Sat, 23 May 2026 11:37:21 +0000

I have stuck with 4.6. I fully believe 4.7 can be smarter for truly complex and long running agentic use. But I prefer the more direct, literal mechanistic style and 4.6 seems to be peak Opus for that.

New comment by zmmmmm in "Don't just paste the AI at me"

zmmmmm — Sat, 23 May 2026 00:53:17 +0000

I've always been fascinated that some people don't seem to have any email "voice" - they just can't translate email text into human emotional impact. So they write super abrupt emails, things they would never say in real life, totally different to their actual personality. It's almost like a distinct form of autism. Meanwhile I'm almost the opposite extreme - I can't hit send on something unless I've finessed it until it sounds exactly like how I would communicate in person. It takes me ages to write my emails.

I'm starting to get a feeling there is a phenomenon like this with AI - some people just genuinely don't hear the AI "voice" at all. They really can't distinguish why sending AI written text is going to impact the person at the other end. It's going to be an interesting ride as these people start using AI and are completely baffled why people are offended by their perfectly reasonable responses.

New comment by zmmmmm in "DeepSeek makes the V4 Pro price discount permanent"

zmmmmm — Fri, 22 May 2026 22:42:18 +0000

I will testify I have used V4 Pro as a coding agent and it did a great job solving a complex problem. It worked with Pi over something like an hour, iterating and running tests. I paid API rates via OpenRouter and it cost me less than $1 I think. I've had single prompts cost that much with Anthropic. I was very impressed.

New comment by zmmmmm in "Python 3.15: features that didn't make the headlines"

zmmmmm — Fri, 22 May 2026 00:37:10 +0000

it's honestly ironic that Java is removing its built in sandboxing right at the time when it would finally have been considered a high value feature. Between the need for AI agent sandboxing and the security apocalypse that is upon us now, it finally would have had its time.

New comment by zmmmmm in "Python 3.15: features that didn't make the headlines"

zmmmmm — Fri, 22 May 2026 00:32:08 +0000

once you use any language that lets you fluently inline a multiline lambda / closure you can never use Python again without it constantly irritating you

New comment by zmmmmm in "An OpenAI model has disproved a central conjecture in discrete geometry"

zmmmmm — Thu, 21 May 2026 02:48:37 +0000

As a side observation, it is striking but also not surprising in retrospect that the big successes in AI are coming from domains where things are fundamentally verifiable. Both software and math are either fully verifiable or low-cost verifiable (breaking a test is not the same cost as building a bridge and watching it fall down to see if it worked).

Other domains are extracting value but I feel like there's an order of magnitude difference. It raises the question, what other domains fit into these categories where the AI itself has pretty much free reign to verify its own results?

New comment by zmmmmm in "I’ve joined Anthropic"

zmmmmm — Wed, 20 May 2026 07:16:50 +0000

So he's working on the singularity

New comment by zmmmmm in "Apple Silicon costs more than OpenRouter"

zmmmmm — Mon, 18 May 2026 03:28:59 +0000

I have free electricity from solar and an old Macbook Pro M1 Max that has depreciated to zero and has no other use. Now how do the economics work out?

New comment by zmmmmm in "I believe there are entire companies right now under AI psychosis"

zmmmmm — Fri, 15 May 2026 22:01:14 +0000

I think AI rescue consulting is going to be come a significant mode of high value consulting, similar to specialists who come in to try and deal with a security breach or do data recovery.

Purely AI written systems will scale to a point of complexity that no human can ever understand and the defect close rate will taper down and the token burn per defect rate scale up and eventually AI changes will cause on average more defects than they close and the whole system will be unstable. It will become a special kind of process to clean room out such a mess and rebuild it fresh (probably still with AI) after distilling out core design principles to avoid catastrophic breakdown.

Somewhere in the future, the new software engineering will be primarily about principles to avoid this in the first, place but it will take us 20 years to learn them, just like original software eng took a lot longer than expected to reach a stable set of design principles (and people still argue about them!).

New comment by zmmmmm in "A few words on DS4"

zmmmmm — Fri, 15 May 2026 06:07:12 +0000

> Why make your programmers wait?

That depends on where the methodology goes. But more and more it's hands off. If the trajectory continues it won't matter because nobody is sitting their waiting / watching the LLM code anyway. It is all happening in the background. We might see hybrid approaches where the weaker / cheaper agent tries to solve it and just "asks for help" from the more expensive agent when it needs it etc.

New comment by zmmmmm in "A few words on DS4"

zmmmmm — Fri, 15 May 2026 02:24:15 +0000

I'm very curious where we will saturate the curve on "enough" intelligence for coding. At some point, you can let a less smart model hammer at a problem for longer and get to the same result, and as long as you are not involved it comes to the same thing. I feel like DeepSeek V4 Pro is nearly there. Maybe Flash is too.

Once we hit that point, I am curious how much of Anthropic's current business model falls apart? So far it's always been clear that you just pay for the most intelligent model you can get because it is worth it. It now seems clear to me that there is limited runway on that concept. It is just a question of how long that runway is. I honestly wonder how much of their frantic push to broaden out into enterprise / productivity is because they see this writing on the wall already.

New comment by zmmmmm in "Elevated error rates on Opus 4.7"

zmmmmm — Fri, 15 May 2026 01:28:24 +0000

I've given V4 Pro some curly things and I was impressed at how it figured them out. I agree high level design is not its forte. But it sat in a loop and dogmatically debugged a crazy dependency issue to come to the right answer over the course of 15 minutes which impressed me.

New comment by zmmmmm in "Elevated error rates on Opus 4.7"

zmmmmm — Fri, 15 May 2026 01:06:34 +0000

interestingly I had the same experience, and weirdly it's in part because it is clearly less intelligent. It's more of a mechanistic tool just doing what I ask (but still very smart and very competent about it) and less trying to win a nobel prize with each answer. Turns out I actually like that.

New comment by zmmmmm in "Googlebook"

zmmmmm — Tue, 12 May 2026 21:49:45 +0000

HN always disappoints me with these kind of threads, with all the generic disappointment and Google scepticism dominating the conversation.

Don't get me wrong, I'm still disappointed. But mainly because it looks so superficial. I was trying to work out what's new and it just looks like an Android device (or Chrome? I can't tell) with some party trick Gemini features sprinkled on it. There isn't anything technically interesting here.

I'm still waiting for someone to ship a truly AI native device - something with the right sandboxing and UI layers to let an AI model truly understand and work with the device natively, but safely. The OS SDK itself should natively incorporate all these elements as first class primitives. And the model would be trained heavily to explicitly understand and work well with them.

New comment by zmmmmm in "Claude Platform on AWS"

zmmmmm — Tue, 12 May 2026 04:18:22 +0000

Yeah i think this could backfire. At the moment they have such a clear messsage with Bedrock about data governance. You now have to ask a question and probalby get approval where previously there was no question and hence no barriers.

New comment by zmmmmm in "Claude Platform on AWS"

zmmmmm — Tue, 12 May 2026 04:15:53 +0000

yes it sounds like a hack to get access to untracked spend in corporate accounts.

In my org, I have to file a form for reimbursement if I bought a pencil for $0.25 but in AWS? spend varies by +/- $5k per month and nobody even questions it. This will definitely make it trivially easy for me to build on Anthropic's services without even telling anybody vs the hoops I would have to jump to get it paid for another way.

New comment by zmmmmm in "Postmortem: TanStack NPM supply-chain compromise"

zmmmmm — Tue, 12 May 2026 04:09:09 +0000

it's not going to help if you share a cache across security boundaries. That is what happened here and seems to be driving a spate of github action related problems.

New comment by zmmmmm in "If AI writes your code, why use Python?"

zmmmmm — Tue, 12 May 2026 01:40:25 +0000

> Critic strength .... Sensor strength

that's a nice breakdown

I think there's something key you get at in terms of the combo of dynamic environment + type safety maximising both. With a dynamic environment, the LLM can do a lot of interrogation to understand the problem space on the fly. I've witnessed agents sort out pretty complex issues through `python -c "..."`, `groovy -e "..."`, executing snippets of code with Node etc which is much less accessible if they have to compile it first. They can also inject logging code that interrogates the runtime as well (what type do we really have at line 1003?) etc which works better with runtimes that have deep introspection capabilities.

New comment by zmmmmm in "GitLab announces workforce reduction and end of their CREDIT values"

zmmmmm — Tue, 12 May 2026 01:23:19 +0000

history suggests so .... people do keep trying to make agent native tools and workflows, but time and time again it turns out to be better just to expose raw inputs and tools to them and let them work with those. See skills beating MCP in most cases where their purpose overlaps for example - it's more effective just to let an agent write git commands than give it a "git tool" with a structured interface. People don't seem to grok the intuition of how heavily biased training on trillions of token of human language and existing software code makes the models towards working well with raw input.

New comment by zmmmmm in "Library for fast mapping of Java records to native memory"

zmmmmm — Mon, 11 May 2026 22:59:35 +0000

I find it weird that the people steering Java have been seemingly willing to sit out the use case high performance computation while it has so dominated the computing landscape. They are just patiently incrementally iterating on all these JEP's that would support dramatically improved capabilities and make Java a very attractive platform for ML - but they keep fretting over minor interface adjustments, cycle after cycle. I get there is a philosophy of keeping the language stable and well designed, but this is really taking it to an extreme in the face of missing an entire segment of computing.