Hacker News: Eridrus

New comment by Eridrus in "Automating myself out of development"

Eridrus — Sun, 14 Jun 2026 14:05:35 +0000

In the last week we have done a complete analytics dashboard overhaul with Fable/Opus. The baseline was really bad, for we have no front-end engineers, so we largely felt comfortable not reading anything but the auth code (where we did find one subtle edge case handled incorrectly).

The pipelines and data serving design was all human since it did have to deal with some data scale but the javascript/api layer was all slop, and it seems fine and good.

If you have a really high quality piece of code that needs to meet a high bar of quality/reliability, then I think the risk of letting the AI loose on it is very high and I wouldn't do it. If you have a pile of code you already know is a pile of garbage despite being human written, well, it can't get much worse :)

I also built an agent orchestration meta harness that runs on k8s and uses the k8s agents sandbox for running codex/claude code in the cloud. This was almost entirely just handed over to Fable and I have not asked a single architectural detail. The quality of this product is mediocre, but the fact that it largely works after I went through a few iterations of clicking around is impressive. I would have preferred to buy something off the shelf, but nothing even really came close (though maybe now I would have forked Omnigent)

New comment by Eridrus in "AI coding at home without going broke"

Eridrus — Sun, 14 Jun 2026 11:42:42 +0000

Codex is much more subscription-efficient than Claude.

Having said that, I think there is a question of how far we can push this and not collapse under the weight of tech debt created, e.g. https://openai.com/index/open-source-codex-orchestration-sym...

I think the dream is basically that you go and file a bunch of Linear tickets, and then you come back a day later to evidence of the tickets being resolved and the code merged. I don't think we're super there yet (See: Anthropic's regular bugs in everything), but this is the future that people are trying to get to and to some extent the question is: is there anywhere we can apply this to now sanely? How does this frontier evolve?

New comment by Eridrus in "Amazon CEO's talks with U.S. officials triggered crackdown on Anthropic models"

Eridrus — Sun, 14 Jun 2026 03:25:36 +0000

The commentary sounds like AWS really pushed this.

If you're bringing this sort of stuff to the government, it's because you want the government to act...

New comment by Eridrus in "GLM 5.2 Is Out"

Eridrus — Sun, 14 Jun 2026 03:22:37 +0000

Releasing a model without benchmarks seems to say the model is probably bad...

New comment by Eridrus in "Statement on US government directive to suspend access to Fable 5 and Mythos 5"

Eridrus — Sun, 14 Jun 2026 02:33:52 +0000

I think Anthropic went a lot further with the marketing around Mythos/Glasswing.

OpenAI talked a lot about potential future risks. Anthropic went around saying "this model is too dangerous to give to people at all".

New comment by Eridrus in "AI OSS tool repo goes archived over night after raising $7.3M Seed"

Eridrus — Sun, 14 Jun 2026 02:32:18 +0000

I'm not a VC man, just someone who has raised funding recently. Nobody in the US is talking about the next Databricks atm, for better or worse, they are either trying to get as much allocation in OpenAI/Anthropic, funding random credentialed people to make neolabs, or funding people who are somehow selling into the massive AI coding/agent demand (or to the labs). Investors in the US currently do not care about safe bets, they want growth at all costs. Risk on.

Maybe in your country it's different, but this is what I see.

New comment by Eridrus in "AI OSS tool repo goes archived over night after raising $7.3M Seed"

Eridrus — Sat, 13 Jun 2026 16:43:38 +0000

Tell me you haven't talked to a VC.

A better model for VCs is: companies are finding tons of budget to allocate to new AI spend. Besides the labs, who is going to be able to capture some of that spend while they're actively looking to spend it?

Nobody at the seed stage is investing in things they think are "safe". They are investing in things they think have huge upside.

New comment by Eridrus in "Statement on US government directive to suspend access to Fable 5 and Mythos 5"

Eridrus — Sat, 13 Jun 2026 01:55:15 +0000

The difference between OpenAI & Anthropic is that OpenAI didn't do multiple big media pushes about how their models are so scary and dangerous.

OpenAI's models are very good, they have refusals + a government ID verification story for cyber access (I don't think they prevent non-US nationals, but I don't know this). What they don't have is Project Glasswing and all the hand wringing about how they're going to end the world in public.

I hope Anthropic pulls their head out of their ass and just starts acting like a normal company.

New comment by Eridrus in "/architect: Reduce Fable tokens by 80%, Fable orchestrates/reviews, Codex builds"

Eridrus — Sat, 13 Jun 2026 01:47:49 +0000

I think I like Codex for the same reason tbh. I think it's just general misanthropy or autism or something lol. Most people seem to prefer Claude.

For me, I think Codex was visibly smarter than Claude until 4.8 came out, it would regularly do better debugging and IMO write better code. 4.8 I think is close.

I think Claude is widely regarded to have a big lead in front-end, which I do not work on.

Claude's Ultrathink is pretty cool, though it eats up tokens like nothing else obviously.

New comment by Eridrus in "/architect: Reduce Fable tokens by 80%, Fable orchestrates/reviews, Codex builds"

Eridrus — Fri, 12 Jun 2026 23:58:50 +0000

The problem is that there are a bunch of benchmarks, the model providers often don't even use the same benchmarks, a bunch of them have known problems, and it's expensive to do your own benchmarks.

I am a GPT 5.x booster since to me it just feels smarter, and I generally felt like the benchmarks backed me up, but it's not every benchmark, so sadly we're mostly arguing about vibes.

SWEBench-Pro was a big one, though apparently Claude was reading solutions out of the .git folder it wasn't meant to have access to among other problems.

New comment by Eridrus in "Can I Buy Your KV Cache?"

Eridrus — Fri, 12 Jun 2026 21:33:43 +0000

The paper has a section on "Reusing precomputed KV across queries" which talks about how other papers have tried to address this problem, but yeah, this paper adds nothing on its own besides a catchy title.

New comment by Eridrus in "Ask HN: Why is the term forward deployed engineer (FDE) popular all of a sudden?"

Eridrus — Fri, 12 Jun 2026 18:18:13 +0000

Why did we invent the term Data Scientist a decade or so ago to replace Business Analyst?

To try and give the position a rebrand with more prestige and hire better people for it.

New comment by Eridrus in "Kimi K2.7-Code: open-source coding model with better token efficiency"

Eridrus — Fri, 12 Jun 2026 17:47:23 +0000

I assume it's a lack of care when RLing them.

RL has a tendency to reinforce cheating when the cheats are easier to find than the final solution.

So when making your RL environment, you need to spend a lot of effort on finding ways the model can cheat and penalizing them.

New comment by Eridrus in "Kimi K2.7-Code: open-source coding model with better token efficiency"

Eridrus — Fri, 12 Jun 2026 17:45:22 +0000

It's not quite as closed as a binary, it is very standard practice to take these models and fine-tune them.

If there were actually even close to frontier open source models, this would be more of a discussion, but everyone knows these mean open weight.

New comment by Eridrus in "Show HN: FablePool – pool money behind a prompt, and Fable builds it in public"

Eridrus — Fri, 12 Jun 2026 14:56:49 +0000

I think what will be interesting is not whether the code will be produced, but rather: will anybody actually use any this output?

This sort of reminds me of startups that go out of business and then open source their code. It's kind of cool when they can do that, but almost nobody ever gets value from it.

Anyway, if anyone uses the code produced this way in prod, I'd love to hear your story.

New comment by Eridrus in "Show HN: FablePool – pool money behind a prompt, and Fable builds it in public"

Eridrus — Thu, 11 Jun 2026 21:36:25 +0000

Hell yeah, $516 for a complete AWS replacement, I'm in lol!

New comment by Eridrus in "AWS Bedrock to require sharing data with Anthropic for Mythos and future models"

Eridrus — Thu, 11 Jun 2026 05:07:17 +0000

Yes, this whole kerfuffle is about contractual agreements between companies, not about governments.

I take my legal risk more seriously than I do people's paranoia about the NSA coming for them.

New comment by Eridrus in "AWS Bedrock to require sharing data with Anthropic for Mythos and future models"

Eridrus — Thu, 11 Jun 2026 05:05:30 +0000

I'm sure AWS told Anthropic this was a bad idea, but now that OpenAI is on AWS, this doesn't really change AWS' competitive posture as much as it does for Anthropic. When OpenAI releases a similar model they will presumably not make this mistake, and we will switch our configs to the OAI models.

New comment by Eridrus in "AWS Bedrock to require sharing data with Anthropic for Mythos and future models"

Eridrus — Thu, 11 Jun 2026 05:02:23 +0000

So basically all models going forward?

I don't think anyone currently thinks the Haiku/Sonnet/Opus models are "good enough" such that they would not want improvements. Users may be cost conscious, but almost every task could be done better.

New comment by Eridrus in "Anthropic requires 30 day data retention for Fable and Mythos"

Eridrus — Thu, 11 Jun 2026 04:56:44 +0000

In general, I agree with you.

However, in the case of model providers, I think it is a more real concern since it could make it into some training data, and then one of your actual competitors could ask the model to code something up and get your IP.

I sort of assume the frontier AI labs are good about not doing this when they promise not to, but if you don't have airtight restrictions on what your devs are doing, they might be sending it somewhere that hasn't agreed....