<rss version="2.0" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:atom="http://www.w3.org/2005/Atom"><channel><title>Hacker News: hedgehog</title><link>https://news.ycombinator.com/user?id=hedgehog</link><description>Hacker News RSS</description><docs>https://hnrss.org/</docs><generator>hnrss v2.1.1</generator><lastBuildDate>Wed, 10 Jun 2026 02:12:08 +0000</lastBuildDate><atom:link href="https://hnrss.org/user?id=hedgehog" rel="self" type="application/rss+xml"></atom:link><item><title><![CDATA[New comment by hedgehog in "What it feels like to work with Mythos"]]></title><description><![CDATA[
<p>Work duration is also not that valuable of a measure, you're usually better off defining the process yourself in code and having that delegate chunks of work to the models. The only real issue there is that it's harder to take advantage of the providers' subscription discounts, but on the other hand it's easier to do your own model routing, and there's no way I've seen for the normal chatbots to maintain coherence on streams of work measured in days and weeks.</p>
]]></description><pubDate>Tue, 09 Jun 2026 18:11:42 +0000</pubDate><link>https://news.ycombinator.com/item?id=48465104</link><dc:creator>hedgehog</dc:creator><comments>https://news.ycombinator.com/item?id=48465104</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=48465104</guid></item><item><title><![CDATA[New comment by hedgehog in "MiMo-v2.5-Pro-UltraSpeed: 1T model with 1000 tokens per second"]]></title><description><![CDATA[
<p>For scale though if three or four chips that size can replicate a Qwen 27B experience that'll be quite useful.</p>
]]></description><pubDate>Mon, 08 Jun 2026 23:03:50 +0000</pubDate><link>https://news.ycombinator.com/item?id=48453648</link><dc:creator>hedgehog</dc:creator><comments>https://news.ycombinator.com/item?id=48453648</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=48453648</guid></item><item><title><![CDATA[New comment by hedgehog in "Stop the Apple Music app from launching"]]></title><description><![CDATA[
<p>iTunes and iPhoto both. Given how good the tools are getting, and how much existing sample code is available, it seems likely someone will do a good job of reincarnating them in the near future. Apple broke the apps I used most on the Mac and then they added the bubblicious design crime UI, no thanks.</p>
]]></description><pubDate>Mon, 08 Jun 2026 18:48:54 +0000</pubDate><link>https://news.ycombinator.com/item?id=48449686</link><dc:creator>hedgehog</dc:creator><comments>https://news.ycombinator.com/item?id=48449686</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=48449686</guid></item><item><title><![CDATA[New comment by hedgehog in "Full Reverse Engineering of the TI-84 Plus Operating System"]]></title><description><![CDATA[
<p>Do you have plans to generate a buildable version of the sources, and do you know the original implementation language (C?).</p>
]]></description><pubDate>Mon, 08 Jun 2026 18:39:03 +0000</pubDate><link>https://news.ycombinator.com/item?id=48449505</link><dc:creator>hedgehog</dc:creator><comments>https://news.ycombinator.com/item?id=48449505</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=48449505</guid></item><item><title><![CDATA[New comment by hedgehog in "Show HN: I Derived a Pancake"]]></title><description><![CDATA[
<p>I want to like this but it reads like Claude output, how much scrutiny did this get for accuracy?</p>
]]></description><pubDate>Mon, 08 Jun 2026 03:38:33 +0000</pubDate><link>https://news.ycombinator.com/item?id=48441064</link><dc:creator>hedgehog</dc:creator><comments>https://news.ycombinator.com/item?id=48441064</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=48441064</guid></item><item><title><![CDATA[New comment by hedgehog in "The LLM warnings Google fired Timnit Gebru over have all come true"]]></title><description><![CDATA[
<p>The scale of the data and the size of the models don't change the underlying issue, the whole construction of these models is to start with a maximum likelihood language sampler (pre-training) and then massage it into a maximum utility language sampler (post-training) with some eye towards risk management and policy compliance ("safety"). It takes work to make model output fit any particular idea of "correct", whether it's Elon's particular ideology, the US Civil Rights act, Xi Jinping Thought, or writing clean C++. More data and weights increase the complexity of tasks that we're able to model but it doesn't automatically make the output "better" on any given axis.</p>
]]></description><pubDate>Thu, 04 Jun 2026 18:17:36 +0000</pubDate><link>https://news.ycombinator.com/item?id=48402490</link><dc:creator>hedgehog</dc:creator><comments>https://news.ycombinator.com/item?id=48402490</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=48402490</guid></item><item><title><![CDATA[New comment by hedgehog in "Gemma 4 12B: A unified, encoder-free multimodal model"]]></title><description><![CDATA[
<p>Ryzen 395 is what I'm using, anything with 128GB+ of RAM accessible to the GPU should work fine for a 4 bit version of the model (so Spark or Mac Studio should be ok too).</p>
]]></description><pubDate>Thu, 04 Jun 2026 08:17:13 +0000</pubDate><link>https://news.ycombinator.com/item?id=48395712</link><dc:creator>hedgehog</dc:creator><comments>https://news.ycombinator.com/item?id=48395712</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=48395712</guid></item><item><title><![CDATA[New comment by hedgehog in "Gemma 4 12B: A unified, encoder-free multimodal model"]]></title><description><![CDATA[
<p>Same chip, with a 6 bit 35B and 8 bit KV cache I see about 500 prefill and 55 decode at 30k into the context window. MiniMax seemed a bit lower token rate but much, much less prone to 40k tokens of monologue before generating an answer. A pattern I like is to use a smaller model to do most execution and then a larger model to review transcripts and output and do any fixups and tooling improvements (this is all batch jobs so all I care about is overall throughput).</p>
]]></description><pubDate>Thu, 04 Jun 2026 03:09:42 +0000</pubDate><link>https://news.ycombinator.com/item?id=48393228</link><dc:creator>hedgehog</dc:creator><comments>https://news.ycombinator.com/item?id=48393228</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=48393228</guid></item><item><title><![CDATA[New comment by hedgehog in "Gemma 4 12B: A unified, encoder-free multimodal model"]]></title><description><![CDATA[
<p>The 6-bit versions + 8-bit KV cache seems to save a good bit of memory without a significant loss of quality. The Qwen 35B is pretty fast in my testing, but MiniMax M2.7 230B is in some ways faster (way fewer tokens to arrive at an answer) even though it is much larger.</p>
]]></description><pubDate>Thu, 04 Jun 2026 01:18:08 +0000</pubDate><link>https://news.ycombinator.com/item?id=48392488</link><dc:creator>hedgehog</dc:creator><comments>https://news.ycombinator.com/item?id=48392488</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=48392488</guid></item><item><title><![CDATA[New comment by hedgehog in "MAI-Code-1-Flash"]]></title><description><![CDATA[
<p>Yes. Divide execution of a change into separate responsibilities. Designate the main chat as the "orchestrator", Opus. You designate a goal, then tell it to grind until it gets there using the following sub-agents in sequence:<p>1. Step execution (Sonnet): Work for 30 minutes / 100k tokens at the direction of the Orchestrator<p>2. Review (Opus): Scrutinize the previous step's work for errors, fidelity to the instructions, fix those and record opportunities to improve the agent configuration + tools to reduce errors and token usage (record those to a file).<p>3. Self-improvement (Opus): Implement the highest impact self-improvement items that don't require user intervention.<p>Repeat: Until orchestrator session token budget exhausted (set it to 1M or whatever).<p>The underlying rationale is to keep each step manageable to maximize adherence to instructions and minimize cost (even cached tokens cost something). Prompt tokens are much cheaper than generated, so to the extent Opus mostly reviews rather than drives that saves a lot too. Self-improvement steps are very expensive but the improvements compound, if you're going to run a job for days or weeks it's way more expensive not to do them.<p>Edit: I do this in Claude Code with the Anthropic models as well as Qwen family models for offline use.</p>
]]></description><pubDate>Tue, 02 Jun 2026 21:07:44 +0000</pubDate><link>https://news.ycombinator.com/item?id=48376300</link><dc:creator>hedgehog</dc:creator><comments>https://news.ycombinator.com/item?id=48376300</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=48376300</guid></item><item><title><![CDATA[New comment by hedgehog in "SQLite is all you need for durable workflows"]]></title><description><![CDATA[
<p>It does, my experience has been that it adds code complexity, deployment complexity, and performance problems. There are some observability benefits, but other ways to solve that. It's possible there are workloads that fit it but not anything I've personally worked on.</p>
]]></description><pubDate>Sat, 30 May 2026 02:34:25 +0000</pubDate><link>https://news.ycombinator.com/item?id=48331868</link><dc:creator>hedgehog</dc:creator><comments>https://news.ycombinator.com/item?id=48331868</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=48331868</guid></item><item><title><![CDATA[New comment by hedgehog in "Dynamic Workflows in Claude Code"]]></title><description><![CDATA[
<p>How granular is the control over the internal process?<p>In my experiments I've had some success modeling the work to be done as a DAG of typed artifacts with a combination of code + LLM doing decomposition, transforms, synthesis, and fitness checking to generate the output. It took me a lot of tries to arrive at that formula and it would be cool to have something more general. I also run part of it against local compute because it would be far beyond my budget to do it all on Opus, so something for that would be nice too.</p>
]]></description><pubDate>Thu, 28 May 2026 21:37:37 +0000</pubDate><link>https://news.ycombinator.com/item?id=48315874</link><dc:creator>hedgehog</dc:creator><comments>https://news.ycombinator.com/item?id=48315874</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=48315874</guid></item><item><title><![CDATA[New comment by hedgehog in "Ruby vs. Java vs. TypeScript: my experience on building a Cowork DOCX plugin"]]></title><description><![CDATA[
<p>You know that, and I know that, but for someone who started working more recently the difference between CORBA and punch cards might be a little blurry because they're both so far back they've never seen either. It's like kids asking how the dinosaurs in LEGO Jurassic world were animated, because they don't move like real toys, and noting how much easier the 1993 live action Jurassic Park filming was because back then they could just film real dinosaurs. Feels weird, but makes sense from their perspective.</p>
]]></description><pubDate>Thu, 28 May 2026 17:06:10 +0000</pubDate><link>https://news.ycombinator.com/item?id=48311985</link><dc:creator>hedgehog</dc:creator><comments>https://news.ycombinator.com/item?id=48311985</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=48311985</guid></item><item><title><![CDATA[New comment by hedgehog in "What Is a Direct Attach Copper (DAC) Cable? (2021)"]]></title><description><![CDATA[
<p>I went from one dev machine to two at my desk so I connected them via 25GBe. With about 2.8GBps TCP throughput and RDMA available I don't have to think too much about task placement or cross-traffic. (specific hardware: Mellanox ConnectX 4 LX cards + a DAC cable).</p>
]]></description><pubDate>Thu, 28 May 2026 02:09:09 +0000</pubDate><link>https://news.ycombinator.com/item?id=48303503</link><dc:creator>hedgehog</dc:creator><comments>https://news.ycombinator.com/item?id=48303503</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=48303503</guid></item><item><title><![CDATA[New comment by hedgehog in "Tech CEOs are apparently suffering from AI psychosis"]]></title><description><![CDATA[
<p>Does that mean it can work 20% faster?</p>
]]></description><pubDate>Wed, 27 May 2026 17:07:41 +0000</pubDate><link>https://news.ycombinator.com/item?id=48297216</link><dc:creator>hedgehog</dc:creator><comments>https://news.ycombinator.com/item?id=48297216</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=48297216</guid></item><item><title><![CDATA[New comment by hedgehog in "A portentous reunion"]]></title><description><![CDATA[
<p>I've done some tool-assisted ports (including without original source), the work you already did is probably 1/4 of the way to a web-hosted Rust BattleTris.</p>
]]></description><pubDate>Wed, 27 May 2026 00:15:57 +0000</pubDate><link>https://news.ycombinator.com/item?id=48287787</link><dc:creator>hedgehog</dc:creator><comments>https://news.ycombinator.com/item?id=48287787</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=48287787</guid></item><item><title><![CDATA[New comment by hedgehog in "Norway's 2 petabytes of Huawei flash storage and LLM training"]]></title><description><![CDATA[
<p>That's enough resources to build on something like the Olmo 3 recipe but with a mix prioritizing their own data and post-training for their own tasks. If they build their own embedding model, index everything in the library, and train their model to query that data while answering historical, cultural, legal, and strategic questions from their perspective... Pretty interesting and likely useful. They won't beat Anthropic at dumping out React code but also there's no real reason to duplicate that.</p>
]]></description><pubDate>Tue, 26 May 2026 03:37:55 +0000</pubDate><link>https://news.ycombinator.com/item?id=48274689</link><dc:creator>hedgehog</dc:creator><comments>https://news.ycombinator.com/item?id=48274689</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=48274689</guid></item><item><title><![CDATA[New comment by hedgehog in "CVE-2026-28952: Apple macOS 26.5 Kernel Vuln found by Claude"]]></title><description><![CDATA[
<p>This was fixed in 26.5 as well as 15.7.7 etc.<p><a href="https://app.opencve.io/cve/CVE-2026-28952" rel="nofollow">https://app.opencve.io/cve/CVE-2026-28952</a></p>
]]></description><pubDate>Tue, 26 May 2026 00:56:18 +0000</pubDate><link>https://news.ycombinator.com/item?id=48273731</link><dc:creator>hedgehog</dc:creator><comments>https://news.ycombinator.com/item?id=48273731</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=48273731</guid></item><item><title><![CDATA[New comment by hedgehog in "Migrating from Go to Rust"]]></title><description><![CDATA[
<p>Pauses are a problem with heap size and structure, not allocation rate, because the pause is caused by GC code that is O(heap size). Making garbage slower reduces the frequency but not severity. This is an issue with most GCs to some degree, there are phases of collection where the GC stops execution and the duration is relative to how much work it has to do which is based on how many objects and how much memory needs to be checked. "Concurrent" garbage collection is the approach of trying to reduce the pauses by doing more of the work while program execution continues. It's complicated and hard to get right, so Go's original GC was IIRC fully stop-the-world.<p>There are some fine points to the O(heap size), for example it's clearly unnecessary for the GC to scan objects that do not themselves contain pointers, and work is somewhat proportional to the total number of objects. Combining numerous small objects into manually managed slices, coming up with ways to make the most numerous items pointer-free, etc.<p>I learned a bit about this when an analytics workload I had ended up with unacceptable pauses (I think over 1 second), Go's GC is more sophisticated now but I think in any GC runtime you have the same considerations to some degree. Some of the best writing at the time was by Gil Tene, one of the principal authors of the C4 concurrent collector at Azul Systems, starting point here:<p><a href="https://groups.google.com/g/golang-dev/c/GvA0DaCI2BU/m/SmEelQsFybkJ" rel="nofollow">https://groups.google.com/g/golang-dev/c/GvA0DaCI2BU/m/SmEel...</a></p>
]]></description><pubDate>Mon, 25 May 2026 15:40:21 +0000</pubDate><link>https://news.ycombinator.com/item?id=48268133</link><dc:creator>hedgehog</dc:creator><comments>https://news.ycombinator.com/item?id=48268133</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=48268133</guid></item><item><title><![CDATA[New comment by hedgehog in "Migrating from Go to Rust"]]></title><description><![CDATA[
<p>Possibly in your specific application, usually there are a handful of options far less painful than a rewrite.<p>For the original issue of GC pauses, a narrow change is to move problem data to non-pointer-carrying types, or the bigger hammer of manually managed slices of those types. The second helps with fragmentation too. Some workloads can be split into multiple processes as a direct way to have smaller heaps. If none of those options are enough then off-heap storage lets you do whatever you want.<p>I do have some complaints about Go, but one of the big ones has been fixed since I last wrote much Go code and it seems like a fine choice for a lot of applications.</p>
]]></description><pubDate>Mon, 25 May 2026 15:10:58 +0000</pubDate><link>https://news.ycombinator.com/item?id=48267807</link><dc:creator>hedgehog</dc:creator><comments>https://news.ycombinator.com/item?id=48267807</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=48267807</guid></item></channel></rss>