<rss version="2.0" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:atom="http://www.w3.org/2005/Atom"><channel><title>Hacker News: msp26</title><link>https://news.ycombinator.com/user?id=msp26</link><description>Hacker News RSS</description><docs>https://hnrss.org/</docs><generator>hnrss v2.1.1</generator><lastBuildDate>Tue, 07 Apr 2026 10:22:57 +0000</lastBuildDate><atom:link href="https://hnrss.org/user?id=msp26" rel="self" type="application/rss+xml"></atom:link><item><title><![CDATA[New comment by msp26 in "If DSPy is so great, why isn't anyone using it?"]]></title><description><![CDATA[
<p>> Data extraction tasks are amongst the easiest to evaluate because there’s a known “right” answer.<p>Wrong. There can be a lot of subjectivity and pretending that some golden answer exists does more harm and narrows down the scope of what you can build.<p>My other main problem with data extraction tasks and why I'm not satisfied with any of the existing eval tools is that the schemas I write change can drastically as my understanding of the problem increases. And nothing really seems to handle that well, I mostly just resort to reading diffs of what happens when I change something and reading the input/output data very closely. Marimo is fantastic for anything visual like this btw.<p>Also there is a difference between: the problem in reality → the business model → your db/application schema → the schema you send to the LLM. And to actually improve your schema/prompt you have to be mindful of the entire problem stack and how you might separate things that are handled through post processing rather than by the LLM directly.<p>> Abstract model calls. Make swapping GPT-4 for Claude a one-line change.<p>And in practice random limitations like structured output API schema limits between providers can make this non-trivial. God I hate the Gemini API.</p>
]]></description><pubDate>Mon, 23 Mar 2026 16:16:58 +0000</pubDate><link>https://news.ycombinator.com/item?id=47491546</link><dc:creator>msp26</dc:creator><comments>https://news.ycombinator.com/item?id=47491546</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47491546</guid></item><item><title><![CDATA[New comment by msp26 in "GPT‑5.4 Mini and Nano"]]></title><description><![CDATA[
<p>Man the lowest end pricing has been thoroughly hiked. It was convenient while it lasted.</p>
]]></description><pubDate>Tue, 17 Mar 2026 22:32:17 +0000</pubDate><link>https://news.ycombinator.com/item?id=47419255</link><dc:creator>msp26</dc:creator><comments>https://news.ycombinator.com/item?id=47419255</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47419255</guid></item><item><title><![CDATA[New comment by msp26 in "Show HN: I built a tool that watches webpages and exposes changes as RSS"]]></title><description><![CDATA[
<p>I got claude to reverse engineer the extension and compare to changedetection and here's what it came up with. Apologies for clanker slop but I think its in poor taste to not attribute the opensource tool that the service is built on (one that's also funded by their SaaS plan)<p>---<p>Summary: What Is Objectively Provable<p>- The extension stores its config under the key changedetection_config<p>- 16 API endpoints in the extension are 1:1 matches with changedetection.io's documented API<p>- 16 data model field names are exact matches with changedetection.io's Watch model (including obscure ones like time_between_check_use_default, history_n, notification_muted, fetch_backend)<p>- The authentication mechanism (x-api-key header) is identical<p>- The default port (5000) matches changedetection.io's default<p>- Custom endpoints (/auth/, /feature-flags, /email/, /generate_key, /pregate) do NOT exist in changedetection.io — these are proprietary additions<p>- The watch limit error format is completely different from changedetection.io's, adding billing-specific fields (current_plan, upgrade_required)<p>- The extension ships with error tracking that sends telemetry (including user emails on login) to the developer's GlitchTip server at 100% sample rate<p>The extension is provably a client for a modified/extended changedetection.io backend. The open question is only the degree of modification - whether it's a fork, a proxy wrapper, or a plugin system. But the underlying engine is unambiguously changedetection.io.</p>
]]></description><pubDate>Thu, 12 Mar 2026 11:08:55 +0000</pubDate><link>https://news.ycombinator.com/item?id=47349069</link><dc:creator>msp26</dc:creator><comments>https://news.ycombinator.com/item?id=47349069</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47349069</guid></item><item><title><![CDATA[New comment by msp26 in "Show HN: I built a tool that watches webpages and exposes changes as RSS"]]></title><description><![CDATA[
<p>see:<p><a href="https://news.ycombinator.com/item?id=47349069">https://news.ycombinator.com/item?id=47349069</a></p>
]]></description><pubDate>Thu, 12 Mar 2026 11:06:12 +0000</pubDate><link>https://news.ycombinator.com/item?id=47349051</link><dc:creator>msp26</dc:creator><comments>https://news.ycombinator.com/item?id=47349051</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47349051</guid></item><item><title><![CDATA[New comment by msp26 in "Show HN: Argus – VSCode debugger for Claude Code sessions"]]></title><description><![CDATA[
<p>Apologies but I will use this thread as an opportunity to report CC VSCode extension bugs because I don't think there's an official channel that actually gets read by humans.<p>> yeah they're shipping too fast and everything is buggy as shit<p>- fork conversation button doesn't even work anymore in vscode extension<p>- sometimes when I reconnect to my remote SSH in VSCode, previously loaded chats become inaccessible. The chats are still there in the .jsonl files but for some reason the CC extension becomes incapable of reading them.<p>-- this issue happens so frequently that I ended up making a skill to allow CC to dig up info from the bugged sessions</p>
]]></description><pubDate>Sat, 07 Mar 2026 17:30:31 +0000</pubDate><link>https://news.ycombinator.com/item?id=47289599</link><dc:creator>msp26</dc:creator><comments>https://news.ycombinator.com/item?id=47289599</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47289599</guid></item><item><title><![CDATA[New comment by msp26 in "Gemini 3.1 Flash-Lite: Built for intelligence at scale"]]></title><description><![CDATA[
<p>many tasks don't need any reasoning</p>
]]></description><pubDate>Tue, 03 Mar 2026 20:08:00 +0000</pubDate><link>https://news.ycombinator.com/item?id=47238189</link><dc:creator>msp26</dc:creator><comments>https://news.ycombinator.com/item?id=47238189</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47238189</guid></item><item><title><![CDATA[New comment by msp26 in "Gemini 3.1 Flash-Lite: Built for intelligence at scale"]]></title><description><![CDATA[
<p>What the fuck is this price hike? It was such a nice low end, fast model. Who needs 10 years of reasoning on this model size??<p>I'm gonna switch some workflows to qwen3.5.<p>There's a lot of tasks that benefit from just having a mildly capable LLM and 2.5 Flash Lite worked out of the box for cheap.<p>Can we get flash lite lite please?<p>Edit:
Logan said:
"I think open source models like Gemma might be the answer here"<p>Implying that they're not interested in serving lower end Gemini models?</p>
]]></description><pubDate>Tue, 03 Mar 2026 19:48:45 +0000</pubDate><link>https://news.ycombinator.com/item?id=47237891</link><dc:creator>msp26</dc:creator><comments>https://news.ycombinator.com/item?id=47237891</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47237891</guid></item><item><title><![CDATA[New comment by msp26 in "Anthropic Cowork feature creates 10GB VM bundle on macOS without warning"]]></title><description><![CDATA[
<p>> every single product/feature I've used other than the Claude Code CLI has been terrible<p>yeah they're shipping too fast and everything is buggy as shit<p>- fork conversation button doesn't even work anymore in vscode extension<p>- sometimes when I reconnect to my remote SSH in VSCode, previously loaded chats become inaccessible. The chats are still there in the .jsonl files but for some reason the CC extension becomes incapable of reading them.</p>
]]></description><pubDate>Mon, 02 Mar 2026 16:50:20 +0000</pubDate><link>https://news.ycombinator.com/item?id=47220477</link><dc:creator>msp26</dc:creator><comments>https://news.ycombinator.com/item?id=47220477</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47220477</guid></item><item><title><![CDATA[New comment by msp26 in "I am directing the Department of War to designate Anthropic a supply-chain risk"]]></title><description><![CDATA[
<p>Batshit situation, respectable position from Dario throughout.<p>But there's some irony in this happening to Anthropic after all the constant hawkish fearmongering about the evil Chinese (and open source AI sentiment too).</p>
]]></description><pubDate>Fri, 27 Feb 2026 23:02:20 +0000</pubDate><link>https://news.ycombinator.com/item?id=47187126</link><dc:creator>msp26</dc:creator><comments>https://news.ycombinator.com/item?id=47187126</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47187126</guid></item><item><title><![CDATA[New comment by msp26 in "The Future of AI Software Development"]]></title><description><![CDATA[
<p>Horrific comparison point. LLM inference is way more expensive locally for single users than running batch inference at scale in a datacenter on actual GPUs/TPUs.</p>
]]></description><pubDate>Wed, 18 Feb 2026 18:11:22 +0000</pubDate><link>https://news.ycombinator.com/item?id=47064133</link><dc:creator>msp26</dc:creator><comments>https://news.ycombinator.com/item?id=47064133</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47064133</guid></item><item><title><![CDATA[Tool Shaped Objects]]></title><description><![CDATA[
<p>Article URL: <a href="https://minutes.substack.com/p/tool-shaped-objects">https://minutes.substack.com/p/tool-shaped-objects</a></p>
<p>Comments URL: <a href="https://news.ycombinator.com/item?id=47061793">https://news.ycombinator.com/item?id=47061793</a></p>
<p>Points: 3</p>
<p># Comments: 0</p>
]]></description><pubDate>Wed, 18 Feb 2026 15:09:58 +0000</pubDate><link>https://minutes.substack.com/p/tool-shaped-objects</link><dc:creator>msp26</dc:creator><comments>https://news.ycombinator.com/item?id=47061793</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47061793</guid></item><item><title><![CDATA[New comment by msp26 in "AI has fixed my productivity"]]></title><description><![CDATA[
<p><a href="https://minutes.substack.com/p/tool-shaped-objects" rel="nofollow">https://minutes.substack.com/p/tool-shaped-objects</a><p>I feel like this applies for many of you.</p>
]]></description><pubDate>Wed, 18 Feb 2026 15:08:34 +0000</pubDate><link>https://news.ycombinator.com/item?id=47061770</link><dc:creator>msp26</dc:creator><comments>https://news.ycombinator.com/item?id=47061770</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47061770</guid></item><item><title><![CDATA[New comment by msp26 in "What I learned building an opinionated and minimal coding agent"]]></title><description><![CDATA[
<p>> Special shout out to Google who to this date seem to not support tool call streaming which is extremely Google.<p>Google doesn't even provide a tokenizer to count tokens locally. The results of this stupidity can be seen directly in AI studio which makes an API call to count_tokens every time you type in the prompt box.</p>
]]></description><pubDate>Sun, 01 Feb 2026 15:23:27 +0000</pubDate><link>https://news.ycombinator.com/item?id=46846791</link><dc:creator>msp26</dc:creator><comments>https://news.ycombinator.com/item?id=46846791</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=46846791</guid></item><item><title><![CDATA[New comment by msp26 in "AGENTS.md outperforms skills in our agent evals"]]></title><description><![CDATA[
<p>This doesn't surprise me.<p>I have a SKILL.md for marimo notebooks with instructions in the frontmatter to always read it before working with marimo files. But half the time Claude Code still doesn't invoke it even with me mentioning marimo in the first conversation turn.<p>I've resorted to typing "read marimo skill" manually and that works fine. Technically you can use skills with slash commands but that automatically sends off the message too which just wastes time.<p>But the actual concept of instructions to load in certain scenarios is very good and has been worth the time to write up the skill.</p>
]]></description><pubDate>Fri, 30 Jan 2026 04:53:17 +0000</pubDate><link>https://news.ycombinator.com/item?id=46820656</link><dc:creator>msp26</dc:creator><comments>https://news.ycombinator.com/item?id=46820656</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=46820656</guid></item><item><title><![CDATA[New comment by msp26 in "Kimi Released Kimi K2.5, Open-Source Visual SOTA-Agentic Model"]]></title><description><![CDATA[
<p>Source? I've heard this rumour twice but never seen proof. I assume it would be based on tokeniser quirks?</p>
]]></description><pubDate>Tue, 27 Jan 2026 17:57:36 +0000</pubDate><link>https://news.ycombinator.com/item?id=46783651</link><dc:creator>msp26</dc:creator><comments>https://news.ycombinator.com/item?id=46783651</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=46783651</guid></item><item><title><![CDATA[New comment by msp26 in "Kimi Released Kimi K2.5, Open-Source Visual SOTA-Agentic Model"]]></title><description><![CDATA[
<p>K2 thinking didn't have vision which was a big drawback for my projects.</p>
]]></description><pubDate>Tue, 27 Jan 2026 08:38:23 +0000</pubDate><link>https://news.ycombinator.com/item?id=46777160</link><dc:creator>msp26</dc:creator><comments>https://news.ycombinator.com/item?id=46777160</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=46777160</guid></item><item><title><![CDATA[New comment by msp26 in "6 Years Building Video Players. 9B Requests. Starting Over"]]></title><description><![CDATA[
<p>Thank you! That looks great.</p>
]]></description><pubDate>Sun, 25 Jan 2026 08:55:34 +0000</pubDate><link>https://news.ycombinator.com/item?id=46752101</link><dc:creator>msp26</dc:creator><comments>https://news.ycombinator.com/item?id=46752101</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=46752101</guid></item><item><title><![CDATA[New comment by msp26 in "6 Years Building Video Players. 9B Requests. Starting Over"]]></title><description><![CDATA[
<p>Mildly related question for the people in the thread:<p>How do I seek to the exact first frame of a timestamp with mux? I've tried a few things but it seems to always go to the nearest keyframe rather than the first frame at e.g. 00:34. This is sensible default behaviour but bad for my use case.</p>
]]></description><pubDate>Sat, 24 Jan 2026 19:41:32 +0000</pubDate><link>https://news.ycombinator.com/item?id=46746911</link><dc:creator>msp26</dc:creator><comments>https://news.ycombinator.com/item?id=46746911</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=46746911</guid></item><item><title><![CDATA[New comment by msp26 in "Gas Town's agent patterns, design bottlenecks, and vibecoding at scale"]]></title><description><![CDATA[
<p>Originally I thought that Gas Town was some form of high level satire like GOODY-2 but it seems that some of you people have actually lost the plot.<p>Ralph loops are also stupid because they don't make use of kv cache properly.<p>---<p><a href="https://github.com/steveyegge/gastown/issues/503" rel="nofollow">https://github.com/steveyegge/gastown/issues/503</a><p>Problem:<p>Every gt command runs bd version to verify the minimum beads version requirement. Under high concurrency (17+ agent sessions), this check times out and blocks gt commands from running.<p>Impact:<p>With 17+ concurrent sessions each running gt commands:<p>- Each gt command spawns bd version<p>- Each bd version spawns 5-7 git processes<p>- This creates 85-120+ git processes competing for resources<p>- The 2-second timeout in gt is exceeded<p>- gt commands fail with "bd version check timed out"</p>
]]></description><pubDate>Fri, 23 Jan 2026 16:41:56 +0000</pubDate><link>https://news.ycombinator.com/item?id=46734587</link><dc:creator>msp26</dc:creator><comments>https://news.ycombinator.com/item?id=46734587</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=46734587</guid></item><item><title><![CDATA[New comment by msp26 in "Scaling PostgreSQL to power 800M ChatGPT users"]]></title><description><![CDATA[
<p>This account's comment history is pure slop. 90% sure its all AI generated. The structure is too blatant.</p>
]]></description><pubDate>Fri, 23 Jan 2026 12:35:33 +0000</pubDate><link>https://news.ycombinator.com/item?id=46731745</link><dc:creator>msp26</dc:creator><comments>https://news.ycombinator.com/item?id=46731745</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=46731745</guid></item></channel></rss>