<rss version="2.0" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:atom="http://www.w3.org/2005/Atom"><channel><title>Hacker News: schipperai</title><link>https://news.ycombinator.com/user?id=schipperai</link><description>Hacker News RSS</description><docs>https://hnrss.org/</docs><generator>hnrss v2.1.1</generator><lastBuildDate>Fri, 03 Jul 2026 09:46:51 +0000</lastBuildDate><atom:link href="https://hnrss.org/user?id=schipperai" rel="self" type="application/rss+xml"></atom:link><item><title><![CDATA[New comment by schipperai in "A way to exclude sensitive files issue still open for OpenAI Codex"]]></title><description><![CDATA[
<p>Do I understand correctly that you scope least-privilege creds/tokens and pass those to the sandbox? I'd be curious to learn more</p>
]]></description><pubDate>Sun, 28 Jun 2026 16:31:10 +0000</pubDate><link>https://news.ycombinator.com/item?id=48708886</link><dc:creator>schipperai</dc:creator><comments>https://news.ycombinator.com/item?id=48708886</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=48708886</guid></item><item><title><![CDATA[New comment by schipperai in "The Coming Loop"]]></title><description><![CDATA[
<p>If an organization decides the engineering team should not be looking at code, that should be coupled with a mandate to figure out what good engineering looks like working that way - what constitutes a good contribution vs what's slop? How do we handle massive PRs? The problem is we are in the "messing around phase" of coding with clankers and have much to learn still</p>
]]></description><pubDate>Tue, 23 Jun 2026 17:04:00 +0000</pubDate><link>https://news.ycombinator.com/item?id=48647965</link><dc:creator>schipperai</dc:creator><comments>https://news.ycombinator.com/item?id=48647965</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=48647965</guid></item><item><title><![CDATA[New comment by schipperai in "Don't trust large context windows"]]></title><description><![CDATA[
<p>Working in the era of 200k context window meant I had to narrowly scope tasks to fit in the context window, forcing me to think about how to reduce complexity and naturally resulting in atomic work. 1M context windows and the promise that the latest models are "better at long running tasks" made me lazy in how I scope tasks and quality got worse. I now went back to narrow-scoping one session per task and zero compaction, trying not to go past 400k context window. If I end up with a long session, I was likely too ambitious and should have broken up the task.</p>
]]></description><pubDate>Sun, 14 Jun 2026 15:33:46 +0000</pubDate><link>https://news.ycombinator.com/item?id=48528322</link><dc:creator>schipperai</dc:creator><comments>https://news.ycombinator.com/item?id=48528322</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=48528322</guid></item><item><title><![CDATA[New comment by schipperai in "Claude Fable 5"]]></title><description><![CDATA[
<p>Let's hope not all frontier AI assimilates these guardrails. It would be a shame for independent researchers and students.</p>
]]></description><pubDate>Tue, 09 Jun 2026 19:34:52 +0000</pubDate><link>https://news.ycombinator.com/item?id=48466487</link><dc:creator>schipperai</dc:creator><comments>https://news.ycombinator.com/item?id=48466487</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=48466487</guid></item><item><title><![CDATA[New comment by schipperai in "Is Grep All You Need? How Agent Harnesses Reshape Agentic Search"]]></title><description><![CDATA[
<p>I get a sense that I was click-baited by article's title with the classic trope of "X is all you need". This research is a solid contribution, but is far from all we need to understand grep vs semantic search in agent retrieval.</p>
]]></description><pubDate>Tue, 09 Jun 2026 18:37:21 +0000</pubDate><link>https://news.ycombinator.com/item?id=48465533</link><dc:creator>schipperai</dc:creator><comments>https://news.ycombinator.com/item?id=48465533</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=48465533</guid></item><item><title><![CDATA[New comment by schipperai in "Claude Fable 5"]]></title><description><![CDATA[
<p>Cognition did well in documenting their approach [1].<p>TL;DR - they worked with OSS project maintainers to build tasks. They score models based on whether a PR is mergeable. All tasks are graded by a human researcher. SoTA models have hill-climbing to do which raises the bar and inspires confidence. I'd say it's legit.<p>[1]: <a href="https://x.com/cognition/status/2064061031912288715" rel="nofollow">https://x.com/cognition/status/2064061031912288715</a></p>
]]></description><pubDate>Tue, 09 Jun 2026 18:29:17 +0000</pubDate><link>https://news.ycombinator.com/item?id=48465396</link><dc:creator>schipperai</dc:creator><comments>https://news.ycombinator.com/item?id=48465396</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=48465396</guid></item><item><title><![CDATA[New comment by schipperai in "MiMo-v2.5-Pro-UltraSpeed: 1T model with 1000 tokens per second"]]></title><description><![CDATA[
<p>I trust AI to surface general information and best practices on established knowledge domains. For example: best practices for securing my VPS.<p>For domains whete SoTA is constantly changing like AI, I use LLMs to aggregate and interact with my own research from trusted sources ala Karpathy LLM wiki.<p>I don’t generally trust everything I read on the internet whether its AI generated or not. I do my own research for the things that matter to me.</p>
]]></description><pubDate>Tue, 09 Jun 2026 10:41:14 +0000</pubDate><link>https://news.ycombinator.com/item?id=48459255</link><dc:creator>schipperai</dc:creator><comments>https://news.ycombinator.com/item?id=48459255</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=48459255</guid></item><item><title><![CDATA[New comment by schipperai in "MiMo-v2.5-Pro-UltraSpeed: 1T model with 1000 tokens per second"]]></title><description><![CDATA[
<p>You can dig deeper into problems with AI. For me, it supplements my knowledge in domains I don’t fully  understand. It also helps me learn. So I can tackle problems I wouldn’t otherwise.<p>I’m excited for ultrafast AI. It likely means less temptation to multi-thread and deeper flow in single sessions.</p>
]]></description><pubDate>Mon, 08 Jun 2026 17:22:58 +0000</pubDate><link>https://news.ycombinator.com/item?id=48448227</link><dc:creator>schipperai</dc:creator><comments>https://news.ycombinator.com/item?id=48448227</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=48448227</guid></item><item><title><![CDATA[New comment by schipperai in "Gemma 4 12B: A unified, encoder-free multimodal model"]]></title><description><![CDATA[
<p>Demis at YCombinator said that they think its best their edge models are open cause once they are put on device they are vulnerable anyways<p><a href="https://youtu.be/JNyuX1zoOgU?is=PdzCILyi8SP6cfDr" rel="nofollow">https://youtu.be/JNyuX1zoOgU?is=PdzCILyi8SP6cfDr</a></p>
]]></description><pubDate>Wed, 03 Jun 2026 20:02:36 +0000</pubDate><link>https://news.ycombinator.com/item?id=48389189</link><dc:creator>schipperai</dc:creator><comments>https://news.ycombinator.com/item?id=48389189</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=48389189</guid></item><item><title><![CDATA[New comment by schipperai in "Ask HN: What are you working on? (May 2026)"]]></title><description><![CDATA[
<p>Yes, you can define sensitive paths and assign 'ask' or 'block' policies to them.<p>.env, .ssh, and others are treated as a sensitive filenames by default.<p>Similarly, with hosts and network access - unknown hosts pause, trusted hosts can be configured.</p>
]]></description><pubDate>Mon, 11 May 2026 17:01:14 +0000</pubDate><link>https://news.ycombinator.com/item?id=48097591</link><dc:creator>schipperai</dc:creator><comments>https://news.ycombinator.com/item?id=48097591</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=48097591</guid></item><item><title><![CDATA[New comment by schipperai in "Maryland citizens hit with $2B power grid upgrade for out-of-state AI"]]></title><description><![CDATA[
<p>This recent article from Semianalysis did a great job explaining part of it: <a href="https://newsletter.semianalysis.com/p/are-ai-datacenters-increasing-electric" rel="nofollow">https://newsletter.semianalysis.com/p/are-ai-datacenters-inc...</a></p>
]]></description><pubDate>Mon, 11 May 2026 03:13:19 +0000</pubDate><link>https://news.ycombinator.com/item?id=48090692</link><dc:creator>schipperai</dc:creator><comments>https://news.ycombinator.com/item?id=48090692</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=48090692</guid></item><item><title><![CDATA[New comment by schipperai in "Ask HN: What are you working on? (May 2026)"]]></title><description><![CDATA[
<p>Very cool. How do you classify negative signals?</p>
]]></description><pubDate>Mon, 11 May 2026 02:39:51 +0000</pubDate><link>https://news.ycombinator.com/item?id=48090501</link><dc:creator>schipperai</dc:creator><comments>https://news.ycombinator.com/item?id=48090501</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=48090501</guid></item><item><title><![CDATA[New comment by schipperai in "Ask HN: What are you working on? (May 2026)"]]></title><description><![CDATA[
<p>Which platform have you found is most hackable? I have Garmin atm and like it but there’s no easy way to pipe my data into my agent or server for offline analysis.</p>
]]></description><pubDate>Mon, 11 May 2026 02:34:46 +0000</pubDate><link>https://news.ycombinator.com/item?id=48090454</link><dc:creator>schipperai</dc:creator><comments>https://news.ycombinator.com/item?id=48090454</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=48090454</guid></item><item><title><![CDATA[New comment by schipperai in "Ask HN: What are you working on? (May 2026)"]]></title><description><![CDATA[
<p>I like the overall premise and would be curious to learn more. The Amazon overview reads like it was written with or by AI though.</p>
]]></description><pubDate>Mon, 11 May 2026 02:32:30 +0000</pubDate><link>https://news.ycombinator.com/item?id=48090437</link><dc:creator>schipperai</dc:creator><comments>https://news.ycombinator.com/item?id=48090437</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=48090437</guid></item><item><title><![CDATA[New comment by schipperai in "Ask HN: What are you working on? (May 2026)"]]></title><description><![CDATA[
<p>A better permissions layer for coding agents. The tool works like auto-mode for Claude Code, so you can stay in the flow and only get prompted to allow or deny tool calls when it truly matters, but it is fully deterministic. My benchmarks surfaced that most Bash calls don’t need an LLM to be classified as safe, ambiguous, or dangerous. A deterministic classifier can auto-allow or block 95% of Bash tool calls as safe or dangerous, with only the remaining 5% being truly ambiguous or unknown.<p>Conclusion is permission reviews with LLMs like Claude’s auto mode or Codex auto review are like using a data center to flip a light switch - overkill.<p>The main benefit is that your agent’s autonomy can be governed deterministically through policies that can be stored at the user and repo level. The bonus is that you save tokens vs using auto modes.<p><a href="https://nah.build" rel="nofollow">https://nah.build</a></p>
]]></description><pubDate>Mon, 11 May 2026 02:17:11 +0000</pubDate><link>https://news.ycombinator.com/item?id=48090342</link><dc:creator>schipperai</dc:creator><comments>https://news.ycombinator.com/item?id=48090342</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=48090342</guid></item><item><title><![CDATA[New comment by schipperai in "Mistral Medium 3.5"]]></title><description><![CDATA[
<p>Thanks, makes sense. I meant Blackwell is explicitly optimized for MoEs.</p>
]]></description><pubDate>Wed, 29 Apr 2026 23:34:54 +0000</pubDate><link>https://news.ycombinator.com/item?id=47956100</link><dc:creator>schipperai</dc:creator><comments>https://news.ycombinator.com/item?id=47956100</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47956100</guid></item><item><title><![CDATA[New comment by schipperai in "Mistral Medium 3.5"]]></title><description><![CDATA[
<p>With most OSS releases being MoEs, and modern GPUs optimized for MoEs, can somebody with knowledge of the topic explain or speculate why Mistral might have opted for a dense model?</p>
]]></description><pubDate>Wed, 29 Apr 2026 17:33:55 +0000</pubDate><link>https://news.ycombinator.com/item?id=47951637</link><dc:creator>schipperai</dc:creator><comments>https://news.ycombinator.com/item?id=47951637</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47951637</guid></item><item><title><![CDATA[New comment by schipperai in "I bought Friendster for $30k – Here's what I'm doing with it"]]></title><description><![CDATA[
<p>100%. The exclusivity of the network is the differentiator here.</p>
]]></description><pubDate>Mon, 27 Apr 2026 10:45:53 +0000</pubDate><link>https://news.ycombinator.com/item?id=47919917</link><dc:creator>schipperai</dc:creator><comments>https://news.ycombinator.com/item?id=47919917</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47919917</guid></item><item><title><![CDATA[New comment by schipperai in "An AI agent deleted our production database. The agent's confession is below"]]></title><description><![CDATA[
<p>Agent permissions layer are broken. We need better a permissions layer that doesn’t get in the way but stops destructive commands. Devs get pushed into running yolo mode cause classifying allow / deny by command is not enough. A sandbox would not have prevented this either.<p>“nah” is a context aware permission layer that clasifies commands based on what they actually do<p>nah exposes a type taxonomy: filesystem_delete, network_write, db_write, etc<p>so commands gets classified contextually:<p>git push  ; Sure.
git push --force ; nah?<p>rm -rf __pycache__ ; Ok, cleaning up.
rm ~/.bashrc ; nah.<p>curl harmless url ; sure.
curl destroy_db ; nah.<p><a href="https://github.com/manuelschipper/nah" rel="nofollow">https://github.com/manuelschipper/nah</a><p>Better permissions layers is part of the answer here, and a space that has been only narrowly explored.</p>
]]></description><pubDate>Mon, 27 Apr 2026 09:45:18 +0000</pubDate><link>https://news.ycombinator.com/item?id=47919570</link><dc:creator>schipperai</dc:creator><comments>https://news.ycombinator.com/item?id=47919570</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47919570</guid></item><item><title><![CDATA[New comment by schipperai in "Show HN: A context-aware permission guard for Claude Code"]]></title><description><![CDATA[
<p>nah inspects Write and Edit content before it hits disk so destructive patterns like os.unlink, rm -rf, shell injection get flagged. And executing the result (./evil) classifies as unknown resolves to ask, which the LLM can choose to blocks or ask you to approve.<p>But yeah, a truly adversarial agent needs a sandbox. It's a different threat model - nah is meant to catch the trusted but mistake-prone coding CLI, not a hostile agent.</p>
]]></description><pubDate>Fri, 13 Mar 2026 13:57:11 +0000</pubDate><link>https://news.ycombinator.com/item?id=47364543</link><dc:creator>schipperai</dc:creator><comments>https://news.ycombinator.com/item?id=47364543</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47364543</guid></item></channel></rss>