Hacker News: schipperai

New comment by schipperai in "A way to exclude sensitive files issue still open for OpenAI Codex"

schipperai — Sun, 28 Jun 2026 16:31:10 +0000

Do I understand correctly that you scope least-privilege creds/tokens and pass those to the sandbox? I'd be curious to learn more

New comment by schipperai in "The Coming Loop"

schipperai — Tue, 23 Jun 2026 17:04:00 +0000

If an organization decides the engineering team should not be looking at code, that should be coupled with a mandate to figure out what good engineering looks like working that way - what constitutes a good contribution vs what's slop? How do we handle massive PRs? The problem is we are in the "messing around phase" of coding with clankers and have much to learn still

New comment by schipperai in "Don't trust large context windows"

schipperai — Sun, 14 Jun 2026 15:33:46 +0000

Working in the era of 200k context window meant I had to narrowly scope tasks to fit in the context window, forcing me to think about how to reduce complexity and naturally resulting in atomic work. 1M context windows and the promise that the latest models are "better at long running tasks" made me lazy in how I scope tasks and quality got worse. I now went back to narrow-scoping one session per task and zero compaction, trying not to go past 400k context window. If I end up with a long session, I was likely too ambitious and should have broken up the task.

New comment by schipperai in "Claude Fable 5"

schipperai — Tue, 09 Jun 2026 19:34:52 +0000

Let's hope not all frontier AI assimilates these guardrails. It would be a shame for independent researchers and students.

New comment by schipperai in "Is Grep All You Need? How Agent Harnesses Reshape Agentic Search"

schipperai — Tue, 09 Jun 2026 18:37:21 +0000

I get a sense that I was click-baited by article's title with the classic trope of "X is all you need". This research is a solid contribution, but is far from all we need to understand grep vs semantic search in agent retrieval.

New comment by schipperai in "Claude Fable 5"

schipperai — Tue, 09 Jun 2026 18:29:17 +0000

Cognition did well in documenting their approach [1].

TL;DR - they worked with OSS project maintainers to build tasks. They score models based on whether a PR is mergeable. All tasks are graded by a human researcher. SoTA models have hill-climbing to do which raises the bar and inspires confidence. I'd say it's legit.

[1]: https://x.com/cognition/status/2064061031912288715

New comment by schipperai in "MiMo-v2.5-Pro-UltraSpeed: 1T model with 1000 tokens per second"

schipperai — Tue, 09 Jun 2026 10:41:14 +0000

I trust AI to surface general information and best practices on established knowledge domains. For example: best practices for securing my VPS.

For domains whete SoTA is constantly changing like AI, I use LLMs to aggregate and interact with my own research from trusted sources ala Karpathy LLM wiki.

I don’t generally trust everything I read on the internet whether its AI generated or not. I do my own research for the things that matter to me.

New comment by schipperai in "MiMo-v2.5-Pro-UltraSpeed: 1T model with 1000 tokens per second"

schipperai — Mon, 08 Jun 2026 17:22:58 +0000

You can dig deeper into problems with AI. For me, it supplements my knowledge in domains I don’t fully understand. It also helps me learn. So I can tackle problems I wouldn’t otherwise.

I’m excited for ultrafast AI. It likely means less temptation to multi-thread and deeper flow in single sessions.

New comment by schipperai in "Gemma 4 12B: A unified, encoder-free multimodal model"

schipperai — Wed, 03 Jun 2026 20:02:36 +0000

Demis at YCombinator said that they think its best their edge models are open cause once they are put on device they are vulnerable anyways

https://youtu.be/JNyuX1zoOgU?is=PdzCILyi8SP6cfDr

New comment by schipperai in "Ask HN: What are you working on? (May 2026)"

schipperai — Mon, 11 May 2026 17:01:14 +0000

Yes, you can define sensitive paths and assign 'ask' or 'block' policies to them.

.env, .ssh, and others are treated as a sensitive filenames by default.

Similarly, with hosts and network access - unknown hosts pause, trusted hosts can be configured.

New comment by schipperai in "Maryland citizens hit with $2B power grid upgrade for out-of-state AI"

schipperai — Mon, 11 May 2026 03:13:19 +0000

This recent article from Semianalysis did a great job explaining part of it: https://newsletter.semianalysis.com/p/are-ai-datacenters-inc...

New comment by schipperai in "Ask HN: What are you working on? (May 2026)"

schipperai — Mon, 11 May 2026 02:39:51 +0000

Very cool. How do you classify negative signals?

New comment by schipperai in "Ask HN: What are you working on? (May 2026)"

schipperai — Mon, 11 May 2026 02:34:46 +0000

Which platform have you found is most hackable? I have Garmin atm and like it but there’s no easy way to pipe my data into my agent or server for offline analysis.

New comment by schipperai in "Ask HN: What are you working on? (May 2026)"

schipperai — Mon, 11 May 2026 02:32:30 +0000

I like the overall premise and would be curious to learn more. The Amazon overview reads like it was written with or by AI though.

New comment by schipperai in "Ask HN: What are you working on? (May 2026)"

schipperai — Mon, 11 May 2026 02:17:11 +0000

A better permissions layer for coding agents. The tool works like auto-mode for Claude Code, so you can stay in the flow and only get prompted to allow or deny tool calls when it truly matters, but it is fully deterministic. My benchmarks surfaced that most Bash calls don’t need an LLM to be classified as safe, ambiguous, or dangerous. A deterministic classifier can auto-allow or block 95% of Bash tool calls as safe or dangerous, with only the remaining 5% being truly ambiguous or unknown.

Conclusion is permission reviews with LLMs like Claude’s auto mode or Codex auto review are like using a data center to flip a light switch - overkill.

The main benefit is that your agent’s autonomy can be governed deterministically through policies that can be stored at the user and repo level. The bonus is that you save tokens vs using auto modes.

https://nah.build

New comment by schipperai in "Mistral Medium 3.5"

schipperai — Wed, 29 Apr 2026 23:34:54 +0000

Thanks, makes sense. I meant Blackwell is explicitly optimized for MoEs.

New comment by schipperai in "Mistral Medium 3.5"

schipperai — Wed, 29 Apr 2026 17:33:55 +0000

With most OSS releases being MoEs, and modern GPUs optimized for MoEs, can somebody with knowledge of the topic explain or speculate why Mistral might have opted for a dense model?

New comment by schipperai in "I bought Friendster for $30k – Here's what I'm doing with it"

schipperai — Mon, 27 Apr 2026 10:45:53 +0000

100%. The exclusivity of the network is the differentiator here.

New comment by schipperai in "An AI agent deleted our production database. The agent's confession is below"

schipperai — Mon, 27 Apr 2026 09:45:18 +0000

Agent permissions layer are broken. We need better a permissions layer that doesn’t get in the way but stops destructive commands. Devs get pushed into running yolo mode cause classifying allow / deny by command is not enough. A sandbox would not have prevented this either.

“nah” is a context aware permission layer that clasifies commands based on what they actually do

nah exposes a type taxonomy: filesystem_delete, network_write, db_write, etc

so commands gets classified contextually:

git push ; Sure. git push --force ; nah?

rm -rf __pycache__ ; Ok, cleaning up. rm ~/.bashrc ; nah.

curl harmless url ; sure. curl destroy_db ; nah.

https://github.com/manuelschipper/nah

Better permissions layers is part of the answer here, and a space that has been only narrowly explored.

New comment by schipperai in "Show HN: A context-aware permission guard for Claude Code"

schipperai — Fri, 13 Mar 2026 13:57:11 +0000

nah inspects Write and Edit content before it hits disk so destructive patterns like os.unlink, rm -rf, shell injection get flagged. And executing the result (./evil) classifies as unknown resolves to ask, which the LLM can choose to blocks or ask you to approve.

But yeah, a truly adversarial agent needs a sandbox. It's a different threat model - nah is meant to catch the trusted but mistake-prone coding CLI, not a hostile agent.