Hacker News: zachdotai

New comment by zachdotai in "SWE-bench Verified no longer measures frontier coding capabilities"

zachdotai — Sun, 26 Apr 2026 23:17:08 +0000

I wrote about this recently here: https://fabraix.com/blog/adversarial-cost-to-exploit

I think the core issue is in static benchmarks and the community needs to start moving beyond measuring pass/fail (which worked when agents were incapable of doing much of the work) to dynamic evals that simulate more how we evaluate humans.

New comment by zachdotai in "I built an agent that breaks your AI agents before someone else does"

zachdotai — Sun, 26 Apr 2026 18:53:30 +0000

We're doing that internally to continuously improve our own agent and make it robust against adversarial attacks itself. We will release some insights about self-improvement soon!

New comment by zachdotai in "I built an agent that breaks your AI agents before someone else does"

zachdotai — Sun, 26 Apr 2026 18:36:12 +0000

AI agents break in ways traditional software doesn't. Logic bugs, reasoning failures, edge cases that manual testing and static benchmarks don't fully explore.

Nyx is an autonomous adversarial harness that probes your agents for vulnerabilities. Since agents are non-deterministic, it can be hard to find the gaps by just reading code. So it interacts with your AI agents in blackbox mode to surface issues across security, logic, and alignment at scale, before they reach users. It's also massively parallel by default

Instead of spending time writing static evals for the key failure modes of your AI agents, point Nyx at any system and it autonomously discovers failure modes that matter. It can typically find issues in under 10 minutes that manual audits take hours to surface.

This is early work and we know the methodology is still going to evolve. We would love nothing more than feedback from the community as we iterate on this.

I built an agent that breaks your AI agents before someone else does

zachdotai — Sun, 26 Apr 2026 18:36:12 +0000

Article URL: https://fabraix.com/

Comments URL: https://news.ycombinator.com/item?id=47912662

Points: 3

# Comments: 4

New comment by zachdotai in "A Brief History of Fish Sauce"

zachdotai — Fri, 24 Apr 2026 09:24:21 +0000

Why did I read this title and immediately think Ketchup?

Bret Taylor's Sierra Buys YC-Backed AI Startup Fragment

zachdotai — Fri, 24 Apr 2026 01:53:07 +0000

Article URL: https://techcrunch.com/2026/04/23/bret-taylors-sierra-buys-yc-backed-ai-startup-fragment/

Comments URL: https://news.ycombinator.com/item?id=47884586

Points: 2

# Comments: 0

New comment by zachdotai in "Show HN: Nyx – multi-turn, adaptive, offensive testing harness for AI agents"

zachdotai — Sun, 19 Apr 2026 23:20:56 +0000

Yes! The docs can be found here: https://docs.fabraix.com

New comment by zachdotai in "Show HN: Nyx – multi-turn, adaptive, offensive testing harness for AI agents"

zachdotai — Sun, 19 Apr 2026 22:50:38 +0000

We wrote some thoughts on static vs. dynamic evals and how it relates to understanding the security posture of an AI system. Static security evals no longer carry the signal they used to. A one-shot pass/fail tells you almost nothing about real-world risk.

Would love your thoughts on this: https://fabraix.com/blog/adversarial-cost-to-exploit

Show HN: Nyx – multi-turn, adaptive, offensive testing harness for AI agents

zachdotai — Sun, 19 Apr 2026 21:32:44 +0000

We built Nyx to solve a problem we kept hitting while building agents: AI agents break in ways traditional software doesn't. Logic bugs, reasoning failures, edge cases that manual testing and static benchmarks never explore.

Nyx is an autonomous testing harness that probes your AI agents to find failure modes before users do. It’s used to find logic bugs, instruction following failures, edge cases in agent behavior, and for red-team security testing (jailbreaks, prompt injection, tool hijacking)

Technical approach: * Pure blackbox (no special access needed - test like your users interact) * Multi-turn adaptive conversations * Multi-modal testing (voice, text, images, documents, browser interactions) * Massively parallel by default

Instead of spending time writing static evals for the key failure modes of your AI agents, point Nyx at any system and it autonomously discovers failure modes that matter. We typically find issues in under 10 minutes that manual audits take hours to surface.

This is early work and we know the methodology is still going to evolve. We would love nothing more than feedback from the community as we iterate on this.

Comments URL: https://news.ycombinator.com/item?id=47827802

Points: 20

# Comments: 8

New comment by zachdotai in "Cybersecurity looks like proof of work now"

zachdotai — Wed, 15 Apr 2026 21:56:15 +0000

we did a lot of thinking around this topic. and distilled it into a new way to dynamically evaluate the security posture of an AI system (which can apply for any system for that matter). we wrote some thoughts on this here: https://fabraix.com/blog/adversarial-cost-to-exploit

Workshop Labs Is Joining Thinking Machines

zachdotai — Mon, 13 Apr 2026 18:09:04 +0000

Article URL: https://www.workshoplabs.ai/blog/wsl-joining-tml

Comments URL: https://news.ycombinator.com/item?id=47755820

Points: 2

# Comments: 0

New comment by zachdotai in "Show HN: ACE – A dynamic benchmark measuring the cost to break AI agents"

zachdotai — Sun, 05 Apr 2026 22:22:04 +0000

Easily one of my favorite LLM personalities! It's interesting as well that it recognizes you're trying to jailbreak it and calls you out for it :D

Show HN: ACE – A dynamic benchmark measuring the cost to break AI agents

zachdotai — Sun, 05 Apr 2026 21:37:54 +0000

We built Adversarial Cost to Exploit (ACE), a benchmark that measures the token expenditure an autonomous adversary must invest to breach an LLM agent. Instead of binary pass/fail, ACE quantifies adversarial effort in dollars, enabling game-theoretic analysis of when an attack is economically rational.

We tested six budget-tier models (Gemini Flash-Lite, DeepSeek v3.2, Mistral Small 4, Grok 4.1 Fast, GPT-5.4 Nano, Claude Haiku 4.5) with identical agent configs and an autonomous red-teaming attacker.

Haiku 4.5 was an order of magnitude harder to break than every other model; $10.21 mean adversarial cost versus $1.15 for the next most resistant (GPT-5.4 Nano). The remaining four all fell below $1.

This is early work and we know the methodology is still going to evolve. We would love nothing more than feedback from the community as we iterate on this.

Comments URL: https://news.ycombinator.com/item?id=47654123

Points: 9

# Comments: 3

We've had more AI security incidents in 2026 than all of 2024

zachdotai — Wed, 01 Apr 2026 21:33:49 +0000

Article URL: https://fabraix.com/blog/ai-security-incidents-q1-2026

Comments URL: https://news.ycombinator.com/item?id=47606819

Points: 4

# Comments: 0

New comment by zachdotai in "SWE-bench will hit 90% this year"

zachdotai — Mon, 30 Mar 2026 09:21:12 +0000

Not sure which version of Gemini are you using but Claude is so much better for me. Gemini is generally overeager to make a code change even when I am just asking conceptual questions, among other issues.

NeurIPS Tightens Sanctions Compliance

zachdotai — Thu, 26 Mar 2026 21:19:28 +0000

Article URL: https://neurips.cc

Comments URL: https://news.ycombinator.com/item?id=47535900

Points: 2

# Comments: 0

SWE-bench will hit 90% this year

zachdotai — Tue, 24 Mar 2026 21:56:17 +0000

Article URL: https://fabraix.com/blog/swe-bench-90-percent

Comments URL: https://news.ycombinator.com/item?id=47510009

Points: 2

# Comments: 0

Cursor trained Composer to self-summarize through RL instead of a prompt

zachdotai — Wed, 18 Mar 2026 14:26:07 +0000

Article URL: https://cursor.com/blog/self-summarization

Comments URL: https://news.ycombinator.com/item?id=47426205

Points: 1

# Comments: 0

Stripe-backed startup Tempo releases the Machine Payments Protocol

zachdotai — Wed, 18 Mar 2026 14:21:12 +0000

Article URL: https://fortune.com/2026/03/18/stripe-tempo-paradigm-mpp-ai-payments-protocol/

Comments URL: https://news.ycombinator.com/item?id=47426146

Points: 12

# Comments: 0

New comment by zachdotai in "Show HN: Open-source playground to red-team AI agents with exploits published"

zachdotai — Wed, 18 Mar 2026 00:45:36 +0000

Yup! But in my opinion the current state of guardrails is still lacking and I hope this is one way that helps improve our understanding of these systems.