<rss version="2.0" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:atom="http://www.w3.org/2005/Atom"><channel><title>Hacker News: robkop</title><link>https://news.ycombinator.com/user?id=robkop</link><description>Hacker News RSS</description><docs>https://hnrss.org/</docs><generator>hnrss v2.1.1</generator><lastBuildDate>Wed, 10 Jun 2026 13:05:07 +0000</lastBuildDate><atom:link href="https://hnrss.org/user?id=robkop" rel="self" type="application/rss+xml"></atom:link><item><title><![CDATA[New comment by robkop in "Show HN: Forge – Guardrails take an 8B model from 53% to 99% on agentic tasks"]]></title><description><![CDATA[
<p>Just saying you’re not alone, very surprised by the reception given how brutally sloppified the OP is.<p>Interesting problems space but I hope the author just gives dot points next time rather than bloating it and losing most of its meaning.</p>
]]></description><pubDate>Wed, 20 May 2026 11:36:08 +0000</pubDate><link>https://news.ycombinator.com/item?id=48206105</link><dc:creator>robkop</dc:creator><comments>https://news.ycombinator.com/item?id=48206105</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=48206105</guid></item><item><title><![CDATA[New comment by robkop in "Show HN: Actual Claude Tokenizer"]]></title><description><![CDATA[
<p>Could you please elaborate a bit more for my understanding?<p>What in particular about this method breaks correct token boundaries?<p>On my first read I read your comment as there are special tokens that require multiple tokens to emit, hence you can't get certain tokens emitted alone - but I don't think that's what you're getting at on a second read?<p>Interesting that you've found similarities between "d" and the hidden tokens for opening an xml tag, pressing caps lock and the other hidden tokens of note. I haven't run into any trouble extracting "d" tokens, is it a particular model that you see create that pattern?</p>
]]></description><pubDate>Sun, 03 May 2026 15:37:41 +0000</pubDate><link>https://news.ycombinator.com/item?id=47998007</link><dc:creator>robkop</dc:creator><comments>https://news.ycombinator.com/item?id=47998007</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47998007</guid></item><item><title><![CDATA[New comment by robkop in "Claude.ai unavailable and elevated errors on the API"]]></title><description><![CDATA[
<p>I use bedrock with 1M context every day. Not sure this is right</p>
]]></description><pubDate>Tue, 28 Apr 2026 20:58:51 +0000</pubDate><link>https://news.ycombinator.com/item?id=47940674</link><dc:creator>robkop</dc:creator><comments>https://news.ycombinator.com/item?id=47940674</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47940674</guid></item><item><title><![CDATA[New comment by robkop in "SpaceX says it has agreement to acquire Cursor for $60B"]]></title><description><![CDATA[
<p>A lot of enterprises were doing that but now they hit the 150 user limit on Claude and are paying seat+api rates.<p>Codex is still going strong but it’s hard to imagine they won’t do similar eventually.<p>So now im honestly hearing a lot more folk stick it out with cursor while waiting for the dust to settle.</p>
]]></description><pubDate>Wed, 22 Apr 2026 11:11:53 +0000</pubDate><link>https://news.ycombinator.com/item?id=47861893</link><dc:creator>robkop</dc:creator><comments>https://news.ycombinator.com/item?id=47861893</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47861893</guid></item><item><title><![CDATA[Show HN: Actual Claude Tokenizer]]></title><description><![CDATA[
<p>I've seen a few "Claude tokenizers" floating around lately with all the 4.7 chatter, but most of them just hit the count_tokens endpoint and hand you back a number. You don't actually see how your text gets split or understand the changes from 4.6 to 4.7.<p>I built this a while back for doing some mech interp research. It faithfully represents Claude token splitting - showing hidden tokens, real boundaries and so on. It is not cheap to run - essentially n^2 cost - you could optimise for longer sequences but you are not guaranteed a faithful representation if so.<p>Open Source: <a href="https://github.com/R0bk/claude-tokenizer" rel="nofollow">https://github.com/R0bk/claude-tokenizer</a><p>Feedback welcome, let me know if there are any edge cases that look wrong.<p>P.S. I'd expect this to face a similar fate as streaming chunk and prefill based token extraction methods did. I do worry about the ability to do independent research once it's fully closed off and would love it if there was more public frontier tokenizers.</p>
<hr>
<p>Comments URL: <a href="https://news.ycombinator.com/item?id=47834347">https://news.ycombinator.com/item?id=47834347</a></p>
<p>Points: 3</p>
<p># Comments: 4</p>
]]></description><pubDate>Mon, 20 Apr 2026 13:51:25 +0000</pubDate><link>https://tokenizer.robkopel.me</link><dc:creator>robkop</dc:creator><comments>https://news.ycombinator.com/item?id=47834347</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47834347</guid></item><item><title><![CDATA[New comment by robkop in "Are the costs of AI agents also rising exponentially? (2025)"]]></title><description><![CDATA[
<p>There’s a lot of tradeoffs to play with, those inference ASICs may not carry the gradient but they are still optimised for larger batches and to run any model. They need enough memory for the weights, wide batch inference, and ideally leftovers for kv cache efficiency.<p>For personal inference you’re given a lot more room to play in - much of it poorly explored today - enough to concern an argument of cost advantages evaporating</p>
]]></description><pubDate>Sun, 19 Apr 2026 12:58:38 +0000</pubDate><link>https://news.ycombinator.com/item?id=47823979</link><dc:creator>robkop</dc:creator><comments>https://news.ycombinator.com/item?id=47823979</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47823979</guid></item><item><title><![CDATA[New comment by robkop in "Are the costs of AI agents also rising exponentially? (2025)"]]></title><description><![CDATA[
<p>You can ablate surprisingly large chunks of a model with near to no effect, you can try this easily - download an open weight model in torch.<p>Obviously it’s not ideal but you could likely have single digit % of all weights affected and still have a useful model (many caveats here: e.g. locality of damaged weights matters, distribution of errors matters, fail high/low matters, …)</p>
]]></description><pubDate>Sun, 19 Apr 2026 12:37:39 +0000</pubDate><link>https://news.ycombinator.com/item?id=47823860</link><dc:creator>robkop</dc:creator><comments>https://news.ycombinator.com/item?id=47823860</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47823860</guid></item><item><title><![CDATA[New comment by robkop in "Why do we tell ourselves scary stories about AI?"]]></title><description><![CDATA[
<p>I can’t speak for the states, but in AU I clearly see a massive displacement of undergrad and junior roles (only in AI exposed domains).<p>I say this as both someone who works with many execs, hearing their musings, and someone who no longer can justify hiring junior roles themselves.<p>Irrespective of that; if we take this strategy of only taking action once it is visible to the layman - our scope of actions available will be invariably and significantly diminished.<p>Even if you are not convinced it is guaranteed and do not believe what myself and others see. I would ask you is your probability of it happening now really that close to 0? If not then would it not be prudent to take the risk seriously?</p>
]]></description><pubDate>Sat, 11 Apr 2026 05:01:36 +0000</pubDate><link>https://news.ycombinator.com/item?id=47727562</link><dc:creator>robkop</dc:creator><comments>https://news.ycombinator.com/item?id=47727562</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47727562</guid></item><item><title><![CDATA[New comment by robkop in "Sam Altman's response to Molotov cocktail incident"]]></title><description><![CDATA[
<p>one of their highlights with mythos was it's ability to generate new puns<p>I took a look and honestly they're the first AI puns that aren't bad<p>Times are changing</p>
]]></description><pubDate>Sat, 11 Apr 2026 03:21:42 +0000</pubDate><link>https://news.ycombinator.com/item?id=47727012</link><dc:creator>robkop</dc:creator><comments>https://news.ycombinator.com/item?id=47727012</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47727012</guid></item><item><title><![CDATA[New comment by robkop in "LLM=True"]]></title><description><![CDATA[
<p>We’ve got a long way to go in optimising our environments for these models. Our perception of a terminal is much closer to feeding a video into Gemini than reading a textbook of logs. But we don’t make that ax affordance at the moment.<p>I wrote a small game for my dev team to experience what it’s like interacting through these painful interfaces over the summer www.youareanagent.app<p>Jump to the agentic coding level or the mcp level to experience true frustration (call it empathy). I also wrote up a lot more thinking here www.robkopel.me/field-notes/ax-agent-experience/</p>
]]></description><pubDate>Wed, 25 Feb 2026 10:20:31 +0000</pubDate><link>https://news.ycombinator.com/item?id=47149721</link><dc:creator>robkop</dc:creator><comments>https://news.ycombinator.com/item?id=47149721</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47149721</guid></item><item><title><![CDATA[New comment by robkop in "Qwen3.5: Towards Native Multimodal Agents"]]></title><description><![CDATA[
<p>Rumours say you do something like:<p><pre><code>  Download every github repo
    -> Classify if it could be used as an env, and what types
      -> Issues and PRs are great for coding rl envs
      -> If the software has a UI, awesome, UI env
      -> If the software is a game, awesome, game env
      -> If the software has xyz, awesome, ...
    -> Do more detailed run checks, 
      -> Can it build
      -> Is it complex and/or distinct enough
      -> Can you verify if it reached some generated goal
      -> Can generated goals even be achieved
      -> Maybe some human review - maybe not
    -> Generate goals
      -> For a coding env you can imagine you may have a LLM introduce a new bug and can see that test cases now fail. Goal for model is now to fix it
    ... Do the rest of the normal RL env stuff</code></pre></p>
]]></description><pubDate>Mon, 16 Feb 2026 12:40:43 +0000</pubDate><link>https://news.ycombinator.com/item?id=47034287</link><dc:creator>robkop</dc:creator><comments>https://news.ycombinator.com/item?id=47034287</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47034287</guid></item><item><title><![CDATA[New comment by robkop in "Dario Amodei – "We are near the end of the exponential" [video]"]]></title><description><![CDATA[
<p>I get this at least once a week. And then once you have to dig in and understand the full mental model it’s not really giving you any uplift anyway.<p>I will say that doing this for enough months has made my ability to pick up the mental model quickly and to scope how much need to absorb much quicker. It seems possible that with another year you’d become very rapid at this.</p>
]]></description><pubDate>Sat, 14 Feb 2026 01:00:35 +0000</pubDate><link>https://news.ycombinator.com/item?id=47010124</link><dc:creator>robkop</dc:creator><comments>https://news.ycombinator.com/item?id=47010124</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47010124</guid></item><item><title><![CDATA[Show HN: You Are an Agent]]></title><description><![CDATA[
<p>After adding "Human" as a LLM provider to OpenCode a few months ago as a joke, it turns-out that acting as a LLM is quite painful. But it was surprisingly useful for understanding real agent harnesses dev.<p>So I thought I wouldn't leave anyone out! I made a small oss game - You Are An Agent - youareanagent.app - to share in the (useful?) frustration<p>It's a bit ridiculous. To tell you about some entirely necessary features, we've got:
  - A full WASM arch-linux vm that runs in your browser for the agent coding level
  - A bad desktop simulation with a beautiful excel simulation for our computer use level
  - A lovely WebGL CRT simulation (I think the first one that supports proper DOM 2d barrel warp distortion on safari? honestly wanted to leverage/ not write my own but I couldn't find one I was happy with)
  - A MCP server simulator with full simulation of off-brand Jira/ Confluence/ ... connected
  - And of course, a full WebGL oscilloscope music simulator for the intro sequence<p>Let me know what you think!<p>Code (If you'd like to add a level): <a href="https://github.com/R0bk/you-are-an-agent" rel="nofollow">https://github.com/R0bk/you-are-an-agent</a><p>(And if you want to waste 20 minutes - I spent way too long writing up my messy thinking about agent harness dev): <a href="http://robkopel.me/field-notes/ax-agent-experience/" rel="nofollow">http://robkopel.me/field-notes/ax-agent-experience/</a></p>
<hr>
<p>Comments URL: <a href="https://news.ycombinator.com/item?id=46849318">https://news.ycombinator.com/item?id=46849318</a></p>
<p>Points: 14</p>
<p># Comments: 0</p>
]]></description><pubDate>Sun, 01 Feb 2026 20:59:12 +0000</pubDate><link>https://youareanagent.app</link><dc:creator>robkop</dc:creator><comments>https://news.ycombinator.com/item?id=46849318</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=46849318</guid></item><item><title><![CDATA[New comment by robkop in "You Are an Agent – Try Being a Human LLM"]]></title><description><![CDATA[
<p>I added a "Human" LLM provider to my local OpenCode a few months ago as a joke, and it turns-out acting as a LLM is quite painful. But it massively improve my agent harnesses dev skills.<p>So I thought I wouldn't leave anyone out! I made a small oss game - You Are An Agent - youareanagent.app - to share in the (useful?) frustration<p>It's a bit ridiculous. To tell you about some entirely necessary features, we've got:
- A full WASM arch-linux vm that runs in your browser for the agent coding level
- A bad desktop simulation with a beautiful excel simulation for our computer use level
- A lovely WebGL CRT simulation (I think the first one that supports proper DOM 2d barrel warp distortion on safari? honestly wanted to leverage/ not write my own but I couldn't find one I was happy with)
- A MCP server simulator with full simulation of off-brand Jira/ Confluence/ ... connected
- And of course, a full WebGL oscilloscope music simulator for the intro sequence<p>Let me know what you think!<p>Code (If you'd like to add a level): <a href="https://github.com/R0bk/you-are-an-agent" rel="nofollow">https://github.com/R0bk/you-are-an-agent</a>
(And if you want to waste 20 minutes - I spent way too long writing up my messy thinking about agent harness dev): <a href="http://robkopel.me/field-notes/ax-agent-experience/" rel="nofollow">http://robkopel.me/field-notes/ax-agent-experience/</a></p>
]]></description><pubDate>Sun, 25 Jan 2026 16:38:36 +0000</pubDate><link>https://news.ycombinator.com/item?id=46755569</link><dc:creator>robkop</dc:creator><comments>https://news.ycombinator.com/item?id=46755569</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=46755569</guid></item><item><title><![CDATA[You Are an Agent – Try Being a Human LLM]]></title><description><![CDATA[
<p>Article URL: <a href="https://youareanagent.app/">https://youareanagent.app/</a></p>
<p>Comments URL: <a href="https://news.ycombinator.com/item?id=46755568">https://news.ycombinator.com/item?id=46755568</a></p>
<p>Points: 3</p>
<p># Comments: 1</p>
]]></description><pubDate>Sun, 25 Jan 2026 16:38:36 +0000</pubDate><link>https://youareanagent.app/</link><dc:creator>robkop</dc:creator><comments>https://news.ycombinator.com/item?id=46755568</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=46755568</guid></item><item><title><![CDATA[New comment by robkop in "Ax Not UX"]]></title><description><![CDATA[
<p>It's a fair question - I think the fact that they hold abilities (read 200k tokens instantly, can clone themselves, ...) that we don't would suggest they will have quirks and differecnes.<p>What downstream implication that will have on a AX sense is certainly arguable, but I would put forward that we're already seeing it with effective harnesses such as Claude Code. The experience the agent has there is quite different to how you'd build an IDE for a human.</p>
]]></description><pubDate>Thu, 22 Jan 2026 16:37:45 +0000</pubDate><link>https://news.ycombinator.com/item?id=46721594</link><dc:creator>robkop</dc:creator><comments>https://news.ycombinator.com/item?id=46721594</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=46721594</guid></item><item><title><![CDATA[Ax Not UX]]></title><description><![CDATA[
<p>Article URL: <a href="https://www.robkopel.me/field-notes/ax-agent-experience/">https://www.robkopel.me/field-notes/ax-agent-experience/</a></p>
<p>Comments URL: <a href="https://news.ycombinator.com/item?id=46721473">https://news.ycombinator.com/item?id=46721473</a></p>
<p>Points: 3</p>
<p># Comments: 2</p>
]]></description><pubDate>Thu, 22 Jan 2026 16:30:57 +0000</pubDate><link>https://www.robkopel.me/field-notes/ax-agent-experience/</link><dc:creator>robkop</dc:creator><comments>https://news.ycombinator.com/item?id=46721473</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=46721473</guid></item><item><title><![CDATA[New comment by robkop in "Ask HN: Share your personal website"]]></title><description><![CDATA[
<p><a href="https://robkopel.me" rel="nofollow">https://robkopel.me</a></p>
]]></description><pubDate>Wed, 14 Jan 2026 20:17:54 +0000</pubDate><link>https://news.ycombinator.com/item?id=46622450</link><dc:creator>robkop</dc:creator><comments>https://news.ycombinator.com/item?id=46622450</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=46622450</guid></item><item><title><![CDATA[New comment by robkop in "2025 Letter"]]></title><description><![CDATA[
<p>Can you elaborate? I would have thought the main driver for the price of a service is the labor?</p>
]]></description><pubDate>Fri, 02 Jan 2026 00:40:54 +0000</pubDate><link>https://news.ycombinator.com/item?id=46459997</link><dc:creator>robkop</dc:creator><comments>https://news.ycombinator.com/item?id=46459997</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=46459997</guid></item><item><title><![CDATA[New comment by robkop in "OpenAI's cash burn will be one of the big bubble questions of 2026"]]></title><description><![CDATA[
<p>Does that cost to serve multiple stay the same when conventional sites are forced to shovel ai into each request? e.g. the new google search</p>
]]></description><pubDate>Wed, 31 Dec 2025 00:35:28 +0000</pubDate><link>https://news.ycombinator.com/item?id=46439992</link><dc:creator>robkop</dc:creator><comments>https://news.ycombinator.com/item?id=46439992</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=46439992</guid></item></channel></rss>