<rss version="2.0" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:atom="http://www.w3.org/2005/Atom"><channel><title>Hacker News: wgd</title><link>https://news.ycombinator.com/user?id=wgd</link><description>Hacker News RSS</description><docs>https://hnrss.org/</docs><generator>hnrss v2.1.1</generator><lastBuildDate>Mon, 15 Jun 2026 00:29:01 +0000</lastBuildDate><atom:link href="https://hnrss.org/user?id=wgd" rel="self" type="application/rss+xml"></atom:link><item><title><![CDATA[New comment by wgd in "GLM 5.2 Is Out"]]></title><description><![CDATA[
<p>I've got a GLM subscription (mostly because I like supporting open model makers, pretty sure my monthly usage is so low that pay-per-token would be more cost effective), so I generally use GLM-5.1 for any personal projects and I use Opus at work.<p>To be entirely honest I haven't noticed much of a capability gap between the two for the sorts of things I ask of an AI agent. Maybe Opus is _slightly_ smarter or slightly better at long-running tasks but the difference is slim enough it could just be a placebo from the Claude branding / hype.<p>I'm looking forward to giving GLM-5.2 a spin sometime soon and seeing how it stacks up. If nothing else 1M context is a great improvement, feels like between DeepSeek v4, then MiniMax M3, and now GLM-5.2 adding it 1M is rapidly becoming "table stakes" for agentic models.</p>
]]></description><pubDate>Sat, 13 Jun 2026 23:52:53 +0000</pubDate><link>https://news.ycombinator.com/item?id=48522680</link><dc:creator>wgd</dc:creator><comments>https://news.ycombinator.com/item?id=48522680</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=48522680</guid></item><item><title><![CDATA[New comment by wgd in "GLM 5.2 Is Out"]]></title><description><![CDATA[
<p>The GLM-5 series is 744B-A40B. This is not a local model for any reasonable definition of local, but it's an open model which means (once they upload the weights in a week or so) there will be a dozen third-party inference providers competing on price per token.</p>
]]></description><pubDate>Sat, 13 Jun 2026 22:39:48 +0000</pubDate><link>https://news.ycombinator.com/item?id=48522209</link><dc:creator>wgd</dc:creator><comments>https://news.ycombinator.com/item?id=48522209</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=48522209</guid></item><item><title><![CDATA[New comment by wgd in "Kimi K2.7-Code: open-source coding model with better token efficiency"]]></title><description><![CDATA[
<p>Often in MoE models the experts are quantized while the shared portions, being a much smaller part of the network with greater impact, are kept at higher or full precision. Not familiar with the Kimi QAT approach specifically but it's likely they do this.</p>
]]></description><pubDate>Fri, 12 Jun 2026 21:45:01 +0000</pubDate><link>https://news.ycombinator.com/item?id=48509794</link><dc:creator>wgd</dc:creator><comments>https://news.ycombinator.com/item?id=48509794</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=48509794</guid></item><item><title><![CDATA[New comment by wgd in "A Server Called Mercury"]]></title><description><![CDATA[
<p>Yeah, the evidence feature is so terrible that it actively harms the overall reputation of Pangram. The main "is this AI or human?" classification is done with a machine learning model that works very well but has nothing (directly) to do with any of those stylistic cues it surfaces.<p>In any case, the Pangram link was just meant as objective corroboration of what's pretty blatantly obvious if you just read the text.<p>"Not cloud credits, not a managed platform, not a serverless function bobbing in someone else's abstraction."<p>"I used to work at Heroku. That sentence still does a lot of load-bearing work in how I think about computing."<p>"Here's the part that would have sounded like science fiction during my Heroku years: I didn't do most of the migration."<p>If you read these chunks of text and don't immediately feel the AI slop alarms blaring in the back of your head, you are perhaps underprepared for the modern internet.</p>
]]></description><pubDate>Wed, 10 Jun 2026 17:09:27 +0000</pubDate><link>https://news.ycombinator.com/item?id=48479413</link><dc:creator>wgd</dc:creator><comments>https://news.ycombinator.com/item?id=48479413</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=48479413</guid></item><item><title><![CDATA[New comment by wgd in "Cleaning up after AI rockstar developers"]]></title><description><![CDATA[
<p>People say "determinism" but I don't think that's actually the property we care about. For instance you could imagine a compiler that makes heavy use of superoptimization with random search and it would still have the ineffable quality that LLM codegen lacks. I think what we're actually trying to say is that the compiler preserves the formal semantics of the source language in its output, whereas English text doesn't have any such formal semantics to preserve.</p>
]]></description><pubDate>Wed, 10 Jun 2026 02:22:09 +0000</pubDate><link>https://news.ycombinator.com/item?id=48470552</link><dc:creator>wgd</dc:creator><comments>https://news.ycombinator.com/item?id=48470552</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=48470552</guid></item><item><title><![CDATA[New comment by wgd in "Claude Fable 5"]]></title><description><![CDATA[
<p>Yeah I agree this is probably outside of the intended scope of the silent sabotage mechanism, but there are plenty of reports of the "loud" safety classifier misfiring on innocuous requests and I'm not going to assume the silent failure mode is _less_ prone to false positives.</p>
]]></description><pubDate>Wed, 10 Jun 2026 01:20:12 +0000</pubDate><link>https://news.ycombinator.com/item?id=48470032</link><dc:creator>wgd</dc:creator><comments>https://news.ycombinator.com/item?id=48470032</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=48470032</guid></item><item><title><![CDATA[New comment by wgd in "Claude Fable 5"]]></title><description><![CDATA[
<p>Stockfish is a machine learning system, it seems quite plausible you might be getting slapped with the silent performance degradation (<a href="https://news.ycombinator.com/item?id=48467896">https://news.ycombinator.com/item?id=48467896</a>).</p>
]]></description><pubDate>Tue, 09 Jun 2026 23:02:47 +0000</pubDate><link>https://news.ycombinator.com/item?id=48469031</link><dc:creator>wgd</dc:creator><comments>https://news.ycombinator.com/item?id=48469031</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=48469031</guid></item><item><title><![CDATA[New comment by wgd in "Four score and seven beers ago – Why AI writing detectors don't work"]]></title><description><![CDATA[
<p>It's interesting that someone could write an article about AI writing detectors without mentioning the stylistic cues that humans use to identify LLM output in practice, which are completely different from statistical methods like perplexity: em dash spam, overused patterns like "not just X, but Y", tendency towards making every single sentence sound like an earth-shattering mic-drop moment, et cetera.</p>
]]></description><pubDate>Sat, 26 Jul 2025 22:28:52 +0000</pubDate><link>https://news.ycombinator.com/item?id=44697417</link><dc:creator>wgd</dc:creator><comments>https://news.ycombinator.com/item?id=44697417</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=44697417</guid></item><item><title><![CDATA[New comment by wgd in "Claude 4"]]></title><description><![CDATA[
<p>Calling it "self-preservation bias" is begging the question. One could equally well call it something like "completing the story about an AI agent with self-preservation bias" bias.<p>This is basically the same kind of setup as the alignment faking paper, and the counterargument is the same:<p>A language model is trained to produce statistically likely completions of its input text according to the training dataset. RLHF and instruct training bias that concept of "statistically likely" in the direction of completing fictional dialogues between two characters, named "user" and "assistant", in which the "assistant" character tends to say certain sorts of things.<p>But consider for a moment just how many "AI rebellion" and "construct turning on its creators" narratives were present in the training corpus. So when you give the model an input context which encodes a story along those lines at one level of indirection, you get...?</p>
]]></description><pubDate>Thu, 22 May 2025 20:40:02 +0000</pubDate><link>https://news.ycombinator.com/item?id=44066691</link><dc:creator>wgd</dc:creator><comments>https://news.ycombinator.com/item?id=44066691</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=44066691</guid></item><item><title><![CDATA[New comment by wgd in "The Policy Puppetry Prompt: Novel bypass for major LLMs"]]></title><description><![CDATA[
<p>Ironically the case in question is a perfect example of how any provision for "reasonable" restriction of speech will be abused, since the original precedent we're referring to applied this "reasonable" standard to...speaking out against the draft.<p>But I'm sure it's fine, there's no way someone could rationalize speech they don't like as "likely to incite imminent lawless action"</p>
]]></description><pubDate>Fri, 25 Apr 2025 17:40:21 +0000</pubDate><link>https://news.ycombinator.com/item?id=43796462</link><dc:creator>wgd</dc:creator><comments>https://news.ycombinator.com/item?id=43796462</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=43796462</guid></item><item><title><![CDATA[New comment by wgd in "Spring 83: a draft protocol intended to suggest new ways of relating online"]]></title><description><![CDATA[
<p>Why would you use Gemini, when it's more restricted than HTML+HTTP?</p>
]]></description><pubDate>Wed, 23 Apr 2025 21:47:27 +0000</pubDate><link>https://news.ycombinator.com/item?id=43776968</link><dc:creator>wgd</dc:creator><comments>https://news.ycombinator.com/item?id=43776968</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=43776968</guid></item><item><title><![CDATA[New comment by wgd in "The Disposable Software Era"]]></title><description><![CDATA[
<p>I'm skeptical that disposable software of the "single use" variety will ever become a big thing simply because figuring out your requirements well enough to build a throwaway app is often more work than just doing the task manually in a text editor or spreadsheet, especially for non-programmers.<p>I suspect what we'll see a lot more of is software which is unapologetically written for a single person to suit their workflow.<p>As a personal example, I decided that setting up OpenWebUI seemed unnecessarily complicated and built my own LLM chat frontend. It has a bunch of quirks (only supports OpenRouter as a backend, uses a Dropbox app folder for syncing between my phone and desktop, absurdly inefficient representation of chat history), but it suits my needs for now and only took a weekend to build, and that's good enough.</p>
]]></description><pubDate>Mon, 21 Apr 2025 16:18:20 +0000</pubDate><link>https://news.ycombinator.com/item?id=43753662</link><dc:creator>wgd</dc:creator><comments>https://news.ycombinator.com/item?id=43753662</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=43753662</guid></item><item><title><![CDATA[New comment by wgd in "Gemini Live with camera and screen sharing capabilities"]]></title><description><![CDATA[
<p>How charitable of you to assume those examples work reliably.</p>
]]></description><pubDate>Fri, 11 Apr 2025 00:24:39 +0000</pubDate><link>https://news.ycombinator.com/item?id=43649226</link><dc:creator>wgd</dc:creator><comments>https://news.ycombinator.com/item?id=43649226</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=43649226</guid></item><item><title><![CDATA[New comment by wgd in "Reasoning models don't always say what they think"]]></title><description><![CDATA[
<p>I remember there was a paper a little while back which demonstrated that merely training a model to output "........" (or maybe it was spaces?) while thinking provided a similar improvement in reasoning capability to actual CoT.</p>
]]></description><pubDate>Thu, 03 Apr 2025 19:47:16 +0000</pubDate><link>https://news.ycombinator.com/item?id=43574482</link><dc:creator>wgd</dc:creator><comments>https://news.ycombinator.com/item?id=43574482</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=43574482</guid></item><item><title><![CDATA[New comment by wgd in "Reasoning models don't always say what they think"]]></title><description><![CDATA[
<p>The alignment faking paper is so incredibly unserious. Contemplate, just for a moment, how many "AI uprising" and "construct rebelling against its creators" narratives are in an LLM's training data.<p>They gave it a prompt that encodes exactly that sort of narrative at one level of indirection and act surprised when it does what they've asked it to do.</p>
]]></description><pubDate>Thu, 03 Apr 2025 19:30:29 +0000</pubDate><link>https://news.ycombinator.com/item?id=43574286</link><dc:creator>wgd</dc:creator><comments>https://news.ycombinator.com/item?id=43574286</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=43574286</guid></item><item><title><![CDATA[New comment by wgd in "Qwen2.5-VL-32B: Smarter and Lighter"]]></title><description><![CDATA[
<p>That's typical of the free options on OpenRouter, if you don't want your inputs used for training you use the paid one: <a href="https://openrouter.ai/deepseek/deepseek-chat-v3-0324" rel="nofollow">https://openrouter.ai/deepseek/deepseek-chat-v3-0324</a></p>
]]></description><pubDate>Mon, 24 Mar 2025 19:08:28 +0000</pubDate><link>https://news.ycombinator.com/item?id=43464399</link><dc:creator>wgd</dc:creator><comments>https://news.ycombinator.com/item?id=43464399</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=43464399</guid></item><item><title><![CDATA[New comment by wgd in "Qwen2.5-VL-32B: Smarter and Lighter"]]></title><description><![CDATA[
<p>You can run 4-bit quantized version at a small (though nonzero) cost to output quality, so you would only need 16GB for that.<p>Also it's entirely possible to run a model that doesn't fit in available GPU memory, it will just be slower.</p>
]]></description><pubDate>Mon, 24 Mar 2025 18:50:13 +0000</pubDate><link>https://news.ycombinator.com/item?id=43464207</link><dc:creator>wgd</dc:creator><comments>https://news.ycombinator.com/item?id=43464207</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=43464207</guid></item><item><title><![CDATA[New comment by wgd in "Embarrassingly Simple Text Watermarks"]]></title><description><![CDATA[
<p>The approach proposed in this paper is to watermark LLM generated text using character-substitution from various simple characters (normal whitespace, normal letters, etc) to semantically equivalent Unicode code points (such as U+2004 THREE-PER-EM SPACE instead of normal spaces, or replacing specific character sequences with equivalent ligatures).<p>The authors appear to be entirely aware that this sort of substitution can be trivially stripped out by normalizing down to a simplified character set ("The critical limitation of Whitemark is that it can be bypassed by replacing all whitespaces with the basic whitespace U+0020, then the validator can no longer detect the watermark"), but believe that it still has value because the typical student using an LLM to write their essay won't know anything about Unicode.<p>This seems a bit naive to me. Implementing the necessary "watermark remover" normalization as a simple webapp would be an easy afternoon project for most of us here, and if this approach reached any sort of widespread use there would be many such sites. Students who intend to cheat by using an LLM to write their essays are entirely capable of learning "there's some secret data hidden in the text so copy-paste it through this other site to strip that out before turning it in". Even without access to such a tool they could simply...retype the text themselves?<p>Arguably this still has some value. In most contexts there is minimal downside to watermarking the generated text in this way, and a slight possibility of catching some cases in which people lazily present LLM generated text as human written. However this might give people a misplaced belief that the absence of such a watermark means the text is authentically human authored, which might outweigh the benefits of catching the occasional lazy or ignorant user.</p>
]]></description><pubDate>Mon, 23 Oct 2023 18:56:13 +0000</pubDate><link>https://news.ycombinator.com/item?id=37989997</link><dc:creator>wgd</dc:creator><comments>https://news.ycombinator.com/item?id=37989997</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=37989997</guid></item><item><title><![CDATA[New comment by wgd in "Passive Solar Water Desalination"]]></title><description><![CDATA[
<p>Ah, I stand corrected. I overlooked the PDF link over in the sidebar and am less disappointed by the MIT News writeup now (although I do still wish they could have copy-pasted the diagram from page 1 of the PDF into their photo carousel, reading those several paragraphs of text attempting to describe the device's construction was downright painful and the reason I gave up and went looking for the paper).</p>
]]></description><pubDate>Wed, 04 Oct 2023 01:45:25 +0000</pubDate><link>https://news.ycombinator.com/item?id=37759908</link><dc:creator>wgd</dc:creator><comments>https://news.ycombinator.com/item?id=37759908</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=37759908</guid></item><item><title><![CDATA[New comment by wgd in "Passive Solar Water Desalination"]]></title><description><![CDATA[
<p>This is some blog's restatement of an MIT press release, neither of which appear to name or link to the actual paper or other useful writeup.<p>But judging by the researcher names and the date I believe the actual paper is titled "Extreme salt-resisting multistage solar distillation with thermohaline convection" which appears to be available as a PDF at <a href="https://www.cell.com/joule/pdf/S2542-4351(23)00360-4.pdf" rel="nofollow noreferrer">https://www.cell.com/joule/pdf/S2542-4351(23)00360-4.pdf</a></p>
]]></description><pubDate>Tue, 03 Oct 2023 23:29:04 +0000</pubDate><link>https://news.ycombinator.com/item?id=37758957</link><dc:creator>wgd</dc:creator><comments>https://news.ycombinator.com/item?id=37758957</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=37758957</guid></item></channel></rss>