<rss version="2.0" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:atom="http://www.w3.org/2005/Atom"><channel><title>Hacker News: Raywob</title><link>https://news.ycombinator.com/user?id=Raywob</link><description>Hacker News RSS</description><docs>https://hnrss.org/</docs><generator>hnrss v2.1.1</generator><lastBuildDate>Fri, 01 May 2026 10:10:29 +0000</lastBuildDate><atom:link href="https://hnrss.org/user?id=Raywob" rel="self" type="application/rss+xml"></atom:link><item><title><![CDATA[Show HN: Vibe Check – Client-side invisible Unicode steganography scanner]]></title><description><![CDATA[
<p>Glassworm has hit 400+ repos across GitHub, npm, and VS Code using invisible Unicode characters to encode executable payloads that pass every code review, linter, and AI assistant.<p>Vibe Check is a browser-based scanner that detects these characters across 14 invisible Unicode ranges (zero-width spaces, variation selectors supplement, tag characters, bidi overrides, etc.) and flags sequences of 3+ consecutive invisible characters as likely payloads. Entirely client-side JS — no code leaves your browser.<p>Not a full SAST tool. Solves one specific problem: detecting characters that are invisible in every editor and terminal but can encode payloads decoded via eval() at runtime.<p>Scanner logic is in scanner.js, viewable in browser. Site runs on Cloudflare Pages free tier.<p><a href="https://websationflow.com" rel="nofollow">https://websationflow.com</a></p>
<hr>
<p>Comments URL: <a href="https://news.ycombinator.com/item?id=47917600">https://news.ycombinator.com/item?id=47917600</a></p>
<p>Points: 1</p>
<p># Comments: 0</p>
]]></description><pubDate>Mon, 27 Apr 2026 04:08:53 +0000</pubDate><link>https://websationflow.com/</link><dc:creator>Raywob</dc:creator><comments>https://news.ycombinator.com/item?id=47917600</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47917600</guid></item><item><title><![CDATA[My AI didn't misread a receipt – it fabricated one from scratch]]></title><description><![CDATA[
<p>I pointed a vision model at a grocery receipt. It returned a store name, item list, and total. None of it was on the paper.<p>This wasn't OCR error. The model didn't confuse a "7" for a "1." It generated a plausible-looking receipt from scratch — different store, different items, different prices. If I hadn't been holding the original, I might not have caught it.<p>Same image, different model (same parameter count, same hardware), five seconds later: every item correct, store name right, total accurate to the penny.<p>The models: minicpm-v 8B (fabricated) vs qwen3-vl 8B (accurate). Both open source, both ~6GB VRAM, both running locally via Ollama on an RTX 5080.<p>What I learned:<p>1. Vision model hallucination is qualitatively different from text hallucination. A text model gives you a wrong answer to a real question. A vision model gives you a confident answer to an image it didn't process. The second is harder to detect.<p>2. Model selection matters more than prompt engineering for vision. Same prompt, same image — one model fabricated, one read accurately. No prompt optimization fixes a model that invents data.<p>3. Confidence scoring is mandatory. I added a reconciliation check: do the extracted items sum to roughly the stated total? This catches fabrication that looks plausible at the individual line-item level.<p>4. The fix wasn't more money or a bigger model. Same size (8B), same hardware, same cost ($0). Just a different architecture that actually reads pixels instead of generating plausible text about them.<p>Full writeup with the pipeline architecture and code patterns: https://dev.to/rayne_robinson_e479bf0f26/my-ai-read-a-receipt-wrong-it-didnt-misread-it-it-made-one-up-4f5n</p>
<hr>
<p>Comments URL: <a href="https://news.ycombinator.com/item?id=47421107">https://news.ycombinator.com/item?id=47421107</a></p>
<p>Points: 4</p>
<p># Comments: 0</p>
]]></description><pubDate>Wed, 18 Mar 2026 02:55:30 +0000</pubDate><link>https://news.ycombinator.com/item?id=47421107</link><dc:creator>Raywob</dc:creator><comments>https://news.ycombinator.com/item?id=47421107</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47421107</guid></item><item><title><![CDATA[New comment by Raywob in "The $2k Laptop That Replaced My $200/Month AI Subscription"]]></title><description><![CDATA[
<p>Haven't tried GPT-OSS-20B yet — the MOE approach is interesting for keeping VRAM usage down while getting better reasoning. 85 t/s on a 3060 is impressive. I'll look into that.<p>I've been on Qwen3 8B mostly because it was "good enough" for the mechanical stages (scanning, scoring, dedup) and I didn't want to optimize the local model before validating the orchestration pattern itself. Now that the pipeline is proven, experimenting with the local model is the obvious next lever to pull.<p>The Qwen3 4B 2507 claim is interesting — if the quality holds for structured extraction tasks, halving the VRAM footprint would open up running two models concurrently or leaving more room for larger contexts. Worth testing.<p>Thanks for the pointers — this is exactly the kind of optimization I haven't had time to dig into yet.</p>
]]></description><pubDate>Thu, 19 Feb 2026 16:04:33 +0000</pubDate><link>https://news.ycombinator.com/item?id=47075227</link><dc:creator>Raywob</dc:creator><comments>https://news.ycombinator.com/item?id=47075227</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47075227</guid></item><item><title><![CDATA[New comment by Raywob in "The $2k Laptop That Replaced My $200/Month AI Subscription"]]></title><description><![CDATA[
<p>For the mechanical stages (scanning, scoring, dedup) — indistinguishable from proprietary models. These are structured tasks: "score this post 1-10 against these criteria" or "extract these fields from this text." An 8B model handles that fine at 30 tok/s on consumer GPU.<p>For synthesis and judgment — no, it's not close. That's exactly why I route those stages to Claude. When you need the model to generate novel connections or strategic recommendations, the quality gap between 8B and frontier is real.<p>The key insight is that most pipeline stages don't need synthesis. They need pattern matching. And that's where the 95% cost savings live.</p>
]]></description><pubDate>Thu, 19 Feb 2026 15:09:39 +0000</pubDate><link>https://news.ycombinator.com/item?id=47074605</link><dc:creator>Raywob</dc:creator><comments>https://news.ycombinator.com/item?id=47074605</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47074605</guid></item><item><title><![CDATA[The $2k Laptop That Replaced My $200/Month AI Subscription]]></title><description><![CDATA[
<p>Cloud AI pricing is per-token. The more useful your pipeline, the more it costs. I built a dual-model orchestration pattern that routes 80% of work to a free local model (Qwen3 8B on Ollama, GPU-accelerated) and only sends the synthesis/judgment stage to a cloud API.<p>Cost for a 50-item research pipeline: $0.15-0.40 vs $8-15 all-cloud. Same output quality where it matters.<p>Stack: RTX 5080 laptop, Ollama in Docker with GPU passthrough, PostgreSQL, Redis, Claude API for the final 20%.<p>The pattern: scan locally → score locally → deduplicate locally → synthesize via cloud. Four stages, three are free.<p>Gotchas I hit: Qwen3's thinking tokens through /api/generate (use /api/chat instead), Docker binding to IPv4 only while Windows resolves localhost to IPv6, and GPU memory ceilings on consumer cards.<p>Happy to share architecture details in comments.</p>
<hr>
<p>Comments URL: <a href="https://news.ycombinator.com/item?id=47074347">https://news.ycombinator.com/item?id=47074347</a></p>
<p>Points: 8</p>
<p># Comments: 5</p>
]]></description><pubDate>Thu, 19 Feb 2026 14:47:59 +0000</pubDate><link>https://news.ycombinator.com/item?id=47074347</link><dc:creator>Raywob</dc:creator><comments>https://news.ycombinator.com/item?id=47074347</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47074347</guid></item></channel></rss>