<rss version="2.0" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:atom="http://www.w3.org/2005/Atom"><channel><title>Hacker News: mmoskal</title><link>https://news.ycombinator.com/user?id=mmoskal</link><description>Hacker News RSS</description><docs>https://hnrss.org/</docs><generator>hnrss v2.1.1</generator><lastBuildDate>Fri, 17 Apr 2026 14:28:24 +0000</lastBuildDate><atom:link href="https://hnrss.org/user?id=mmoskal" rel="self" type="application/rss+xml"></atom:link><item><title><![CDATA[New comment by mmoskal in "Structured outputs on the Claude Developer Platform"]]></title><description><![CDATA[
<p>Grammars work best when aligned with prompt. That is, if your prompt gives you the right format of answer 80% of the time, the grammar will take you to a 100%. If it gives you the right answer 1% of the time, the grammar will give you syntactically correct garbage.</p>
]]></description><pubDate>Fri, 14 Nov 2025 22:26:18 +0000</pubDate><link>https://news.ycombinator.com/item?id=45932903</link><dc:creator>mmoskal</dc:creator><comments>https://news.ycombinator.com/item?id=45932903</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=45932903</guid></item><item><title><![CDATA[New comment by mmoskal in "Structured outputs on the Claude Developer Platform"]]></title><description><![CDATA[
<p>OpenAI is using [0] LLGuidance [1]. You need to set strict:true in your request for schema validation to kick in though.<p>[0] <a href="https://platform.openai.com/docs/guides/function-calling#lark-cfg" rel="nofollow">https://platform.openai.com/docs/guides/function-calling#lar...</a>
[1] <a href="https://github.com/guidance-ai/llguidance" rel="nofollow">https://github.com/guidance-ai/llguidance</a></p>
]]></description><pubDate>Fri, 14 Nov 2025 22:23:08 +0000</pubDate><link>https://news.ycombinator.com/item?id=45932866</link><dc:creator>mmoskal</dc:creator><comments>https://news.ycombinator.com/item?id=45932866</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=45932866</guid></item><item><title><![CDATA[New comment by mmoskal in "PCB Edge USB C Connector Library"]]></title><description><![CDATA[
<p>I had good experience with carefully spaced holes in PCB and a 50 mil header, see <a href="https://jacdac.github.io/jacdac-docs/ddk/firmware/jac-connect/" rel="nofollow">https://jacdac.github.io/jacdac-docs/ddk/firmware/jac-connec...</a></p>
]]></description><pubDate>Sun, 26 Oct 2025 06:17:16 +0000</pubDate><link>https://news.ycombinator.com/item?id=45709531</link><dc:creator>mmoskal</dc:creator><comments>https://news.ycombinator.com/item?id=45709531</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=45709531</guid></item><item><title><![CDATA[New comment by mmoskal in "How to stop AI's "lethal trifecta""]]></title><description><![CDATA[
<p>The previous article is in the same issue, in science and technology section. This is how they typically do it - leader article has a longer version in the paper. Leaders tend to be more opinionated.</p>
]]></description><pubDate>Fri, 26 Sep 2025 17:29:16 +0000</pubDate><link>https://news.ycombinator.com/item?id=45388924</link><dc:creator>mmoskal</dc:creator><comments>https://news.ycombinator.com/item?id=45388924</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=45388924</guid></item><item><title><![CDATA[New comment by mmoskal in "AGI is an engineering problem, not a model training problem"]]></title><description><![CDATA[
<p>Consciousness (subjective experience) is possibly orthogonal to intelligence (ability to achieve complex goals). We definitely have a better handle on what intelligence is than consciousness.</p>
]]></description><pubDate>Sun, 24 Aug 2025 01:22:29 +0000</pubDate><link>https://news.ycombinator.com/item?id=45000469</link><dc:creator>mmoskal</dc:creator><comments>https://news.ycombinator.com/item?id=45000469</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=45000469</guid></item><item><title><![CDATA[New comment by mmoskal in "Guid Smash"]]></title><description><![CDATA[
<p>Counting to 2^61 probably is.<p>To actually find a collision in 128b cryptographic hash function it would take closer to 2^65 hashes. Back of the envelope calculations suggest that with Pollard's rho it would cost a few million dollars of CPU time at Hetzner's super-low prices. Not nearly mere mortals budget, but not that far off I guess.</p>
]]></description><pubDate>Sun, 17 Aug 2025 04:46:54 +0000</pubDate><link>https://news.ycombinator.com/item?id=44928938</link><dc:creator>mmoskal</dc:creator><comments>https://news.ycombinator.com/item?id=44928938</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=44928938</guid></item><item><title><![CDATA[New comment by mmoskal in "How Tesla is proving doubters right on why its robotaxi service cannot scale"]]></title><description><![CDATA[
<p>Airplanes are dirty, unsafe and unclean?</p>
]]></description><pubDate>Sun, 20 Jul 2025 17:12:50 +0000</pubDate><link>https://news.ycombinator.com/item?id=44627166</link><dc:creator>mmoskal</dc:creator><comments>https://news.ycombinator.com/item?id=44627166</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=44627166</guid></item><item><title><![CDATA[New comment by mmoskal in "The borrowchecker is what I like the least about Rust"]]></title><description><![CDATA[
<p>I think this is like unsafe - most of your code won’t have it, so you get the benefits of borrow checker (memory safety and race freedom) elsewhere.</p>
]]></description><pubDate>Sat, 19 Jul 2025 23:46:11 +0000</pubDate><link>https://news.ycombinator.com/item?id=44620551</link><dc:creator>mmoskal</dc:creator><comments>https://news.ycombinator.com/item?id=44620551</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=44620551</guid></item><item><title><![CDATA[New comment by mmoskal in "Show HN: Lambduck, a Functional Programming Brainfuck"]]></title><description><![CDATA[
<p>This seems way too readable! I think you should remove the character literals in the name of purity.<p>Also, this is likely way more compact than Brainfuck, as the lambda calculus is written essentially as usual.<p>And seriously, very cool!</p>
]]></description><pubDate>Fri, 06 Jun 2025 01:18:03 +0000</pubDate><link>https://news.ycombinator.com/item?id=44197035</link><dc:creator>mmoskal</dc:creator><comments>https://news.ycombinator.com/item?id=44197035</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=44197035</guid></item><item><title><![CDATA[New comment by mmoskal in "Teaching Program Verification in Dafny at Amazon (2023)"]]></title><description><![CDATA[
<p><a href="https://github.com/verus-lang/verus">https://github.com/verus-lang/verus</a> is similar tool for Rust (developed by previous heavy users of Dafny).</p>
]]></description><pubDate>Tue, 03 Jun 2025 01:10:11 +0000</pubDate><link>https://news.ycombinator.com/item?id=44165197</link><dc:creator>mmoskal</dc:creator><comments>https://news.ycombinator.com/item?id=44165197</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=44165197</guid></item><item><title><![CDATA[New comment by mmoskal in "Look ma, no bubbles designing a low-latency megakernel for Llama-1B"]]></title><description><![CDATA[
<p>They are reducing forward pass time from say 1.5ms to 1ms. On bigger model you would likely reduce from 15ms to 14.2ms or something like that.</p>
]]></description><pubDate>Wed, 28 May 2025 03:55:31 +0000</pubDate><link>https://news.ycombinator.com/item?id=44112610</link><dc:creator>mmoskal</dc:creator><comments>https://news.ycombinator.com/item?id=44112610</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=44112610</guid></item><item><title><![CDATA[New comment by mmoskal in "Look ma, no bubbles designing a low-latency megakernel for Llama-1B"]]></title><description><![CDATA[
<p>The sglang and vllm numbers are with cuda graphs enabled.<p>Having said that, 1B model is an extreme example - hence the 1.5x speedup. For regular models and batch sizes this would probably buy you a few percent.</p>
]]></description><pubDate>Wed, 28 May 2025 03:52:15 +0000</pubDate><link>https://news.ycombinator.com/item?id=44112597</link><dc:creator>mmoskal</dc:creator><comments>https://news.ycombinator.com/item?id=44112597</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=44112597</guid></item><item><title><![CDATA[New comment by mmoskal in "Pyrefly vs. Ty: Comparing Python's two new Rust-based type checkers"]]></title><description><![CDATA[
<p>As mentioned in other comments - in TypeScript which follows this gradual typing there is a number of flags to disable it (gradually so to speak). No reason ty wouldn't do it.</p>
]]></description><pubDate>Tue, 27 May 2025 16:52:53 +0000</pubDate><link>https://news.ycombinator.com/item?id=44108662</link><dc:creator>mmoskal</dc:creator><comments>https://news.ycombinator.com/item?id=44108662</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=44108662</guid></item><item><title><![CDATA[New comment by mmoskal in "Project Verona: Fearless Concurrency for Python"]]></title><description><![CDATA[
<p>If you change one letter in the prompt, however insignificant you may think it is, it will change the results in unpredictable ways, even with temperature 0 etc. The same is not true of renaming a variable in a programming language, most refactorings etc.</p>
]]></description><pubDate>Sun, 18 May 2025 19:56:16 +0000</pubDate><link>https://news.ycombinator.com/item?id=44023919</link><dc:creator>mmoskal</dc:creator><comments>https://news.ycombinator.com/item?id=44023919</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=44023919</guid></item><item><title><![CDATA[New comment by mmoskal in "AI is draining water from areas that need it most"]]></title><description><![CDATA[
<p>I don't know. I suspect most people rate data centers higher than almond milk...</p>
]]></description><pubDate>Sat, 10 May 2025 20:36:18 +0000</pubDate><link>https://news.ycombinator.com/item?id=43948776</link><dc:creator>mmoskal</dc:creator><comments>https://news.ycombinator.com/item?id=43948776</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=43948776</guid></item><item><title><![CDATA[New comment by mmoskal in "Business books are entertainment, not strategic tools"]]></title><description><![CDATA[
<p>My understanding is that printing 300 page paperback costs like $2 while 50 pages cost $1.50. However you can clearly claim way more money for the 300 pages so publishers are not interested in short books, business or otherwise.</p>
]]></description><pubDate>Sat, 10 May 2025 05:57:21 +0000</pubDate><link>https://news.ycombinator.com/item?id=43943513</link><dc:creator>mmoskal</dc:creator><comments>https://news.ycombinator.com/item?id=43943513</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=43943513</guid></item><item><title><![CDATA[New comment by mmoskal in "Nnd – a TUI debugger alternative to GDB, LLDB"]]></title><description><![CDATA[
<p>Very typical for anything with CUDA (they tend to compile everything for 10 different architectures times hundreds of template kernel parameters).<p>Not sure about ClickHouse though.</p>
]]></description><pubDate>Tue, 06 May 2025 16:12:21 +0000</pubDate><link>https://news.ycombinator.com/item?id=43906781</link><dc:creator>mmoskal</dc:creator><comments>https://news.ycombinator.com/item?id=43906781</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=43906781</guid></item><item><title><![CDATA[New comment by mmoskal in "Mercury, the first commercial-scale diffusion language model"]]></title><description><![CDATA[
<p>To put this into perspective, driving for an hour in an electric car (15kW avg consumption) consumes about as much energy as 50,000 chatgpt queries [0]
Running your laptop for an hour would be around 100 queries.<p>[0] <a href="https://epoch.ai/gradient-updates/how-much-energy-does-chatgpt-use" rel="nofollow">https://epoch.ai/gradient-updates/how-much-energy-does-chatg...</a></p>
]]></description><pubDate>Wed, 30 Apr 2025 23:47:24 +0000</pubDate><link>https://news.ycombinator.com/item?id=43852009</link><dc:creator>mmoskal</dc:creator><comments>https://news.ycombinator.com/item?id=43852009</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=43852009</guid></item><item><title><![CDATA[New comment by mmoskal in "Qwen3: Think deeper, act faster"]]></title><description><![CDATA[
<p>Spec decoding only depends on the tokenizer used. It's transfering either the draft token sequence or at most draft logits to the main model.</p>
]]></description><pubDate>Mon, 28 Apr 2025 23:45:04 +0000</pubDate><link>https://news.ycombinator.com/item?id=43827323</link><dc:creator>mmoskal</dc:creator><comments>https://news.ycombinator.com/item?id=43827323</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=43827323</guid></item><item><title><![CDATA[New comment by mmoskal in "Qwen3: Think deeper, act faster"]]></title><description><![CDATA[
<p>Just for some callibration: approx. no one runs 32 bit for LLMs on any sort of iron, big or otherwise. Some models (eg DeepSeek V3, and derivatives like R1) are native FP8. FP8 was also common for llama3 405b serving.</p>
]]></description><pubDate>Mon, 28 Apr 2025 23:17:42 +0000</pubDate><link>https://news.ycombinator.com/item?id=43827140</link><dc:creator>mmoskal</dc:creator><comments>https://news.ycombinator.com/item?id=43827140</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=43827140</guid></item></channel></rss>