<rss version="2.0" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:atom="http://www.w3.org/2005/Atom"><channel><title>Hacker News: Yukonv</title><link>https://news.ycombinator.com/user?id=Yukonv</link><description>Hacker News RSS</description><docs>https://hnrss.org/</docs><generator>hnrss v2.1.1</generator><lastBuildDate>Wed, 08 Apr 2026 10:37:47 +0000</lastBuildDate><atom:link href="https://hnrss.org/user?id=Yukonv" rel="self" type="application/rss+xml"></atom:link><item><title><![CDATA[New comment by Yukonv in "GLM-5.1: Towards Long-Horizon Tasks"]]></title><description><![CDATA[
<p>Unsloth quantizations are available on release as well. [0] The IQ4_XS is a massive 361 GB with the 754B parameters. This is definitely a model your average local LLM enthusiast is not going to be able to run even with high end hardware.<p>[0] <a href="https://huggingface.co/unsloth/GLM-5.1-GGUF" rel="nofollow">https://huggingface.co/unsloth/GLM-5.1-GGUF</a></p>
]]></description><pubDate>Tue, 07 Apr 2026 17:06:49 +0000</pubDate><link>https://news.ycombinator.com/item?id=47678337</link><dc:creator>Yukonv</dc:creator><comments>https://news.ycombinator.com/item?id=47678337</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47678337</guid></item><item><title><![CDATA[New comment by Yukonv in "Running Gemma 4 locally with LM Studio's new headless CLI and Claude Code"]]></title><description><![CDATA[
<p>With that you are taking a significant performance penalty and become severely I/O bottlenecked. I've been able to stream Qwen3.5-397B-A17B from my M5 Max (12 GB/s SSD Read) using the Flash MoE technique at the brisk pace of 10 tokens per second. As tokens are generated different experts need to be consulted resulting in a lot of I/O churn. So while feasible it's only great for batch jobs not interactive usage.</p>
]]></description><pubDate>Sun, 05 Apr 2026 21:38:57 +0000</pubDate><link>https://news.ycombinator.com/item?id=47654135</link><dc:creator>Yukonv</dc:creator><comments>https://news.ycombinator.com/item?id=47654135</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47654135</guid></item><item><title><![CDATA[New comment by Yukonv in "April 2026 TLDR Setup for Ollama and Gemma 4 26B on a Mac mini"]]></title><description><![CDATA[
<p>The latest release v0.3.2 has partial support, generation is supported but not all special tokens are handled. I've done some personal testing to add tool calling and <|channel> thinking support. <a href="https://github.com/Yukon/omlx" rel="nofollow">https://github.com/Yukon/omlx</a></p>
]]></description><pubDate>Fri, 03 Apr 2026 15:02:35 +0000</pubDate><link>https://news.ycombinator.com/item?id=47627460</link><dc:creator>Yukonv</dc:creator><comments>https://news.ycombinator.com/item?id=47627460</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47627460</guid></item><item><title><![CDATA[New comment by Yukonv in "Google releases Gemma 4 open models"]]></title><description><![CDATA[
<p>The model does have the format specified but there is no _one_ standard. For this model it’s defined in the [
tokenizer_config.json [0]. As for llama.cpp they seem to be using a more type safe approach to reading the arguments.<p>[0] <a href="https://huggingface.co/google/gemma-4-31B-it/blob/main/tokenizer_config.json#L37" rel="nofollow">https://huggingface.co/google/gemma-4-31B-it/blob/main/token...</a></p>
]]></description><pubDate>Thu, 02 Apr 2026 23:46:49 +0000</pubDate><link>https://news.ycombinator.com/item?id=47621689</link><dc:creator>Yukonv</dc:creator><comments>https://news.ycombinator.com/item?id=47621689</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47621689</guid></item><item><title><![CDATA[New comment by Yukonv in "Ollama is now powered by MLX on Apple Silicon in preview"]]></title><description><![CDATA[
<p>Good to see Ollama is catching up with the times for inference on Mac. MLX powered inference makes a big difference, especially on M5 as their graphs point out.
What really has been a game changer for my workflow is using <a href="https://omlx.ai/" rel="nofollow">https://omlx.ai/</a> that has SSD KV cold caching. No longer have to worry about a session falling out of memory and needing to prefill again. Combine that with the M5 Max prefill speed means more time is spend on generation than waiting for 50k+ content window to process.</p>
]]></description><pubDate>Tue, 31 Mar 2026 07:03:48 +0000</pubDate><link>https://news.ycombinator.com/item?id=47583733</link><dc:creator>Yukonv</dc:creator><comments>https://news.ycombinator.com/item?id=47583733</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47583733</guid></item><item><title><![CDATA[New comment by Yukonv in "iPhone 17 Pro Demonstrated Running a 400B LLM"]]></title><description><![CDATA[
<p>That’s exactly what I thought about. Getting my hands on an M5 Max this week and going to see hows Dan’s experiment performs with faster I/O. Also going to experiment with running active parameters at Q6 or Q8 since output is I/O bottlenecked there should room for higher accuracy compute.</p>
]]></description><pubDate>Mon, 23 Mar 2026 19:12:39 +0000</pubDate><link>https://news.ycombinator.com/item?id=47493847</link><dc:creator>Yukonv</dc:creator><comments>https://news.ycombinator.com/item?id=47493847</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47493847</guid></item><item><title><![CDATA[Is AI capable of Intelligent Disobedience? [video]]]></title><description><![CDATA[
<p>Article URL: <a href="https://www.youtube.com/watch?v=Qu-00j9XuF0">https://www.youtube.com/watch?v=Qu-00j9XuF0</a></p>
<p>Comments URL: <a href="https://news.ycombinator.com/item?id=47470321">https://news.ycombinator.com/item?id=47470321</a></p>
<p>Points: 2</p>
<p># Comments: 0</p>
]]></description><pubDate>Sat, 21 Mar 2026 19:18:59 +0000</pubDate><link>https://www.youtube.com/watch?v=Qu-00j9XuF0</link><dc:creator>Yukonv</dc:creator><comments>https://news.ycombinator.com/item?id=47470321</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47470321</guid></item><item><title><![CDATA[New comment by Yukonv in "Rust error handling"]]></title><description><![CDATA[
<p>Another good option I’ve personally used if you want a smaller API surface with just Result and Maybe concepts is True Myth. <a href="https://true-myth.github.io/true-myth-csharp/" rel="nofollow">https://true-myth.github.io/true-myth-csharp/</a></p>
]]></description><pubDate>Sun, 15 Sep 2024 01:39:59 +0000</pubDate><link>https://news.ycombinator.com/item?id=41544529</link><dc:creator>Yukonv</dc:creator><comments>https://news.ycombinator.com/item?id=41544529</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=41544529</guid></item><item><title><![CDATA[New comment by Yukonv in "High-speed 10Gbps full-mesh network based on USB4 for just $47.98"]]></title><description><![CDATA[
<p>Related, Intel was showing off Thunderbolt Share at CES[1]. Allows Thunderbolt 4/5 device-to-device transfer of files. Theoretical speeds in the 20Gbps and 40Gbps for Thunderbolt four and five respective.<p>One idea for why they were only able to reach 11Gbps is having only one Thunderbolt/USB4 controller[2], meaning the two USB4 ports split the 40Gbps PCIe lane. Throw in a full-duplex connection and you get 10Gbps in one direction.<p>[1] <a href="https://youtu.be/GqCwLjhb4YY?t=81" rel="nofollow">https://youtu.be/GqCwLjhb4YY?t=81</a>
[2] Just a theory but seems like a sane assumption.</p>
]]></description><pubDate>Mon, 15 Jan 2024 19:38:44 +0000</pubDate><link>https://news.ycombinator.com/item?id=39005132</link><dc:creator>Yukonv</dc:creator><comments>https://news.ycombinator.com/item?id=39005132</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=39005132</guid></item><item><title><![CDATA[New comment by Yukonv in "Super Mario 64 on the Web"]]></title><description><![CDATA[
<p>Great find! Seems to be a common issue with games, found the same issue with trying to auto play PICO-8 cartrages on web.</p>
]]></description><pubDate>Thu, 11 Jan 2024 03:15:34 +0000</pubDate><link>https://news.ycombinator.com/item?id=38947058</link><dc:creator>Yukonv</dc:creator><comments>https://news.ycombinator.com/item?id=38947058</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=38947058</guid></item><item><title><![CDATA[New comment by Yukonv in "Super Mario 64 on the Web"]]></title><description><![CDATA[
<p>Edit: Keyboard input does not work :/ it was a good effort.<p>Found a workaround throw it in an iFrame and have the frame load with a user interaction. Here is a jsFiddle link, just click "Run" after the page loads.<p><a href="https://jsfiddle.net/sg1r3h60/" rel="nofollow">https://jsfiddle.net/sg1r3h60/</a></p>
]]></description><pubDate>Thu, 11 Jan 2024 03:00:38 +0000</pubDate><link>https://news.ycombinator.com/item?id=38946940</link><dc:creator>Yukonv</dc:creator><comments>https://news.ycombinator.com/item?id=38946940</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=38946940</guid></item><item><title><![CDATA[New comment by Yukonv in "Super Mario 64 on the Web"]]></title><description><![CDATA[
<p>Firefox prevents audio from playing with no initial user interaction like a play button. Can see the warning if you pop open the dev console. Don't know of workaround besides sites not creating an AudioContext on page load.</p>
]]></description><pubDate>Thu, 11 Jan 2024 02:30:25 +0000</pubDate><link>https://news.ycombinator.com/item?id=38946698</link><dc:creator>Yukonv</dc:creator><comments>https://news.ycombinator.com/item?id=38946698</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=38946698</guid></item><item><title><![CDATA[Starlink Maritime]]></title><description><![CDATA[
<p>Article URL: <a href="https://www.starlink.com/maritime">https://www.starlink.com/maritime</a></p>
<p>Comments URL: <a href="https://news.ycombinator.com/item?id=32018218">https://news.ycombinator.com/item?id=32018218</a></p>
<p>Points: 606</p>
<p># Comments: 508</p>
]]></description><pubDate>Thu, 07 Jul 2022 19:26:04 +0000</pubDate><link>https://www.starlink.com/maritime</link><dc:creator>Yukonv</dc:creator><comments>https://news.ycombinator.com/item?id=32018218</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=32018218</guid></item><item><title><![CDATA[New comment by Yukonv in "Ask HN: Is your company sticking to on-premise servers? Why?"]]></title><description><![CDATA[
<p>Little late but though I would say hi. I too got started programming thanks to Minecraft. My first real job was working at Overcast Network (oc.tc). I remember having to scale out our infrastructure to seven dedicated servers after a popular YouTuber featured us. At the time that felt crazy for a Minecraft server and here you are now with hundreds of servers. Huge congrats on scaling to where you are today.<p>Have lots of fond memories of those early years, especially Minecon 2013.</p>
]]></description><pubDate>Wed, 13 May 2020 04:14:48 +0000</pubDate><link>https://news.ycombinator.com/item?id=23163406</link><dc:creator>Yukonv</dc:creator><comments>https://news.ycombinator.com/item?id=23163406</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=23163406</guid></item></channel></rss>