<rss version="2.0" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:atom="http://www.w3.org/2005/Atom"><channel><title>Hacker News: danielhanchen</title><link>https://news.ycombinator.com/user?id=danielhanchen</link><description>Hacker News RSS</description><docs>https://hnrss.org/</docs><generator>hnrss v2.1.1</generator><lastBuildDate>Mon, 15 Jun 2026 18:17:14 +0000</lastBuildDate><atom:link href="https://hnrss.org/user?id=danielhanchen" rel="self" type="application/rss+xml"></atom:link><item><title><![CDATA[New comment by danielhanchen in "Unsloth Joins PyTorch Ecosystem"]]></title><description><![CDATA[
<p>Thank you appreciate the support! It's all thanks to you guys and the community!</p>
]]></description><pubDate>Mon, 11 May 2026 15:15:35 +0000</pubDate><link>https://news.ycombinator.com/item?id=48096146</link><dc:creator>danielhanchen</dc:creator><comments>https://news.ycombinator.com/item?id=48096146</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=48096146</guid></item><item><title><![CDATA[New comment by danielhanchen in "Making LLM Training Faster with Unsloth and NVIDIA"]]></title><description><![CDATA[
<p>Update - Just got rid of the spiced up intro</p>
]]></description><pubDate>Thu, 07 May 2026 10:24:14 +0000</pubDate><link>https://news.ycombinator.com/item?id=48047664</link><dc:creator>danielhanchen</dc:creator><comments>https://news.ycombinator.com/item?id=48047664</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=48047664</guid></item><item><title><![CDATA[New comment by danielhanchen in "Making LLM Training Faster with Unsloth and NVIDIA"]]></title><description><![CDATA[
<p>Thank you!</p>
]]></description><pubDate>Thu, 07 May 2026 10:22:58 +0000</pubDate><link>https://news.ycombinator.com/item?id=48047659</link><dc:creator>danielhanchen</dc:creator><comments>https://news.ycombinator.com/item?id=48047659</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=48047659</guid></item><item><title><![CDATA[New comment by danielhanchen in "Making LLM Training Faster with Unsloth and NVIDIA"]]></title><description><![CDATA[
<p>Oh thanks :) We're also going to add MTP support soon for Qwen3.6!<p>95% of it is fully human done - the maths, algos, code snippets, screenshots & benchmarks are done / conducted by us and NVIDIA :)<p>We did use AI to fix spelling errors + made some nice plots using Chat (ours would look horrible lol)<p>Update - Just got rid of the spiced up intro</p>
]]></description><pubDate>Thu, 07 May 2026 10:22:32 +0000</pubDate><link>https://news.ycombinator.com/item?id=48047651</link><dc:creator>danielhanchen</dc:creator><comments>https://news.ycombinator.com/item?id=48047651</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=48047651</guid></item><item><title><![CDATA[Mistral Medium 3.5 YaRN bug fix]]></title><description><![CDATA[
<p>Article URL: <a href="https://huggingface.co/mistralai/Mistral-Medium-3.5-128B/discussions/18">https://huggingface.co/mistralai/Mistral-Medium-3.5-128B/discussions/18</a></p>
<p>Comments URL: <a href="https://news.ycombinator.com/item?id=47990697">https://news.ycombinator.com/item?id=47990697</a></p>
<p>Points: 1</p>
<p># Comments: 0</p>
]]></description><pubDate>Sat, 02 May 2026 21:23:21 +0000</pubDate><link>https://huggingface.co/mistralai/Mistral-Medium-3.5-128B/discussions/18</link><dc:creator>danielhanchen</dc:creator><comments>https://news.ycombinator.com/item?id=47990697</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47990697</guid></item><item><title><![CDATA[New comment by danielhanchen in "Qwen3.6-27B: Flagship-Level Coding in a 27B Dense Model"]]></title><description><![CDATA[
<p>Sorry on the delay - so it installs <a href="https://github.com/Blaizzy/mlx-vlm" rel="nofollow">https://github.com/Blaizzy/mlx-vlm</a> and other components and sets up the commands - you don't need to use it but we thought it might be easier for folks</p>
]]></description><pubDate>Thu, 30 Apr 2026 06:13:50 +0000</pubDate><link>https://news.ycombinator.com/item?id=47958823</link><dc:creator>danielhanchen</dc:creator><comments>https://news.ycombinator.com/item?id=47958823</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47958823</guid></item><item><title><![CDATA[New comment by danielhanchen in "Qwen3.6-27B: Flagship-Level Coding in a 27B Dense Model"]]></title><description><![CDATA[
<p>Sorry on the delay - oh haha that would be cool :) We did release 2bit dynamic ones, but unsure if they'll be helpful</p>
]]></description><pubDate>Thu, 30 Apr 2026 06:13:08 +0000</pubDate><link>https://news.ycombinator.com/item?id=47958818</link><dc:creator>danielhanchen</dc:creator><comments>https://news.ycombinator.com/item?id=47958818</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47958818</guid></item><item><title><![CDATA[New comment by danielhanchen in "Qwen3.6-27B: Flagship-Level Coding in a 27B Dense Model"]]></title><description><![CDATA[
<p>Yes we do! Sorry on the delay</p>
]]></description><pubDate>Thu, 30 Apr 2026 06:12:37 +0000</pubDate><link>https://news.ycombinator.com/item?id=47958814</link><dc:creator>danielhanchen</dc:creator><comments>https://news.ycombinator.com/item?id=47958814</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47958814</guid></item><item><title><![CDATA[New comment by danielhanchen in "Qwen3.6-27B: Flagship-Level Coding in a 27B Dense Model"]]></title><description><![CDATA[
<p>We use Duck Duck Go - sorry on the delayed response as well</p>
]]></description><pubDate>Thu, 30 Apr 2026 05:21:48 +0000</pubDate><link>https://news.ycombinator.com/item?id=47958476</link><dc:creator>danielhanchen</dc:creator><comments>https://news.ycombinator.com/item?id=47958476</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47958476</guid></item><item><title><![CDATA[New comment by danielhanchen in "Qwen3.6-27B: Flagship-Level Coding in a 27B Dense Model"]]></title><description><![CDATA[
<p>Thank you and appreciate it! Sorry on the delayed reply as well</p>
]]></description><pubDate>Thu, 30 Apr 2026 05:21:28 +0000</pubDate><link>https://news.ycombinator.com/item?id=47958472</link><dc:creator>danielhanchen</dc:creator><comments>https://news.ycombinator.com/item?id=47958472</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47958472</guid></item><item><title><![CDATA[New comment by danielhanchen in "Qwen3.6-27B: Flagship-Level Coding in a 27B Dense Model"]]></title><description><![CDATA[
<p>Oh yes LM Link is cool!</p>
]]></description><pubDate>Thu, 30 Apr 2026 05:21:14 +0000</pubDate><link>https://news.ycombinator.com/item?id=47958471</link><dc:creator>danielhanchen</dc:creator><comments>https://news.ycombinator.com/item?id=47958471</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47958471</guid></item><item><title><![CDATA[New comment by danielhanchen in "Qwen3.6-27B: Flagship-Level Coding in a 27B Dense Model"]]></title><description><![CDATA[
<p>Hey sorry on the delay - we just added API support, so you can access a remote server - it includes optional python, tool call, bash and web search support if you enable them.<p>For SSH - we haven't yet done that - for now we have a SHA256 encryption approach, but it's not SSH yet. HTTPS will also sadly have to be the end user's setup process as well - we plan to make it better soon!</p>
]]></description><pubDate>Thu, 30 Apr 2026 05:21:04 +0000</pubDate><link>https://news.ycombinator.com/item?id=47958470</link><dc:creator>danielhanchen</dc:creator><comments>https://news.ycombinator.com/item?id=47958470</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47958470</guid></item><item><title><![CDATA[New comment by danielhanchen in "Qwen3.6-27B: Flagship-Level Coding in a 27B Dense Model"]]></title><description><![CDATA[
<p>Hey! Sorry for not replying sooner - yes we'll keep publishing more KLD - sadly some are saying we are "optimizing" for KLD now since we posted so many haha - but the whole purpose of quantization is to match the BF16 logits as much as possible whilst reducing disk space (ie reduce KLD).<p>In general so this is funny and a quirk of quantization - sometimes 8bit, 4bit models do BETTER on downstream benchmarks (SWE Bench for eg), since sometimes rounding can actually somehow act as a "regularization" method (this is just my hunch).<p>So KLD isn't that expensive, since we leverage the trick of causal attention - since causal attention is lower triangular, we can do 1 forward pass on the enter text (say 2048 tokens), and you attain logits for the prediction for every token's position - so this is O(N^2).<p>However coding benchmarking require actual inference, and cannot use the causal attention trick, and it's best to run them 10 times since temperature = 1.0 is not deterministic - and take an average. We plan to maybe do something like <a href="https://marginlab.ai/trackers/claude-code/" rel="nofollow">https://marginlab.ai/trackers/claude-code/</a>, which takes a random sample and does it over time.</p>
]]></description><pubDate>Thu, 30 Apr 2026 05:19:08 +0000</pubDate><link>https://news.ycombinator.com/item?id=47958453</link><dc:creator>danielhanchen</dc:creator><comments>https://news.ycombinator.com/item?id=47958453</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47958453</guid></item><item><title><![CDATA[New comment by danielhanchen in "Qwen3.6-27B: Flagship-Level Coding in a 27B Dense Model"]]></title><description><![CDATA[
<p>Hey so sorry didn't reply sooner - yes the docker used to be I think 4-8GB ish since CUDA sadly itself is 4GB I think, and PyTorch takes the rest. So unfortunately the Unsloth Docker image has ballooned due to this. We tried reducing it as much as possible, but it's hard :( <a href="https://hub.docker.com/r/vllm/vllm-openai/tags" rel="nofollow">https://hub.docker.com/r/vllm/vllm-openai/tags</a> for eg is around 11GB ish, ad we're 13.6GB ish.<p>We'll try our best to compress it more, but it's tough</p>
]]></description><pubDate>Thu, 30 Apr 2026 05:12:17 +0000</pubDate><link>https://news.ycombinator.com/item?id=47958401</link><dc:creator>danielhanchen</dc:creator><comments>https://news.ycombinator.com/item?id=47958401</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47958401</guid></item><item><title><![CDATA[New comment by danielhanchen in "Qwen3.6-27B: Flagship-Level Coding in a 27B Dense Model"]]></title><description><![CDATA[
<p>Apologies as well didn't reply sooner - Studio supports AMD out of the box now! We worked with AMD to make it work! One thing that is still missing is pre-compiled AMD ROCM binaries, which we're trying to see if we can integrate that.<p>Interesting on diskpart - let me check and get back to you [EDIT] - visual studio build tools, python 3.13, git, cmake, node.js are all msi-based installers - so these are likely the culprits on using diskpart - essentially MSI installers check if there's enough disk space before installing items</p>
]]></description><pubDate>Thu, 30 Apr 2026 05:09:44 +0000</pubDate><link>https://news.ycombinator.com/item?id=47958386</link><dc:creator>danielhanchen</dc:creator><comments>https://news.ycombinator.com/item?id=47958386</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47958386</guid></item><item><title><![CDATA[New comment by danielhanchen in "Qwen3.6-27B: Flagship-Level Coding in a 27B Dense Model"]]></title><description><![CDATA[
<p>Oh my apologies I didn't respond - if only HN had a notifier haha<p>Oh yes we added a custom folder button which can pull .gguf files for now from any folder - it supports LM Studio and Ollama ones - but afreed it's still a mess.<p>One of the goals is to somehow quick search for .gguf folders, and add recommended folders - we currently have folders for Ollama and LM Studio for eg</p>
]]></description><pubDate>Thu, 30 Apr 2026 05:07:38 +0000</pubDate><link>https://news.ycombinator.com/item?id=47958366</link><dc:creator>danielhanchen</dc:creator><comments>https://news.ycombinator.com/item?id=47958366</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47958366</guid></item><item><title><![CDATA[New comment by danielhanchen in "Qwen3.6-27B: Flagship-Level Coding in a 27B Dense Model"]]></title><description><![CDATA[
<p>We made Unsloth Studio which should help :)<p>1. Auto best official parameters set for all models<p>2. Auto determines the largest quant that can fit on your PC / Mac etc<p>3. Auto determines max context length<p>4. Auto heals tool calls, provides python & bash + web search :)</p>
]]></description><pubDate>Wed, 22 Apr 2026 16:12:47 +0000</pubDate><link>https://news.ycombinator.com/item?id=47865679</link><dc:creator>danielhanchen</dc:creator><comments>https://news.ycombinator.com/item?id=47865679</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47865679</guid></item><item><title><![CDATA[New comment by danielhanchen in "Qwen3.6-27B: Flagship-Level Coding in a 27B Dense Model"]]></title><description><![CDATA[
<p>Haha :)</p>
]]></description><pubDate>Wed, 22 Apr 2026 15:50:37 +0000</pubDate><link>https://news.ycombinator.com/item?id=47865374</link><dc:creator>danielhanchen</dc:creator><comments>https://news.ycombinator.com/item?id=47865374</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47865374</guid></item><item><title><![CDATA[New comment by danielhanchen in "Qwen3.6-27B: Flagship-Level Coding in a 27B Dense Model"]]></title><description><![CDATA[
<p>Haha :) We had some issues with Kimi-2.6 since it was int4 and we were investigating how to handle it :)</p>
]]></description><pubDate>Wed, 22 Apr 2026 15:50:16 +0000</pubDate><link>https://news.ycombinator.com/item?id=47865365</link><dc:creator>danielhanchen</dc:creator><comments>https://news.ycombinator.com/item?id=47865365</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47865365</guid></item><item><title><![CDATA[New comment by danielhanchen in "Qwen3.6-27B: Flagship-Level Coding in a 27B Dense Model"]]></title><description><![CDATA[
<p>We also made some dynamic MLX ones if they help - it might be faster for Macs, but llama-server definitely is improving at a fast pace.<p><a href="https://huggingface.co/unsloth/Qwen3.6-27B-UD-MLX-4bit" rel="nofollow">https://huggingface.co/unsloth/Qwen3.6-27B-UD-MLX-4bit</a></p>
]]></description><pubDate>Wed, 22 Apr 2026 15:49:29 +0000</pubDate><link>https://news.ycombinator.com/item?id=47865351</link><dc:creator>danielhanchen</dc:creator><comments>https://news.ycombinator.com/item?id=47865351</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47865351</guid></item></channel></rss>