Hacker News: danielhanchen

New comment by danielhanchen in "Unsloth Joins PyTorch Ecosystem"

danielhanchen — Mon, 11 May 2026 15:15:35 +0000

Thank you appreciate the support! It's all thanks to you guys and the community!

New comment by danielhanchen in "Making LLM Training Faster with Unsloth and NVIDIA"

danielhanchen — Thu, 07 May 2026 10:24:14 +0000

Update - Just got rid of the spiced up intro

New comment by danielhanchen in "Making LLM Training Faster with Unsloth and NVIDIA"

danielhanchen — Thu, 07 May 2026 10:22:58 +0000

Thank you!

New comment by danielhanchen in "Making LLM Training Faster with Unsloth and NVIDIA"

danielhanchen — Thu, 07 May 2026 10:22:32 +0000

Oh thanks :) We're also going to add MTP support soon for Qwen3.6!

95% of it is fully human done - the maths, algos, code snippets, screenshots & benchmarks are done / conducted by us and NVIDIA :)

We did use AI to fix spelling errors + made some nice plots using Chat (ours would look horrible lol)

Update - Just got rid of the spiced up intro

Mistral Medium 3.5 YaRN bug fix

danielhanchen — Sat, 02 May 2026 21:23:21 +0000

Article URL: https://huggingface.co/mistralai/Mistral-Medium-3.5-128B/discussions/18

Comments URL: https://news.ycombinator.com/item?id=47990697

Points: 1

# Comments: 0

New comment by danielhanchen in "Qwen3.6-27B: Flagship-Level Coding in a 27B Dense Model"

danielhanchen — Thu, 30 Apr 2026 06:13:50 +0000

Sorry on the delay - so it installs https://github.com/Blaizzy/mlx-vlm and other components and sets up the commands - you don't need to use it but we thought it might be easier for folks

New comment by danielhanchen in "Qwen3.6-27B: Flagship-Level Coding in a 27B Dense Model"

danielhanchen — Thu, 30 Apr 2026 06:13:08 +0000

Sorry on the delay - oh haha that would be cool :) We did release 2bit dynamic ones, but unsure if they'll be helpful

New comment by danielhanchen in "Qwen3.6-27B: Flagship-Level Coding in a 27B Dense Model"

danielhanchen — Thu, 30 Apr 2026 06:12:37 +0000

Yes we do! Sorry on the delay

New comment by danielhanchen in "Qwen3.6-27B: Flagship-Level Coding in a 27B Dense Model"

danielhanchen — Thu, 30 Apr 2026 05:21:48 +0000

We use Duck Duck Go - sorry on the delayed response as well

New comment by danielhanchen in "Qwen3.6-27B: Flagship-Level Coding in a 27B Dense Model"

danielhanchen — Thu, 30 Apr 2026 05:21:28 +0000

Thank you and appreciate it! Sorry on the delayed reply as well

New comment by danielhanchen in "Qwen3.6-27B: Flagship-Level Coding in a 27B Dense Model"

danielhanchen — Thu, 30 Apr 2026 05:21:14 +0000

Oh yes LM Link is cool!

New comment by danielhanchen in "Qwen3.6-27B: Flagship-Level Coding in a 27B Dense Model"

danielhanchen — Thu, 30 Apr 2026 05:21:04 +0000

Hey sorry on the delay - we just added API support, so you can access a remote server - it includes optional python, tool call, bash and web search support if you enable them.

For SSH - we haven't yet done that - for now we have a SHA256 encryption approach, but it's not SSH yet. HTTPS will also sadly have to be the end user's setup process as well - we plan to make it better soon!

New comment by danielhanchen in "Qwen3.6-27B: Flagship-Level Coding in a 27B Dense Model"

danielhanchen — Thu, 30 Apr 2026 05:19:08 +0000

Hey! Sorry for not replying sooner - yes we'll keep publishing more KLD - sadly some are saying we are "optimizing" for KLD now since we posted so many haha - but the whole purpose of quantization is to match the BF16 logits as much as possible whilst reducing disk space (ie reduce KLD).

In general so this is funny and a quirk of quantization - sometimes 8bit, 4bit models do BETTER on downstream benchmarks (SWE Bench for eg), since sometimes rounding can actually somehow act as a "regularization" method (this is just my hunch).

So KLD isn't that expensive, since we leverage the trick of causal attention - since causal attention is lower triangular, we can do 1 forward pass on the enter text (say 2048 tokens), and you attain logits for the prediction for every token's position - so this is O(N^2).

However coding benchmarking require actual inference, and cannot use the causal attention trick, and it's best to run them 10 times since temperature = 1.0 is not deterministic - and take an average. We plan to maybe do something like https://marginlab.ai/trackers/claude-code/, which takes a random sample and does it over time.

New comment by danielhanchen in "Qwen3.6-27B: Flagship-Level Coding in a 27B Dense Model"

danielhanchen — Thu, 30 Apr 2026 05:12:17 +0000

Hey so sorry didn't reply sooner - yes the docker used to be I think 4-8GB ish since CUDA sadly itself is 4GB I think, and PyTorch takes the rest. So unfortunately the Unsloth Docker image has ballooned due to this. We tried reducing it as much as possible, but it's hard :( https://hub.docker.com/r/vllm/vllm-openai/tags for eg is around 11GB ish, ad we're 13.6GB ish.

We'll try our best to compress it more, but it's tough

New comment by danielhanchen in "Qwen3.6-27B: Flagship-Level Coding in a 27B Dense Model"

danielhanchen — Thu, 30 Apr 2026 05:09:44 +0000

Apologies as well didn't reply sooner - Studio supports AMD out of the box now! We worked with AMD to make it work! One thing that is still missing is pre-compiled AMD ROCM binaries, which we're trying to see if we can integrate that.

Interesting on diskpart - let me check and get back to you [EDIT] - visual studio build tools, python 3.13, git, cmake, node.js are all msi-based installers - so these are likely the culprits on using diskpart - essentially MSI installers check if there's enough disk space before installing items

New comment by danielhanchen in "Qwen3.6-27B: Flagship-Level Coding in a 27B Dense Model"

danielhanchen — Thu, 30 Apr 2026 05:07:38 +0000

Oh my apologies I didn't respond - if only HN had a notifier haha

Oh yes we added a custom folder button which can pull .gguf files for now from any folder - it supports LM Studio and Ollama ones - but afreed it's still a mess.

One of the goals is to somehow quick search for .gguf folders, and add recommended folders - we currently have folders for Ollama and LM Studio for eg

New comment by danielhanchen in "Qwen3.6-27B: Flagship-Level Coding in a 27B Dense Model"

danielhanchen — Wed, 22 Apr 2026 16:12:47 +0000

We made Unsloth Studio which should help :)

1. Auto best official parameters set for all models

2. Auto determines the largest quant that can fit on your PC / Mac etc

3. Auto determines max context length

4. Auto heals tool calls, provides python & bash + web search :)

New comment by danielhanchen in "Qwen3.6-27B: Flagship-Level Coding in a 27B Dense Model"

danielhanchen — Wed, 22 Apr 2026 15:50:37 +0000

Haha :)

New comment by danielhanchen in "Qwen3.6-27B: Flagship-Level Coding in a 27B Dense Model"

danielhanchen — Wed, 22 Apr 2026 15:50:16 +0000

Haha :) We had some issues with Kimi-2.6 since it was int4 and we were investigating how to handle it :)

New comment by danielhanchen in "Qwen3.6-27B: Flagship-Level Coding in a 27B Dense Model"

danielhanchen — Wed, 22 Apr 2026 15:49:29 +0000

We also made some dynamic MLX ones if they help - it might be faster for Macs, but llama-server definitely is improving at a fast pace.

https://huggingface.co/unsloth/Qwen3.6-27B-UD-MLX-4bit