Hacker News: threeducks

New comment by threeducks in "Claude Opus 4.6"

threeducks — Thu, 05 Feb 2026 18:44:50 +0000

It is too easy to jailbreak the models with prefill, which was probably the reason why it was removed. But I like that this pushes people towards open source models. llama.cpp supports prefill and even GBNF grammars [1], which is useful if you are working with a custom programming language for example.

[1] https://github.com/ggml-org/llama.cpp/blob/master/grammars/R...

New comment by threeducks in "Writing an eigenvalue solver in Rust for WebAssembly"

threeducks — Wed, 07 Jan 2026 00:21:37 +0000

A while ago, I also implemented a dense eigenvalue solver in Python following a similar approach, but found that it did not converge in O(n^3) as sometimes claimed in literature. I then read about the Divide-and-conquer eigenvalue algorithm, which did the trick. It seems to have a reasonable Wikipedia page these days: https://en.wikipedia.org/wiki/Divide-and-conquer_eigenvalue_...

New comment by threeducks in "Web development is fun again"

threeducks — Mon, 05 Jan 2026 08:23:40 +0000

Could you explain how? I can't seem to figure it out.

DeepSeek-V3.2-Exp has 37B active parameters, GLM-4.7 and Kimi K2 have 32B active parameters.

Lets say we are dealing with Q4_K_S quantization for roughly half the size, we still need to move 16 GB 30 times per second, which requires a memory bandwidth of 480 GB/s, or maybe half that if speculative decoding works really well.

Anything GPU-based won't work for that speed, because PCIe 5 provides only 64 GB/s and $2000 can not afford enough VRAM (~256GB) for a full model.

That leaves CPU-based systems with high memory bandwidth. DDR5 would work (somewhere around 300 GB/s with 8x 4800MHz modules), but that would cost about twice as much for just the RAM alone, disregarding the rest of the system.

Can you get enough memory bandwidth out of DDR4 somehow?

New comment by threeducks in "NSFW Acronyms for Programmers (Free eBook)"

threeducks — Sun, 04 Jan 2026 22:47:52 +0000

I have tried a few Qwen-2.5 and 3.0 models (<=30B), even abliterated ones, but it seems that some words have been completely wiped from their pretraining dataset. No amount of prompting can bring back what has never been there.

For comparison, I have also tried the smaller Mistral models, which have a much more complete vocabulary, but their writing sometimes lacks continuity.

I have not tried the larger models due to lack of VRAM.

New comment by threeducks in "NSFW Acronyms for Programmers (Free eBook)"

threeducks — Sun, 04 Jan 2026 22:27:20 +0000

> Are .pdfs and .epub safe these days?

Depends on the viewer. Acrobat Reader? Probably not. PDF.js in some browser? Probably safe enough unless you are extremely rich.

New comment by threeducks in "NSFW Acronyms for Programmers (Free eBook)"

threeducks — Sun, 04 Jan 2026 22:22:39 +0000

That was my experience as well. Sometimes, LLMs were a big help, but other times, my efforts would have been better spent writing things myself. I always tell myself that experience will make me choose correctly next time, but then a new model is released and things are different yet again.

New comment by threeducks in "NSFW Acronyms for Programmers (Free eBook)"

threeducks — Sun, 04 Jan 2026 22:04:16 +0000

> But if you are insinuating AI made all this up on it's own, I have to disappoint you.

No worries, I am not a native English speaker myself. I was genuinely interested in whether commercial LLMs would use "bad" words without some convincing.

New comment by threeducks in "NSFW Acronyms for Programmers (Free eBook)"

threeducks — Sun, 04 Jan 2026 21:37:15 +0000

Did you have to use any special prompts when using LLMs for writing assistance, or did it just work?

New comment by threeducks in "Total monthly number of StackOverflow questions over time"

threeducks — Sun, 04 Jan 2026 09:07:21 +0000

The last data point is from January 2026, which has just begun. If you extrapolate the 321 questions by multiplying by 10 to account for the remaining 90 % of the month, you get to within the same order of magnitude as December 2025 (3862). The small difference is probably due to the turn of the year.

New comment by threeducks in "Total monthly number of StackOverflow questions over time"

threeducks — Sun, 04 Jan 2026 08:56:02 +0000

Those popups were a big contributor for me to stop using SO. I stopped updating my uBlock origin rules when LLMs became good enough. I am now using the free Kimi K2 model via Groq over CLI, which is much faster.

New comment by threeducks in "Karpathy on Programming: “I've never felt this much behind”"

threeducks — Mon, 29 Dec 2025 22:43:13 +0000

> This is from the man who has no finished open source projects

To be fair, which open source project can really claim that it is "finished", and what does "finished" even mean?

The only projects that I can truly call "finished" are those that I have laid to rest because they have been superseded by newer technologies, not because they have achieved completeness, because there is always more to do.

New comment by threeducks in "OrangePi 6 Plus Review"

threeducks — Sun, 28 Dec 2025 08:10:46 +0000

> it always work[s]

That was not my experience, at least for very large files (100+ GB). There was a workaround (that has since been patched) where you could link files into your own Google drive and circumvent the bandwidth restriction that way. The current workaround is to link the files into a directory and then download the directory containing the link as an archive, which does not count against the bandwidth limit.

New comment by threeducks in "Show HN: Turn raw HTML into production-ready images for free"

threeducks — Wed, 24 Dec 2025 07:47:14 +0000

HTML to PNG:

    chromium --headless --disable-gpu --screenshot=output.png --window-size=1920,1080 --hide-scrollbars index.html

Also works great for HTML to PDF:

    chromium --headless --disable-gpu --no-pdf-header-footer --run-all-compositor-stages-before-draw --print-to-pdf=output.pdf index.html

New comment by threeducks in "What Does a Database for SSDs Look Like?"

threeducks — Sat, 20 Dec 2025 12:20:53 +0000

Lets take the Samsung 9100 Pro M.2 as an example. It has a sequential read rate of ~6700 MB/s and a 4k random read rate of ~80 MB/s:

https://i.imgur.com/t5scCa3.png

https://ssd.userbenchmark.com/ (click on the orange double arrow to view additional columns)

That is a latency of about 50 µs for a random read, compared to 4-5 ms latency for HDDs.

New comment by threeducks in "A linear-time alternative for Dimensionality Reduction and fast visualisation"

threeducks — Tue, 16 Dec 2025 16:39:13 +0000

It is true that big-O notation does not necessarily tell you anything about the actual runtime, but if the hidden constant appears suspiciously large, one should double-check whether something else is going on.

New comment by threeducks in "A linear-time alternative for Dimensionality Reduction and fast visualisation"

threeducks — Tue, 16 Dec 2025 09:25:55 +0000

Without looking at the code, O(N * k) with N = 9000 points and k = 50 dimensions should take in the order of milliseconds, not seconds. Did you profile your code to see whether there is perhaps something that takes an unexpected amount of time?

New comment by threeducks in "Show HN: Tacopy – Tail Call Optimization for Python"

threeducks — Fri, 05 Dec 2025 12:44:39 +0000

Tail calls can be used for parsing very efficiently: https://news.ycombinator.com/item?id=41289114

New comment by threeducks in "OpenAI declares 'code red' as Google catches up in AI race"

threeducks — Wed, 03 Dec 2025 13:52:01 +0000

I can not say how big ML companies do it, but from personal experience of training vision models, you can absolutely reuse the weights of barely related architectures (add more layers, switch between different normalization layers, switch between separable/full convolution, change activation functions, etc.). Even if the shapes of the weights do not match, just do what you have to do to make them fit (repeat or crop). Of course the models will not work right away, but training will go much faster. I usually get over 10 times faster convergence that way.

New comment by threeducks in "OpenAI declares 'code red' as Google catches up in AI race"

threeducks — Wed, 03 Dec 2025 11:18:47 +0000

Why would the open weights providers need their own tools for agentic workflows when you can just plug their OpenAI-compatible API URL into existing tools?

Also, there are many providers of open source models with caching (Moonshot AI, Groq, DeepSeek, FireWorks AI, MiniMax): https://openrouter.ai/docs/guides/best-practices/prompt-cach...

New comment by threeducks in "OpenAI declares 'code red' as Google catches up in AI race"

threeducks — Wed, 03 Dec 2025 11:02:42 +0000

You need a certain level of batch parallelism to make inference efficient, but you also need enough capacity to handle request floods. Being a small provider is not easy.