Hacker News: lopuhin

New comment by lopuhin in "The path to ubiquitous AI (17k tokens/sec)"

lopuhin — Fri, 20 Feb 2026 18:08:24 +0000

For that you only need high throughput which is much easier to achieve compared to high latency, thanks to batching -- assuming the log lines or chunks can be processed independently. You can check TensorRT-LLM benchmarks (https://nvidia.github.io/TensorRT-LLM/developer-guide/perf-o...), or try running vllm on a card you have access to.

New comment by lopuhin in "TimeCapsuleLLM: LLM trained only on data from 1800-1875"

lopuhin — Tue, 13 Jan 2026 10:25:27 +0000

On whether this accounts only the final output layer -- once the first token is generated (i.e. selected according to the modified sampling procedure), and assuming a different token is selected compared to standard sampling, then all layers of the model would be affected during generation of subsequent tokens.

New comment by lopuhin in "Python numbers every programmer should know"

lopuhin — Thu, 01 Jan 2026 19:37:54 +0000

That's impressive how you figured out the reason for the difference in list of floats vs list of ints container size, framed as an interview question that would have been quite difficult I think

New comment by lopuhin in "GPT-5.2"

lopuhin — Thu, 11 Dec 2025 22:38:33 +0000

Context window size of 400k is not new, gpt-5, 5.1, 5-mini, etc. have the same. But they do claim they improved long context performance which if true would be great.

New comment by lopuhin in "Running GPT-OSS-120B at 500 tokens per second on Nvidia GPUs"

lopuhin — Thu, 07 Aug 2025 09:16:49 +0000

you can rent them for less then $2/h in a lot of places (maybe not in the drawer)

New comment by lopuhin in "Batch Mode in the Gemini API: Process More for Less"

lopuhin — Fri, 11 Jul 2025 11:33:44 +0000

I find OpenAI's new flex processing more attractive, as it has the same 50% discount, but allows to use the same API as regular chat mode, so you can still do stuff where Batch API won't work (e.g. evaluating agents), and in practice I found it to work well enough when paired with client-side request caching: https://platform.openai.com/docs/guides/flex-processing?api-...

New comment by lopuhin in "MCP Run Python"

lopuhin — Thu, 17 Apr 2025 20:28:11 +0000

it's pretty difficult to package native python dependencies for wasmtime or other wasi runtimes, e.g. lxml

Visualize LLM Token Probabilities and Confidence with ELI5

lopuhin — Sun, 06 Apr 2025 17:22:48 +0000

Article URL: https://eli5.readthedocs.io/en/stable/tutorials/explain_llm_logprobs.html

Comments URL: https://news.ycombinator.com/item?id=43603097

Points: 2

# Comments: 0

New comment by lopuhin in "Setuptools version 78.0.1 breaks install of many packages"

lopuhin — Mon, 24 Mar 2025 17:49:56 +0000

Crazy amount of breakage...

Here is a PR which reverts this: https://github.com/pypa/setuptools/pull/4911

Interesting that maintainers of setuptools still only postpone the depreciation date for a year, so we can probably expect more issues like this in the future.

New comment by lopuhin in "ForeverVM: Run AI-generated code in stateful sandboxes that run forever"

lopuhin — Thu, 27 Feb 2025 13:45:43 +0000

Congrats on the launch! How much does it cost? And what is the sandboxing technology?

New comment by lopuhin in "Using AI for Coding: My Journey with Cline and LLMs"

lopuhin — Tue, 28 Jan 2025 10:13:21 +0000

I find it strange that the author is really happy with the quality of string comparison here https://pgaleone.eu/ai/coding/2025/01/26/using-ai-for-coding... and while it would kind of work, it's a very weird piece of code from ML standpoint, e.g. it's training a TF-IDF vectorizer on just two strings being compared, which at best won't change anything (unless the same word is repeated within one product), and is a super weird thing to do as for better quality you'd probably want to train that on some corpus, or not bother at all. And also it compare two strings as bags of words, which again is not the end of the world but maybe not what the author wants here, and if they want this then it's not the easiest way of doing it. So it's taking some things which can be useful when comparing texts (tf-idf and cosine similarity) but then applying them in a weird way which does not let them show their strengths.

New comment by lopuhin in "DeepSeek and the Effects of GPU Export Controls"

lopuhin — Thu, 23 Jan 2025 15:46:53 +0000

It's a 600B+ mixture of experts and yes it's described in the paper, GitHub, etc.

New comment by lopuhin in "DeepSeek and the Effects of GPU Export Controls"

lopuhin — Thu, 23 Jan 2025 13:55:42 +0000

Why is this doubtful, did you spot any suspicious things in their paper? They make the weights and a lot of training details open as well, which leaves much less room for making stuff up, e.g. you could check training compute requirements from active weight size (which they can't fake as they released the weights) and fp8 training used.

New comment by lopuhin in "DeepSeek-R1"

lopuhin — Mon, 20 Jan 2025 14:06:17 +0000

With distilled models being released, it's very likely they'd be soon served by other providers at a good price and perf, unlike the full R1 which is very big and much harder to serve efficiently.

New comment by lopuhin in "Diffusion for World Modeling"

lopuhin — Sun, 13 Oct 2024 12:24:10 +0000

I don't think so, what they show on CS video is exactly the Dust2 map, not just something similar/inspired by it.

New comment by lopuhin in "GraalPy – A high-performance embeddable Python 3 runtime for Java"

lopuhin — Tue, 17 Sep 2024 19:06:14 +0000

I think GraalPython does have a GIL, see https://github.com/oracle/graalpython/blob/master/docs/contr... - and if by "there is no such thing on those platforms" you mean JVM/CLR not having a GIL, C also does not have a GIL but CPython does.

New comment by lopuhin in "When ChatGPT summarises, it does nothing of the kind"

lopuhin — Sun, 21 Jul 2024 21:24:09 +0000

Curious which model was used? Sorry if I missed that. Looks like an important detail to mention when doing an evaluation.

New comment by lopuhin in "Mistral NeMo"

lopuhin — Fri, 19 Jul 2024 14:53:14 +0000

Also I don't think you can use NIM packages in production without a subscription, and I wasn't able to find the cost without signing up. Also NIM package for Mistral Nemo is not yet available anyways.

New comment by lopuhin in "Exo: Run your own AI cluster at home with everyday devices"

lopuhin — Tue, 16 Jul 2024 16:07:56 +0000

The README says they plan to add llama.cpp support which should cover a lot of targets, also they have tinygrad already integrated I think.

New comment by lopuhin in "Safe Superintelligence Inc."

lopuhin — Wed, 19 Jun 2024 17:21:30 +0000

Not quite the same, OpenAI was initially quite open, while Ilia is currently very explicitly against opening or open-sourcing research, e.g. see https://www.theverge.com/2023/3/15/23640180/openai-gpt-4-lau...