<rss version="2.0" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:atom="http://www.w3.org/2005/Atom"><channel><title>Hacker News: lopuhin</title><link>https://news.ycombinator.com/user?id=lopuhin</link><description>Hacker News RSS</description><docs>https://hnrss.org/</docs><generator>hnrss v2.1.1</generator><lastBuildDate>Tue, 28 Apr 2026 21:03:51 +0000</lastBuildDate><atom:link href="https://hnrss.org/user?id=lopuhin" rel="self" type="application/rss+xml"></atom:link><item><title><![CDATA[New comment by lopuhin in "The path to ubiquitous AI (17k tokens/sec)"]]></title><description><![CDATA[
<p>For that you only need high throughput which is much easier to achieve compared to high latency, thanks to batching -- assuming the log lines or chunks can be processed independently. You can check TensorRT-LLM benchmarks (<a href="https://nvidia.github.io/TensorRT-LLM/developer-guide/perf-overview.html" rel="nofollow">https://nvidia.github.io/TensorRT-LLM/developer-guide/perf-o...</a>), or try running vllm on a card you have access to.</p>
]]></description><pubDate>Fri, 20 Feb 2026 18:08:24 +0000</pubDate><link>https://news.ycombinator.com/item?id=47091532</link><dc:creator>lopuhin</dc:creator><comments>https://news.ycombinator.com/item?id=47091532</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47091532</guid></item><item><title><![CDATA[New comment by lopuhin in "TimeCapsuleLLM: LLM trained only on data from 1800-1875"]]></title><description><![CDATA[
<p>On whether this accounts only the final output layer -- once the first token is generated (i.e. selected according to the modified sampling procedure), and assuming a different token is selected compared to standard sampling, then all layers of the model would be affected during generation of subsequent tokens.</p>
]]></description><pubDate>Tue, 13 Jan 2026 10:25:27 +0000</pubDate><link>https://news.ycombinator.com/item?id=46599230</link><dc:creator>lopuhin</dc:creator><comments>https://news.ycombinator.com/item?id=46599230</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=46599230</guid></item><item><title><![CDATA[New comment by lopuhin in "Python numbers every programmer should know"]]></title><description><![CDATA[
<p>That's impressive how you figured out the reason for the difference in list of floats vs list of ints container size, framed as an interview question that would have been quite difficult I think</p>
]]></description><pubDate>Thu, 01 Jan 2026 19:37:54 +0000</pubDate><link>https://news.ycombinator.com/item?id=46457276</link><dc:creator>lopuhin</dc:creator><comments>https://news.ycombinator.com/item?id=46457276</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=46457276</guid></item><item><title><![CDATA[New comment by lopuhin in "GPT-5.2"]]></title><description><![CDATA[
<p>Context window size of 400k is not new, gpt-5, 5.1, 5-mini, etc. have the same. But they do claim they improved long context performance which if true would be great.</p>
]]></description><pubDate>Thu, 11 Dec 2025 22:38:33 +0000</pubDate><link>https://news.ycombinator.com/item?id=46238297</link><dc:creator>lopuhin</dc:creator><comments>https://news.ycombinator.com/item?id=46238297</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=46238297</guid></item><item><title><![CDATA[New comment by lopuhin in "Running GPT-OSS-120B at 500 tokens per second on Nvidia GPUs"]]></title><description><![CDATA[
<p>you can rent them for less then $2/h in a lot of places (maybe not in the drawer)</p>
]]></description><pubDate>Thu, 07 Aug 2025 09:16:49 +0000</pubDate><link>https://news.ycombinator.com/item?id=44822311</link><dc:creator>lopuhin</dc:creator><comments>https://news.ycombinator.com/item?id=44822311</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=44822311</guid></item><item><title><![CDATA[New comment by lopuhin in "Batch Mode in the Gemini API: Process More for Less"]]></title><description><![CDATA[
<p>I find OpenAI's new flex processing more attractive, as it has the same 50% discount, but allows to use the same API as regular chat mode, so you can still do stuff where Batch API won't work (e.g. evaluating agents), and in practice I found it to work well enough when paired with client-side request caching: <a href="https://platform.openai.com/docs/guides/flex-processing?api-mode=chat" rel="nofollow">https://platform.openai.com/docs/guides/flex-processing?api-...</a></p>
]]></description><pubDate>Fri, 11 Jul 2025 11:33:44 +0000</pubDate><link>https://news.ycombinator.com/item?id=44530959</link><dc:creator>lopuhin</dc:creator><comments>https://news.ycombinator.com/item?id=44530959</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=44530959</guid></item><item><title><![CDATA[New comment by lopuhin in "MCP Run Python"]]></title><description><![CDATA[
<p>it's pretty difficult to package native python dependencies for wasmtime or other wasi runtimes, e.g. lxml</p>
]]></description><pubDate>Thu, 17 Apr 2025 20:28:11 +0000</pubDate><link>https://news.ycombinator.com/item?id=43721856</link><dc:creator>lopuhin</dc:creator><comments>https://news.ycombinator.com/item?id=43721856</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=43721856</guid></item><item><title><![CDATA[Visualize LLM Token Probabilities and Confidence with ELI5]]></title><description><![CDATA[
<p>Article URL: <a href="https://eli5.readthedocs.io/en/stable/tutorials/explain_llm_logprobs.html">https://eli5.readthedocs.io/en/stable/tutorials/explain_llm_logprobs.html</a></p>
<p>Comments URL: <a href="https://news.ycombinator.com/item?id=43603097">https://news.ycombinator.com/item?id=43603097</a></p>
<p>Points: 2</p>
<p># Comments: 0</p>
]]></description><pubDate>Sun, 06 Apr 2025 17:22:48 +0000</pubDate><link>https://eli5.readthedocs.io/en/stable/tutorials/explain_llm_logprobs.html</link><dc:creator>lopuhin</dc:creator><comments>https://news.ycombinator.com/item?id=43603097</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=43603097</guid></item><item><title><![CDATA[New comment by lopuhin in "Setuptools version 78.0.1 breaks install of many packages"]]></title><description><![CDATA[
<p>Crazy amount of breakage...<p>Here is a PR which reverts this: <a href="https://github.com/pypa/setuptools/pull/4911" rel="nofollow">https://github.com/pypa/setuptools/pull/4911</a><p>Interesting that maintainers of setuptools still only postpone the depreciation date for a year, so we can probably expect more issues like this in the future.</p>
]]></description><pubDate>Mon, 24 Mar 2025 17:49:56 +0000</pubDate><link>https://news.ycombinator.com/item?id=43463610</link><dc:creator>lopuhin</dc:creator><comments>https://news.ycombinator.com/item?id=43463610</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=43463610</guid></item><item><title><![CDATA[New comment by lopuhin in "ForeverVM: Run AI-generated code in stateful sandboxes that run forever"]]></title><description><![CDATA[
<p>Congrats on the launch! How much does it cost? And what is the sandboxing technology?</p>
]]></description><pubDate>Thu, 27 Feb 2025 13:45:43 +0000</pubDate><link>https://news.ycombinator.com/item?id=43194303</link><dc:creator>lopuhin</dc:creator><comments>https://news.ycombinator.com/item?id=43194303</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=43194303</guid></item><item><title><![CDATA[New comment by lopuhin in "Using AI for Coding: My Journey with Cline and LLMs"]]></title><description><![CDATA[
<p>I find it strange that the author is really happy with the quality of string comparison here <a href="https://pgaleone.eu/ai/coding/2025/01/26/using-ai-for-coding-my-experience/#backend-development-insights" rel="nofollow">https://pgaleone.eu/ai/coding/2025/01/26/using-ai-for-coding...</a> and while it would kind of work, it's a very weird piece of code from ML standpoint, e.g. it's training a TF-IDF vectorizer on just two strings being compared, which at best won't change anything (unless the same word is repeated within one product), and is a super weird thing to do as for better quality you'd probably want to train that on some corpus, or not bother at all. And also it compare two strings as bags of words, which again is not the end of the world but maybe not what the author wants here, and if they want this then it's not the easiest way of doing it. So it's taking some things which can be useful when comparing texts (tf-idf and cosine similarity) but then applying them in a weird way which does not let them show their strengths.</p>
]]></description><pubDate>Tue, 28 Jan 2025 10:13:21 +0000</pubDate><link>https://news.ycombinator.com/item?id=42850731</link><dc:creator>lopuhin</dc:creator><comments>https://news.ycombinator.com/item?id=42850731</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=42850731</guid></item><item><title><![CDATA[New comment by lopuhin in "DeepSeek and the Effects of GPU Export Controls"]]></title><description><![CDATA[
<p>It's a 600B+ mixture of experts and yes it's described in the paper, GitHub, etc.</p>
]]></description><pubDate>Thu, 23 Jan 2025 15:46:53 +0000</pubDate><link>https://news.ycombinator.com/item?id=42805079</link><dc:creator>lopuhin</dc:creator><comments>https://news.ycombinator.com/item?id=42805079</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=42805079</guid></item><item><title><![CDATA[New comment by lopuhin in "DeepSeek and the Effects of GPU Export Controls"]]></title><description><![CDATA[
<p>Why is this doubtful, did you spot any suspicious things in their paper? They make the weights and a lot of training details open as well, which leaves much less room for making stuff up, e.g. you could check training compute requirements from active weight size (which they can't fake as they released the weights) and fp8 training used.</p>
]]></description><pubDate>Thu, 23 Jan 2025 13:55:42 +0000</pubDate><link>https://news.ycombinator.com/item?id=42804031</link><dc:creator>lopuhin</dc:creator><comments>https://news.ycombinator.com/item?id=42804031</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=42804031</guid></item><item><title><![CDATA[New comment by lopuhin in "DeepSeek-R1"]]></title><description><![CDATA[
<p>With distilled models being released, it's very likely they'd be soon served by other providers at a good price and perf, unlike the full R1 which is very big and much harder to serve efficiently.</p>
]]></description><pubDate>Mon, 20 Jan 2025 14:06:17 +0000</pubDate><link>https://news.ycombinator.com/item?id=42768858</link><dc:creator>lopuhin</dc:creator><comments>https://news.ycombinator.com/item?id=42768858</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=42768858</guid></item><item><title><![CDATA[New comment by lopuhin in "Diffusion for World Modeling"]]></title><description><![CDATA[
<p>I don't think so, what they show on CS video is exactly the Dust2 map, not just something similar/inspired by it.</p>
]]></description><pubDate>Sun, 13 Oct 2024 12:24:10 +0000</pubDate><link>https://news.ycombinator.com/item?id=41827367</link><dc:creator>lopuhin</dc:creator><comments>https://news.ycombinator.com/item?id=41827367</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=41827367</guid></item><item><title><![CDATA[New comment by lopuhin in "GraalPy – A high-performance embeddable Python 3 runtime for Java"]]></title><description><![CDATA[
<p>I think GraalPython does have a GIL, see <a href="https://github.com/oracle/graalpython/blob/master/docs/contributor/IMPLEMENTATION_DETAILS.md#the-gil">https://github.com/oracle/graalpython/blob/master/docs/contr...</a> - and if by "there is no such thing on those platforms" you mean JVM/CLR not having a GIL, C also does not have a GIL but CPython does.</p>
]]></description><pubDate>Tue, 17 Sep 2024 19:06:14 +0000</pubDate><link>https://news.ycombinator.com/item?id=41571442</link><dc:creator>lopuhin</dc:creator><comments>https://news.ycombinator.com/item?id=41571442</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=41571442</guid></item><item><title><![CDATA[New comment by lopuhin in "When ChatGPT summarises, it does nothing of the kind"]]></title><description><![CDATA[
<p>Curious which model was used? Sorry if I missed that. Looks like an important detail to mention when doing an evaluation.</p>
]]></description><pubDate>Sun, 21 Jul 2024 21:24:09 +0000</pubDate><link>https://news.ycombinator.com/item?id=41028259</link><dc:creator>lopuhin</dc:creator><comments>https://news.ycombinator.com/item?id=41028259</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=41028259</guid></item><item><title><![CDATA[New comment by lopuhin in "Mistral NeMo"]]></title><description><![CDATA[
<p>Also I don't think you can use NIM packages in production without a subscription, and I wasn't able to find the cost without signing up. Also NIM package for Mistral Nemo is not yet available anyways.</p>
]]></description><pubDate>Fri, 19 Jul 2024 14:53:14 +0000</pubDate><link>https://news.ycombinator.com/item?id=41007137</link><dc:creator>lopuhin</dc:creator><comments>https://news.ycombinator.com/item?id=41007137</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=41007137</guid></item><item><title><![CDATA[New comment by lopuhin in "Exo: Run your own AI cluster at home with everyday devices"]]></title><description><![CDATA[
<p>The README says they plan to add llama.cpp support which should cover a lot of targets, also they have tinygrad already integrated I think.</p>
]]></description><pubDate>Tue, 16 Jul 2024 16:07:56 +0000</pubDate><link>https://news.ycombinator.com/item?id=40977818</link><dc:creator>lopuhin</dc:creator><comments>https://news.ycombinator.com/item?id=40977818</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=40977818</guid></item><item><title><![CDATA[New comment by lopuhin in "Safe Superintelligence Inc."]]></title><description><![CDATA[
<p>Not quite the same, OpenAI was initially quite open, while Ilia is currently very explicitly against opening or open-sourcing research, e.g. see <a href="https://www.theverge.com/2023/3/15/23640180/openai-gpt-4-launch-closed-research-ilya-sutskever-interview" rel="nofollow">https://www.theverge.com/2023/3/15/23640180/openai-gpt-4-lau...</a></p>
]]></description><pubDate>Wed, 19 Jun 2024 17:21:30 +0000</pubDate><link>https://news.ycombinator.com/item?id=40730307</link><dc:creator>lopuhin</dc:creator><comments>https://news.ycombinator.com/item?id=40730307</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=40730307</guid></item></channel></rss>