<rss version="2.0" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:atom="http://www.w3.org/2005/Atom"><channel><title>Hacker News: pavelstoev</title><link>https://news.ycombinator.com/user?id=pavelstoev</link><description>Hacker News RSS</description><docs>https://hnrss.org/</docs><generator>hnrss v2.1.1</generator><lastBuildDate>Mon, 13 Apr 2026 13:59:07 +0000</lastBuildDate><atom:link href="https://hnrss.org/user?id=pavelstoev" rel="self" type="application/rss+xml"></atom:link><item><title><![CDATA[New comment by pavelstoev in "Do AI Agents Make Money in 2026? Or Is It Just Mac Minis and Vibes?"]]></title><description><![CDATA[
<p>Yes, for example, if you are a seller on the Shopify Platform. Lookup Shopify SimGym, and Javier Moreno's tech blog about it.</p>
]]></description><pubDate>Tue, 03 Mar 2026 03:43:26 +0000</pubDate><link>https://news.ycombinator.com/item?id=47227738</link><dc:creator>pavelstoev</dc:creator><comments>https://news.ycombinator.com/item?id=47227738</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47227738</guid></item><item><title><![CDATA[New comment by pavelstoev in "WiFi could become an invisible mass surveillance system"]]></title><description><![CDATA[
<p>In a dynamically changing environment (from the perspective of Wi-Fi signal), this will be difficult but not impossible with modern applications of ML algorithms. We worked on this technique back in 2016-18 at the University of Toronto WIRLab; take a look at the results video from back then. I think the person is somewhat identifiable. <a href="https://www.youtube.com/watch?v=lTOUBUhC0Cg" rel="nofollow">https://www.youtube.com/watch?v=lTOUBUhC0Cg</a></p>
]]></description><pubDate>Thu, 12 Feb 2026 03:07:21 +0000</pubDate><link>https://news.ycombinator.com/item?id=46984434</link><dc:creator>pavelstoev</dc:creator><comments>https://news.ycombinator.com/item?id=46984434</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=46984434</guid></item><item><title><![CDATA[New comment by pavelstoev in "The Day the Telnet Died"]]></title><description><![CDATA[
<p>Am I the only one who finds this suspicious ? About Telnetd “…The vulnerable code was introduced in a 2015 commit and sat undiscovered for nearly 11 years.”</p>
]]></description><pubDate>Wed, 11 Feb 2026 04:22:59 +0000</pubDate><link>https://news.ycombinator.com/item?id=46970791</link><dc:creator>pavelstoev</dc:creator><comments>https://news.ycombinator.com/item?id=46970791</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=46970791</guid></item><item><title><![CDATA[New comment by pavelstoev in "FLUX.2 [Klein]: Towards Interactive Visual Intelligence"]]></title><description><![CDATA[
<p>If we think of GenAI models as a compression implementation. Generally, text compresses extremely well. Images and video do not. Yet state-of-the-art text-to-image and text-to-video models are often much smaller (in parameter count) than large language models like Llama-3. Maybe vision models are small because we’re not actually compressing very much of the visual world. The training data covers a narrow, human-biased manifold of common scenes, objects, and styles. The combinatorial space of visual reality remains largely unexplored. I am looking towards what else is out there outside of the human-biased manifold.</p>
]]></description><pubDate>Sat, 17 Jan 2026 03:15:17 +0000</pubDate><link>https://news.ycombinator.com/item?id=46654906</link><dc:creator>pavelstoev</dc:creator><comments>https://news.ycombinator.com/item?id=46654906</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=46654906</guid></item><item><title><![CDATA[New comment by pavelstoev in "OpenAI are quietly adopting skills, now available in ChatGPT and Codex CLI"]]></title><description><![CDATA[
<p>Not wrong but markdown with English may be the most used DSL, second only to a language itself. Volume over quality.</p>
]]></description><pubDate>Sat, 13 Dec 2025 01:42:41 +0000</pubDate><link>https://news.ycombinator.com/item?id=46251168</link><dc:creator>pavelstoev</dc:creator><comments>https://news.ycombinator.com/item?id=46251168</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=46251168</guid></item><item><title><![CDATA[New comment by pavelstoev in "TPUs vs. GPUs and why Google is positioned to win AI race in the long term"]]></title><description><![CDATA[
<p>keyword: "...talks..."</p>
]]></description><pubDate>Fri, 28 Nov 2025 02:11:38 +0000</pubDate><link>https://news.ycombinator.com/item?id=46074945</link><dc:creator>pavelstoev</dc:creator><comments>https://news.ycombinator.com/item?id=46074945</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=46074945</guid></item><item><title><![CDATA[New comment by pavelstoev in "Nvidia takes $1B stake in Nokia"]]></title><description><![CDATA[
<p>Nokia also makes complex backbone carrier-grade network switches based on the Intellectual Property portfolio they acquired from Nortel.</p>
]]></description><pubDate>Wed, 29 Oct 2025 02:55:07 +0000</pubDate><link>https://news.ycombinator.com/item?id=45742104</link><dc:creator>pavelstoev</dc:creator><comments>https://news.ycombinator.com/item?id=45742104</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=45742104</guid></item><item><title><![CDATA[New comment by pavelstoev in "The Rapper 50 Cent, Adjusted for Inflation"]]></title><description><![CDATA[
<p>Much respect for the artist 50 Cent - converted his rap music success into respectable business ventures (Vitamin Water, others). So he is worth much more now!</p>
]]></description><pubDate>Sat, 18 Oct 2025 02:10:58 +0000</pubDate><link>https://news.ycombinator.com/item?id=45624315</link><dc:creator>pavelstoev</dc:creator><comments>https://news.ycombinator.com/item?id=45624315</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=45624315</guid></item><item><title><![CDATA[New comment by pavelstoev in "Every vibe-coded website is the same page with different words. So I made that"]]></title><description><![CDATA[
<p>I've vibe-coded a website about vibe coding websites. I used GPT-5 and it inserted an easter egg that was found by a human front-end dev, to my amusement. Easter eggs must be in-distribution!<p>(No I am not sharing the link as I was downvoted for it before - search for it. Hint: built with vibe)</p>
]]></description><pubDate>Sat, 18 Oct 2025 01:54:58 +0000</pubDate><link>https://news.ycombinator.com/item?id=45624220</link><dc:creator>pavelstoev</dc:creator><comments>https://news.ycombinator.com/item?id=45624220</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=45624220</guid></item><item><title><![CDATA[New comment by pavelstoev in "Gemini 2.5 Computer Use model"]]></title><description><![CDATA[
<p>It was my first engineering job, calibrating those inductive loops and circuit boards on I-93, just north of Boston's downtown area. Here is the photo from 2006. <a href="https://postimg.cc/zbz5JQC0" rel="nofollow">https://postimg.cc/zbz5JQC0</a><p>PEEK controller, 56K modem, Verizon telco lines, rodents - all included in one cabinet</p>
]]></description><pubDate>Wed, 08 Oct 2025 02:33:32 +0000</pubDate><link>https://news.ycombinator.com/item?id=45511425</link><dc:creator>pavelstoev</dc:creator><comments>https://news.ycombinator.com/item?id=45511425</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=45511425</guid></item><item><title><![CDATA[New comment by pavelstoev in "WiFi-3D-Fusion – Real-time 3D motion sensing with Wi-Fi"]]></title><description><![CDATA[
<p>Here is a link to a video of what it looks like (estimated) in video.<p>We built this system at the UofT WIRLab back in 2018-19 <a href="https://youtu.be/lTOUBUhC0Cg" rel="nofollow">https://youtu.be/lTOUBUhC0Cg</a><p>And link to paper <a href="https://arxiv.org/pdf/2001.05842" rel="nofollow">https://arxiv.org/pdf/2001.05842</a></p>
]]></description><pubDate>Tue, 26 Aug 2025 03:10:52 +0000</pubDate><link>https://news.ycombinator.com/item?id=45021867</link><dc:creator>pavelstoev</dc:creator><comments>https://news.ycombinator.com/item?id=45021867</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=45021867</guid></item><item><title><![CDATA[New comment by pavelstoev in "Ask HN: How can ChatGPT serve 700M users when I can't run one GPT-4 locally?"]]></title><description><![CDATA[
<p>When I think about serving large-scale LLM inference (like ChatGPT), I see it a lot like high-speed web serving — there are layers to it, much like in the OSI model.<p>1. Physical/Hardware Layer
At the very bottom is the GPU silicon and its associated high-bandwidth VRAM. The model weights are partitioned, compiled, and efficiently placed so that each GPU chip and its VRAM are used to the fullest (ideally). This is where low-level kernel optimizations, fused operations, and memory access patterns matter so that everything above the chip level tries to play nice with the lowest level.<p>2. Intra-Node Coordination Layer
Inside a single server, multiple GPUs are connected via NVLink (or equivalent high-speed interconnect). Here you use tensor parallelism (splitting matrices across GPUs), pipeline parallelism (splitting model layers across GPUs), or expert parallelism (only activating parts of the model per request) to make the model fit and run faster. The key is minimizing cross-GPU communication latency while keeping all GPUs running at full load - many low level software tricks here.<p>3. Inter-Node Coordination Layer
When the model spans multiple servers, high-speed networking like InfiniBand comes into play. Techniques like data parallelism (replicating the model and splitting requests), hybrid parallelism (mixing tensor/pipeline/data/expert parallelism), and careful orchestration of collectives (all-reduce, all-to-all) keep throughput high while hiding model communication (slow) behind model computation (fast).<p>4. Request Processing Layer
Above the hardware/multi-GPU layers is the serving logic: batching incoming prompts together to maximize GPU efficiency and mold them into ideal shapes to max out compute, offloading less urgent work to background processes, caching key/value attention states (KV cache) to avoid recomputing past tokens, and using paged caches to handle variable-length sequences.<p>5. User-Facing Serving Layer
At the top are optimizations users see indirectly — multi-layer caching for common or repeated queries, fast serialization protocols like gRPC or WebSockets for minimal overhead, and geo-distributed load balancing to route users to the lowest-latency cluster.<p>Like the OSI model, each “layer” solves its own set of problems but works together to make the whole system scale. That’s how you get from “this model barely runs on a single high-end GPU” to “this service handles hundreds of millions of users per week with low latency.”</p>
]]></description><pubDate>Sat, 09 Aug 2025 05:08:12 +0000</pubDate><link>https://news.ycombinator.com/item?id=44844164</link><dc:creator>pavelstoev</dc:creator><comments>https://news.ycombinator.com/item?id=44844164</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=44844164</guid></item><item><title><![CDATA[New comment by pavelstoev in "I tried Vibe coding in BASIC and it didn't go well"]]></title><description><![CDATA[
<p>I vibe coded a site about vibe 2 code projects. <a href="https://builtwithvibe.com/" rel="nofollow">https://builtwithvibe.com/</a></p>
]]></description><pubDate>Sun, 20 Jul 2025 03:22:23 +0000</pubDate><link>https://news.ycombinator.com/item?id=44621706</link><dc:creator>pavelstoev</dc:creator><comments>https://news.ycombinator.com/item?id=44621706</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=44621706</guid></item><item><title><![CDATA[New comment by pavelstoev in "Compiling LLMs into a MegaKernel: A path to low-latency inference"]]></title><description><![CDATA[
<p>Hi Author - thank you very much for the clear and relatively easy-to-understand MPK overview. Could you please also comment on the similarity of your project to Hidet <a href="https://pytorch.org/blog/introducing-hidet/" rel="nofollow">https://pytorch.org/blog/introducing-hidet/</a><p>Thank you !</p>
]]></description><pubDate>Fri, 20 Jun 2025 15:58:20 +0000</pubDate><link>https://news.ycombinator.com/item?id=44328983</link><dc:creator>pavelstoev</dc:creator><comments>https://news.ycombinator.com/item?id=44328983</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=44328983</guid></item><item><title><![CDATA[New comment by pavelstoev in "Battle to eradicate invasive pythons in Florida achieves milestone"]]></title><description><![CDATA[
<p>This story is dear to my heart. Let me tell you why - this is the tale of how my wife of 15 years, bless her heart, an occasional unstable genius, proposed a startlingly effective method for eradicating these invasive pythons.<p>She slammed her coffee cup down one morning with the conviction of an Old Testament prophet and declared: “Exploding rabbits.”<p>“Excuse me?” I said, wiping marmalade off my chin.<p>“Exploding. Rabbits. Stuff ‘em with quarter pound of C4, or maybe just enough tannerite to surprise the neighbors but not call down the FAA, and set them loose in the Everglades. Pythons love rabbits. Boom. Problem solved. You’re welcome, America.”<p>Now I’ve heard my share of madcap schemes. Once she tried to compost credit card offers. But this time she looked me square in the eye with the righteous glow of a woman who had just solved two ecological crises and accidentally founded a billion-dollar startup in the process.<p>“We’ll call it Hare Trigger™,” she added, deadpan. “It’s got product-market fit and explosive growth potential.”<p>She even sketched out a logo involving a jackrabbit with aviator goggles and a plunger.<p>I asked if this might attract some sort of federal attention.<p>“Good,” she said. “That’s called buzz. Besides, the pythons started it.”<p>And just like that, I found myself wondering how far true it is that behind every successful man stands an even more genius woman. Waiting for Elon to offer Series A.</p>
]]></description><pubDate>Tue, 17 Jun 2025 01:28:10 +0000</pubDate><link>https://news.ycombinator.com/item?id=44294919</link><dc:creator>pavelstoev</dc:creator><comments>https://news.ycombinator.com/item?id=44294919</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=44294919</guid></item><item><title><![CDATA[New comment by pavelstoev in "'I paid for the whole GPU, I am going to use the whole GPU'"]]></title><description><![CDATA[
<p>GPU sharing is a concern for sensitive data. It is more appropriate to increase the utilization rate of GPU chip internals via a variety of low-level (CUDA and below) optimizations.</p>
]]></description><pubDate>Thu, 08 May 2025 02:27:29 +0000</pubDate><link>https://news.ycombinator.com/item?id=43922540</link><dc:creator>pavelstoev</dc:creator><comments>https://news.ycombinator.com/item?id=43922540</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=43922540</guid></item><item><title><![CDATA[New comment by pavelstoev in "Performance optimization is hard because it's fundamentally a brute-force task"]]></title><description><![CDATA[
<p>Optimizing AI performance is like peeling an onion — every time you remove one bottleneck, another layer appears underneath. What looks like a compute problem turns out to be a memory bottleneck, which then turns out to be a scheduling issue, which reveals a parallelism mismatch… and so on.<p>It’s a process of continuous uncovering, and unless you have visibility across the whole stack — from kernel to cluster — you’ll spend all your time slicing through surface layers with lots of tears being shed.<p>Fortunately, there are software automation solutions to this.</p>
]]></description><pubDate>Wed, 30 Apr 2025 04:08:57 +0000</pubDate><link>https://news.ycombinator.com/item?id=43841177</link><dc:creator>pavelstoev</dc:creator><comments>https://news.ycombinator.com/item?id=43841177</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=43841177</guid></item><item><title><![CDATA[New comment by pavelstoev in "National Airspace System Status"]]></title><description><![CDATA[
<p>YYC airport code is for Calgary, Canada. Why is on the US .gov site is there something I missed</p>
]]></description><pubDate>Fri, 25 Apr 2025 02:15:38 +0000</pubDate><link>https://news.ycombinator.com/item?id=43789537</link><dc:creator>pavelstoev</dc:creator><comments>https://news.ycombinator.com/item?id=43789537</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=43789537</guid></item><item><title><![CDATA[New comment by pavelstoev in "Zoom outage caused by accidental 'shutting down' of the zoom.us domain"]]></title><description><![CDATA[
<p>Can’t have an apologetic zoom call when zoom is down …</p>
]]></description><pubDate>Thu, 17 Apr 2025 02:41:41 +0000</pubDate><link>https://news.ycombinator.com/item?id=43712587</link><dc:creator>pavelstoev</dc:creator><comments>https://news.ycombinator.com/item?id=43712587</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=43712587</guid></item><item><title><![CDATA[New comment by pavelstoev in "Lumon Macro Data Refinement Multiplayer Game"]]></title><description><![CDATA[
<p>Impressive!</p>
]]></description><pubDate>Sat, 12 Apr 2025 15:12:09 +0000</pubDate><link>https://news.ycombinator.com/item?id=43665072</link><dc:creator>pavelstoev</dc:creator><comments>https://news.ycombinator.com/item?id=43665072</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=43665072</guid></item></channel></rss>