<rss version="2.0" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:atom="http://www.w3.org/2005/Atom"><channel><title>Hacker News: ismailmaj</title><link>https://news.ycombinator.com/user?id=ismailmaj</link><description>Hacker News RSS</description><docs>https://hnrss.org/</docs><generator>hnrss v2.1.1</generator><lastBuildDate>Tue, 16 Jun 2026 11:57:20 +0000</lastBuildDate><atom:link href="https://hnrss.org/user?id=ismailmaj" rel="self" type="application/rss+xml"></atom:link><item><title><![CDATA[New comment by ismailmaj in "Statement on US government directive to suspend access to Fable 5 and Mythos 5"]]></title><description><![CDATA[
<p>My main read on this is that European governments will more aggressively invest in regional AI labs.<p>For Nvidia chips you could've deluded yourself that the US is just anti-china, that position is harder to argue for right now.</p>
]]></description><pubDate>Sat, 13 Jun 2026 08:55:32 +0000</pubDate><link>https://news.ycombinator.com/item?id=48515052</link><dc:creator>ismailmaj</dc:creator><comments>https://news.ycombinator.com/item?id=48515052</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=48515052</guid></item><item><title><![CDATA[New comment by ismailmaj in "Ask HN: Are most corporate SWE jobs performative?"]]></title><description><![CDATA[
<p>In my experience a lot of companies try really hard to be data oriented and try to find objective metrics for impact, sometimes it’s good, often it’s bad. Like LOC count, PR count, time in meetings or time spent at the office.<p>Enough of this and people will learn to play the game over doing the right thing.</p>
]]></description><pubDate>Wed, 10 Jun 2026 13:50:45 +0000</pubDate><link>https://news.ycombinator.com/item?id=48476276</link><dc:creator>ismailmaj</dc:creator><comments>https://news.ycombinator.com/item?id=48476276</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=48476276</guid></item><item><title><![CDATA[New comment by ismailmaj in "I am giving up on VM Gaming"]]></title><description><![CDATA[
<p>I had a much better experience gaming on MacOS than I ever did using Linux + X (it was first dual boot with Windows, then double GPU + VM, then proton).<p>But it's mainly because I do not target anymore games with stringent anti-cheats or with high setup requirements (though the M5 is quite powerful right now), and a lot of games are ported natively to macOS but not Linux (most recent to date is Age of Empire 2 Definitive Edition 1 week ago).</p>
]]></description><pubDate>Sun, 07 Jun 2026 11:06:49 +0000</pubDate><link>https://news.ycombinator.com/item?id=48433701</link><dc:creator>ismailmaj</dc:creator><comments>https://news.ycombinator.com/item?id=48433701</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=48433701</guid></item><item><title><![CDATA[New comment by ismailmaj in "Gemma 4 12B: A unified, encoder-free multimodal model"]]></title><description><![CDATA[
<p>Gemini is a huge team while Gemma is relatively small.
They can totally do this at a loss with no ulterior motive.<p>They remind me a bit of HuggingFace, create something great then make money … maybe.</p>
]]></description><pubDate>Wed, 03 Jun 2026 18:21:23 +0000</pubDate><link>https://news.ycombinator.com/item?id=48387702</link><dc:creator>ismailmaj</dc:creator><comments>https://news.ycombinator.com/item?id=48387702</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=48387702</guid></item><item><title><![CDATA[New comment by ismailmaj in "Real-time LLM Inference on Standard GPUs: 3k tokens/s per request"]]></title><description><![CDATA[
<p>I expected a 4090, maybe 2.
I did not expect 8xH200 for a 2B model.</p>
]]></description><pubDate>Fri, 29 May 2026 11:13:00 +0000</pubDate><link>https://news.ycombinator.com/item?id=48321655</link><dc:creator>ismailmaj</dc:creator><comments>https://news.ycombinator.com/item?id=48321655</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=48321655</guid></item><item><title><![CDATA[New comment by ismailmaj in "Claude Opus 4.8"]]></title><description><![CDATA[
<p>I just asked the model details about the incoming spaceX IPO and it responded with “There’s no confirmed SpaceX IPO. Elon Musk has said for years that SpaceX itself won’t go public”. It took me two push backs and specifically asking for web search.<p>I feel like I won’t like this model just like I didn’t like 4.7, push backs a lot and avoids thinking or search as much as possible.</p>
]]></description><pubDate>Thu, 28 May 2026 22:19:00 +0000</pubDate><link>https://news.ycombinator.com/item?id=48316325</link><dc:creator>ismailmaj</dc:creator><comments>https://news.ycombinator.com/item?id=48316325</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=48316325</guid></item><item><title><![CDATA[New comment by ismailmaj in "OpenAI Is Preparing to File for an IPO Soon"]]></title><description><![CDATA[
<p>3 things, they can, there is a precedent for that with Google v. Oracle for Java, and they have something!<p>AMD engineered something called HIP which is CUDA API compatible libraries that targets AMD's hardware, it's the closest thing we have for drop-in replacement to Nvidia's software moat.<p>It works for simple stuff but loses terribly for frontier kernels (like Flash Attention 3), novel approaches (e.g. Mamba) or networking (e.g. NCCL), also they are rough on the edges, so what you gain from GPU costs is lost in engineering cost.<p>My previous company tried to compete in this GPU game while putting effort to have a good software stack (Rivos), drop in replacement and cheaper with decent software.<p>But that vision was rough, any new player had to implement the bad APIs due to backward compatibility concerns, following specs wasn't sufficient as a lot of the AI stack was depending on observable effects (Hyrum's Law), and Nvidia simply just had a long head start, the company is now dead (acquired by Meta) and AFAIK there isn't another player.<p>Best case scenario AMD puts more effort into their software stack but I just think they do not have enough internal talent to compete.<p>Training will continue to be an Nvidia's thing and that's where most of the money sits, unless suddenly the AI research scene pivots to using JAX but I do not see it coming any time soon, if anything, I've seen internal efforts at Google to make PyTorch work nicely with TPUs. Some players like Anthropic started using JAX for training but all the small players are using Nvidia, I'm guessing it has something to do with Nvidia partnering aggressively with startups.</p>
]]></description><pubDate>Thu, 21 May 2026 13:20:53 +0000</pubDate><link>https://news.ycombinator.com/item?id=48222187</link><dc:creator>ismailmaj</dc:creator><comments>https://news.ycombinator.com/item?id=48222187</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=48222187</guid></item><item><title><![CDATA[New comment by ismailmaj in "OpenAI Is Preparing to File for an IPO Soon"]]></title><description><![CDATA[
<p>Their moat is cuda and cuda libraries and everything built on top.<p>When a new architecture drops, it's always PyTorch running on CUDA, other PyTorch backends are best effort, even if they reach feature parity, many industry power users went closer to the metal to squeeze performance and that stuff is too specific to Nvidia stuff.<p>if there is something that will beat Nvidia, it won't be something reaching feature parity with slightly better economics (like AMD, also Nvidia could just reduce their margins), it needs to be a novel approach worth rewriting the codebase for (maybe Cerebras, maybe a new player).</p>
]]></description><pubDate>Thu, 21 May 2026 12:01:32 +0000</pubDate><link>https://news.ycombinator.com/item?id=48221246</link><dc:creator>ismailmaj</dc:creator><comments>https://news.ycombinator.com/item?id=48221246</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=48221246</guid></item><item><title><![CDATA[New comment by ismailmaj in "Meta's embrace of A.I. is making its employees miserable"]]></title><description><![CDATA[
<p>I'm thinking that personally, technology is not bad in a vacuum and not necessarily bad in society, but it just reveals that our system is ill-equipped to guarantee good usage of it.<p>We could have fun defining what's good usage but we're so far from it, it would just make me sad.</p>
]]></description><pubDate>Sat, 09 May 2026 20:55:55 +0000</pubDate><link>https://news.ycombinator.com/item?id=48078171</link><dc:creator>ismailmaj</dc:creator><comments>https://news.ycombinator.com/item?id=48078171</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=48078171</guid></item><item><title><![CDATA[New comment by ismailmaj in "Cerebras S-1"]]></title><description><![CDATA[
<p>Unclear if it's the only cause but wafer scale is great for very low latency, but loses to throughput per dollar compared to classic Nvidia like GPUs.
I don't think they can reduce the gap, SRAM is just more expensive than HBM and their architecture needs a lot of it.<p>So, the price makes it necessarily niche to some specific use-cases like HFT or intelligent duplex voice assistants, I'm still semi-bullish personally.</p>
]]></description><pubDate>Sat, 18 Apr 2026 02:44:30 +0000</pubDate><link>https://news.ycombinator.com/item?id=47812717</link><dc:creator>ismailmaj</dc:creator><comments>https://news.ycombinator.com/item?id=47812717</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47812717</guid></item><item><title><![CDATA[New comment by ismailmaj in "MegaTrain: Full Precision Training of 100B+ Parameter LLMs on a Single GPU"]]></title><description><![CDATA[
<p>Obsolete because of what? Because with limited hardware you’re never aiming for state of the art, and for fine-tuning, you don’t steer for too long anyway.</p>
]]></description><pubDate>Wed, 08 Apr 2026 13:55:42 +0000</pubDate><link>https://news.ycombinator.com/item?id=47690278</link><dc:creator>ismailmaj</dc:creator><comments>https://news.ycombinator.com/item?id=47690278</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47690278</guid></item><item><title><![CDATA[New comment by ismailmaj in "25 Years of Eggs"]]></title><description><![CDATA[
<p>I don't know why people mess with tesseract in 2026, attention-based OCRs (and more recently VLMs) outperformed any LSTM-based approach since at least 2020.<p>My guess is that it's the entry-point to OCR and the internet is flooded by that, just like pandas for data processing.</p>
]]></description><pubDate>Sun, 22 Mar 2026 12:55:05 +0000</pubDate><link>https://news.ycombinator.com/item?id=47477019</link><dc:creator>ismailmaj</dc:creator><comments>https://news.ycombinator.com/item?id=47477019</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47477019</guid></item><item><title><![CDATA[New comment by ismailmaj in "BitNet: Inference framework for 1-bit LLMs"]]></title><description><![CDATA[
<p>Assuming 2 bit per values (first bit is sign and second bit is value).<p>actv = A[_:1] & B[_:1]<p>sign = A[_:0] ^ B[_:0]<p>dot = pop_count(actv & !sign) - pop_count(actv & sign)<p>It can probably be made more efficient by taking a column-first format.<p>Since we are in CPU land, we mostly deal with dot products that match the cache size, I don't assume we have a tiled matmul instruction which is unlikely to support this weird 1-bit format.</p>
]]></description><pubDate>Wed, 11 Mar 2026 18:09:15 +0000</pubDate><link>https://news.ycombinator.com/item?id=47339074</link><dc:creator>ismailmaj</dc:creator><comments>https://news.ycombinator.com/item?id=47339074</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47339074</guid></item><item><title><![CDATA[New comment by ismailmaj in "Surpassing vLLM with a Generated Inference Stack"]]></title><description><![CDATA[
<p>Confusing, since this is specific to an architecture that no one making money will use (8B is consumer space, not enterprise).
The produced code shouldn't hold much interesting IP?</p>
]]></description><pubDate>Wed, 11 Mar 2026 17:43:39 +0000</pubDate><link>https://news.ycombinator.com/item?id=47338732</link><dc:creator>ismailmaj</dc:creator><comments>https://news.ycombinator.com/item?id=47338732</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47338732</guid></item><item><title><![CDATA[New comment by ismailmaj in "Why the global elite gave up on spelling and grammar"]]></title><description><![CDATA[
<p>Oh no, cortisol spike in my text-only forum.</p>
]]></description><pubDate>Wed, 11 Mar 2026 17:24:34 +0000</pubDate><link>https://news.ycombinator.com/item?id=47338468</link><dc:creator>ismailmaj</dc:creator><comments>https://news.ycombinator.com/item?id=47338468</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47338468</guid></item><item><title><![CDATA[New comment by ismailmaj in "BitNet: 100B Param 1-Bit model for local CPUs"]]></title><description><![CDATA[
<p>The title and the repo uses 1-bit when it means 1.58 bits tertiary values, it doesn't change any of my arguments (still xors and pop_counts).</p>
]]></description><pubDate>Wed, 11 Mar 2026 17:02:01 +0000</pubDate><link>https://news.ycombinator.com/item?id=47338172</link><dc:creator>ismailmaj</dc:creator><comments>https://news.ycombinator.com/item?id=47338172</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47338172</guid></item><item><title><![CDATA[New comment by ismailmaj in "BitNet: 100B Param 1-Bit model for local CPUs"]]></title><description><![CDATA[
<p>You drop the memory throughput requirements because of the packed representation of bits so an FMA can become the bottleneck, and you bypass the problem of needing to upscale the bits to whatever FP the FMA instruction needs.<p>typically for 1-bit matmul, you can get away with xors and pop_counts which should have a better throughput profile than FMA when taking into account the SIMD nature of the inputs/outputs.</p>
]]></description><pubDate>Wed, 11 Mar 2026 14:37:38 +0000</pubDate><link>https://news.ycombinator.com/item?id=47336142</link><dc:creator>ismailmaj</dc:creator><comments>https://news.ycombinator.com/item?id=47336142</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47336142</guid></item><item><title><![CDATA[New comment by ismailmaj in "Yann LeCun raises $1B to build AI that understands the physical world"]]></title><description><![CDATA[
<p>Ah yes, OpenAI the puppet of Microsoft that is currently declaring war against GitHub, sounds logical.</p>
]]></description><pubDate>Wed, 11 Mar 2026 11:35:56 +0000</pubDate><link>https://news.ycombinator.com/item?id=47334283</link><dc:creator>ismailmaj</dc:creator><comments>https://news.ycombinator.com/item?id=47334283</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47334283</guid></item><item><title><![CDATA[New comment by ismailmaj in "Surpassing vLLM with a Generated Inference Stack"]]></title><description><![CDATA[
<p>Any place we can find the code?</p>
]]></description><pubDate>Wed, 11 Mar 2026 11:01:10 +0000</pubDate><link>https://news.ycombinator.com/item?id=47334028</link><dc:creator>ismailmaj</dc:creator><comments>https://news.ycombinator.com/item?id=47334028</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47334028</guid></item><item><title><![CDATA[New comment by ismailmaj in "Yann LeCun raises $1B to build AI that understands the physical world"]]></title><description><![CDATA[
<p>Don’t think that’s a fair interpretation of what I said.<p>Liquid money rich? No.<p>Can get pulled for big tech packages? Also no, for most of the employees.<p>AFAIK, big tech didn’t aggressively poach OpenAI-like talent, they did spend 10M+ pay packages but it was for a select few research scientists. Some folks left and came but it boiled down to culture mostly.</p>
]]></description><pubDate>Tue, 10 Mar 2026 19:39:33 +0000</pubDate><link>https://news.ycombinator.com/item?id=47327864</link><dc:creator>ismailmaj</dc:creator><comments>https://news.ycombinator.com/item?id=47327864</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47327864</guid></item></channel></rss>