<rss version="2.0" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:atom="http://www.w3.org/2005/Atom"><channel><title>Hacker News: djsjajah</title><link>https://news.ycombinator.com/user?id=djsjajah</link><description>Hacker News RSS</description><docs>https://hnrss.org/</docs><generator>hnrss v2.1.1</generator><lastBuildDate>Sat, 18 Apr 2026 09:07:18 +0000</lastBuildDate><atom:link href="https://hnrss.org/user?id=djsjajah" rel="self" type="application/rss+xml"></atom:link><item><title><![CDATA[New comment by djsjajah in "The AI revolution in math has arrived"]]></title><description><![CDATA[
<p>I don't follow. Can you explain how your comment is relevant to mine? It might help if you also explain how you interpreted my comment.</p>
]]></description><pubDate>Tue, 14 Apr 2026 22:59:46 +0000</pubDate><link>https://news.ycombinator.com/item?id=47772549</link><dc:creator>djsjajah</dc:creator><comments>https://news.ycombinator.com/item?id=47772549</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47772549</guid></item><item><title><![CDATA[New comment by djsjajah in "The AI revolution in math has arrived"]]></title><description><![CDATA[
<p>You just failed the Turing test.</p>
]]></description><pubDate>Tue, 14 Apr 2026 04:44:10 +0000</pubDate><link>https://news.ycombinator.com/item?id=47761344</link><dc:creator>djsjajah</dc:creator><comments>https://news.ycombinator.com/item?id=47761344</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47761344</guid></item><item><title><![CDATA[New comment by djsjajah in "Taking on CUDA with ROCm: 'One Step After Another'"]]></title><description><![CDATA[
<p>I have 2 of them. I would advise against if you want to run things like vllm. I have had the cards for months and I still have not been able to create a uv env with trl and vllm. For vllm, it’s works fine in docker for some models. With one gpu, gpt-oss 20b decoding at a cumulative 600-800tps with 32 concurrent requests depending on context length but I was getting trash performance out of qwen3.5 and Gemma4<p>If I were to do it again, I’d probably just get a dgx spark. I don’t think it’s been worth the hassle.</p>
]]></description><pubDate>Mon, 13 Apr 2026 11:26:46 +0000</pubDate><link>https://news.ycombinator.com/item?id=47750516</link><dc:creator>djsjajah</dc:creator><comments>https://news.ycombinator.com/item?id=47750516</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47750516</guid></item><item><title><![CDATA[New comment by djsjajah in "Taking on CUDA with ROCm: 'One Step After Another'"]]></title><description><![CDATA[
<p>> or by the community<p>Hmmm</p>
]]></description><pubDate>Mon, 13 Apr 2026 07:11:56 +0000</pubDate><link>https://news.ycombinator.com/item?id=47748722</link><dc:creator>djsjajah</dc:creator><comments>https://news.ycombinator.com/item?id=47748722</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47748722</guid></item><item><title><![CDATA[New comment by djsjajah in "Quantization from the Ground Up"]]></title><description><![CDATA[
<p>yes, but the difference between one model and one 4x larger is usually a lot more than that.<p>It is not a question of do a run Qwen 8b at bf16 or a quantized version. It more of a question of do I run Qwen 8b at full precision or do I run a quantized version of Qwen 27b.<p>You will find that you are usually better off with the larger model.</p>
]]></description><pubDate>Thu, 26 Mar 2026 01:35:00 +0000</pubDate><link>https://news.ycombinator.com/item?id=47525710</link><dc:creator>djsjajah</dc:creator><comments>https://news.ycombinator.com/item?id=47525710</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47525710</guid></item><item><title><![CDATA[New comment by djsjajah in "Tinybox – Offline AI device 120B parameters"]]></title><description><![CDATA[
<p>trl.
give me a uv command to get that working.<p>But even in the amd stack things (like ck and aiter) consumer cards are not even second class citizens. They are a distance third at best.
If you just want to run vllm with the latest model, if you can get it running at all there are going to be paper cuts all along the way and even then the performance won't be close to what you could be getting out of the hardware.</p>
]]></description><pubDate>Sun, 22 Mar 2026 00:19:46 +0000</pubDate><link>https://news.ycombinator.com/item?id=47473019</link><dc:creator>djsjajah</dc:creator><comments>https://news.ycombinator.com/item?id=47473019</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47473019</guid></item><item><title><![CDATA[New comment by djsjajah in "Attention Residuals"]]></title><description><![CDATA[
<p>No. It seems to me that the comment is objectively incorrect.
The original comment was talking about inference and from what I can tell, it is strictly going to run slower than the model trained to the same loss without this approach (it has "minimal overhead"). The main point is that you wont need to train that model for as long.</p>
]]></description><pubDate>Fri, 20 Mar 2026 21:34:20 +0000</pubDate><link>https://news.ycombinator.com/item?id=47460910</link><dc:creator>djsjajah</dc:creator><comments>https://news.ycombinator.com/item?id=47460910</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47460910</guid></item><item><title><![CDATA[New comment by djsjajah in "GLM-5: Targeting complex systems engineering and long-horizon agentic tasks"]]></title><description><![CDATA[
<p>That’s kind of a moot point. Even if none of those overheads existed you would still be getting a a fractions of the mfu. Models are fundamental limited by memory bandwidth even with best case scenarios of sft or prefill.<p>And what are you doing that I/O is a bottleneck?</p>
]]></description><pubDate>Thu, 12 Feb 2026 01:01:26 +0000</pubDate><link>https://news.ycombinator.com/item?id=46983567</link><dc:creator>djsjajah</dc:creator><comments>https://news.ycombinator.com/item?id=46983567</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=46983567</guid></item><item><title><![CDATA[New comment by djsjajah in "Nvidia Stock Crash Prediction"]]></title><description><![CDATA[
<p>> including all previous experiments<p>How far back do you go? What about experiments into architecture features that didn’t make the cut? What about pre-transformer attention?<p>But more generally, why are you so sure that they team that built Gemini didn’t exclusively use TPUs while they were developing it?<p>I think that one of the reasons that Gemini caught up so quickly is because they have so much compute at fraction of the price of everyone else.</p>
]]></description><pubDate>Tue, 20 Jan 2026 21:23:14 +0000</pubDate><link>https://news.ycombinator.com/item?id=46697922</link><dc:creator>djsjajah</dc:creator><comments>https://news.ycombinator.com/item?id=46697922</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=46697922</guid></item><item><title><![CDATA[New comment by djsjajah in "Command-line Tools can be 235x Faster than your Hadoop Cluster (2014)"]]></title><description><![CDATA[
<p>Not only can it be streamed, but lz4 will probably make things quicker.</p>
]]></description><pubDate>Sun, 18 Jan 2026 22:31:25 +0000</pubDate><link>https://news.ycombinator.com/item?id=46672841</link><dc:creator>djsjajah</dc:creator><comments>https://news.ycombinator.com/item?id=46672841</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=46672841</guid></item><item><title><![CDATA[New comment by djsjajah in "Databases in 2025: A Year in Review"]]></title><description><![CDATA[
<p>You just ruined my day. The post makes it sound like gel is now dead. The post by Vercel does not give me much hope either [1]. Last commit on the gel repo was two weeks ago.<p>[1] <a href="https://vercel.com/blog/investing-in-the-python-ecosystem" rel="nofollow">https://vercel.com/blog/investing-in-the-python-ecosystem</a></p>
]]></description><pubDate>Mon, 05 Jan 2026 10:45:38 +0000</pubDate><link>https://news.ycombinator.com/item?id=46497316</link><dc:creator>djsjajah</dc:creator><comments>https://news.ycombinator.com/item?id=46497316</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=46497316</guid></item><item><title><![CDATA[New comment by djsjajah in "Local AI is driving the biggest change in laptops in decades"]]></title><description><![CDATA[
<p>> Do you really though?<p>Yes.<p>It stays in on the hbm but it need to get shuffled to the place where it can actually do the computation. It’s a lot like a normal cpu. The cpu can’t do anything with data in the system memory, it has to be loaded into a cpu register.
For every token that is generated, a dense llm has to read every parameter in the model.</p>
]]></description><pubDate>Tue, 23 Dec 2025 20:56:40 +0000</pubDate><link>https://news.ycombinator.com/item?id=46369368</link><dc:creator>djsjajah</dc:creator><comments>https://news.ycombinator.com/item?id=46369368</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=46369368</guid></item><item><title><![CDATA[New comment by djsjajah in "Local AI is driving the biggest change in laptops in decades"]]></title><description><![CDATA[
<p>GPUs might not be bandwidth starved most of the time, but they absolutely are when generating text from an llm.
It’s the whole reason why low precision floating point numbers are being pushed by nvidia.</p>
]]></description><pubDate>Tue, 23 Dec 2025 20:41:40 +0000</pubDate><link>https://news.ycombinator.com/item?id=46369247</link><dc:creator>djsjajah</dc:creator><comments>https://news.ycombinator.com/item?id=46369247</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=46369247</guid></item><item><title><![CDATA[New comment by djsjajah in "Why CUDA translation wont unlock AMD"]]></title><description><![CDATA[
<p>I can't tell if you are making a joke or not.<p>They are not even remotely equivalent. tinygrad is a toy.<p>If you are serious, I would be interested to hear how you see tinygrad replacing CUDA. I could see a tiny grad zealot arguing that it is gong to replace torch, but CUDA??<p>Have you looked into AMD support in torch? I would wager that like for like, a torch/amd implementation of a models is going to run rings around a tinygrad/amd implementation.</p>
]]></description><pubDate>Thu, 20 Nov 2025 05:10:03 +0000</pubDate><link>https://news.ycombinator.com/item?id=45989197</link><dc:creator>djsjajah</dc:creator><comments>https://news.ycombinator.com/item?id=45989197</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=45989197</guid></item><item><title><![CDATA[New comment by djsjajah in "Cloudflare Global Network experiencing issues"]]></title><description><![CDATA[
<p>I went to check how many services are being impacted on down detector, but it was down.</p>
]]></description><pubDate>Tue, 18 Nov 2025 12:04:59 +0000</pubDate><link>https://news.ycombinator.com/item?id=45964253</link><dc:creator>djsjajah</dc:creator><comments>https://news.ycombinator.com/item?id=45964253</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=45964253</guid></item><item><title><![CDATA[New comment by djsjajah in "Can I stop drone delivery companies flying over my property?"]]></title><description><![CDATA[
<p>If we didn't have people like that, then they would be right.</p>
]]></description><pubDate>Tue, 03 Jun 2025 03:09:06 +0000</pubDate><link>https://news.ycombinator.com/item?id=44165890</link><dc:creator>djsjajah</dc:creator><comments>https://news.ycombinator.com/item?id=44165890</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=44165890</guid></item><item><title><![CDATA[New comment by djsjajah in "Ask HN: What is the simplest data orchestration tool you've worked with?"]]></title><description><![CDATA[
<p>I few people have mentioned dagster and I took a look at that for some machine learning things I was playing with but I found dvc (data version control [1]) and I think it is fantastic. I think it also has more applications than just machine learning but really anything with data. If you have a bunch of shell scripts that write to files to pass data around, then dvc might be a good fit. it will do things like only rerun steps if it needs to.
Also for totally non-data stuff, Prefect is great.<p>[1] <a href="https://dvc.org" rel="nofollow">https://dvc.org</a></p>
]]></description><pubDate>Fri, 21 Mar 2025 21:31:05 +0000</pubDate><link>https://news.ycombinator.com/item?id=43441005</link><dc:creator>djsjajah</dc:creator><comments>https://news.ycombinator.com/item?id=43441005</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=43441005</guid></item><item><title><![CDATA[New comment by djsjajah in "Age Verification Laws: A Backdoor to Surveillance"]]></title><description><![CDATA[
<p>No. Kids would need to memorize the private key of their parents id card.</p>
]]></description><pubDate>Fri, 07 Mar 2025 23:50:46 +0000</pubDate><link>https://news.ycombinator.com/item?id=43296080</link><dc:creator>djsjajah</dc:creator><comments>https://news.ycombinator.com/item?id=43296080</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=43296080</guid></item><item><title><![CDATA[New comment by djsjajah in "Tailscale is pretty useful"]]></title><description><![CDATA[
<p>You could have also self-hosted the GitHub Actions runner which might have been easier as long as you had something to run the runner on.</p>
]]></description><pubDate>Wed, 05 Mar 2025 22:53:32 +0000</pubDate><link>https://news.ycombinator.com/item?id=43273882</link><dc:creator>djsjajah</dc:creator><comments>https://news.ycombinator.com/item?id=43273882</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=43273882</guid></item><item><title><![CDATA[New comment by djsjajah in "Garmin's –$40B Pivot"]]></title><description><![CDATA[
<p>I think everyone should be reminded that a few years ago Garmin let someone take down their whole network (globally) and then paid the ransom after a few days [1]. In my opinion, the company does not deserve your money.<p>[1] <a href="https://arstechnica.com/information-technology/2020/07/garmans-four-day-service-meltdown-was-caused-by-ransomware/" rel="nofollow">https://arstechnica.com/information-technology/2020/07/garma...</a></p>
]]></description><pubDate>Tue, 21 Jan 2025 12:43:11 +0000</pubDate><link>https://news.ycombinator.com/item?id=42779474</link><dc:creator>djsjajah</dc:creator><comments>https://news.ycombinator.com/item?id=42779474</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=42779474</guid></item></channel></rss>