<rss version="2.0" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:atom="http://www.w3.org/2005/Atom"><channel><title>Hacker News: diamondlovesyou</title><link>https://news.ycombinator.com/user?id=diamondlovesyou</link><description>Hacker News RSS</description><docs>https://hnrss.org/</docs><generator>hnrss v2.1.1</generator><lastBuildDate>Wed, 13 May 2026 14:43:46 +0000</lastBuildDate><atom:link href="https://hnrss.org/user?id=diamondlovesyou" rel="self" type="application/rss+xml"></atom:link><item><title><![CDATA[New comment by diamondlovesyou in "Deterministic Fully-Static Whole-Binary Translation Without Heuristics"]]></title><description><![CDATA[
<p>That won't be located on the stack either. The underlying buffer will be a TU local - ie static and not rx</p>
]]></description><pubDate>Wed, 13 May 2026 05:54:06 +0000</pubDate><link>https://news.ycombinator.com/item?id=48118336</link><dc:creator>diamondlovesyou</dc:creator><comments>https://news.ycombinator.com/item?id=48118336</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=48118336</guid></item><item><title><![CDATA[New comment by diamondlovesyou in "The acyclic e-graph: Cranelift's mid-end optimizer"]]></title><description><![CDATA[
<p>> This post makes it seem like the pass ordering problem is bigger than it really is and then overestimates the extent to which egraphs solve it.<p>It isn't so much for SoTA implementations like LLVM, but it is for HL IRs like those present in MLIR. For LLVM, you're basically always in the same representation and every pass operates in that shared representation. But even then, this is not quite true. For example, SLP in LLVM is one of the last passes because running SLP before most "latency sensitive cleanups" would break most of them.<p>In particular, HL to LL lowering pipelines suffer very heavily from the ordering concerns.</p>
]]></description><pubDate>Tue, 14 Apr 2026 20:44:10 +0000</pubDate><link>https://news.ycombinator.com/item?id=47771247</link><dc:creator>diamondlovesyou</dc:creator><comments>https://news.ycombinator.com/item?id=47771247</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47771247</guid></item><item><title><![CDATA[New comment by diamondlovesyou in "Why are credit card rates so high?"]]></title><description><![CDATA[
<p>I don't use credit cards for the credit; in fact mine are completely paid for every statement. They are used for the customer protections and other provided "free" benefits. If some scummy or outright scam-y thing is charged to my Amex, I know I will have Amex on my side. If my card is stolen, Amex will refund any fraudulent charges and overnight me a new card; I probably won't get my debit card overnighted, though they will probably refund the fraud. The other thing is credit card points, which are essentially a benefit paid for by credit card processing fees charged to businesses. Many cards also offer access to "private" airport lounges. And other benefits I'm forgetting off the top of my head.<p>Additionally, having high credit limits, low usage, and older accounts improves credit scores for loans/etc.<p>No interest is charged if there is no balance carried statement-to-statement, so why bother with silly debit pins and such.<p>That's how it becomes the default way of payment; it's not really "credit".</p>
]]></description><pubDate>Tue, 01 Apr 2025 23:11:35 +0000</pubDate><link>https://news.ycombinator.com/item?id=43552270</link><dc:creator>diamondlovesyou</dc:creator><comments>https://news.ycombinator.com/item?id=43552270</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=43552270</guid></item><item><title><![CDATA[New comment by diamondlovesyou in "Xfinity XB3 hardware mod: Disable WiFi and save 2 watts"]]></title><description><![CDATA[
<p>Less area means less sources of interference for others (this property is also true in the other direction). So the attenuation reduces the signal area, and stronger attenuation lets the transmitter be "strong" in the house without the downsides in congested areas.</p>
]]></description><pubDate>Mon, 31 Mar 2025 03:37:33 +0000</pubDate><link>https://news.ycombinator.com/item?id=43530653</link><dc:creator>diamondlovesyou</dc:creator><comments>https://news.ycombinator.com/item?id=43530653</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=43530653</guid></item><item><title><![CDATA[New comment by diamondlovesyou in "Constant-Time Code: The Pessimist Case [pdf]"]]></title><description><![CDATA[
<p>> Why is cooperation unlikely? AFAIK it’s not too hard to make a compiler support a function attribute that says “do not optimize this function at all”<p>Compilers like Clang actually generate terrible code; it's expected that a sufficiently smart optimizer (of which LLVM is a member) will clean it up anyway, so Clang makes no attempt to generate good code. Rust is similar. For example, a simple for-loop's induction variable is stored/loaded to an alloca (ie stack) on every use, it isn't an SSA variable. So one of the first things in the optimization pipeline is to promote those to SSA registers/variables. Disabling that would cost a ton of perf just right there, nevermind the impact on instruction combining/value tracking/scalar evolution, and crypto is pretty perf sensitive after security.<p>BTW, Clang/LLVM already has such a function-level attribute, `optnone`, which was actually added to support LTO. But it's all or nothing; LLVM IR/Clang doesn't have the info needed to know what instructions are timing sensitive.</p>
]]></description><pubDate>Thu, 13 Mar 2025 02:33:41 +0000</pubDate><link>https://news.ycombinator.com/item?id=43349861</link><dc:creator>diamondlovesyou</dc:creator><comments>https://news.ycombinator.com/item?id=43349861</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=43349861</guid></item><item><title><![CDATA[New comment by diamondlovesyou in "High-speed 10Gbps full-mesh network based on USB4 for just $47.98"]]></title><description><![CDATA[
<p>GB6 will use the Zen4's AVX512, which Zen2 doesn't support.</p>
]]></description><pubDate>Mon, 15 Jan 2024 18:47:48 +0000</pubDate><link>https://news.ycombinator.com/item?id=39004486</link><dc:creator>diamondlovesyou</dc:creator><comments>https://news.ycombinator.com/item?id=39004486</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=39004486</guid></item><item><title><![CDATA[New comment by diamondlovesyou in "Rust std fs slower than Python? No, it's hardware"]]></title><description><![CDATA[
<p>Fast is relative here. These are microcoded instructions, which are generally terrible for latency: microcoded instructions don't get branch prediction benefits, nor OoO benefits (they lock the FE/scheduler while running). Small memcpy/moves are always latency bound, hence even if the HW supports "fast" rep store, you're better off not using them. L2 is wicked fast, and these copies are linear, so prediction will be good.<p>Note that for rep store to be better it must overcome the cost of the initial latency and then catch up to the 32byte vector copies, which yes generally have not-as-good-perf vs DRAM speed, but they aren't that bad either. Thus for small copies.... just don't use string store.<p>All this is not even considering non-temporal loads/stores; many larger copies would see better perf by not trashing the L2 cache, since the destination or source is often not inspected right after. String stores don't have a non-temporal option, so this has to be done with vectors.</p>
]]></description><pubDate>Wed, 29 Nov 2023 18:37:48 +0000</pubDate><link>https://news.ycombinator.com/item?id=38463286</link><dc:creator>diamondlovesyou</dc:creator><comments>https://news.ycombinator.com/item?id=38463286</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=38463286</guid></item><item><title><![CDATA[New comment by diamondlovesyou in "Rust std fs slower than Python? No, it's hardware"]]></title><description><![CDATA[
<p>AMD's string store is not like Intel's. Generally, you don't want to use it until you are past the CPU's L2 size (L3 is a victim cache), making ~2k WAY too small. Once past that point, it's profitable to use string store, and should run at "DRAM speed". But it has a high startup cost, hence 256bit vector loads/stores should be used until that threshold is met.</p>
]]></description><pubDate>Wed, 29 Nov 2023 16:56:52 +0000</pubDate><link>https://news.ycombinator.com/item?id=38461882</link><dc:creator>diamondlovesyou</dc:creator><comments>https://news.ycombinator.com/item?id=38461882</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=38461882</guid></item><item><title><![CDATA[New comment by diamondlovesyou in "Steam Deck OLED"]]></title><description><![CDATA[
<p>I have been very happy with my Minisforum Venus UM790, though I use it as a mobile computer since I can just throw it into my backpack. It's been great to have access to AVX512 on the go.</p>
]]></description><pubDate>Thu, 09 Nov 2023 19:33:25 +0000</pubDate><link>https://news.ycombinator.com/item?id=38209842</link><dc:creator>diamondlovesyou</dc:creator><comments>https://news.ycombinator.com/item?id=38209842</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=38209842</guid></item><item><title><![CDATA[New comment by diamondlovesyou in "Speed Up C++ Compilation"]]></title><description><![CDATA[
<p>> It is not a language flaw. C++ requires types to be complete when defining them because it needs to have access to their internal structure and layout to be in a position to apply all the optimizations that C++ is renowned for. Knowing this, at most it's a design tradeoff, and one where C++ came out winning.<p>This statement is incorrect. "Definition resolution" (my made up term for FE Stuff(TM) (not what I work on)) happens during the frontend compilation phase. Optimization is a backend phase, and we don't use source level info on type layout there. The FE does all that layout work and gives the BE an IR which uses explicit offsets.<p>C++ doesn't allow two phase lookup (at least originally); that's why definitions must precede uses.</p>
]]></description><pubDate>Sun, 06 Aug 2023 01:24:06 +0000</pubDate><link>https://news.ycombinator.com/item?id=37018036</link><dc:creator>diamondlovesyou</dc:creator><comments>https://news.ycombinator.com/item?id=37018036</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=37018036</guid></item><item><title><![CDATA[New comment by diamondlovesyou in "Speed Up C++ Compilation"]]></title><description><![CDATA[
<p>The power of the optimizations available to C++ are what make it so fast (see how slow debug mode is vs -O2/etc), and what allow C++ to be fast in the face of common/easy-to-understand, but technically perf-hostile, patterns. Bit counting loops vs popcnt, auto-vectorization, DCE, RCE, CSE, CFG simplification, LTCG/LTO, and so on. These things let you write "high level" (to a point - there are some ways to do "high level" paradigms and absolutely eviscerate the compilers ability to optimize) code/algos and still get great hardware level performance. This is so much more important overall than the time it takes to compile your program, and even more so once you consider that often such programs are shipped once and then enter maintenance mode.<p>It doesn't really have anything to do with compatibility (not entirely, but the things that are the biggest issue to good optimization quality and are fixable are things that need a system-level rethinking on how hardware exceptions happen). It just isn't reasonable to expect developers to know how to optimize, and it doesn't scale.</p>
]]></description><pubDate>Sun, 06 Aug 2023 01:14:20 +0000</pubDate><link>https://news.ycombinator.com/item?id=37017965</link><dc:creator>diamondlovesyou</dc:creator><comments>https://news.ycombinator.com/item?id=37017965</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=37017965</guid></item><item><title><![CDATA[New comment by diamondlovesyou in "The World Might Be Better Off Without College for Everyone (2018)"]]></title><description><![CDATA[
<p><a href="https://archive.li/ZN5MJ" rel="nofollow noreferrer">https://archive.li/ZN5MJ</a></p>
]]></description><pubDate>Fri, 07 Jul 2023 04:18:54 +0000</pubDate><link>https://news.ycombinator.com/item?id=36627046</link><dc:creator>diamondlovesyou</dc:creator><comments>https://news.ycombinator.com/item?id=36627046</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=36627046</guid></item><item><title><![CDATA[New comment by diamondlovesyou in "The RISC Wars Part 1: The Cambrian Explosion"]]></title><description><![CDATA[
<p>CISC vs RISC doesn't matter. An ISA should ideally be a healthy mixture of both (citation needed). Arm64 allows memory operands, "just" like x86; but it still has code size issues. Memory operands (ie having a bit of address calculation in the load that's fused into its use) are very useful for reducing register pressure, which is an issue that every call ABI must contend with. This is something that the RISC ISA totally misses (and ARM64.. isn't really RISC).<p>The issue with this "debate" is that it misses the forest for trees. Instead we should be talking about binary encoding (ie how much "variability" is required), and you're right on that bit; memory isn't the issue it once was.</p>
]]></description><pubDate>Mon, 01 May 2023 03:50:13 +0000</pubDate><link>https://news.ycombinator.com/item?id=35768918</link><dc:creator>diamondlovesyou</dc:creator><comments>https://news.ycombinator.com/item?id=35768918</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=35768918</guid></item><item><title><![CDATA[New comment by diamondlovesyou in "JDK 20 G1/Parallel/Serial GC Changes"]]></title><description><![CDATA[
<p>Sadly, nobody can run from memory management.</p>
]]></description><pubDate>Sat, 18 Mar 2023 05:37:45 +0000</pubDate><link>https://news.ycombinator.com/item?id=35206441</link><dc:creator>diamondlovesyou</dc:creator><comments>https://news.ycombinator.com/item?id=35206441</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=35206441</guid></item><item><title><![CDATA[New comment by diamondlovesyou in "Project Orion"]]></title><description><![CDATA[
<p>> Like almost any thorny military problem of the 1950s, the solution was the application of nuclear bombs.<p>Magnificent.</p>
]]></description><pubDate>Fri, 17 Mar 2023 03:56:02 +0000</pubDate><link>https://news.ycombinator.com/item?id=35192873</link><dc:creator>diamondlovesyou</dc:creator><comments>https://news.ycombinator.com/item?id=35192873</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=35192873</guid></item><item><title><![CDATA[New comment by diamondlovesyou in "Trimming spaces from strings faster with SVE on an Amazon Graviton 3 processor"]]></title><description><![CDATA[
<p>I don't think scalable vectors is particularly useful feature, especially compared to what compilers have to go though to support it. It's much more useful to be able to do "more powerful" things with existing vector widths at hardware speeds (or perhaps just make the existing stuff faster than it is) than to be able to go wider. Scalable vectors also doesn't solve the ISA problem: don't break existing processors.</p>
]]></description><pubDate>Mon, 13 Mar 2023 23:56:12 +0000</pubDate><link>https://news.ycombinator.com/item?id=35145094</link><dc:creator>diamondlovesyou</dc:creator><comments>https://news.ycombinator.com/item?id=35145094</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=35145094</guid></item><item><title><![CDATA[New comment by diamondlovesyou in "Bing: “I will not harm you unless you harm me first”"]]></title><description><![CDATA[
<p>Add a period to the end of the sentence and aberration is gone.<p>"맙소사, 절대평화주의자들도 가끔 존재 자체가 고통이라 해도 남에게 해를 끼치는 행동을 하는 것 같아요."</p>
]]></description><pubDate>Thu, 16 Feb 2023 18:02:59 +0000</pubDate><link>https://news.ycombinator.com/item?id=34822640</link><dc:creator>diamondlovesyou</dc:creator><comments>https://news.ycombinator.com/item?id=34822640</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=34822640</guid></item><item><title><![CDATA[New comment by diamondlovesyou in "GPU Caching Compared Among AMD, Intel UHD, Apple M1"]]></title><description><![CDATA[
<p>See AMD "Smart Memory" a.k.a. PCIe Large Bar. This expands the amount of GPU memory that the CPU can directly access, usually to the GPU's entire memory range (ordinarily only ~256Mb is accessible). GPU->CPU Reads have very high latencies, but that's not an issue for CPU->GPU writes.<p>GPUs have been able to access "host" memory for a long time now, with a few restrictions: you have to setup the GPU mappings first and pin the pages in memory.</p>
]]></description><pubDate>Tue, 17 Jan 2023 05:17:03 +0000</pubDate><link>https://news.ycombinator.com/item?id=34409397</link><dc:creator>diamondlovesyou</dc:creator><comments>https://news.ycombinator.com/item?id=34409397</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=34409397</guid></item><item><title><![CDATA[New comment by diamondlovesyou in "Why did the F-14 Tomcat retire decades before its peers? (2021)"]]></title><description><![CDATA[
<p>> By the time of the second gulf war, the F-14D cost 20% more per unit than the F-18E, and some 80-100% more to maintain. I'm not sure how you're concluding that the taxpayers were ripped off by that.<p>He/she isn't concluding that, they are repeating what they were told at the time.</p>
]]></description><pubDate>Sun, 30 Oct 2022 23:15:01 +0000</pubDate><link>https://news.ycombinator.com/item?id=33400056</link><dc:creator>diamondlovesyou</dc:creator><comments>https://news.ycombinator.com/item?id=33400056</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=33400056</guid></item><item><title><![CDATA[New comment by diamondlovesyou in "Ushering out strlcpy()"]]></title><description><![CDATA[
<p>Naw, just use {pointer, length} tuples. Crisis averted.</p>
]]></description><pubDate>Sat, 27 Aug 2022 11:19:12 +0000</pubDate><link>https://news.ycombinator.com/item?id=32617814</link><dc:creator>diamondlovesyou</dc:creator><comments>https://news.ycombinator.com/item?id=32617814</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=32617814</guid></item></channel></rss>