<rss version="2.0" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:atom="http://www.w3.org/2005/Atom"><channel><title>Hacker News: otherjason</title><link>https://news.ycombinator.com/user?id=otherjason</link><description>Hacker News RSS</description><docs>https://hnrss.org/</docs><generator>hnrss v2.1.1</generator><lastBuildDate>Sun, 31 May 2026 17:19:40 +0000</lastBuildDate><atom:link href="https://hnrss.org/user?id=otherjason" rel="self" type="application/rss+xml"></atom:link><item><title><![CDATA[New comment by otherjason in "Dav2d"]]></title><description><![CDATA[
<p>Almost every Intel CPU released since 2013 has AVX2 support. Some Atom SKUs were longer holdouts, but the fraction of x86 CPUs shipped in the last decade that have AVX2 support is very high.</p>
]]></description><pubDate>Sun, 31 May 2026 14:46:53 +0000</pubDate><link>https://news.ycombinator.com/item?id=48346091</link><dc:creator>otherjason</dc:creator><comments>https://news.ycombinator.com/item?id=48346091</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=48346091</guid></item><item><title><![CDATA[New comment by otherjason in "Building a web server in aarch64 assembly to give my life (a lack of) meaning"]]></title><description><![CDATA[
<p>The statistics reported by GitLab for the x264 repo (<a href="https://code.videolan.org/videolan/x264" rel="nofollow">https://code.videolan.org/videolan/x264</a>) report that the project is 13.5% assembly; common utilities used in the inner loops of the codec have optimized assembly implementations for several CPU architectures.</p>
]]></description><pubDate>Mon, 11 May 2026 16:19:45 +0000</pubDate><link>https://news.ycombinator.com/item?id=48097035</link><dc:creator>otherjason</dc:creator><comments>https://news.ycombinator.com/item?id=48097035</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=48097035</guid></item><item><title><![CDATA[New comment by otherjason in "AVX2 is slower than SSE2-4.x under Windows ARM emulation"]]></title><description><![CDATA[
<p>See this correct comment above: <a href="https://news.ycombinator.com/item?id=47061696">https://news.ycombinator.com/item?id=47061696</a><p>AVX512 leading to thermal throttling is a common myth that from what I can tell traces its origins to a blog post about clock throttling on a particular set of low-TDP SKUs from the first generation of Xeon CPUs that supported it (Skylake-X), released over a decade ago: <a href="https://blog.cloudflare.com/on-the-dangers-of-intels-frequency-scaling/" rel="nofollow">https://blog.cloudflare.com/on-the-dangers-of-intels-frequen...</a><p>The results were debated shortly after that by well-known SIMD authors that were unable to duplicate the results: <a href="https://lemire.me/blog/2018/08/25/avx-512-throttling-heavy-instructions-are-maybe-not-so-dangerous/" rel="nofollow">https://lemire.me/blog/2018/08/25/avx-512-throttling-heavy-i...</a><p>In practice, this has not been an issue for a long time, if ever; clock frequency scaling for AVX modes has been continually improved in subsequent Intel CPU generations (and even more so in AMD Zen 4/5 once AVX512 support was added).</p>
]]></description><pubDate>Wed, 18 Feb 2026 15:32:53 +0000</pubDate><link>https://news.ycombinator.com/item?id=47062103</link><dc:creator>otherjason</dc:creator><comments>https://news.ycombinator.com/item?id=47062103</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47062103</guid></item><item><title><![CDATA[New comment by otherjason in "AVX2 is slower than SSE2-4.x under Windows ARM emulation"]]></title><description><![CDATA[
<p>The only CPU I've encountered that supports SVE is the Cortex-X925/A725 that is used in the NVIDIA DGX Spark platform. The vector width is still only 128 bits, but you do get access to the other enhancements the SVE instructions give, like predication (one of the most useful features from Intel's AVX512).</p>
]]></description><pubDate>Wed, 18 Feb 2026 15:25:20 +0000</pubDate><link>https://news.ycombinator.com/item?id=47062011</link><dc:creator>otherjason</dc:creator><comments>https://news.ycombinator.com/item?id=47062011</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47062011</guid></item><item><title><![CDATA[New comment by otherjason in "Understanding C++ Ownership System"]]></title><description><![CDATA[
<p>What makes you think that RAII- and arena-based strategies are in tension with one another? RAII and smart pointers are more related to the ownership and resource management model. Allocating items in bulk or from arenas is more about where the underlying resources and/or memory come from. These concepts can certainly be used in tandem. What is the substance of the argument that RAII, etc. are "hot garbage?"</p>
]]></description><pubDate>Mon, 19 Jan 2026 21:24:41 +0000</pubDate><link>https://news.ycombinator.com/item?id=46684701</link><dc:creator>otherjason</dc:creator><comments>https://news.ycombinator.com/item?id=46684701</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=46684701</guid></item><item><title><![CDATA[New comment by otherjason in "NYC Spends $200 Million on Cell Service for School Chromebooks"]]></title><description><![CDATA[
<p>The parking spaces in question aren’t free; the city sold the long-term rights to operate the parking facilities to the private sector in a bid to balance one year’s budget.</p>
]]></description><pubDate>Tue, 23 Dec 2025 02:19:32 +0000</pubDate><link>https://news.ycombinator.com/item?id=46361688</link><dc:creator>otherjason</dc:creator><comments>https://news.ycombinator.com/item?id=46361688</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=46361688</guid></item><item><title><![CDATA[New comment by otherjason in "Hard drives on backorder for two years as AI data centers trigger HDD shortage"]]></title><description><![CDATA[
<p>Overprovisioning is much less aggressive than this in practice. A read-oriented SSD with 15.36 TB of storage typically has 16.384 TiB of flash. The same hardware can be used to implement a 12.8 TB mixed-use SSD (3 DWPD or more).</p>
]]></description><pubDate>Thu, 13 Nov 2025 01:55:14 +0000</pubDate><link>https://news.ycombinator.com/item?id=45909511</link><dc:creator>otherjason</dc:creator><comments>https://news.ycombinator.com/item?id=45909511</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=45909511</guid></item><item><title><![CDATA[New comment by otherjason in "I took all my projects off the cloud, saving thousands of dollars"]]></title><description><![CDATA[
<p><a href="https://www.solidigm.com/products/data-center/d5/p5336.html" rel="nofollow">https://www.solidigm.com/products/data-center/d5/p5336.html</a></p>
]]></description><pubDate>Thu, 06 Nov 2025 03:29:09 +0000</pubDate><link>https://news.ycombinator.com/item?id=45831050</link><dc:creator>otherjason</dc:creator><comments>https://news.ycombinator.com/item?id=45831050</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=45831050</guid></item><item><title><![CDATA[New comment by otherjason in "How OpenAI uses complex and circular deals to fuel its multibillion-dollar rise"]]></title><description><![CDATA[
<p>But, if model development stalls, and everyone else is stalled as well, then what happens to turn the current wildly-unprofitable industry into something that "it makes sense to keep spending billions" on?</p>
]]></description><pubDate>Fri, 31 Oct 2025 15:20:13 +0000</pubDate><link>https://news.ycombinator.com/item?id=45773037</link><dc:creator>otherjason</dc:creator><comments>https://news.ycombinator.com/item?id=45773037</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=45773037</guid></item><item><title><![CDATA[New comment by otherjason in "GPU Prefix Sums: A nearly complete collection"]]></title><description><![CDATA[
<p>I think they were trying to say “radix sort is a more important application of prefix sum than extraction of values from a sparse matrix/vector is.”</p>
]]></description><pubDate>Thu, 28 Aug 2025 17:53:05 +0000</pubDate><link>https://news.ycombinator.com/item?id=45054994</link><dc:creator>otherjason</dc:creator><comments>https://news.ycombinator.com/item?id=45054994</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=45054994</guid></item><item><title><![CDATA[New comment by otherjason in "Cloud Run GPUs, now GA, makes running AI workloads easier for everyone"]]></title><description><![CDATA[
<p>Where did you get the pricing for vast.ai here? Looking at their pricing page, I don't see any 8xH200 options for less than $21.65 an hour (and most are more than that).</p>
]]></description><pubDate>Wed, 04 Jun 2025 14:27:32 +0000</pubDate><link>https://news.ycombinator.com/item?id=44181071</link><dc:creator>otherjason</dc:creator><comments>https://news.ycombinator.com/item?id=44181071</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=44181071</guid></item><item><title><![CDATA[New comment by otherjason in "Cloud Run GPUs, now GA, makes running AI workloads easier for everyone"]]></title><description><![CDATA[
<p>The last few generations of GPU architectures have been increasingly optimized for massive throughput of low-precision integer arithmetic operations, though, which are not useful for any of those other applications.</p>
]]></description><pubDate>Wed, 04 Jun 2025 12:45:38 +0000</pubDate><link>https://news.ycombinator.com/item?id=44180153</link><dc:creator>otherjason</dc:creator><comments>https://news.ycombinator.com/item?id=44180153</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=44180153</guid></item><item><title><![CDATA[New comment by otherjason in "Fundamental flaws of SIMD ISAs (2021)"]]></title><description><![CDATA[
<p>This is the common argument from proponents of compiler autovectorization. An example like what you have is very simple, so modern compilers would turn it into SIMD code without a problem.<p>In practice, though, the cases that compilers can successfully autovectorize are very limited relative to the total problem space that SIMD is solving. Plus, if I rely on that, it leaves me vulnerable to regressions in the compiler vectorizer.<p>Ultimately for me, I would rather write the implementation myself and know what is being generated versus trying to write high-level code in just the right way to make the compiler generate what I want.</p>
]]></description><pubDate>Fri, 25 Apr 2025 13:27:00 +0000</pubDate><link>https://news.ycombinator.com/item?id=43793348</link><dc:creator>otherjason</dc:creator><comments>https://news.ycombinator.com/item?id=43793348</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=43793348</guid></item><item><title><![CDATA[New comment by otherjason in "Social Security Admin to require in-person ID checks for new&existing recipients"]]></title><description><![CDATA[
<p>It doesn’t sound like they are referring to newborns needing to be physically present to get a SSN. Instead, it seems to refer to persons who are registering to start receiving their Social Security benefits (or existing recipients who want to change their direct deposit information). Also, there is an existing supported method for identifying yourself electronically that is mentioned in the article. In that sense, the headline seems a bit misleading.</p>
]]></description><pubDate>Wed, 19 Mar 2025 00:59:34 +0000</pubDate><link>https://news.ycombinator.com/item?id=43407199</link><dc:creator>otherjason</dc:creator><comments>https://news.ycombinator.com/item?id=43407199</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=43407199</guid></item><item><title><![CDATA[New comment by otherjason in "Nvidia sheds almost $600B in market cap"]]></title><description><![CDATA[
<p>And "our best iPhone ever." If it weren't your best iPhone ever, then I would be better off buying one of your older models.</p>
]]></description><pubDate>Mon, 27 Jan 2025 23:19:40 +0000</pubDate><link>https://news.ycombinator.com/item?id=42846985</link><dc:creator>otherjason</dc:creator><comments>https://news.ycombinator.com/item?id=42846985</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=42846985</guid></item><item><title><![CDATA[New comment by otherjason in "The Fannie and Freddie trade is back"]]></title><description><![CDATA[
<p>If you or anyone else are interested, his newsletter is available by email for free!</p>
]]></description><pubDate>Sat, 11 Jan 2025 21:29:21 +0000</pubDate><link>https://news.ycombinator.com/item?id=42669060</link><dc:creator>otherjason</dc:creator><comments>https://news.ycombinator.com/item?id=42669060</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=42669060</guid></item><item><title><![CDATA[New comment by otherjason in "Launch HN: Double (YC W24) – Index Investing with 0% Expense Ratios"]]></title><description><![CDATA[
<p>If your roboadvisor is buying the individual stocks that make up the index in my personal account for me, do you have data that compares the slippage (bid/ask spread) paid across all of these transactions versus a single purchase of very liquid ETFs like SPY?</p>
]]></description><pubDate>Tue, 10 Dec 2024 14:56:25 +0000</pubDate><link>https://news.ycombinator.com/item?id=42377429</link><dc:creator>otherjason</dc:creator><comments>https://news.ycombinator.com/item?id=42377429</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=42377429</guid></item><item><title><![CDATA[New comment by otherjason in "Initial CUDA Performance Lessons"]]></title><description><![CDATA[
<p>For devices with compute capability of 7.0 or greater (anything from the Volta series on), a single thread block can address up to the entire shared memory size of the SM; the 48 kB limit that older hardware had is no more. Most contemporary applications are going to be running on hardware that doesn’t have the shared memory limit you mentioned.<p>The claim at the end of your post, suggesting that >1 block per SM is always better than 1 block per SM, isn’t strictly true either. In the example you gave, you’re limited to 60 blocks because the thread count of each block is too high. You could, for example, cut the blocks in half to yield 120 blocks. But each block has half as many threads in it, so you don’t automatically get any occupancy benefit by doing so.<p>When planning out the geometry of a CUDA thread grid, there are inherent tradeoffs between SM thread and/or warp scheduler limits, shared memory usage, register usage, and overall SM count, and those tradeoffs can be counterintuitive if you follow (admittedly, NVIDIA’s official) guidance that maximizing the thread count leads to optimal performance.</p>
]]></description><pubDate>Fri, 11 Oct 2024 17:18:24 +0000</pubDate><link>https://news.ycombinator.com/item?id=41811364</link><dc:creator>otherjason</dc:creator><comments>https://news.ycombinator.com/item?id=41811364</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=41811364</guid></item><item><title><![CDATA[New comment by otherjason in "Hitting every branch on the way down"]]></title><description><![CDATA[
<p>Protecting the main branch is definitely a good practice, but the other potential hazard is:<p>- Having a developer on your team that rebases their own feature branch<p>- Then tries to "git push", only for it to be rejected since a force push is required<p>- Then performs a "git push --force", which will force-push all of their local branches, including feature branches from other developers that they may have checked out previously<p>Our team uses merges because they are safe from this kind of problem, although a rebase workflow would have cleaner history. I wish that "git push --force" would not push all branches by default, and just fail unless a (remote, branch) pair or --all is given.</p>
]]></description><pubDate>Tue, 30 Apr 2024 11:58:50 +0000</pubDate><link>https://news.ycombinator.com/item?id=40210036</link><dc:creator>otherjason</dc:creator><comments>https://news.ycombinator.com/item?id=40210036</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=40210036</guid></item><item><title><![CDATA[New comment by otherjason in "Difftastic, a structural diff tool that understands syntax"]]></title><description><![CDATA[
<p>Difftastic is a useful tool, but in my experience, it's far too slow to be suitable as the default selection for a ubiquitous tool like git.</p>
]]></description><pubDate>Thu, 21 Mar 2024 14:56:23 +0000</pubDate><link>https://news.ycombinator.com/item?id=39779267</link><dc:creator>otherjason</dc:creator><comments>https://news.ycombinator.com/item?id=39779267</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=39779267</guid></item></channel></rss>