<rss version="2.0" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:atom="http://www.w3.org/2005/Atom"><channel><title>Hacker News: austinvhuang</title><link>https://news.ycombinator.com/user?id=austinvhuang</link><description>Hacker News RSS</description><docs>https://hnrss.org/</docs><generator>hnrss v2.1.1</generator><lastBuildDate>Wed, 20 May 2026 06:45:55 +0000</lastBuildDate><atom:link href="https://hnrss.org/user?id=austinvhuang" rel="self" type="application/rss+xml"></atom:link><item><title><![CDATA[New comment by austinvhuang in "Growing Neural Cellular Automata"]]></title><description><![CDATA[
<p>Information processing in organisms likely resembles NCA more than feedforward/backprop of neural networks.</p>
]]></description><pubDate>Wed, 20 May 2026 04:05:48 +0000</pubDate><link>https://news.ycombinator.com/item?id=48202962</link><dc:creator>austinvhuang</dc:creator><comments>https://news.ycombinator.com/item?id=48202962</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=48202962</guid></item><item><title><![CDATA[New comment by austinvhuang in "TorchLean: Formalizing Neural Networks in Lean"]]></title><description><![CDATA[
<p>Some may also find Junji Hashimoto's GPU programming library in lean (w/ webgpu) interesting:<p><a href="https://github.com/Verilean/hesper" rel="nofollow">https://github.com/Verilean/hesper</a><p>Even includes an example of transformer inference (quantized 1.5 bit):<p><a href="https://github.com/Verilean/hesper/blob/a688ce9848d6416b2e958b29a0a3b95518df7505/Hesper/Models/BitNet.lean" rel="nofollow">https://github.com/Verilean/hesper/blob/a688ce9848d6416b2e95...</a></p>
]]></description><pubDate>Wed, 04 Mar 2026 02:40:32 +0000</pubDate><link>https://news.ycombinator.com/item?id=47242320</link><dc:creator>austinvhuang</dc:creator><comments>https://news.ycombinator.com/item?id=47242320</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47242320</guid></item><item><title><![CDATA[New comment by austinvhuang in "I have written gemma3 inference in pure C"]]></title><description><![CDATA[
<p>I don't have firsthand knowledge, but r/SesameAI seems to believe Maya/Miles products are based on a Gemma3 backbone.</p>
]]></description><pubDate>Thu, 29 Jan 2026 05:12:49 +0000</pubDate><link>https://news.ycombinator.com/item?id=46806057</link><dc:creator>austinvhuang</dc:creator><comments>https://news.ycombinator.com/item?id=46806057</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=46806057</guid></item><item><title><![CDATA[New comment by austinvhuang in "I have written gemma3 inference in pure C"]]></title><description><![CDATA[
<p>My first implementation of gemma.cpp was kind of like this.<p>There's such a massive performance differential vs. SIMD though that I learned to appreciate SIMD (via highway) as one sweet spot of low-dependency portability that sits between C loops and the messy world of GPUs + their fat tree of dependencies.<p>If anyone want to learn the basics - whip out your favorite LLM pair programmer and ask it to help you study the kernels in the ops/ library of gemma.cpp:<p><a href="https://github.com/google/gemma.cpp/tree/main/ops" rel="nofollow">https://github.com/google/gemma.cpp/tree/main/ops</a></p>
]]></description><pubDate>Wed, 28 Jan 2026 20:00:48 +0000</pubDate><link>https://news.ycombinator.com/item?id=46800762</link><dc:creator>austinvhuang</dc:creator><comments>https://news.ycombinator.com/item?id=46800762</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=46800762</guid></item><item><title><![CDATA[New comment by austinvhuang in "Makepad 1.0: Rust UI Framework"]]></title><description><![CDATA[
<p>Been following makepad for years. Congrats!</p>
]]></description><pubDate>Tue, 13 May 2025 19:10:13 +0000</pubDate><link>https://news.ycombinator.com/item?id=43976490</link><dc:creator>austinvhuang</dc:creator><comments>https://news.ycombinator.com/item?id=43976490</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=43976490</guid></item><item><title><![CDATA[New comment by austinvhuang in "Show HN: WebGPU Puzzles - Learn GPU Programming in Your Browser"]]></title><description><![CDATA[
<p>author here - WebGPU Puzzles is a web incarnation of Sasha Rush’s GPU Puzzles - a series of small, fun, self-contained coding challenges for learning GPU programming.<p>gpupuzzles.answer.ai<p>Using WebGPU, you write code in the browser and computation runs entirely locally on your GPU.</p>
]]></description><pubDate>Fri, 13 Sep 2024 15:46:29 +0000</pubDate><link>https://news.ycombinator.com/item?id=41532299</link><dc:creator>austinvhuang</dc:creator><comments>https://news.ycombinator.com/item?id=41532299</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=41532299</guid></item><item><title><![CDATA[Show HN: WebGPU Puzzles - Learn GPU Programming in Your Browser]]></title><description><![CDATA[
<p>Article URL: <a href="https://www.answer.ai/posts/2024-09-12-gpupuzzles.html">https://www.answer.ai/posts/2024-09-12-gpupuzzles.html</a></p>
<p>Comments URL: <a href="https://news.ycombinator.com/item?id=41532298">https://news.ycombinator.com/item?id=41532298</a></p>
<p>Points: 14</p>
<p># Comments: 2</p>
]]></description><pubDate>Fri, 13 Sep 2024 15:46:29 +0000</pubDate><link>https://www.answer.ai/posts/2024-09-12-gpupuzzles.html</link><dc:creator>austinvhuang</dc:creator><comments>https://news.ycombinator.com/item?id=41532298</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=41532298</guid></item><item><title><![CDATA[New comment by austinvhuang in "Gpu.cpp: A lightweight library for portable low-level GPU computation"]]></title><description><![CDATA[
<p>The raw WebGPUAPI is geared towards infrastructure type of usage, eg ML compilers, game engines, etc and is pretty verbose for application and research use cases.<p>Under examples/, for pedagogical purposes + help contributors understand what happens with WebGPU under the hood, I actually included an example of invoking the same GELU kernel as in the hello world example without gpu.cpp. It looks like this and is ~ 400+ LoC and also will take several minutes to build Dawn:<p><a href="https://github.com/AnswerDotAI/gpu.cpp/blob/main/examples/webgpu_from_scratch/run.cpp">https://github.com/AnswerDotAI/gpu.cpp/blob/main/examples/we...</a><p>A goal of gpu.cpp is to make the power of webgpu much less painful to integrate into a project without having to jump through as many hoops (+ also sets up the prebuilt shared library so builds are instantaneous and painless instead of reams of cmake hassles + 5-10 minutes of waiting for dawn to build):<p><a href="https://github.com/AnswerDotAI/gpu.cpp/blob/main/examples/hello_world/run.cpp">https://github.com/AnswerDotAI/gpu.cpp/blob/main/examples/he...</a></p>
]]></description><pubDate>Sun, 14 Jul 2024 17:32:28 +0000</pubDate><link>https://news.ycombinator.com/item?id=40962182</link><dc:creator>austinvhuang</dc:creator><comments>https://news.ycombinator.com/item?id=40962182</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=40962182</guid></item><item><title><![CDATA[New comment by austinvhuang in "Gpu.cpp: A lightweight library for portable low-level GPU computation"]]></title><description><![CDATA[
<p>Yes wgpu is a much lighter build and has a lot going for it.<p>The situation has gotten a lot better for both dawn and wgpu integration in C++ with:<p><a href="https://github.com/eliemichel/WebGPU-distribution/">https://github.com/eliemichel/WebGPU-distribution/</a><p>Getting a shared library build was a revelation though, credit to:<p><a href="https://github.com/jspanchu/webgpu-dawn-binaries">https://github.com/jspanchu/webgpu-dawn-binaries</a><p>because the FetchContent cache invalidations would still periodically lead to recompiling which gets quite annoying. When it's just a matter of linking you get few-second builds consistently. The cost is we'll have a bit of hardening around potential ABI bugs but it's ultimately worth it.<p>We'll work towards wgpu support. There's some sharp edges in the non-overlap w/ dawn which seem most pronounced with the async handling (which is pretty critical), but I don't think anything is a hard blocker.</p>
]]></description><pubDate>Sun, 14 Jul 2024 15:36:41 +0000</pubDate><link>https://news.ycombinator.com/item?id=40961529</link><dc:creator>austinvhuang</dc:creator><comments>https://news.ycombinator.com/item?id=40961529</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=40961529</guid></item><item><title><![CDATA[New comment by austinvhuang in "Gpu.cpp: A lightweight library for portable low-level GPU computation"]]></title><description><![CDATA[
<p>Part of the goal is not to get in the way if there's other aspects of a project that talk to WebGPU directly. If you're already using WebGPU the correspondence should be pretty familiar if you look at the `gpu.h` source. We specifically avoided extra layers of indirection so that you can mix in direct calls against the WebGPU API when needed.</p>
]]></description><pubDate>Sun, 14 Jul 2024 13:53:48 +0000</pubDate><link>https://news.ycombinator.com/item?id=40960997</link><dc:creator>austinvhuang</dc:creator><comments>https://news.ycombinator.com/item?id=40960997</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=40960997</guid></item><item><title><![CDATA[New comment by austinvhuang in "Gpu.cpp: A lightweight library for portable low-level GPU computation"]]></title><description><![CDATA[
<p>See <a href="https://news.ycombinator.com/item?id=40952182#40957959">https://news.ycombinator.com/item?id=40952182#40957959</a><p>It's early but my current since WGSL -> SPIRV is fairly shallow mapping you should be able to get close modulo extensions. Extensions can be important though, in particular I'm tracking this closely:<p><a href="https://github.com/gpuweb/gpuweb/issues/4195">https://github.com/gpuweb/gpuweb/issues/4195</a><p>One subgoal of gpu.cpp is to be able to have a canvas to experiment and see how far we can push the limits.</p>
]]></description><pubDate>Sun, 14 Jul 2024 13:17:48 +0000</pubDate><link>https://news.ycombinator.com/item?id=40960791</link><dc:creator>austinvhuang</dc:creator><comments>https://news.ycombinator.com/item?id=40960791</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=40960791</guid></item><item><title><![CDATA[New comment by austinvhuang in "Gpu.cpp: A lightweight library for portable low-level GPU computation"]]></title><description><![CDATA[
<p>Fair enough - I don't think there's any hard blockers to doing this, but to get the same QoL we'll want to add a dawn dll to the available prebuilt binaries and adjust the download script.<p>Will look into this in the coming weeks (or if anyone is up for contributing let us know).</p>
]]></description><pubDate>Sun, 14 Jul 2024 10:40:15 +0000</pubDate><link>https://news.ycombinator.com/item?id=40960106</link><dc:creator>austinvhuang</dc:creator><comments>https://news.ycombinator.com/item?id=40960106</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=40960106</guid></item><item><title><![CDATA[New comment by austinvhuang in "Gpu.cpp: A lightweight library for portable low-level GPU computation"]]></title><description><![CDATA[
<p>Thanks!<p>If anyone adds bindings let us know so we can link it in the readme.</p>
]]></description><pubDate>Sun, 14 Jul 2024 10:36:07 +0000</pubDate><link>https://news.ycombinator.com/item?id=40960094</link><dc:creator>austinvhuang</dc:creator><comments>https://news.ycombinator.com/item?id=40960094</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=40960094</guid></item><item><title><![CDATA[New comment by austinvhuang in "Gpu.cpp: A lightweight library for portable low-level GPU computation"]]></title><description><![CDATA[
<p>wgpu is an implementation of the WebGPU API, so it's basically an alternative to Dawn.<p>gpu.cpp is one level up - it's implemented using the WebGPU API, not an implementation of the WebGPU API. In theory it should work with both wgpu and dawn but in practice you find there's enough differences it takes some conditional branching + testing to support both.<p>Having both wgpu and dawn support would be nice and I think we'll get there in the coming months but for faster early iteration I wanted to keep things simple for now. There's implementation + maintenance + testing overhead that you start to have to carry around so it isn't free.</p>
]]></description><pubDate>Sun, 14 Jul 2024 02:29:00 +0000</pubDate><link>https://news.ycombinator.com/item?id=40958409</link><dc:creator>austinvhuang</dc:creator><comments>https://news.ycombinator.com/item?id=40958409</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=40958409</guid></item><item><title><![CDATA[New comment by austinvhuang in "Gpu.cpp: A lightweight library for portable low-level GPU computation"]]></title><description><![CDATA[
<p>The data that is out there is reasonably promising with WebGPU already in use in some production ML inference engines. TVM of course is way ahead of the curve as usual - <a href="https://tvm.apache.org/2020/05/14/compiling-machine-learning-to-webassembly-and-webgpu" rel="nofollow">https://tvm.apache.org/2020/05/14/compiling-machine-learning...</a> though this post is quite old now.<p>It's still early days for pushing compute use cases to WebGPU (OctoML being super early notwithstanding). There's a small matmul in the examples directory but it only has the most basic tiling optimizations. One of my goals the next few weeks is porting the transformer block kernels from llm.c - I think that will flesh out the picture far better. If there's interest, happy to collaborate + could potentially do a writeup if there's enough interest.<p>There's always some tradeoffs that comes with portability, but part of my goal with gpu.cpp is to create a scaffold to experiment and see how far we can push portable GPU performance.</p>
]]></description><pubDate>Sun, 14 Jul 2024 00:31:29 +0000</pubDate><link>https://news.ycombinator.com/item?id=40957959</link><dc:creator>austinvhuang</dc:creator><comments>https://news.ycombinator.com/item?id=40957959</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=40957959</guid></item><item><title><![CDATA[New comment by austinvhuang in "Gpu.cpp: A lightweight library for portable low-level GPU computation"]]></title><description><![CDATA[
<p>Thanks! If there are binding projects, feel free to get in touch so we can link it + trade notes.</p>
]]></description><pubDate>Sun, 14 Jul 2024 00:10:02 +0000</pubDate><link>https://news.ycombinator.com/item?id=40957860</link><dc:creator>austinvhuang</dc:creator><comments>https://news.ycombinator.com/item?id=40957860</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=40957860</guid></item><item><title><![CDATA[New comment by austinvhuang in "Gpu.cpp: A lightweight library for portable low-level GPU computation"]]></title><description><![CDATA[
<p>Thanks very much!<p>You're probably better prepared than you think. The funny thing is after working on making compute workflows work with graphics APIs like vulkan and webgpu, CUDA is so user friendly by comparison :)<p>Feel free to say hi or ping us if you run into issues in the discord channel <a href="https://discord.gg/Q9PWDckbnR" rel="nofollow">https://discord.gg/Q9PWDckbnR</a></p>
]]></description><pubDate>Sun, 14 Jul 2024 00:05:54 +0000</pubDate><link>https://news.ycombinator.com/item?id=40957838</link><dc:creator>austinvhuang</dc:creator><comments>https://news.ycombinator.com/item?id=40957838</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=40957838</guid></item><item><title><![CDATA[New comment by austinvhuang in "Gpu.cpp: A lightweight library for portable low-level GPU computation"]]></title><description><![CDATA[
<p>Windows should work since WebGPU can target DirectX or Vulkan and it should be possible to build in WSL.<p>However I was planning to announce next week after I've had a chance to test with my Windows-using colleagues and this thread came early, so it's possible we'll run into some hiccups.<p>Meet us on discord here if anyone needs helps or just wants to say hello - <a href="https://discord.gg/Q9PWDckbnR" rel="nofollow">https://discord.gg/Q9PWDckbnR</a></p>
]]></description><pubDate>Sat, 13 Jul 2024 23:59:39 +0000</pubDate><link>https://news.ycombinator.com/item?id=40957808</link><dc:creator>austinvhuang</dc:creator><comments>https://news.ycombinator.com/item?id=40957808</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=40957808</guid></item><item><title><![CDATA[New comment by austinvhuang in "Gpu.cpp: A lightweight library for portable low-level GPU computation"]]></title><description><![CDATA[
<p>You're not alone.<p>I've had hour long conversations explaining the project talking about how webgpu can be used natively, how rust and zig people are using webgpu as a main GPU APIs (with wgpu and mach) and at the end there's still clarification questions about differences from WebGL and WASM.<p>The phrase "native webgpu" might as well be a Stroop Effect prank in technology branding.</p>
]]></description><pubDate>Sat, 13 Jul 2024 23:46:09 +0000</pubDate><link>https://news.ycombinator.com/item?id=40957762</link><dc:creator>austinvhuang</dc:creator><comments>https://news.ycombinator.com/item?id=40957762</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=40957762</guid></item><item><title><![CDATA[New comment by austinvhuang in "Gpu.cpp: A lightweight library for portable low-level GPU computation"]]></title><description><![CDATA[
<p>Vulkan is definitely a valid angle and I seriously considered it as well. There's a few things that, in aggregate, led me to explore a different direction:<p>First, there's already a few teams taking a stab at the vulkan approach like kompute, so it's not like that's uncovered territory. At the same time I first looked into this the khronos/apple drama + complaints about moltenvk didn't seem encouraging but I'd be happy to hear if the situation is a lot better.<p>Second, even though it's not the initial focus, the possibility of browser targets is interesting.<p>Finally, there's not much in the fairly minimalist gpu.cpp design that couldn't be retargeted to a vulkan backend at some point in the future if it becomes clear that (eg w/ the right combination of vulkan-specific extensions) the performance differential is sufficient to justify the higher implementation complexity and the metal/vulkan tug of war issues are a thing of the past.<p>Ultimately there's much less happening with webgpu and the things that are happening tend to be in the ml inference infra rather than libraries. it seemed to be a point in the design space worth exploring.<p>Regarding Dawn - I've lived where your coming from. Some non-trivial amount of effort went into smoothing out the friction. First, if you look at the bottom of the repo README you'll see others have done a lot to make building easier - fetchcontent with Elie's repo worked on the first try, but w/ gpu.cpp users shouldn't even have to deal with that if they don't want to. The reason there's a small script that takes the few seconds to fetch a prebuilt shared library on the first build is so that you can avoid the dawn build by default. After that it should be almost instantaneous to link and compile cycles should be a second or two.<p>But as I mention elsewhere in these threads, if the Dawn team shipped prebuilt shared libraries themselves, that would be an even better solution (if anyone at Google is reading this)!</p>
]]></description><pubDate>Sat, 13 Jul 2024 23:38:49 +0000</pubDate><link>https://news.ycombinator.com/item?id=40957730</link><dc:creator>austinvhuang</dc:creator><comments>https://news.ycombinator.com/item?id=40957730</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=40957730</guid></item></channel></rss>