Hacker News: tehsauce

New comment by tehsauce in "The Claude Code Source Leak: fake tools, frustration regexes, undercover mode"

tehsauce — Tue, 31 Mar 2026 23:04:58 +0000

For the purpose of disclosure, it should say “Warning: AI generated code” in the commit message, not an advertisement for a specific product. You would never accept any of your other tools injecting themselves into a commit message like that.

New comment by tehsauce in "Zclaw – The 888 KiB Assistant"

tehsauce — Mon, 02 Mar 2026 17:11:41 +0000

“888 KiB Assistant” but the assistant itself is a multi terabyte rental-only model stored in some mysterious data center.

New comment by tehsauce in "Gemini 3 Deep Think"

tehsauce — Thu, 12 Feb 2026 22:17:28 +0000

How does it do on gold stake?

New comment by tehsauce in "Attention at Constant Cost per Token via Symmetry-Aware Taylor Approximation"

tehsauce — Wed, 04 Feb 2026 15:44:41 +0000

Right, and when they compare to floating point accuracy they seem to be using the number of decimals supported by the mantissa, but the exponent is important no?

New comment by tehsauce in "World Models"

tehsauce — Thu, 29 Jan 2026 01:08:18 +0000

“A transformer predicts the next token”

Nope. A transformer is much more general than that. A GPT predicts the next token.

New comment by tehsauce in "Scaling long-running autonomous coding"

tehsauce — Thu, 15 Jan 2026 03:32:16 +0000

I was excited to try it out so I downloaded the repo and ran the build. However there were 100+ compilation errors. So I checked the commit history on github and saw that for at least several pages back all recent commits had failed in the CI. It was not clear which commit I should pick to get the semi-working version advertised.

I started looking in the Cargo.toml to at least get an idea how the project was constructed. I saw there that rather than being built from scratch as the post seemed to imply that almost every core component was simply pulled in from an open source library. quickjs engine, wgpu graphics, winit windowing & input, egui for ui, html parsing, the list goes on. On twitter their CEO explicitly stated that it uses a "custom js vm" which seemed particularly misleading / untrue to me.

Integrating all of these existing components is still super impressive for these models to do autonomously, so I'm just at a loss how to feel when it does something impressive but they then feel the need to misrepresent so much. I guess I just have a lot less respect and trust for the cursor leadership, but maybe a little relief knowing that soon I may just generate my own custom cursor!

New comment by tehsauce in "World Emulation via Neural Network"

tehsauce — Sat, 26 Apr 2025 01:30:18 +0000

I love this! Your results seem comparable to the counter strike or minecraft models from a bit ago with massively less compute and data. It's particularly cool that it uses real world data. I've been wanting to do something like this for a while, like capturing a large dataset while backpacking in the cascades :)

I didn't see it in an obvious place on your github, do you have any plans to open source the training code?

New comment by tehsauce in "People are just as bad as my LLMs"

tehsauce — Mon, 10 Mar 2025 19:49:02 +0000

There has been some good research published on this topic of how RLHF, ie aligning to human preferences easily introduces mode collapse and bias into models. For example, with a prompt like: "Choose a random number", the base pretrained model can give relatively random answers, but after fine tuning to produce responses humans like, they become very biased towards responding with numbers like "7" or "42".

New comment by tehsauce in "Show HN: Beating Pokemon Red with RL and <10M Parameters"

tehsauce — Thu, 06 Mar 2025 00:35:42 +0000

We have a shared community map where you can watch hundreds of agents from multiple peoples training runs playing in real time!

https://pwhiddy.github.io/pokerl-map-viz/

New comment by tehsauce in "Show HN: Beating Pokemon Red with RL and <10M Parameters"

tehsauce — Wed, 05 Mar 2025 19:32:14 +0000

It's impossible to beat with random actions or brute force, but you can get surprisingly far. It doesn't take too long to get halfway through route 1, but even with insane compute you'll never make it even to viridian forest.

New comment by tehsauce in "Claude Plays Pokémon"

tehsauce — Thu, 27 Feb 2025 00:01:39 +0000

Anyone interested in watching lots of reinforcement agents playing pokemon red at once, we have a website which streams hundreds of concurrent games from multiple people’s training runs to a shared map in real time!

https://pwhiddy.github.io/pokerl-map-viz/

(works best on desktop)

New comment by tehsauce in "Ask HN: Resources for General Purpose GPU development on Apple's M* chips?"

tehsauce — Wed, 25 Dec 2024 17:35:45 +0000

the metal backend does currently generate quite a lot of unnecessary command buffers, but in general performance seems solid.

New comment by tehsauce in "Were RNNs all we needed?"

tehsauce — Thu, 03 Oct 2024 17:49:38 +0000

I haven’t gone through the paper in detail yet but maybe someone can answer. If you remove the hidden state from an rnn as they say they’ve done, what’s left? An mlp predicting from a single token?

New comment by tehsauce in "The Environmental Toll of a Single ChatGPT Query Is Absolutely Wild"

tehsauce — Tue, 24 Sep 2024 07:14:58 +0000

The water consumed to produce a single hamburger is over 2000 liters, and the power likely well over 100 watt-hours.

That means gpt can write >1000 emails using the resources of feeding a single person lunch. The resource efficiency of these machines already is really quite astonishing.

New comment by tehsauce in "The GJK Algorithm: A weird and beautiful way to do a simple thing"

tehsauce — Wed, 12 Jun 2024 22:49:32 +0000

Awesome article! Something slightly misleading though - the first image shows the intersection of a non-convex shape, but it isn't revealed until much later that the algorithm only works for convex shapes, not the type shown in the first image.

New comment by tehsauce in "Grokked Transformers Are Implicit Reasoners"

tehsauce — Tue, 28 May 2024 04:29:47 +0000

Grokking is a sudden huge jump in test accuracy with increasing training steps, well after training accuracy has fully converged. Double descent is test performance increasing, decreasing, and then finally rising again as model parameters are increased.

New comment by tehsauce in "New exponent functions that make SiLU and SoftMax 2x faster, at full accuracy"

tehsauce — Wed, 15 May 2024 23:54:06 +0000

If cpu softmax were limited by memory bandwidth, then these vectorization optimizations wouldn't improve performance.

New comment by tehsauce in "Show HN: gpudeploy.com – "Airbnb" for GPUs"

tehsauce — Sun, 05 May 2024 01:07:36 +0000

+1 for vast. they usually are the cheapest and have the most supply. some instances can be less reliable at the low end though

New comment by tehsauce in "GPU compute in the browser at the speed of native: WebGPU marching cubes"

tehsauce — Wed, 24 Apr 2024 01:12:15 +0000

It's possible you might not need direct access to wave/subgroup ops to implement efficient stream compaction. There's a great old Nvidia blog post on "warp-aggregated atomics"

https://developer.nvidia.com/blog/cuda-pro-tip-optimized-fil...

where they show that their compiler is sometimes able to automatically convert global atomic operations into the warp local versions, and achieve the same performance as manually written intrinsics. I was recently curious if 10 years later these same optimizations had made it into other GPUs and platforms besides cuda, so I put together a simple atomics benchmark in WebGPU.

https://github.com/PWhiddy/webgpu-atomics-benchmark

The results seem to indicate that these optimizations are accessible through webgpu on chrome on both MacOS and Linux (with nvidia gpu). Note that I'm not directly testing stream compaction, just incrementing a single global atomic counter. So that would need to be tested to know for sure if the optimization still holds there. If you see any issues with the benchmark or this reasoning please let me know! I am hoping to solidify my knowledge in this area :)

New comment by tehsauce in "FPGA Architecture for Deep Learning: Survey and Future Directions"

tehsauce — Tue, 23 Apr 2024 06:49:55 +0000

500GB/s is going to limit it to at best 1/4 the DL performance of an nvidia gpu. I’m not sure what the floating point perf of these FPGAs are but I imagine that also might set a fundamental performance limit at a small fraction of a GPU.