<rss version="2.0" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:atom="http://www.w3.org/2005/Atom"><channel><title>Hacker News: joelburget</title><link>https://news.ycombinator.com/user?id=joelburget</link><description>Hacker News RSS</description><docs>https://hnrss.org/</docs><generator>hnrss v2.1.1</generator><lastBuildDate>Sat, 25 Apr 2026 08:52:55 +0000</lastBuildDate><atom:link href="https://hnrss.org/user?id=joelburget" rel="self" type="application/rss+xml"></atom:link><item><title><![CDATA[On-Policy Distillation]]></title><description><![CDATA[
<p>Article URL: <a href="https://thinkingmachines.ai/blog/on-policy-distillation/">https://thinkingmachines.ai/blog/on-policy-distillation/</a></p>
<p>Comments URL: <a href="https://news.ycombinator.com/item?id=45761818">https://news.ycombinator.com/item?id=45761818</a></p>
<p>Points: 5</p>
<p># Comments: 0</p>
]]></description><pubDate>Thu, 30 Oct 2025 16:26:30 +0000</pubDate><link>https://thinkingmachines.ai/blog/on-policy-distillation/</link><dc:creator>joelburget</dc:creator><comments>https://news.ycombinator.com/item?id=45761818</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=45761818</guid></item><item><title><![CDATA[US AI Action Plan]]></title><description><![CDATA[
<p>PDF: <a href="https://www.whitehouse.gov/wp-content/uploads/2025/07/Americas-AI-Action-Plan.pdf" rel="nofollow">https://www.whitehouse.gov/wp-content/uploads/2025/07/Americ...</a></p>
<hr>
<p>Comments URL: <a href="https://news.ycombinator.com/item?id=44660323">https://news.ycombinator.com/item?id=44660323</a></p>
<p>Points: 426</p>
<p># Comments: 618</p>
]]></description><pubDate>Wed, 23 Jul 2025 15:28:58 +0000</pubDate><link>https://www.ai.gov/action-plan</link><dc:creator>joelburget</dc:creator><comments>https://news.ycombinator.com/item?id=44660323</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=44660323</guid></item><item><title><![CDATA[Showh HN: Microjax – JAX in two classes and six functions]]></title><description><![CDATA[
<p>Article URL: <a href="https://github.com/joelburget/microjax">https://github.com/joelburget/microjax</a></p>
<p>Comments URL: <a href="https://news.ycombinator.com/item?id=44490299">https://news.ycombinator.com/item?id=44490299</a></p>
<p>Points: 46</p>
<p># Comments: 1</p>
]]></description><pubDate>Mon, 07 Jul 2025 13:41:55 +0000</pubDate><link>https://github.com/joelburget/microjax</link><dc:creator>joelburget</dc:creator><comments>https://news.ycombinator.com/item?id=44490299</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=44490299</guid></item><item><title><![CDATA[ASI existential risk: reconsidering alignment as a goal]]></title><description><![CDATA[
<p>Article URL: <a href="https://michaelnotebook.com/xriskbrief/index.html">https://michaelnotebook.com/xriskbrief/index.html</a></p>
<p>Comments URL: <a href="https://news.ycombinator.com/item?id=43698739">https://news.ycombinator.com/item?id=43698739</a></p>
<p>Points: 3</p>
<p># Comments: 0</p>
]]></description><pubDate>Tue, 15 Apr 2025 21:41:19 +0000</pubDate><link>https://michaelnotebook.com/xriskbrief/index.html</link><dc:creator>joelburget</dc:creator><comments>https://news.ycombinator.com/item?id=43698739</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=43698739</guid></item><item><title><![CDATA[New comment by joelburget in "“A calculator app? Anyone could make that”"]]></title><description><![CDATA[
<p>I wrote an OCaml implementation of this paper a few years ago, which I've now extracted into its own [repo](<a href="https://github.com/joelburget/constructive-reals/blob/main/Constructive_reals.ml">https://github.com/joelburget/constructive-reals/blob/main/C...</a>)<p>The link in the paper to their Java implementation is now broken: does anyone have a current link?</p>
]]></description><pubDate>Sun, 16 Feb 2025 17:46:47 +0000</pubDate><link>https://news.ycombinator.com/item?id=43069955</link><dc:creator>joelburget</dc:creator><comments>https://news.ycombinator.com/item?id=43069955</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=43069955</guid></item><item><title><![CDATA[New comment by joelburget in "Pre-Trained Large Language Models Use Fourier Features for Addition (2024)"]]></title><description><![CDATA[
<p>And more recently, [Language Models Use Trigonometry to Do Addition](<a href="https://arxiv.org/abs/2502.00873" rel="nofollow">https://arxiv.org/abs/2502.00873</a>)</p>
]]></description><pubDate>Thu, 06 Feb 2025 21:41:56 +0000</pubDate><link>https://news.ycombinator.com/item?id=42966781</link><dc:creator>joelburget</dc:creator><comments>https://news.ycombinator.com/item?id=42966781</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=42966781</guid></item><item><title><![CDATA[Writing Einsum in Depth (In OCaml)]]></title><description><![CDATA[
<p>Article URL: <a href="https://www.joelburget.com/writing-einsum-in-depth">https://www.joelburget.com/writing-einsum-in-depth</a></p>
<p>Comments URL: <a href="https://news.ycombinator.com/item?id=42793323">https://news.ycombinator.com/item?id=42793323</a></p>
<p>Points: 2</p>
<p># Comments: 0</p>
]]></description><pubDate>Wed, 22 Jan 2025 14:41:27 +0000</pubDate><link>https://www.joelburget.com/writing-einsum-in-depth</link><dc:creator>joelburget</dc:creator><comments>https://news.ycombinator.com/item?id=42793323</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=42793323</guid></item><item><title><![CDATA[New comment by joelburget in "Einsum in Depth"]]></title><description><![CDATA[
<p>This is a good idea, though one problem is that Einsum notation (as realized in Numpy and Pytorch) doesn't support the notion of co-contravariance, and the site is based on their Einsum notation. I could potentially add the variances for the examples, though that would move away from how the site currently works (where the information about the reduction comes only from the einsum input).</p>
]]></description><pubDate>Mon, 06 Jan 2025 19:39:09 +0000</pubDate><link>https://news.ycombinator.com/item?id=42614624</link><dc:creator>joelburget</dc:creator><comments>https://news.ycombinator.com/item?id=42614624</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=42614624</guid></item><item><title><![CDATA[Einsum in Depth]]></title><description><![CDATA[
<p>Article URL: <a href="https://einsum.joelburget.com/">https://einsum.joelburget.com/</a></p>
<p>Comments URL: <a href="https://news.ycombinator.com/item?id=42587056">https://news.ycombinator.com/item?id=42587056</a></p>
<p>Points: 87</p>
<p># Comments: 31</p>
]]></description><pubDate>Fri, 03 Jan 2025 16:34:33 +0000</pubDate><link>https://einsum.joelburget.com/</link><dc:creator>joelburget</dc:creator><comments>https://news.ycombinator.com/item?id=42587056</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=42587056</guid></item><item><title><![CDATA[New comment by joelburget in "Underrated reasons to be thankful IV"]]></title><description><![CDATA[
<p>A couple of these I'd like references on if anyone happens to have them.<p>1. "current science suggests that the actual health impact from consuming most types of plastic might well be essentially zero"<p>2. "the (weak) evidence we have now suggests running strengthens your knees"</p>
]]></description><pubDate>Thu, 28 Nov 2024 17:05:24 +0000</pubDate><link>https://news.ycombinator.com/item?id=42266854</link><dc:creator>joelburget</dc:creator><comments>https://news.ycombinator.com/item?id=42266854</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=42266854</guid></item><item><title><![CDATA[Underrated reasons to be thankful IV]]></title><description><![CDATA[
<p>Article URL: <a href="https://dynomight.net/thanks-4/">https://dynomight.net/thanks-4/</a></p>
<p>Comments URL: <a href="https://news.ycombinator.com/item?id=42266848">https://news.ycombinator.com/item?id=42266848</a></p>
<p>Points: 2</p>
<p># Comments: 1</p>
]]></description><pubDate>Thu, 28 Nov 2024 17:04:13 +0000</pubDate><link>https://dynomight.net/thanks-4/</link><dc:creator>joelburget</dc:creator><comments>https://news.ycombinator.com/item?id=42266848</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=42266848</guid></item><item><title><![CDATA[DeepSeek-R1-Lite-Preview is now live]]></title><description><![CDATA[
<p>Article URL: <a href="https://api-docs.deepseek.com/news/news1120">https://api-docs.deepseek.com/news/news1120</a></p>
<p>Comments URL: <a href="https://news.ycombinator.com/item?id=42206521">https://news.ycombinator.com/item?id=42206521</a></p>
<p>Points: 1</p>
<p># Comments: 0</p>
]]></description><pubDate>Thu, 21 Nov 2024 17:25:11 +0000</pubDate><link>https://api-docs.deepseek.com/news/news1120</link><dc:creator>joelburget</dc:creator><comments>https://news.ycombinator.com/item?id=42206521</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=42206521</guid></item><item><title><![CDATA[Quantum error correction below the surface code threshold]]></title><description><![CDATA[
<p>Article URL: <a href="https://arxiv.org/abs/2408.13687">https://arxiv.org/abs/2408.13687</a></p>
<p>Comments URL: <a href="https://news.ycombinator.com/item?id=41638516">https://news.ycombinator.com/item?id=41638516</a></p>
<p>Points: 1</p>
<p># Comments: 0</p>
]]></description><pubDate>Tue, 24 Sep 2024 16:57:01 +0000</pubDate><link>https://arxiv.org/abs/2408.13687</link><dc:creator>joelburget</dc:creator><comments>https://news.ycombinator.com/item?id=41638516</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=41638516</guid></item><item><title><![CDATA[New comment by joelburget in "Notes on OpenAI's new o1 chain-of-thought models"]]></title><description><![CDATA[
<p>o1 <i>is</i> an application of the Bitter Less. To quote Sutton: "The two methods that seem to scale arbitrarily in this way are <i>search</i> and learning." (emphasis mine -- in the original Sutton also emphasized <i>learning</i>).<p>OpenAI and others have previously pushed the learning side, while neglecting search. Now that gains from adding compute at training time have started to level off, they're adding compute at inference time.</p>
]]></description><pubDate>Fri, 13 Sep 2024 14:45:02 +0000</pubDate><link>https://news.ycombinator.com/item?id=41531719</link><dc:creator>joelburget</dc:creator><comments>https://news.ycombinator.com/item?id=41531719</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=41531719</guid></item><item><title><![CDATA[New comment by joelburget in "Vision language models are blind"]]></title><description><![CDATA[
<p>Vision Transformers do a shocking amount of compression in the tokenizer. In the [Chameleon paper](<a href="https://arxiv.org/pdf/2405.09818" rel="nofollow">https://arxiv.org/pdf/2405.09818</a>) they say the tokenizer "encodes a 512 × 512 image into 1024 discrete tokens from a codebook of size 8192". That's 256 pixels per token (512 * 512 / 1024). If we assume that a pixel is 24 bits (3x 8 bit channels), this implies that they've compressed 256 * 24 = 6144 bits into 13 = (log2(8192)). [An Image is Worth 32 Tokens for Reconstruction and Generation](<a href="https://yucornetto.github.io/projects/titok.html" rel="nofollow">https://yucornetto.github.io/projects/titok.html</a>) pushes this even further. If these models work similarly, it's no wonder they struggle with some vision tasks.</p>
]]></description><pubDate>Thu, 11 Jul 2024 01:04:18 +0000</pubDate><link>https://news.ycombinator.com/item?id=40932934</link><dc:creator>joelburget</dc:creator><comments>https://news.ycombinator.com/item?id=40932934</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=40932934</guid></item><item><title><![CDATA[New comment by joelburget in "How Does GPT-4o Encode Images?"]]></title><description><![CDATA[
<p>Vision transformers should be our default guess as to how GPT-4o works, yet this article never mentions them.</p>
]]></description><pubDate>Fri, 07 Jun 2024 14:04:40 +0000</pubDate><link>https://news.ycombinator.com/item?id=40608852</link><dc:creator>joelburget</dc:creator><comments>https://news.ycombinator.com/item?id=40608852</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=40608852</guid></item><item><title><![CDATA[Disrupting the Deepfake Supply Chain]]></title><description><![CDATA[
<p>Article URL: <a href="https://openletter.net/l/disrupting-deepfakes">https://openletter.net/l/disrupting-deepfakes</a></p>
<p>Comments URL: <a href="https://news.ycombinator.com/item?id=39460381">https://news.ycombinator.com/item?id=39460381</a></p>
<p>Points: 2</p>
<p># Comments: 0</p>
]]></description><pubDate>Wed, 21 Feb 2024 22:13:39 +0000</pubDate><link>https://openletter.net/l/disrupting-deepfakes</link><dc:creator>joelburget</dc:creator><comments>https://news.ycombinator.com/item?id=39460381</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=39460381</guid></item><item><title><![CDATA[New comment by joelburget in "Tiktoken: OpenAI’s Tokenizer"]]></title><description><![CDATA[
<p>It works on all human languages, just inefficiently. I ran it over a sample I found on wikipedia:<p><pre><code>    sample = "ฟองมันฟันหนู, ฟันหนูฟองมัน, ฝนทองฟองมัน"
    len(sample), len(enc.encode(sample))
</code></pre>
This returns `39, 40` so it's just encoding one character at a time. It's probably like this for almost all non-English text.</p>
]]></description><pubDate>Fri, 16 Dec 2022 17:19:30 +0000</pubDate><link>https://news.ycombinator.com/item?id=34017258</link><dc:creator>joelburget</dc:creator><comments>https://news.ycombinator.com/item?id=34017258</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=34017258</guid></item><item><title><![CDATA[New comment by joelburget in "Tiktoken: OpenAI’s Tokenizer"]]></title><description><![CDATA[
<p>A few interesting findings:<p>* the cl100k_base tokenizer has ~100k tokens -- previous tokenizers had ~50k. (enc.n_vocab gives 100277 but some numbers in that range don't work, starting at 100256)<p>* it has exactly 1110 tokens which are just digits. 10 1 digit tokens, 100 2 digit tokens and 1000 3 digit tokens! (none have preceding spaces). this is a huge improvement from GPT2's tokenizer, which was a huge mess.<p>* there are <|fim_prefix|>, <|fim_middle|>, and <|fim_suffix|> tokens (see <i>Efficient Training of Language Models to Fill in the Middle</i>)<p>The biggest news to me is the improved handling of numbers. This could explain some improved performance on arithmetic. One disappointment is that it tokenizes from the <i>front</i>, e.g. "1000000" -> 100|000|0. This is one of those "so close!" moments -- I would work for free to fix this.</p>
]]></description><pubDate>Fri, 16 Dec 2022 17:16:07 +0000</pubDate><link>https://news.ycombinator.com/item?id=34017206</link><dc:creator>joelburget</dc:creator><comments>https://news.ycombinator.com/item?id=34017206</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=34017206</guid></item><item><title><![CDATA[Kelvin Versioning]]></title><description><![CDATA[
<p>Article URL: <a href="https://jtobin.io/kelvin-versioning">https://jtobin.io/kelvin-versioning</a></p>
<p>Comments URL: <a href="https://news.ycombinator.com/item?id=30948831">https://news.ycombinator.com/item?id=30948831</a></p>
<p>Points: 2</p>
<p># Comments: 0</p>
]]></description><pubDate>Thu, 07 Apr 2022 18:54:51 +0000</pubDate><link>https://jtobin.io/kelvin-versioning</link><dc:creator>joelburget</dc:creator><comments>https://news.ycombinator.com/item?id=30948831</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=30948831</guid></item></channel></rss>