<rss version="2.0" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:atom="http://www.w3.org/2005/Atom"><channel><title>Hacker News: markusheimerl</title><link>https://news.ycombinator.com/user?id=markusheimerl</link><description>Hacker News RSS</description><docs>https://hnrss.org/</docs><generator>hnrss v2.1.1</generator><lastBuildDate>Sat, 13 Jun 2026 10:58:25 +0000</lastBuildDate><atom:link href="https://hnrss.org/user?id=markusheimerl" rel="self" type="application/rss+xml"></atom:link><item><title><![CDATA[New comment by markusheimerl in "Tiny hackable CUDA language model implementation"]]></title><description><![CDATA[
<p><a href="https://github.com/markusheimerl/gpt/blob/main/train.c" rel="nofollow">https://github.com/markusheimerl/gpt/blob/main/train.c</a> - in this file, search for the line "const int batch_size = 15;" - reduce this number</p>
]]></description><pubDate>Mon, 08 Jun 2026 15:48:29 +0000</pubDate><link>https://news.ycombinator.com/item?id=48446917</link><dc:creator>markusheimerl</dc:creator><comments>https://news.ycombinator.com/item?id=48446917</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=48446917</guid></item><item><title><![CDATA[New comment by markusheimerl in "Tiny hackable CUDA language model implementation"]]></title><description><![CDATA[
<p>Sure it could be extended to support LoRA finetuning but this implementation has the goal to be as lean and efficient as possible for a <i>pre-training</i> stack as you can be.</p>
]]></description><pubDate>Mon, 08 Jun 2026 14:13:44 +0000</pubDate><link>https://news.ycombinator.com/item?id=48445664</link><dc:creator>markusheimerl</dc:creator><comments>https://news.ycombinator.com/item?id=48445664</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=48445664</guid></item><item><title><![CDATA[New comment by markusheimerl in "Tiny hackable CUDA language model implementation"]]></title><description><![CDATA[
<p>the data gets downloaded via curl from huggingface - sure you can make your own data, simply dump all text you want the model to be trained on into "corpus.txt" and skip "make data".<p>As the tokenizer adds substantial complexity, this implementation does not include any tokenziation logic and works on raw bytes. Feel free to add your own tokenzier with the help of the coding model of your choice.<p>You can stop the training using CTRL+C
You can train on as little memory as you have. Simply reduce batch size and/or model dimensions in train.c
You can change the context window size in train.c via the "seq_len" variable.<p>Regarding Ruby, LORA and quantization I'll have to refer you to the coding agent of your choice.</p>
]]></description><pubDate>Mon, 08 Jun 2026 14:12:23 +0000</pubDate><link>https://news.ycombinator.com/item?id=48445643</link><dc:creator>markusheimerl</dc:creator><comments>https://news.ycombinator.com/item?id=48445643</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=48445643</guid></item><item><title><![CDATA[New comment by markusheimerl in "Tiny hackable CUDA language model implementation"]]></title><description><![CDATA[
<p>I did run it as a test on the NVIDIA Jetson Orin Nano Super Dev. Kit once - so yea it works on arm like a charm ;)</p>
]]></description><pubDate>Mon, 08 Jun 2026 14:06:13 +0000</pubDate><link>https://news.ycombinator.com/item?id=48445566</link><dc:creator>markusheimerl</dc:creator><comments>https://news.ycombinator.com/item?id=48445566</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=48445566</guid></item><item><title><![CDATA[New comment by markusheimerl in "Tiny hackable CUDA language model implementation"]]></title><description><![CDATA[
<p>Reduce batch size in train.c</p>
]]></description><pubDate>Mon, 08 Jun 2026 14:04:07 +0000</pubDate><link>https://news.ycombinator.com/item?id=48445535</link><dc:creator>markusheimerl</dc:creator><comments>https://news.ycombinator.com/item?id=48445535</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=48445535</guid></item><item><title><![CDATA[New comment by markusheimerl in "Tiny hackable CUDA language model implementation"]]></title><description><![CDATA[
<p>I deleted the numerical checks a while back after confirming the backward pass is correct to keep the code base lean - running <a href="https://github.com/markusheimerl/gpt/blob/main/transformer/attention/test.c" rel="nofollow">https://github.com/markusheimerl/gpt/blob/main/transformer/a...</a> is also somewhat of a confirmation that the backward pass is correct, since an analytically incorrect backward pass cant fit perfectly to synthetic data.</p>
]]></description><pubDate>Mon, 08 Jun 2026 08:12:14 +0000</pubDate><link>https://news.ycombinator.com/item?id=48442576</link><dc:creator>markusheimerl</dc:creator><comments>https://news.ycombinator.com/item?id=48442576</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=48442576</guid></item><item><title><![CDATA[Tiny hackable CUDA language model implementation]]></title><description><![CDATA[
<p>Article URL: <a href="https://github.com/markusheimerl/gpt">https://github.com/markusheimerl/gpt</a></p>
<p>Comments URL: <a href="https://news.ycombinator.com/item?id=48415828">https://news.ycombinator.com/item?id=48415828</a></p>
<p>Points: 81</p>
<p># Comments: 13</p>
]]></description><pubDate>Fri, 05 Jun 2026 17:41:58 +0000</pubDate><link>https://github.com/markusheimerl/gpt</link><dc:creator>markusheimerl</dc:creator><comments>https://news.ycombinator.com/item?id=48415828</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=48415828</guid></item><item><title><![CDATA[Stick in Bike Wheel OOP]]></title><description><![CDATA[
<p>Article URL: <a href="https://imgflip.com/i/8yu4c5">https://imgflip.com/i/8yu4c5</a></p>
<p>Comments URL: <a href="https://news.ycombinator.com/item?id=41117998">https://news.ycombinator.com/item?id=41117998</a></p>
<p>Points: 1</p>
<p># Comments: 0</p>
]]></description><pubDate>Wed, 31 Jul 2024 10:53:14 +0000</pubDate><link>https://imgflip.com/i/8yu4c5</link><dc:creator>markusheimerl</dc:creator><comments>https://news.ycombinator.com/item?id=41117998</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=41117998</guid></item></channel></rss>