<rss version="2.0" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:atom="http://www.w3.org/2005/Atom"><channel><title>Hacker News: soraki_soladead</title><link>https://news.ycombinator.com/user?id=soraki_soladead</link><description>Hacker News RSS</description><docs>https://hnrss.org/</docs><generator>hnrss v2.1.1</generator><lastBuildDate>Tue, 14 Apr 2026 17:42:31 +0000</lastBuildDate><atom:link href="https://hnrss.org/user?id=soraki_soladead" rel="self" type="application/rss+xml"></atom:link><item><title><![CDATA[New comment by soraki_soladead in "Make tmux pretty and usable (2024)"]]></title><description><![CDATA[
<p>Context: <a href="https://xkcd.com/1053/" rel="nofollow">https://xkcd.com/1053/</a><p>Then, if you're like me and read this years ago, play around with the Light Mode dropdown which was new to me. :)</p>
]]></description><pubDate>Mon, 13 Apr 2026 15:51:55 +0000</pubDate><link>https://news.ycombinator.com/item?id=47753844</link><dc:creator>soraki_soladead</dc:creator><comments>https://news.ycombinator.com/item?id=47753844</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47753844</guid></item><item><title><![CDATA[New comment by soraki_soladead in "Eternity in six hours: Intergalactic spreading of intelligent life (2013)"]]></title><description><![CDATA[
<p>We don't know however "It would take so long" is an anthropomorphic assumption of time scale.</p>
]]></description><pubDate>Sun, 12 Apr 2026 16:32:01 +0000</pubDate><link>https://news.ycombinator.com/item?id=47741612</link><dc:creator>soraki_soladead</dc:creator><comments>https://news.ycombinator.com/item?id=47741612</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47741612</guid></item><item><title><![CDATA[New comment by soraki_soladead in "Mystery jump in oil trading ahead of Trump post draws scrutiny"]]></title><description><![CDATA[
<p>Why would we settle for anything less than discontinuing both?</p>
]]></description><pubDate>Tue, 24 Mar 2026 15:43:15 +0000</pubDate><link>https://news.ycombinator.com/item?id=47504354</link><dc:creator>soraki_soladead</dc:creator><comments>https://news.ycombinator.com/item?id=47504354</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47504354</guid></item><item><title><![CDATA[New comment by soraki_soladead in "Exploring JEPA for real-time speech translation"]]></title><description><![CDATA[
<p>Roughly, when you train a model to make its predictions align to its own predictions in some way, you create a scenario where the simplest "correct" solution is to output a single value under diverse inputs, aka representation collapse. This guarantees that your predicted representations agree, which is technically what you want it to do but it's degenerate.<p>EMA helps because it changes more slowly than the learning network which prevents rapid collapse by forcing the predictions to align to what a historical average would predict. This is a harder and more informative task because the model can't trivially output one value and have it match the EMA target so the model learns more useful representations.<p>EMA has a long history in deep learning (many GANs use it, TD-learning like DQN, many JEPA papers, etc.) so authors often omit defense of it due to over-familiarity or sometimes cargo culting. :)</p>
]]></description><pubDate>Sat, 14 Mar 2026 00:15:42 +0000</pubDate><link>https://news.ycombinator.com/item?id=47371814</link><dc:creator>soraki_soladead</dc:creator><comments>https://news.ycombinator.com/item?id=47371814</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47371814</guid></item><item><title><![CDATA[New comment by soraki_soladead in "NanoGPT Slowrun: Language Modeling with Limited Data, Infinite Compute"]]></title><description><![CDATA[
<p>also, BabyLM is more of a conference track / workshop than an open-repo competition which creates a different vibe</p>
]]></description><pubDate>Wed, 04 Mar 2026 19:04:20 +0000</pubDate><link>https://news.ycombinator.com/item?id=47252234</link><dc:creator>soraki_soladead</dc:creator><comments>https://news.ycombinator.com/item?id=47252234</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47252234</guid></item><item><title><![CDATA[New comment by soraki_soladead in "Towards Nyquist Learners"]]></title><description><![CDATA[
<p>Alias-Free GANs? <a href="https://nvlabs.github.io/stylegan3/" rel="nofollow">https://nvlabs.github.io/stylegan3/</a></p>
]]></description><pubDate>Tue, 19 Nov 2024 02:56:53 +0000</pubDate><link>https://news.ycombinator.com/item?id=42179696</link><dc:creator>soraki_soladead</dc:creator><comments>https://news.ycombinator.com/item?id=42179696</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=42179696</guid></item><item><title><![CDATA[New comment by soraki_soladead in "Diffusion for World Modeling"]]></title><description><![CDATA[
<p>We have lossless memory for models today. That's the training data. You could consider this the offline version of a replay buffer which is also typically lossless.<p>The online, continuous and lossy version of this problem is more like how our memory works and still largely unsolved.</p>
]]></description><pubDate>Sun, 13 Oct 2024 15:27:57 +0000</pubDate><link>https://news.ycombinator.com/item?id=41828828</link><dc:creator>soraki_soladead</dc:creator><comments>https://news.ycombinator.com/item?id=41828828</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=41828828</guid></item><item><title><![CDATA[New comment by soraki_soladead in "A most profound video game: a good cognitive aid for research"]]></title><description><![CDATA[
<p>You might enjoy this paper[0] which shows that recurrent position encodings recover grid cell representations and maps to path integration found in a popular model of the hippocampus. This isn't terribly surprising since RNNs have shown this before[1, 2] but its an interesting connection.<p>[0] <a href="https://arxiv.org/abs/2112.04035" rel="nofollow">https://arxiv.org/abs/2112.04035</a><p>[1] <a href="https://arxiv.org/abs/1803.07770" rel="nofollow">https://arxiv.org/abs/1803.07770</a><p>[2] <a href="https://www.nature.com/articles/s41586-018-0102-6" rel="nofollow">https://www.nature.com/articles/s41586-018-0102-6</a></p>
]]></description><pubDate>Sun, 16 Jun 2024 14:12:50 +0000</pubDate><link>https://news.ycombinator.com/item?id=40697226</link><dc:creator>soraki_soladead</dc:creator><comments>https://news.ycombinator.com/item?id=40697226</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=40697226</guid></item><item><title><![CDATA[New comment by soraki_soladead in "Reproducing GPT-2 in llm.c"]]></title><description><![CDATA[
<p>Fwiw, that's SwiGLU in #3 above. Swi = Swish = silu. GLU is gated linear unit; the gate construction you describe.</p>
]]></description><pubDate>Wed, 29 May 2024 04:15:39 +0000</pubDate><link>https://news.ycombinator.com/item?id=40508400</link><dc:creator>soraki_soladead</dc:creator><comments>https://news.ycombinator.com/item?id=40508400</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=40508400</guid></item><item><title><![CDATA[New comment by soraki_soladead in "Building a deep learning rig"]]></title><description><![CDATA[
<p>It's not always about cost. Sometimes the ergonomics of a local machine are nicer.</p>
]]></description><pubDate>Sat, 24 Feb 2024 18:11:50 +0000</pubDate><link>https://news.ycombinator.com/item?id=39493736</link><dc:creator>soraki_soladead</dc:creator><comments>https://news.ycombinator.com/item?id=39493736</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=39493736</guid></item><item><title><![CDATA[New comment by soraki_soladead in "Cyclist hit by driverless Waymo car in San Francisco, police say"]]></title><description><![CDATA[
<p>Agree that's not a great look for the supervisor.<p>Cyclists have a bad rep in SF because many (not all) ride quite dangerously. It's a common sight to see cyclists running four-way stop signs and lights without even yielding. I live adjacent to a four-way stop and there's an incident where a cyclist fails to yield nearly hourly.<p>Meanwhile, Waymo has millions of incident-free miles and of all the self-driving car companies generally takes safety seriously, even if they will act to protect their interests here.<p>Until more evidence comes out I'll be taking Waymo's side here. I want safer vehicles and Waymo is currently the best bet.</p>
]]></description><pubDate>Thu, 08 Feb 2024 04:35:45 +0000</pubDate><link>https://news.ycombinator.com/item?id=39298198</link><dc:creator>soraki_soladead</dc:creator><comments>https://news.ycombinator.com/item?id=39298198</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=39298198</guid></item><item><title><![CDATA[New comment by soraki_soladead in "Cyclist hit by driverless Waymo car in San Francisco, police say"]]></title><description><![CDATA[
<p>Cyclists do this all the time in SF. Afaik an "Idaho stop" is not legal here, despite it being common and often unsafe for obvious reasons.</p>
]]></description><pubDate>Thu, 08 Feb 2024 03:19:08 +0000</pubDate><link>https://news.ycombinator.com/item?id=39297594</link><dc:creator>soraki_soladead</dc:creator><comments>https://news.ycombinator.com/item?id=39297594</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=39297594</guid></item><item><title><![CDATA[New comment by soraki_soladead in "Markov Chains are the Original Language Models"]]></title><description><![CDATA[
<p>FLOPs by perplexity by samples is an interesting way to compare this family of models.</p>
]]></description><pubDate>Fri, 02 Feb 2024 02:32:46 +0000</pubDate><link>https://news.ycombinator.com/item?id=39224240</link><dc:creator>soraki_soladead</dc:creator><comments>https://news.ycombinator.com/item?id=39224240</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=39224240</guid></item><item><title><![CDATA[New comment by soraki_soladead in "Learn Datalog Today"]]></title><description><![CDATA[
<p><a href="https://github.com/cozodb/pycozo/blob/main/pycozo/test_builder.py">https://github.com/cozodb/pycozo/blob/main/pycozo/test_build...</a><p>Here's the python version of what I think you're looking for. Shouldn't be too difficult to port to rust.</p>
]]></description><pubDate>Sun, 21 Jan 2024 19:08:38 +0000</pubDate><link>https://news.ycombinator.com/item?id=39081737</link><dc:creator>soraki_soladead</dc:creator><comments>https://news.ycombinator.com/item?id=39081737</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=39081737</guid></item><item><title><![CDATA[New comment by soraki_soladead in "Build full “product skills” and you'll probably be fine"]]></title><description><![CDATA[
<p>Sure. A few below but far from exhaustive:<p>- <a href="https://arxiv.org/abs/1909.07528" rel="nofollow">https://arxiv.org/abs/1909.07528</a>
- <a href="https://arxiv.org/abs/2212.10403" rel="nofollow">https://arxiv.org/abs/2212.10403</a>
- <a href="https://arxiv.org/abs/2201.11903" rel="nofollow">https://arxiv.org/abs/2201.11903</a>
- <a href="https://arxiv.org/abs/2210.13382" rel="nofollow">https://arxiv.org/abs/2210.13382</a><p>There are also literally hundreds of articles and tweet threads about it. Moreover, as I said, you can test many of my claims above directly using readily available LLMs.<p>GP has a much harder defense. They have to prove that despite all of these capabilities that LLMs are not intelligent. That the mechanisms by which humans possess intelligence is fundamentally distinct from a computer’s ability to exhibit the same behaviors so much that it invalidates any claim that LLMs exhibit intelligence.<p>Intelligence: “the ability to acquire and apply knowledge and skills”. It is difficult to argue that modern LLMs cannot do this. At best we can quibble about the meaning of individual words like “acquire”, “apply”, “knowledge”, and “skills”. That’s a significant goal post shift from even a year ago.</p>
]]></description><pubDate>Sun, 19 Mar 2023 20:27:45 +0000</pubDate><link>https://news.ycombinator.com/item?id=35223203</link><dc:creator>soraki_soladead</dc:creator><comments>https://news.ycombinator.com/item?id=35223203</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=35223203</guid></item><item><title><![CDATA[New comment by soraki_soladead in "Build full “product skills” and you'll probably be fine"]]></title><description><![CDATA[
<p>> They are not intelligent.<p>Citation needed. Numerous actual citations have demonstrated hallmarks of intelligence for years. Tool use. Comprehension and generalization of grammars. World modeling with spatial reasoning through language. Many of these are readily testable in GPT. Many people have… and I dare say that LLMs reading comprehension, problem solving and reasoning skills do surpass that of many actual humans.<p>> They model intelligent behavior<p>It is not at all clear that modeling intelligent behavior is any different from intelligence. This is an open question. If you have an insight there I would love to read it.<p>> They don't know or care what language is: they learn whatever patterns are present in text, language or not.<p>This is identical to how children learn language prior to schooling. They listen and form connections based on the cooccurrence of words. They’re brains are working overtime to predict what sounds follow next. Before anyone says “not from text!” please don’t forget people who can’t see or hear. Before anyone says, “not only from language!” multimodal LLMs are here now too!<p>I’m not saying they’re perfect or even possess the same type of intelligence. Obviously the mechanisms are different. However far too many people in this debate are either unaware of their capabilities or hold on too strongly to human exceptionalism.<p>> There is this religious cult surrounding LLMs that bases all of its expectations of what an LLM can become on a personification of the LLM.<p>Anthropomorphizing LLMs is indeed an issue but is separate from a debate on their intelligence. I would argue there’s a very different religious cult very vocally proclaiming “that’s not really intelligence!” as these models sprint past goal posts.</p>
]]></description><pubDate>Sun, 19 Mar 2023 18:39:03 +0000</pubDate><link>https://news.ycombinator.com/item?id=35222096</link><dc:creator>soraki_soladead</dc:creator><comments>https://news.ycombinator.com/item?id=35222096</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=35222096</guid></item><item><title><![CDATA[New comment by soraki_soladead in "Understanding Large Language Models – A Transformative Reading List"]]></title><description><![CDATA[
<p>Only some model architectures continue to get better as you pump in more data. Transformers and their variants have this property more so than prior architectures.</p>
]]></description><pubDate>Sat, 11 Feb 2023 22:09:56 +0000</pubDate><link>https://news.ycombinator.com/item?id=34756918</link><dc:creator>soraki_soladead</dc:creator><comments>https://news.ycombinator.com/item?id=34756918</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=34756918</guid></item><item><title><![CDATA[New comment by soraki_soladead in "TensorFlow Datasets"]]></title><description><![CDATA[
<p>To each their own. I like that TF separates them since they are separate tasks and combining them is only one use case. At the end of the day we should just use what works best. The ML landscape is far from settled.</p>
]]></description><pubDate>Wed, 21 Dec 2022 20:44:52 +0000</pubDate><link>https://news.ycombinator.com/item?id=34086163</link><dc:creator>soraki_soladead</dc:creator><comments>https://news.ycombinator.com/item?id=34086163</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=34086163</guid></item><item><title><![CDATA[New comment by soraki_soladead in "TensorFlow Datasets"]]></title><description><![CDATA[
<p>UX preferences vary. Imo, hf is too verbose and their pages try to cram in too much information with poor information hierarchy. For example:<p><a href="https://huggingface.co/datasets/glue" rel="nofollow">https://huggingface.co/datasets/glue</a><p><a href="https://www.tensorflow.org/datasets/catalog/glue" rel="nofollow">https://www.tensorflow.org/datasets/catalog/glue</a></p>
]]></description><pubDate>Wed, 21 Dec 2022 19:10:14 +0000</pubDate><link>https://news.ycombinator.com/item?id=34084972</link><dc:creator>soraki_soladead</dc:creator><comments>https://news.ycombinator.com/item?id=34084972</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=34084972</guid></item><item><title><![CDATA[New comment by soraki_soladead in "TensorFlow Datasets"]]></title><description><![CDATA[
<p>Quantity of datasets doesn’t seem like the right metric. The library just needs the datasets you care about and both libraries have the popular ones. What’s more important is integration and if you’re training custom TF models then tfds will generally integrate more smoothly than huggingface.</p>
]]></description><pubDate>Wed, 21 Dec 2022 15:05:47 +0000</pubDate><link>https://news.ycombinator.com/item?id=34081574</link><dc:creator>soraki_soladead</dc:creator><comments>https://news.ycombinator.com/item?id=34081574</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=34081574</guid></item></channel></rss>