<rss version="2.0" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:atom="http://www.w3.org/2005/Atom"><channel><title>Hacker News: skyde</title><link>https://news.ycombinator.com/user?id=skyde</link><description>Hacker News RSS</description><docs>https://hnrss.org/</docs><generator>hnrss v2.1.1</generator><lastBuildDate>Tue, 21 Apr 2026 06:26:52 +0000</lastBuildDate><atom:link href="https://hnrss.org/user?id=skyde" rel="self" type="application/rss+xml"></atom:link><item><title><![CDATA[New comment by skyde in "LLM Neuroanatomy II: Modern LLM Hacking and Hints of a Universal Language?"]]></title><description><![CDATA[
<p>Actually not surprised.
I guess this is for the same reason “say it twice” [1] is working. Because LLm are trained as causal language model, past token cannot attend to future token. 
One copy of the layer set solve this.
[1]<a href="https://arxiv.org/html/2512.14982v1" rel="nofollow">https://arxiv.org/html/2512.14982v1</a></p>
]]></description><pubDate>Tue, 24 Mar 2026 15:37:18 +0000</pubDate><link>https://news.ycombinator.com/item?id=47504248</link><dc:creator>skyde</dc:creator><comments>https://news.ycombinator.com/item?id=47504248</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47504248</guid></item><item><title><![CDATA[New comment by skyde in "Apple's MLX adding CUDA support"]]></title><description><![CDATA[
<p>this is CUDA backend to MLX not MLX backend for CUDA!</p>
]]></description><pubDate>Mon, 14 Jul 2025 22:42:24 +0000</pubDate><link>https://news.ycombinator.com/item?id=44566156</link><dc:creator>skyde</dc:creator><comments>https://news.ycombinator.com/item?id=44566156</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=44566156</guid></item><item><title><![CDATA[New comment by skyde in "Math Academy pulled me out of the Valley of Despair"]]></title><description><![CDATA[
<p>Can you give more detail on what you mean by it can be a valuable experience with the right people around to help.<p>My son (7 years old) is gifted in Math and as a parent I find it extremely hard to decide how much I should push him (register him to math competition, weekend math club ...) and how much I should just let him get 100% on exam and not accelerate the learning.</p>
]]></description><pubDate>Wed, 05 Mar 2025 19:03:51 +0000</pubDate><link>https://news.ycombinator.com/item?id=43270764</link><dc:creator>skyde</dc:creator><comments>https://news.ycombinator.com/item?id=43270764</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=43270764</guid></item><item><title><![CDATA[New comment by skyde in "Does X cause Y? An in-depth evidence review (2021)"]]></title><description><![CDATA[
<p>is it only me or this completely miss all the recent research on causal inference using causal graphical model ?</p>
]]></description><pubDate>Fri, 14 Feb 2025 21:33:39 +0000</pubDate><link>https://news.ycombinator.com/item?id=43053298</link><dc:creator>skyde</dc:creator><comments>https://news.ycombinator.com/item?id=43053298</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=43053298</guid></item><item><title><![CDATA[New comment by skyde in "Desmos Animated Graphing Calculator"]]></title><description><![CDATA[
<p>Des os is the best graphic calculator ever built. 
And its amazing it has un directly in your browser or without internet on your phone.<p>Just wish it was open source :-)<p>Anyone know of an open source library like 3blue1brown Manim library that can work run client side in the web browser like Desmos ?</p>
]]></description><pubDate>Fri, 31 Jan 2025 16:07:53 +0000</pubDate><link>https://news.ycombinator.com/item?id=42888898</link><dc:creator>skyde</dc:creator><comments>https://news.ycombinator.com/item?id=42888898</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=42888898</guid></item><item><title><![CDATA[New comment by skyde in "The era of open voice assistants"]]></title><description><![CDATA[
<p>how does this compare to ESP32-S3-BOX-3B ?</p>
]]></description><pubDate>Fri, 20 Dec 2024 16:53:39 +0000</pubDate><link>https://news.ycombinator.com/item?id=42472660</link><dc:creator>skyde</dc:creator><comments>https://news.ycombinator.com/item?id=42472660</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=42472660</guid></item><item><title><![CDATA[New comment by skyde in "Math is still catching up to the genius of Ramanujan"]]></title><description><![CDATA[
<p>Thanks a lot for writing that.
I agree 100% with you.<p>But I always wondered how polymath like Leonard davinci and Isaac newton that are excellent in many area are possible.</p>
]]></description><pubDate>Wed, 23 Oct 2024 18:30:51 +0000</pubDate><link>https://news.ycombinator.com/item?id=41927945</link><dc:creator>skyde</dc:creator><comments>https://news.ycombinator.com/item?id=41927945</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=41927945</guid></item><item><title><![CDATA[New comment by skyde in "Students paid thousands for a Caltech boot camp that Caltech didn't teach"]]></title><description><![CDATA[
<p>Is MIT class also taught by outsourced instructor instead of MIT instructor?</p>
]]></description><pubDate>Sun, 29 Sep 2024 15:54:00 +0000</pubDate><link>https://news.ycombinator.com/item?id=41688180</link><dc:creator>skyde</dc:creator><comments>https://news.ycombinator.com/item?id=41688180</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=41688180</guid></item><item><title><![CDATA[New comment by skyde in "Distributed Locks with Redis (2014)"]]></title><description><![CDATA[
<p>Redis Sentinel provides high availability and monitoring for Redis, but it does <i>not</i> guarantee strong consistency.<p>Linearizability requires that once a write is acknowledged, all subsequent reads should reflect that write.<p>if min-replicas-to-write is set to the number of Redis replica then if a single node goes down you won't be able to do any write (take lock or release lock).<p>if min-replicas-to-write is set to any number smaller than the total number or Redis replica some replica could still be lagging because of Asynchronous replication.<p>Also when a replica acknowledges a write in Redis, it means that the write has been received and logged by the replica, but it doesn’t necessarily mean that the write has been fully processed and applied to the data set.<p>This mean reading from replica that acknowledges a write from master might still return the Old value for the Key.</p>
]]></description><pubDate>Thu, 22 Aug 2024 20:50:54 +0000</pubDate><link>https://news.ycombinator.com/item?id=41324292</link><dc:creator>skyde</dc:creator><comments>https://news.ycombinator.com/item?id=41324292</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=41324292</guid></item><item><title><![CDATA[New comment by skyde in "Distributed Locks with Redis (2014)"]]></title><description><![CDATA[
<p>Paxos and Raft are consensus algorithms that provide certain guarantees and capabilities that a master-slave system with synchronous replication, such as PostgreSQL, cannot offer.<p>These algorithms ensure that a majority of nodes (a quorum) must agree on any proposed chAnge. This agreement guarantees that once a decision is made (e.g., to commit a transaction), it is final and consistent across all nodes. This strong consistency is critical in distributed systems to avoid split-brain scenarios.<p>This is easily caused by :<p>1-network partition<p>2-latency issues.<p>3-Async failover (2 nodes think they are the master)<p>4-replica lag (some but not all replica acknowledged the write) while master send confirmation to client</p>
]]></description><pubDate>Thu, 22 Aug 2024 04:30:46 +0000</pubDate><link>https://news.ycombinator.com/item?id=41316842</link><dc:creator>skyde</dc:creator><comments>https://news.ycombinator.com/item?id=41316842</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=41316842</guid></item><item><title><![CDATA[New comment by skyde in "Distributed Locks with Redis (2014)"]]></title><description><![CDATA[
<p>Redis is a very bad store for a distributed lock but Postgres is only slightly better.<p>What you truly need is something like ZooKeeper and etcd that are designed to achieve distributed consensus using algorithms like Paxos or Raft.<p>This ensures strong consistency and reliability in a distributed system, making them ideal for tasks like leader election, configuration management, and lease management where consistency across nodes is critical.</p>
]]></description><pubDate>Thu, 22 Aug 2024 04:27:34 +0000</pubDate><link>https://news.ycombinator.com/item?id=41316827</link><dc:creator>skyde</dc:creator><comments>https://news.ycombinator.com/item?id=41316827</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=41316827</guid></item><item><title><![CDATA[New comment by skyde in "IsoFLOP curves of large language models are flat"]]></title><description><![CDATA[
<p>But inside on epoch there is a lot of duplication already.<p>By  duplication I mean if context length is N there is many sequence of N word that are not unique.</p>
]]></description><pubDate>Sat, 03 Aug 2024 01:14:52 +0000</pubDate><link>https://news.ycombinator.com/item?id=41143983</link><dc:creator>skyde</dc:creator><comments>https://news.ycombinator.com/item?id=41143983</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=41143983</guid></item><item><title><![CDATA[New comment by skyde in "Launch HN: Undermind (YC S24) – AI agent for discovering scientific papers"]]></title><description><![CDATA[
<p>Could not try it.
Saying valid institutional or company email address.<p>It doesn’t recognize my university.</p>
]]></description><pubDate>Fri, 26 Jul 2024 06:53:23 +0000</pubDate><link>https://news.ycombinator.com/item?id=41076373</link><dc:creator>skyde</dc:creator><comments>https://news.ycombinator.com/item?id=41076373</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=41076373</guid></item><item><title><![CDATA[New comment by skyde in "Large Enough"]]></title><description><![CDATA[
<p>It “work” but the LLM having to use the calculator mean the LLM doesn’t understand arithmetic enough and doesn’t know how to use an follow a set of step (algorithm ) natively to find the answer for bug numbers.<p>I believe this could be fixed and is worth fixing.
Because it’s the only way LLM will be able to help math and physic researcher write proof and make real scientific progress</p>
]]></description><pubDate>Wed, 24 Jul 2024 18:50:27 +0000</pubDate><link>https://news.ycombinator.com/item?id=41060438</link><dc:creator>skyde</dc:creator><comments>https://news.ycombinator.com/item?id=41060438</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=41060438</guid></item><item><title><![CDATA[New comment by skyde in "FlashAttention-3: Fast and Accurate Attention with Asynchrony and Low-Precision"]]></title><description><![CDATA[
<p>Just discovered e-graph recently and I have a good understanding of compiler from taking compiler class at university.<p>I would like to understand why you say e-graph would need control-flow to be revamped.<p>Do you have anything I could read on it ?</p>
]]></description><pubDate>Fri, 12 Jul 2024 16:25:31 +0000</pubDate><link>https://news.ycombinator.com/item?id=40947018</link><dc:creator>skyde</dc:creator><comments>https://news.ycombinator.com/item?id=40947018</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=40947018</guid></item><item><title><![CDATA[New comment by skyde in "FlashAttention-3: Fast and Accurate Attention with Asynchrony and Low-Precision"]]></title><description><![CDATA[
<p><a href="https://github.com/uwplse/tensat">https://github.com/uwplse/tensat</a></p>
]]></description><pubDate>Fri, 12 Jul 2024 16:20:48 +0000</pubDate><link>https://news.ycombinator.com/item?id=40946989</link><dc:creator>skyde</dc:creator><comments>https://news.ycombinator.com/item?id=40946989</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=40946989</guid></item><item><title><![CDATA[New comment by skyde in "Chebyshev approximation and how it can help (2012)"]]></title><description><![CDATA[
<p>What do you mean by close to CNN?<p>What is your architecture?
Is it just a fully connected layer of chebyshev?</p>
]]></description><pubDate>Sat, 08 Jun 2024 16:23:44 +0000</pubDate><link>https://news.ycombinator.com/item?id=40618616</link><dc:creator>skyde</dc:creator><comments>https://news.ycombinator.com/item?id=40618616</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=40618616</guid></item><item><title><![CDATA[New comment by skyde in "Simple tasks showing reasoning breakdown in state-of-the-art LLMs"]]></title><description><![CDATA[
<p>Given that Alice has 13 brothers and 31 sisters, we can update the Prolog program with this information. We need to adjust the fact about Alice's siblings and then use the rule to calculate the number of sisters her brothers have.<p>Here is the updated program:<p>```prolog
% Facts
siblings(alice, 13, 31). % Alice has 13 brothers and 31 sisters<p>% Rules
% Alice's brothers have M + 1 sisters, since Alice is also their sister.
sisters_of_brother(TotalSisters) :- 
    siblings(alice, _, M),
    TotalSisters is M + 1.<p>% Initialization goal
:- initialization(main).<p>% Predicate to find and print the total number of sisters Alice's brothers have
main :- 
    sisters_of_brother(TotalSisters),
    writeln(TotalSisters).
```<p>In this program, the fact `siblings(alice, 13, 31)` represents that Alice has 13 brothers and 31 sisters. The rule `sisters_of_brother(TotalSisters)` calculates the total number of sisters that Alice's brothers have by adding 1 to the number of sisters Alice has (since Alice is also their sister).<p>When we run this program, it will print the total number of sisters Alice's brothers have:<p>1. Alice has 31 sisters.
2. Each of Alice's brothers will also consider Alice as a sister.
3. Therefore, each brother has 31 + 1 = 32 sisters.<p>The output of the program will be:<p>```
32
```<p>Thus, Alice's brothers have 32 sisters.</p>
]]></description><pubDate>Thu, 06 Jun 2024 08:36:26 +0000</pubDate><link>https://news.ycombinator.com/item?id=40594996</link><dc:creator>skyde</dc:creator><comments>https://news.ycombinator.com/item?id=40594996</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=40594996</guid></item><item><title><![CDATA[New comment by skyde in "Simple tasks showing reasoning breakdown in state-of-the-art LLMs"]]></title><description><![CDATA[
<p>Asking gpt to first output prolog program seem to 100% fix it!<p>Given that Alice has 13 brothers and 31 sisters, we can update the Prolog program with this information. We need to adjust the fact about Alice's siblings and then use the rule to calculate the number of sisters her brothers have.<p>Here is the updated program:<p>```prolog
% Facts
siblings(alice, 13, 31). % Alice has 13 brothers and 31 sisters<p>% Rules
% Alice's brothers have M + 1 sisters, since Alice is also their sister.
sisters_of_brother(TotalSisters) :- 
    siblings(alice, _, M),
    TotalSisters is M + 1.<p>% Initialization goal
:- initialization(main).<p>% Predicate to find and print the total number of sisters Alice's brothers have
main :- 
    sisters_of_brother(TotalSisters),
    writeln(TotalSisters).
```<p>In this program, the fact `siblings(alice, 13, 31)` represents that Alice has 13 brothers and 31 sisters. The rule `sisters_of_brother(TotalSisters)` calculates the total number of sisters that Alice's brothers have by adding 1 to the number of sisters Alice has (since Alice is also their sister).<p>When we run this program, it will print the total number of sisters Alice's brothers have:<p>1. Alice has 31 sisters.
2. Each of Alice's brothers will also consider Alice as a sister.
3. Therefore, each brother has 31 + 1 = 32 sisters.<p>The output of the program will be:<p>```
32
```<p>Thus, Alice's brothers have 32 sisters.</p>
]]></description><pubDate>Thu, 06 Jun 2024 08:30:00 +0000</pubDate><link>https://news.ycombinator.com/item?id=40594964</link><dc:creator>skyde</dc:creator><comments>https://news.ycombinator.com/item?id=40594964</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=40594964</guid></item><item><title><![CDATA[New comment by skyde in "Transformers Represent Belief State Geometry in Their Residual Stream"]]></title><description><![CDATA[
<p>What do you mean by simplest in term of optimization?<p>I get it find solution that are easy for SGD or Adam optimizer to find.<p>But why would such solution be less simple than other ?</p>
]]></description><pubDate>Sat, 01 Jun 2024 17:07:01 +0000</pubDate><link>https://news.ycombinator.com/item?id=40547385</link><dc:creator>skyde</dc:creator><comments>https://news.ycombinator.com/item?id=40547385</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=40547385</guid></item></channel></rss>