Hacker News: andy_xor_andrew

New comment by andy_xor_andrew in "Intel Arc Pro B70 Review"

andy_xor_andrew — Tue, 28 Apr 2026 22:33:05 +0000

the build they use is from February, over two months old: https://github.com/ggml-org/llama.cpp/releases/tag/b8121

Which might not sound like much, but 2months in llm time is a long time, especially regarding support for new hardware like the r9700.

New comment by andy_xor_andrew in "Embarrassingly simple self-distillation improves code generation"

andy_xor_andrew — Sat, 04 Apr 2026 22:05:59 +0000

I've been wondering about adaptive decoding! It seems obvious to me that at some points during decoding (reasoning, "creative thinking") you would want a higher temperature, while at other points (emitting syntactically correct code, following a plan that was already established) you would want lower temperature.

New comment by andy_xor_andrew in "Embarrassingly simple self-distillation improves code generation"

andy_xor_andrew — Sat, 04 Apr 2026 22:04:43 +0000

Not really? If you read it, there is no validation, no correctness signal, no verification, none of that. They're just passing in benchmark inputs, collecting the outputs (regardless of their quality), training on those outputs, and then sweeping the decode settings (temp, topk) of the resulting model. Their conclusion is that this results in a better model than the original - even when taking into consideration the same temp/topk sweep of the original.

So no, they are not fine-tuning a general purpose model to produce "valid benchmark code results."

New comment by andy_xor_andrew in "Show HN: Apfel – The free AI already on your Mac"

andy_xor_andrew — Fri, 03 Apr 2026 19:04:45 +0000

I find the branding to be a little odd. Like, it should be a github page with a README that says "here's how to use this." Like, the full explanation of this project is right there in the HN title: "The free AI already on your Mac."

I guess LLMs have made it too simple to instantly build startup landing page slop, which causes this? Like, do we need to see the github star count chart? Do we need all the buzzwords and stuff? You'd think this was a startup trying to get a billion dollar evaluation. It feels disingenuous.

Maybe I'm just being a hater.

New comment by andy_xor_andrew in "EmDash – A spiritual successor to WordPress that solves plugin security"

andy_xor_andrew — Wed, 01 Apr 2026 17:01:46 +0000

> x402 is an open, neutral standard for Internet-native payments. It lets anyone on the Internet easily charge, and any client pay on-demand, on a pay-per-use basis. A client, such as an agent, sends a HTTP request and receives a HTTP 402 Payment Required status code. In response, the client pays for access on-demand, and the server can let the client through to the requested content.

Fascinating. Cloudflare is envisioning a future where agents are given debit cards by their owners, so they can autonomously send microtransactions to website owners to scrape content or possibly purchase goods on the owner's behalf. I don't know how I feel about that but there's no doubt it's a fascinating concept.

Brb, setting up a honeypot that always responds with HTTP 402 Payment Required demanding 10cents per visit... That's the next "selling 1 million pixels on my website for $1 each", I guess

New comment by andy_xor_andrew in "Wine-Staging 11.1 Adds Patches for Enabling Recent Photoshop Versions on Linux"

andy_xor_andrew — Sun, 25 Jan 2026 18:24:45 +0000

Curious if someone could enlighten me-

How much of these sorts of patches are specifically checking if a certain application is running, and then changing behavior to match what that application expects? And how much of it is simply better emulating the Windows API in general?

I think there are benefits to both approaches, not criticizing either one. I'm just curious if the implementation of a patch like this is "We fixed an inconsistency between Wine and Windows" vs "We're checking if Photoshop is running and using a different locking primitive" or whatever.

New comment by andy_xor_andrew in "Understanding Rust Closures"

andy_xor_andrew — Sat, 24 Jan 2026 22:41:22 +0000

if I'm not mistaken (and I very well may be!) my primary confusion with closures comes from the fact that: the trait they implement (FnOnce / Fn / FnMut) depends entirely upon what happens inside the closure.

It will automatically implement the most general, relaxed version (FnMut I think?) and only restrict itself further to FnOnce and Fn based on what you do inside the closure.

So, it can be tricky to know what's going on, and making a code change can change the contract of the closure and therefore where and how it can be used.

(I invite rust experts to correct me if any of the above is mistaken - I always forget the order of precedence for FnOnce/Fn/FnMut and which implies which)

New comment by andy_xor_andrew in "Anthropic Economic Index report: economic primitives"

andy_xor_andrew — Thu, 22 Jan 2026 23:14:35 +0000

no payoff whatsoever? I just asked Claude to do a task that would have previously taken me four days. Then I got up and got lunch, and when I was back, it was done.

I would never make the argument that there are no risks. But there's also no way you can make the argument there are no payoffs!

New comment by andy_xor_andrew in "GPTZero finds 100 new hallucinations in NeurIPS 2025 accepted papers"

andy_xor_andrew — Thu, 22 Jan 2026 18:35:22 +0000

I guess that makes this "standing on the shoulders of fabrications"

New comment by andy_xor_andrew in "Qwen3-Omni-Flash-2025-12-01：a next-generation native multimodal large model"

andy_xor_andrew — Wed, 10 Dec 2025 18:37:01 +0000

> This is a 30B parameter MoE with 3B active parameters

Where are you finding that info? Not saying you're wrong; just saying that I didn't see that specified anywhere in the linked page, or on their HF.

New comment by andy_xor_andrew in "PRC elites voice AI-skepticism"

andy_xor_andrew — Tue, 25 Nov 2025 19:08:09 +0000

> former Dean of Electronics Engineering and Computer Science at Peking University, has noted that Chinese data makes up only 1.3 percent of global large-model datasets (The Paper, March 24). Reflecting these concerns, the Ministry of State Security (MSS) has issued a stark warning that “poisoned data” (数据投毒) could “mislead public opinion” (误导社会舆论) (Sina Finance, August 5).

from a technical point of view, I suppose it's actually not a problem like he suggests. You can use all the pro-democracy, pro-free-speech, anti-PRC data in the world, but the pretraining stages (on the planet's data) are more for instilling core language abilities, and are far less important than the SFT / RL / DPO / etc stages, which require far less data, and can tune a model towards whatever ideology you'd like. Plus, you can do things like selectively identify vectors that encode for certain high-level concepts, and emphasize them during inference, like Golden Gate Claude.

New comment by andy_xor_andrew in "How to turn liquid glass into a solid interface"

andy_xor_andrew — Tue, 14 Oct 2025 20:40:02 +0000

I truly, genuinely wanted to like Liquid Glass. I think the default reaction to ANY change in UX, even changes that are generally improvements, is: "I don't like this, it's different!"

I thought that'd be the case for ios 26. But after installing it... yeesh. I can barely see anything. It's just awful.

New comment by andy_xor_andrew in "Google Safe Browsing incident"

andy_xor_andrew — Fri, 10 Oct 2025 18:26:31 +0000

> In order to limit the impact of similar issues in the future, all sites on statichost.eu are now created with a statichost.page domain instead.

This read like a dark twist in a horror novel - the .page tld is controlled by Google!

https://get.page/

New comment by andy_xor_andrew in "F-35 pilot held 50-minute airborne conference call with engineers before crash"

andy_xor_andrew — Wed, 27 Aug 2025 15:58:36 +0000

Sure, of course I will trust the report as the source of truth.

But I'm interested in the reporting. There are, you know, journalistic standards, which are considered kinda "journalism 101"! For instance, getting the basic facts of a story correct - especially the facts stated in the headline.

So I'm curious, did the reporter do their due diligence, and write the article in a way that is factually correct, but highly misleading? Or did they simply not follow basic reporting protocol?

New comment by andy_xor_andrew in "F-35 pilot held 50-minute airborne conference call with engineers before crash"

andy_xor_andrew — Wed, 27 Aug 2025 15:49:30 +0000

I read the article (twice) and I still have the impression the pilot was in fact the one in the conference call

Opening line:

> A US Air Force F-35 pilot spent 50 minutes on an airborne conference call with Lockheed Martin engineers trying to solve a problem with his fighter jet before he ejected

Am I illiterate or misreading it?

> After going through system checklists in an attempt to remedy the problem, the pilot got on a conference call with engineers from the plane’s manufacturer, Lockheed Martin, *as the plane flew near the air base. *

Is this actually some insane weasel-wording by CNN? "We never said the pilot (he is in fact a pilot) was the one flying the jet, we just said 'as the plane flew', not 'as he flew the plane', using passive voice, so we're not wrong - but it was another pilot flying the plane"

New comment by andy_xor_andrew in "It’s not wrong that "\u{1F926}\u{1F3FC}\u200D\u2642\uFE0F".length == 7 (2019)"

andy_xor_andrew — Fri, 22 Aug 2025 18:15:04 +0000

https://news.ycombinator.com/item?id=27529697

New comment by andy_xor_andrew in "Home Depot sued for 'secretly' using facial recognition at self-checkouts"

andy_xor_andrew — Thu, 21 Aug 2025 16:20:09 +0000

I was wondering this as well. The green box could simply indicate it detected a face, using something like YOLO, or even a simpler technique like some point-and-shoot cameras use to decide where to focus (on faces, obviously).

LLMs Are Magic – Their Applications Should Be, Too

andy_xor_andrew — Tue, 08 Jul 2025 14:56:17 +0000

Article URL: https://debugti.me/posts/llm-magic/

Comments URL: https://news.ycombinator.com/item?id=44500588

Points: 1

# Comments: 0

New comment by andy_xor_andrew in "Xfinity using WiFi signals in your house to detect motion"

andy_xor_andrew — Mon, 30 Jun 2025 20:58:40 +0000

Yeah, it's bizarre.

Normally the pathway for this kind of thing would be:

1. theorized

2. proven in a research lab

3. not feasible in real-world use (fizzles and dies)

if you're lucky the path is like

1. theorized

2. proven in a research lab

3. actually somewhat feasible in real-world use!

4. startups / researchers split off to attempt to market it (fizzles and dies)

the fact that this ended up going from research paper to "Comcast can tell if I'm home based on my body's physical interaction with wifi waves" is absolutely wild

Hacker News Clone (Microeval)

andy_xor_andrew — Wed, 25 Jun 2025 19:52:43 +0000

Article URL: https://artificialanalysis.ai/microevals/hackernews-clone-1749859153173

Comments URL: https://news.ycombinator.com/item?id=44381246

Points: 2

# Comments: 0