Hacker News: avdelazeri

New comment by avdelazeri in "There Will Be a Scientific Theory of Deep Learning"

avdelazeri — Sat, 25 Apr 2026 01:34:34 +0000

We must know, we will know.

New comment by avdelazeri in "Show HN: I rebuilt a 2000s browser strategy game on Cloudflare's edge"

avdelazeri — Wed, 15 Apr 2026 21:00:46 +0000

OGame and Travian are two names that really take me back. Those and Tribal Wars, I played them a lot back when I was a teenager.

New comment by avdelazeri in "The Claude Code Source Leak: fake tools, frustration regexes, undercover mode"

avdelazeri — Wed, 01 Apr 2026 22:30:45 +0000

Has Claude stopped claiming to be deepseek when prompted in Chinese yet? It wasn't long that it hit the news and blogs

New comment by avdelazeri in "GitHub backs down, kills Copilot pull-request ads after backlash"

avdelazeri — Tue, 31 Mar 2026 09:46:44 +0000

Afaik turning up the temperature slowly wouldn't work on an actual frog. But works on people without fail.

New comment by avdelazeri in "Entities enabling scientific fraud at scale (2025)"

avdelazeri — Thu, 12 Mar 2026 00:05:38 +0000

Lack of will. That was one of the main results from the survey from Whitaker in 2020. Making your code reusable and easy to understand is significant work that had no direct benefits for a researcher's career. Particularly because research code grows wildly as researchers keep trying thungs.

Working on the next paper is seem as the better choice.

Moreover if your code is easy for others to run then you're likely to be hit with people wanting support, or even open yourself to the risk of someone finding errors in your code (the survey's result, not my own beliefs).

There are other issues, of course. Just running the code doesn't mean something is replicable. Science is replicated when studies are repeated independently by many teams.

There are many other failure modes SOTA-hacking, benchmarking, and lack of rigorous analysis of results, for example. And that's ignoring data leakage or other more silly mistakes (that still happen in published work! In work published in very good venues even)

Authors don't do much of anything to disabuse readers that they didn't simply get really look with their pseudorandom number generators during initialization, shuffling, etc. As long as it beats SOTA who cares if it is actually a meaningful improvement? Of course doing multiple runs with a decent bootstrap to get some estimation of the average behavior os often really expensive and really slow, and deadlines are always so tight. There is also the matter that the field converged on a experimentation methodology that isn't actually correct. Once you start reusing test sets your experiments stop being approximations of a random sampling process and you quickly find yourself outside of the grantees provided by statistical theory (this is a similar sort of mistake as the one scientists in other fields do when interpreting p-values). There be dragons out there and statistical demons might come to eat your heart or your network could converge to an implementation of nethack.

Scale also plays into that, of course, and use of private data as the other comment mentioned.

Ultimately Machine Learning research is just too competitive and moves too fast. There are tens of thousands (hundreds maybe?) of people all working on closely related problems, all rushing to publish their results before someone else published something that overlaps too much with their own work. Nobody is going to be as careful as they should, because they can't afford to. It's more profitable to carefully find the minimal publishable amount of work and do that, splitting a result into several small papers you can pump every few months. The first thing that tends to get sacrificed during that process is reliability.

New comment by avdelazeri in "AI agents break rules under everyday pressure"

avdelazeri — Wed, 03 Dec 2025 08:20:38 +0000

While I never measured it, this aligns with my own experiences.

It's better to have very shallow conversations where you keep regenerating outputs aggressively, only picking the best results. Asking for fixes, restructuring or elaborations on generated content has fast diminishing returns. And once it made a mistake (or hallucinated) it will not stop erring even if you provide evidence that it is wrong, LLMs just commit to certain things very strongly.

New comment by avdelazeri in "Meta's new EU regulator is contractually prohibited from hurting Meta's feelings"

avdelazeri — Tue, 02 Dec 2025 22:04:13 +0000

Regulatory capture is an ugly thing.

New comment by avdelazeri in "Is America's jobs market nearing a cliff?"

avdelazeri — Mon, 01 Dec 2025 13:33:06 +0000

Idk about interviewing, but there are many benefits to opening fake job listing (gathering a database of people, keeping track of people looking for jobs, etc) which is why people do it. Data is valuable.

New comment by avdelazeri in "AI CEO – Replace your boss before they replace you"

avdelazeri — Thu, 27 Nov 2025 22:28:11 +0000

That is more or less what I fear. If the top 10 percent already account for half of all consumer spending, and I equality keeps getting worse and worse, that's probably where it will end.

New comment by avdelazeri in "Americans can't afford their cars any more and Wall Street is worried"

avdelazeri — Tue, 21 Oct 2025 00:39:13 +0000

It's fine, in the future we will all subscribe to the self driving robot taxi, own nothing, and be (un)happy

New comment by avdelazeri in "Basic Math Textbook: The Napkin Project"

avdelazeri — Mon, 06 Oct 2025 15:15:15 +0000

True. There's Morita's a mathematical gift for the same audience

New comment by avdelazeri in "Basic Math Textbook: The Napkin Project"

avdelazeri — Mon, 06 Oct 2025 11:34:30 +0000

That's common with mathematics books. Weil's Basic Number Theory is enough to give the unsuspecting quite the fright, despite the name

New comment by avdelazeri in "The collapse of the econ PhD job market"

avdelazeri — Fri, 03 Oct 2025 20:20:55 +0000

Cargo cult mathematics

New comment by avdelazeri in "China's 200M gig workers are a warning for the world"

avdelazeri — Sat, 20 Sep 2025 18:45:53 +0000

If someone is working but still needs welfare then the state is just subsiding company payrolls by indirect means. Strongly disagree that gig work is fine as long as there is welfare.

New comment by avdelazeri in "AI tools are making the world look weird"

avdelazeri — Fri, 19 Sep 2025 10:29:58 +0000

As one of the professors I had undergrad classes with liked to say "Economics is the only field where you can be awarded the Nobel prize for showing A and then next year someone gets a Nobel prize for showing not A".

New comment by avdelazeri in "Dark patterns killed my wife's Windows 11 installation"

avdelazeri — Fri, 19 Sep 2025 10:21:59 +0000

I abandoned Windows over 10 years ago, when a mandatory Windows 10 update rewrote by UEFI partition and messed up my dual boot setup. It took me an entire evening to fix, at which point I swore that no machine owned by me would ever run Windows again.

The more I hear about Windows 11 makes me think I really dodged a bullet there.

New comment by avdelazeri in "Famous cognitive psychology experiments that failed to replicate"

avdelazeri — Wed, 17 Sep 2025 21:05:41 +0000

Don't get me started, I have seem repos that I'm fairly sure never ran in their presented form. A guy in our lab thinks authors purposefully mess up their code when publishing on GitHub to make it harder to replicate. I'm starting to come around on his theory.

New comment by avdelazeri in "Fraudulent Publishing in the Mathematical Sciences"

avdelazeri — Thu, 11 Sep 2025 13:03:32 +0000

When I took business 101 in college one of the first things they taught us is that long term, fixed metrics will always become gamified, that both the ones measuring and the ones being measured will replace the real results with the metrics and sacrifice the first for the second. I understand that this is common knowledge in the administrative world. Yet, every single performance metric always becomes ossified as the only target that matters, every time. Why?