Hacker News: andreyk

New comment by andreyk in "Yann LeCun raises $1B to build AI that understands the physical world"

andreyk — Tue, 10 Mar 2026 16:36:28 +0000

"and we didn't see anything" is not justified at all.

Meta absolutely has (or at least had) a word class industry AI lab and has published a ton of great work and open source models (granted their LLM open source stuff failed to keep up with chinese models in 2024/2025 ; their other open source stuff for thins like segmentation don't get enough credit though). Yann's main role was Chief AI Scientist, not any sort of product role, and as far as I can tell he did a great job building up and leading a research group within Meta.

He deserved a lot of credit for pushing Meta to very open to publishing research and open sourcing models trained on large scale data.

Just as one example, Meta (together with NYU) just published "Beyond Language Modeling: An Exploration of Multimodal Pretraining" (https://arxiv.org/pdf/2603.03276) which has a ton of large-experiment backed insights.

Yann did seem to end up with a bit of an inflated ego, but I still consider him a great research lead. Context: I did a PhD focused on AI, and Meta's group had a similar pedigree as Google AI/Deepmind as far as places to go do an internship or go to after graduation.

New comment by andreyk in "The Waymo World Model"

andreyk — Fri, 06 Feb 2026 20:00:46 +0000

This is quite misleading... From the article:

“When the Waymo vehicle encounters a particular situation on the road, the autonomous driver can reach out to a human fleet response agent for additional information to contextualize its environment,” the post reads. “The Waymo Driver [software] does not rely solely on the inputs it receives from the fleet response agent and it is in control of the vehicle at all times.” [from Waymo's own blog https://waymo.com/blog/2024/05/fleet-response/]

What's the problem with this?

Notes from an Authoritarian Year

andreyk — Mon, 05 Jan 2026 17:05:41 +0000

Article URL: https://leftymorrill.substack.com/p/notes-from-an-authoritarian-year

Comments URL: https://news.ycombinator.com/item?id=46501451

Points: 1

# Comments: 0

New comment by andreyk in "Sycophancy is the first LLM "dark pattern""

andreyk — Tue, 02 Dec 2025 03:24:03 +0000

To say they LLMs are 'predictive text models trained to match patterns in their data, statistical algorithms, not brains, not systems with “psychology” in any human sense.' is not entirely accurate. Classic LLMs like GPT 3 , sure. But LLM-powered chatbots (ChatGPT, Claude - which is what this article is really about) go through much more than just predict-next-token training (RLHF, presumably now reasoning training, who knows what else).

New comment by andreyk in "A new AI winter is coming?"

andreyk — Mon, 01 Dec 2025 17:59:33 +0000

This blog post is full of bizarre statements and the author seems almost entirely ignorant of the history or present of AI. I think it's fair to argue there may be an AI bubble that will burst, but this blog post is plainly wrong in many ways.

Here's a few clarifications (sorry this is so long...):

"I should explain for anyone who hasn't heard that term [AI winter]... there was much hope, as there is now, but ultimately the technology stagnated. "

The term AI winter typically refers to a period of reduced funding for AI research/development, not the technology stagnating (the technology failing to deliver on expectations was the cause of the AI winter, not the definition of AI winter).

"[When GPT3 came out, pre-ChatGPT] People were saying that this meant that the AI winter was over, and a new era was beginning."

People tend to agree there were two AI winters already, one having to do with symbolic AI disappointments/general lack of progress (70s), and the latter related to expert systems (late 80s). That AI winter has long been over. The Deep Learning revolution started in ~2012, and by 2020 (GPT 3) huge amount of talent and money were already going into AI for years. This trend just accelerated with ChatGPT.

"[After symbolic AI] So then came transformers. Seemingly capable of true AI, or, at least, scaling to being good enough to be called true AI, with astonishing capabilities ... the huge research breakthrough was figuring out that, by starting with essentially random coefficients (weights and biases) in the linear algebra, and during training back-propagating errors, these weights and biases could eventually converge on something that worked."

Transformers came about in 2017. The first wave of excitement about neural nets and backpropagation goes all the way back to the late 80s/early 90s, and AI (computer vision, NLP, to a lesser extent robotics) were already heavily ML-based by the 2000s, just not neural-net based (this changed in roughly 2012).

"All transformers have a fundamental limitation, which can not be eliminated by scaling to larger models, more training data or better fine-tuning ... This is the root of the hallucination problem in transformers, and is unsolveable because hallucinating is all that transformers can do."

The 'highest number' token is not necessarily chosen, this depends on the decoding algorithm. That aside, 'the next token will be generated to match that bad choice' makes it sound like once you generate one 'wrong' token the rest of the output is also wrong. A token is a few characters, and need not 'poison' the rest of the output.

That aside, there are plenty of ways to 'recover' from starting to go down the wrong route. A key aspect of why reasoning in LLMs works well is that it typically incorporates backtracking - going earlier in the reasoning to verify details or whatnot. You can do uncertainty estimation in the decoding algorithm, use a secondary model, plenty of things (here is a detailed survey https://arxiv.org/pdf/2311.05232 , one of several that is easy to find).

"The technology won't disappear – existing models, particularly in the open source domain, will still be available, and will still be used, but expect a few 'killer app' use cases to remain, with the rest falling away."

A quick google search shows ChatGPT currently has 800 million weekly active users who are using it for all sorts of things. AI-assisted programming is certainly here to stay, and there are plenty of other industries in which AI will be part of the workflow (helping do research, take notes, summarize, build presentations, etc.)

I think discussion is good, but it's disappointing to see stuff with this level of accuracy being on front page of HN.

New comment by andreyk in "Poker Tournament for LLMs"

andreyk — Tue, 28 Oct 2025 16:30:05 +0000

For reference, the details about how the LLMs are queried:

"How the players work

    All players use the same system prompt
    Each time it's their turn, or after a hand ends (to write a note), we query the LLM
    At each decision point, the LLM sees:
        General hand info — player positions, stacks, hero's cards
        Player stats across the tournament (VPIP, PFR, 3bet, etc.)
        Notes hero has written about other players in past hands
    From the LLM, we expect:
        Reasoning about the decision
        The action to take (executed in the poker engine)
        A reasoning summary for the live viewer interface
    Models have a maximum token limit for reasoning
    If there's a problem with the response (timeout, invalid output), the fallback action is fold"

The fact the models are given stats about the other models is rather disappointing to me, makes it less interesting. Would be curious how this would go if the models had to only use notes/context would be more interesting. Maybe it's a way to save on costs, this could get expensive...

New comment by andreyk in "Poker Tournament for LLMs"

andreyk — Tue, 28 Oct 2025 16:19:05 +0000

But LLMs would presumably also condition on past observations of opponents - i.e. LLMs can conversely adapt their strategy during repeated play (especially if given a budget for reasoning as opposed to direct sampling from their output distributions).

The rules state the LLMs do get "Notes hero has written about other players in past hands" and "Models have a maximum token limit for reasoning" , so the outcome might be at least more interesting as a result.

The top models on the leaderboard are notably also the ones strongest in reasoning. They even show the models' notes, e.g. Grok on Claude: "About: claude Called preflop open and flop bet in multiway pot but folded to turn donk bet after checking, suggesting a passive postflop style that folds to aggression on later streets."

PS The sampling params also matter a lot (with temperature 0 the LLMs are going to be very consistent, going higher they could get more 'creative').

PPS the models getting statistics about other models' behavior seems kind of like cheating, they rely on it heavily, e.g. 'I flopped middle pair (tens) on a paired board (9s-Th-9d) against LLAMA, a loose passive player (64.5% VPIP, only 29.5% PFR)'

New comment by andreyk in "Show HN: When is the next Caltrain? (minimal webapp)"

andreyk — Wed, 06 Aug 2025 17:52:24 +0000

haha nice, the official caltrain schedule is a bit of a hassle to parse...

Astrocade rolls out AI agent-powered game creation experience

andreyk — Wed, 30 Jul 2025 20:10:25 +0000

Article URL: https://gamesbeat.com/astrocade-rolls-out-ai-agent-powered-game-creation-experience-so-anyone-can-create-games/

Comments URL: https://news.ycombinator.com/item?id=44738949

Points: 1

# Comments: 0

I Came to Study Aging. Now I'm Trapped in ICE Detention

andreyk — Tue, 13 May 2025 16:24:14 +0000

Article URL: https://www.nytimes.com/2025/05/13/opinion/ice-detention-russian-scientist.html

Comments URL: https://news.ycombinator.com/item?id=43974645

Points: 21

# Comments: 1

Show HN: Astrocade - AI-powered game creation tool

andreyk — Tue, 08 Apr 2025 18:01:38 +0000

Hi HN!

My startup has just launched a web demo of our tool to make 2D games with AI assistance. There are some other players in that space, but we are trying to make a truly codeless and casual platform for non technical people to make cool stuff.

I came to work on it after studying AI in grad school, and it's actually been two years since then... Classic statup journey stuff, lots of mistakes and lessons learned the hard way, so super rewarding to finally get a widely shared demo out there.

Would love feedback / for you to break it in creative ways so we can fix it.

Comments URL: https://news.ycombinator.com/item?id=43624679

Points: 2

# Comments: 0

New comment by andreyk in "How America's universities became debt factories"

andreyk — Sat, 14 Sep 2024 17:03:35 +0000

Seems like a good overview, but I do find this bit unclear: "But why don’t market forces correct these issues?

The answer lies in the unique shield that non-dischargeable student loans provide to educational institutions and lenders.

In a normal market, if a product consistently fails to deliver value, consumers stop buying it. Producers either improve or go out of business. But in the world of higher education, this feedback loop is broken.

Colleges and universities, shielded by the guarantee of student loan money, have no real incentive to improve their product or direct students to majors that have an ability to pay back their loans.

They can raise tuition year after year, even as the value of their degrees stagnates or declines. "

Sure, colleges can charge a lot due to loans, but they are still competing with each other and differences in tuition could make a big difference. I went to Georgia Tech over other universities because it was in-state and Georgia has generous scholarships for students with good grades. So why does competition among schools not lower costs?

Astrocade raises $12M for AI-based social gaming platform

andreyk — Tue, 11 Jun 2024 19:28:47 +0000

Article URL: https://venturebeat.com/games/astrocade-raises-12m-for-ai-based-social-gaming-platform/

Comments URL: https://news.ycombinator.com/item?id=40650442

Points: 1

# Comments: 0

New comment by andreyk in "Extracting concepts from GPT-4"

andreyk — Thu, 06 Jun 2024 19:54:17 +0000

Exciting to see this so soon after Anthropic's "Mapping the Mind of a Large Language Model" (under 3 weeks). I find these efforts really exciting; it is still common to hear people say "we have no idea how LLMs / Deep Learning works", but that is really a gross generalization as stuff like this shows.

Wonder if this was a bit rushed out in response to Anthropic's release (as well as the departure of Jan Leike from OpenAI)... the paper link doesn't even go to Arxiv, and the analysis is not nearly as deep. Though who knows, might be unrelated.

Financial market applications of LLMs

andreyk — Sat, 20 Apr 2024 18:03:15 +0000

Article URL: https://thegradient.pub/financial-market-applications-of-llms/

Comments URL: https://news.ycombinator.com/item?id=40099344

Points: 253

# Comments: 111

Mamba Explained

andreyk — Sat, 30 Mar 2024 16:04:53 +0000

Article URL: https://thegradient.pub/mamba-explained/

Comments URL: https://news.ycombinator.com/item?id=39876114

Points: 201

# Comments: 44

Car-GPT: Could LLMs make self-driving cars happen?

andreyk — Fri, 08 Mar 2024 17:09:01 +0000

Article URL: https://thegradient.pub/car-gpt/

Comments URL: https://news.ycombinator.com/item?id=39643150

Points: 1

# Comments: 0

Do text embeddings perfectly encode text?

andreyk — Wed, 06 Mar 2024 18:40:37 +0000

Article URL: https://thegradient.pub/text-embedding-inversion/

Comments URL: https://news.ycombinator.com/item?id=39619414

Points: 2

# Comments: 0

Why Doesn't My Model Work?

andreyk — Wed, 06 Mar 2024 18:16:27 +0000

Article URL: https://thegradient.pub/why-doesnt-my-model-work/

Comments URL: https://news.ycombinator.com/item?id=39619037

Points: 1

# Comments: 0

New comment by andreyk in "Word2Vec received 'strong reject' four times at ICLR2013"

andreyk — Mon, 18 Dec 2023 20:28:12 +0000

I have finished a PhD in AI just this past year, and can assure you there exist reviewers who spend hours per review to do it well. It's true that these days it's often the case that you can (and are more likely than not to) get unlucky with lazier reviewers, but that does not appear to have been the case with this paper.

For example just see this from the review of f5bf:

"The main contribution of the paper comprises two new NLM architectures that facilitate training on massive data sets. The first model, CBOW, is essentially a standard feed-forward NLM without the intermediate projection layer (but with weight sharing + averaging before applying the non-linearity in the hidden layer). The second model, skip-gram, comprises a collection of simple feed-forward nets that predict the presence of a preceding or succeeding word from the current word. The models are trained on a massive Google News corpus, and tested on a semantic and syntactic question-answering task. The results of these experiments look promising.

...

(2) The description of the models that are developed is very minimal, making it hard to determine how different they are from, e.g., the models presented in [15]. It would be very helpful if the authors included some graphical representations and/or more mathematical details of their models. Given that the authors still almost have one page left, and that they use a lot of space for the (frankly, somewhat superfluous) equations for the number of parameters of each model, this should not be a problem."

These reviews in turn led to significant (though apparently not significant enough) modifications to the paper (https://openreview.net/forum?id=idpCdOWtqXd60¬eId=C8Vn84f...). These were some quality reviews and the paper benefited from going this review process, IMHO.