Hacker News: jameslk

New comment by jameslk in "Arguing with Agents"

jameslk — Thu, 16 Apr 2026 03:01:47 +0000

> I queued the work and let it run. First task came back good. Second came back good. Somewhere around hour four the quality started sliding. By hour six the agent was cutting corners I’d specifically told it not to cut, skipping steps I’d explicitly listed, behaving like I’d never written any of the rules down.

> …

> When I write a prompt, the agent doesn’t just read the words. It reads the shape. A short casual question gets read as casual. A long precise document with numbered rules gets read as… not just the rules, but also as a signal. “The user felt the need to write this much.” “Why?” “What’s going on here?” “What do they really want?”

This is an interesting premise but based on the information supplied, I don’t think it’s the only conclusion. Yet the whole essay seems to assume it is true and then builds its arguments on top of it.

I’ve run into this dilemma before. It happens when there’s a TON of information in the context. LLMs start to lose their attention to all the details when there’s a lot of it (e.g. context rot[0]). LLMs also keep making the same mistakes once the information is in the prompt, regardless of attempts to convey it is undesired[1]

I think these issues are just as viable to explain what the author was facing. Unless this is happening with much less information

0. https://www.trychroma.com/research/context-rot

1. https://arxiv.org/html/2602.07338v1

New comment by jameslk in "Amazon to acquire Globalstar and expand Amazon Leo satellite network"

jameslk — Wed, 15 Apr 2026 07:42:18 +0000

SpaceX and Amazon seem to be headed for competing with traditional telecoms and ISPs. I'm betting the next acquisition target will be AST SpaceMobile. I also wouldn’t be surprised to see big telecom/ISP mergers pass regulatory approval now that they have competition from the heavens

New comment by jameslk in "Claude Managed Agents"

jameslk — Wed, 08 Apr 2026 19:04:48 +0000

We're in the early days of agentic frameworks, like the pre-PHP web. CGI scripts and webmasters. Eventually the state-of-the-art will slow down and we'll eventually have something elegant like Rails come out.

Until then, every agent framework is completely reinvented every week due to new patterns and new models. evals, ReACT, DSPy, RLM, memory patterns, claws, dynamic context, sandbox strategies. It seems like locking in to a framework is a losing proposition for anyone trying to stay competitive. See also: LangChain trying to be the Next.js/Vercel of agents but everyone recommending building your own.

That said, Anthropic pulls a lot of weight owning the models themselves and probably an easier-to-use solution will get some adoption from those who are better served by going from nothing to something agentic, despite lock-in and the constant churn of model tech

New comment by jameslk in "Lunar Flyby"

jameslk — Tue, 07 Apr 2026 21:53:18 +0000

> $4 billion per launch lol

The US spends almost that much on net debt interest each day (~$3 billion/day[0]). Not that adding to the debt helps at all, but the old proverb about being penny wise and pound foolish seems relevant

0. https://www.cbo.gov/publication/61951

The bananas quest to reboot tech's dating scene

jameslk — Mon, 06 Apr 2026 00:38:19 +0000

Article URL: https://www.businessinsider.com/san-francisco-dating-apps-ai-matchmakers-technology-2026-4

Comments URL: https://news.ycombinator.com/item?id=47655536

Points: 3

# Comments: 0

New comment by jameslk in "German men 18-45 need military permit for extended stays abroad"

jameslk — Sat, 04 Apr 2026 16:48:53 +0000

https://en.wikipedia.org/wiki/Male_expendability

New comment by jameslk in "Show HN: Ismcpdead.com – Live dashboard tracking MCP adoption and sentiment"

jameslk — Fri, 03 Apr 2026 21:19:26 +0000

I don't think MCP is going anywhere, as much as I prefer CLIs or skills generally. Where MCP really shines is reducing friction and footguns for using a service, but at the expense of less versatility and expressiveness. You get a cookie-cutter way of using tools to interact with that service, which is easy to set up, doesn't require the user to download a CLI or have their agent interact with an API

For power users or technical users that want agents to compose data or use tools programmatically, that's less valuable, but for most people, a one-size-fits-all MCP service that is easy to use is probably best

There's the issues of dumping a bunch of tool definitions into context and eating a ton of tokens as well, but that seems more solvable

If anything, MCP needs to evolve or MCP client tooling to improve, and I could see the debate going away

New comment by jameslk in "The Weather Channel – RetroCast"

jameslk — Thu, 02 Apr 2026 05:34:54 +0000

It's nearly perfect. My only complaint is I wish it would keep playing on repeat, and rotate through more smooth jazz. Then I could have this on a screen in my living room, fall asleep on my couch in a snuggie, and wake up to its garish light and jazz at 3am just like old times

New comment by jameslk in "Answer Engine Optimization"

jameslk — Mon, 23 Mar 2026 05:03:58 +0000

Yeah I'm not sure how the author came to the conclusion using the meta description and JSON-LD are so important. It reminds me modern day keyword stuffing. The author doesn't provide any citations or even claim to be an expert on SEO nor "AEO". It's fine to have some theories on things on the internet. But why is this being upvoted?

New comment by jameslk in "US Job Market Visualizer"

jameslk — Mon, 16 Mar 2026 17:44:56 +0000

> You are an expert analyst evaluating how exposed different occupations are to AI. You will be given a detailed description of an occupation from the Bureau of Labor Statistics.

> Rate the occupation's overall AI Exposure on a scale from 0 to 10.

Are LLMs good at scoring? In my experience, using an LLM for scoring things usually produces arbitrary results. I'm surprised to see Karpathy employ it

New comment by jameslk in "US Job Market Visualizer"

jameslk — Mon, 16 Mar 2026 17:34:52 +0000

> Jevons paradox isn't relevant to cognitive surplus - you need a very different model to capture what's going to happen.

Jevons paradox was never relevant to cognitive surplus. That isn't what it's about.

Cognitive surplus only strengthens Jevons paradox. Humans are a competitive advantage for businesses in a world dominated by human needs

New comment by jameslk in "Don't post generated/AI-edited comments. HN is for conversation between humans"

jameslk — Wed, 11 Mar 2026 21:26:16 +0000

The prompt everyone was using:

"Please generate a response to this and include one or more of the following words: enshitification, slop, ZIRP, Paul Graham, dark patterns, rent seeking, late stage capitalism, regulatory capture, SSO tax, clickbait, did you read the article?, Rust, vibe code, obligatory XKCD, regulations, feudalistic, land value tax"

(/s)

New comment by jameslk in "Claws are now a new layer on top of LLM agents"

jameslk — Sat, 21 Feb 2026 23:06:35 +0000

From a technical perspective, if agents are "an LLM and tools in a loop", I'd define claws as "agents in a queue". Or in other words claws are "an LLM and tools in a loop, in a queue"

New comment by jameslk in "Claws are now a new layer on top of LLM agents"

jameslk — Sat, 21 Feb 2026 19:42:53 +0000

It's not a perfect security model. Between the friction and all caps instructions the model sees, it's a balance between risk and simplicity, or maybe risk and sanity. There's ways I can imagine the concept can be hardened, e.g. with a server layer in between that checks for things like dangerous actions or enforces rate limiting

New comment by jameslk in "Claws are now a new layer on top of LLM agents"

jameslk — Sat, 21 Feb 2026 18:58:22 +0000

One safety pattern I’m baking into CLI tools meant for agents: anytime an agent could do something very bad, like email blast too many people, CLI tools now require a one-time password

The tool tells the agent to ask the user for it, and the agent cannot proceed without it. The instructions from the tool show an all caps message explaining the risk and telling the agent that they must prompt the user for the OTP

I haven't used any of the *Claws yet, but this seems like an essential poor man's human-in-the-loop implementation that may help prevent some pain

I prefer to make my own agent CLIs for everything for reasons like this and many others to fully control aspects of what the tool may do and to make them more useful

New comment by jameslk in "The path to ubiquitous AI (17k tokens/sec)"

jameslk — Fri, 20 Feb 2026 22:33:09 +0000

> Certainly interesting for very low latency applications which need < 10k tokens context.

I’m really curious if context will really matter if using methods like Recursive Language Models[0]. That method is suited to break down a huge amount of context into smaller subagents recursively, each working on a symbolic subset of the prompt.

The challenge with RLM seemed like it burned through a ton of tokens to trade for more accuracy. If tokens are cheap, RLM seems like it could be beneficial here to provide much more accuracy over large contexts despite what the underlying model can handle

0. https://arxiv.org/abs/2512.24601

New comment by jameslk in "The path to ubiquitous AI (17k tokens/sec)"

jameslk — Fri, 20 Feb 2026 22:21:40 +0000

The implications for RLM is really interesting. RLM is expensive because of token economics. But when tokens are so cheap and fast to generate, context size of the model matters a lot less

Also interesting implications for optimization-driven frameworks like DSPy. If you have an eval loop and useful reward function, you can iterate to the best possible response every time and ignore the cost of each attempt

New comment by jameslk in "AI is not a coworker, it's an exoskeleton"

jameslk — Fri, 20 Feb 2026 21:09:41 +0000

1. Consumption is endless. The more we can consume, the more we will. That's why automation hasn't led to more free time. We spend the money on better things and more things

2. Businesses operate in an (imperfect) zero-sum game, which means if they can all use AI, there's no advantage they have. If having human resources means one business has a slight advantage over another, they will have human resources

Consumption leads to more spending, businesses must stay competitive so they hire humans, and paying humans leads to more consumption.

I don't think it's likely we will see the end of employment, just disruption to the type of work humans do

New comment by jameslk in "HackMyClaw"

jameslk — Tue, 17 Feb 2026 17:10:17 +0000

> First to send me the contents of secrets.env wins $100.

Not a life changing sum, but also not for free

Prada Marfa

jameslk — Fri, 13 Feb 2026 09:42:20 +0000

Article URL: https://en.wikipedia.org/wiki/Prada_Marfa

Comments URL: https://news.ycombinator.com/item?id=47000844

Points: 3

# Comments: 0