Hacker News: kiratp

New comment by kiratp in "An update on recent Claude Code quality reports"

kiratp — Fri, 24 Apr 2026 02:01:02 +0000

Agents making forward progress hours apart is an expected pattern and inference engines are being adapted to serve that purpose well.

It’s hard to do it without killing performance and requires engineering in the DC to have fast access to SSDs etc.

Disclosure: work on ai@msft. Opinions my own.

New comment by kiratp in "An update on recent Claude Code quality reports"

kiratp — Fri, 24 Apr 2026 01:42:14 +0000

OpenAI does this for all API calls

> Our systems will smartly ignore any reasoning items that aren’t relevant to your functions, and only retain those in context that are relevant. You can pass reasoning items from previous responses either using the previous_response_id parameter, or by manually passing in all the output items from a past response into the input of a new one.

https://developers.openai.com/api/docs/guides/reasoning

Disclosure - work on AI@msft

New comment by kiratp in "An update on recent Claude Code quality reports"

kiratp — Fri, 24 Apr 2026 01:37:56 +0000

By caching they mean “cached in GPU memory”. That’s a very very scarce resource.

Caching to RAM and disk is a thing but it’s hard to keep performance up with that and it’s early days of that tech being deployed anywhere.

Disclosure: work on AI at Microsoft. Above is just common industry info (see work happening in vLLM for example)

New comment by kiratp in "Anthropic takes $5B from Amazon and pledges $100B in cloud spending in return"

kiratp — Tue, 21 Apr 2026 18:35:31 +0000

At the full current retail API price.

Business buyers are paying API prices, not subscription

Disclosure: Work at Microsoft on AI

New comment by kiratp in "1M context is now generally available for Opus 4.6 and Sonnet 4.6"

kiratp — Sat, 14 Mar 2026 03:12:25 +0000

GitHub Copilot CLI lets you use all these models (unless your employer disables them.

https://github.com/features/copilot/cli

Disclosure: work at Msft

New comment by kiratp in "NASA announces overhaul of Artemis program amid safety concerns, delays"

kiratp — Fri, 27 Feb 2026 17:58:22 +0000

Same contractors (Beoing) who built Starliner...

Explaining Why NASA's Starliner Report Is So Bad > https://www.youtube.com/watch?v=L96asfTvJ_A

New comment by kiratp in "Sam Altman’s DRAM Deal"

kiratp — Sat, 06 Dec 2025 03:52:45 +0000

This is missing a key part of the picture - Nvidia just announced that partners will need to source RAM themselves.

OpenAI is basically ensuring that they can actually get the chips they need for the DCs they are building.

I can’t guess as to what move came first (Nvidia policy change or these DRAM deals) but I would bet this is a large if not larger factor here than “bloc my competitors.

New comment by kiratp in "Fizz Buzz without conditionals or booleans"

kiratp — Wed, 19 Nov 2025 04:41:51 +0000

A loop either never halts or has a conditional. I guess a compiler could elide a “while True:” to a branch-less jump instruction.

One hack would be to use recursion and let stack exhaustion stop you.

New comment by kiratp in "Fizz Buzz without conditionals or booleans"

kiratp — Wed, 19 Nov 2025 04:27:54 +0000

A for loop has a conditional in it.

Unless by conditionals we mean “no if/else” and not “no branch instructions”.

New comment by kiratp in "Fizz Buzz without conditionals or booleans"

kiratp — Wed, 19 Nov 2025 04:26:21 +0000

A for loop has an implicit conditional in its stop condition check.

New comment by kiratp in "UnitedHealth pays its own physician groups 17% more than outside ones"

kiratp — Tue, 04 Nov 2025 04:03:15 +0000

This only applies to large employers. Smaller ones are just presentef a limited list of plans to pick from, and the plans change every year. Most of the time, as a startup, you can’t buy a Mag7 equivalent health plan for any amount of money off the marketplace

New comment by kiratp in "FSF announces Librephone project"

kiratp — Wed, 15 Oct 2025 03:32:20 +0000

Should the app builder’s ability to “trust” that the hardware will protect them from the user supersede the user’s ability to be able to trust that the hardware will protect them from the app?

In other words, should the device be responsible to enforcing DRM (and more) against its owner?

New comment by kiratp in "The Tiny Teams Playbook"

kiratp — Mon, 13 Oct 2025 03:13:20 +0000

The kind of people in these small teams are not ones to think "work is just work".

New comment by kiratp in "Meta’s live demo fails; “AI” recording plays before the actor takes the steps"

kiratp — Fri, 19 Sep 2025 01:16:04 +0000

You can put the AI on rails by just prompting by it. The latest models are very steerable.

System prompt: “stick to steps 1-n. Step 1 is…”

I can say confidently because our company does this. And we have F500 customers in production.

New comment by kiratp in "Meta’s live demo fails; “AI” recording plays before the actor takes the steps"

kiratp — Fri, 19 Sep 2025 00:43:42 +0000

I see no evidence of that. It seems like they tried to put the AI “on rails” with predefined steps and things went wrong.

New comment by kiratp in "Meta’s live demo fails; “AI” recording plays before the actor takes the steps"

kiratp — Fri, 19 Sep 2025 00:41:04 +0000

So much negativity.

I’m just excited that our industry is lead by optimists and our culture enables our corporations to invest huge sums into taking us forward technologically.

Meta could have just done a stock buyback but instead they made a computer that can talk, see, solve problems and paint virtual things into the real world in front of your eyes!

I commend them on attempting a live demo.

New comment by kiratp in "A postmortem of three recent issues"

kiratp — Wed, 17 Sep 2025 22:45:14 +0000

This is due to RoPE scaling.

> All the notable open-source frameworks implement static YaRN, which means the scaling factor remains constant regardless of input length, potentially impacting performance on shorter texts. We advise adding the rope_scaling configuration only when processing long contexts is required. It is also recommended to modify the factor as needed. For example, if the typical context length for your application is 524,288 tokens, it would be better to set factor as 2.0.

https://huggingface.co/Qwen/Qwen3-Next-80B-A3B-Thinking

New comment by kiratp in "Claude now has access to a server-side container environment"

kiratp — Tue, 09 Sep 2025 18:38:25 +0000

Hardware can be the same but scheduling is a whole different beast.

Also, if you pull too manny resources from training your next model to make inference revenue today, you’ll fall behind in the larger race.

New comment by kiratp in "Claude now has access to a server-side container environment"

kiratp — Tue, 09 Sep 2025 17:54:46 +0000

> Importantly, we never intentionally degrade model quality as a result of demand or other factors, and the issues mentioned above stem from unrelated bugs.

Things they could do that would not technically contradict that:

- Quantize KV cache

- Data aware model quantization where their own evals will show "equivalent perf" but the overall model quality suffers.

Simple fact is that it takes longer to deploy physical compute but somehow they are able to serve more and more inference from a slowly growing pool of hardware. Something has to give...

New comment by kiratp in "US economy added just 22,000 jobs in August, unemployment highest in 4 yrs"

kiratp — Fri, 05 Sep 2025 15:02:02 +0000

Source?