Hacker News: numlocked

New comment by numlocked in "Reallocating $100/Month Claude Code Spend to Zed and OpenRouter"

numlocked — Thu, 09 Apr 2026 11:24:42 +0000

You are absolutely allowed to expose access to end users, as long as you continue to abide by terms of service. We have hundreds, if not thousands, of apps built on openrouter that in turn have end users of their own. We showcase many of them on our /apps ranking page!

New comment by numlocked in "Reallocating $100/Month Claude Code Spend to Zed and OpenRouter"

numlocked — Thu, 09 Apr 2026 10:41:41 +0000

COO of OpenRouter here. Thats right — we haven’t done it to date but we can’t have unlimited liabilities stacking up forever. At some point we will start expiring credits from accounts that have seen zero activity in over a year.

New comment by numlocked in "Ju Ci: The Art of Repairing Porcelain"

numlocked — Mon, 23 Mar 2026 23:46:11 +0000

I watched the video at the expecting one thing and finding something completely different. Remarkable — [0] watch the video in its entirety. Not what I thought when I read “staples to repair porcelain”.

[0] intentional human use of an em-dash

New comment by numlocked in "Prompt-caching – auto-injects Anthropic cache breakpoints (90% token savings)"

numlocked — Fri, 13 Mar 2026 12:57:03 +0000

As per its own FAQ this plugin is out of date and doesn’t actually do anything incremental re:caching:

> "Hasn't Anthropic's new auto-caching feature solved this?"

> Largely, yes — Anthropic's automatic caching (passing "cache_control": {"type": "ephemeral"} at the top level) handles breakpoint placement automatically now. This plugin predates that feature and originally filled that gap.

New comment by numlocked in "OpenAI API Logs: Unpatched data exfiltration"

numlocked — Wed, 21 Jan 2026 21:58:49 +0000

At the risk of totally misunderstanding this...it seems to be exfiltration by the app developer, who already has access to all of these data sources and the data that the customer is inputting into the AI KYC app (in this example)...right? I don't believe this exposes any end-user information to a third party. The AI app developer is already 'trusted' and could get access to this information regardless of the exfiltration. Maybe someone can explain this to me more clearly.

New comment by numlocked in "Response Healing: Reduce JSON defects by 80%+"

numlocked — Fri, 19 Dec 2025 21:59:08 +0000

That is a way of doing that, but it's quite expensive computationally. There are some companies that can make it feasible [0], but it's often not a perfect process and different inference providers implement it different ways.

[0] https://dottxt.ai/

New comment by numlocked in "Mistral OCR 3"

numlocked — Fri, 19 Dec 2025 21:43:01 +0000

(I work at OpenRouter) If you send a PDF to our API we will:

1. Use native PDF parsing if the model supports it

2. Use this Mistral OCR model (we updated to this version yesterday)

3. UNLESS you override the "engine" param to use an alternate. We support a JS-based (non-LLM) parser as well [0]

So yes, in practice a lot of OCR jobs go to Mistral, but not all of them.

Would love to hear requests for other parsers if folks have them!

[0] https://openrouter.ai/docs/guides/overview/multimodal/pdfs#p...

Response Healing: Reduce JSON defects by 80%+

numlocked — Thu, 18 Dec 2025 16:19:48 +0000

Article URL: https://openrouter.ai/announcements/response-healing-reduce-json-defects-by-80percent

Comments URL: https://news.ycombinator.com/item?id=46314684

Points: 51

# Comments: 47

New comment by numlocked in "Classical statues were not painted horribly"

numlocked — Thu, 18 Dec 2025 13:39:41 +0000

I just learned that the site/magazine publishing this, Works in Progress, is owned by Stripe! I have no idea why, but the content is great so...thanks Stripe!

New comment by numlocked in "Classical statues were not painted horribly"

numlocked — Thu, 18 Dec 2025 13:33:23 +0000

Good read! The idea that these marvels of artistry were painted like my 10th birthday at the local paint-your-own-pottery store always seemed incongruous, at best.

> Why, then, are the reconstructions so ugly?

> ...may be that they are hampered by conservation doctrines that forbid including any feature in a reconstruction for which there is no direct archaeological evidence. Since underlayers are generally the only element of which traces survive, such doctrines lead to all-underlayer reconstructions, with the overlayers that were obviously originally present excluded for lack of evidence.

That seems plausible -- and somewhat reasonable! To the credit of academics, they seems aware of this (according to the article):

> ‘reconstructions can be difficult to explain to the public – that these are not exact copies, that we can never know exactly how they looked’.

New comment by numlocked in "Developers can now submit apps to ChatGPT"

numlocked — Thu, 18 Dec 2025 00:31:53 +0000

We do this at openrouter and many apps use exactly that pattern!

New comment by numlocked in "Getting a Gemini API key is an exercise in frustration"

numlocked — Thu, 11 Dec 2025 14:50:01 +0000

(I work at OpenRouter) Certainly for individual developers / hobby projects that's the primary value prop; super easy access to all of the models.

But there's a lot more functionality that becomes relevant when building in production. We do automatic fallbacks, route between providers based on data policies, syndicate your data to agent observability tools / your logging platform of choice, user-level and api-key-level budget management and model allow/block lists, programmatic API key management, etc, etc. More good stuff shipping all the time!

New comment by numlocked in "Getting a Gemini API key is an exercise in frustration"

numlocked — Thu, 11 Dec 2025 14:46:18 +0000

(I work at OpenRouter) We add about 15ms of latency once the cache is warm (e.g. on subsequent requests) -- and if there are reliability problems, please let us know! OpenRouter should be more reliable as we will load balance and fall back between different Gemini endpoints.

Is implicit caching prompt retention?

numlocked — Mon, 27 Oct 2025 14:23:16 +0000

Article URL: https://openrouter.ai/announcements/is-implicit-caching-prompt-retention

Comments URL: https://news.ycombinator.com/item?id=45721332

Points: 6

# Comments: 0

LLM Provider Variance: Introducing Exacto

numlocked — Fri, 24 Oct 2025 17:01:15 +0000

Article URL: https://openrouter.ai/announcements/provider-variance-introducing-exacto

Comments URL: https://news.ycombinator.com/item?id=45696596

Points: 3

# Comments: 0

New comment by numlocked in "GLM 4.5 with Claude Code"

numlocked — Sat, 06 Sep 2025 02:37:24 +0000

(OpenRouter COO here) We are starting to test this and verify the deployments. More to come on that front -- but long story short is that we don't have good evidence that providers are doing weird stuff that materially affects model accuracy. If you have data points to the contrary, we would love them.

We are heavily incentivized to prioritize/make transparent high-quality inference and have no incentive to offer quantized/poorly-performing alternatives. We certainly hear plenty of anecdotal reports like this, but when we dig in we generally don't see it.

An exception is when a model is first released -- for example this terrific work by artificial analysis: https://x.com/ArtificialAnlys/status/1955102409044398415

It does take providers time to learn how to run the models in a high quality way; my expectation is that the difference in quality will be (or already is) minimal over time. The large variance in that case was because GPT OSS had only been out for a couple of weeks.

For well-established models, our (admittedly limited) testing has not revealed much variance between providers in terms of quality. There is some but it's not like we see a couple of providers 'cheating' by secretly quantizing and clearly serving less intelligence versions of the model. We're going to get more systematic about it though and perhaps will uncover some surprises.

New comment by numlocked in "How big are our embeddings now and why?"

numlocked — Sat, 06 Sep 2025 01:16:01 +0000

I hear you, but the article is talking specifically about "embeddings as a product" -- not the embeddings that are within an LLM architecture. It starts:

> As a quick review, embeddings are compressed numerical representations of a variety of features (text, images, audio) that we can use for machine learning tasks like search, recommendations, RAG, and classification.

Current standalone embedding models are not intrinsically connected to SotA LLM architectures (e.g. the Qwen reference) -- right? The article seems to mix the two ideas together.

New comment by numlocked in "How big are our embeddings now and why?"

numlocked — Fri, 05 Sep 2025 17:53:17 +0000

I don’t quite understand. The article says things like:

“With the constant upward pressure on embedding sizes not limited by having to train models in-house, it’s not clear where we’ll slow down: Qwen-3, along with many others is already at 4096”

But aren’t embedding models separate from the LLMs? The size of attention heads in LLMs etc isn’t inherently connected to how a lab might train and release an embedding model. I don’t really understand why growth in LLM size fundamentally puts upward pressure on embedding size as they are not intrinsically connected.

New comment by numlocked in "OpenRouter is down"

numlocked — Thu, 28 Aug 2025 12:45:24 +0000

Hi folks -- I'm Chris from OpenRouter. This one hurts. We're back, but our database was down for about 45 minutes, which caused user and credit lookups to fail, and took down the API. We are investigating why, and of course going to look into improving durability so this failure mode can't happen again. We will share a post-mortem on the site when we have finished our investigation. I'm sorry to our users who count on us.

New comment by numlocked in "Show HN: Price Per Token – LLM API Pricing Data"

numlocked — Fri, 25 Jul 2025 17:47:30 +0000

(I work at OpenRouter)

We have a simple model comparison tool that is not-at-all-obvious to find on the website, but hopefully can help somewhat. E.g.

https://openrouter.ai/compare/qwen/qwen3-coder/moonshotai/ki...