Hacker News: whakim

Turning RAG pipelines into enterprise-grade Data Subscriptions

whakim — Thu, 16 Apr 2026 15:21:20 +0000

Article URL: https://halcyon.io/blog/machine-readable/building-the-stack

Comments URL: https://news.ycombinator.com/item?id=47794543

Points: 6

# Comments: 0

New comment by whakim in "From zero to a RAG system: successes and failures"

whakim — Thu, 26 Mar 2026 17:04:45 +0000

I don't think we should undersell that transformers and semantic search are really powerful information retrieval tools, and they are extremely potent for solving search problems. That being said, I think I agree with you that RAG is fundamentally just search, and the hype (like any hype) elides the fact that you still have to solve all of the normal, difficult search problems.

New comment by whakim in "From zero to a RAG system: successes and failures"

whakim — Thu, 26 Mar 2026 14:41:08 +0000

I'd argue the author missed a trick here by using a fancy embedding model without any re-ranking. One of the benefits of a re-ranker (or even a series of re-rankers!) is that you can embed your documents using a really small and cheap model (this also often means smaller embeddings).

New comment by whakim in "From zero to a RAG system: successes and failures"

whakim — Thu, 26 Mar 2026 14:33:20 +0000

For technical domains, stuffing the context full of related-and-irrelevant or possibly-conflicting information will lead to poor results. The examples of long-context retrieval like finding a fact in a book really aren't representative of the types of context you'd be working with in a RAG scenario. In a lot of cases the problem is information organization, not retrieval, e.g. "What is the most authoritative type of source for this information?" or "How do these 100 documents about X relate to each other?"

New comment by whakim in "Beyond has dropped “meat” from its name and expanded its high-protein drink line"

whakim — Tue, 17 Mar 2026 05:51:50 +0000

There is no reason to believe that the foods humans have historically eaten are safer/healthier than "industrially processed/extracted/refined" food simply because we have historically eaten them. Evolution does not select for avoiding the health problems facing modern-day humans such as cancer or heart disease.

New comment by whakim in "Tenure Is a Total Scam (2023)"

whakim — Mon, 09 Feb 2026 01:12:34 +0000

Yes; you can phone it in post-tenure. But just because it is possible doesn't mean (in my experience) it is common; and I don't think it's helpful (as TFA claims) to equate this possibility with "a total scam." To get tenure anywhere doesn't just require a huge amount of work as an Assistant Professor; it also requires a huge amount of work as a PhD student and potentially multiple rounds of post-doc'ing or other non-tenure-line work. In my experience, tenured professors have spent nearly two decades distorting their work-life balance beyond all recognition to the point that grinding insanely hard in pursuit of publications just feels normal.

New comment by whakim in "Postgres extension complements pgvector for performance and scale"

whakim — Wed, 31 Dec 2025 00:32:39 +0000

Worth noting that the filtering implementation is quite restrictive if you want to avoid post-filtering: filters must be expressible as discrete smallints (ruling out continuous variables like timestamps or high cardinality filters like ids); filters must always be denormalized onto the table you're indexing (no filtering on attributes of parent documents, for example); and filters must be declared at index creation time (lots of time spent on expensive index builds if you want to add filters). Personally I would consider these caveats pretty big deal-breakers if the intent is scale and you do a lot of filtering.

New comment by whakim in "How uv got so fast"

whakim — Sat, 27 Dec 2025 16:29:42 +0000

> Most of the time you don't need a different Python version from the system one.

Except for literally anytime you’re collaborating with anyone, ever? I can’t even begin to imagine working on a project where folks just use whatever python version their OS happens to ship with. Do you also just ship the latest version of whatever container because most of the time nothing has changed?

New comment by whakim in "Structured outputs create false confidence"

whakim — Sun, 21 Dec 2025 21:25:42 +0000

I don't really understand the point around error handling. Sure, with structured outputs you need to be explicit about what errors you're handling and how you're handling them. But if you ask the model to return pure text, you now have a universe of possible errors that you still need to handle explicitly (you're using structured outputs, so your LLM response is presumably being consumed programmatically?), including a whole bunch of new errors that structured outputs help you avoid.

Also, meta gripe: this article felt like a total bait-and-switch in that it only became clear that it was promoting a product right at the end.

New comment by whakim in "So you wanna build a local RAG?"

whakim — Sat, 29 Nov 2025 02:01:31 +0000

In my experience the semantic/lexical search problem is better understood as a precision/recall tradeoff. Lexical search (along with boolean operators, exact phrase matching, etc.) has very high precision at the expense of lower recall, whereas semantic search sits at a higher recall/lower precision point on the curve.

New comment by whakim in "Scaling HNSWs"

whakim — Wed, 12 Nov 2025 05:52:15 +0000

Doesn't this depend on your data to a large extent? In a very dense graph "far" results (in terms of the effort spent searching) that match the filters might actually be quite similar?

New comment by whakim in "The Case Against PGVector"

whakim — Mon, 03 Nov 2025 22:58:56 +0000

Thanks for the reply! This makes much more sense now. To preface, I think pgvector is incredibly awesome software, and I have to give huge kudos to the folks working on it. Super cool. That being said, I do think the author isn't being unreasonable in that the limitations of pgvector are very real when you're talking indices that grow beyond millions of things, and the "just use pgvector" crowd in general doesn't have a lot of experience with scaling things beyond toy examples. Folks should take a hard look at what size they expect their indices to grow to in the near-to-medium-term future.

New comment by whakim in "The Case Against PGVector"

whakim — Mon, 03 Nov 2025 22:42:30 +0000

> maintenance_work_mem begs to differ.

HNSW indices are big. Let's suppose I have an HNSW index which fits in a few hundred gigabytes of memory, or perhaps a few terabytes. How do I reasonably rebuild this using maintenance_work_mem? Double the size of my database for a week? What about the knock-on impacts on the performance for the rest of my database-stuff - presumably I'm relying on this memory for shared_buffers and caching? This seems like the type of workload that is being discussed here, not a toy 20GB index or something.

> You use REINDEX CONCURRENTLY.

Even with a bunch of worker processes, how do I do this within a reasonable timeframe?

> How do you think a B+tree gets updated?

Sure, the computational complexity of insertion into an HNSW index is sublinear, the constant factors are significant and do actually add up. That being said, I do find this the weakest of the author's arguments.

New comment by whakim in "The Case Against PGVector"

whakim — Mon, 03 Nov 2025 22:14:52 +0000

Interested to hear more about your experience here. At Halcyon, we have trillions of embeddings and found Postgres to be unsuitable at several orders of magnitude less than we currently have.

On the iterative scan side, how do you prevent this from becoming too computationally intensive with a restrictive pre-filter, or simply not working at all? We use Vespa, which means effectively doing a map-reduce across all of our nodes; the effective number of graph traversals to do is smaller, and the computational burden mostly involves scanning posting lists on a per-node basis. I imagine to do something similar in postgres, you'd need sharded tables, and complicated application logic to control what you're actually searching.

How do you deal with re-indexing and/or denormalizing metadata for filtering? Do you simply accept that it'll take hours or days?

I agree with you, however, that vector databases are not a panacea (although they do remove a huge amount of devops work, which is worth a lot!). Vespa supports filtering across parent-child relationships (like a relational database) which means we don't have to reindex a trillion things every time we want to add a new type of filter, which with a previous vector database vendor we used took us almost a week.

New comment by whakim in "Extract-0: A specialized language model for document information extraction"

whakim — Tue, 30 Sep 2025 18:00:23 +0000

Ok, but what was the cost of labor put into curation of the training dataset and performing the fine-tuning? Hasn’t the paper’s conclusion been repeatedly demonstrated - that it is possible to get really good task-specific performance out of fine-tuned smaller models? There just remains the massive caveat that closed-source models are pretty cheap and so the ROI isn’t there in a lot of cases.

New comment by whakim in "There is a huge pool of exceptional junior engineers"

whakim — Tue, 30 Sep 2025 04:50:54 +0000

No offense, but this software engineering elitism does no favors to perceptions of the field. In reality, most other fields are complex and the phenomenon of believing something is simple because you don't understand it is widespread across fields. Dan Luu expounded on this at much greater length/with greater eloquence: https://danluu.com/cocktail-ideas/

New comment by whakim in "Find SF parking cops"

whakim — Wed, 24 Sep 2025 17:31:07 +0000

You enforce them; if I get a ticket for parking at an intersection, I won't do it again!

(Also, in the specific context of this discussion, parking restrictions near intersections are super common; this is not some esoteric new law that has been introduced. See https://www.sfmta.com/sites/default/files/reports-and-docume...)

New comment by whakim in "Find SF parking cops"

whakim — Tue, 23 Sep 2025 22:19:02 +0000

No, I don't; there are plenty of places you can't legally park that do not have painted curbs or "No Parking" signage. Do we also need curbs and signage near every fire hydrant? How about every driveway? Can drivers double-park anywhere they want? Should they yield to pedestrians in crosswalks? Etc. etc.

New comment by whakim in "Find SF parking cops"

whakim — Tue, 23 Sep 2025 20:00:35 +0000

Why? Having a driver's license is a privilege that requires you to study and know the rules of the road. The onus is on you to know the rules.

New comment by whakim in "Vector database that can index 1B vectors in 48M"

whakim — Fri, 12 Sep 2025 23:54:50 +0000

I couldn't agree with this more. I don't think the majority of problems with vector search at scale are vector search problems (although filtering + ANN is definitely interesting), they're search-problems-at-scale problems.