Hacker News: thoughtlede

New comment by thoughtlede in "Claude Pro: Opus model will only be available if extra usage is enabled"

thoughtlede — Tue, 28 Apr 2026 00:19:05 +0000

Investor funds have been subsidizing the inference costs so far.

Investors might move from funding the model providers to funding the enterprises that use those models. That is, they might move from funding the cost of the experiment to funding the value of the result. No funding if there are no demonstrable AI gains.

This is a reasonable shift if this happens. If enough gains have been demonstrated, then investors might go back to funding the model providers. Investors always move towards the highest leverage point.

As long as AI delivers, this would be the rhythm.

New comment by thoughtlede in "Nano Banana 2: Google's latest AI image generation model"

thoughtlede — Thu, 26 Feb 2026 17:34:39 +0000

Strictly speaking, I don't think it is the generation or creation that diminishes their value. it is the consumption.

You said it too:

> If I see a million fake Tom Cruise videos, then it oversaturates my desire for desire for all Tom Cruise movies.

The trick of course is to keep yourself from seeing that content.

The other nuance is that as long as real performance remains unique, which so far it is, we can appreciate more what flesh and blood brings to the table. For example, I can appreciate the reality of the people in a picture or a video that is captured by a regular camera; it's AI version lacks that spunk (for now).

Note that iPhone in its default settings is already altering the reality, so AI generation is far right on that slippery axis.

Perhaps, AI and VR would be the reason why our real hangouts would be more appreciated even if they become rare events in the future.

New comment by thoughtlede in "Why is the sky blue?"

thoughtlede — Mon, 09 Feb 2026 21:22:16 +0000

I think we can simplify the answer to this question for most audience and say "the air is blue".

If they say, the air appears to be clear when I stare at something other than sky, the answer is you need more of air to be able to see its blue-ness, in much the same way that a small amount of murky water in your palm appears clear, but a lot of it does not.

If they ask, why don't I see that blue-ness at dawn or dusk, the answer is that the light source is at a different angle. The color of most objects changes when the light source is at a flat angle. And sun lights hits at a flat angle at dawn and dusk.

If they ask, what exactly is the inside phenomenon to see the sky color to be blue, then explanations like this blog are relevant.

If they ask, what exactly is a color, the answer is that it is a fiction made up by our brain.

New comment by thoughtlede in "Collaboration sucks"

thoughtlede — Tue, 11 Nov 2025 22:23:13 +0000

For me there are two things about collaboration.

Decision making is one, which you emphasized.

The other is knowing what the collaboration brings to the table and shaping the rules of engagement to fit that expectation. Sometimes you collaborate with SMEs; they bring the domain knowledge - you don't, but you understand the goal better than them. Sometimes you are creating or refining the corporate strategy based on the actions from individual projects or partners; you are learning ground realities from them. Sometimes you need help from others to improve your take on a subject.

In each of these cases, you have to be clear about what you expect from the collaborators (and motivate them to contribute). Without being clear on what the collaboration is about and what they get in return is the number one killer of collaborative projects even though there is no ill-intent anywhere.

New comment by thoughtlede in "Ask HN: What's your experience with using graph databases for agentic use-cases?"

thoughtlede — Mon, 06 Oct 2025 08:04:06 +0000

It boils down to whether your LLMs can speak graph queries better than SQL, for your use cases and data. As your data posture changes and your use cases change, you routinely reevaluate which DB query language suits best for LLMs.

I'd also design the system architecture in such a way that your non-agentic workloads don't suffer if you have to move between query models for serving agentic workloads better.

New comment by thoughtlede in "AWS CEO says using AI to replace junior staff is 'Dumbest thing I've ever heard'"

thoughtlede — Thu, 21 Aug 2025 17:30:23 +0000

memorization + application = comprehension. Rinse and repeat.

Whether leet code or anything else.

New comment by thoughtlede in "Reasoning models don't always say what they think"

thoughtlede — Thu, 03 Apr 2025 19:43:43 +0000

It feels to me that the hypothesis of this research was somewhat "begging the question". Reasoning models are trained to spit some tokens out that increase the chance of the models spitting the right answer at the end. That is, the training process is singularly optimizing for the right answer, not the reasoning tokens.

Why would you then assume the reasoning tokens will include hints supplied in the prompt "faithfully"? The model may or may not include the hints - depending on whether the model activations believe those hints are necessary to arrive at the answer. In their experiments, they found between 20% and 40% of the time, the models included those hints. Naively, that sounds unsurprising to me.

Even in the second experiment when they trained the model to use hints, the optimization was around the answer, not the tokens. I am not surprised the models did not include the hints because they are not trained to include the hints.

That said, and in spite of me potentially coming across as an unsurprised-by-the-result reader, it is a good experiment because "now we have some experimental results" to lean into.

Kudos to Anthropic for continuing to study these models.

New comment by thoughtlede in "Supercharge vector search with ColBERT rerank in PostgreSQL"

thoughtlede — Fri, 24 Jan 2025 07:53:17 +0000

tadkar did a good job at explaining ColBERT. I understood ColBERT well in the context of where it lies on the spectrum of choices.

On one side of the spectrum, you reduce each of the documents as well as the query to a lower-dimensional space (aka embeddings) and perform similarity. This has the advantage that the document embeddings could be precomputed. At query time, you only compute the query embedding and compare its similarity with document embeddings. The problem is that the lower-dimensional embedding acts as a decent, but not great, proxy for the documents as well as for the query. Your query-document similarity is only as good as the semantics that could be captured in those lower-dimensional embeddings.

On the other side of the spectrum, you consider the query with each document (as a pair) and see how much the query "attends" to each of the documents. The power of trained attention weights means that you get a much reliable similarity score. The problem is that this approach requires you to run attention-forward-pass as many times as there are documents -- for each query. In other words, this has a performance issue.

ColBERT sits in the middle of the spectrum. It "attends" to each of the documents separately and captures the lower-dimensional embedding for each token in each document. This we precompute. Once we have done that, we captured the essence of how tokens within a given document attend to each other, and is captured in the token embeddings.

Then, at query time, we do the same for each token in the query. And we see which query-token embedding is greatly similar to which document-token embedding. If we find that there is a document which has more tokens that are found to be greatly similar to the query tokens, then we consider that to the best document match. (The degree of similarity between each query-document token is used to score the ranking - it is called Sum of MaxSim).

Obviously, attention based similarity, like in the second approach, is better than reducing to token embeddings and scoring similarity. But ColBERT avoids the performance hit compared to the second approach. ColBERT also avoids the lower fidelity of "reducing the entire document to a lower-dimensional space issue" because it reduces each token in the document separately.

By the way, the first approach is what bi-encoders do. The second approach is cross-encoding.

New comment by thoughtlede in "Building Effective "Agents""

thoughtlede — Fri, 20 Dec 2024 23:19:44 +0000

When thinking about AI agents, there is still conflation between how to decide the next step to take vs what information is needed to decide the next step.

If runtime information is insufficient, we can use AI/ML models to fill that information. But deciding the next step could be done ahead of time assuming complete information.

Most AI agent examples short circuit these two steps. When faced with unstructured or insufficient information, the program asks the LLM/AI model to decide the next step. Instead, we could ask the LLM/AI model to structure/predict necessary information and use pre-defined rules to drive the process.

This approach will translate most [1] "Agent" examples into "Workflow" examples. The quotes here are meant to imply Anthropic's definition of these terms.

[1] I said "most" because there might be continuous world systems (such as real world simulacrum) that will require a very large number of rules and is probably impractical to define each of them. I believe those systems are an exception, not a rule.

New comment by thoughtlede in "CRDTs and Collaborative Playground"

thoughtlede — Wed, 18 Dec 2024 07:53:37 +0000

> Beyond this, if you want to determine causality, e.g. whether events are "causally related" (happened before or after each other) or are "concurrent" (entirely independent of), you can look at Vector Clocks—I won't go down that rabbit-hole here, though.

If anyone want to go down that rabbit hole: https://www.exhypothesi.com/clocks-and-causality/

New comment by thoughtlede in "Training LLMs to Reason in a Continuous Latent Space"

thoughtlede — Wed, 11 Dec 2024 02:40:28 +0000

Perhaps these findings might be indicating that we need more NN layers/attention blocks for performing reasoning. This project circumvented the lack of more trained layers by looping the input through currently trained layers more than once.

Also we may have to look for better loss functions than ones that help us predict the next token to train the models if the objective is reasoning.

New comment by thoughtlede in "Model Context Protocol"

thoughtlede — Tue, 26 Nov 2024 10:58:46 +0000

If function calling is sync, is MCP its async counterpart? Is that the gist of what MCP is?

Open API (aka swagger) based function calling is standard already for sync calls, and it solves the NxM problem. I'm wondering if the proposed value is that MCP is async.

New comment by thoughtlede in "My Python code is a neural network"

thoughtlede — Mon, 01 Jul 2024 17:01:34 +0000

I think the mention of 'spaghetti code' is a red herring from the author. If the output from an algorithm cannot be defined precisely as a function of the input, but you have some examples to show, that's where machine learning (ML) is useful.

In the end, ML provides one more option to choose from. Whether it works or not for you depends on evaluations and how deterministic and explainability you need from the chosen algorithm/option.

The thing that struck me is if RNN is the right choice given that it would need to be trained and we need a lot of examples than what we might have. That said, maybe based on known 'rules', we can produce synthetic data for both +ve and -ve cases.

Classifier to detect DALL·E 3 images

thoughtlede — Thu, 09 May 2024 01:38:43 +0000

Article URL: https://openai.com/index/understanding-the-source-of-what-we-see-and-hear-online/

Comments URL: https://news.ycombinator.com/item?id=40304449

Points: 2

# Comments: 0

New comment by thoughtlede in "Deliberative Consensus Protocols"

thoughtlede — Mon, 15 Apr 2024 16:37:04 +0000

Interesting. I didn’t know this body of work. I haven’t read the documentation other than the abstract.

If the protocol is about knowing what information to inject at what node in the network to achieve consensus, the protocol can (and will) be used to inject whatever truths the parties with knowledge on the protocol believe in. If there are enough parties with opposing beliefs, then the network cannot be gamed beyond the status quo.

How does this protocol “choose” the truth to propagate in the face of opposing truths?

What power does the protocol give to parties who know how to work the system?

Or rather what resilience does the protocol have against “inside traders”?

New comment by thoughtlede in "AI and the Problem of Knowledge Collapse"

thoughtlede — Fri, 05 Apr 2024 22:13:07 +0000

That's interesting.

In keyword-based indexing solutions, a document vector is created using "term frequency inverse document frequency" scores. The idea is to pump up the document on the dimension where the document is unique compared to the other documents in the corpus. So when a query is issued with emphasis on a certain dimension, only documents that has higher scores in that dimension are returned.

But the uniqueness in those solutions is based on keywords being used in the document, not concepts.

What we need here to eliminate "blandness" is conceptual uniqueness. Maybe TF-IDF is still relevant to get there. Something to think about.

New comment by thoughtlede in "AI and the Problem of Knowledge Collapse"

thoughtlede — Fri, 05 Apr 2024 21:07:14 +0000

LLMs are both language processing engines and knowledge bases. This article explores the knowledge base aspect of LLM and sheds light on the potential danger. The authors are well-justified in doing so because ChatGPT as a knowledge-bot is being used by many end users for its knowledge.

However, to my knowledge, many enterprise applications that are being built using LLMs feed task-specific curated knowledge to LLMs. This mode of LLM use is encouraging. I do not think this article acknowledged this aspect of LLM use.

New comment by thoughtlede in "SceneScript, a novel approach for 3D scene reconstruction"

thoughtlede — Sat, 23 Mar 2024 18:03:28 +0000

Yes, the camera sees everything. But we could avoid teaching the model certain things. For instance, an undressed body and a dressed body could both be taught as a body. Likewise, medicine pills as well as regular mints could just be taught as mints.

While this approach doesn’t address privacy completely, it avoids certain elements of our life that are considered really private.

I think half of the point of using the simulated dataset to train the model was to safeguard the above mentioned kind of privacy (with the other half being lack of real-world dataset).

New comment by thoughtlede in "Launch HN: DryMerge (YC W24) – Automate Workflows with Plain English"

thoughtlede — Fri, 22 Mar 2024 17:52:21 +0000

Yeah. Rollbacks or reruns are hard when dealing with external systems. Actions need to be idempotent for reruns to work.

One thing you may focus on is making workflows more durable: Checkpointing and sending to users summaries of last checkpoints when things fail.

The last thing you want a non-tech user (your target customer) is to figure out what’s the state of a failed workflow.

New comment by thoughtlede in "Launch HN: DryMerge (YC W24) – Automate Workflows with Plain English"

thoughtlede — Fri, 22 Mar 2024 17:35:08 +0000

Thanks.

What happens right now when the workflow fails mid-way? Do you ensure atomicity or durable execution?