<rss version="2.0" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:atom="http://www.w3.org/2005/Atom"><channel><title>Hacker News: sdesol</title><link>https://news.ycombinator.com/user?id=sdesol</link><description>Hacker News RSS</description><docs>https://hnrss.org/</docs><generator>hnrss v2.1.1</generator><lastBuildDate>Wed, 27 May 2026 04:49:54 +0000</lastBuildDate><atom:link href="https://hnrss.org/user?id=sdesol" rel="self" type="application/rss+xml"></atom:link><item><title><![CDATA[New comment by sdesol in "Amp, Inc. – Amp is spinning out of Sourcegraph"]]></title><description><![CDATA[
<p>I think the issue was they (the parent commenter) didn't properly convey and/or did not realize they were arguing for context.  Data that is difficult to come by that can be used in a prompt is valuable. Being able to workaround something with clever wording (i.e. prompt) is not a moat.</p>
]]></description><pubDate>Tue, 09 Dec 2025 07:24:32 +0000</pubDate><link>https://news.ycombinator.com/item?id=46202196</link><dc:creator>sdesol</dc:creator><comments>https://news.ycombinator.com/item?id=46202196</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=46202196</guid></item><item><title><![CDATA[New comment by sdesol in "ClickHouse acquires LibreChat, open-source AI chat platform"]]></title><description><![CDATA[
<p>Full Disclosure. I am the author of <a href="https://github.com/gitsense/chat" rel="nofollow">https://github.com/gitsense/chat</a><p>> The idea behind the Agentic Data Stack is a higher-level integration to provide a composable software stack for agentic analytics that users can setup quicky, with room for customization.<p>I agree with this. For those who have been programming with LLM, the difference between something working and not working can be a simple "sentence" conveying the required context. I strongly believe data enrichment will be one of the main ways we can make agents more effective and efficient. Data enrichment is the foundation for my personal assistant feature <a href="https://github.com/gitsense/chat/blob/main/packages/chat/widgets/app/components/chat-builder/trees/help/documentation/understanding-your-personalized-ai-search-assistant/1.md" rel="nofollow">https://github.com/gitsense/chat/blob/main/packages/chat/wid...</a><p>Basically instead of having agents blindly grep for things, you would provide them with analyzers that they can use to search with. By making it dead simple for domain experts to extract 'business logic' from their codebase/data, we can solve a lot of problems, much more efficiently. Since data is the key, I can see why ClickHouse will make this move since they probably want to become the storage for all business logic.<p>Note: I will be dropping a massive update to how my tool generates and analyzes metadata this week, so don't read too much into the demo or if you decide to play with it. I haven't really been promoting it because the flow hasn't been right, but it should be this week.</p>
]]></description><pubDate>Mon, 10 Nov 2025 18:12:28 +0000</pubDate><link>https://news.ycombinator.com/item?id=45878854</link><dc:creator>sdesol</dc:creator><comments>https://news.ycombinator.com/item?id=45878854</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=45878854</guid></item><item><title><![CDATA[New comment by sdesol in "New York Times, AP, Newsmax and others say they won't sign new Pentagon rules"]]></title><description><![CDATA[
<p>> all I want at this point, is my politicians to be smarter than me<p>I don't care if they are smarter than me. I need them to be smart enough to know they are not that smart. I don't expect politicians to be smart. I expect them to be good listeners and be the voice for the people.</p>
]]></description><pubDate>Tue, 14 Oct 2025 06:10:46 +0000</pubDate><link>https://news.ycombinator.com/item?id=45576786</link><dc:creator>sdesol</dc:creator><comments>https://news.ycombinator.com/item?id=45576786</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=45576786</guid></item><item><title><![CDATA[New comment by sdesol in "OpenAI Is Just Another Boring, Desperate AI Startup"]]></title><description><![CDATA[
<p>> I'm in awe they are still allowing free users at all.<p>I am not.<p>> The free tier is enough for me to use it as a helper at work, and I'd probably pay for it tomorrow if they cut off the free tier.<p>You are sort of proving the point that thid isn't crazy. They want to be the dealer of choice and they can afford to give you the hit now for free.</p>
]]></description><pubDate>Fri, 03 Oct 2025 18:14:51 +0000</pubDate><link>https://news.ycombinator.com/item?id=45465978</link><dc:creator>sdesol</dc:creator><comments>https://news.ycombinator.com/item?id=45465978</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=45465978</guid></item><item><title><![CDATA[New comment by sdesol in "Cerebras systems raises $1.1B Series G"]]></title><description><![CDATA[
<p>> Sonnet/Claude Code may technically be "smarter", but Qwen3-Coder on Cerebras is often more productive for me because it's just so incredibly fast.<p>Saying "technically" is really underselling the difference in intelligence in my opinion.  Claude and Gemini are much, much smarter and I trust them to produce better code, but you honestly can't deny the excellent value that Qwen-3, the inference speed and $50/month for 25M tokens/per day brings to the table.<p>Since I paid for the Cerebras pro plan, I've decided to force myself to use it as much as possible for the duration of the month for developing my chat app (<a href="https://github.com/gitsense/chat" rel="nofollow">https://github.com/gitsense/chat</a>) and here so some of my thoughts so far:<p>- Qwen3 Coder is a lot dumber when it comes to prompting as Gemini and Claude are much better at reading between the lines.  However since the speed is so good, I often don't care as I can go back to the message and make some simple clarifications and try again.<p>- The max context window size of 128k for Qwen 3 Coder 480B on their platform can be a serious issue if you need a lot of documentation or code in context.<p>- I've never come close to the 25M tokens per day limit for their Pro Plan. The max I am using is 5M/day.<p>- The inference speed + a capable model like Qwen 3 will open up use cases most people might not have thought of before.<p>I will probably continue to pay for the $50 dollar plan for these use cases.<p>1. Applying LLM generated patches<p>Qwen 3 coder is very much capable of applying patches generated by Sonnet and Gemini.  It is slower than what <a href="https://www.morphllm.com/">https://www.morphllm.com/</a> provides but it is definitely fast enough for most people to not care. The cost savings can be quite significant depending on the work.<p>2. Building context<p>Since it is so fast and because the 25M token limit per day is such a high limit for me, I am finding myself loading more files into context and just asking Qwen to identify files that I will need and/or summarize things so I can feed it into Sonnet or Gemini to save me significant money.<p>3. AI Assistant<p>Due to it's blazing speed, you can analyze a lot data fast for deterministic searches and because it can review results at such a great speed, you can do multiple search and review loops without feeling like you are waiting forever.<p>Given what I've experienced so far, I don't think Cerebras can be a serious platform for coding if Qwen 3 Coder is the only available model.  Having said that, given the inference speed and Qwen being more than capable, I can see Cerebras becoming a massive cost savings option for many companies and developers, which is where I think they might win a lot of enterprise contracts.</p>
]]></description><pubDate>Tue, 30 Sep 2025 23:11:54 +0000</pubDate><link>https://news.ycombinator.com/item?id=45432418</link><dc:creator>sdesol</dc:creator><comments>https://news.ycombinator.com/item?id=45432418</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=45432418</guid></item><item><title><![CDATA[New comment by sdesol in "Context is the bottleneck for coding agents now"]]></title><description><![CDATA[
<p>> A human can effectively discard or disregard prior information as the narrow window of focus moves to a new task, LLMs seem incredibly bad at this.<p>This is how I designed my LLM chat app (<a href="https://github.com/gitsense/chat" rel="nofollow">https://github.com/gitsense/chat</a>).  I think agents have their place, but I really think if you want to solve complex problems without needlessly burning tokens, you will need a human in the loop to curate the context.  I will get to it, but I believe in the same way that we developed different flows for working with Git, we will have different 'Chat Flows' for working with LLMs.<p>I have an interactive demo at <a href="https://chat.gitsense.com" rel="nofollow">https://chat.gitsense.com</a> which shows how you can narrow the focus of the context for the LLM.  Click "Start GitSense Chat Demos" then "Context Engineering & Management" to go through the 30 second demo.</p>
]]></description><pubDate>Fri, 26 Sep 2025 17:32:31 +0000</pubDate><link>https://news.ycombinator.com/item?id=45388965</link><dc:creator>sdesol</dc:creator><comments>https://news.ycombinator.com/item?id=45388965</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=45388965</guid></item><item><title><![CDATA[New comment by sdesol in "Everyone's trying vectors and graphs for AI memory. We went back to SQL"]]></title><description><![CDATA[
<p>How are you quantify the speed at which results are reviewed?</p>
]]></description><pubDate>Thu, 25 Sep 2025 01:14:08 +0000</pubDate><link>https://news.ycombinator.com/item?id=45368133</link><dc:creator>sdesol</dc:creator><comments>https://news.ycombinator.com/item?id=45368133</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=45368133</guid></item><item><title><![CDATA[New comment by sdesol in "Everyone's trying vectors and graphs for AI memory. We went back to SQL"]]></title><description><![CDATA[
<p>Honestly Gemini Flash Lite and models on Cerebras are extremely fast.  I know what you are saying. If the goal is to get a lot of results where they may or may not be relevant, then yes, it is an order of a magnitude slower.<p>If you take into consideration the post analysis process, which is what inference is trying to solve, is it an order of a magnitude slower?</p>
]]></description><pubDate>Wed, 24 Sep 2025 17:11:50 +0000</pubDate><link>https://news.ycombinator.com/item?id=45363220</link><dc:creator>sdesol</dc:creator><comments>https://news.ycombinator.com/item?id=45363220</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=45363220</guid></item><item><title><![CDATA[New comment by sdesol in "Everyone's trying vectors and graphs for AI memory. We went back to SQL"]]></title><description><![CDATA[
<p>I'm guessing you are referring to <a href="https://github.com/gitsense/chat/tree/main/data/analyze" rel="nofollow">https://github.com/gitsense/chat/tree/main/data/analyze</a> or <a href="https://github.com/gitsense/chat/tree/main/packages/chat/widgets/app/components/chat-builder/messages" rel="nofollow">https://github.com/gitsense/chat/tree/main/packages/chat/wid...</a><p>The number is actually the order in the chat so 1.md would be the first message, 2.md would be the second and so forth.<p>If you goto <a href="https://chat.gitsense.com" rel="nofollow">https://chat.gitsense.com</a> and click on the "Load Personal Help Guide" you can see how it is used. Since I want you to be able to chat with the document, I will create a new chat tree and use the directory structure and the 1,2,3... markdown files to determine message order.</p>
]]></description><pubDate>Wed, 24 Sep 2025 17:05:42 +0000</pubDate><link>https://news.ycombinator.com/item?id=45363144</link><dc:creator>sdesol</dc:creator><comments>https://news.ycombinator.com/item?id=45363144</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=45363144</guid></item><item><title><![CDATA[New comment by sdesol in "Everyone's trying vectors and graphs for AI memory. We went back to SQL"]]></title><description><![CDATA[
<p>You could instruct the LLM to classify messages with high level tags like for coffee, drinks, etc. always include beverage.<p>Given how fast interference has become and given current supported context window sizes for most SOTA models, I think summarizing and having the LLM decide what is relevant is not that fragile at all for most use cases.  This is what I do with my analyzers which I talk about at <a href="https://github.com/gitsense/chat/blob/main/packages/chat/widgets/app/components/chat-builder/trees/help/documentation/understanding-your-personalized-ai-search-assistant/1.md" rel="nofollow">https://github.com/gitsense/chat/blob/main/packages/chat/wid...</a></p>
]]></description><pubDate>Wed, 24 Sep 2025 16:50:44 +0000</pubDate><link>https://news.ycombinator.com/item?id=45362948</link><dc:creator>sdesol</dc:creator><comments>https://news.ycombinator.com/item?id=45362948</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=45362948</guid></item><item><title><![CDATA[New comment by sdesol in "Everyone's trying vectors and graphs for AI memory. We went back to SQL"]]></title><description><![CDATA[
<p>I haven't looked at the code, but it might do what I do with my chat app which is talked about at
 <a href="https://github.com/gitsense/chat/blob/main/packages/chat/widgets/app/components/search/docs/gitsense/search-strategy.md" rel="nofollow">https://github.com/gitsense/chat/blob/main/packages/chat/wid...</a><p>The basic idea is, you don't search for a single term but rather you search for many. Depending on the instructions provided in the "Query Construction" stage, you may end up with a very high level search term like beverage or you may end up with terms like 'hot-drinks', 'code-drinks', etc.<p>Once you have the query, you can do a "Broad Search" which returns an overview of the message and from there the LLM can determine which messages it should analyze further if required.<p>Edit.<p>I should add, this search strategy will only work well if you have a post message process.  For example, after every message save/upddate, you have the LLM generate an overview.  These are my instructions for my tiny overview <a href="https://github.com/gitsense/chat/blob/main/data/analyze/tiny-overview/file-content/default/1.md" rel="nofollow">https://github.com/gitsense/chat/blob/main/data/analyze/tiny...</a> that is focused on generating the purpose and keywords that can be used to help the LLM define search terms.</p>
]]></description><pubDate>Wed, 24 Sep 2025 16:09:10 +0000</pubDate><link>https://news.ycombinator.com/item?id=45362347</link><dc:creator>sdesol</dc:creator><comments>https://news.ycombinator.com/item?id=45362347</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=45362347</guid></item><item><title><![CDATA[New comment by sdesol in "Tesla market share in US drops to lowest since 2017"]]></title><description><![CDATA[
<p>> We're putting aside the political stuff because there isn't a lot to discuss<p>I don't agree, as we are not quantifying the emotional aspect of the purchasing process. If people "love" the brand, they are willing to overlook a lot of things. Tesla was a status symbol and is now seen as a regret purchase and a toxic brand for many (see Europe and Canada for examples).  I can't see how "politics" should not be considered as it does play a critical role in how people spend money. There is a reason why a lot of companies are not open about politics and I don't think I've ever seen a CEO that was so forth coming with their beliefs as Elon Musk.</p>
]]></description><pubDate>Tue, 09 Sep 2025 02:45:31 +0000</pubDate><link>https://news.ycombinator.com/item?id=45176802</link><dc:creator>sdesol</dc:creator><comments>https://news.ycombinator.com/item?id=45176802</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=45176802</guid></item><item><title><![CDATA[New comment by sdesol in "GLM 4.5 with Claude Code"]]></title><description><![CDATA[
<p>> But in my testing, other models do not work well. It looks like prompts are either very optimized for Claude, or other models are just not great yet with such an agentic environment.<p>Anybody who has done any serious development with LLMs would know that prompts are not universal. The reason why Claude Code is good is because Anthropic knows Claude Sonnet is good, and that they only need to create prompts that work well with their models. They also have the ability to train their models to work with specific tools and so forth.<p>It really is a kind of fool's errand to try to create agents that can work well with many different models from different providers.</p>
]]></description><pubDate>Sat, 06 Sep 2025 06:18:44 +0000</pubDate><link>https://news.ycombinator.com/item?id=45147052</link><dc:creator>sdesol</dc:creator><comments>https://news.ycombinator.com/item?id=45147052</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=45147052</guid></item><item><title><![CDATA[New comment by sdesol in "Anthropic raises $13B Series F"]]></title><description><![CDATA[
<p>It might not be their money, but they are paid a management fee and if they cannot provide some return, people will stop using them.</p>
]]></description><pubDate>Wed, 03 Sep 2025 15:38:29 +0000</pubDate><link>https://news.ycombinator.com/item?id=45117057</link><dc:creator>sdesol</dc:creator><comments>https://news.ycombinator.com/item?id=45117057</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=45117057</guid></item><item><title><![CDATA[New comment by sdesol in "A staff engineer's journey with Claude Code"]]></title><description><![CDATA[
<p>It will certainly be interesting to see how businesses evolve in the upcoming years. What is written in stone is, you (employee) will be measured and I am curious to see what developers will be measured by in the future. Will you be at a greater risk of layoffs/lack of promotions/etc. if you spend more on AI? How do you as a developer prove that it is you and not the LLM that should be praised?</p>
]]></description><pubDate>Tue, 02 Sep 2025 23:45:29 +0000</pubDate><link>https://news.ycombinator.com/item?id=45110579</link><dc:creator>sdesol</dc:creator><comments>https://news.ycombinator.com/item?id=45110579</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=45110579</guid></item><item><title><![CDATA[New comment by sdesol in "Anthropic raises $13B Series F"]]></title><description><![CDATA[
<p>> So it is a game of being the one that is left standing<p>Or the last investor. When this type of money is raised, you can be sure the earlier investors are looking for ways to have a soft landing.</p>
]]></description><pubDate>Tue, 02 Sep 2025 17:53:24 +0000</pubDate><link>https://news.ycombinator.com/item?id=45106637</link><dc:creator>sdesol</dc:creator><comments>https://news.ycombinator.com/item?id=45106637</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=45106637</guid></item><item><title><![CDATA[New comment by sdesol in "Survey: a third of senior developers say over half their code is AI-generated"]]></title><description><![CDATA[
<p>Joking aside, if he is one of the top developers in the company and if he is "actually" a good developer, when compared to others outside of the company, then I can see this bill.<p>The current feature that I'm working on, required 100 messages to finalize things and I would say the context window was around 35k - 50k per "chat completion".  My model of choice is Gemini 2.5 Flash which has an input cost of $0.30/1M.  Compare this to Sonnet which is $3.00/1M.<p>If the person was properly designing and instructing the LLM to build something advanced correctly, I can see the bill being quite high. I personally don't think you need to use Sonnet 99% of the time, but if somebody else is willing to pay the bill, why not.</p>
]]></description><pubDate>Mon, 01 Sep 2025 07:29:55 +0000</pubDate><link>https://news.ycombinator.com/item?id=45090361</link><dc:creator>sdesol</dc:creator><comments>https://news.ycombinator.com/item?id=45090361</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=45090361</guid></item><item><title><![CDATA[New comment by sdesol in "Deploying DeepSeek on 96 H100 GPUs"]]></title><description><![CDATA[
<p>No what I am saying is there are more applications for batch processing that will help with utilization. I can see developers and companies using off hour processing to prep their data for agentic coding.</p>
]]></description><pubDate>Sat, 30 Aug 2025 16:37:57 +0000</pubDate><link>https://news.ycombinator.com/item?id=45075958</link><dc:creator>sdesol</dc:creator><comments>https://news.ycombinator.com/item?id=45075958</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=45075958</guid></item><item><title><![CDATA[New comment by sdesol in "Deploying DeepSeek on 96 H100 GPUs"]]></title><description><![CDATA[
<p>I don't think you need to be big data to benefit.<p>A major issue we have right now is, we want the coding process to be more "Agentic", but we don't have an easy way for LLMs to determine what to pull into context to solve a problem. This is a problem that I am working on with my personal AI search assistant, which I talk about below:<p><a href="https://github.com/gitsense/chat/blob/main/packages/chat/widgets/app/components/chat-builder/trees/help/documentation/understanding-your-personalized-ai-search-assistant/1.md" rel="nofollow">https://github.com/gitsense/chat/blob/main/packages/chat/wid...</a><p>Analyzers are the "Brains" for my search, but generating the analysis is both tedious and can be costly. I'm working on the tedious part and with batch processing, you can probably process thousands of files for under 5 dollars with Gemini 2.5 Flash.<p>With batch processing and the ability to continuously analyze 10s of thousands of files, I can see companies wanting to make "Agentic" coding smarter, which should help with GPU utilization and drive down the cost of software development.</p>
]]></description><pubDate>Fri, 29 Aug 2025 21:11:55 +0000</pubDate><link>https://news.ycombinator.com/item?id=45069433</link><dc:creator>sdesol</dc:creator><comments>https://news.ycombinator.com/item?id=45069433</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=45069433</guid></item><item><title><![CDATA[New comment by sdesol in "Grok Code Fast 1"]]></title><description><![CDATA[
<p>As a bit of a side note, I want to like Cerebras, but using any of the models through OpenRouter that uses them has lead to, too many throttling responses. Like you can't seem to make a few calls per minute. I'm not sure if Cerebras is throttling OpenRouter or if they are throttling everybody.<p>If somebody from Cerebras is reading this, are you having capacity issues?</p>
]]></description><pubDate>Fri, 29 Aug 2025 18:20:11 +0000</pubDate><link>https://news.ycombinator.com/item?id=45067631</link><dc:creator>sdesol</dc:creator><comments>https://news.ycombinator.com/item?id=45067631</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=45067631</guid></item></channel></rss>