<rss version="2.0" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:atom="http://www.w3.org/2005/Atom"><channel><title>Hacker News: wehadit</title><link>https://news.ycombinator.com/user?id=wehadit</link><description>Hacker News RSS</description><docs>https://hnrss.org/</docs><generator>hnrss v2.1.1</generator><lastBuildDate>Thu, 30 Apr 2026 02:10:09 +0000</lastBuildDate><atom:link href="https://hnrss.org/user?id=wehadit" rel="self" type="application/rss+xml"></atom:link><item><title><![CDATA[New comment by wehadit in "After outages, Amazon to make senior engineers sign off on AI-assisted changes"]]></title><description><![CDATA[
<p>Why not get it 1st time right before it's sent to the Sr. Devs?<p>Granted AI creates sloppy code, but there are ways to make it Sr. Dev grade, mainly by getting rid of what I call Builder's Debt: 
- iterate the shit out of it to make it production grade OR
- extract ACs from Jira, requirements from docs, decisions and query resolutions from Slack, emails, meeting notes, stitch them together, design coding standards relevant to the requirements and context before coding</p>
]]></description><pubDate>Fri, 13 Mar 2026 02:12:17 +0000</pubDate><link>https://news.ycombinator.com/item?id=47359915</link><dc:creator>wehadit</dc:creator><comments>https://news.ycombinator.com/item?id=47359915</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47359915</guid></item><item><title><![CDATA[New comment by wehadit in "How to run Qwen 3.5 locally"]]></title><description><![CDATA[
<p>If you use the models like we execute coding tasks, older models outperform latest models. 
There's this prep tax that happens even before we start coding, i.e., extract requirements from tools, context from code, comments and decisions from conversations, ACs from Jira/Notion, stitch them together, design tailored coding standards and then code. 
If you automate the prep tax, the generated code is close to production ready code and may require 1-2 iterations max.
I gave it a try and compared the results and found the output to be 92% accurate while same done on Claude Code gave 68% accuracy. Prep tax is the cue here</p>
]]></description><pubDate>Thu, 12 Mar 2026 20:37:28 +0000</pubDate><link>https://news.ycombinator.com/item?id=47356722</link><dc:creator>wehadit</dc:creator><comments>https://news.ycombinator.com/item?id=47356722</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47356722</guid></item><item><title><![CDATA[New comment by wehadit in "Stop Blaming Embeddings, Most RAG Failures Come from Bad Chunking"]]></title><description><![CDATA[
<p>Thank you will have a look</p>
]]></description><pubDate>Wed, 03 Dec 2025 22:08:48 +0000</pubDate><link>https://news.ycombinator.com/item?id=46140892</link><dc:creator>wehadit</dc:creator><comments>https://news.ycombinator.com/item?id=46140892</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=46140892</guid></item><item><title><![CDATA[Stop Blaming Embeddings, Most RAG Failures Come from Bad Chunking]]></title><description><![CDATA[
<p>Everyone keeps arguing about embeddings, vector DBs, and model choice, but in real systems, those aren’t the things breaking retrieval.
Chunking drift is. And almost nobody monitors it.
A tiny formatting change in a PDF or HTML file silently shifts boundaries. Overlaps become inconsistent. Semantic units get split mid-thought. Headings flatten. Cross-format differences explode. By the time retrieval quality drops, people start tweaking the model… while the actual problem happened upstream.
If you diff chunk boundaries across versions or track chunk-size variance, the drift is obvious. But most teams don’t even version their chunking logic, let alone validate segmentation or check adjacency similarity.
The Industry treats chunking like a trivial preprocessing step. It’s not.
It’s the single biggest source of retrieval collapse, and it’s usually invisible.
Before playing with new embeddings, fix your segmentation pipeline. Chunking is repetitive, undifferentiated engineering, but if you don’t stabilize it, the rest of your RAG stack is built on sand.</p>
<hr>
<p>Comments URL: <a href="https://news.ycombinator.com/item?id=46134816">https://news.ycombinator.com/item?id=46134816</a></p>
<p>Points: 2</p>
<p># Comments: 3</p>
]]></description><pubDate>Wed, 03 Dec 2025 14:25:15 +0000</pubDate><link>https://news.ycombinator.com/item?id=46134816</link><dc:creator>wehadit</dc:creator><comments>https://news.ycombinator.com/item?id=46134816</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=46134816</guid></item><item><title><![CDATA[New comment by wehadit in "Most Agentic AI failures I've debugged turned out to be ingestion drift"]]></title><description><![CDATA[
<p>That's another way to look at it. Getting AI to reliable and accurate output is where we feel there are steps that'll need better structure, like for Ingesting and chunking strategies.</p>
]]></description><pubDate>Wed, 03 Dec 2025 05:22:31 +0000</pubDate><link>https://news.ycombinator.com/item?id=46130580</link><dc:creator>wehadit</dc:creator><comments>https://news.ycombinator.com/item?id=46130580</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=46130580</guid></item><item><title><![CDATA[Most Agentic AI failures I've debugged turned out to be ingestion drift]]></title><description><![CDATA[
<p>Over the last few months, we’ve been working on creating an autonomous Agentic AI, and something unexpected kept showing up. I went in thinking the issues were with embeddings or the retriever, but the root cause was usually ingestion drifting upstream.<p>Some patterns that kept repeating:
• PDFs extracting differently after a small template or export tool change
• headings collapsing or shifting levels
• hidden characters creeping into tokens
• tables losing their structure
• documents updated without being re-ingested
• different converters producing slightly different text layouts<p>We only noticed the drift once we started diffing extraction output week-to-week and tracking token count variance. Running two extractors on the same file also revealed inconsistencies that weren’t obvious from looking at the text.<p>Even with pinned extractor versions, mixed-format sources (Google Docs, Word, Confluence exports, scanned PDFs) still drifted subtly over time. The retriever was doing exactly what it was told, the input data just wasn’t consistent anymore.<p>Curious if others have seen this.
How do you keep ingestion stable in production RAG/Agentic AI systems?</p>
<hr>
<p>Comments URL: <a href="https://news.ycombinator.com/item?id=46128397">https://news.ycombinator.com/item?id=46128397</a></p>
<p>Points: 2</p>
<p># Comments: 2</p>
]]></description><pubDate>Tue, 02 Dec 2025 23:33:22 +0000</pubDate><link>https://news.ycombinator.com/item?id=46128397</link><dc:creator>wehadit</dc:creator><comments>https://news.ycombinator.com/item?id=46128397</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=46128397</guid></item><item><title><![CDATA[Is generation of reliable tailored code helpful?]]></title><description><![CDATA[
<p>Hey devs, I am building an agentic AI to be the AI tool for developers. I was wondering if it'll be helpful if the agentic AI generates the 1st working ver of app in one prompt with re-usable code that is tailored to project specs from Figma/Motiff, Postman or requirements docs. Additionally, what if the agentic AI also helps with elevating coding skills on every project, does code review, creates unit test cases, helps with task management.<p>Does this type of an Agentic AI/AI tool help developers?</p>
<hr>
<p>Comments URL: <a href="https://news.ycombinator.com/item?id=42987231">https://news.ycombinator.com/item?id=42987231</a></p>
<p>Points: 1</p>
<p># Comments: 0</p>
]]></description><pubDate>Sat, 08 Feb 2025 23:52:13 +0000</pubDate><link>https://news.ycombinator.com/item?id=42987231</link><dc:creator>wehadit</dc:creator><comments>https://news.ycombinator.com/item?id=42987231</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=42987231</guid></item><item><title><![CDATA[We need more than AI auto complete to do what matters most]]></title><description><![CDATA[
<p>The reality is that since the 90s, we've all gone from spending 70% of our time on things we enjoy to 30%. This shift is due to info overload, multitude of apps and processes creating mundane tasks a necessity or 40% of our life. At 9 to 5 is now such a phased out term that it irks people , it's since we struggle to make time for things we enjoy. The same has happened on our personal lives as well. There's a solution in AI, definitely.<p>If corporates can profit from it by automating most complex processes, then why can't we as individuals take advantage as well. It's time to empower ourselves by making the power of AI accessible to us for work & personal tasks. The power of desktop is now old.<p>Imagine us having a personal AI that recommends vacation plans by automatically working with your preferences, budget and prior reservations - we are now spending the 5hrs travel research time with our friends/family.<p>OR a personal AI that extracts UI elements from Figma, requirements of the UI from a doc and API from Postman to generate your 1st working ver of the app using your coding standards. The time saved can now help you spend time on what you enjoy the most - applying your hard earned skills on complex/customization/challenging tasks.<p>It's not only an AI tool for developers, it's also a tool for a developer who is a traveler. This is the tech world we should have. Let's just not AI auto complete, let's empower ourselves with such personal AI.
Do you agree there's this gap I speak of and power of personal AI (not generic AI) will fill this gap?</p>
<hr>
<p>Comments URL: <a href="https://news.ycombinator.com/item?id=42968013">https://news.ycombinator.com/item?id=42968013</a></p>
<p>Points: 1</p>
<p># Comments: 0</p>
]]></description><pubDate>Fri, 07 Feb 2025 00:35:57 +0000</pubDate><link>https://news.ycombinator.com/item?id=42968013</link><dc:creator>wehadit</dc:creator><comments>https://news.ycombinator.com/item?id=42968013</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=42968013</guid></item></channel></rss>