<rss version="2.0" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:atom="http://www.w3.org/2005/Atom"><channel><title>Hacker News: couAUIA</title><link>https://news.ycombinator.com/user?id=couAUIA</link><description>Hacker News RSS</description><docs>https://hnrss.org/</docs><generator>hnrss v2.1.1</generator><lastBuildDate>Tue, 26 May 2026 18:20:18 +0000</lastBuildDate><atom:link href="https://hnrss.org/user?id=couAUIA" rel="self" type="application/rss+xml"></atom:link><item><title><![CDATA[New comment by couAUIA in "GitHub Actions down again today"]]></title><description><![CDATA[
<p>LoL they added "Copilot AI Model Providers" in githubstatus and it has 100% up time.<p>Thanks for pointing out that nobody is using that thing</p>
]]></description><pubDate>Tue, 26 May 2026 12:19:28 +0000</pubDate><link>https://news.ycombinator.com/item?id=48278738</link><dc:creator>couAUIA</dc:creator><comments>https://news.ycombinator.com/item?id=48278738</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=48278738</guid></item><item><title><![CDATA[Ask HN: Should I continue this project ? (Being able to change AI harness)]]></title><description><![CDATA[
<p>Article URL: <a href="https://github.com/charles-azam/OmniAgents">https://github.com/charles-azam/OmniAgents</a></p>
<p>Comments URL: <a href="https://news.ycombinator.com/item?id=48026121">https://news.ycombinator.com/item?id=48026121</a></p>
<p>Points: 1</p>
<p># Comments: 1</p>
]]></description><pubDate>Tue, 05 May 2026 17:58:21 +0000</pubDate><link>https://github.com/charles-azam/OmniAgents</link><dc:creator>couAUIA</dc:creator><comments>https://news.ycombinator.com/item?id=48026121</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=48026121</guid></item><item><title><![CDATA[Evaluate Your Own RAG: Why Best Practices Failed Us]]></title><description><![CDATA[
<p>Article URL: <a href="https://charlesazam.com/blog/rag/">https://charlesazam.com/blog/rag/</a></p>
<p>Comments URL: <a href="https://news.ycombinator.com/item?id=47035041">https://news.ycombinator.com/item?id=47035041</a></p>
<p>Points: 2</p>
<p># Comments: 0</p>
]]></description><pubDate>Mon, 16 Feb 2026 14:02:22 +0000</pubDate><link>https://charlesazam.com/blog/rag/</link><dc:creator>couAUIA</dc:creator><comments>https://news.ycombinator.com/item?id=47035041</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47035041</guid></item><item><title><![CDATA[New comment by couAUIA in "GLM-5 topped the coding benchmarks. Then I used it"]]></title><description><![CDATA[
<p>TL;DR: GLM-5 tops coding benchmarks. I tested it on an unpublished NP-hard optimization problem (KIRO) and 89-task Terminal-Bench. Best case: competitive. Typical case: 30% invalid output, every trial timed out, and two identical runs could produce a valid solution or complete garbage. Zhipu AI reports 56% on Terminal-Bench; I got 40%.</p>
]]></description><pubDate>Sat, 14 Feb 2026 20:21:27 +0000</pubDate><link>https://news.ycombinator.com/item?id=47018004</link><dc:creator>couAUIA</dc:creator><comments>https://news.ycombinator.com/item?id=47018004</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47018004</guid></item><item><title><![CDATA[GLM-5 topped the coding benchmarks. Then I used it]]></title><description><![CDATA[
<p>Article URL: <a href="https://charlesazam.com/blog/glm5-benchmark-reality/">https://charlesazam.com/blog/glm5-benchmark-reality/</a></p>
<p>Comments URL: <a href="https://news.ycombinator.com/item?id=47018003">https://news.ycombinator.com/item?id=47018003</a></p>
<p>Points: 5</p>
<p># Comments: 1</p>
]]></description><pubDate>Sat, 14 Feb 2026 20:21:27 +0000</pubDate><link>https://charlesazam.com/blog/glm5-benchmark-reality/</link><dc:creator>couAUIA</dc:creator><comments>https://news.ycombinator.com/item?id=47018003</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47018003</guid></item><item><title><![CDATA[New comment by couAUIA in "I benchmarked 4 coding agents on an NP-hard problem I solved 8 years ago"]]></title><description><![CDATA[
<p>I gave an unpublished fiber network optimization problem to Claude Code, Codex, Gemini CLI, and Mistral. The score is total fiber length (lower is better). A good human solution in 30 minutes: ~40,000. My best after days of C++: 34,123. Given one hour, Claude Code hit 34,061 — beating me by 62 points. A 7-word prompt hint improved every agent by 18-30%. About 15% of all trials produced completely invalid outputs.</p>
]]></description><pubDate>Thu, 12 Feb 2026 14:44:20 +0000</pubDate><link>https://news.ycombinator.com/item?id=46989454</link><dc:creator>couAUIA</dc:creator><comments>https://news.ycombinator.com/item?id=46989454</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=46989454</guid></item><item><title><![CDATA[I benchmarked 4 coding agents on an NP-hard problem I solved 8 years ago]]></title><description><![CDATA[
<p>Article URL: <a href="https://charlesazam.com/blog/kiro-benchmark/">https://charlesazam.com/blog/kiro-benchmark/</a></p>
<p>Comments URL: <a href="https://news.ycombinator.com/item?id=46989453">https://news.ycombinator.com/item?id=46989453</a></p>
<p>Points: 3</p>
<p># Comments: 2</p>
]]></description><pubDate>Thu, 12 Feb 2026 14:44:20 +0000</pubDate><link>https://charlesazam.com/blog/kiro-benchmark/</link><dc:creator>couAUIA</dc:creator><comments>https://news.ycombinator.com/item?id=46989453</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=46989453</guid></item><item><title><![CDATA[I Accidentally Rebuilt OpenHands from Scratch – Here's What I Learned]]></title><description><![CDATA[
<p>Article URL: <a href="https://huggingface.co/blog/charles-azam/rebuilt-openhands">https://huggingface.co/blog/charles-azam/rebuilt-openhands</a></p>
<p>Comments URL: <a href="https://news.ycombinator.com/item?id=46464090">https://news.ycombinator.com/item?id=46464090</a></p>
<p>Points: 2</p>
<p># Comments: 0</p>
]]></description><pubDate>Fri, 02 Jan 2026 12:23:38 +0000</pubDate><link>https://huggingface.co/blog/charles-azam/rebuilt-openhands</link><dc:creator>couAUIA</dc:creator><comments>https://news.ycombinator.com/item?id=46464090</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=46464090</guid></item><item><title><![CDATA[New comment by couAUIA in "Evaluate Your Own RAG, Why Best Practices Failed Us"]]></title><description><![CDATA[
<p>At Jimmy, we're developing France's first Small Modular Reactor. Our engineers needed to quickly search and extract insights from thousands of complex scientific PDFs—nuclear research papers, regulatory documents, multilingual content filled with equations and diagrams.<p>Manual search wasn't cutting it. So we built a RAG system to give our team instant access to critical technical knowledge.<p>What worked: 
-  AWS Titan V2 crushed it (69.2% hit rate vs. 57.7% for Qwen, 39.1% for Mistral) 
-  Chunk size? Barely mattered (2K to 40K—no significant difference)
-  Qdrant: Easy to use, solid performance, great for self-hosting
-  Mistral OCR: Unmatched, the only tool that parsed our equations correctly
-  Naive chunking beat context-aware (70.5% vs 63.8%) 
-  Dense-only search outperformed hybrid search (69.2% vs 63.5%)<p>Hard lessons:
-  OpenSearch from AWS is ridiculously expensive for no reason and presented as the default option by AWS
-  Mistral Embed works well in English but not in French</p>
]]></description><pubDate>Wed, 05 Nov 2025 13:47:04 +0000</pubDate><link>https://news.ycombinator.com/item?id=45822766</link><dc:creator>couAUIA</dc:creator><comments>https://news.ycombinator.com/item?id=45822766</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=45822766</guid></item><item><title><![CDATA[Evaluate Your Own RAG, Why Best Practices Failed Us]]></title><description><![CDATA[
<p>Article URL: <a href="https://huggingface.co/blog/charles-azam/rag">https://huggingface.co/blog/charles-azam/rag</a></p>
<p>Comments URL: <a href="https://news.ycombinator.com/item?id=45822765">https://news.ycombinator.com/item?id=45822765</a></p>
<p>Points: 2</p>
<p># Comments: 1</p>
]]></description><pubDate>Wed, 05 Nov 2025 13:47:04 +0000</pubDate><link>https://huggingface.co/blog/charles-azam/rag</link><dc:creator>couAUIA</dc:creator><comments>https://news.ycombinator.com/item?id=45822765</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=45822765</guid></item></channel></rss>