<rss version="2.0" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:atom="http://www.w3.org/2005/Atom"><channel><title>Hacker News: kapitalx</title><link>https://news.ycombinator.com/user?id=kapitalx</link><description>Hacker News RSS</description><docs>https://hnrss.org/</docs><generator>hnrss v2.1.1</generator><lastBuildDate>Sat, 30 May 2026 05:03:40 +0000</lastBuildDate><atom:link href="https://hnrss.org/user?id=kapitalx" rel="self" type="application/rss+xml"></atom:link><item><title><![CDATA[Show HN: I Built Grid View for Switchboard – Claude Code CLI Manager]]></title><description><![CDATA[
<p>Article URL: <a href="https://github.com/doctly/switchboard/releases/tag/v0.0.16">https://github.com/doctly/switchboard/releases/tag/v0.0.16</a></p>
<p>Comments URL: <a href="https://news.ycombinator.com/item?id=47472719">https://news.ycombinator.com/item?id=47472719</a></p>
<p>Points: 2</p>
<p># Comments: 0</p>
]]></description><pubDate>Sat, 21 Mar 2026 23:38:25 +0000</pubDate><link>https://github.com/doctly/switchboard/releases/tag/v0.0.16</link><dc:creator>kapitalx</dc:creator><comments>https://news.ycombinator.com/item?id=47472719</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47472719</guid></item><item><title><![CDATA[New comment by kapitalx in "Show HN: A desktop app for managing Claude Code sessions"]]></title><description><![CDATA[
<p>Fixed some bugs and much more stable now.<p><a href="https://github.com/doctly/switchboard/releases/tag/v0.0.8" rel="nofollow">https://github.com/doctly/switchboard/releases/tag/v0.0.8</a></p>
]]></description><pubDate>Fri, 13 Mar 2026 23:17:37 +0000</pubDate><link>https://news.ycombinator.com/item?id=47371304</link><dc:creator>kapitalx</dc:creator><comments>https://news.ycombinator.com/item?id=47371304</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47371304</guid></item><item><title><![CDATA[New comment by kapitalx in "Show HN: A desktop app for managing Claude Code sessions"]]></title><description><![CDATA[
<p>Thank you. Yeah I looked at a few, another one is air.dev. The problem is they are recreating the actualy coding pane. Switchboard runs terminal directly.</p>
]]></description><pubDate>Fri, 13 Mar 2026 15:39:58 +0000</pubDate><link>https://news.ycombinator.com/item?id=47365931</link><dc:creator>kapitalx</dc:creator><comments>https://news.ycombinator.com/item?id=47365931</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47365931</guid></item><item><title><![CDATA[Show HN: A desktop app for managing Claude Code sessions]]></title><description><![CDATA[
<p>Article URL: <a href="https://github.com/doctly/switchboard">https://github.com/doctly/switchboard</a></p>
<p>Comments URL: <a href="https://news.ycombinator.com/item?id=47351580">https://news.ycombinator.com/item?id=47351580</a></p>
<p>Points: 5</p>
<p># Comments: 3</p>
]]></description><pubDate>Thu, 12 Mar 2026 14:59:26 +0000</pubDate><link>https://github.com/doctly/switchboard</link><dc:creator>kapitalx</dc:creator><comments>https://news.ycombinator.com/item?id=47351580</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47351580</guid></item><item><title><![CDATA[New comment by kapitalx in "Ask HN: What are you working on? (September 2025)"]]></title><description><![CDATA[
<p><a href="https://doctly.ai" rel="nofollow">https://doctly.ai</a><p>We're building Doctly.ai - PDF Extraction with AI.<p>We started out with document conversions to Markdown but quickly realized that most use cases were for JSON conversion. We recently launched our "Extractor Studio" where you can have AI analyze a few sample variations of your documents and come up with a schema for you and publish it to an API endpoint.<p>We've built a technique on top of AI models that dramatically improves run to run consistency of JSON output.<p>Checkout the blog post here: <a href="https://medium.com/@abasiri/introducing-doctlys-extractor-studio-a-faster-way-to-build-pdf-extractors-999093a0ad11" rel="nofollow">https://medium.com/@abasiri/introducing-doctlys-extractor-st...</a></p>
]]></description><pubDate>Tue, 30 Sep 2025 15:42:41 +0000</pubDate><link>https://news.ycombinator.com/item?id=45426958</link><dc:creator>kapitalx</dc:creator><comments>https://news.ycombinator.com/item?id=45426958</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=45426958</guid></item><item><title><![CDATA[New comment by kapitalx in "Show HN: Extractor Studio – The Fastest Way to Build PDF Extractors"]]></title><description><![CDATA[
<p>Check it out at <a href="https://doctly.ai" rel="nofollow">https://doctly.ai</a></p>
]]></description><pubDate>Tue, 30 Sep 2025 15:15:28 +0000</pubDate><link>https://news.ycombinator.com/item?id=45426587</link><dc:creator>kapitalx</dc:creator><comments>https://news.ycombinator.com/item?id=45426587</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=45426587</guid></item><item><title><![CDATA[Show HN: Extractor Studio – The Fastest Way to Build PDF Extractors]]></title><description><![CDATA[
<p>Article URL: <a href="https://medium.com/@abasiri/introducing-doctlys-extractor-studio-a-faster-way-to-build-pdf-extractors-999093a0ad11">https://medium.com/@abasiri/introducing-doctlys-extractor-studio-a-faster-way-to-build-pdf-extractors-999093a0ad11</a></p>
<p>Comments URL: <a href="https://news.ycombinator.com/item?id=45426585">https://news.ycombinator.com/item?id=45426585</a></p>
<p>Points: 10</p>
<p># Comments: 2</p>
]]></description><pubDate>Tue, 30 Sep 2025 15:15:10 +0000</pubDate><link>https://medium.com/@abasiri/introducing-doctlys-extractor-studio-a-faster-way-to-build-pdf-extractors-999093a0ad11</link><dc:creator>kapitalx</dc:creator><comments>https://news.ycombinator.com/item?id=45426585</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=45426585</guid></item><item><title><![CDATA[New comment by kapitalx in "PDF to Text, a challenging problem"]]></title><description><![CDATA[
<p>This is approximately the approach we're taking also at <a href="https://doctly.ai" rel="nofollow">https://doctly.ai</a>, add to that a "multiple experts" approach for analyzing the image (for our 'ultra' version), and we get really good results. And we're making it better constantly.</p>
]]></description><pubDate>Tue, 13 May 2025 16:21:00 +0000</pubDate><link>https://news.ycombinator.com/item?id=43974604</link><dc:creator>kapitalx</dc:creator><comments>https://news.ycombinator.com/item?id=43974604</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=43974604</guid></item><item><title><![CDATA[New comment by kapitalx in "Show HN: Qwen-2.5-32B is now the best open source OCR model"]]></title><description><![CDATA[
<p>To be fair, they didn't include themselves at all in the graph.</p>
]]></description><pubDate>Tue, 01 Apr 2025 22:05:13 +0000</pubDate><link>https://news.ycombinator.com/item?id=43551815</link><dc:creator>kapitalx</dc:creator><comments>https://news.ycombinator.com/item?id=43551815</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=43551815</guid></item><item><title><![CDATA[New comment by kapitalx in "Show HN: Qwen-2.5-32B is now the best open source OCR model"]]></title><description><![CDATA[
<p>In addition, gemini Pro 2.5 does really well with bounding boxes, but yeah not open source :(</p>
]]></description><pubDate>Tue, 01 Apr 2025 21:57:56 +0000</pubDate><link>https://news.ycombinator.com/item?id=43551762</link><dc:creator>kapitalx</dc:creator><comments>https://news.ycombinator.com/item?id=43551762</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=43551762</guid></item><item><title><![CDATA[New comment by kapitalx in "Show HN: Qwen-2.5-32B is now the best open source OCR model"]]></title><description><![CDATA[
<p>If you're limited to open source models, that's very true. But for larger models and depending on your document needs, we're definitely seeing very high accuracy (95%-99%) for direct to json extraction (no markdown in between step) with our solution at <a href="https://doctly.ai" rel="nofollow">https://doctly.ai</a>.</p>
]]></description><pubDate>Tue, 01 Apr 2025 21:57:17 +0000</pubDate><link>https://news.ycombinator.com/item?id=43551756</link><dc:creator>kapitalx</dc:creator><comments>https://news.ycombinator.com/item?id=43551756</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=43551756</guid></item><item><title><![CDATA[Show HN: We OCR'ed 60k pages of the JFK files with AI]]></title><description><![CDATA[
<p>Article URL: <a href="https://doctly.ai/jfk">https://doctly.ai/jfk</a></p>
<p>Comments URL: <a href="https://news.ycombinator.com/item?id=43414002">https://news.ycombinator.com/item?id=43414002</a></p>
<p>Points: 11</p>
<p># Comments: 3</p>
]]></description><pubDate>Wed, 19 Mar 2025 16:17:13 +0000</pubDate><link>https://doctly.ai/jfk</link><dc:creator>kapitalx</dc:creator><comments>https://news.ycombinator.com/item?id=43414002</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=43414002</guid></item><item><title><![CDATA[New comment by kapitalx in "Show HN: OCR Benchmark Focusing on Automation"]]></title><description><![CDATA[
<p>I'll dig deeper into your code, but scanning your post does look like your are addressing this. That's great.<p>If I do find anything, I'll share with you for comments before I publish the post.</p>
]]></description><pubDate>Fri, 14 Mar 2025 23:38:53 +0000</pubDate><link>https://news.ycombinator.com/item?id=43368479</link><dc:creator>kapitalx</dc:creator><comments>https://news.ycombinator.com/item?id=43368479</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=43368479</guid></item><item><title><![CDATA[New comment by kapitalx in "Show HN: OCR Benchmark Focusing on Automation"]]></title><description><![CDATA[
<p>Exactly. You still have to be explicit in order to remove bias. Either by sorting the keys, or looking up specific keys. For arrays, I would say order still matters. For example when you capture a list of invoice items, you should maintain order.</p>
]]></description><pubDate>Fri, 14 Mar 2025 22:18:38 +0000</pubDate><link>https://news.ycombinator.com/item?id=43367886</link><dc:creator>kapitalx</dc:creator><comments>https://news.ycombinator.com/item?id=43367886</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=43367886</guid></item><item><title><![CDATA[New comment by kapitalx in "Show HN: OCR Benchmark Focusing on Automation"]]></title><description><![CDATA[
<p>Great list! I’ll definitely run your benchmark against Doctly.ai (our PDF-to-Markdown service) specially as we publish our workflow service, to see how we stack up.<p>One thing I’ve noticed in many benchmarks, though, is the potential for bias. I’m actually working on a post about this issue, so it’s top of mind for me. For example, in the omni benchmark, the ground truth expected a specific order for heading information—like logo, phone number, and customer details. While this data was all located near the top of the document, the exact ordering felt subjective. Should the model prioritize horizontal or vertical scanning? Since the ground truth was created by the company running the benchmark, their model naturally scored the highest for maintaining the same order as the ground-truth.<p>However, this approach penalized other LLMs for not adhering to the "correct" order, even though the order itself was arguably arbitrary. This kind of bias can skew results and make it harder to evaluate models fairly. I’d love to see benchmarks that account for subjectivity or allow for multiple valid interpretations of document structure.<p>Did you run into this when looking at the benchmarks?<p>On a side note, Doctly.ai leverages multiple LLMs to evaluate documents, and runs a tournament with a judge for each page to get the best data (this is only on the Precision Ultra selection).</p>
]]></description><pubDate>Fri, 14 Mar 2025 21:16:46 +0000</pubDate><link>https://news.ycombinator.com/item?id=43367384</link><dc:creator>kapitalx</dc:creator><comments>https://news.ycombinator.com/item?id=43367384</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=43367384</guid></item><item><title><![CDATA[New comment by kapitalx in "Mistral OCR"]]></title><description><![CDATA[
<p>Customers are willing to pay for accuracy compared to existing solutions out there. We started out in need of an accurate solution for a RAG product we were building, but none of the solutions we tried were providing the accuracy we needed.</p>
]]></description><pubDate>Sun, 09 Mar 2025 07:59:26 +0000</pubDate><link>https://news.ycombinator.com/item?id=43307123</link><dc:creator>kapitalx</dc:creator><comments>https://news.ycombinator.com/item?id=43307123</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=43307123</guid></item><item><title><![CDATA[New comment by kapitalx in "Mistral OCR"]]></title><description><![CDATA[
<p>Looks to be API only for now. Documentation here: <a href="https://docs.mistral.ai/capabilities/document/" rel="nofollow">https://docs.mistral.ai/capabilities/document/</a></p>
]]></description><pubDate>Thu, 06 Mar 2025 19:50:40 +0000</pubDate><link>https://news.ycombinator.com/item?id=43284408</link><dc:creator>kapitalx</dc:creator><comments>https://news.ycombinator.com/item?id=43284408</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=43284408</guid></item><item><title><![CDATA[New comment by kapitalx in "Mistral OCR"]]></title><description><![CDATA[
<p>We've been getting great results with those aswell. But ofcourse there is always some chance of not getting it perfect, specially with different handwritings.<p>Give it a try, no credit cards needed to try it. If you email me (ali@doctly.ai) i can give you extra free credits for testing.</p>
]]></description><pubDate>Thu, 06 Mar 2025 19:44:23 +0000</pubDate><link>https://news.ycombinator.com/item?id=43284342</link><dc:creator>kapitalx</dc:creator><comments>https://news.ycombinator.com/item?id=43284342</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=43284342</guid></item><item><title><![CDATA[New comment by kapitalx in "Mistral OCR"]]></title><description><![CDATA[
<p>Yes I used the API. They have examples here:<p><a href="https://docs.mistral.ai/capabilities/document/" rel="nofollow">https://docs.mistral.ai/capabilities/document/</a><p>I used base64 encoding of the image of the pdf page. The output was an object that has the markdown, and coordinates for the images:<p>[OCRPageObject(index=0, markdown='![img-0.jpeg](img-0.jpeg)', images=[OCRImageObject(id='img-0.jpeg', top_left_x=140, top_left_y=65, bottom_right_x=2136, bottom_right_y=1635, image_base64=None)], dimensions=OCRPageDimensions(dpi=200, height=1778, width=2300))] model='mistral-ocr-2503-completion' usage_info=OCRUsageInfo(pages_processed=1, doc_size_bytes=634209)</p>
]]></description><pubDate>Thu, 06 Mar 2025 19:41:05 +0000</pubDate><link>https://news.ycombinator.com/item?id=43284314</link><dc:creator>kapitalx</dc:creator><comments>https://news.ycombinator.com/item?id=43284314</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=43284314</guid></item><item><title><![CDATA[New comment by kapitalx in "Mistral OCR"]]></title><description><![CDATA[
<p>Great question. The language models are definitely beating the old tools. Take a look at Gemini for example.<p>Doctly runs a tournament style judge. It will run multiple generations across LLMs and pick the best one. Outperforming single generation and single model.</p>
]]></description><pubDate>Thu, 06 Mar 2025 19:30:06 +0000</pubDate><link>https://news.ycombinator.com/item?id=43284206</link><dc:creator>kapitalx</dc:creator><comments>https://news.ycombinator.com/item?id=43284206</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=43284206</guid></item></channel></rss>