Hacker News: kapitalx

Show HN: I Built Grid View for Switchboard – Claude Code CLI Manager

kapitalx — Sat, 21 Mar 2026 23:38:25 +0000

Article URL: https://github.com/doctly/switchboard/releases/tag/v0.0.16

Comments URL: https://news.ycombinator.com/item?id=47472719

Points: 2

# Comments: 0

New comment by kapitalx in "Show HN: A desktop app for managing Claude Code sessions"

kapitalx — Fri, 13 Mar 2026 23:17:37 +0000

Fixed some bugs and much more stable now.

https://github.com/doctly/switchboard/releases/tag/v0.0.8

New comment by kapitalx in "Show HN: A desktop app for managing Claude Code sessions"

kapitalx — Fri, 13 Mar 2026 15:39:58 +0000

Thank you. Yeah I looked at a few, another one is air.dev. The problem is they are recreating the actualy coding pane. Switchboard runs terminal directly.

Show HN: A desktop app for managing Claude Code sessions

kapitalx — Thu, 12 Mar 2026 14:59:26 +0000

Article URL: https://github.com/doctly/switchboard

Comments URL: https://news.ycombinator.com/item?id=47351580

Points: 5

# Comments: 3

New comment by kapitalx in "Ask HN: What are you working on? (September 2025)"

kapitalx — Tue, 30 Sep 2025 15:42:41 +0000

https://doctly.ai

We're building Doctly.ai - PDF Extraction with AI.

We started out with document conversions to Markdown but quickly realized that most use cases were for JSON conversion. We recently launched our "Extractor Studio" where you can have AI analyze a few sample variations of your documents and come up with a schema for you and publish it to an API endpoint.

We've built a technique on top of AI models that dramatically improves run to run consistency of JSON output.

Checkout the blog post here: https://medium.com/@abasiri/introducing-doctlys-extractor-st...

New comment by kapitalx in "Show HN: Extractor Studio – The Fastest Way to Build PDF Extractors"

kapitalx — Tue, 30 Sep 2025 15:15:28 +0000

Check it out at https://doctly.ai

Show HN: Extractor Studio – The Fastest Way to Build PDF Extractors

kapitalx — Tue, 30 Sep 2025 15:15:10 +0000

Article URL: https://medium.com/@abasiri/introducing-doctlys-extractor-studio-a-faster-way-to-build-pdf-extractors-999093a0ad11

Comments URL: https://news.ycombinator.com/item?id=45426585

Points: 10

# Comments: 2

New comment by kapitalx in "PDF to Text, a challenging problem"

kapitalx — Tue, 13 May 2025 16:21:00 +0000

This is approximately the approach we're taking also at https://doctly.ai, add to that a "multiple experts" approach for analyzing the image (for our 'ultra' version), and we get really good results. And we're making it better constantly.

New comment by kapitalx in "Show HN: Qwen-2.5-32B is now the best open source OCR model"

kapitalx — Tue, 01 Apr 2025 22:05:13 +0000

To be fair, they didn't include themselves at all in the graph.

New comment by kapitalx in "Show HN: Qwen-2.5-32B is now the best open source OCR model"

kapitalx — Tue, 01 Apr 2025 21:57:56 +0000

In addition, gemini Pro 2.5 does really well with bounding boxes, but yeah not open source :(

New comment by kapitalx in "Show HN: Qwen-2.5-32B is now the best open source OCR model"

kapitalx — Tue, 01 Apr 2025 21:57:17 +0000

If you're limited to open source models, that's very true. But for larger models and depending on your document needs, we're definitely seeing very high accuracy (95%-99%) for direct to json extraction (no markdown in between step) with our solution at https://doctly.ai.

Show HN: We OCR'ed 60k pages of the JFK files with AI

kapitalx — Wed, 19 Mar 2025 16:17:13 +0000

Article URL: https://doctly.ai/jfk

Comments URL: https://news.ycombinator.com/item?id=43414002

Points: 11

# Comments: 3

New comment by kapitalx in "Show HN: OCR Benchmark Focusing on Automation"

kapitalx — Fri, 14 Mar 2025 23:38:53 +0000

I'll dig deeper into your code, but scanning your post does look like your are addressing this. That's great.

If I do find anything, I'll share with you for comments before I publish the post.

New comment by kapitalx in "Show HN: OCR Benchmark Focusing on Automation"

kapitalx — Fri, 14 Mar 2025 22:18:38 +0000

Exactly. You still have to be explicit in order to remove bias. Either by sorting the keys, or looking up specific keys. For arrays, I would say order still matters. For example when you capture a list of invoice items, you should maintain order.

New comment by kapitalx in "Show HN: OCR Benchmark Focusing on Automation"

kapitalx — Fri, 14 Mar 2025 21:16:46 +0000

Great list! I’ll definitely run your benchmark against Doctly.ai (our PDF-to-Markdown service) specially as we publish our workflow service, to see how we stack up.

One thing I’ve noticed in many benchmarks, though, is the potential for bias. I’m actually working on a post about this issue, so it’s top of mind for me. For example, in the omni benchmark, the ground truth expected a specific order for heading information—like logo, phone number, and customer details. While this data was all located near the top of the document, the exact ordering felt subjective. Should the model prioritize horizontal or vertical scanning? Since the ground truth was created by the company running the benchmark, their model naturally scored the highest for maintaining the same order as the ground-truth.

However, this approach penalized other LLMs for not adhering to the "correct" order, even though the order itself was arguably arbitrary. This kind of bias can skew results and make it harder to evaluate models fairly. I’d love to see benchmarks that account for subjectivity or allow for multiple valid interpretations of document structure.

Did you run into this when looking at the benchmarks?

On a side note, Doctly.ai leverages multiple LLMs to evaluate documents, and runs a tournament with a judge for each page to get the best data (this is only on the Precision Ultra selection).

New comment by kapitalx in "Mistral OCR"

kapitalx — Sun, 09 Mar 2025 07:59:26 +0000

Customers are willing to pay for accuracy compared to existing solutions out there. We started out in need of an accurate solution for a RAG product we were building, but none of the solutions we tried were providing the accuracy we needed.

New comment by kapitalx in "Mistral OCR"

kapitalx — Thu, 06 Mar 2025 19:50:40 +0000

Looks to be API only for now. Documentation here: https://docs.mistral.ai/capabilities/document/

New comment by kapitalx in "Mistral OCR"

kapitalx — Thu, 06 Mar 2025 19:44:23 +0000

We've been getting great results with those aswell. But ofcourse there is always some chance of not getting it perfect, specially with different handwritings.

Give it a try, no credit cards needed to try it. If you email me (ali@doctly.ai) i can give you extra free credits for testing.

New comment by kapitalx in "Mistral OCR"

kapitalx — Thu, 06 Mar 2025 19:41:05 +0000

Yes I used the API. They have examples here:

https://docs.mistral.ai/capabilities/document/

I used base64 encoding of the image of the pdf page. The output was an object that has the markdown, and coordinates for the images:

[OCRPageObject(index=0, markdown='![img-0.jpeg](img-0.jpeg)', images=[OCRImageObject(id='img-0.jpeg', top_left_x=140, top_left_y=65, bottom_right_x=2136, bottom_right_y=1635, image_base64=None)], dimensions=OCRPageDimensions(dpi=200, height=1778, width=2300))] model='mistral-ocr-2503-completion' usage_info=OCRUsageInfo(pages_processed=1, doc_size_bytes=634209)

New comment by kapitalx in "Mistral OCR"

kapitalx — Thu, 06 Mar 2025 19:30:06 +0000

Great question. The language models are definitely beating the old tools. Take a look at Gemini for example.

Doctly runs a tournament style judge. It will run multiple generations across LLMs and pick the best one. Outperforming single generation and single model.