Hacker News: martimchaves

New comment by martimchaves in "Ask HN: What Are You Working On? (April 2026)"

martimchaves — Mon, 13 Apr 2026 10:46:46 +0000

Hey y'all, couple of projects to show off:

https://ragbandit.com - improve the retrieval stage of your RAG systems by tuning your document processing pipeline

https://smolinvoiceagent - an agent that process invoices, you make corrections, the agent learns your ways

https://vendor-simple-central.streamlit.app/ - this is just a POC, but it's a system to process and extract insights from data from amazon's vendor central

This post is really wholesome :)

New comment by martimchaves in "Ask HN: What Are You Working On? (April 2026)"

martimchaves — Mon, 13 Apr 2026 10:26:44 +0000

This is really cool, any challenges in developing this? How did you decide on the directions that you take when writing a word?

Couldn't get the letter constellations working on my end.

Country quizzes is a weak spot of mine, loved that. Would be cool to move the globe! Also, kudos for the bus cataloging!

New comment by martimchaves in "Ask HN: What Are You Working On? (April 2026)"

martimchaves — Mon, 13 Apr 2026 09:47:15 +0000

That sounds really cool, how are you going about this? Are you training a model using your own EEGs as inputs?

New comment by martimchaves in "Ask HN: What Are You Working On? (April 2026)"

martimchaves — Mon, 13 Apr 2026 09:39:27 +0000

I love tiled words, I've actually been doing the daily puzzle for a while, have completed over a 100 of them! It's part of my morning routine :)

I'm not sure if it would fit the theme, but sometimes I end up searching what an expression means, or where does it come from. Maybe it would be cool to have a little info box after you discover what the word is. Just an idea! Not sure if it would clutter things, and you can always search it yourself, but something I've been thinking about. I still remember looking up peanut gallery and sand dollar!

Show HN: Smol Invoice Agent, invoice processor that learns from your corrections

martimchaves — Wed, 08 Apr 2026 18:04:36 +0000

Hey HN, I built a small app, called Smol Invoice Agent.

You can upload invoices, have them parsed, and correct the parsed data. The corrections are saved, and the next time you upload a similar invoice, the agent may apply them automatically if they make sense.

You can provide feedback on the automatically applied corrections. Rejected corrections are immediately reverted, and you can add an explanation, which will be taken into account the next time the agent encounters a similar invoice.

Here's a 4 min demo showing the full feedback loop in action: https://www.youtube.com/watch?v=txaJ0OhOFVw

Export as JSON or CSV. You can also use it via the API.

Tech stack: Postgres (with pgvector), FastAPI, Claude for extraction, Celery + Redis for background processing, TypeScript with React, Voyage AI for embeddings, Mistral for OCR (with pymupdf fallback).

It's pay-as-you-go, no subscription required, just top up your account and use as needed. You can try it for free.

I'd really appreciate hearing what you think, especially if you deal with invoices regularly. Is the correction/feedback loop something that would actually save you time?

Thanks! Martim

Comments URL: https://news.ycombinator.com/item?id=47693967

Points: 1

# Comments: 0

Show HN: A tool to create and evaluate document processing pipelines for RAG

martimchaves — Fri, 27 Mar 2026 13:53:11 +0000

Hey HN, I built [ragbandit](https://ragbandit.com), a tool to help you evaluate different document processing pipelines for the retrieval stage of your RAG systems.

I was a bit overwhelmed with the different ways that you can process documents to create embeddings for RAG, so I wanted to create a tool to experiment with different OCR models, refining the OCR results, different chunking methods, and different embedding models.

You can: - search processed documents in the playground - evaluate the retrieval results using an llm-as-judge (not perfect, but can be a useful signal) - compare different datasets (using aggregate metrics or by side by side comparison in the playground)

You can also manually inspect the results of each query, and of each intermediate document processing result.

To get a better idea, check out one of the use cases: https://ragbandit.com/use-cases/optimizing-insurance-documen...

To be completely fair, I haven't added that many options for the different stages of the document processing pipeline! There are tons of features that I'd like to add, but I've already spent quite a bit of time on this, so I'd really appreciate it if you could let me know if this is something that could be useful for you/you find interesting. Would you use something like this?

Tech stack: Postgres (with pgvector), fastapi, [ragbandit-core](https://github.com/MartimChaves/ragbandit-core) (the document processing core is open source), typescript with react, celery for background tasks (and redis as the broker).

It's currently a credits-based subscription with optional top-ups. You can get 1000 credits to try it out (I ask for card info for these 1000 credits as a spam filter).

Thanks, Martim

Comments URL: https://news.ycombinator.com/item?id=47542679

Points: 2

# Comments: 0

New comment by martimchaves in "Show HN: Geo Racers – Race from London to Tokyo on a single bus pass"

martimchaves — Fri, 13 Feb 2026 17:03:50 +0000

Very fun game, thanks. When you try to change the speed of the game on mobile using firefox, it glitches and keeps opening the speed modal. Pretty much crashes the browser. Also +1 to the taxi fare not being paid in the currency of the country, but in some other currency. More jobs would be great! Appreciate the difficulty of not knowing exactly where you'll be taken, but without consistent jobs sometimes it's really punishing (or at least feels really punishing). I was thinking it might be cool to add stamina, and fancier hotels may replenish more stamina. An incentive to splurge ah.

New comment by martimchaves in "Ingesting PDFs and why Gemini 2.0 changes everything"

martimchaves — Thu, 06 Feb 2025 11:40:21 +0000

I'm guessing that human accuracy may be lower or around that value, given that handwritten notes are generally difficult to read. A better metric for document parsing might be accuracy relative to human performance (how much better the LLM performs compared to a human).

New comment by martimchaves in "The FizzBuzz that did not get me the job"

martimchaves — Sat, 25 Jan 2025 10:54:30 +0000

Cool read! I loved it when you changed the numbers to base 15, I thought that was a beautiful solution.

New comment by martimchaves in "Ask HN: What's Your Morning Routine?"

martimchaves — Mon, 30 Dec 2024 16:43:50 +0000

I feel you, exact same routine here.