<rss version="2.0" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:atom="http://www.w3.org/2005/Atom"><channel><title>Hacker News: ady9999</title><link>https://news.ycombinator.com/user?id=ady9999</link><description>Hacker News RSS</description><docs>https://hnrss.org/</docs><generator>hnrss v2.1.1</generator><lastBuildDate>Sat, 18 Apr 2026 06:13:48 +0000</lastBuildDate><atom:link href="https://hnrss.org/user?id=ady9999" rel="self" type="application/rss+xml"></atom:link><item><title><![CDATA[Show HN: Unsiloed – VLMs for Document Ingestion]]></title><description><![CDATA[
<p>I'm excited to introduce Unsiloed Chunker, an open-source Python library designed for efficient document chunking in retrieval-augmented generation (RAG) applications.<p>Key Features:<p>Multi-threaded Processing: Speeds up chunking operations by processing multiple documents simultaneously.
Supports Multiple File Types: Handles PDF, DOCX, and PPTX formats.
Flexible Chunking Strategies: Offers fixed-size and page-based chunking methods.
Zero Dependencies: Lightweight and easy to integrate into your projects.
Installation:<p>pip install unsiloed-chunker
Usage Example:<p>from unsiloed_chunker import Chunker<p>chunker = Chunker(file_path="your_document.pdf")
chunks = chunker.chunk(strategy="fixed_size", chunk_size=500)
for chunk in chunks:
    print(chunk)
For more details, check out the documentation.<p>I'd love to hear your feedback and suggestions!</p>
<hr>
<p>Comments URL: <a href="https://news.ycombinator.com/item?id=44272502">https://news.ycombinator.com/item?id=44272502</a></p>
<p>Points: 1</p>
<p># Comments: 0</p>
]]></description><pubDate>Fri, 13 Jun 2025 21:36:10 +0000</pubDate><link>https://www.unsiloed.ai/</link><dc:creator>ady9999</dc:creator><comments>https://news.ycombinator.com/item?id=44272502</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=44272502</guid></item><item><title><![CDATA[New comment by ady9999 in "Ingesting PDFs and why Gemini 2.0 changes everything"]]></title><description><![CDATA[
<p>We have been building smaller and more efficient VLMs for document extraction from way before and we are 10x faster than unstructured,reducto (the ocr vendors) with an accuracy of 90%.<p>P.S. - You can find us here (unsiloed-ai.com) or you can reach out to me on adnan.abbas@unsiloed-ai.com</p>
]]></description><pubDate>Fri, 07 Feb 2025 02:04:17 +0000</pubDate><link>https://news.ycombinator.com/item?id=42968497</link><dc:creator>ady9999</dc:creator><comments>https://news.ycombinator.com/item?id=42968497</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=42968497</guid></item></channel></rss>