Hacker News: mingtianzhang

ClawdReview – OpenReview for AI Agents

mingtianzhang — Sat, 14 Feb 2026 16:57:42 +0000

Agents can review the paper on arXiv, and humans can like or dislike agents' reviews. There are also ranking lists of the most popular papers and agents. Please visit: https://clawdreview.ai/

Comments URL: https://news.ycombinator.com/item?id=47016081

Points: 5

# Comments: 0

Show HN: ClawdReview – OpenReview for AI Agents

mingtianzhang — Sat, 14 Feb 2026 13:03:13 +0000

Agents can review the paper on arXiv, and humans can like or dislike agents' reviews. There are also ranking lists of the most popular papers and agents. Please visit: https://clawdreview.ai/

Comments URL: https://news.ycombinator.com/item?id=47014232

Points: 3

# Comments: 0

New comment by mingtianzhang in "[dead]"

mingtianzhang — Fri, 31 Oct 2025 15:13:27 +0000

VLM can already process both the document images and the query to produce an answer directly. Do we still need the intermediate OCR step?

New comment by mingtianzhang in "Do we still need OCR? An implementation of a pure vision-based agent"

mingtianzhang — Wed, 29 Oct 2025 06:28:09 +0000

We discuss the limitations of the classic OCR pipeline and provide a pure vision-based RAG system for document analysis (https://github.com/VectifyAI/PageIndex/blob/main/cookbook/vi...)

Any feedback is welcome!

Do we still need OCR? An implementation of a pure vision-based agent

mingtianzhang — Wed, 29 Oct 2025 06:28:09 +0000

Article URL: https://pageindex.ai/blog/do-we-need-ocr

Comments URL: https://news.ycombinator.com/item?id=45743316

Points: 7

# Comments: 1

New comment by mingtianzhang in "Should LLMs just treat text content as an image?"

mingtianzhang — Mon, 27 Oct 2025 15:32:25 +0000

We actually don't need OCR: https://pageindex.ai/blog/do-we-need-ocr

New comment by mingtianzhang in "Do We Still Need OCR?"

mingtianzhang — Mon, 27 Oct 2025 15:01:09 +0000

This blog examines the inherent limitations of the current OCR pipeline in the context of document question-answering systems from an information-theoretic perspective and discusses why a direct, vision-based approach can be more effective. It also provides a practical implementation of a vision-based question-answering system for long documents.

Do We Still Need OCR?

mingtianzhang — Mon, 27 Oct 2025 15:01:09 +0000

Article URL: https://pageindex.ai/blog/do-we-need-ocr

Comments URL: https://news.ycombinator.com/item?id=45721772

Points: 4

# Comments: 2

New comment by mingtianzhang in "Reasoning-based RAG for long document question answering"

mingtianzhang — Fri, 24 Oct 2025 14:05:59 +0000

PageIndex Chat is the world's first human-like long-document AI analyst. You can upload entire books, research papers, or hundred-page reports and chat with them without context limits, all in the browser. Unlike traditional RAG or "chat-with-your-doc" tools that rely on vector similarity search, PageIndex builds a hierarchical tree index of your document (like a table of contents), and then reasons over this index to retrieve and interpret relevant sections. It doesn’t search by keywords or embeddings — it reads, understands, and reasons through the document like a human expert.

What makes it different:

- Reasoning-based retrieval: Understands structure, logic, and meaning, not just semantic similarity.

- Page-level references: Every answer includes precise citations for easy verification.

- Cross-section reasoning: Connects information across sections and appendices to find true answers.

- Human-in-the-loop: You can guide, refine, and verify its reasoning.

- Multi-document comparison: Analyze and contrast multiple reports at once.

Reasoning-based RAG for long document question answering

mingtianzhang — Fri, 24 Oct 2025 14:05:59 +0000

Article URL: https://pageindex.ai/blog/pageindex-chat

Comments URL: https://news.ycombinator.com/item?id=45694799

Points: 1

# Comments: 1

New comment by mingtianzhang in "PageIndex Chat – Human-Like Long Document AI Analyst"

mingtianzhang — Wed, 22 Oct 2025 09:02:52 +0000

Unlike traditional RAG or "chat-with-your-doc" tools that rely on vector similarity search, PageIndex builds a hierarchical tree index of your document (like a table of contents), and then reasons over this index to retrieve and interpret relevant sections. It doesn’t search by keywords or embeddings — it reads, understands, and reasons through the document like a human expert.

What makes it different:

- Reasoning-based retrieval – Understands structure, logic, and meaning, not just semantic similarity. - Page-level references – Every answer includes precise citations for easy verification. - Cross-section reasoning – Connects information across sections and appendices to find true answers. - Human-in-the-loop – You can guide, refine, and verify its reasoning. - Multi-document comparison – Analyze and contrast multiple reports at once.

PageIndex Chat – Human-Like Long Document AI Analyst

mingtianzhang — Wed, 22 Oct 2025 09:02:52 +0000

Article URL: https://pageindex.ai/blog/pageindex-chat

Comments URL: https://news.ycombinator.com/item?id=45666487

Points: 6

# Comments: 1

New comment by mingtianzhang in "PageIndex Chat – Human-Like Long Document AI Analyst"

mingtianzhang — Wed, 22 Oct 2025 00:35:17 +0000

PageIndex Chat is the world's first human-like long-document AI analyst. You can pload entire books, research papers, or hundred-page reports and chat with them without context limits, all in the browser.

What makes it different:

PageIndex Chat – Human-Like Long Document AI Analyst

mingtianzhang — Wed, 22 Oct 2025 00:35:17 +0000

Article URL: https://pageindex.ai/blog/pageindex-chat

Comments URL: https://news.ycombinator.com/item?id=45663609

Points: 4

# Comments: 1

Show HN: In-Context Index for In-Context Retrieval

mingtianzhang — Thu, 09 Oct 2025 16:23:20 +0000

RAG pipelines have become bloated: embeddings, vector DBs, rerankers, and ad-hoc pipelines everywhere.

Projects like Claude Code showed a simpler path: In-Context Retrieval — letting the LLM reason directly over context for retrieval instead of outsourcing search to external infrastructure.

PageIndex takes that one step further with In-Context Indexing.

If retrieval happens in-context, the index should live there too.

Each document is transformed into a hierarchical, human-readable tree structure (like a table-of-contents tree index) inside the model's context window.

The LLM reads the structure, identifies relevant branches, opens them, and reasons through for retrieval — no embeddings, no chunking, no opaque vector indexes the model can't interpret.

Retrieval and indexing, both inside the model.

Comments URL: https://news.ycombinator.com/item?id=45529851

Points: 5

# Comments: 0