Hacker News: parsakhaz

New comment by parsakhaz in "Why LLMs still have problems with OCR"

parsakhaz — Mon, 10 Feb 2025 23:37:08 +0000

Yup, Moondream is great for this use case! You can use locally with the quickstart: https://docs.moondream.ai/

It is a 2b vision model that runs anywhere and can object detect, point, query, and more.

New comment by parsakhaz in "Coping with dumb LLMs using classic ML"

parsakhaz — Fri, 31 Jan 2025 21:18:53 +0000

Thanks for the shout out :)

New comment by parsakhaz in "Coping with dumb LLMs using classic ML"

parsakhaz — Fri, 31 Jan 2025 21:18:24 +0000

We've run a couple experiments and have found that our open vision language model Moondream works better than YOLOv11 in general cases. If accuracy matters most, it's worth trying our vision language model. If you need real-time results, you can train YOLO models using data from our model. We have a space for video redaction, that is just object detection, on our Hugging Face. We also have a playground online to try it out.

New comment by parsakhaz in "Guide: How to use Moondream's free OpenAI compatible endpoint (5k queries/day)"

parsakhaz — Wed, 15 Jan 2025 21:22:50 +0000

Send a message into our discord, and we will get it bumped up for you: https://discord.com/invite/tRUdpjDQfH

New comment by parsakhaz in "Guide: How to use Moondream's free OpenAI compatible endpoint (5k queries/day)"

parsakhaz — Wed, 15 Jan 2025 19:35:06 +0000

We just rolled out OpenAI compatibility for Moondream 2B, which means that you can now seamlessly switch from OpenAI's Vision API to Moondream with minimal changes to your existing code.

Our docs, like Moondream, are open source. If you find any issues with the page, or want to suggest a change, click "Edit this page" and you'll be routed to the GitHub for the docs.

The best part is - our API is free, for up to 5k requests. There is zero friction in getting started with and trying Moondream. We are also very active on discord, so if you get stuck (or have a special request), let us know and we will be quick to help.

Looking forwards to seeing all the cool stuff that people build!

---

What is Moondream?

Moondream-2B is a lightweight vision-language model optimized for visual understanding tasks. It excels at answering questions about images, describing scenes, identifying objects and attributes, and basic text recognition. While more compact than larger models, it provides efficient and accurate responses for straightforward visual question-answering.

As a 2B parameter model, it has some limitations to keep in mind: descriptions may be less detailed than larger models, complex multi-step reasoning can be challenging, and it may struggle with edge cases like very low-quality images or advanced spatial understanding. For best results, focus on direct questions about image content rather than complex reasoning chains.

Guide: How to use Moondream's free OpenAI compatible endpoint (5k queries/day)

parsakhaz — Wed, 15 Jan 2025 19:35:06 +0000

Article URL: https://docs.moondream.ai/openai-compatibility

Comments URL: https://news.ycombinator.com/item?id=42715855

Points: 3

# Comments: 3