<rss version="2.0" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:atom="http://www.w3.org/2005/Atom"><channel><title>Hacker News: jeadie</title><link>https://news.ycombinator.com/user?id=jeadie</link><description>Hacker News RSS</description><docs>https://hnrss.org/</docs><generator>hnrss v2.1.1</generator><lastBuildDate>Fri, 17 Apr 2026 13:46:17 +0000</lastBuildDate><atom:link href="https://hnrss.org/user?id=jeadie" rel="self" type="application/rss+xml"></atom:link><item><title><![CDATA[New comment by jeadie in "Distributed DuckDB Instance"]]></title><description><![CDATA[
<p>This is exactly what we found. Ingest rates were tough. We partitioned and ran over multiple duckdb instances too (and wrangled the complexity).<p>We ending up building a Sqlite + vortex file alternative for our use case: <a href="https://spice.ai/blog/introducing-spice-cayenne-data-accelerator" rel="nofollow">https://spice.ai/blog/introducing-spice-cayenne-data-acceler...</a></p>
]]></description><pubDate>Tue, 14 Apr 2026 10:20:44 +0000</pubDate><link>https://news.ycombinator.com/item?id=47763661</link><dc:creator>jeadie</dc:creator><comments>https://news.ycombinator.com/item?id=47763661</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47763661</guid></item><item><title><![CDATA[New comment by jeadie in "Distributed DuckDB Instance"]]></title><description><![CDATA[
<p>You might find <a href="https://github.com/apache/datafusion" rel="nofollow">https://github.com/apache/datafusion</a> and <a href="https://github.com/datafusion-contrib/datafusion-federation" rel="nofollow">https://github.com/datafusion-contrib/datafusion-federation</a> of interest</p>
]]></description><pubDate>Tue, 14 Apr 2026 10:14:43 +0000</pubDate><link>https://news.ycombinator.com/item?id=47763616</link><dc:creator>jeadie</dc:creator><comments>https://news.ycombinator.com/item?id=47763616</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47763616</guid></item><item><title><![CDATA[New comment by jeadie in "Vector database that can index 1B vectors in 48M"]]></title><description><![CDATA[
<p>We’re building vector indexes into Datafusion for search (starting with S3 vectors).<p>Open source at <a href="https://github.com/spiceai/spiceai" rel="nofollow">https://github.com/spiceai/spiceai</a></p>
]]></description><pubDate>Fri, 12 Sep 2025 23:50:56 +0000</pubDate><link>https://news.ycombinator.com/item?id=45228028</link><dc:creator>jeadie</dc:creator><comments>https://news.ycombinator.com/item?id=45228028</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=45228028</guid></item><item><title><![CDATA[New comment by jeadie in "Airport for DuckDB"]]></title><description><![CDATA[
<p>This is one of the ideas behind using DuckDB in github.com/spiceai/spiceai</p>
]]></description><pubDate>Fri, 23 May 2025 02:15:01 +0000</pubDate><link>https://news.ycombinator.com/item?id=44069226</link><dc:creator>jeadie</dc:creator><comments>https://news.ycombinator.com/item?id=44069226</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=44069226</guid></item><item><title><![CDATA[New comment by jeadie in "Show HN: TextQuery – Query CSV, JSON, XLSX Files with SQL"]]></title><description><![CDATA[
<p>There’s also <a href="https://github.com/spiceai/spiceai">https://github.com/spiceai/spiceai</a></p>
]]></description><pubDate>Tue, 06 May 2025 01:00:45 +0000</pubDate><link>https://news.ycombinator.com/item?id=43900898</link><dc:creator>jeadie</dc:creator><comments>https://news.ycombinator.com/item?id=43900898</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=43900898</guid></item><item><title><![CDATA[New comment by jeadie in "Pinecone integrates AI inferencing with vector database"]]></title><description><![CDATA[
<p>This is a common feature now. If anything, for being so early to vector databases, Pinecone was rather late to integrating embeddings.<p>Timescale most recently added it but, yes a bunch of others: Weaviate, Spice AI, Marqo, etc.</p>
]]></description><pubDate>Wed, 04 Dec 2024 09:56:11 +0000</pubDate><link>https://news.ycombinator.com/item?id=42316005</link><dc:creator>jeadie</dc:creator><comments>https://news.ycombinator.com/item?id=42316005</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=42316005</guid></item><item><title><![CDATA[New comment by jeadie in "Pg_parquet: An extension to connect Postgres and parquet"]]></title><description><![CDATA[
<p>Why not just federate Postgres and parquet files? That way the query planner can push down as much of the query and reduce how much data has to move about?</p>
]]></description><pubDate>Fri, 18 Oct 2024 00:52:27 +0000</pubDate><link>https://news.ycombinator.com/item?id=41875366</link><dc:creator>jeadie</dc:creator><comments>https://news.ycombinator.com/item?id=41875366</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=41875366</guid></item><item><title><![CDATA[New comment by jeadie in "Pg_lakehouse: Query Any Data Lake from Postgres"]]></title><description><![CDATA[
<p>This looks functionally similar as using <a href="http://github.com/spiceai/spiceai">http://github.com/spiceai/spiceai</a> with a postgreSQL data accelerator.</p>
]]></description><pubDate>Mon, 13 May 2024 21:45:45 +0000</pubDate><link>https://news.ycombinator.com/item?id=40348883</link><dc:creator>jeadie</dc:creator><comments>https://news.ycombinator.com/item?id=40348883</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=40348883</guid></item><item><title><![CDATA[New comment by jeadie in "Ask HN: Who is hiring? (April 2024)"]]></title><description><![CDATA[
<p>Spice AI | Senior Software Engineer | GMT+10 (e.g. Australia) through GMT-7 (e.g. Seatle/SF/LA) | Remote | Full Time<p>Spice AI provides building blocks for data and AI-driven applications by composing real-time and historical time-series data, high-performance SQL query, machine learning training and inferencing, in a single, interconnected AI backend-as-a-service.<p>We just launched github.com/spiceai/spiceai, a unified SQL query interface and portable runtime to locally materialize, accelerate, and query data tables sourced from any database, data warehouse, or data lake.<p>We're hiring experienced software engineers, ideally with Rust and/or Golang production experience. We're focused on large data and distributed systems, experience in these is important too. More details: <a href="https://spice.ai/careers#section-open-positions" rel="nofollow">https://spice.ai/careers#section-open-positions</a></p>
]]></description><pubDate>Mon, 01 Apr 2024 22:03:27 +0000</pubDate><link>https://news.ycombinator.com/item?id=39899935</link><dc:creator>jeadie</dc:creator><comments>https://news.ycombinator.com/item?id=39899935</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=39899935</guid></item><item><title><![CDATA[New comment by jeadie in "Show HN: Spice.ai – materialize, accelerate, and query SQL data from any source"]]></title><description><![CDATA[
<p>And yes, Iceberg is very high up on our list</p>
]]></description><pubDate>Thu, 28 Mar 2024 21:11:48 +0000</pubDate><link>https://news.ycombinator.com/item?id=39857361</link><dc:creator>jeadie</dc:creator><comments>https://news.ycombinator.com/item?id=39857361</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=39857361</guid></item><item><title><![CDATA[New comment by jeadie in "Show HN: Spice.ai – materialize, accelerate, and query SQL data from any source"]]></title><description><![CDATA[
<p>Yes! It can connect to FlightSQL compatible servers (see <a href="https://docs.spiceai.org/data-connectors/flightsql" rel="nofollow">https://docs.spiceai.org/data-connectors/flightsql</a> ) and its also a FlightSQL compatible server</p>
]]></description><pubDate>Thu, 28 Mar 2024 21:11:28 +0000</pubDate><link>https://news.ycombinator.com/item?id=39857358</link><dc:creator>jeadie</dc:creator><comments>https://news.ycombinator.com/item?id=39857358</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=39857358</guid></item><item><title><![CDATA[New comment by jeadie in "Show HN: Yes, another vector embeddings API"]]></title><description><![CDATA[
<p>Have you seen github.com/marqo-ai/marqo? It does all this wrapping, and you don't even need to pay for OpenAI or pinecone</p>
]]></description><pubDate>Fri, 09 Jun 2023 07:23:01 +0000</pubDate><link>https://news.ycombinator.com/item?id=36254465</link><dc:creator>jeadie</dc:creator><comments>https://news.ycombinator.com/item?id=36254465</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=36254465</guid></item><item><title><![CDATA[New comment by jeadie in "GGML – AI at the Edge"]]></title><description><![CDATA[
<p>I'm very glad that this has some added funding. I am building a serverless API on the cloudflare edge network using GGML as the backbone --> tryinfima.com</p>
]]></description><pubDate>Thu, 08 Jun 2023 04:36:47 +0000</pubDate><link>https://news.ycombinator.com/item?id=36237271</link><dc:creator>jeadie</dc:creator><comments>https://news.ycombinator.com/item?id=36237271</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=36237271</guid></item><item><title><![CDATA[New comment by jeadie in "Weaviate – Open-Source AI Native Vector Database"]]></title><description><![CDATA[
<p>"AI Native" catching on</p>
]]></description><pubDate>Thu, 01 Jun 2023 01:38:55 +0000</pubDate><link>https://news.ycombinator.com/item?id=36146712</link><dc:creator>jeadie</dc:creator><comments>https://news.ycombinator.com/item?id=36146712</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=36146712</guid></item><item><title><![CDATA[New comment by jeadie in "PrivateGPT"]]></title><description><![CDATA[
<p>I've tried both Chroma and Qdrant. I don't think Chroma lacks that much. Definitely newer, but is also a great product. I think cloud support coming Q3 2023</p>
]]></description><pubDate>Mon, 22 May 2023 22:06:37 +0000</pubDate><link>https://news.ycombinator.com/item?id=36037299</link><dc:creator>jeadie</dc:creator><comments>https://news.ycombinator.com/item?id=36037299</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=36037299</guid></item><item><title><![CDATA[New comment by jeadie in "Ask HN: Seeking a Vector Database for ClickHouse Users – Suggestions Appreciated"]]></title><description><![CDATA[
<p>(Not affiliated with hyperDB)</p>
]]></description><pubDate>Thu, 20 Apr 2023 09:52:00 +0000</pubDate><link>https://news.ycombinator.com/item?id=35637997</link><dc:creator>jeadie</dc:creator><comments>https://news.ycombinator.com/item?id=35637997</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=35637997</guid></item><item><title><![CDATA[New comment by jeadie in "Ask HN: Seeking a Vector Database for ClickHouse Users – Suggestions Appreciated"]]></title><description><![CDATA[
<p>I've been using <a href="https://github.com/jdagdelen/hyperDB">https://github.com/jdagdelen/hyperDB</a> and it's been really easy to use. I think Clickhouse support is on the short-term roadmap.</p>
]]></description><pubDate>Thu, 20 Apr 2023 09:51:44 +0000</pubDate><link>https://news.ycombinator.com/item?id=35637995</link><dc:creator>jeadie</dc:creator><comments>https://news.ycombinator.com/item?id=35637995</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=35637995</guid></item><item><title><![CDATA[New comment by jeadie in "After All Is Said and Indexed – Unlocking Information in Recorded Speech"]]></title><description><![CDATA[
<p>Most people, like me, who end up needing to use vector DBs, are wanting to use LLMs on a specific, often private dataset/use case. Typically one starts with something like unstructured JSON data, then need to pick and manage LLMs to create embeddings, then store these and the original JSON data in a vectorDB. Then the application is some variety of CRUD operations + searching over both the original data and the embeddings.<p>Chroma, Pinecone, I guess FAISS/HNSWlib/etc only handle vector operations. Really what I'd want, which Marqo does, is handle everything end to end.</p>
]]></description><pubDate>Sat, 15 Apr 2023 23:46:01 +0000</pubDate><link>https://news.ycombinator.com/item?id=35585799</link><dc:creator>jeadie</dc:creator><comments>https://news.ycombinator.com/item?id=35585799</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=35585799</guid></item><item><title><![CDATA[New comment by jeadie in "After All Is Said and Indexed – Unlocking Information in Recorded Speech"]]></title><description><![CDATA[
<p>Not a dumb question at all! Essentially what can do Marqo, and this blog shows, is that there is alot of logic and work to do what you said (i.e. pass raw data into LLM, get embeddings, store in vector DB, then query both embeddings and original data).</p>
]]></description><pubDate>Sat, 15 Apr 2023 23:40:28 +0000</pubDate><link>https://news.ycombinator.com/item?id=35585775</link><dc:creator>jeadie</dc:creator><comments>https://news.ycombinator.com/item?id=35585775</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=35585775</guid></item><item><title><![CDATA[New comment by jeadie in "After All Is Said and Indexed – Unlocking Information in Recorded Speech"]]></title><description><![CDATA[
<p>Its a great tool. Unlike vectorDBs alone, Marqo helps the full process that alot of people end up wanting to use vectorDBs for (e.g. have structured data, use LLMs to create embeddings, and perform search/CRUD on embeddings + original data).</p>
]]></description><pubDate>Sat, 15 Apr 2023 23:32:58 +0000</pubDate><link>https://news.ycombinator.com/item?id=35585723</link><dc:creator>jeadie</dc:creator><comments>https://news.ycombinator.com/item?id=35585723</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=35585723</guid></item></channel></rss>