<rss version="2.0" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:atom="http://www.w3.org/2005/Atom"><channel><title>Hacker News: proddata</title><link>https://news.ycombinator.com/user?id=proddata</link><description>Hacker News RSS</description><docs>https://hnrss.org/</docs><generator>hnrss v2.1.1</generator><lastBuildDate>Wed, 27 May 2026 05:37:51 +0000</lastBuildDate><atom:link href="https://hnrss.org/user?id=proddata" rel="self" type="application/rss+xml"></atom:link><item><title><![CDATA[New comment by proddata in "The world of PostgreSQL wire compatibility"]]></title><description><![CDATA[
<p>CrateDB DevRel here :)<p>> databases providing an abstraction through the Postgres wire protocol<p>I would not call it an abstraction, if one has a full parser, analyzer, planner and execution engine. It is just a common language ;)</p>
]]></description><pubDate>Thu, 10 Feb 2022 19:10:03 +0000</pubDate><link>https://news.ycombinator.com/item?id=30290970</link><dc:creator>proddata</dc:creator><comments>https://news.ycombinator.com/item?id=30290970</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=30290970</guid></item><item><title><![CDATA[New comment by proddata in "How Time Series Databases Work, and Where They Don’t"]]></title><description><![CDATA[
<p>Not being able to keep up with the incoming data. But 100-200Hz I'd consider fine for most</p>
]]></description><pubDate>Wed, 20 Oct 2021 06:50:48 +0000</pubDate><link>https://news.ycombinator.com/item?id=28927574</link><dc:creator>proddata</dc:creator><comments>https://news.ycombinator.com/item?id=28927574</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=28927574</guid></item><item><title><![CDATA[New comment by proddata in "How Time Series Databases Work, and Where They Don’t"]]></title><description><![CDATA[
<p>What do yo mean by high-frequency data? 100Hz, 1KHz, 100KHz? For that kind of use cases many time-series DBs break apart. We have customers storing multiple millions of high frequency measurements per sec in arrays.<p>I would say, Postgres is not too storage efficient in itself for large amounts of data, especially if you need any sorts of indexes. Timescale basically mitigates that by automatically creating new table in the background ("chunks") and keeping individual tables small.</p>
]]></description><pubDate>Mon, 18 Oct 2021 19:09:45 +0000</pubDate><link>https://news.ycombinator.com/item?id=28910086</link><dc:creator>proddata</dc:creator><comments>https://news.ycombinator.com/item?id=28910086</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=28910086</guid></item><item><title><![CDATA[New comment by proddata in "How Time Series Databases Work, and Where They Don’t"]]></title><description><![CDATA[
<p>Sorry, mixed up the number 2GB memory (0.5GB heap). So 1:500 is correct</p>
]]></description><pubDate>Mon, 18 Oct 2021 18:59:45 +0000</pubDate><link>https://news.ycombinator.com/item?id=28909937</link><dc:creator>proddata</dc:creator><comments>https://news.ycombinator.com/item?id=28909937</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=28909937</guid></item><item><title><![CDATA[New comment by proddata in "How Time Series Databases Work, and Where They Don’t"]]></title><description><![CDATA[
<p>Most of CrateDB clusters run on cloud providers hardware (azure, aws, alibaba). Using EBS (GP2 or now GP3) is also quite common. Due to the indexing / storage engine, gp disks are typically sufficient and faster disks have little to no advantage</p>
]]></description><pubDate>Mon, 18 Oct 2021 18:56:18 +0000</pubDate><link>https://news.ycombinator.com/item?id=28909871</link><dc:creator>proddata</dc:creator><comments>https://news.ycombinator.com/item?id=28909871</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=28909871</guid></item><item><title><![CDATA[New comment by proddata in "How Time Series Databases Work, and Where They Don’t"]]></title><description><![CDATA[
<p>- Depends - Just inserting, indexing, storing and simple querying can be done with little memory (i.e. 1:500 memory-disk-ratio 0.5GB RAM per 1TB disk). Typical production clusters with high query load are in the 1:150 range i.e. 64GB RAM for 10TB disk).<p>Otherwise typical general purpose hardware (Standard SSDs, 1:4 vCPU:memory ratios, ...)</p>
]]></description><pubDate>Mon, 18 Oct 2021 09:26:33 +0000</pubDate><link>https://news.ycombinator.com/item?id=28903652</link><dc:creator>proddata</dc:creator><comments>https://news.ycombinator.com/item?id=28903652</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=28903652</guid></item><item><title><![CDATA[New comment by proddata in "How Time Series Databases Work, and Where They Don’t"]]></title><description><![CDATA[
<p>> is an OLAP database a common go-to for longer-timescale analytics (as in [1])?<p>I would not consider Clickhouse or CrateDB "classic" OLAP DBs. I can speak for CrateDB (I work there), that it definitely would be able to handle 600GB and query across it in an ad-hoc manner.<p>We have users ingesting Terabytes of events per day and run aggregations across 100 Terabyte.</p>
]]></description><pubDate>Mon, 18 Oct 2021 06:23:50 +0000</pubDate><link>https://news.ycombinator.com/item?id=28902770</link><dc:creator>proddata</dc:creator><comments>https://news.ycombinator.com/item?id=28902770</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=28902770</guid></item><item><title><![CDATA[New comment by proddata in "How Time Series Databases Work, and Where They Don’t"]]></title><description><![CDATA[
<p>That article takes various concepts from typical TSDB solutions and seemingly only looks at the bad sides. Time series data has many different forms, not every form works for every TSDB solution.<p>For the 3 caveats at the top, there are already two TS solutions that look promising (QuestDB, TimescaleDB). Often an operational analytics DB (Clickhouse, CrateDB) might also be a solution.</p>
]]></description><pubDate>Mon, 18 Oct 2021 05:08:42 +0000</pubDate><link>https://news.ycombinator.com/item?id=28902403</link><dc:creator>proddata</dc:creator><comments>https://news.ycombinator.com/item?id=28902403</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=28902403</guid></item><item><title><![CDATA[New comment by proddata in "OpenSearch: AWS fork of Elasticsearch and Kibana"]]></title><description><![CDATA[
<p>Sorry, but this is not true at all.<p>Some of the biggest changes within ES come from Lucene, like _massive_ reduction in memory footprint, enabling ES to use cases not even possible before.</p>
]]></description><pubDate>Tue, 13 Apr 2021 05:23:31 +0000</pubDate><link>https://news.ycombinator.com/item?id=26788390</link><dc:creator>proddata</dc:creator><comments>https://news.ycombinator.com/item?id=26788390</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=26788390</guid></item><item><title><![CDATA[New comment by proddata in "ClickHouse as an alternative to Elasticsearch for log storage and analysis"]]></title><description><![CDATA[
<p>If you are looking an OSS ES replacement, CrateDB might also be worth a look :)<p>Basically a best of both worlds combination of ES and PostgreSQL, perfect for time-series and log analytics.</p>
]]></description><pubDate>Tue, 02 Mar 2021 18:18:32 +0000</pubDate><link>https://news.ycombinator.com/item?id=26318448</link><dc:creator>proddata</dc:creator><comments>https://news.ycombinator.com/item?id=26318448</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=26318448</guid></item><item><title><![CDATA[New comment by proddata in "Doubling down on permissive licensing and the Elasticsearch lockdown"]]></title><description><![CDATA[
<p>Yes, we have customers using CrateDB as part of their proprietary product.<p>Also the SSPL is so vague, that we probably would not only have to release CrateDB itself - which we already do, but also everything we use for the services we provide. Also we could never make any kind of deals with OEMs, etc.</p>
]]></description><pubDate>Thu, 28 Jan 2021 14:54:29 +0000</pubDate><link>https://news.ycombinator.com/item?id=25942611</link><dc:creator>proddata</dc:creator><comments>https://news.ycombinator.com/item?id=25942611</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=25942611</guid></item><item><title><![CDATA[New comment by proddata in "Doubling down on permissive licensing and the Elasticsearch lockdown"]]></title><description><![CDATA[
<p>The thing is, that all the arguments they now bring up for the move, have been true in 2018 as well ...</p>
]]></description><pubDate>Thu, 28 Jan 2021 07:46:30 +0000</pubDate><link>https://news.ycombinator.com/item?id=25939286</link><dc:creator>proddata</dc:creator><comments>https://news.ycombinator.com/item?id=25939286</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=25939286</guid></item><item><title><![CDATA[New comment by proddata in "Doubling down on permissive licensing and the Elasticsearch lockdown"]]></title><description><![CDATA[
<p>> So, they don't run Linux, don't use glibc? That can't be all that common? (I mean sure, there's the bsds.. But still..).<p>We do run Linux :)<p>But there is a difference between building on and building with.</p>
]]></description><pubDate>Wed, 27 Jan 2021 20:32:21 +0000</pubDate><link>https://news.ycombinator.com/item?id=25933541</link><dc:creator>proddata</dc:creator><comments>https://news.ycombinator.com/item?id=25933541</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=25933541</guid></item><item><title><![CDATA[New comment by proddata in "Doubling down on permissive licensing and the Elasticsearch lockdown"]]></title><description><![CDATA[
<p>> This begs the question: isn't "a restrictive OSS licence" not less "fully open source" than a more permissive licence like GPL, MIT or BSD?<p>We gonna change CrateDB fully to Apache License v2 ;) I would say that counts as a "more permissive" license.<p>> Is that really only because of some enterprises not liking GPL?<p>There are various reasons for the change. A big part is definitely also the spirit of many our contributors. We built CrateDB on open source software and also want to make the software available as open source. It also was planned for quite some time to be more open.</p>
]]></description><pubDate>Wed, 27 Jan 2021 20:02:33 +0000</pubDate><link>https://news.ycombinator.com/item?id=25933170</link><dc:creator>proddata</dc:creator><comments>https://news.ycombinator.com/item?id=25933170</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=25933170</guid></item><item><title><![CDATA[New comment by proddata in "Doubling down on permissive licensing and the Elasticsearch lockdown"]]></title><description><![CDATA[
<p>> If your business model cannot survive when a critical upstream piece of your infrastructure moves to GPL, you probably have a bad business model to begin with.<p>To be clear CrateDB started out as OSS and we decide to stay OSS. Elasticsearch used the Apache License and so did CrateDB. All in the spirit of OSS. Elastic are however now the ones how decided, that their business model isn't viable anymore.<p>> It sounds like they are making up excuses for not wanting to fully Open Source their code<p>We do want to make it fully open source! Everything that was under a more restrictive License is going to be offered under Apache License.</p>
]]></description><pubDate>Wed, 27 Jan 2021 19:03:28 +0000</pubDate><link>https://news.ycombinator.com/item?id=25932339</link><dc:creator>proddata</dc:creator><comments>https://news.ycombinator.com/item?id=25932339</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=25932339</guid></item><item><title><![CDATA[New comment by proddata in "CrateDB: Purpose-built to scale modern applications in a machine data world"]]></title><description><![CDATA[
<p>Fair point - I will review this with our marketing and get that fixed</p>
]]></description><pubDate>Sat, 28 Nov 2020 09:22:17 +0000</pubDate><link>https://news.ycombinator.com/item?id=25236479</link><dc:creator>proddata</dc:creator><comments>https://news.ycombinator.com/item?id=25236479</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=25236479</guid></item><item><title><![CDATA[New comment by proddata in "CrateDB: Purpose-built to scale modern applications in a machine data world"]]></title><description><![CDATA[
<p>Many reasons actually ...<p>- Scalability
CrateDB is built for horizontal scale from the ground up on top of distributed technologies. We have customers using clusters with 80+ nodes in production for many years now.<p>Timescale just released their multi-node feature in beta and they follow a different concept then we do. While Timescale uses a leader (access node) - follower (data node) model with a single point of failure CrateDB is built on a shared-nothing architecture. Many features you would want to see in a distributed system are present in CrateDB and still missing in TS:<p>- cluster wide replication
 - automatic rebalancing
 - cluster wide backup
 - shared nothing architecture / no single point of failure<p>- Full Text Search
CrateDB is built on Lucene and parts of ES and includes search capabilities you would typically need a separate product for when using PG/TS.<p>- Distributed Query Engine
Yes, PG/TS are fast if you query "small" amounts of data (e.g. last days data). But if you have distributed system, you might as well also want to run queries on larger data sets.<p>- Geospatial Queries 
Powered with Lucenes BKD-Trees<p>---<p>Disclaimer:
I work for Crate.io and I also think Timescale are doing awesome stuff in many ways and give Influx the competition they deserve. I don't see us in direct competition (at least not yet), as the focus of Timescale is clearly more on smaller use cases.</p>
]]></description><pubDate>Sat, 28 Nov 2020 09:20:00 +0000</pubDate><link>https://news.ycombinator.com/item?id=25236465</link><dc:creator>proddata</dc:creator><comments>https://news.ycombinator.com/item?id=25236465</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=25236465</guid></item></channel></rss>