<rss version="2.0" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:atom="http://www.w3.org/2005/Atom"><channel><title>Hacker News: dmoura</title><link>https://news.ycombinator.com/user?id=dmoura</link><description>Hacker News RSS</description><docs>https://hnrss.org/</docs><generator>hnrss v2.1.1</generator><lastBuildDate>Thu, 18 Jun 2026 08:32:34 +0000</lastBuildDate><atom:link href="https://hnrss.org/user?id=dmoura" rel="self" type="application/rss+xml"></atom:link><item><title><![CDATA[Open Dataset: Vehicle Accidents]]></title><description><![CDATA[
<p>Article URL: <a href="https://huggingface.co/datasets/nexar-ai/nexar_collision_prediction">https://huggingface.co/datasets/nexar-ai/nexar_collision_prediction</a></p>
<p>Comments URL: <a href="https://news.ycombinator.com/item?id=43354796">https://news.ycombinator.com/item?id=43354796</a></p>
<p>Points: 2</p>
<p># Comments: 0</p>
]]></description><pubDate>Thu, 13 Mar 2025 16:20:50 +0000</pubDate><link>https://huggingface.co/datasets/nexar-ai/nexar_collision_prediction</link><dc:creator>dmoura</dc:creator><comments>https://news.ycombinator.com/item?id=43354796</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=43354796</guid></item><item><title><![CDATA[Nexar Dashcam Crash Prediction Challenge]]></title><description><![CDATA[
<p>Article URL: <a href="https://www.kaggle.com/competitions/nexar-collision-prediction">https://www.kaggle.com/competitions/nexar-collision-prediction</a></p>
<p>Comments URL: <a href="https://news.ycombinator.com/item?id=43293515">https://news.ycombinator.com/item?id=43293515</a></p>
<p>Points: 10</p>
<p># Comments: 4</p>
]]></description><pubDate>Fri, 07 Mar 2025 19:30:50 +0000</pubDate><link>https://www.kaggle.com/competitions/nexar-collision-prediction</link><dc:creator>dmoura</dc:creator><comments>https://news.ycombinator.com/item?id=43293515</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=43293515</guid></item><item><title><![CDATA[Limitations of 'Understanding the Limitations of Mathematical Reasoning in LLMs']]></title><description><![CDATA[
<p>Article URL: <a href="https://desirivanova.com/post/gsm-symbolic/">https://desirivanova.com/post/gsm-symbolic/</a></p>
<p>Comments URL: <a href="https://news.ycombinator.com/item?id=42079309">https://news.ycombinator.com/item?id=42079309</a></p>
<p>Points: 1</p>
<p># Comments: 0</p>
]]></description><pubDate>Thu, 07 Nov 2024 18:21:05 +0000</pubDate><link>https://desirivanova.com/post/gsm-symbolic/</link><dc:creator>dmoura</dc:creator><comments>https://news.ycombinator.com/item?id=42079309</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=42079309</guid></item><item><title><![CDATA[New comment by dmoura in "My solopreneur story"]]></title><description><![CDATA[
<p>Thank you for sharing your learnings and for your transparency! Congrats!</p>
]]></description><pubDate>Sat, 23 Sep 2023 21:23:58 +0000</pubDate><link>https://news.ycombinator.com/item?id=37627683</link><dc:creator>dmoura</dc:creator><comments>https://news.ycombinator.com/item?id=37627683</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=37627683</guid></item><item><title><![CDATA[New comment by dmoura in "Fq: Jq for Binary Formats"]]></title><description><![CDATA[
<p>I prefer a SQL-like format. It’s not as complete but it cover most of the day-to-day use cases. Take a look at <a href="https://github.com/dcmoura/spyql">https://github.com/dcmoura/spyql</a> (I am the author). Congrats on fq!</p>
]]></description><pubDate>Sat, 03 Jun 2023 15:31:59 +0000</pubDate><link>https://news.ycombinator.com/item?id=36177454</link><dc:creator>dmoura</dc:creator><comments>https://news.ycombinator.com/item?id=36177454</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=36177454</guid></item><item><title><![CDATA[The Five Dimensions of Sustainable Software Engineering]]></title><description><![CDATA[
<p>Article URL: <a href="http://luiscruz.github.io/2022/01/01/sustainable-se-intro.html">http://luiscruz.github.io/2022/01/01/sustainable-se-intro.html</a></p>
<p>Comments URL: <a href="https://news.ycombinator.com/item?id=33745280">https://news.ycombinator.com/item?id=33745280</a></p>
<p>Points: 2</p>
<p># Comments: 0</p>
]]></description><pubDate>Fri, 25 Nov 2022 18:22:38 +0000</pubDate><link>http://luiscruz.github.io/2022/01/01/sustainable-se-intro.html</link><dc:creator>dmoura</dc:creator><comments>https://news.ycombinator.com/item?id=33745280</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=33745280</guid></item><item><title><![CDATA[Kangas: Explore Multimedia Datasets at Scale]]></title><description><![CDATA[
<p>Article URL: <a href="https://github.com/comet-ml/kangas">https://github.com/comet-ml/kangas</a></p>
<p>Comments URL: <a href="https://news.ycombinator.com/item?id=33693106">https://news.ycombinator.com/item?id=33693106</a></p>
<p>Points: 9</p>
<p># Comments: 2</p>
]]></description><pubDate>Mon, 21 Nov 2022 14:51:50 +0000</pubDate><link>https://github.com/comet-ml/kangas</link><dc:creator>dmoura</dc:creator><comments>https://news.ycombinator.com/item?id=33693106</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=33693106</guid></item><item><title><![CDATA[Neural Geometry and Rendering ECCV2022]]></title><description><![CDATA[
<p>Article URL: <a href="https://ngr-co3d.github.io/">https://ngr-co3d.github.io/</a></p>
<p>Comments URL: <a href="https://news.ycombinator.com/item?id=33592068">https://news.ycombinator.com/item?id=33592068</a></p>
<p>Points: 2</p>
<p># Comments: 0</p>
]]></description><pubDate>Mon, 14 Nov 2022 10:10:13 +0000</pubDate><link>https://ngr-co3d.github.io/</link><dc:creator>dmoura</dc:creator><comments>https://news.ycombinator.com/item?id=33592068</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=33592068</guid></item><item><title><![CDATA[Reconstructing Training Data from Trained Neural Networks]]></title><description><![CDATA[
<p>Article URL: <a href="https://giladude1.github.io/reconstruction/">https://giladude1.github.io/reconstruction/</a></p>
<p>Comments URL: <a href="https://news.ycombinator.com/item?id=33592049">https://news.ycombinator.com/item?id=33592049</a></p>
<p>Points: 11</p>
<p># Comments: 0</p>
]]></description><pubDate>Mon, 14 Nov 2022 10:07:22 +0000</pubDate><link>https://giladude1.github.io/reconstruction/</link><dc:creator>dmoura</dc:creator><comments>https://news.ycombinator.com/item?id=33592049</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=33592049</guid></item><item><title><![CDATA[New comment by dmoura in "Command-line data analytics"]]></title><description><![CDATA[
<p>DuckDB is great! I love what you guys are building. 
The main gap for me is native support of JSON (lines), like you have for CSV and Parquet.</p>
]]></description><pubDate>Thu, 03 Nov 2022 14:09:31 +0000</pubDate><link>https://news.ycombinator.com/item?id=33451240</link><dc:creator>dmoura</dc:creator><comments>https://news.ycombinator.com/item?id=33451240</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=33451240</guid></item><item><title><![CDATA[New comment by dmoura in "Command-line data analytics"]]></title><description><![CDATA[
<p>updated, thank you</p>
]]></description><pubDate>Thu, 03 Nov 2022 14:05:55 +0000</pubDate><link>https://news.ycombinator.com/item?id=33451169</link><dc:creator>dmoura</dc:creator><comments>https://news.ycombinator.com/item?id=33451169</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=33451169</guid></item><item><title><![CDATA[New comment by dmoura in "Command-line data analytics"]]></title><description><![CDATA[
<p>Things you can do with SPyQL CLI that you can't with clickhouse local (AFAIK, top of my mind, not exhaustive):<p>- use python code in your queries<p>- import python libs (just install them with pip/conda)<p>- write your one UDFs in Python<p>- run OS commands from within the query (using os.system)<p>- have guaranty of row order (like in grep, sed, etc)<p>And there is more, please take a look at:
<a href="https://spyql.readthedocs.io/en/latest/distinctive.html" rel="nofollow">https://spyql.readthedocs.io/en/latest/distinctive.html</a></p>
]]></description><pubDate>Thu, 03 Nov 2022 13:02:22 +0000</pubDate><link>https://news.ycombinator.com/item?id=33450313</link><dc:creator>dmoura</dc:creator><comments>https://news.ycombinator.com/item?id=33450313</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=33450313</guid></item><item><title><![CDATA[New comment by dmoura in "Command-line data analytics"]]></title><description><![CDATA[
<p>Author of the benchmark and of SPyQL here.
ClickHouse is fantastic. Amazing performance. SPyQL is built on top of Python but still can be faster than jq and several other tools as shown in the benchmark. SPyQL can handle large datasets but Clickhouse local should always show better performance.<p>SPyQL CLI is more oriented to work in harmony with the shell (piping), to be very simple to use and to leverage the Python ecosystem (you can import Python libs and use them in your queries).</p>
]]></description><pubDate>Thu, 03 Nov 2022 12:19:08 +0000</pubDate><link>https://news.ycombinator.com/item?id=33449891</link><dc:creator>dmoura</dc:creator><comments>https://news.ycombinator.com/item?id=33449891</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=33449891</guid></item><item><title><![CDATA[New comment by dmoura in "JC – JSONifies the output of many CLI tools"]]></title><description><![CDATA[
<p>This is great!<p>I am the author of SPyQL [1]. Combining JC with SPyQL you can easily query the json output and run python commands on top of it from the command-line :-) You can do aggregations and so forth in a much simpler and intuitive way than with jq.<p>I just wrote a blogpost [2] that illustrates it. It is more focused on CSV, but the commands would be the same if you were working with JSON.<p>[1] <a href="https://github.com/dcmoura/spyql" rel="nofollow">https://github.com/dcmoura/spyql</a>
[2] <a href="https://danielcmoura.com/blog/2022/spyql-cell-towers/" rel="nofollow">https://danielcmoura.com/blog/2022/spyql-cell-towers/</a></p>
]]></description><pubDate>Thu, 03 Nov 2022 11:07:44 +0000</pubDate><link>https://news.ycombinator.com/item?id=33449368</link><dc:creator>dmoura</dc:creator><comments>https://news.ycombinator.com/item?id=33449368</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=33449368</guid></item><item><title><![CDATA[Command-line data analytics]]></title><description><![CDATA[
<p>Article URL: <a href="https://danielcmoura.com/blog/2022/spyql-cell-towers/">https://danielcmoura.com/blog/2022/spyql-cell-towers/</a></p>
<p>Comments URL: <a href="https://news.ycombinator.com/item?id=33418573">https://news.ycombinator.com/item?id=33418573</a></p>
<p>Points: 103</p>
<p># Comments: 25</p>
]]></description><pubDate>Tue, 01 Nov 2022 08:52:09 +0000</pubDate><link>https://danielcmoura.com/blog/2022/spyql-cell-towers/</link><dc:creator>dmoura</dc:creator><comments>https://news.ycombinator.com/item?id=33418573</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=33418573</guid></item><item><title><![CDATA[SPyQL – SQL Powered by Python]]></title><description><![CDATA[
<p>Article URL: <a href="https://spyql.readthedocs.io/en/latest/index.html">https://spyql.readthedocs.io/en/latest/index.html</a></p>
<p>Comments URL: <a href="https://news.ycombinator.com/item?id=33400202">https://news.ycombinator.com/item?id=33400202</a></p>
<p>Points: 3</p>
<p># Comments: 0</p>
]]></description><pubDate>Sun, 30 Oct 2022 23:35:29 +0000</pubDate><link>https://spyql.readthedocs.io/en/latest/index.html</link><dc:creator>dmoura</dc:creator><comments>https://news.ycombinator.com/item?id=33400202</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=33400202</guid></item><item><title><![CDATA[Make beautiful visualisations of large graphs online (2015)]]></title><description><![CDATA[
<p>Article URL: <a href="https://github.com/dcmoura/3DHEB">https://github.com/dcmoura/3DHEB</a></p>
<p>Comments URL: <a href="https://news.ycombinator.com/item?id=31266438">https://news.ycombinator.com/item?id=31266438</a></p>
<p>Points: 2</p>
<p># Comments: 0</p>
]]></description><pubDate>Wed, 04 May 2022 22:03:27 +0000</pubDate><link>https://github.com/dcmoura/3DHEB</link><dc:creator>dmoura</dc:creator><comments>https://news.ycombinator.com/item?id=31266438</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=31266438</guid></item><item><title><![CDATA[New comment by dmoura in "The fastest tool for querying large JSON files is written in Python (benchmark)"]]></title><description><![CDATA[
<p>Thank you all for your feedback.   The benchmark was updated and the fastest tool is NOT written in Python. Here are the highlights:<p>* Added ClickHouse (written in C++) to the benchmark: I was unaware that the clickhouse-local tool would handle these tasks. ClickHouse is now the fastest (together with OctoSQL);<p>* OctoSQL (written in Go) was updated as a response to the benchmark: updates included switching to fastjson, short-circuiting LIMIT, and eagerly printing when outputting JSON and CSV. Now, OctoSQL is one of the fastest and memory is stable;<p>* SPyQL (written in Python) is now third: SPyQL leverages orjson (Rust) to parse JSONs, while the query engine is written in Python. When processing 1GB of input data, SPyQL takes 4x-5x more time than the best, while still achieving up to 2x higher performance than jq (written in C);<p>* I removed Pandas from the benchmark and focused on command-line tools. I am planning a separate benchmark on Python libs where Pandas, Polars and Modin (and eventually others) will be included.<p>This benchmark is a living document. 
If you are interested in receiving updates, please subscribe to the following issue: <a href="https://github.com/dcmoura/spyql/issues/72" rel="nofollow">https://github.com/dcmoura/spyql/issues/72</a><p>Thank you!</p>
]]></description><pubDate>Thu, 21 Apr 2022 16:48:09 +0000</pubDate><link>https://news.ycombinator.com/item?id=31111863</link><dc:creator>dmoura</dc:creator><comments>https://news.ycombinator.com/item?id=31111863</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=31111863</guid></item><item><title><![CDATA[New comment by dmoura in "The fastest tool for querying large JSON files is written in Python (benchmark)"]]></title><description><![CDATA[
<p>Are you able to calculate aggregates, like an average?</p>
]]></description><pubDate>Tue, 19 Apr 2022 11:30:19 +0000</pubDate><link>https://news.ycombinator.com/item?id=31081753</link><dc:creator>dmoura</dc:creator><comments>https://news.ycombinator.com/item?id=31081753</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=31081753</guid></item><item><title><![CDATA[New comment by dmoura in "The fastest tool for querying large JSON files is written in Python (benchmark)"]]></title><description><![CDATA[
<p>The initial idea was to focus on cmd line tools... I added pandas for comparison, as it is one of the most used libs to work with datasets. I will either remove Pandas from the equation or add Polars. By the way, I run some benchmarks and polars seems a bit faster than spyql for the aggregation challenge, but does not scale (loads everything into memory)</p>
]]></description><pubDate>Mon, 18 Apr 2022 09:13:42 +0000</pubDate><link>https://news.ycombinator.com/item?id=31068728</link><dc:creator>dmoura</dc:creator><comments>https://news.ycombinator.com/item?id=31068728</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=31068728</guid></item></channel></rss>