<rss version="2.0" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:atom="http://www.w3.org/2005/Atom"><channel><title>Hacker News: michaelmarkell</title><link>https://news.ycombinator.com/user?id=michaelmarkell</link><description>Hacker News RSS</description><docs>https://hnrss.org/</docs><generator>hnrss v2.1.1</generator><lastBuildDate>Tue, 07 Apr 2026 08:37:27 +0000</lastBuildDate><atom:link href="https://hnrss.org/user?id=michaelmarkell" rel="self" type="application/rss+xml"></atom:link><item><title><![CDATA[New comment by michaelmarkell in "The wealth of the top 1% reaches a record $52T (2025)"]]></title><description><![CDATA[
<p>You're 100% right, thanks! Updated my comment.</p>
]]></description><pubDate>Fri, 16 Jan 2026 19:42:07 +0000</pubDate><link>https://news.ycombinator.com/item?id=46651149</link><dc:creator>michaelmarkell</dc:creator><comments>https://news.ycombinator.com/item?id=46651149</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=46651149</guid></item><item><title><![CDATA[New comment by michaelmarkell in "The wealth of the top 1% reaches a record $52T (2025)"]]></title><description><![CDATA[
<p>That's approximately ~15.3M USD per person in the top 1% assuming 340 million americans</p>
]]></description><pubDate>Fri, 16 Jan 2026 19:36:37 +0000</pubDate><link>https://news.ycombinator.com/item?id=46651069</link><dc:creator>michaelmarkell</dc:creator><comments>https://news.ycombinator.com/item?id=46651069</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=46651069</guid></item><item><title><![CDATA[New comment by michaelmarkell in "Claude for Excel"]]></title><description><![CDATA[
<p>Not really. Take for example:<p>item, date, price<p>abc, 01/01/2023, $30<p>cde, 02/01/2023, $40<p>... 100k rows ...<p>subtotal.        $1000<p>def, 03/01,2023, $20<p>"Hey Claude, what's the total from this file?
> grep for headers
> "Ah, I see column 3 is the price value"
> SUM(C2:C) -> $2020
> "Great! I found your total!"<p>If you can find me an example of tech that can solve this at scale on large, diverse Excel formats, then I'll concede, but I haven't found something actually trustworthy for important data sets</p>
]]></description><pubDate>Mon, 27 Oct 2025 20:33:09 +0000</pubDate><link>https://news.ycombinator.com/item?id=45725932</link><dc:creator>michaelmarkell</dc:creator><comments>https://news.ycombinator.com/item?id=45725932</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=45725932</guid></item><item><title><![CDATA[New comment by michaelmarkell in "Claude for Excel"]]></title><description><![CDATA[
<p>IMO, a real solution here has to be hybrid, not full LLM, because these sheets can be massive and have very complicated structures. You want to be able to use the LLM to identify / map column headers, while using non-LLM tool calling to run Excel operations like SUMIFs or VLOOKUPs. One of the most important traits in these systems is consistency with slight variation in file layout, as so much Excel work involves consolidating / reconciling between reports made on a quarterly basis or produced by a variety of sources, with different reporting structures.<p>Disclosure: My company builds ingestion pipelines for large multi-tab Excel files, PDFs, and CSVs.</p>
]]></description><pubDate>Mon, 27 Oct 2025 18:00:24 +0000</pubDate><link>https://news.ycombinator.com/item?id=45724275</link><dc:creator>michaelmarkell</dc:creator><comments>https://news.ycombinator.com/item?id=45724275</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=45724275</guid></item><item><title><![CDATA[New comment by michaelmarkell in "Chess grandmaster Daniel Naroditsky has died"]]></title><description><![CDATA[
<p>Danya was like the Mr. Rogers of chess. He had a way of making you feel accepted into the chess community even if you were a beginner, and was such a clear thinker. I strive to be more like him, and am devastated by this loss.</p>
]]></description><pubDate>Mon, 20 Oct 2025 21:12:54 +0000</pubDate><link>https://news.ycombinator.com/item?id=45649451</link><dc:creator>michaelmarkell</dc:creator><comments>https://news.ycombinator.com/item?id=45649451</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=45649451</guid></item><item><title><![CDATA[Fast UPDATEs for the ClickHouse column store]]></title><description><![CDATA[
<p>Article URL: <a href="https://clickhouse.com/blog/updates-in-clickhouse-2-sql-style-updates">https://clickhouse.com/blog/updates-in-clickhouse-2-sql-style-updates</a></p>
<p>Comments URL: <a href="https://news.ycombinator.com/item?id=45018275">https://news.ycombinator.com/item?id=45018275</a></p>
<p>Points: 4</p>
<p># Comments: 0</p>
]]></description><pubDate>Mon, 25 Aug 2025 19:58:30 +0000</pubDate><link>https://clickhouse.com/blog/updates-in-clickhouse-2-sql-style-updates</link><dc:creator>michaelmarkell</dc:creator><comments>https://news.ycombinator.com/item?id=45018275</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=45018275</guid></item><item><title><![CDATA[New comment by michaelmarkell in "Does OLAP Need an ORM"]]></title><description><![CDATA[
<p>The way my company uses Clickhouse is basically that we have one giant flat table, and have written our own abstraction layer on top of it based around "entities" which are functions of data in the underlying table, potentially adding in some window functions or joins. Pretty much every query we write with Clickhouse tacks on a big "Group By All" at the end of it, because we are always trying to squash down the number of rows and aggregate as aggressively as possible.<p>I imagine we're not alone in this type of abstraction layer, and some type-safety would be very welcome there. I tried to build our system on top of Kysely (<a href="https://kysely.dev/" rel="nofollow">https://kysely.dev/</a>) but the Clickhouse extension was not far along enough to make sense for our use-case. As such, we basically had to build our own parser that compiles down to sql, but there are many type-error edge cases, especially when we're joining in against data from S3 that could be CSV, Parquet, etc.<p>Side note: One of the things I love most about Clickhouse is how easy it is to combine data from multiple sources other than just the source database at query time. I imagine this makes the problem of building an ORM much harder as well, since you could need to build type-checking / ORM against sql queries to external databases, rather than to the source table itself</p>
]]></description><pubDate>Sun, 17 Aug 2025 17:35:06 +0000</pubDate><link>https://news.ycombinator.com/item?id=44933318</link><dc:creator>michaelmarkell</dc:creator><comments>https://news.ycombinator.com/item?id=44933318</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=44933318</guid></item><item><title><![CDATA[A New Postgres Block Storage Layout for Full Text Search]]></title><description><![CDATA[
<p>Article URL: <a href="https://www.paradedb.com/blog/block_storage_part_one">https://www.paradedb.com/blog/block_storage_part_one</a></p>
<p>Comments URL: <a href="https://news.ycombinator.com/item?id=44484066">https://news.ycombinator.com/item?id=44484066</a></p>
<p>Points: 21</p>
<p># Comments: 2</p>
]]></description><pubDate>Sun, 06 Jul 2025 21:08:54 +0000</pubDate><link>https://www.paradedb.com/blog/block_storage_part_one</link><dc:creator>michaelmarkell</dc:creator><comments>https://news.ycombinator.com/item?id=44484066</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=44484066</guid></item><item><title><![CDATA[New comment by michaelmarkell in "Anthropic's Circuit Tracer"]]></title><description><![CDATA[
<p>From the Readme:<p>Given a model with pre-trained transcoders, it finds the circuit / attribution graph; i.e., it computes the direct effect that each non-zero transcoder feature, transcoder error node, and input token has on each other non-zero transcoder feature and output logit.
Given an attribution graph, it visualizes this graph and allows you to annotate these features.
Enables interventions on a model's transcoder features using the insights gained from the attribution graph; i.e. you can set features to arbitrary values, and observe how model output changes.<p>The blog post:
<a href="https://www.anthropic.com/research/open-source-circuit-tracing" rel="nofollow">https://www.anthropic.com/research/open-source-circuit-traci...</a></p>
]]></description><pubDate>Sat, 31 May 2025 15:09:39 +0000</pubDate><link>https://news.ycombinator.com/item?id=44144780</link><dc:creator>michaelmarkell</dc:creator><comments>https://news.ycombinator.com/item?id=44144780</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=44144780</guid></item><item><title><![CDATA[Anthropic's Circuit Tracer]]></title><description><![CDATA[
<p>Article URL: <a href="https://github.com/safety-research/circuit-tracer">https://github.com/safety-research/circuit-tracer</a></p>
<p>Comments URL: <a href="https://news.ycombinator.com/item?id=44144779">https://news.ycombinator.com/item?id=44144779</a></p>
<p>Points: 2</p>
<p># Comments: 1</p>
]]></description><pubDate>Sat, 31 May 2025 15:09:39 +0000</pubDate><link>https://github.com/safety-research/circuit-tracer</link><dc:creator>michaelmarkell</dc:creator><comments>https://news.ycombinator.com/item?id=44144779</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=44144779</guid></item><item><title><![CDATA[New comment by michaelmarkell in "Ask HN: Who is hiring? (May 2025)"]]></title><description><![CDATA[
<p>Syncopate | NYC (Hybrid ~3d/week) | Full-time | Senior Full Stack Engineers / Focus on AI + Finance<p>Syncopate builds tools to help automate financial diligence and management of long-tail financial assets.<p>We've found product market fit with ETL/analysis tools for niche financial data, starting with music rights, and we're looking to build out our capabilities across more Excel + PDF-based workflows.<p>What we're looking for: A full-stack engineer with experience building data-heavy applications. Experience with analytics databases like Clickhouse and data pipelining is a plus. Required to be knowledgeable in Typescript.<p>Big bonus points for:
1) High agency (previously a founder or built side-projects to completion)
2) Some knowledge of finance
3) Skill in Rust<p>You can reach out to me here <a href="https://www.linkedin.com/in/michael-markell-377b4221a/" rel="nofollow">https://www.linkedin.com/in/michael-markell-377b4221a/</a> or via email (michael at syncopate dot ai)<p>More about Syncopate (geared towards our music rights segment): <a href="https://syncopate.notion.site/" rel="nofollow">https://syncopate.notion.site/</a></p>
]]></description><pubDate>Thu, 01 May 2025 23:33:52 +0000</pubDate><link>https://news.ycombinator.com/item?id=43864564</link><dc:creator>michaelmarkell</dc:creator><comments>https://news.ycombinator.com/item?id=43864564</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=43864564</guid></item><item><title><![CDATA[New comment by michaelmarkell in "Show HN: Chonky – a neural approach for text semantic chunking"]]></title><description><![CDATA[
<p>In our use-case we have many gigabytes of PDFs that contain some qualitative data but also many pages of inline pdf tables. In an ideal world we’d be “compressing” those embedded tables into some text that says “there’s a table here with these columns, if you want to analyze it you can use this <tool>, but basically the table is talking about X, here are the relevant stats like mean, sum, cardinality.”<p>In the naive chunking approach, we would grab random sections of line items from these tables because they happen to reference some similar text to the search query, but there’s no guarantee the data pulled into context is complete.</p>
]]></description><pubDate>Sun, 13 Apr 2025 14:43:09 +0000</pubDate><link>https://news.ycombinator.com/item?id=43673160</link><dc:creator>michaelmarkell</dc:creator><comments>https://news.ycombinator.com/item?id=43673160</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=43673160</guid></item><item><title><![CDATA[New comment by michaelmarkell in "Show HN: Chonky – a neural approach for text semantic chunking"]]></title><description><![CDATA[
<p>It seems to me like chunking (or some higher order version of it like chunking into knowledge graphs) is the highest leverage thing someone can work on right now if trying to improve intelligence of AI systems like code completion, PDF understanding etc. I’m surprised more people aren’t working on this.</p>
]]></description><pubDate>Sun, 13 Apr 2025 13:37:09 +0000</pubDate><link>https://news.ycombinator.com/item?id=43672718</link><dc:creator>michaelmarkell</dc:creator><comments>https://news.ycombinator.com/item?id=43672718</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=43672718</guid></item><item><title><![CDATA[New comment by michaelmarkell in "A Man Out to Prove How Dumb AI Still Is"]]></title><description><![CDATA[
<p>If I were to guess, most (adult) humans could not add two 3 digit numbers together with 100% accuracy. Maybe 99%? Computers can already do 100%, so we should probably be trying to figure out how to use language to extract the numbers from stuff and send them off to computers to do the calculations. Especially because in the real world most numbers that matter are not just two digits addition</p>
]]></description><pubDate>Fri, 04 Apr 2025 22:53:20 +0000</pubDate><link>https://news.ycombinator.com/item?id=43588605</link><dc:creator>michaelmarkell</dc:creator><comments>https://news.ycombinator.com/item?id=43588605</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=43588605</guid></item><item><title><![CDATA[New comment by michaelmarkell in "OpenAI says it has evidence DeepSeek used its model to train competitor"]]></title><description><![CDATA[
<p>Can someone with more expertise help me understand what I'm looking at here? <a href="https://crt.sh/?id=10106356492" rel="nofollow">https://crt.sh/?id=10106356492</a><p>It looks like Deepseek had a subdomain called "openai-us1.deepseek.com". What is a legitimate use-case for hosting an openai proxy(?) on your subdomain like this?<p>Not implying anything's off here, but it's interesting to me that this OpenAI entity is one of the few subdomains they have on their site</p>
]]></description><pubDate>Thu, 30 Jan 2025 00:16:20 +0000</pubDate><link>https://news.ycombinator.com/item?id=42873183</link><dc:creator>michaelmarkell</dc:creator><comments>https://news.ycombinator.com/item?id=42873183</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=42873183</guid></item><item><title><![CDATA[New comment by michaelmarkell in "Dinner and Deception (2015)"]]></title><description><![CDATA[
<p>Archive link: <a href="https://web.archive.org/web/20240330143422/https://www.nytimes.com/2015/08/23/opinion/sunday/dinner-and-deception.html" rel="nofollow">https://web.archive.org/web/20240330143422/https://www.nytim...</a></p>
]]></description><pubDate>Thu, 12 Dec 2024 12:24:54 +0000</pubDate><link>https://news.ycombinator.com/item?id=42398575</link><dc:creator>michaelmarkell</dc:creator><comments>https://news.ycombinator.com/item?id=42398575</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=42398575</guid></item><item><title><![CDATA[Dinner and Deception (2015)]]></title><description><![CDATA[
<p>Article URL: <a href="https://www.nytimes.com/2015/08/23/opinion/sunday/dinner-and-deception.html">https://www.nytimes.com/2015/08/23/opinion/sunday/dinner-and-deception.html</a></p>
<p>Comments URL: <a href="https://news.ycombinator.com/item?id=42398574">https://news.ycombinator.com/item?id=42398574</a></p>
<p>Points: 1</p>
<p># Comments: 1</p>
]]></description><pubDate>Thu, 12 Dec 2024 12:24:53 +0000</pubDate><link>https://www.nytimes.com/2015/08/23/opinion/sunday/dinner-and-deception.html</link><dc:creator>michaelmarkell</dc:creator><comments>https://news.ycombinator.com/item?id=42398574</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=42398574</guid></item><item><title><![CDATA[Parsebox | Parser Combinators in the TypeScript Type System]]></title><description><![CDATA[
<p>Article URL: <a href="https://github.com/sinclairzx81/parsebox">https://github.com/sinclairzx81/parsebox</a></p>
<p>Comments URL: <a href="https://news.ycombinator.com/item?id=42077328">https://news.ycombinator.com/item?id=42077328</a></p>
<p>Points: 1</p>
<p># Comments: 0</p>
]]></description><pubDate>Thu, 07 Nov 2024 15:12:57 +0000</pubDate><link>https://github.com/sinclairzx81/parsebox</link><dc:creator>michaelmarkell</dc:creator><comments>https://news.ycombinator.com/item?id=42077328</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=42077328</guid></item><item><title><![CDATA[New comment by michaelmarkell in "What happens when you make a move in lichess.org?"]]></title><description><![CDATA[
<p>Timing of moves</p>
]]></description><pubDate>Wed, 23 Oct 2024 20:56:33 +0000</pubDate><link>https://news.ycombinator.com/item?id=41929210</link><dc:creator>michaelmarkell</dc:creator><comments>https://news.ycombinator.com/item?id=41929210</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=41929210</guid></item><item><title><![CDATA[New comment by michaelmarkell in "Man accused of using bots and AI to earn streaming revenue"]]></title><description><![CDATA[
<p>Artists do not get paid per stream on Spotify + many other DSPs. The platform sums up all of the ad revenue and divides it pro rata among all of the streamed artists. So the fraudulent streams dilute the pie for legitimate streams.</p>
]]></description><pubDate>Sun, 08 Sep 2024 12:32:46 +0000</pubDate><link>https://news.ycombinator.com/item?id=41480022</link><dc:creator>michaelmarkell</dc:creator><comments>https://news.ycombinator.com/item?id=41480022</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=41480022</guid></item></channel></rss>