<rss version="2.0" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:atom="http://www.w3.org/2005/Atom"><channel><title>Hacker News: pleshkov</title><link>https://news.ycombinator.com/user?id=pleshkov</link><description>Hacker News RSS</description><docs>https://hnrss.org/</docs><generator>hnrss v2.1.1</generator><lastBuildDate>Sun, 10 May 2026 08:42:40 +0000</lastBuildDate><atom:link href="https://hnrss.org/user?id=pleshkov" rel="self" type="application/rss+xml"></atom:link><item><title><![CDATA[New comment by pleshkov in "A polynomial autoencoder beats PCA on transformer embeddings"]]></title><description><![CDATA[
<p>I agree that we don't want to reconstruct the whole vector while retrieval and it makes poly-AE toy-like at the current state non production ready. My main interest here in the just taking more recall pp in closed form. And then think about how to make it fast. In all threads I got a good intermediate thoughts about the topic which may help me to bring to closer to production form</p>
]]></description><pubDate>Sat, 09 May 2026 23:06:24 +0000</pubDate><link>https://news.ycombinator.com/item?id=48079159</link><dc:creator>pleshkov</dc:creator><comments>https://news.ycombinator.com/item?id=48079159</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=48079159</guid></item><item><title><![CDATA[New comment by pleshkov in "A polynomial autoencoder beats PCA on transformer embeddings"]]></title><description><![CDATA[
<p>Just checked the normalization point. You were partially right, sqrt-normalization makes the difference x2 less. I'm updating the numbers in the post.
Interesting moment. I did a smoke test of poly-AE without whitening, and the result didn't change. I won't mention it in the post cause right now I'm not sure if it's a random effect or really a polynomial lift compensates normalization</p>
]]></description><pubDate>Sat, 09 May 2026 22:29:13 +0000</pubDate><link>https://news.ycombinator.com/item?id=48078897</link><dc:creator>pleshkov</dc:creator><comments>https://news.ycombinator.com/item?id=48078897</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=48078897</guid></item><item><title><![CDATA[New comment by pleshkov in "A polynomial autoencoder beats PCA on transformer embeddings"]]></title><description><![CDATA[
<p>The polynomial lift in this post originally came out of an unsuccessful experiment with hyperbolic embeddings. The idea was to embed corpora into a hyperbolic ball (anisotropic embeddings have a tree-like structure that hyperbolic space could exploit). The lift was a tool to go from hyperbolic latent back to Euclidean for retrieval. Hyperbolic part didn't work; the lift evaluated standalone kept showing real signal, and that became this post.</p>
]]></description><pubDate>Sat, 09 May 2026 10:18:59 +0000</pubDate><link>https://news.ycombinator.com/item?id=48073715</link><dc:creator>pleshkov</dc:creator><comments>https://news.ycombinator.com/item?id=48073715</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=48073715</guid></item><item><title><![CDATA[New comment by pleshkov in "A polynomial autoencoder beats PCA on transformer embeddings"]]></title><description><![CDATA[
<p>Fair point — lam is technically a hyperparameter. In practice I used lam=1e-3 (the default in the code) across all four models without tuning, and the gap to PCA is robust enough that small variations don't change the conclusion. So more accurately: "one hyperparameter with a benign default" rather than "no hyperparameters" — you're right I overstated.</p>
]]></description><pubDate>Fri, 08 May 2026 15:33:38 +0000</pubDate><link>https://news.ycombinator.com/item?id=48064610</link><dc:creator>pleshkov</dc:creator><comments>https://news.ycombinator.com/item?id=48064610</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=48064610</guid></item><item><title><![CDATA[New comment by pleshkov in "A polynomial autoencoder beats PCA on transformer embeddings"]]></title><description><![CDATA[
<p>Good catch, this is the obvious ablation I should have included. I'll re-run with per-axis normalized PCA as a separate baseline and post numbers in this thread tomorrow.
Prior: I expect some of the gap to come from normalization, but not all — the no-improvement results on isotropic datasets (§4) suggest there's structural signal the polynomial cross-terms catch that linear projection structurally can't. But that's a prediction; let me actually run it.</p>
]]></description><pubDate>Fri, 08 May 2026 15:23:39 +0000</pubDate><link>https://news.ycombinator.com/item?id=48064486</link><dc:creator>pleshkov</dc:creator><comments>https://news.ycombinator.com/item?id=48064486</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=48064486</guid></item><item><title><![CDATA[New comment by pleshkov in "A polynomial autoencoder beats PCA on transformer embeddings"]]></title><description><![CDATA[
<p>Author here. Fair characterization, and a fair critique on the geometric story.
A few clarifications. I don't claim {x_i, x_i·x_j} is the right lift specifically — the post itself shows datasets where the quadratic decoder gives essentially no improvement over PCA. The contribution is empirical: "second-order is the simplest nonlinear decoder you can fit in closed form, and on anisotropic embeddings it picks up real signal that linear decoders miss."
Whether degree 3 would help further is open. Degree 3 blows up fast: at d=100 that's 175K features, and the Ridge solve at that scale starts memorizing the corpus rather than generalizing (§7 in the post discusses this trap at d=256 already). So degree 2 is partly a choice, partly a practical ceiling for the closed-form route.</p>
]]></description><pubDate>Fri, 08 May 2026 15:16:09 +0000</pubDate><link>https://news.ycombinator.com/item?id=48064372</link><dc:creator>pleshkov</dc:creator><comments>https://news.ycombinator.com/item?id=48064372</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=48064372</guid></item><item><title><![CDATA[New comment by pleshkov in "A polynomial autoencoder beats PCA on transformer embeddings"]]></title><description><![CDATA[
<p>Author here — questions and pushback both welcome.</p>
]]></description><pubDate>Tue, 05 May 2026 11:32:06 +0000</pubDate><link>https://news.ycombinator.com/item?id=48021033</link><dc:creator>pleshkov</dc:creator><comments>https://news.ycombinator.com/item?id=48021033</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=48021033</guid></item></channel></rss>