<rss version="2.0" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:atom="http://www.w3.org/2005/Atom"><channel><title>Hacker News: abhgh</title><link>https://news.ycombinator.com/user?id=abhgh</link><description>Hacker News RSS</description><docs>https://hnrss.org/</docs><generator>hnrss v2.1.1</generator><lastBuildDate>Thu, 23 Apr 2026 18:28:27 +0000</lastBuildDate><atom:link href="https://hnrss.org/user?id=abhgh" rel="self" type="application/rss+xml"></atom:link><item><title><![CDATA[New comment by abhgh in "Qwen3.5 Fine-Tuning Guide"]]></title><description><![CDATA[
<p>They are great for specialized use-cases: (a) where the problem is not hard enough (you don't need reasoning), or (b) diverse enough (you don't need a world model), (c) you want cheap inference (and you can make it happen hardware-wise) and (d) you either have enough data or a workflow that accumulates data (with fine tuning with enough data you can sometimes beat a premier model while ensuring low latency - ofc, assuming (a) and (b) apply).<p>I make it sound like a rare perfect storm needs to exist to justify fine tuning, but these circumstances are not uncommon - to an extent (a), (c) and (d) were already prerequisites for deploying traditional ML systems.</p>
]]></description><pubDate>Wed, 04 Mar 2026 15:52:39 +0000</pubDate><link>https://news.ycombinator.com/item?id=47249298</link><dc:creator>abhgh</dc:creator><comments>https://news.ycombinator.com/item?id=47249298</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47249298</guid></item><item><title><![CDATA[New comment by abhgh in "Ask HN: What are you working on? (February 2026)"]]></title><description><![CDATA[
<p>I notice you mentioned dspy - do you also support prompt optimization?</p>
]]></description><pubDate>Mon, 09 Feb 2026 12:22:42 +0000</pubDate><link>https://news.ycombinator.com/item?id=46944469</link><dc:creator>abhgh</dc:creator><comments>https://news.ycombinator.com/item?id=46944469</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=46944469</guid></item><item><title><![CDATA[New comment by abhgh in "I miss thinking hard"]]></title><description><![CDATA[
<p>This is an amazing quote - thank you. This is also my argument for why I can't use LLMs for writing (proofreading is OK) - what I write is not produced as a side-effect of thinking through a problem, writing <i>is</i> how I think through a problem.</p>
]]></description><pubDate>Wed, 04 Feb 2026 07:13:44 +0000</pubDate><link>https://news.ycombinator.com/item?id=46882521</link><dc:creator>abhgh</dc:creator><comments>https://news.ycombinator.com/item?id=46882521</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=46882521</guid></item><item><title><![CDATA[The Gumbel-Max Trick]]></title><description><![CDATA[
<p>Article URL: <a href="https://blog.quipu-strands.com/gumbel">https://blog.quipu-strands.com/gumbel</a></p>
<p>Comments URL: <a href="https://news.ycombinator.com/item?id=46874077">https://news.ycombinator.com/item?id=46874077</a></p>
<p>Points: 2</p>
<p># Comments: 0</p>
]]></description><pubDate>Tue, 03 Feb 2026 17:32:03 +0000</pubDate><link>https://blog.quipu-strands.com/gumbel</link><dc:creator>abhgh</dc:creator><comments>https://news.ycombinator.com/item?id=46874077</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=46874077</guid></item><item><title><![CDATA[New comment by abhgh in "Ask HN: When has a "dumb" solution beaten a sophisticated one for you?"]]></title><description><![CDATA[
<p>I once modeled user journeys on a website using fancy ML models that honored sequence information, i.e., order of page visits, only to be beaten by bag-of-words (i.e., page url becomes a vector dimension, but order is lost) decision tree model, which was supposed to be my <i>baseline</i>.<p>What I had overlooked was that journeys on that particular website were fairly constrained by design, i.e., if you landed on the home page, did a bunch of stuff, put product X in the cart - there was pretty much one sequence of pages (or in the worst case, a small handful) that you'd traverse for the journey. Which means the bag-of-words (BoW) representation was more or less as expressive as the sequence model; certain pages showing up in the BoW vector corresponded to a single sequence (mostly). But the DT could learn faster with less data.</p>
]]></description><pubDate>Sun, 18 Jan 2026 07:37:56 +0000</pubDate><link>https://news.ycombinator.com/item?id=46665630</link><dc:creator>abhgh</dc:creator><comments>https://news.ycombinator.com/item?id=46665630</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=46665630</guid></item><item><title><![CDATA[New comment by abhgh in "Vibe Coding Killed Cursor"]]></title><description><![CDATA[
<p>I use Claude Code within Pycharm and I see the git diff format for changes there.<p>EDIT: It shows the side-by-side view by default, but it is easy to toggle to a unified view. There's probably a way to permanently set this somewhere.</p>
]]></description><pubDate>Fri, 02 Jan 2026 17:12:52 +0000</pubDate><link>https://news.ycombinator.com/item?id=46466942</link><dc:creator>abhgh</dc:creator><comments>https://news.ycombinator.com/item?id=46466942</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=46466942</guid></item><item><title><![CDATA[New comment by abhgh in "A linear-time alternative for Dimensionality Reduction and fast visualisation"]]></title><description><![CDATA[
<p>Thank you. Your comment about LLMs to semantically parse diverse data, as a first step, makes sense. In fact come to think of it, in the area of prompt optimization too - such as MIPROv2 [1] - the LLM is used to create initial prompt guesses based on its understanding of data. And I agree that UMAP still works well out of the box and has been pretty much like this since its introduction.<p>[1] Section C.1 in the Appendix here <a href="https://arxiv.org/pdf/2406.11695" rel="nofollow">https://arxiv.org/pdf/2406.11695</a></p>
]]></description><pubDate>Tue, 16 Dec 2025 13:12:05 +0000</pubDate><link>https://news.ycombinator.com/item?id=46288081</link><dc:creator>abhgh</dc:creator><comments>https://news.ycombinator.com/item?id=46288081</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=46288081</guid></item><item><title><![CDATA[New comment by abhgh in "A linear-time alternative for Dimensionality Reduction and fast visualisation"]]></title><description><![CDATA[
<p>I was not aware this existed and it looks cool! I am definitely going to take out some time to explore it further.<p>I have a couple of questions for now:
(1) I am confused by your last sentence. It seems you're saying embeddings are a substitute for clustering. My understanding is that you usually apply a clustering algorithm over embeddings - good embeddings just ensure that the grouping produced by the clustering algo "makes sense".<p>(2) Have you tried PaCMAP? I found it to produce high quality and quick results when I tried it. Haven't tried it in a while though - and I vaguely remember that it won't install properly on my machine (a Mac) the last time I had reached out for it. Their group has some new stuff coming out too (on the linked page).<p>[1] <a href="https://github.com/YingfanWang/PaCMAP" rel="nofollow">https://github.com/YingfanWang/PaCMAP</a></p>
]]></description><pubDate>Tue, 16 Dec 2025 10:47:21 +0000</pubDate><link>https://news.ycombinator.com/item?id=46287030</link><dc:creator>abhgh</dc:creator><comments>https://news.ycombinator.com/item?id=46287030</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=46287030</guid></item><item><title><![CDATA[New comment by abhgh in "Algorithms for Optimization [pdf]"]]></title><description><![CDATA[
<p>Thanks for the example. Yes, true, this is for expensive functions - to be precise functions that depend on data that is hard to gather, so you interleave the process of computing the value of the function with gathering strategically just as much data as is needed to compute the function value. The video on their page [1] is quite illustrative: calculate shortest path on a graph where the edge weights are expensive to obtain. Note how the edge weights they end up obtaining forms a narrow band around the shortest path they find.<p>[1] <a href="https://willieneis.github.io/bax-website/" rel="nofollow">https://willieneis.github.io/bax-website/</a></p>
]]></description><pubDate>Mon, 01 Dec 2025 07:11:39 +0000</pubDate><link>https://news.ycombinator.com/item?id=46104423</link><dc:creator>abhgh</dc:creator><comments>https://news.ycombinator.com/item?id=46104423</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=46104423</guid></item><item><title><![CDATA[New comment by abhgh in "Algorithms for Optimization [pdf]"]]></title><description><![CDATA[
<p>Timefold looks very interesting. This might be irrelevant but have you looked at stuff like InfoBax [1]?<p>[1] <a href="https://willieneis.github.io/bax-website/" rel="nofollow">https://willieneis.github.io/bax-website/</a></p>
]]></description><pubDate>Mon, 01 Dec 2025 05:40:42 +0000</pubDate><link>https://news.ycombinator.com/item?id=46103884</link><dc:creator>abhgh</dc:creator><comments>https://news.ycombinator.com/item?id=46103884</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=46103884</guid></item><item><title><![CDATA[New comment by abhgh in "Terence Tao: At the Erdos problem website, AI assistance now becoming routine"]]></title><description><![CDATA[
<p>You don't - the way I use LLMs for explanations is that I keep going back and forth between the LLM explanation and Google search /Wikipedia. And of course asking the LLM to cite sources helps.<p>This might sound cumbersome but without the LLM I wouldn't have (1) known what to search for, in a way (2) that lets me incrementally build a mental model. So it's a net win for me. The only gap I see is coverage/recall: when asked for different techniques to accomplish something, the LLM might miss some techniques - and what is missed depends upon the specific LLM. My solution here is asking multiple LLMs and going back to Google search.</p>
]]></description><pubDate>Tue, 25 Nov 2025 17:35:19 +0000</pubDate><link>https://news.ycombinator.com/item?id=46048268</link><dc:creator>abhgh</dc:creator><comments>https://news.ycombinator.com/item?id=46048268</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=46048268</guid></item><item><title><![CDATA[New comment by abhgh in "Awk Technical Notes (2023)"]]></title><description><![CDATA[
<p>Love awk. In the early days of my career, I used to write ETL pipelines and awk helped me condense a lot of stuff into a small number of LOC. I particularly prided myself in writing terse one-liners (some probably undecipherable, ha!); but did occasionally write scripts. Now I mostly reach for Python.</p>
]]></description><pubDate>Fri, 14 Nov 2025 20:45:51 +0000</pubDate><link>https://news.ycombinator.com/item?id=45931970</link><dc:creator>abhgh</dc:creator><comments>https://news.ycombinator.com/item?id=45931970</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=45931970</guid></item><item><title><![CDATA[UMAP Projections of Animals to 2D]]></title><description><![CDATA[
<p>Article URL: <a href="https://duhaime.s3.amazonaws.com/apps/umap-zoo/index.html">https://duhaime.s3.amazonaws.com/apps/umap-zoo/index.html</a></p>
<p>Comments URL: <a href="https://news.ycombinator.com/item?id=45922649">https://news.ycombinator.com/item?id=45922649</a></p>
<p>Points: 1</p>
<p># Comments: 0</p>
]]></description><pubDate>Fri, 14 Nov 2025 00:57:33 +0000</pubDate><link>https://duhaime.s3.amazonaws.com/apps/umap-zoo/index.html</link><dc:creator>abhgh</dc:creator><comments>https://news.ycombinator.com/item?id=45922649</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=45922649</guid></item><item><title><![CDATA[New comment by abhgh in "Claude Haiku 4.5"]]></title><description><![CDATA[
<p>I'm curious to know if Anthropic mentions anywhere that they use speculative decoding. For OpenAI they do seem to use it based on this tweet [1].<p>[1] <a href="https://x.com/stevendcoffey/status/1853582548225683814" rel="nofollow">https://x.com/stevendcoffey/status/1853582548225683814</a></p>
]]></description><pubDate>Thu, 16 Oct 2025 04:20:48 +0000</pubDate><link>https://news.ycombinator.com/item?id=45601429</link><dc:creator>abhgh</dc:creator><comments>https://news.ycombinator.com/item?id=45601429</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=45601429</guid></item><item><title><![CDATA[New comment by abhgh in "Let's Take Esoteric Programming Languages Seriously"]]></title><description><![CDATA[
<p>Wouldn't this be an optimization problem, that's to say, something like z3 should be able to do - [1], [2]?<p>I was about to suggest probabilistic programming, e.g., PyMC [3], as well, but it looks like you want the optimization to occur autonomously after you've specified the problem - which is different from the program drawing insights from organically accumulated data.<p>[1] <a href="https://github.com/Z3Prover/z3?tab=readme-ov-file" rel="nofollow">https://github.com/Z3Prover/z3?tab=readme-ov-file</a><p>[2] <a href="https://microsoft.github.io/z3guide/programming/Z3%20Python%20-%20Readonly/Introduction" rel="nofollow">https://microsoft.github.io/z3guide/programming/Z3%20Python%...</a><p>[3] <a href="https://www.pymc.io/welcome.html" rel="nofollow">https://www.pymc.io/welcome.html</a></p>
]]></description><pubDate>Sun, 12 Oct 2025 08:06:09 +0000</pubDate><link>https://news.ycombinator.com/item?id=45556290</link><dc:creator>abhgh</dc:creator><comments>https://news.ycombinator.com/item?id=45556290</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=45556290</guid></item><item><title><![CDATA[New comment by abhgh in "Show HN: Traceroute Visualizer"]]></title><description><![CDATA[
<p>Hadn't seen this before, very nice read, thank you!</p>
]]></description><pubDate>Fri, 03 Oct 2025 20:44:21 +0000</pubDate><link>https://news.ycombinator.com/item?id=45467577</link><dc:creator>abhgh</dc:creator><comments>https://news.ycombinator.com/item?id=45467577</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=45467577</guid></item><item><title><![CDATA[New comment by abhgh in "Gaussian Processes for Machine Learning (2006) [pdf]"]]></title><description><![CDATA[
<p>Thank you for your kind comment!</p>
]]></description><pubDate>Thu, 21 Aug 2025 02:01:33 +0000</pubDate><link>https://news.ycombinator.com/item?id=44968357</link><dc:creator>abhgh</dc:creator><comments>https://news.ycombinator.com/item?id=44968357</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=44968357</guid></item><item><title><![CDATA[New comment by abhgh in "Gaussian Processes for Machine Learning (2006) [pdf]"]]></title><description><![CDATA[
<p>Aside from secondmind [1] I don't know of any companies (only because I haven't looked)... But if I had to look for places with strong research culture on GPs (I don't know if you're) I would find relevant papers on arxiv and Google scholar, and see if any of them come from industry labs. If I had to take a guess on Bayesian tools at work, maybe the industries to look at would be advertising and healthcare.I would also look out for places that hire econometricists.<p>Also thank you for the book recommendation!<p>[1] <a href="https://www.secondmind.ai/" rel="nofollow">https://www.secondmind.ai/</a></p>
]]></description><pubDate>Mon, 18 Aug 2025 23:41:18 +0000</pubDate><link>https://news.ycombinator.com/item?id=44946579</link><dc:creator>abhgh</dc:creator><comments>https://news.ycombinator.com/item?id=44946579</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=44946579</guid></item><item><title><![CDATA[New comment by abhgh in "Gaussian Processes for Machine Learning (2006) [pdf]"]]></title><description><![CDATA[
<p>This is the definitive reference on the topic! I have some notes on the topic as well, if you want something concise, but that doesn't ignore the math [1].<p>[1] <a href="https://blog.quipu-strands.com/bayesopt_1_key_ideas_GPs#gaussian_processes" rel="nofollow">https://blog.quipu-strands.com/bayesopt_1_key_ideas_GPs#gaus...</a></p>
]]></description><pubDate>Mon, 18 Aug 2025 20:23:25 +0000</pubDate><link>https://news.ycombinator.com/item?id=44944879</link><dc:creator>abhgh</dc:creator><comments>https://news.ycombinator.com/item?id=44944879</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=44944879</guid></item><item><title><![CDATA[New comment by abhgh in "Achieving 10,000x training data reduction with high-fidelity labels"]]></title><description><![CDATA[
<p>Active Learning is a very tricky area to get right ... over the years I have had mixed luck with text classification, to the point that my colleague and I decided to perform a thorough empirical study [1], that normalized various experiment settings that individual papers had reported. We observed that post normalization, randomly picking instances to label is better!<p>[1] <a href="https://aclanthology.org/2024.emnlp-main.1240/" rel="nofollow">https://aclanthology.org/2024.emnlp-main.1240/</a></p>
]]></description><pubDate>Fri, 08 Aug 2025 06:08:56 +0000</pubDate><link>https://news.ycombinator.com/item?id=44833978</link><dc:creator>abhgh</dc:creator><comments>https://news.ycombinator.com/item?id=44833978</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=44833978</guid></item></channel></rss>