<rss version="2.0" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:atom="http://www.w3.org/2005/Atom"><channel><title>Hacker News: shenberg</title><link>https://news.ycombinator.com/user?id=shenberg</link><description>Hacker News RSS</description><docs>https://hnrss.org/</docs><generator>hnrss v2.1.1</generator><lastBuildDate>Thu, 18 Jun 2026 08:12:49 +0000</lastBuildDate><atom:link href="https://hnrss.org/user?id=shenberg" rel="self" type="application/rss+xml"></atom:link><item><title><![CDATA[New comment by shenberg in "How to earn a billion dollars"]]></title><description><![CDATA[
<p>Some example >1B companies off the top of my head: DataDog, Sentry, Snowflake, Okta, MongoDB</p>
]]></description><pubDate>Mon, 15 Jun 2026 07:04:36 +0000</pubDate><link>https://news.ycombinator.com/item?id=48537586</link><dc:creator>shenberg</dc:creator><comments>https://news.ycombinator.com/item?id=48537586</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=48537586</guid></item><item><title><![CDATA[New comment by shenberg in "OpenCV 5 Is Here: The Biggest Leap in Years for Computer Vision"]]></title><description><![CDATA[
<p>moondream is a beast</p>
]]></description><pubDate>Tue, 09 Jun 2026 08:04:44 +0000</pubDate><link>https://news.ycombinator.com/item?id=48458089</link><dc:creator>shenberg</dc:creator><comments>https://news.ycombinator.com/item?id=48458089</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=48458089</guid></item><item><title><![CDATA[New comment by shenberg in "DeepSeek V4 Pro beats GPT-5.5 Pro on precision"]]></title><description><![CDATA[
<p>Seems 100% AI generated and automated, the judge also seems suspect - in the first one it's actually GPT-5.5 pro which has the correct email RE: the deepseek one will match a@b.com1 as "a@b.com" while 5.5 will correctly require a word boundary at the end of the email.
I quit after this. No test-cases = useless judge.</p>
]]></description><pubDate>Mon, 08 Jun 2026 07:29:38 +0000</pubDate><link>https://news.ycombinator.com/item?id=48442300</link><dc:creator>shenberg</dc:creator><comments>https://news.ycombinator.com/item?id=48442300</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=48442300</guid></item><item><title><![CDATA[New comment by shenberg in "Anthropic is expanding to Colossus2. Will use GB200"]]></title><description><![CDATA[
<p>11% MFU does not mean 89% of GPUs are idle, it means that they're using the GPUs ineffectively.</p>
]]></description><pubDate>Thu, 21 May 2026 08:29:34 +0000</pubDate><link>https://news.ycombinator.com/item?id=48219521</link><dc:creator>shenberg</dc:creator><comments>https://news.ycombinator.com/item?id=48219521</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=48219521</guid></item><item><title><![CDATA[New comment by shenberg in "PA bench: Evaluating web agents on real world personal assistant workflows"]]></title><description><![CDATA[
<p>Using existing enterprise apps probably - this solution is scalable for the vendor and it's easier to sell using existing software as-is than to start out by writing new custom tools.</p>
]]></description><pubDate>Thu, 26 Feb 2026 11:42:47 +0000</pubDate><link>https://news.ycombinator.com/item?id=47164728</link><dc:creator>shenberg</dc:creator><comments>https://news.ycombinator.com/item?id=47164728</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47164728</guid></item><item><title><![CDATA[New comment by shenberg in "Elsevier shuts down its finance journal citation cartel"]]></title><description><![CDATA[
<p>Mid-way I realized this was AI writing (took me a while), then I read a quote in the text about a comment that "The tragedy isn’t that they cheated; it’s that the system was designed to let them thrive for a decade before anyone bothered to look at the data." I didn't find this comment in EJMR, or anywhere on the internet except the OP post, for that matter.</p>
]]></description><pubDate>Mon, 23 Feb 2026 19:40:47 +0000</pubDate><link>https://news.ycombinator.com/item?id=47127645</link><dc:creator>shenberg</dc:creator><comments>https://news.ycombinator.com/item?id=47127645</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47127645</guid></item><item><title><![CDATA[New comment by shenberg in "Audio is the one area small labs are winning"]]></title><description><![CDATA[
<p>Moshi was an amazing tech demo, building the entire stack from scratch in 6 months with a small team was an amazing show of skill: 7B text LLM data + training, emotive TTS for synth data generation (again model + data collection), synth data pipeline, novel speech codec, rust inference stack for low latency, audio LLM architecture incl. text "thoughts" stream which was novel.<p>But, this piece is a fluff piece: "underfunded" means a total of around $400 million ($330 million in the initial round, $70 million for Gradium). Compare to Elevenlabs who used a $2 million pre-seed for creating their initial product.<p>A bunch of other stuff there is disingenuous, like comparing their 7B model to Llama-3 405B (hint: the 7B model is a _lot_ dumber). There's also the outright lie: team of 4 made Moshi, which is corrected _in the same piece_ to 8 if you read enough.</p>
]]></description><pubDate>Mon, 16 Feb 2026 10:02:34 +0000</pubDate><link>https://news.ycombinator.com/item?id=47033127</link><dc:creator>shenberg</dc:creator><comments>https://news.ycombinator.com/item?id=47033127</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47033127</guid></item><item><title><![CDATA[New comment by shenberg in "Ask HN: Who wants to be hired? (February 2026)"]]></title><description><![CDATA[
<p>Location: Paris, France (US citizen, EU resident)<p>Remote: Yes<p>Willing to relocate: Not until 2027<p>Technologies:<p>ML / DS: PyTorch, CUDA, distributed training & inference, performance profiling/optimization (audio & speech focus, some LLM inference acceleration)<p>Systems: C/C++/asm, low-level performance work, reliability/scale engineering<p>Backend / Infra: Python/Java/C# prod services, ETL / data pipelines, k8s (incl. operator work)<p>Roles: Tech Lead, Research Engineer (training and/or inference of large models)<p>Résumé/CV: <a href="https://www.linkedin.com/in/roeeshenberg/" rel="nofollow">https://www.linkedin.com/in/roeeshenberg/</a><p>Email: roee.shenberg@upai.dev<p>I’m a hands-on engineer who’s spent the last 6 years doing freelance ML + data science, primarily in audio/speech, and before that 10+ years in startups building and scaling production systems.<p>I’m looking for where research meets real systems: training and/or inference for large models, especially roles that value end-to-end ownership. Open to freelance engagements or full-time roles.</p>
]]></description><pubDate>Thu, 05 Feb 2026 11:18:50 +0000</pubDate><link>https://news.ycombinator.com/item?id=46898465</link><dc:creator>shenberg</dc:creator><comments>https://news.ycombinator.com/item?id=46898465</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=46898465</guid></item><item><title><![CDATA[New comment by shenberg in "FlashAttention-T: Towards Tensorized Attention"]]></title><description><![CDATA[
<p>There are two ingredients that don't fit in the "attention-is-kernel-smoothing" as far as I can tell: positional encoding and causal masking (another way to say positional encoding, I guess)<p>Also, Simplical attention is pretty much what the OP was going for, but the hardware lottery is such that it's gonna be pretty difficult to get competitive in terms of engineering, not that people aren't trying (e.g. <a href="https://arxiv.org/pdf/2507.02754" rel="nofollow">https://arxiv.org/pdf/2507.02754</a>)</p>
]]></description><pubDate>Wed, 04 Feb 2026 12:15:41 +0000</pubDate><link>https://news.ycombinator.com/item?id=46884929</link><dc:creator>shenberg</dc:creator><comments>https://news.ycombinator.com/item?id=46884929</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=46884929</guid></item><item><title><![CDATA[New comment by shenberg in "Cyclic Subgroup Sum"]]></title><description><![CDATA[
<p>I don't understand how using group-theory language to describe number-theoretic properties provides extra insight in this case (e.g. conjecture: all perfect numbers are even is more concise than the group-theoretic description given in the page). Can you expand on why you believe the tools of group theory have something to say about this?
(e.g. for polynomial roots, the connection with symmetry groups comes from symmetries of factorized polynomials, while there's no obvious-to-me connection here as there is no unique-up-to-symmetry integer factorization)</p>
]]></description><pubDate>Tue, 27 Jan 2026 08:59:42 +0000</pubDate><link>https://news.ycombinator.com/item?id=46777318</link><dc:creator>shenberg</dc:creator><comments>https://news.ycombinator.com/item?id=46777318</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=46777318</guid></item><item><title><![CDATA[New comment by shenberg in "Exe.dev"]]></title><description><![CDATA[
<p>ssh exe.dev works</p>
]]></description><pubDate>Sat, 27 Dec 2025 16:38:38 +0000</pubDate><link>https://news.ycombinator.com/item?id=46402983</link><dc:creator>shenberg</dc:creator><comments>https://news.ycombinator.com/item?id=46402983</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=46402983</guid></item><item><title><![CDATA[New comment by shenberg in "Ask HN: How are Markov chains so different from tiny LLMs?"]]></title><description><![CDATA[
<p>The short and unsatisfying answer is that an LLM generation is a markov chain, except that instead of counting n-grams in order to generate the posterior distribution, the training process compresses the statistics into the LLM's weights.<p>There was an interesting paper a while back which investigated using unbounded n-gram models as a complement to LLMs: <a href="https://arxiv.org/pdf/2401.17377" rel="nofollow">https://arxiv.org/pdf/2401.17377</a> (I found the implementation to be clever and I'm somewhat surprised it received so little follow-up work)</p>
]]></description><pubDate>Fri, 21 Nov 2025 09:17:15 +0000</pubDate><link>https://news.ycombinator.com/item?id=46002735</link><dc:creator>shenberg</dc:creator><comments>https://news.ycombinator.com/item?id=46002735</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=46002735</guid></item><item><title><![CDATA[New comment by shenberg in "US declines to join more than 70 countries in signing UN cybercrime treaty"]]></title><description><![CDATA[
<p>When countries like North Korea, which depends on cybercrime to fund itself, are signatories, you have to wonder whether this agreement means what its title says.</p>
]]></description><pubDate>Thu, 30 Oct 2025 14:56:17 +0000</pubDate><link>https://news.ycombinator.com/item?id=45760748</link><dc:creator>shenberg</dc:creator><comments>https://news.ycombinator.com/item?id=45760748</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=45760748</guid></item><item><title><![CDATA[New comment by shenberg in "What would an efficient and trustworthy meeting culture look like?"]]></title><description><![CDATA[
<p>The reality of meetings in most places I've seen is that key stakeholders have already formed an opinion beforehand, the meeting is a place to disseminate decisions that have already been made and align the organization.</p>
]]></description><pubDate>Mon, 28 Jul 2025 08:38:26 +0000</pubDate><link>https://news.ycombinator.com/item?id=44708661</link><dc:creator>shenberg</dc:creator><comments>https://news.ycombinator.com/item?id=44708661</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=44708661</guid></item><item><title><![CDATA[New comment by shenberg in "Learnings from building AI agents"]]></title><description><![CDATA[
<p>When I read "51% fewer false positives" followed immediately by "Median comments per pull request cut by half" it makes me wonder how many true positives they find. That's maybe unfair as my reference is automated tooling in the security world, where the true-positive/false-positive ratio is so bad that a 50% reduction in false positives is a drop in the bucket</p>
]]></description><pubDate>Thu, 26 Jun 2025 14:45:28 +0000</pubDate><link>https://news.ycombinator.com/item?id=44387971</link><dc:creator>shenberg</dc:creator><comments>https://news.ycombinator.com/item?id=44387971</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=44387971</guid></item><item><title><![CDATA[New comment by shenberg in "Sam Altman said startups with $10M were 'hopeless' competing with OpenAI"]]></title><description><![CDATA[
<p>The DeepSeek v3 model had a net training cost of >$5m for the final training run, the paper lists over 100 authors[1], meaning highly-paid engineers. This is also one of a sequence of models (v1, v2, math, coder) trained in order to build the institutional knowledge necessary to get to the frontier , and this ends up still far above the $10m mark. It's hardly a "trio of super-smart engineers".<p>[1] <a href="https://arxiv.org/abs/2412.19437v1" rel="nofollow">https://arxiv.org/abs/2412.19437v1</a></p>
]]></description><pubDate>Tue, 28 Jan 2025 17:58:08 +0000</pubDate><link>https://news.ycombinator.com/item?id=42855604</link><dc:creator>shenberg</dc:creator><comments>https://news.ycombinator.com/item?id=42855604</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=42855604</guid></item><item><title><![CDATA[New comment by shenberg in "Israel, Hamas reach ceasefire deal to end 15 months of war in Gaza"]]></title><description><![CDATA[
<p>That's really not true, e.g. the wikipedia page on population transfer in the Ottoman empire[1]. This dates way back to the Assyrian and Persian empries explicitly moving conquered peoples around in their empires in order to safeguard their rule. This book on population transfer in the Ottoman empire[2] explicitly states, with references, that the Ottomans habits were inherited from the steppe Turks, the Byzantines (=the Romans) and the Arabs.<p>[1] <a href="https://en.wikipedia.org/w/index.php?title=Population_transfer_in_the_Ottoman_Empire" rel="nofollow">https://en.wikipedia.org/w/index.php?title=Population_transf...</a>
[2] <a href="https://websites.umich.edu/~gocek/Work/ja/Gocek.Muge.ja.population.transfers.pdf" rel="nofollow">https://websites.umich.edu/~gocek/Work/ja/Gocek.Muge.ja.popu...</a></p>
]]></description><pubDate>Thu, 16 Jan 2025 12:09:11 +0000</pubDate><link>https://news.ycombinator.com/item?id=42724230</link><dc:creator>shenberg</dc:creator><comments>https://news.ycombinator.com/item?id=42724230</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=42724230</guid></item><item><title><![CDATA[New comment by shenberg in "Z-Library Helps Students to Overcome Academic Poverty, Study Finds"]]></title><description><![CDATA[
<p>Anecdotally, a pro-audio software company I worked with had to fire 1/3 of the company when their copy-protection was cracked and sales tanked immediately afterwards, and recovered once a new copy-protection scheme was developed and applied. And just to be clear, software licenses in direct-to-user sales are not that company's only revenue stream (they sell hardware and software to OEMs).<p>This is to say, the evidence in this natural experiment points towards piracy reducing sales by a lot.</p>
]]></description><pubDate>Thu, 21 Nov 2024 11:55:57 +0000</pubDate><link>https://news.ycombinator.com/item?id=42203422</link><dc:creator>shenberg</dc:creator><comments>https://news.ycombinator.com/item?id=42203422</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=42203422</guid></item><item><title><![CDATA[New comment by shenberg in "SpawELO – small free matchmaking system for LAN parties"]]></title><description><![CDATA[
<p>Under the leaderboard tab, if the "Solution" column has an icon, it's clickable. 2nd place solution is by Jeremy Howard (of fast.ai fame), which I'd summarize as TrueSkill Through Time (Microsoft Research paper) + some overfitting on the public leaderboard (1st place was #26 in the public leaderboard).</p>
]]></description><pubDate>Sun, 03 Nov 2024 17:09:55 +0000</pubDate><link>https://news.ycombinator.com/item?id=42034311</link><dc:creator>shenberg</dc:creator><comments>https://news.ycombinator.com/item?id=42034311</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=42034311</guid></item><item><title><![CDATA[New comment by shenberg in "No "Zero-Shot" Without Exponential Data"]]></title><description><![CDATA[
<p>The CLIP plot (Fig. 2) is damning, however some of the generative models show flat responses in Fig. 3 (e.g. Adobe GigaGAN, DALL-E-mini). While those are on the one hand technically linear relationships, but are also exactly what we'd want: image generation aesthetic score that doesn't care about concept frequency. Maybe the issue is with the contrastive training target used in CLIP?</p>
]]></description><pubDate>Thu, 09 May 2024 16:20:17 +0000</pubDate><link>https://news.ycombinator.com/item?id=40309787</link><dc:creator>shenberg</dc:creator><comments>https://news.ycombinator.com/item?id=40309787</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=40309787</guid></item></channel></rss>