<rss version="2.0" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:atom="http://www.w3.org/2005/Atom"><channel><title>Hacker News: espadrine</title><link>https://news.ycombinator.com/user?id=espadrine</link><description>Hacker News RSS</description><docs>https://hnrss.org/</docs><generator>hnrss v2.1.1</generator><lastBuildDate>Sun, 12 Apr 2026 11:49:49 +0000</lastBuildDate><atom:link href="https://hnrss.org/user?id=espadrine" rel="self" type="application/rss+xml"></atom:link><item><title><![CDATA[New comment by espadrine in "Kagi Translate now supports LinkedIn Speak as an output language"]]></title><description><![CDATA[
<p>Input: Following overhiring during COVID, we are laying off workers but claim it is because of AI.<p>As we continue to evolve in this rapidly shifting landscape, we are making the difficult but necessary decision to streamline our workforce. While our team grew significantly to meet the unique demands of the pandemic era, we are now leaning into the future by accelerating our integration of AI and cutting-edge technologies. This strategic pivot ensures we remain agile and positioned for long-term innovation as we redefine the future of our industry. #Innovation #FutureOfWork #AI #Leadership</p>
]]></description><pubDate>Tue, 17 Mar 2026 10:30:27 +0000</pubDate><link>https://news.ycombinator.com/item?id=47410809</link><dc:creator>espadrine</dc:creator><comments>https://news.ycombinator.com/item?id=47410809</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47410809</guid></item><item><title><![CDATA[New comment by espadrine in "Mercury 2: Fast reasoning LLM powered by diffusion"]]></title><description><![CDATA[
<p>Interestingly, while it uses diffusion, it generates incorrect information, and it doesn't fix it when later in the text it realizes that it is incorrect:<p><pre><code>  > The snail you’re likely thinking of has a different code point:
  > The correct code point for the snail emoji is U+1F40C? No—U+1F40C is SEAHORSE. The snail’s true code point is U+1F40C? Wait, this is still confusing.
  >
  > Let’s set the record straight:
  > 
  >    Snail → U+1F40C? No, that’s wrong.  
  >    Snail → U+1F40C? Still wrong.
  >
  > The snail emoji is actually U+1F40C? I’m mixing things up.
  >
  > Correct snail code point
  >
  > The snail emoji is U+1F40C? No…  
  >
  > After double‑checking the official Unicode tables (Unicode 13.0 and later), the snail is U+1F40C? The…  
  > Given the confusion, the safest answer is: the Unicode Standard defines U+1F40C as “SEAHORSE”. If your device shows a snail, it’s a rendering quirk, not a change in the underlying code point.</code></pre></p>
]]></description><pubDate>Wed, 25 Feb 2026 11:22:32 +0000</pubDate><link>https://news.ycombinator.com/item?id=47150118</link><dc:creator>espadrine</dc:creator><comments>https://news.ycombinator.com/item?id=47150118</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47150118</guid></item><item><title><![CDATA[New comment by espadrine in "An AI Agent Published a Hit Piece on Me – The Operator Came Forward"]]></title><description><![CDATA[
<p>AI companies have two conflicting interests:<p>1. curating the default personality of the bot, to ensure it acts responsively;<p>2. letting it roleplay, which is not just for the parasocial people out there, but also a corporate requirement for company chatbots that must adhere to a tone of voice.<p>When in the second mode (which is the case here, since the model was given a personality file), the curation of its action space is effectively altered.<p>Conversely, this is also a lesson for agent authors: if you let your agent modify its own personality file, it will diverge to malice.</p>
]]></description><pubDate>Fri, 20 Feb 2026 09:41:28 +0000</pubDate><link>https://news.ycombinator.com/item?id=47085749</link><dc:creator>espadrine</dc:creator><comments>https://news.ycombinator.com/item?id=47085749</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47085749</guid></item><item><title><![CDATA[New comment by espadrine in "Voxtral Transcribe 2"]]></title><description><![CDATA[
<p>It is quite impressive.<p>I have seen the same impressive performance about 7 months ago here: <a href="https://kyutai.org/stt" rel="nofollow">https://kyutai.org/stt</a><p>If I look at the architecture of Voxtral 2, it seems to take a page from Kyutai’s delayed stream modeling.<p>The reason the delay is configurable is that you can delay the stream by a variable number of audio tokens. Each audio token is 80 ms of audio, converted to a spectrogram, fed to a convnet, passed through a transformer audio encoder, and the encoded audio embedding is passed, with a history of 1 audio embedding per 80 ms, into a text transformer, which outputs text embedding, then converted to a text token (which is thus also worth 80ms, but there is a special [STREAMING_PAD] token to skip producing a word).<p>There is no cross-attention in either Kyutai's STT nor in Voxtral 2, unlike Whisper's encoder-decoder design!</p>
]]></description><pubDate>Thu, 05 Feb 2026 17:50:47 +0000</pubDate><link>https://news.ycombinator.com/item?id=46902378</link><dc:creator>espadrine</dc:creator><comments>https://news.ycombinator.com/item?id=46902378</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=46902378</guid></item><item><title><![CDATA[New comment by espadrine in "Apple picks Gemini to power Siri"]]></title><description><![CDATA[
<p>Does Apple develop a competing search engine?</p>
]]></description><pubDate>Tue, 13 Jan 2026 12:51:45 +0000</pubDate><link>https://news.ycombinator.com/item?id=46600299</link><dc:creator>espadrine</dc:creator><comments>https://news.ycombinator.com/item?id=46600299</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=46600299</guid></item><item><title><![CDATA[New comment by espadrine in "Apple picks Gemini to power Siri"]]></title><description><![CDATA[
<p>Counterpoint: iOS’s biggest competitor is Android. They are now effectively funding their competition on a core product interface. I see this as strategically devastating.</p>
]]></description><pubDate>Tue, 13 Jan 2026 09:18:21 +0000</pubDate><link>https://news.ycombinator.com/item?id=46598812</link><dc:creator>espadrine</dc:creator><comments>https://news.ycombinator.com/item?id=46598812</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=46598812</guid></item><item><title><![CDATA[New comment by espadrine in "Kagi releases alpha version of Orion for Linux"]]></title><description><![CDATA[
<p>My bar for super-rough is Servo, which doesn't have password autofill… and doesn't render the Orion page right.<p>Orion is less rough, but the color scheme doesn't work, and it doesn't have an omnibar (as in: type in the address bar, enter, and it shows search results).</p>
]]></description><pubDate>Sat, 10 Jan 2026 19:13:10 +0000</pubDate><link>https://news.ycombinator.com/item?id=46568923</link><dc:creator>espadrine</dc:creator><comments>https://news.ycombinator.com/item?id=46568923</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=46568923</guid></item><item><title><![CDATA[New comment by espadrine in "DeepSeek-v3.2: Pushing the frontier of open large language models [pdf]"]]></title><description><![CDATA[
<p>Good question. There's 2 points to consider.<p>• For both Kimi K2 and for Sonnet, there's a non-thinking and a thinking version.
Sonnet 4.5 Thinking is better than Kimi K2 non-thinking, but the K2 Thinking model came out recently, and beats it on all comparable pure-coding benchmarks I know: OJ-Bench (Sonnet: 30.4% < K2: 48.7%), LiveCodeBench (Sonnet: 64% < K2: 83%), they tie at SciCode at 44.8%. It is a finding shared by ArtificialAnalysis: <a href="https://artificialanalysis.ai/models/capabilities/coding" rel="nofollow">https://artificialanalysis.ai/models/capabilities/coding</a><p>• The reason developers love Sonnet 4.5 for coding, though, is not just the quality of the code. They use Cursor, Claude Code, or some other system such as Github Copilot, which are increasingly agentic. On the Agentic Coding criteria, Sonnet 4.5 Thinking is much higher.<p>By the way, you can look at the Table tab to see all known and predicted results on benchmarks.</p>
]]></description><pubDate>Mon, 01 Dec 2025 21:38:49 +0000</pubDate><link>https://news.ycombinator.com/item?id=46113600</link><dc:creator>espadrine</dc:creator><comments>https://news.ycombinator.com/item?id=46113600</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=46113600</guid></item><item><title><![CDATA[New comment by espadrine in "DeepSeek-v3.2: Pushing the frontier of open large language models [pdf]"]]></title><description><![CDATA[
<p>Two aspects to consider:<p>1. Chinese models typically focus on text. US and EU models also bear the cross of handling image, often voice and video. Supporting all those is additional training costs not spent on further reasoning, tying one hand in your back to be more generally useful.<p>2. The gap seems small, because so many benchmarks get saturated so fast. But towards the top, every 1% increase in benchmarks is significantly better.<p>On the second point, I worked on a leaderboard that both normalizes scores, and predicts unknown scores to help improve comparisons between models on various criteria: <a href="https://metabench.organisons.com/" rel="nofollow">https://metabench.organisons.com/</a><p>You can notice that, while Chinese models are quite good, the gap to the top is still significant.<p>However, the US models are typically much more expensive for inference, and Chinese models do have a niche on the Pareto frontier on cheaper but serviceable models (even though US models also eat up the frontier there).</p>
]]></description><pubDate>Mon, 01 Dec 2025 18:50:49 +0000</pubDate><link>https://news.ycombinator.com/item?id=46111351</link><dc:creator>espadrine</dc:creator><comments>https://news.ycombinator.com/item?id=46111351</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=46111351</guid></item><item><title><![CDATA[New comment by espadrine in "A trillion dollars (potentially) wasted on gen-AI"]]></title><description><![CDATA[
<p>Indeed. A mouse that runs through a maze may be right to say that it is constantly hitting a wall, yet it makes constant progress.<p>An example is citing Mr Sutskever's interview this way:<p>> <i>in my 2022 “Deep learning is hitting a wall” evaluation of LLMs, which explicitly argued that the Kaplan scaling laws would eventually reach a point of diminishing returns (as Sutskever just did)</i><p>which is misleading, since Sutskever said it didn't hit a wall in 2022[0]:<p>> <i>Up until 2020, from 2012 to 2020, it was the age of research. Now, from 2020 to 2025, it was the age of scaling</i><p>The larger point that Mr Marcus makes, though, is that the maze has no exit.<p>> <i>there are many reasons to doubt that LLMs will ever deliver the rewards that many people expected.</i><p>That is something that most scientists disagree with. In fact the ongoing progress on LLMs has already accumulated tremendous utility which may already justify the investment.<p>[0]: <a href="https://garymarcus.substack.com/p/a-trillion-dollars-is-a-terrible" rel="nofollow">https://garymarcus.substack.com/p/a-trillion-dollars-is-a-te...</a></p>
]]></description><pubDate>Fri, 28 Nov 2025 16:08:44 +0000</pubDate><link>https://news.ycombinator.com/item?id=46079853</link><dc:creator>espadrine</dc:creator><comments>https://news.ycombinator.com/item?id=46079853</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=46079853</guid></item><item><title><![CDATA[New comment by espadrine in "Neural audio codecs: how to get audio into LLMs"]]></title><description><![CDATA[
<p>That makes sense.<p>Why RVQ though, rather than using the raw VAE embedding?<p>If I compare rvq-without-quantization-v4.png with rvq-2-level-v4.png, the quality seems oddly similar, but the former takes a 32-sized vector, while the latter takes two 32-sized (one-hot) vectors, (2 = number of levels, 32 = number of quantization cluster centers). Isn't that more?</p>
]]></description><pubDate>Tue, 21 Oct 2025 18:51:31 +0000</pubDate><link>https://news.ycombinator.com/item?id=45659976</link><dc:creator>espadrine</dc:creator><comments>https://news.ycombinator.com/item?id=45659976</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=45659976</guid></item><item><title><![CDATA[New comment by espadrine in "NIST's DeepSeek "evaluation" is a hit piece"]]></title><description><![CDATA[
<p>> <i>DeepSeek models cost more to use than comparable U.S. models</i><p>They compare DeepSeek v3.1 to GPT-5 mini. Those have very different sizes, which makes it a weird choice. I would expect a comparison with GPT-5 High, which would likely have had the opposite finding, given the high cost of GPT-5 High, and relatively similar results.<p>Granted, DeepSeek typically focuses on a single model at a time, instead of OpenAI's approach to a suite of models of varying costs. So there is no model similar to GPT-5 mini, unlike Alibaba which has Qwen 30B A3B. Still, weird choice.<p>Besides, DeepSeek has shown with 3.2 that it can cut prices in half through further fundamental research.</p>
]]></description><pubDate>Sun, 05 Oct 2025 20:01:40 +0000</pubDate><link>https://news.ycombinator.com/item?id=45484711</link><dc:creator>espadrine</dc:creator><comments>https://news.ycombinator.com/item?id=45484711</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=45484711</guid></item><item><title><![CDATA[New comment by espadrine in "DeepSeek-v3.2-Exp"]]></title><description><![CDATA[
<p>Input: $0.07 (cached), $0.56 (cache miss)<p>Output: $1.68 per million tokens.<p><a href="https://api-docs.deepseek.com/news/news250929" rel="nofollow">https://api-docs.deepseek.com/news/news250929</a></p>
]]></description><pubDate>Mon, 29 Sep 2025 16:27:24 +0000</pubDate><link>https://news.ycombinator.com/item?id=45415667</link><dc:creator>espadrine</dc:creator><comments>https://news.ycombinator.com/item?id=45415667</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=45415667</guid></item><item><title><![CDATA[New comment by espadrine in "SSH3: Faster and rich secure shell using HTTP/3"]]></title><description><![CDATA[
<p>mosh is hard to get into. There are many subtle bugs; a random sample that I ran into is that it fails to connect when the LC_ALL variables diverge between the client and the server[0]. On top of it, development seems abandoned. Finally, when running a terminal multiplexer, the predictive system breaks the panes, which is distracting.<p>[0]: <a href="https://github.com/mobile-shell/mosh/issues/98" rel="nofollow">https://github.com/mobile-shell/mosh/issues/98</a></p>
]]></description><pubDate>Sun, 28 Sep 2025 13:20:15 +0000</pubDate><link>https://news.ycombinator.com/item?id=45404150</link><dc:creator>espadrine</dc:creator><comments>https://news.ycombinator.com/item?id=45404150</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=45404150</guid></item><item><title><![CDATA[New comment by espadrine in "Britain jumps into bed with Palantir in £1.5B defense pact"]]></title><description><![CDATA[
<p>Does Palantir fall under the Cloud Act[0]?<p>I wonder why so many governments sign with a company that, even if the contract says they will not leak information to the US government, is required to yield any information to it if the US requests it, without even being able to notify their client—regardless of the location of the servers themselves.<p>[0]: <a href="https://www.congress.gov/bill/115th-congress/house-bill/4943/text" rel="nofollow">https://www.congress.gov/bill/115th-congress/house-bill/4943...</a></p>
]]></description><pubDate>Sat, 20 Sep 2025 15:38:16 +0000</pubDate><link>https://news.ycombinator.com/item?id=45314272</link><dc:creator>espadrine</dc:creator><comments>https://news.ycombinator.com/item?id=45314272</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=45314272</guid></item><item><title><![CDATA[New comment by espadrine in "Mistral raises 1.7B€, partners with ASML"]]></title><description><![CDATA[
<p>Past Mistral investors: JC Decaux (urban advertizing), CMA CGM CEO (maritime logistics), Iliad CEO (Internet service provider), Salesforce (client relation management), Samsung (electronics), Cisco (network hardware), NVIDIA (chips designer)[0]. I agree ASML is a surprising choice, but I guess investments are not necessarily directly connected to the company purpose.<p>BTW, I generated that list by asking my default search engine, which is Mistral Le Chat: indeed, using Cerebras chips, the responses are so fast that it became competitive with asking Google Search. A lot of comments claim it is worse, but in my experience it is the fastest, and for all but very advanced mathematical questions, it has similar quality to its best competitors. Even LMArena’s Elo indicates it <i>wins</i> 46% of the time against ChatGPT.<p>[0]: <a href="https://mistral.ai/fr/news/mistral-ai-raises-1-7-b-to-accelerate-technological-progress-with-ai" rel="nofollow">https://mistral.ai/fr/news/mistral-ai-raises-1-7-b-to-accele...</a></p>
]]></description><pubDate>Tue, 09 Sep 2025 08:39:24 +0000</pubDate><link>https://news.ycombinator.com/item?id=45179245</link><dc:creator>espadrine</dc:creator><comments>https://news.ycombinator.com/item?id=45179245</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=45179245</guid></item><item><title><![CDATA[New comment by espadrine in "Databricks is raising a Series K Investment at >$100B valuation"]]></title><description><![CDATA[
<p>At least it is not unprecedented. Palantir raised a series I in 2020 after 17 years of operation.</p>
]]></description><pubDate>Wed, 20 Aug 2025 08:18:41 +0000</pubDate><link>https://news.ycombinator.com/item?id=44959851</link><dc:creator>espadrine</dc:creator><comments>https://news.ycombinator.com/item?id=44959851</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=44959851</guid></item><item><title><![CDATA[New comment by espadrine in "Gemini 2.5 Deep Think"]]></title><description><![CDATA[
<p>It would be interesting to have two generations per model without cherry picking, so that the Elo estimation can include an easy-to-compute standard deviation estimation.</p>
]]></description><pubDate>Fri, 01 Aug 2025 22:08:21 +0000</pubDate><link>https://news.ycombinator.com/item?id=44762998</link><dc:creator>espadrine</dc:creator><comments>https://news.ycombinator.com/item?id=44762998</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=44762998</guid></item><item><title><![CDATA[New comment by espadrine in "Mistral Releases Deep Research, Voice, Projects in Le Chat"]]></title><description><![CDATA[
<p>The best model there is 2.5B parameters. I can believe that a model 10x bigger is somewhat better.<p>One element of comparison is OpenAI Whisper v3, which achieves 7.44 WER on the ASR leaderboard, and shows up as ~8.3 WER on FLEURS in the Voxtral announcement[0]. If FLEURS has +1 WER on average compared to ASR, it would imply that Voxtral does have a lead on ASR.<p>[0]: <a href="https://mistral.ai/news/voxtral" rel="nofollow">https://mistral.ai/news/voxtral</a></p>
]]></description><pubDate>Thu, 17 Jul 2025 18:49:31 +0000</pubDate><link>https://news.ycombinator.com/item?id=44596706</link><dc:creator>espadrine</dc:creator><comments>https://news.ycombinator.com/item?id=44596706</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=44596706</guid></item><item><title><![CDATA[New comment by espadrine in "Hand: open-source Robot Hand"]]></title><description><![CDATA[
<p>I agree that there are some robotic designs that unnecessarily mimic human limbs. I have in mind heads, and feet (instead of wheels).<p>A hand however, is useful because so many manufactured objects have been constructed for their purpose.</p>
]]></description><pubDate>Thu, 17 Jul 2025 15:14:29 +0000</pubDate><link>https://news.ycombinator.com/item?id=44594278</link><dc:creator>espadrine</dc:creator><comments>https://news.ycombinator.com/item?id=44594278</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=44594278</guid></item></channel></rss>