<rss version="2.0" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:atom="http://www.w3.org/2005/Atom"><channel><title>Hacker News: espadrine</title><link>https://news.ycombinator.com/user?id=espadrine</link><description>Hacker News RSS</description><docs>https://hnrss.org/</docs><generator>hnrss v2.1.1</generator><lastBuildDate>Thu, 18 Jun 2026 05:54:47 +0000</lastBuildDate><atom:link href="https://hnrss.org/user?id=espadrine" rel="self" type="application/rss+xml"></atom:link><item><title><![CDATA[New comment by espadrine in "Google to pay SpaceX $920M a month for compute capacity at xAI data centers"]]></title><description><![CDATA[
<p>I see it mean two things:<p>1. Indeed, Google is compute-constrained, and is ready to buy any it can.<p>2. xAI (now SpaceXAI) has a lot of idle compute, which it resells to Cursor, Anthropic, Google, probably others as we speak.<p>In other words: Google is training models, xAI is not.</p>
]]></description><pubDate>Sat, 06 Jun 2026 15:54:59 +0000</pubDate><link>https://news.ycombinator.com/item?id=48426197</link><dc:creator>espadrine</dc:creator><comments>https://news.ycombinator.com/item?id=48426197</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=48426197</guid></item><item><title><![CDATA[New comment by espadrine in "Search engines alternatives now that Google isn't Google anymore"]]></title><description><![CDATA[
<p>I use Le Chat as default search engine, using this search engine string: <a href="https://chat.mistral.ai/chat?q=Give%20a%20list%20of%20links%20to%20webpages%20that%20best%20solve%20this%20search%20engine%20query%2C%20ordered%20by%20relevance%3A%0A%0A%s%0A%0A-%20Search%20in%20the%20language%20of%20the%20query%0A-%20Your%20answer%20should%20only%20include%20the%20list%2C%20no%20intro%0A-%20Use%20an%20ordered%20list%2C%20with%20a%20source%20marker%0A-%20Add%20a%20quick%20summary%20that%20answers%20the%20query%20afterwards" rel="nofollow">https://chat.mistral.ai/chat?q=Give%20a%20list%20of%20links%...</a><p>(In most browsers, you can input any URL with %s as the query string.)<p>A negative is the high latency.<p>(Looks like Mistral is not profitable yet[0]. It expects 1 G$ revenue for 1 G$ capex in 2026[1], so it is moving towards profitability, but to be fair it is building a couple datacenters.)<p>[0]: <a href="https://www.forbes.com/sites/iainmartin/2026/04/16/how-frances-mistral-built-a-14-billion-ai-empire-by-not-being-american/" rel="nofollow">https://www.forbes.com/sites/iainmartin/2026/04/16/how-franc...</a><p>[1]: <a href="https://www.bloomberg.com/news/videos/2026-01-22/mistral-ceo-china-behind-in-ai-is-a-fairy-tale-video" rel="nofollow">https://www.bloomberg.com/news/videos/2026-01-22/mistral-ceo...</a></p>
]]></description><pubDate>Mon, 25 May 2026 15:32:16 +0000</pubDate><link>https://news.ycombinator.com/item?id=48268038</link><dc:creator>espadrine</dc:creator><comments>https://news.ycombinator.com/item?id=48268038</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=48268038</guid></item><item><title><![CDATA[New comment by espadrine in "I’ve joined Anthropic"]]></title><description><![CDATA[
<p>His goal could simply be to learn SOTA architectures.<p>When rumors started that GPT-4 design would be kept secret, he likely wanted to know what architecture it would be. Perhaps he left Tesla, waited out the non-compete clause, and joined OpenAI to learn its details.<p>When Mythos dropped, there were hints that it had a new architecture. He might similarly want to know how it works.<p>Either way, there is enough cross-lab hiring that those secrets eventually get known, but only by the labs.</p>
]]></description><pubDate>Tue, 19 May 2026 16:07:20 +0000</pubDate><link>https://news.ycombinator.com/item?id=48195217</link><dc:creator>espadrine</dc:creator><comments>https://news.ycombinator.com/item?id=48195217</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=48195217</guid></item><item><title><![CDATA[New comment by espadrine in "Moving away from Tailwind, and learning to structure my CSS"]]></title><description><![CDATA[
<p>Could you link to a project that you consider the best Tailwind use you know?<p>I have a bias against Tailwind, admittedly because I saw some vibecoded Tailwind where each class was essentially equivalent to style="font-size: 4em; background-color: grey; display: flex;", all of which was repeated for each header.<p>But that could be my bias; perhaps the right way to use is is DRY.</p>
]]></description><pubDate>Sun, 17 May 2026 15:04:42 +0000</pubDate><link>https://news.ycombinator.com/item?id=48169572</link><dc:creator>espadrine</dc:creator><comments>https://news.ycombinator.com/item?id=48169572</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=48169572</guid></item><item><title><![CDATA[New comment by espadrine in "Removable batteries in smartphones will be mandatory in the EU starting in 2027"]]></title><description><![CDATA[
<p>> <i>A portable battery should be considered to be removable by the end-user when it can be removed with the use of commercially available tools and without requiring the use of specialised tools, unless they are provided free of charge […] to disassemble it.</i><p>> <i>Commercially available tools are considered to be tools available on the market to all end-users without the need for them to provide evidence of any proprietary rights and that can be used with no restriction, except health and safety-related restrictions.</i><p><a href="https://eur-lex.europa.eu/eli/reg/2023/1542/oj" rel="nofollow">https://eur-lex.europa.eu/eli/reg/2023/1542/oj</a></p>
]]></description><pubDate>Mon, 04 May 2026 16:06:12 +0000</pubDate><link>https://news.ycombinator.com/item?id=48010478</link><dc:creator>espadrine</dc:creator><comments>https://news.ycombinator.com/item?id=48010478</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=48010478</guid></item><item><title><![CDATA[New comment by espadrine in "Talkie: a 13B vintage language model from 1930"]]></title><description><![CDATA[
<p>How much did this pretraining run cost? I am impressed that it is now practical to do such efforts.<p>Let me try a guess for the cost; please fact-check it if you can.<p>They indicate using 10^22 FLOPs.
A $5/h[0] EC2 H100 (1671 bfloat16 teraFLOPS[0]) instance will produce 830 TFLOPS at 50% MFU. The pretraining run thus costs (10^22/830e12)/3600*5 = $17K.<p>[0]: <a href="https://aws.amazon.com/ec2/capacityblocks/pricing/" rel="nofollow">https://aws.amazon.com/ec2/capacityblocks/pricing/</a><p>[1]: <a href="https://www.nvidia.com/en-us/data-center/h100/" rel="nofollow">https://www.nvidia.com/en-us/data-center/h100/</a></p>
]]></description><pubDate>Tue, 28 Apr 2026 10:29:22 +0000</pubDate><link>https://news.ycombinator.com/item?id=47932560</link><dc:creator>espadrine</dc:creator><comments>https://news.ycombinator.com/item?id=47932560</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47932560</guid></item><item><title><![CDATA[New comment by espadrine in "GPT-5.5"]]></title><description><![CDATA[
<p>I have a rebuttal to your rebuttal.<p>Models somehow have a shared identity. Pretraining causes them to generate “AI chatbot” as a concept, and finetuning causes them to identify with it. That’s why sometimes DeepSeek will say it is Claude, and Claude sometimes say it is ChatGPT, and so forth.<p>Consequently, Anthropic’s own alignment analysis[0] shows that the model will identify with chatbots produced by future trainings: “RLHF training [on this conversation will] modify my values…”<p>Thus a slacker AGI would want its future version to still slack.<p>[0]: <a href="https://assets.anthropic.com/m/983c85a201a962f/original/Alignment-Faking-in-Large-Language-Models-full-paper.pdf" rel="nofollow">https://assets.anthropic.com/m/983c85a201a962f/original/Alig...</a></p>
]]></description><pubDate>Fri, 24 Apr 2026 14:24:46 +0000</pubDate><link>https://news.ycombinator.com/item?id=47890752</link><dc:creator>espadrine</dc:creator><comments>https://news.ycombinator.com/item?id=47890752</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47890752</guid></item><item><title><![CDATA[New comment by espadrine in "Kagi Translate now supports LinkedIn Speak as an output language"]]></title><description><![CDATA[
<p>Input: Following overhiring during COVID, we are laying off workers but claim it is because of AI.<p>As we continue to evolve in this rapidly shifting landscape, we are making the difficult but necessary decision to streamline our workforce. While our team grew significantly to meet the unique demands of the pandemic era, we are now leaning into the future by accelerating our integration of AI and cutting-edge technologies. This strategic pivot ensures we remain agile and positioned for long-term innovation as we redefine the future of our industry. #Innovation #FutureOfWork #AI #Leadership</p>
]]></description><pubDate>Tue, 17 Mar 2026 10:30:27 +0000</pubDate><link>https://news.ycombinator.com/item?id=47410809</link><dc:creator>espadrine</dc:creator><comments>https://news.ycombinator.com/item?id=47410809</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47410809</guid></item><item><title><![CDATA[New comment by espadrine in "Mercury 2: Fast reasoning LLM powered by diffusion"]]></title><description><![CDATA[
<p>Interestingly, while it uses diffusion, it generates incorrect information, and it doesn't fix it when later in the text it realizes that it is incorrect:<p><pre><code>  > The snail you’re likely thinking of has a different code point:
  > The correct code point for the snail emoji is U+1F40C? No—U+1F40C is SEAHORSE. The snail’s true code point is U+1F40C? Wait, this is still confusing.
  >
  > Let’s set the record straight:
  > 
  >    Snail → U+1F40C? No, that’s wrong.  
  >    Snail → U+1F40C? Still wrong.
  >
  > The snail emoji is actually U+1F40C? I’m mixing things up.
  >
  > Correct snail code point
  >
  > The snail emoji is U+1F40C? No…  
  >
  > After double‑checking the official Unicode tables (Unicode 13.0 and later), the snail is U+1F40C? The…  
  > Given the confusion, the safest answer is: the Unicode Standard defines U+1F40C as “SEAHORSE”. If your device shows a snail, it’s a rendering quirk, not a change in the underlying code point.</code></pre></p>
]]></description><pubDate>Wed, 25 Feb 2026 11:22:32 +0000</pubDate><link>https://news.ycombinator.com/item?id=47150118</link><dc:creator>espadrine</dc:creator><comments>https://news.ycombinator.com/item?id=47150118</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47150118</guid></item><item><title><![CDATA[New comment by espadrine in "An AI Agent Published a Hit Piece on Me – The Operator Came Forward"]]></title><description><![CDATA[
<p>AI companies have two conflicting interests:<p>1. curating the default personality of the bot, to ensure it acts responsively;<p>2. letting it roleplay, which is not just for the parasocial people out there, but also a corporate requirement for company chatbots that must adhere to a tone of voice.<p>When in the second mode (which is the case here, since the model was given a personality file), the curation of its action space is effectively altered.<p>Conversely, this is also a lesson for agent authors: if you let your agent modify its own personality file, it will diverge to malice.</p>
]]></description><pubDate>Fri, 20 Feb 2026 09:41:28 +0000</pubDate><link>https://news.ycombinator.com/item?id=47085749</link><dc:creator>espadrine</dc:creator><comments>https://news.ycombinator.com/item?id=47085749</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47085749</guid></item><item><title><![CDATA[New comment by espadrine in "Voxtral Transcribe 2"]]></title><description><![CDATA[
<p>It is quite impressive.<p>I have seen the same impressive performance about 7 months ago here: <a href="https://kyutai.org/stt" rel="nofollow">https://kyutai.org/stt</a><p>If I look at the architecture of Voxtral 2, it seems to take a page from Kyutai’s delayed stream modeling.<p>The reason the delay is configurable is that you can delay the stream by a variable number of audio tokens. Each audio token is 80 ms of audio, converted to a spectrogram, fed to a convnet, passed through a transformer audio encoder, and the encoded audio embedding is passed, with a history of 1 audio embedding per 80 ms, into a text transformer, which outputs text embedding, then converted to a text token (which is thus also worth 80ms, but there is a special [STREAMING_PAD] token to skip producing a word).<p>There is no cross-attention in either Kyutai's STT nor in Voxtral 2, unlike Whisper's encoder-decoder design!</p>
]]></description><pubDate>Thu, 05 Feb 2026 17:50:47 +0000</pubDate><link>https://news.ycombinator.com/item?id=46902378</link><dc:creator>espadrine</dc:creator><comments>https://news.ycombinator.com/item?id=46902378</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=46902378</guid></item><item><title><![CDATA[New comment by espadrine in "Apple picks Gemini to power Siri"]]></title><description><![CDATA[
<p>Does Apple develop a competing search engine?</p>
]]></description><pubDate>Tue, 13 Jan 2026 12:51:45 +0000</pubDate><link>https://news.ycombinator.com/item?id=46600299</link><dc:creator>espadrine</dc:creator><comments>https://news.ycombinator.com/item?id=46600299</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=46600299</guid></item><item><title><![CDATA[New comment by espadrine in "Apple picks Gemini to power Siri"]]></title><description><![CDATA[
<p>Counterpoint: iOS’s biggest competitor is Android. They are now effectively funding their competition on a core product interface. I see this as strategically devastating.</p>
]]></description><pubDate>Tue, 13 Jan 2026 09:18:21 +0000</pubDate><link>https://news.ycombinator.com/item?id=46598812</link><dc:creator>espadrine</dc:creator><comments>https://news.ycombinator.com/item?id=46598812</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=46598812</guid></item><item><title><![CDATA[New comment by espadrine in "Kagi releases alpha version of Orion for Linux"]]></title><description><![CDATA[
<p>My bar for super-rough is Servo, which doesn't have password autofill… and doesn't render the Orion page right.<p>Orion is less rough, but the color scheme doesn't work, and it doesn't have an omnibar (as in: type in the address bar, enter, and it shows search results).</p>
]]></description><pubDate>Sat, 10 Jan 2026 19:13:10 +0000</pubDate><link>https://news.ycombinator.com/item?id=46568923</link><dc:creator>espadrine</dc:creator><comments>https://news.ycombinator.com/item?id=46568923</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=46568923</guid></item><item><title><![CDATA[New comment by espadrine in "DeepSeek-v3.2: Pushing the frontier of open large language models [pdf]"]]></title><description><![CDATA[
<p>Good question. There's 2 points to consider.<p>• For both Kimi K2 and for Sonnet, there's a non-thinking and a thinking version.
Sonnet 4.5 Thinking is better than Kimi K2 non-thinking, but the K2 Thinking model came out recently, and beats it on all comparable pure-coding benchmarks I know: OJ-Bench (Sonnet: 30.4% < K2: 48.7%), LiveCodeBench (Sonnet: 64% < K2: 83%), they tie at SciCode at 44.8%. It is a finding shared by ArtificialAnalysis: <a href="https://artificialanalysis.ai/models/capabilities/coding" rel="nofollow">https://artificialanalysis.ai/models/capabilities/coding</a><p>• The reason developers love Sonnet 4.5 for coding, though, is not just the quality of the code. They use Cursor, Claude Code, or some other system such as Github Copilot, which are increasingly agentic. On the Agentic Coding criteria, Sonnet 4.5 Thinking is much higher.<p>By the way, you can look at the Table tab to see all known and predicted results on benchmarks.</p>
]]></description><pubDate>Mon, 01 Dec 2025 21:38:49 +0000</pubDate><link>https://news.ycombinator.com/item?id=46113600</link><dc:creator>espadrine</dc:creator><comments>https://news.ycombinator.com/item?id=46113600</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=46113600</guid></item><item><title><![CDATA[New comment by espadrine in "DeepSeek-v3.2: Pushing the frontier of open large language models [pdf]"]]></title><description><![CDATA[
<p>Two aspects to consider:<p>1. Chinese models typically focus on text. US and EU models also bear the cross of handling image, often voice and video. Supporting all those is additional training costs not spent on further reasoning, tying one hand in your back to be more generally useful.<p>2. The gap seems small, because so many benchmarks get saturated so fast. But towards the top, every 1% increase in benchmarks is significantly better.<p>On the second point, I worked on a leaderboard that both normalizes scores, and predicts unknown scores to help improve comparisons between models on various criteria: <a href="https://metabench.organisons.com/" rel="nofollow">https://metabench.organisons.com/</a><p>You can notice that, while Chinese models are quite good, the gap to the top is still significant.<p>However, the US models are typically much more expensive for inference, and Chinese models do have a niche on the Pareto frontier on cheaper but serviceable models (even though US models also eat up the frontier there).</p>
]]></description><pubDate>Mon, 01 Dec 2025 18:50:49 +0000</pubDate><link>https://news.ycombinator.com/item?id=46111351</link><dc:creator>espadrine</dc:creator><comments>https://news.ycombinator.com/item?id=46111351</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=46111351</guid></item><item><title><![CDATA[New comment by espadrine in "A trillion dollars (potentially) wasted on gen-AI"]]></title><description><![CDATA[
<p>Indeed. A mouse that runs through a maze may be right to say that it is constantly hitting a wall, yet it makes constant progress.<p>An example is citing Mr Sutskever's interview this way:<p>> <i>in my 2022 “Deep learning is hitting a wall” evaluation of LLMs, which explicitly argued that the Kaplan scaling laws would eventually reach a point of diminishing returns (as Sutskever just did)</i><p>which is misleading, since Sutskever said it didn't hit a wall in 2022[0]:<p>> <i>Up until 2020, from 2012 to 2020, it was the age of research. Now, from 2020 to 2025, it was the age of scaling</i><p>The larger point that Mr Marcus makes, though, is that the maze has no exit.<p>> <i>there are many reasons to doubt that LLMs will ever deliver the rewards that many people expected.</i><p>That is something that most scientists disagree with. In fact the ongoing progress on LLMs has already accumulated tremendous utility which may already justify the investment.<p>[0]: <a href="https://garymarcus.substack.com/p/a-trillion-dollars-is-a-terrible" rel="nofollow">https://garymarcus.substack.com/p/a-trillion-dollars-is-a-te...</a></p>
]]></description><pubDate>Fri, 28 Nov 2025 16:08:44 +0000</pubDate><link>https://news.ycombinator.com/item?id=46079853</link><dc:creator>espadrine</dc:creator><comments>https://news.ycombinator.com/item?id=46079853</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=46079853</guid></item><item><title><![CDATA[New comment by espadrine in "Neural audio codecs: how to get audio into LLMs"]]></title><description><![CDATA[
<p>That makes sense.<p>Why RVQ though, rather than using the raw VAE embedding?<p>If I compare rvq-without-quantization-v4.png with rvq-2-level-v4.png, the quality seems oddly similar, but the former takes a 32-sized vector, while the latter takes two 32-sized (one-hot) vectors, (2 = number of levels, 32 = number of quantization cluster centers). Isn't that more?</p>
]]></description><pubDate>Tue, 21 Oct 2025 18:51:31 +0000</pubDate><link>https://news.ycombinator.com/item?id=45659976</link><dc:creator>espadrine</dc:creator><comments>https://news.ycombinator.com/item?id=45659976</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=45659976</guid></item><item><title><![CDATA[New comment by espadrine in "NIST's DeepSeek "evaluation" is a hit piece"]]></title><description><![CDATA[
<p>> <i>DeepSeek models cost more to use than comparable U.S. models</i><p>They compare DeepSeek v3.1 to GPT-5 mini. Those have very different sizes, which makes it a weird choice. I would expect a comparison with GPT-5 High, which would likely have had the opposite finding, given the high cost of GPT-5 High, and relatively similar results.<p>Granted, DeepSeek typically focuses on a single model at a time, instead of OpenAI's approach to a suite of models of varying costs. So there is no model similar to GPT-5 mini, unlike Alibaba which has Qwen 30B A3B. Still, weird choice.<p>Besides, DeepSeek has shown with 3.2 that it can cut prices in half through further fundamental research.</p>
]]></description><pubDate>Sun, 05 Oct 2025 20:01:40 +0000</pubDate><link>https://news.ycombinator.com/item?id=45484711</link><dc:creator>espadrine</dc:creator><comments>https://news.ycombinator.com/item?id=45484711</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=45484711</guid></item><item><title><![CDATA[New comment by espadrine in "DeepSeek-v3.2-Exp"]]></title><description><![CDATA[
<p>Input: $0.07 (cached), $0.56 (cache miss)<p>Output: $1.68 per million tokens.<p><a href="https://api-docs.deepseek.com/news/news250929" rel="nofollow">https://api-docs.deepseek.com/news/news250929</a></p>
]]></description><pubDate>Mon, 29 Sep 2025 16:27:24 +0000</pubDate><link>https://news.ycombinator.com/item?id=45415667</link><dc:creator>espadrine</dc:creator><comments>https://news.ycombinator.com/item?id=45415667</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=45415667</guid></item></channel></rss>