<rss version="2.0" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:atom="http://www.w3.org/2005/Atom"><channel><title>Hacker News: baibai008989</title><link>https://news.ycombinator.com/user?id=baibai008989</link><description>Hacker News RSS</description><docs>https://hnrss.org/</docs><generator>hnrss v2.1.1</generator><lastBuildDate>Sat, 04 Apr 2026 09:19:01 +0000</lastBuildDate><atom:link href="https://hnrss.org/user?id=baibai008989" rel="self" type="application/rss+xml"></atom:link><item><title><![CDATA[New comment by baibai008989 in "Show HN: Three new Kitten TTS models – smallest less than 25MB"]]></title><description><![CDATA[
<p>the dependency chain issue is a real barrier for edge deployment. i've been running tts models on a raspberry pi for a home automation project and anything that pulls torch + cuda makes the whole thing a non-starter. 25MB is genuinely exciting for that use case.<p>curious about the latency characteristics though. 1.5x realtime on a 9700 is fine for batch processing but for interactive use you need first-chunk latency under 200ms or the conversation feels broken. does anyone know if it supports streaming output or is it full-utterance only?<p>the phoneme-based approach should help with pronunciation consistency too. the models i've tried that work on raw text tend to mispronounce technical terms unpredictably — same word pronounced differently across runs.</p>
]]></description><pubDate>Fri, 20 Mar 2026 11:06:13 +0000</pubDate><link>https://news.ycombinator.com/item?id=47452981</link><dc:creator>baibai008989</dc:creator><comments>https://news.ycombinator.com/item?id=47452981</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47452981</guid></item></channel></rss>