<rss version="2.0" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:atom="http://www.w3.org/2005/Atom"><channel><title>Hacker News: ipotapov</title><link>https://news.ycombinator.com/user?id=ipotapov</link><description>Hacker News RSS</description><docs>https://hnrss.org/</docs><generator>hnrss v2.1.1</generator><lastBuildDate>Fri, 01 May 2026 20:12:34 +0000</lastBuildDate><atom:link href="https://hnrss.org/user?id=ipotapov" rel="self" type="application/rss+xml"></atom:link><item><title><![CDATA[New comment by ipotapov in "VibeVoice: Open-source frontier voice AI"]]></title><description><![CDATA[
<p>I built speech-swift, which focuses on on-device speech processing like VibeVoice, but specifically leverages Apple Silicon's capabilities for ASR, TTS, and VAD without cloud dependency. Our ASR supports 52 languages with a real-time factor of 0.06. <a href="https://soniqo.audio/benchmarks" rel="nofollow">https://soniqo.audio/benchmarks</a></p>
]]></description><pubDate>Wed, 29 Apr 2026 06:14:18 +0000</pubDate><link>https://news.ycombinator.com/item?id=47944740</link><dc:creator>ipotapov</dc:creator><comments>https://news.ycombinator.com/item?id=47944740</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47944740</guid></item><item><title><![CDATA[New comment by ipotapov in "Arietta: A framework for creating local AI voice assistants w. knowledge, tools"]]></title><description><![CDATA[
<p>I built speech-swift, which focuses on on-device ASR, TTS, and VAD for Apple Silicon, similar to Arietta's local-first approach. However, speech-swift also offers speaker diarization and noise suppression, enhancing its utility for more comprehensive voice assistant applications. <a href="https://github.com/soniqo/speech-swift" rel="nofollow">https://github.com/soniqo/speech-swift</a></p>
]]></description><pubDate>Wed, 29 Apr 2026 06:13:53 +0000</pubDate><link>https://news.ycombinator.com/item?id=47944739</link><dc:creator>ipotapov</dc:creator><comments>https://news.ycombinator.com/item?id=47944739</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47944739</guid></item><item><title><![CDATA[New comment by ipotapov in "Show HN: Parlor Jarvis – Realtime AI (audio+screen in, voice out) & multilingual"]]></title><description><![CDATA[
<p>I built speech-swift, which focuses on on-device ASR and TTS, similar to Parlor Jarvis's multilingual capabilities, but specifically optimized for Apple Silicon with 52 languages and a real-time factor of 0.06. It also includes speaker diarization and noise suppression. <a href="https://github.com/soniqo/speech-swift" rel="nofollow">https://github.com/soniqo/speech-swift</a></p>
]]></description><pubDate>Sun, 26 Apr 2026 19:27:19 +0000</pubDate><link>https://news.ycombinator.com/item?id=47913158</link><dc:creator>ipotapov</dc:creator><comments>https://news.ycombinator.com/item?id=47913158</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47913158</guid></item><item><title><![CDATA[MLX vs. CoreML on Apple Silicon: A Practical Guide to Picking the Right Back End]]></title><description><![CDATA[
<p>Article URL: <a href="https://old.reddit.com/r/apple/comments/1sq4dry/mlx_vs_coreml_on_apple_silicon_a_practical_guide/">https://old.reddit.com/r/apple/comments/1sq4dry/mlx_vs_coreml_on_apple_silicon_a_practical_guide/</a></p>
<p>Comments URL: <a href="https://news.ycombinator.com/item?id=47839316">https://news.ycombinator.com/item?id=47839316</a></p>
<p>Points: 1</p>
<p># Comments: 0</p>
]]></description><pubDate>Mon, 20 Apr 2026 19:26:22 +0000</pubDate><link>https://old.reddit.com/r/apple/comments/1sq4dry/mlx_vs_coreml_on_apple_silicon_a_practical_guide/</link><dc:creator>ipotapov</dc:creator><comments>https://news.ycombinator.com/item?id=47839316</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47839316</guid></item><item><title><![CDATA[MLX vs. CoreML on Apple Silicon: A Practical Guide to Picking the Right Back End]]></title><description><![CDATA[
<p>Article URL: <a href="https://blog.ivan.digital/mlx-vs-coreml-on-apple-silicon-a-practical-guide-to-picking-the-right-backend-and-why-you-should-f77ddea7b27a">https://blog.ivan.digital/mlx-vs-coreml-on-apple-silicon-a-practical-guide-to-picking-the-right-backend-and-why-you-should-f77ddea7b27a</a></p>
<p>Comments URL: <a href="https://news.ycombinator.com/item?id=47827676">https://news.ycombinator.com/item?id=47827676</a></p>
<p>Points: 2</p>
<p># Comments: 0</p>
]]></description><pubDate>Sun, 19 Apr 2026 21:10:04 +0000</pubDate><link>https://blog.ivan.digital/mlx-vs-coreml-on-apple-silicon-a-practical-guide-to-picking-the-right-backend-and-why-you-should-f77ddea7b27a</link><dc:creator>ipotapov</dc:creator><comments>https://news.ycombinator.com/item?id=47827676</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47827676</guid></item><item><title><![CDATA[Parakeet Streaming ASR on Apple Silicon via CoreML – and Swift Demo App]]></title><description><![CDATA[
<p>Article URL: <a href="https://old.reddit.com/r/iOSProgramming/comments/1sickii/streaming_asr_on_apple_silicon_via_coreml_and_the/">https://old.reddit.com/r/iOSProgramming/comments/1sickii/streaming_asr_on_apple_silicon_via_coreml_and_the/</a></p>
<p>Comments URL: <a href="https://news.ycombinator.com/item?id=47728486">https://news.ycombinator.com/item?id=47728486</a></p>
<p>Points: 3</p>
<p># Comments: 0</p>
]]></description><pubDate>Sat, 11 Apr 2026 07:57:49 +0000</pubDate><link>https://old.reddit.com/r/iOSProgramming/comments/1sickii/streaming_asr_on_apple_silicon_via_coreml_and_the/</link><dc:creator>ipotapov</dc:creator><comments>https://news.ycombinator.com/item?id=47728486</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47728486</guid></item><item><title><![CDATA[Show HN: Monetary policy tracker – AI sentiment analysis of 26 central bank]]></title><description><![CDATA[
<p>Article URL: <a href="https://monetary.live/">https://monetary.live/</a></p>
<p>Comments URL: <a href="https://news.ycombinator.com/item?id=47709333">https://news.ycombinator.com/item?id=47709333</a></p>
<p>Points: 1</p>
<p># Comments: 0</p>
]]></description><pubDate>Thu, 09 Apr 2026 20:22:22 +0000</pubDate><link>https://monetary.live/</link><dc:creator>ipotapov</dc:creator><comments>https://news.ycombinator.com/item?id=47709333</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47709333</guid></item><item><title><![CDATA[PixelSmile: Toward Fine-Grained Facial Expression Editing]]></title><description><![CDATA[
<p>Article URL: <a href="https://arxiv.org/abs/2603.25728">https://arxiv.org/abs/2603.25728</a></p>
<p>Comments URL: <a href="https://news.ycombinator.com/item?id=47571420">https://news.ycombinator.com/item?id=47571420</a></p>
<p>Points: 1</p>
<p># Comments: 0</p>
]]></description><pubDate>Mon, 30 Mar 2026 07:22:06 +0000</pubDate><link>https://arxiv.org/abs/2603.25728</link><dc:creator>ipotapov</dc:creator><comments>https://news.ycombinator.com/item?id=47571420</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47571420</guid></item><item><title><![CDATA[Nvidia PersonaPlex 7B on Apple Silicon: Full-Duplex Speech-to-Speech in Swift]]></title><description><![CDATA[
<p>Article URL: <a href="https://blog.ivan.digital/nvidia-personaplex-7b-on-apple-silicon-full-duplex-speech-to-speech-in-native-swift-with-mlx-0aa5276f2e23">https://blog.ivan.digital/nvidia-personaplex-7b-on-apple-silicon-full-duplex-speech-to-speech-in-native-swift-with-mlx-0aa5276f2e23</a></p>
<p>Comments URL: <a href="https://news.ycombinator.com/item?id=47258801">https://news.ycombinator.com/item?id=47258801</a></p>
<p>Points: 374</p>
<p># Comments: 125</p>
]]></description><pubDate>Thu, 05 Mar 2026 07:43:41 +0000</pubDate><link>https://blog.ivan.digital/nvidia-personaplex-7b-on-apple-silicon-full-duplex-speech-to-speech-in-native-swift-with-mlx-0aa5276f2e23</link><dc:creator>ipotapov</dc:creator><comments>https://news.ycombinator.com/item?id=47258801</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47258801</guid></item><item><title><![CDATA[The Modern Search Engine: The Complete Pipeline – How It Ranks Results]]></title><description><![CDATA[
<p>Article URL: <a href="https://blog.ivan.digital/inside-the-modern-search-engine-the-complete-pipeline-how-it-ranks-results-learns-from-7409d19ba09b">https://blog.ivan.digital/inside-the-modern-search-engine-the-complete-pipeline-how-it-ranks-results-learns-from-7409d19ba09b</a></p>
<p>Comments URL: <a href="https://news.ycombinator.com/item?id=47258151">https://news.ycombinator.com/item?id=47258151</a></p>
<p>Points: 1</p>
<p># Comments: 0</p>
]]></description><pubDate>Thu, 05 Mar 2026 06:13:41 +0000</pubDate><link>https://blog.ivan.digital/inside-the-modern-search-engine-the-complete-pipeline-how-it-ranks-results-learns-from-7409d19ba09b</link><dc:creator>ipotapov</dc:creator><comments>https://news.ycombinator.com/item?id=47258151</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47258151</guid></item><item><title><![CDATA[Fine-Tuning Qwen3 Embeddings for product category classification]]></title><description><![CDATA[
<p>Article URL: <a href="https://blog.ivan.digital/fine-tuning-qwen3-embeddings-for-product-category-classification-on-the-large-scale-product-corpus-3a0919506bc8">https://blog.ivan.digital/fine-tuning-qwen3-embeddings-for-product-category-classification-on-the-large-scale-product-corpus-3a0919506bc8</a></p>
<p>Comments URL: <a href="https://news.ycombinator.com/item?id=47234950">https://news.ycombinator.com/item?id=47234950</a></p>
<p>Points: 2</p>
<p># Comments: 0</p>
]]></description><pubDate>Tue, 03 Mar 2026 16:37:10 +0000</pubDate><link>https://blog.ivan.digital/fine-tuning-qwen3-embeddings-for-product-category-classification-on-the-large-scale-product-corpus-3a0919506bc8</link><dc:creator>ipotapov</dc:creator><comments>https://news.ycombinator.com/item?id=47234950</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47234950</guid></item><item><title><![CDATA[New comment by ipotapov in "Full speech pipeline in native Swift/MLX – ASR, TTS, speech-to-speech, on-device"]]></title><description><![CDATA[
<p>Been building this for a few months now and it's turned into a complete on-device audio pipeline for Apple Silicon:<p>ASR (Qwen3) → TTS (Qwen3 + CosyVoice, 10 languages) → Speech-to-Speech (PersonaPlex 7B, full-duplex) → Speaker Diarization (pyannote + WeSpeaker) → Voice Activity Detection (Silero, real-time streaming) → Forced Alignment (word-level timestamps)<p>No Python, no server, no CoreML — pure Swift through MLX. Models download automatically from HuggingFace on first run. The whole diarization stack is ~32 MB.<p>Everything is protocol-based and composable — VAD gates ASR, diarization feeds into transcription, embeddings enable speaker verification. Mix and match.<p>Repo: github.com/ivan-digital/qwen3-asr-swift (Apache 2.0)<p>Blog post with architecture details: blog.ivan.digital<p>There's a lot of surface area here and contributions are very welcome — whether it's new model ports, iOS integration, performance work, or just filing issues. If you've been wanting to do anything with audio or MLX in Swift, come build with us.</p>
]]></description><pubDate>Tue, 03 Mar 2026 06:41:15 +0000</pubDate><link>https://news.ycombinator.com/item?id=47228958</link><dc:creator>ipotapov</dc:creator><comments>https://news.ycombinator.com/item?id=47228958</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47228958</guid></item><item><title><![CDATA[Full speech pipeline in native Swift/MLX – ASR, TTS, speech-to-speech, on-device]]></title><description><![CDATA[
<p>Article URL: <a href="https://github.com/ivan-digital/qwen3-asr-swift">https://github.com/ivan-digital/qwen3-asr-swift</a></p>
<p>Comments URL: <a href="https://news.ycombinator.com/item?id=47228957">https://news.ycombinator.com/item?id=47228957</a></p>
<p>Points: 5</p>
<p># Comments: 2</p>
]]></description><pubDate>Tue, 03 Mar 2026 06:41:15 +0000</pubDate><link>https://github.com/ivan-digital/qwen3-asr-swift</link><dc:creator>ipotapov</dc:creator><comments>https://news.ycombinator.com/item?id=47228957</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47228957</guid></item><item><title><![CDATA[New comment by ipotapov in "Show HN: I ported Manim to TypeScript (run 3b1B math animations in the browser)"]]></title><description><![CDATA[
<p>You mention using MathJax for LaTeX rendering, which is great for web compatibility. Have you explored the potential limitations of rendering text due to the lack of Pango? This might affect clarity in complex equations. Also, any thoughts on how it performs with large animations compared to traditional Manim: does the browser handle it smoothly?</p>
]]></description><pubDate>Sat, 28 Feb 2026 13:00:44 +0000</pubDate><link>https://news.ycombinator.com/item?id=47194811</link><dc:creator>ipotapov</dc:creator><comments>https://news.ycombinator.com/item?id=47194811</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47194811</guid></item><item><title><![CDATA[New comment by ipotapov in "Unrolling the Codex agent loop"]]></title><description><![CDATA[
<p>Regarding the user instruction aggregation process in the agent loop, I'm curious how you manage context retention in multi-turn interactions. Have you explored any techniques for dynamically adjusting the context based on the evolving user requirements?</p>
]]></description><pubDate>Sat, 24 Jan 2026 12:40:40 +0000</pubDate><link>https://news.ycombinator.com/item?id=46743077</link><dc:creator>ipotapov</dc:creator><comments>https://news.ycombinator.com/item?id=46743077</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=46743077</guid></item><item><title><![CDATA[New comment by ipotapov in "Claude's new constitution"]]></title><description><![CDATA[
<p>The 'Broad Safety' guideline seems vague at first, but it might be beneficial to incorporate user feedback loops where the AI adjusts based on real-world outcomes. This could enhance its adaptability and ethics over time, rather than depending solely on the initial constitution.</p>
]]></description><pubDate>Wed, 21 Jan 2026 19:29:36 +0000</pubDate><link>https://news.ycombinator.com/item?id=46710317</link><dc:creator>ipotapov</dc:creator><comments>https://news.ycombinator.com/item?id=46710317</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=46710317</guid></item><item><title><![CDATA[New comment by ipotapov in "Web development is fun again"]]></title><description><![CDATA[
<p>Been using GitHub Copilot to handle the tedious webpack/babel config files and it's a game changer for modern web dev. No more spending hours debugging build pipeline issues - it generates 90% correct configs that just need minor tweaks.</p>
]]></description><pubDate>Mon, 05 Jan 2026 07:09:49 +0000</pubDate><link>https://news.ycombinator.com/item?id=46496076</link><dc:creator>ipotapov</dc:creator><comments>https://news.ycombinator.com/item?id=46496076</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=46496076</guid></item></channel></rss>