<rss version="2.0" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:atom="http://www.w3.org/2005/Atom"><channel><title>Hacker News: b89kim</title><link>https://news.ycombinator.com/user?id=b89kim</link><description>Hacker News RSS</description><docs>https://hnrss.org/</docs><generator>hnrss v2.1.1</generator><lastBuildDate>Wed, 01 Jul 2026 00:49:46 +0000</lastBuildDate><atom:link href="https://hnrss.org/user?id=b89kim" rel="self" type="application/rss+xml"></atom:link><item><title><![CDATA[New comment by b89kim in "Building a robotics research setup that lives next to my desk"]]></title><description><![CDATA[
<p>If you're using depth, you're better off starting with a diffusion policy (DP). We benchmarked ACT, DP, pi0,pi05 on the same task, ACT underperformed in most cases.<p>There is already plenty of research around multimodal diffusion policies. While DP typically doesn't require pre-training, you can boost data size by depth estimation model+Open data.</p>
]]></description><pubDate>Sat, 20 Jun 2026 11:13:01 +0000</pubDate><link>https://news.ycombinator.com/item?id=48608359</link><dc:creator>b89kim</dc:creator><comments>https://news.ycombinator.com/item?id=48608359</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=48608359</guid></item><item><title><![CDATA[New comment by b89kim in "Building a robotics research setup that lives next to my desk"]]></title><description><![CDATA[
<p>Adding a depth channel rarely yields a massive performance gain, likely due to data scarcity and the fact that modern VLAs are good at guessing distance directly from RGB. I have used multiple RGB-D cameras, but it is hard to get stable images without jitter. Depth can still be useful for high-level reasoning. PI also uses bounding-box or segmentation data from PI-05 for that.<p>PI smartly combined discretized tokens with flow-matching for efficient training, and it works well in most cases. Still, end-effector representation may be better for teleop with devices like a SpaceMouse, VR, or VibeTracker. PI-07 also supports EEF, but I am not sure how much data is needed to fine-tune PI-05 for that.<p>I'd suggest starting with the default pi05 model. Data strategy is probably more important than model improvements.  Since VLA performance is highly dependent on the data/action distribution and it's easy to modify. After that, you can add high-level reasoning like PI05. I visited a Chinese VLA company that already adopted the PI-05 approach, and it works quite well in practice.</p>
]]></description><pubDate>Sat, 20 Jun 2026 10:01:35 +0000</pubDate><link>https://news.ycombinator.com/item?id=48607943</link><dc:creator>b89kim</dc:creator><comments>https://news.ycombinator.com/item?id=48607943</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=48607943</guid></item><item><title><![CDATA[New comment by b89kim in "Building a robotics research setup that lives next to my desk"]]></title><description><![CDATA[
<p>-  A single arm is sufficient for validating basic Pick/Place tasks, but more complex scenarios require Bi-arm<p>- Calibration is not required for VLA models.<p>-  RGB or Stereo RGB inputs are sufficient for ACT, DP, and PI0/PI05.<p>- ROS2 is not strictly required, but it can be useful for sharing/co-developing codes. For instance, the Stanford team built a custom framework for diffusion policy instead. I also developed similar framework because ROS2 is not optimized for bi-manual manipulation or VLA workloads.</p>
]]></description><pubDate>Sat, 20 Jun 2026 06:01:40 +0000</pubDate><link>https://news.ycombinator.com/item?id=48606736</link><dc:creator>b89kim</dc:creator><comments>https://news.ycombinator.com/item?id=48606736</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=48606736</guid></item><item><title><![CDATA[New comment by b89kim in "Building a robotics research setup that lives next to my desk"]]></title><description><![CDATA[
<p>I could confirm 50-100 demonstrations are enough for fine-tuning pi0/pi05. I did research with aloha and humanoid. It works from 20~40ep(5~10min) but success rate would be 70~80%. Pi0 tech paper suggests to use over 1~4 hours of data. I could get 95% success rate for pick&place with 1 hour of humanoid. Anyway, required hours for good SR depend on generality of data. Long Horizon task over 5 min is not working as paper because PI removed high level(subtask) reasoning part in released pi05.</p>
]]></description><pubDate>Sat, 20 Jun 2026 03:58:27 +0000</pubDate><link>https://news.ycombinator.com/item?id=48606232</link><dc:creator>b89kim</dc:creator><comments>https://news.ycombinator.com/item?id=48606232</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=48606232</guid></item><item><title><![CDATA[New comment by b89kim in "Raspberry Pi 5 – 16GB RAM"]]></title><description><![CDATA[
<p>The Raspberry Pi is a single-board computer with native support for UART, SPI, I2C, CSI, and more. There's a large ecosystem of HATs, sensors, and peripherals built specifically for it. Most mini PCs rely on USB for peripherals, which isn't ideal for embedded use cases. Additionally, mini PCs tend to be discontinued within 2–3 years, whereas the Pi has much longer lifecycle over a decade. It's closer to an ARM development board, and those alternatives aren't cheap.<p>There are plenty of Pi clone boards at lower prices, but they have smaller communities and less documentation. When you hit an unexpected problem, it can be hard to find solutions or get support.</p>
]]></description><pubDate>Thu, 11 Jun 2026 00:54:24 +0000</pubDate><link>https://news.ycombinator.com/item?id=48484918</link><dc:creator>b89kim</dc:creator><comments>https://news.ycombinator.com/item?id=48484918</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=48484918</guid></item><item><title><![CDATA[New comment by b89kim in "Pyodide: a Python distribution based on WebAssembly"]]></title><description><![CDATA[
<p>ChatGPT's Canvas uses Pyodide for sandboxing, but it's not designed for coding agents. Node.js environment is usually better for agents.  Pyodide restricts server-side functionality, and fetching external URLs often needs proxying due to sandbox. By the way, pyodide is still good option for interactive visualizer or deploying small webapps require data processing.</p>
]]></description><pubDate>Tue, 17 Mar 2026 03:56:12 +0000</pubDate><link>https://news.ycombinator.com/item?id=47408419</link><dc:creator>b89kim</dc:creator><comments>https://news.ycombinator.com/item?id=47408419</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47408419</guid></item><item><title><![CDATA[New comment by b89kim in "How to run Qwen 3.5 locally"]]></title><description><![CDATA[
<p>I’ve been testing these on other tasks—IK, Kalman filters, and UI/DB boilerplate. Qwen3.5 is multimodal and specialized for js/webdev or agentic coding. It’s not surprising MoE model have some limitations in specific area. I understand most LLM have limited ability in mathematical/physical reasoning. And I don't think these tasks represent general performance. I'm just sharing personal experiences for those curious.</p>
]]></description><pubDate>Sun, 08 Mar 2026 17:05:03 +0000</pubDate><link>https://news.ycombinator.com/item?id=47298919</link><dc:creator>b89kim</dc:creator><comments>https://news.ycombinator.com/item?id=47298919</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47298919</guid></item><item><title><![CDATA[New comment by b89kim in "How to run Qwen 3.5 locally"]]></title><description><![CDATA[
<p>I’ve been benchmarking GGUF quants for Python tasks under some hardware configs.<p><pre><code>  - 4090 : 27b-q4_k_m
  - A100: 27b-q6_k
  - 3*A100: 122b-a10b-q6_k_L
</code></pre>
Using the Qwen team's "thinking" presets, I found that non-agentic coding performance doesn't feel significant leap over unquantized GPT-OSS-120B. It shows some hallucination and repetition for mujoco codes with default presence penalty. 27b-q4_k_m with 4090 generates 30~35 tok/s in good quality.</p>
]]></description><pubDate>Sun, 08 Mar 2026 08:51:51 +0000</pubDate><link>https://news.ycombinator.com/item?id=47295706</link><dc:creator>b89kim</dc:creator><comments>https://news.ycombinator.com/item?id=47295706</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47295706</guid></item></channel></rss>