Hacker News: teamchong

New comment by teamchong in "Show HN: Prompt-to-Excalidraw demo with Gemma 4 E2B in the browser (3.1GB)"

teamchong — Sun, 19 Apr 2026 15:46:23 +0000

sorry it’s not working for you. I built this as a personal project for self-learning, but I plan to take a look at this issue next weekend. you can check out a video demo of it here: https://github.com/user-attachments/assets/71ae6e5c-a5ec-4d0...

New comment by teamchong in "Show HN: Prompt-to-Excalidraw demo with Gemma 4 E2B in the browser (3.1GB)"

teamchong — Sun, 19 Apr 2026 14:47:07 +0000

firefox has webgpu already, but the subgroups extension isn't in yet. every matmul / softmax kernel here leans on subgroupShuffleXor for reductions, that's the blocker. same reason mlc webllm and friends don't run on firefox either. once mozilla ships it this should work

Show HN: Prompt-to-Excalidraw demo with Gemma 4 E2B in the browser (3.1GB)

teamchong — Sun, 19 Apr 2026 11:17:27 +0000

Article URL: https://teamchong.github.io/turboquant-wasm/draw.html

Comments URL: https://news.ycombinator.com/item?id=47823460

Points: 163

# Comments: 62

New comment by teamchong in "Show HN: TurboQuant-WASM – Google's vector quantization in the browser"

teamchong — Wed, 08 Apr 2026 07:09:11 +0000

I made some adjustment, can you try again? Is it faster now?

https://teamchong.github.io/turboquant-wasm/search.html

New comment by teamchong in "Show HN: TurboQuant-WASM – Google's vector quantization in the browser"

teamchong — Sun, 05 Apr 2026 00:16:27 +0000

you’re right that 32f is faster on raw query time, quantization adds extra step. main benefit on download size since gzip won’t help much, which matters most in browser contexts

Show HN: TurboQuant-WASM – Google's vector quantization in the browser

teamchong — Sat, 04 Apr 2026 14:53:16 +0000

Article URL: https://github.com/teamchong/turboquant-wasm

Comments URL: https://news.ycombinator.com/item?id=47639567

Points: 165

# Comments: 7

New comment by teamchong in "Show HN: VectorJSON – O(n) streaming parser to handle LLM JSON outputs"

teamchong — Wed, 18 Feb 2026 21:53:13 +0000

I built this after hitting GC stalls parsing streaming tool calls in my AI agent. LLM outputs are getting large — code edits, file writes, 50-200KB JSON payloads.

Every AI SDK I looked at (Vercel, Anthropic, TanStack, OpenClaw) does `buffer += chunk; JSON.parse(buffer)` on every token. That's O(n²) — a 100KB tool call arrives in ~8000 chunks, and each chunk re-parses the entire accumulated buffer from scratch. The cumulative parse time adds up to 13.4 seconds for the Anthropic SDK. Each intermediate buffer string and parsed object becomes garbage immediately, thousands of short-lived allocations that put constant pressure on the GC.

VectorJSON scans only the new bytes on each chunk. O(n) total — same payload, 6.6ms. Parsing happens in WASM linear memory, so no JS objects are created until you access a field.

Built on zimdjson (Zig port of simdjson by @travisstaloch) compiled to WASM. Fields materialize lazily through a Proxy — if you only read 3 fields from a 100KB payload, the other 97% never touches the JS heap. Deep comparison runs entirely in WASM memory — 2-4× faster than recursive JS deepEqual with 24× less heap pressure.

Not a replacement for JSON.parse — for single-shot full materialization, JSON.parse is faster (it's optimized C++ in V8). VectorJSON is built for streaming, partial access, and deep comparison.

Show HN: VectorJSON – O(n) streaming parser to handle LLM JSON outputs

teamchong — Wed, 18 Feb 2026 21:52:58 +0000

Article URL: https://github.com/teamchong/vectorjson

Comments URL: https://news.ycombinator.com/item?id=47066966

Points: 1

# Comments: 1