Hacker News: samhoss93

New comment by samhoss93 in "Show HN: Piqc – An open-source GPU waste scanner for LLM inference clusters"

samhoss93 — Fri, 05 Jun 2026 14:18:57 +0000

piqc scans your Kubernetes cluster (Read-only) and identifies which models are running on the wrong GPU tier and what the cost attribution is. It runs in a minute. I'd like to hear the community's experiences/thoughts on our detection approach and its benefits.

Show HN: Piqc – An open-source GPU waste scanner for LLM inference clusters

samhoss93 — Fri, 05 Jun 2026 13:48:48 +0000

Article URL: https://github.com/paralleliq/piqc

Comments URL: https://news.ycombinator.com/item?id=48412542

Points: 1

# Comments: 1

New comment by samhoss93 in "Show HN: Tiny-vLLM – high performance LLM inference engine in C++ and CUDA"

samhoss93 — Mon, 01 Jun 2026 22:44:15 +0000

Great README. Genuinely one of the clearest walkthrough of inference internals. The KV cache section is worth lingering one as most of the OOM and throughput issues trace back to this and normally difficult to reason about. sequence length and batch size fill the cache in a way that show up under real traffic.

look forward to going over the completed course.