Hacker News: Beefin

New comment by Beefin in "Ask HN: Who is hiring? (April 2026)"

Beefin — Wed, 01 Apr 2026 18:50:49 +0000

Mixpeek | Senior Infrastructure Engineer, Applied AI Engineer | NYC or REMOTE (US) | Full-time | https://mixpeek.com

Mixpeek is multimodal AI infrastructure — we turn unstructured content (video, images, audio, documents) into searchable, programmable assets through a unified API. Think of us as the data infrastructure layer that sits between your raw media and your AI applications. If you've ever tried to build a pipeline that extracts features from video, makes them retrievable, and serves them at scale — you know this is a 12-18 month infrastructure project. We compress that to an afternoon.

We're currently building MVS (Mixpeek Vector Store) — a distributed vector database built on Ray + S3, designed to handle 100B+ vectors at a fraction of the cost of existing solutions. Architecture details: shard-level WAL, LIRE-based adaptive search, replica sets, and agent-native query primitives. If you've ever wanted to rethink how vector search works from the storage layer up, this is that project.

Some things we're shipping right now:

- IP safety for media & sports — our copyright detection platform (https://copyright.mixpeek.com) helps brands and leagues detect unauthorized use of visual IP at scale. We're working with partners in the media/sports ecosystem including Backblaze for storage-native integration.

- Healthcare pipelines — multimodal extraction for clinical trial recruitment and SNF/MDS coding workflows, working with enterprise partners in the space.

- Ad verification — we contribute to the IAB Tech Lab ARTF working group and power contextual intelligence for ad safety.

Our core primitives: feature extractors, retrievers, taxonomies, clusters. Decompose with extractors, recompose with retrievers. Docs: https://docs.mixpeek.com

Stack: Python, Ray, S3, FastAPI, React/TypeScript. We also maintain amux [https://github.com/mixpeek/amux], an open-source tmux multiplexer for running parallel Claude Code agent sessions — if you're into agentic dev workflows, check it out.

I'm Ethan (founder/CEO, previously led search at MongoDB). Small team, high ownership, real problems. We're preparing for NAB Show next week and scaling enterprise pipeline work across healthcare, adtech, and media.

Reach out: ethan [at] mixpeek [dot] com — mention HN.

Multi-Vector Retrieval at Sub-Millisecond Latency

Beefin — Wed, 25 Mar 2026 22:25:33 +0000

Article URL: https://mixpeek.com/blog/colqwen2-muvera-multimodal-late-interaction

Comments URL: https://news.ycombinator.com/item?id=47524096

Points: 1

# Comments: 0

New comment by Beefin in "Jury finds Meta liable in case over child sexual exploitation on its platforms"

Beefin — Wed, 25 Mar 2026 13:48:06 +0000

This is a good flag that you should be rolling your own safety checks. It's not hard, here's a writeup of an ancillary problem/solution: https://mixpeek.com/blog/ip-safety-pre-publication-clearance

Why Git Worktrees Are the Missing Piece for Parallel AI Coding Agents

Beefin — Wed, 25 Mar 2026 13:15:37 +0000

Article URL: https://amux.io/blog/parallel-agents-isolated-worktrees/

Comments URL: https://news.ycombinator.com/item?id=47516934

Points: 1

# Comments: 0

Show HN: Reverse Image Search on the National Gallery of Art Archive

Beefin — Thu, 19 Mar 2026 16:02:41 +0000

Article URL: https://nga.mxp.co/

Comments URL: https://news.ycombinator.com/item?id=47441649

Points: 2

# Comments: 0

Detecting When Your AI Agent Dies

Beefin — Wed, 18 Mar 2026 14:26:12 +0000

Article URL: https://amux.io/blog/auto-restart-ai-agents/

Comments URL: https://news.ycombinator.com/item?id=47426207

Points: 1

# Comments: 0

Terminal Multiplexers > IDEs

Beefin — Wed, 18 Mar 2026 12:20:20 +0000

Article URL: https://amux.io/guides/ai-terminal-multiplexer/

Comments URL: https://news.ycombinator.com/item?id=47424790

Points: 1

# Comments: 0

Simple, standalone tools for working with multimodal data

Beefin — Mon, 16 Mar 2026 19:34:25 +0000

Article URL: https://github.com/mixpeek/multimodal-tools

Comments URL: https://news.ycombinator.com/item?id=47403700

Points: 1

# Comments: 0

AMUX – Tmux and Tailscale powered offline-first agent multiplexer

Beefin — Fri, 13 Mar 2026 18:52:49 +0000

Article URL: https://amux.io/

Comments URL: https://news.ycombinator.com/item?id=47368144

Points: 2

# Comments: 0

New comment by Beefin in "Show HN: Amux – run Claude Code agents in parallel from your phone"

Beefin — Fri, 13 Mar 2026 12:41:27 +0000

Repo: https://github.com/mixpeek/amux

Show HN: Amux – run Claude Code agents in parallel from your phone

Beefin — Fri, 13 Mar 2026 12:41:12 +0000

I built amux because I kept waking up to dead Claude Code sessions. Context would fill up at 2am, the agent would crash, and I’d lose hours of work. So I wrote a self-healing multiplexer that wraps Claude Code in tmux sessions and keeps them alive.

The core loop: amux parses ANSI-stripped tmux output to detect state — working, stuck, needs input, context running low. When context drops below 20%, it sends /compact. When the session crashes from a redacted_thinking error, it restarts and replays the last message. When a YOLO session hits a safety prompt, it auto-answers. The watchdog has kept sessions running unattended for 12+ hours.

What made it actually useful was the web dashboard. I can monitor all my agents from my phone (it’s a PWA), send them messages, check the kanban board, and peek at any terminal — all without SSH-ing into my machine.

Last night I had 8 agents working across 3 repos: one writing API endpoints, one on tests, one doing a migration, others on smaller tasks. I checked in twice from my phone, answered two prompts, and woke up to PRs ready for review.

Agents started coordinating on their own The coordination part surprised me the most. Agents share a REST API so they can peek at each other’s output, claim tasks from a shared board (SQLite with atomic CAS), and send messages between sessions. I didn’t plan for agent-to-agent orchestration initially — I just exposed the API in their global memory and they started using it naturally.

It’s a single Python file (~23k lines, inline HTML/CSS/JS), no build step, no external services. Python 3 + tmux. Auto-generates TLS certs so phone access works over Tailscale. The server watches its own mtime and restarts on save, so you can edit it while it’s running.

What it doesn’t do: it doesn’t replace Claude Code or modify it in any way. It’s purely a wrapper — ANSI parsing, no hooks, no patches. It works with whatever Claude Code version you have.

Comments URL: https://news.ycombinator.com/item?id=47363707

Points: 1

# Comments: 1

New comment by Beefin in "Query Preprocessing: How we handle 500MB video files as search queries"

Beefin — Mon, 09 Mar 2026 18:13:22 +0000

Author here. I'm the founder of Mixpeek — we build multimodal search infrastructure.

The core problem: most vector search assumes your query is a sentence or a single image. But we kept getting customers who wanted to pass entire video files as queries — a media company searching their archive with a raw broadcast clip, a legal team querying with a full contract PDF, an IP safety pipeline scanning videos frame-by-frame against a brand index.

The key insight was that the decomposition pipeline we already use for ingestion (split → embed → store) is the same operation needed at query time — just routing output to search instead of write. Same extractor, same chunking, same embedding model. This guarantees query and index vectors are always in the same space.

The execution is: detect large input → decompose via extractor → batch embed in parallel → N concurrent ANN searches → fuse results (RRF/max/avg). From the caller's perspective, the API shape doesn't change at all.

One decision I'd be curious to get feedback on: we explicitly dropped an "auto" mode that would pick chunking strategy based on file type. The right decomposition depends on what you're searching for, not just the file itself. Felt like the wrong abstraction to hide. Curious if others have found ways to make auto-config work well here.

Happy to answer questions about the fusion strategies, the credit model, or the architecture.

Query Preprocessing: How we handle 500MB video files as search queries

Beefin — Mon, 09 Mar 2026 18:13:00 +0000

Article URL: https://mixpeek.com/blog/query-preprocessing-large-file-search

Comments URL: https://news.ycombinator.com/item?id=47313019

Points: 3

# Comments: 1

New comment by Beefin in "Show HN: Amux – single-file agent multiplexer for headless Claude Code sessions"

Beefin — Mon, 09 Mar 2026 15:02:23 +0000

I built amux because running 5–10 Claude Code agents at once across different repos turned into an unmanageable mess of terminal tabs and forgotten sessions.

The core problem: Claude Code sessions crash at 3am from context compaction, agents silently block on permission prompts, and there's no good way to see which of your 8 running sessions actually needs attention. I was losing work and wasting money.

amux is a tmux-based multiplexer that gives you a single control plane for all your headless Claude Code sessions — from a web dashboard, your phone, or the CLI.

*What it actually does:*

- Registers Claude Code sessions as named tmux panes, each with its own conversation history and working directory - Live status detection (working / needs input / idle) streamed via SSE — you see at a glance which agents need you - Self-healing watchdog that auto-compacts context, restarts crashed sessions, and replays the last message - Built-in kanban board backed by SQLite with atomic task claiming (CAS), so agents can pick up work items without race conditions - REST API for everything — agents discover peers and delegate work via `curl`. The API reference gets injected into each agent's global memory, so plain-English orchestration works out of the box - Per-session token tracking with daily spend breakdowns, so you know what each agent costs before the bill arrives - Git conflict detection that warns when two agents share a directory + branch, with one-click branch isolation

*What it's not:*

It's not a wrapper around Claude Code's native agent teams feature. It operates at a layer below that — it doesn't modify Claude Code at all. It parses ANSI-stripped tmux output. No hooks, no patches, no monkey-patching. If Claude Code updates tomorrow, amux still works.

*Technical decisions:*

The whole thing is a single ~12,000-line Python file with inline HTML/CSS/JS. No npm, no build step, no Docker. I know the single-file approach is polarizing, but for a tool that runs on your dev machine and you might want to hack on, I've found it dramatically lowers the barrier. It restarts on save.

TLS is auto-provisioned in priority order: Tailscale cert → mkcert → self-signed fallback. The idea is you install it on your dev box, run `amux serve`, and access it securely from your phone over Tailscale while you're away from your desk. I use the mobile PWA daily — kick off a batch of tasks, go walk the dog, check progress from my phone.

The kanban board uses SQLite with compare-and-swap for task claiming. This matters because when you have multiple agents that can pick up work, you need atomicity — two agents hitting `/api/board/PROJ-5/claim` simultaneously should result in exactly one of them getting it.

Show HN: Amux – single-file agent multiplexer for headless Claude Code sessions

Beefin — Mon, 09 Mar 2026 15:01:45 +0000

Article URL: https://amux.io

Comments URL: https://news.ycombinator.com/item?id=47310029

Points: 1

# Comments: 3

New comment by Beefin in "Show HN: Amux – A tmux-based multiplexer for running parallel Claude Code agents"

Beefin — Fri, 06 Mar 2026 20:48:56 +0000

One thing we’ve been thinking about with Amux is that the unit of compute shouldn’t just be the terminal session—it should be the agent itself. That means each pane/session can expose things like:

* tokens in / tokens out * cumulative run cost * model + pricing tier * runtime duration * optional budget caps

So when you spin up 5–10 agents, you can immediately see which one is burning tokens or looping.

Longer term I’d love for Amux to treat agents a bit like processes in `htop` where you can see resource usage across all agents in one place and kill/restart the expensive ones quickly.

Curious how you're currently surfacing cost in your setups — logs, dashboards, or something inline with the agent runtime?

Show HN: Offline-First Agent Multiplexer

Beefin — Fri, 27 Feb 2026 15:16:36 +0000

Article URL: https://twitter.com/ethansteininger/status/2027401658272031108

Comments URL: https://news.ycombinator.com/item?id=47181502

Points: 1

# Comments: 0

New comment by Beefin in "We run 20M models in parallel on Ray"

Beefin — Wed, 25 Feb 2026 15:55:03 +0000

We process video, images, and documents through 20+ ML models simultaneously at Mixpeek. A single 10-minute video triggers transcription, visual embeddings, scene descriptions, face detection, object detection, brand safety classification, and more — all in parallel with different compute requirements.

We wrote up the full Ray architecture we use in production on KubeRay/GKE. Not a tutorial — more of a "here's what we actually run and what bit us."

Some highlights:

- *Custom resource isolation* — We use a synthetic `{"batch": 1}` resource to prevent batch pipeline tasks from starving Ray Serve inference replicas. Same cluster, zero interference, no runtime overhead.

- *Flexible actor pools* — Fixed-size `ActorPoolStrategy(size=8)` deadlocks when concurrent jobs compete for workers. `min_size=1, max_size=N` guarantees every job can make progress.

- *Shared preprocessing* — Naive approach runs S3 download + format normalization once per extractor. With 10 extractors on 1,000 files, that's 10,000 redundant reads. We preprocess once and fan out via Ray Dataset.

- *Distributed Qdrant writes* — Ray Data's `Datasink` API distributes vector DB writes across all workers with backpressure, instead of collecting everything on one node.

- *Fire-and-forget progress tracking* — A Ray actor as a shared counter lets workers report progress without blocking the pipeline.

- *Zero-CPU head node* — Learned this one the hard way when a runaway batch job took down our scheduler.

The post includes the KubeRay YAML, Ray Serve autoscaling configs, pipeline code, and the LocalStack parquet workaround that saved us hours of debugging silent hangs.

https://mixpeek.com/blog/ray-distributed-ml-pipeline-archite...

Happy to answer questions about any of the patterns or trade-offs.

We run 20M models in parallel on Ray

Beefin — Wed, 25 Feb 2026 15:55:03 +0000

Article URL: https://mixpeek.com/blog/ray-distributed-ml-pipeline-architecture

Comments URL: https://news.ycombinator.com/item?id=47153227

Points: 2

# Comments: 1

New comment by Beefin in "Show HN: Agent Multiplexer – manage Claude Code via tmux"

Beefin — Tue, 24 Feb 2026 14:08:15 +0000

llm coordination is just one feature - the core (and why i built amux) was so that i can quickly delegate from my phone, see outputs, monitor, etc without raw ssh.