Hacker News: hamza_q_

New comment by hamza_q_ in "Show HN: Iron-Wolf – Wolfenstein 3D source port in Rust"

hamza_q_ — Sat, 21 Feb 2026 20:25:32 +0000

Cool! I did an incomplete version in Rust a while back as well. Not a source port, tried to recreate the game from scratch myself, without looking at the C src code

https://github.com/hamzaq2000/wolf3d-reimpl-rs

New comment by hamza_q_ in "Our new SAM audio model transforms audio editing"

hamza_q_ — Fri, 26 Dec 2025 20:26:32 +0000

Yeah would love contributions! Here's a brief overview of how I think it can be done:

Senko has two clustering types, (1) spectral for audio < 20 mins in length, and (2) UMAP+HDBSCAN for >= 20 mins. In the clustering code, spectral actually already supports orcale/min/max speakers, but UMAP+HDBSCAN doesn't. However, someone forked Senko and added min/max speakers to that here (for oracle, I guess min = max): https://github.com/DedZago/senko/commit/c33812ae185a5cd420f2...

So I think all that's required is basically just testing this thoroughly to make sure it doesn't introduce any regressions in clustering quality. And then just wiring the oracle/min/max parameters to the Diarizer class, or diarize() func.

New comment by hamza_q_ in "Our new SAM audio model transforms audio editing"

hamza_q_ — Thu, 25 Dec 2025 19:30:23 +0000

Thanks for checking it out!

Yeah unfortunately, since the diarization is acoustic features based, it really does require high recorded voice fidelity/quality to get the best results. However, I just added another knob to the Diarizer class called mer_cos, which controls the speaker merging threshold. The default is 0.875, so perhaps try lowering to 0.8. That should help.

I'll also get around to adding a oracle/min/max speakers feature at some point, for cases where you know the exact number of speakers ahead of time, or wanna set upper/lower bounds. Gotten busy with another project, so haven't done it yet. PR's welcome though! haha

New comment by hamza_q_ in "Our new SAM audio model transforms audio editing"

hamza_q_ — Tue, 23 Dec 2025 20:19:06 +0000

Use Demucs bruh https://github.com/adefossez/demucs

New comment by hamza_q_ in "Our new SAM audio model transforms audio editing"

hamza_q_ — Tue, 23 Dec 2025 20:05:14 +0000

Yeah I was frustrated by slow and hard to use OSS diarization too; recently released a library to address that, check it out: https://github.com/narcotic-sh/senko

Also https://zanshin.sh, if you'd like speaker diarization when watching YouTube videos

New comment by hamza_q_ in "Vince Zampella, developer of Call of Duty and Battlefield has died"

hamza_q_ — Mon, 22 Dec 2025 20:51:09 +0000

Thanks for COD: MW2 (2009), Vince. The game of my childhood. Rest in Peace.

New comment by hamza_q_ in "Show HN: Chirp – Local Windows dictation with ParakeetV3 no executable required"

hamza_q_ — Fri, 14 Nov 2025 21:34:51 +0000

Cool use of ONNX! Fluid Inference also have great implementations of Parakeet v2/v3 in CoreML for Apple devices and OpenVINO for Intel:

https://github.com/FluidInference/FluidAudio

https://github.com/FluidInference/eddy-audio

New comment by hamza_q_ in "Ask HN: Who wants to be hired? (November 2025)"

hamza_q_ — Tue, 04 Nov 2025 06:19:06 +0000

Location: Vancouver, BC, Canada

Remote: Yes

Willing to relocate: Yes

Technologies: diarization, Voice AI, PyTorch, CoreML,

Svelte/SvelteKit, Flask, SQLite, Tauri

Résumé/CV: https://hamzaq.com/Hamza_Qayyum_Resume_Public.pdf

Email: mhamzaqayyum [at] icloud [dot] com

---------

Projects:

- Senko: very fast, accurate, speaker diarization (https://senko.sh)

- Zanshin: novel media player that allows you to navigate by speaker (https://zanshin.sh)

New comment by hamza_q_ in "Ghostty with ⌘+F search"

hamza_q_ — Sat, 25 Oct 2025 18:26:50 +0000

Thought about it but it seems they have some stringent pre-req's they'd like: https://github.com/ghostty-org/ghostty/issues/189

I didn't care for those; just told Claude Code to add in the feature directly. So they probably wouldn't accept the PR if I made one.

Ghostty with ⌘+F search

hamza_q_ — Sat, 25 Oct 2025 03:44:57 +0000

Article URL: https://github.com/hamzaq2000/ghostty-cmd-f

Comments URL: https://news.ycombinator.com/item?id=45701223

Points: 4

# Comments: 2

Oral History of Ken Thompson [video]

hamza_q_ — Wed, 08 Oct 2025 17:59:19 +0000

Article URL: https://www.youtube.com/watch?v=OmVHkL0IWk4

Comments URL: https://news.ycombinator.com/item?id=45518843

Points: 6

# Comments: 1

Show HN: Lightning-Fast Diarization on Apple Silicon

hamza_q_ — Thu, 25 Sep 2025 06:36:43 +0000

Article URL: https://github.com/narcotic-sh/senko

Comments URL: https://news.ycombinator.com/item?id=45369869

Points: 3

# Comments: 0

New comment by hamza_q_ in "Show HN: Navigate by speaker in YouTube videos"

hamza_q_ — Sun, 21 Sep 2025 17:17:40 +0000

Thanks :) Agreed, the limiting factor has been diarization (generating the "who speaks when" data) speed. But the diarization backend of this app that I developed can now process 1 hour of audio in ~8 seconds on a M3 Mac. So that's more or less a solved problem now (at least on Mac), just UI work remains.

Show HN: Navigate by speaker in YouTube videos

hamza_q_ — Sun, 21 Sep 2025 17:08:27 +0000

Article URL: https://zanshin.sh

Comments URL: https://news.ycombinator.com/item?id=45324612

Points: 2

# Comments: 2

New comment by hamza_q_ in "US High school students' scores fall in reading and math"

hamza_q_ — Thu, 11 Sep 2025 16:49:58 +0000

We do know; it's just not in the popular conscience yet. Read a bit of Marshall McLuhan.

New comment by hamza_q_ in "US High school students' scores fall in reading and math"

hamza_q_ — Thu, 11 Sep 2025 16:48:04 +0000

Taking bets on how fast Marshall McLuhan re-enters the public conscience :)

New comment by hamza_q_ in "Is the decline of reading making politics dumber?"

hamza_q_ — Fri, 05 Sep 2025 00:56:13 +0000

It's remarkable that Marshall McLuhan's ideas haven't entered the public conscience yet.

New comment by hamza_q_ in "Senko – Very Fast Speaker Diarization"

hamza_q_ — Tue, 02 Sep 2025 17:40:31 +0000

1 hour of audio processed in 5 seconds (RTX 4090, Ryzen 9 7950X). ~17x faster than Pyannote 3.1.

On M3 MacBook Air, 1 hour in 23.5 seconds (~14x faster).

This is a custom speaker diarization pipeline I've developed; it's a modified version of the pipeline found in the excellent 3D-Speaker project by Alibaba Research.

My optimizations/modifications were the following:

- changed VAD model

- multi-threaded Fbank feature extraction

- batched inference of CAM++ embeddings model

- clustering is accelerated by RAPIDS, when NVIDIA GPU available

Optimizations aside, massive credit needs to be given to the CAM++ speaker embeddings model, whose efficiency is where the majority of the speed comes from.

This pipeline powers the Zanshin media player, which is an attempt at a usable integration of diarization in a media player. Check it out here: https://zanshin.sh And discuss here: https://news.ycombinator.com/item?id=45104866

Let me know what you think! Were you also frustrated by how slow speaker diarization is? Does Senko's speed unlock new use cases for you? Cheers, everyone.

Senko – Very Fast Speaker Diarization

hamza_q_ — Tue, 02 Sep 2025 17:40:31 +0000

Article URL: https://github.com/narcotic-sh/senko

Comments URL: https://news.ycombinator.com/item?id=45106441

Points: 2

# Comments: 1

Show HN: Zanshin – Navigate through media by speaker w/ fast diarization

hamza_q_ — Tue, 02 Sep 2025 16:02:19 +0000

残心/Zanshin is a media player that allows you to:

  - Visualize who speaks when & for how long
  - Jump/skip speaker segments
  - Remove/disable speakers (auto-skip)
  - Set different playback speeds for each speaker

It's a better, more efficient way to listen to podcasts, interviews, press conferences, etc.

It has first-class support for YouTube videos; just drop in a URL. Also supports your local media files. All processing runs on-device.

Download today for macOS. Also works on Linux and WSL, but currently without packaging. You can get it running though with just a few terminal commands. Check out the repo for instructions: https://github.com/narcotic-sh/zanshin

Zanshin is powered by Senko, a new, very fast, speaker diarization pipeline I've developed.

On an M3 MacBook Air, it takes over 5 minutes to process 1 hour of audio using Pyannote 3.1, the leading open-source diarization pipeline. With Senko, it only takes ~24 seconds, a ~14x speed improvement. And on an RTX 4090 + Ryzen 9 7950X machine, processing 1 hour of audio takes just 5 seconds with Senko, a ~17x speed improvement.

Senko's speed is what make's Zanshin possible. Senko is a modified version of the speaker diarization pipeline found in the excellent 3D-Speaker project. Check out Senko here: https://github.com/narcotic-sh/senko

Cheers, everyone; enjoy 残心/Zanshin and Senko. I hope you find them useful. Let me know what you think!

Side note: I am looking for a job. If you like my work and have an opportunity for me, I'm all ears :) You can contact me at mhamzaqayyum [at] icloud.com

Comments URL: https://news.ycombinator.com/item?id=45104866

Points: 2

# Comments: 0