Hacker News: sorenjan

New comment by sorenjan in "Show HN: Ghost Pepper – Local hold-to-talk speech-to-text for macOS"

sorenjan — Mon, 06 Apr 2026 21:25:46 +0000

Whisper supports a prompt, you can put your "Donold" there.

https://developers.openai.com/cookbook/examples/whisper_prom...

New comment by sorenjan in "Demonstrating Real Time AV2 Decoding on Consumer Laptops"

sorenjan — Sun, 05 Apr 2026 11:52:25 +0000

More info about AV2 here: https://news.ycombinator.com/item?id=45547537

New comment by sorenjan in "TinyLoRA – Learning to Reason in 13 Parameters"

sorenjan — Wed, 01 Apr 2026 14:05:49 +0000

A version of this comment is posted in all submissions about Low Rank Adapters. I don't see how "Learning to reason in 13 parameters" would apply to low power radio communication, so it's even less relevant this time.

> Please don't complain about tangential annoyances—e.g. article or website formats, name collisions, or back-button breakage. They're too common to be interesting.

https://news.ycombinator.com/newsguidelines.html

New comment by sorenjan in "Bring Back MiniDV with This Raspberry Pi FireWire Hat"

sorenjan — Wed, 01 Apr 2026 12:46:17 +0000

Can you expand on the Gemini tagging part? What did you do with the tags, import them into Jellyfin after cutting the videos into parts?

New comment by sorenjan in "TinyLoRA – Learning to Reason in 13 Parameters"

sorenjan — Wed, 01 Apr 2026 12:41:31 +0000

They're using the truncated SVD, not the full variant, that's computationally cheaper.

TinyLoRA – Learning to Reason in 13 Parameters

sorenjan — Fri, 27 Mar 2026 12:11:12 +0000

Article URL: https://arxiv.org/abs/2602.04118

Comments URL: https://news.ycombinator.com/item?id=47541733

Points: 234

# Comments: 45

New comment by sorenjan in "Drawvg Filter for FFmpeg"

sorenjan — Fri, 20 Mar 2026 11:47:12 +0000

The cropdetect example made me wonder if they're thinking about including support for yolo or similar models. They're including Whisper for text to speech already, I think yolo would enable things like automatic face censoring and general frame content aware editing. Or maybe Segment anything, and have more fine grained masks available.

On the other hand, when I compared the binaries (ffmpeg, ffprobe, ffplay) I downloaded the other day with the ones I had installed since around September, they where almost 100 MB larger. I don't remember the exact size of the old ones but the new ones are 640 MB, the old ones well under 600 MB. The only difference in included libraries was Cairo and the JPEG-XS lib. So while I think a bunch of new ML models would be really cool, maybe they don't want to go down that route. But some kind of pluggable system with accelerated ML models would be helpful I think.

New comment by sorenjan in "Lazycut: A simple terminal video trimmer using FFmpeg"

sorenjan — Tue, 17 Mar 2026 00:41:52 +0000

I've never tried doing frame perfect clips like that, that does sound annoying. But from a cursory read of the source, I don't think this program will solve that issue either? Because the time stamps in your examples are all correct, and the TUI is using ffmpeg with -ss and -t as well.

  func BuildFFmpegCommand(opts ExportOptions) string {
   output := opts.Output
   if output == "" {
    output = generateOutputName(opts.Input)
   }
   duration := opts.OutPoint - opts.InPoint
  
   args := []string{"ffmpeg", "-y",
    "-ss", fmt.Sprintf("%.3f", opts.InPoint.Seconds()),
    "-i", filepath.Base(opts.Input),
    "-t", fmt.Sprintf("%.3f", duration.Seconds()),
   }

I think the best way of getting frame accurate clips like that is putting the starting time after the input (or rather before the output), which decodes the video up to that time, and reencode it instead of copying. Both of these commands gives the expected output:

  ffmpeg -i master.mp4 -ss 0 -t 1 -c:v libx264 green.mp4
  ffmpeg -i master.mp4 -ss 1 -t 1 -c:v libx264 red.mp4

New comment by sorenjan in "Lazycut: A simple terminal video trimmer using FFmpeg"

sorenjan — Mon, 16 Mar 2026 17:03:04 +0000

Missed opportunity to reference the famous Dropbox hn comment.

I just think there are other closely related use cases where a separate program can add more value, especially in the terminal. I wouldn't suggest most people should use ffmpeg instead of a gui, those are too dissimilar. Another example is cutting out a part of a video, with ffmpeg you need to make two temporary videos and then concatenate them, that process would greatly benefit from a better ux.

New comment by sorenjan in "Lazycut: A simple terminal video trimmer using FFmpeg"

sorenjan — Mon, 16 Mar 2026 16:47:15 +0000

I disagree, I don't want another ffmpeg binary, I already have one. Winget works well, especially since this is already a terminal program.

New comment by sorenjan in "Lazycut: A simple terminal video trimmer using FFmpeg"

sorenjan — Mon, 16 Mar 2026 16:45:53 +0000

I don't find trimming videos with ffmpeg particularly difficult, is just-ss xx -to xx -c copy basically. Sure, you need to get those time stamps using a media player, but you probably already have one so that isn't really an issue.

What I've found to be trickier is dividing a video into multiple clips, where one clip can start at the end of another, but not necessarily.

New comment by sorenjan in "$96 3D-printed rocket that recalculates its mid-air trajectory using a $5 sensor"

sorenjan — Sun, 15 Mar 2026 12:52:33 +0000

Both Russia and Ukraine build millions of drones per year, most of them fpv drones that are basically remote controlled flying grenades. There's plenty of electronic warfare with radio jamming, so in some places they use drone mounted spools of fiber optic cable to control them. It's probably been the most impactful weapon type in the war for the past years.

New comment by sorenjan in "Okmain: How to pick an OK main colour of an image"

sorenjan — Fri, 13 Mar 2026 17:33:53 +0000

  > uvx --with pillow --with okmain python -c "from PIL import Image; import okmain; print(okmain.colors(Image.open('bluemarble.jpg')))"
  [RGB(r=79, g=87, b=120), RGB(r=27, g=33, b=66), RGB(r=152, g=155, b=175), RGB(r=0, g=0, b=0)]

It would make sense to add an entrypoint in the pyproject.toml so you can use uvx okmain directly.

New comment by sorenjan in "TUI Studio – visual terminal UI design tool"

sorenjan — Fri, 13 Mar 2026 15:06:13 +0000

I wonder if one of the LLMs could generate code from a screenshot of a layout designed by this.

New comment by sorenjan in "TUI Studio – visual terminal UI design tool"

sorenjan — Fri, 13 Mar 2026 15:00:22 +0000

You can right click on it and choose "Show controls", at least in Firefox.

New comment by sorenjan in "Show HN: Satellite imagery object detection using text prompts"

sorenjan — Wed, 11 Mar 2026 22:05:59 +0000

Planet labs has a solution specifically for ships.

https://www.planet.com/pulse/illuminate-the-dark-fleet-with-...

New comment by sorenjan in "Qwen3.5 Fine-Tuning Guide"

sorenjan — Sun, 08 Mar 2026 18:12:36 +0000

A flathead screwdriver is not a valid analogy, because LLMs are big complicated and opaque machines. And while other ML methods are non-deterministic as well, gaussian process, decision trees or even CNNs are easier to try to make sense of than these huge black boxes.

And I still haven't seen a single example of anyone actually using a finetuned Qwen in industrial inspection, which leads me to believe than nobody is actually using it for that, but some people want to use it because it's their new favorite toy. You don't need a VLM to count cells in microscopy images, or find scratches in painted parts, or estimate output from a log in a saw mill. I can see the use case for things like describing a scene from a surveillance camera, finding a car of a certain model and colour, or other tasks that demand more reasoning or description. But in those cases latency is not super important compared to getting the right output, which was the tradeoff discussed from the start of this thread.

The last thing I'd want to deal with is to have a computer say something like "You're absolutely right, it was wrong of me to classify the metal debris as food".

New comment by sorenjan in "Qwen3.5 Fine-Tuning Guide"

sorenjan — Thu, 05 Mar 2026 21:37:25 +0000

Seems like a way to use a sledgehammer to hammer in screws, and inviting nondeterminism in important systems. Besides being way larger and more complex than what most specialized industrial processes need, they are also vulnerable to adversarial attacks.

https://www.lakera.ai/blog/visual-prompt-injections

https://www.theverge.com/2021/3/8/22319173/openai-machine-vi...

New comment by sorenjan in "Qwen3.5 Fine-Tuning Guide"

sorenjan — Thu, 05 Mar 2026 00:24:25 +0000

The discussion was about fine-tuned Qwen models, not industrial inspection in general. I would also find it interesting to learn about what kind of edge AI industrial inspection task you could do with fine-tuned llms, not some handwavy answer about how sometimes latency is important in real time systems. Of course it is, so generally you don't use models with several billion parameters unless you need to.

New comment by sorenjan in "Qwen3.5 Fine-Tuning Guide"

sorenjan — Wed, 04 Mar 2026 19:14:12 +0000

But that's not something you'd use an LLM for. There have been computer vision systems sorting bad peas for more than a decade[0], of course there are plenty of use cases for very fast inspection systems. But when would you use an LLM for anything like that?

[0] https://www.youtube.com/watch?v=eLDxXPziztw