Hacker News: manca

New comment by manca in "FFmpeg 8.0 adds Whisper support"

manca — Wed, 13 Aug 2025 18:24:19 +0000

The only problem with this PR/diff is that it creates just a avfilter wrapper around whisper.cpp library and requires the user to manage the dependencies on their own. This is not helpful for novice users who will first need to:

1. git clone whisper.cpp

2. Make sure they have all dependencies for `that` library

3. Hope the build passes

4. Download the actual model

AND only then be able to use `-af "whisper=model...` filter.

If they try to use the filter without all the prereqs they'll fail and it'll create frustration.

It'd be better to natively create a Whisper avfilter and only require the user to download the model -- I feel like this would streamline the whole process and actually make people use it much more.

New comment by manca in "I tried vibe coding in BASIC and it didn't go well"

manca — Sun, 20 Jul 2025 04:10:14 +0000

I literally had the same experience when I asked the top code LLMs (Claude Code, GPT-4o) to rewrite the code from Erlang/Elixir codebase to Java. It got some things right, but most things wrong and it required a lot of debugging to figure out what went wrong.

It's the absolute proof that they are still dumb prediction machines, fully relying on the type of content they've been trained on. They can't generalize (yet) and if you want to use them for novel things, they'll fail miserably.

New comment by manca in "Grok 4"

manca — Fri, 11 Jul 2025 02:47:59 +0000

Elon mentioned that Grok's 4 image and video understanding capabilities are somewhat limited and he suggested a new version of the foundation model is being trained to address these issues. According to the "Humanity's Last Exam" benchmark, though, it seems to perform reasonably well, if not the best among the SOTA models.

I agree, though - the timing of the release is a bit unfortunate and it felt like rushed a bit, since not even a model card is available.

New comment by manca in "Show HN: LinkedIn Hype Crew"

manca — Wed, 16 Oct 2024 18:02:42 +0000

Isn't this exactly what Google's new NotebookLM does?

Cool idea anyway :)

New comment by manca in "LosslessCut: The Swiss army knife of lossless video/audio editing"

manca — Sun, 30 Jun 2024 21:30:04 +0000

When I read lossless, I immediately thought about the editing of the real lossless formats like ProRes, MJPEG2000, HuffYUV, etc. But what this ultimately does it remuxes the original container in a new one without touching the elementary stream (no reencoding).

It's no wonder that it uses FFMpeg to do the heavy-lifting, but I think it's worthwhile for the community to understand how this process ultimately works.

In a nutshell, every single modern video format you know about - mp4, mov, avi, ts, etc - is ultimately the extension of the container that could contain multiple video and audio tracks. The tracks are called Elementary Streams (ES) and they are separately encoded using appropriate codecs such as H264/AVC, H265/HEVC, AAC, etc. Then during the process called "muxing" they are put together in a container and each sample/frame is timestamped, so the ESes can be in sync.

Now, since the ES is encoded, you don't get frame-level accuracy when seeking for example, because the ES is compressed and the only fully decodable frame is an I-Frame. Then every subsequent frame (P, or B) is decoded based on the information from the IFrame. This sequence of IPPBPPB... is called GOP (Group of Pictures).

The cool part is that you could glean the type of the frame, even though it's encoded by looking into NAL units (Network Abstraction Layer), which have specific headers that identify each frame type or picture slice. For example for H264 IFrame the frame-type byte is like 0x07, while the header is 0x000001.

Putting all this together, you could look into the ES bitstream and detect GOP boundaries without decoding the stream. The challenge here is of course that you can't just cut in the middle of the GOP, but the solution for that is to either be ok with some <1sec accuracy, or just decode the entire GOP which is usually 30 frames and insert an IFrame (fully decoded frame can be turned into an IFrame) in the resulting output. That way all you do is literally super fast bit manipulation and copy from one container into another. That's why this is such an efficient process if all you care about is cutting the original video into segments.

New comment by manca in "Show HN: Hacker Search – A semantic search engine for Hacker News"

manca — Thu, 02 May 2024 17:59:56 +0000

I love projects like this. It shows the true potential of what LLMs and RAG can unlock. Imagine applying the same method on the actual content within the threads and extract the sentiment, as well as summarize the key points of a particular thread -- the options are limitless.

My only piece of advice, though: try to do the reranking using some other rerankers instead of an LLM -- you'll save both on the latency AND the cost.

Other than that, good job.

New comment by manca in "Can we RAG the whole web?"

manca — Tue, 30 Apr 2024 18:36:02 +0000

This is exactly what https://www.perplexity.ai/ is trying to do. Maybe not "RAGing" the entire internet, but sure using the mapping between natural language query to their own (probably) vector database which contains "source of truth" from the internet.

The way how they build that database and what models they use for text tokenization, embeddings generation and ranking at "internet" scale is the secret sauce that enabled them to raise more than $165M to date.

For sure this is where the internet search will be in a couple of years and that's why Google got really concerned when original ChatGPT was released. That said, don't assume Google is not already working on something similar. In fact, the main theme of their Google Next conference was about LLMs and RAG.

New comment by manca in "Ask HN: How does deploying a fine-tuned model work"

manca — Wed, 24 Apr 2024 06:28:21 +0000

A lot of the answers to your question focus solely on the infra piece of the deployment process, which is just one, albeit, important piece of the puzzle.

Each model is built using some predefined model architecture and the majority of the LLMs of today are the implementation of Transformer architecture, based on the "Attention is All You Need" paper from 2017. That said, when you fine-tune a model, you usually start from a checkpoint and then using techniques like LORA or QLORA you compute new weights. You do this in your training/fine-tuning script using PyTorch, or some other framework.

Once the training is done you get the final weights -- a binary blob of floats. Now you need to use those weights back into the inference architecture of the model. You do that by using the framework which is used for training (PyTorch) to construct the inferencing pipeline. You can build your own framework/inferencing engine too if you want and try to beat PyTorch :) The pipeline will consist of things like:

- loading the model weights

- doing pre-processing on your input

- building the inference graph

- running your input (embeddings/vectors) through the graph

- generating predictions/results

Now, the execution of this pipeline can be done on GPU(s) so all the computations (matrix multiplications) are super fast and the results are generated quickly, or it can still run on good old CPUs, but much slower. Tricks like quantization of model weights can be used here to reduce the model size and speed up the execution by trading-off precision/recall.

Services like ollama, or vllm abstract away all the above steps and that's why they are very popular -- they might even allow you to bring your own (fine-tuned) model.

On top of the pure model execution, you can create a web service that will serve your model via a HTTP or gRPC endpoint. It could accept user query/input and return a JSON with the results. Then it can be incorporated in any application, or become part of another service, etc.

So, the answer is much more than "get the GPU and run with it" and I think it's important to be aware of all the steps required if you want to really understand what goes into deploying custom ML models and putting them to a good use.

New comment by manca in "Ask HN: People who switched from GPT to their own models. How was it?"

manca — Tue, 27 Feb 2024 05:46:06 +0000

If you don't care about the details of how those model servers work, then something that abstracts out the whole process like LM Studio or Ollama is all you need.

However, if you want to get into the weeds of how this actually works, I recommend you look up model quantization and some libraries like ggml[1] that actually do that for you.

[1] https://github.com/ggerganov/ggml

New comment by manca in "Ask HN: People who switched from GPT to their own models. How was it?"

manca — Tue, 27 Feb 2024 05:39:33 +0000

I've tried code-llama with Ollama, along with Continue.dev and found it to be pretty good. The only downside is that I couldn't "productively" run the 70B version, even on my MBP with M3 Max with 36GB of RAM (which interestingly should be enough to hold quantized model weights). It was simply painfully slow. 34B one works good enough for most of my use-cases, so I am happy.

New comment by manca in "Galaksija: The Soviet-Era, Z80-based microcomputer"

manca — Thu, 12 Oct 2023 04:52:22 +0000

Galaksija was truly a "masterpiece" at that time, made by a single person by stitching together various smuggled parts from the West. I have a huge admiration and respect for Voja, especially after he decided to give up everything in Serbia and move to the US and start from scratch on his own in his late sixties!

He's a very humble man despite his remarkable impact and influence on the early tech industry in Yugoslavia. He and Dejan Ristanovic [1] started one of the first PC magazines in the 80's which was the bastion of progress filled with ingenious articles and insights collected from all over the world mostly by the word of mouth (remember there was no internet back then). They and a few others actually founded the first ISP and BBC in Yugoslavia in the late eighties.

Anyway, I am glad to see this article on HN and would suggest you all to watch Voja's interview [2] given to Computer History Museum in Mountain View where Galaksija rightfully got its own piece of the history.

[1] https://en.wikipedia.org/wiki/Dejan_Ristanovi%C4%87

[2] https://www.youtube.com/watch?v=nPLyzOEobw8&ab_channel=Compu...

New comment by manca in "Kevin Mitnick has died"

manca — Thu, 20 Jul 2023 04:57:22 +0000

I will always remember when the "Takedown" movie came out. I loved the original "Hackers" and couldn't wait for "Hackers 2" which was Takedown.

I had learned about Mitnick few years prior to the movie and was fascinated by his life story and what he had done up to that point (including his "takedown" by the FBI). It's an understatement to say that his work, character and some sort of positive social manipulation put a great influence on my upbringing and later my professional career. Back then I enjoyed playing pranks with my friends and "hacking" them with all sorts of trojans and ejecting their CD roms :)

I am very sad to hear that he's gone. RIP Legend.

New comment by manca in "Backend of Meta Threads is built with Python 3.10"

manca — Thu, 06 Jul 2023 17:20:02 +0000

I've been using Threads since late yesterday and it's been very slow... sometimes it wouldn't even load content when you browse specific profiles (especially the active ones with tens of thousands of followers). So I am not surprised the backend is in Python :)

New comment by manca in "Tell HN: Nearly all of Evernote’s remaining staff has been laid off"

manca — Thu, 06 Jul 2023 03:09:19 +0000

I remember when Evernote launched and everyone was super hyped about it -- even some of the biggest VC names promoted them. Not to mention the funding they raised ($290M). It was literally iOS Notes on steroids. I even used it for awhile, but somehow it didn't stick.

They hired bunch of great people and had some good backend tech -- sad to see this happen to them.

New comment by manca in "Show HN: Banterai – Talk to any AI celebrity, human-like voice conversations"

manca — Wed, 05 Apr 2023 04:00:31 +0000

Nicely done. Does Azure Speech to Text also handle speech synthesis and provide out of the box voices for different characters or you had to build your own model to do this? It's impressive if their service can do it all: speech recognition, speech to text and text to speech and in near real-time. I should take a closer look at the Azure ML stack :)

New comment by manca in "Show HN: Banterai – Talk to any AI celebrity, human-like voice conversations"

manca — Wed, 05 Apr 2023 03:30:47 +0000

Great job. I must say that the speech synthesis sounds pretty realistic. I talked with Jobs, Musk and Obama and liked how they sounded and more importantly how they handled the questions. Do you mind sharing the entire stack you used to build this? Very well done!

New comment by manca in "Google Summer of Code 2023"

manca — Fri, 17 Mar 2023 15:31:19 +0000

Glad to see this is still going. I was sad to see CodeJam being shut down recently. We need more programs like GSoC and CJ as they encourage students to take a part of something great and contribute to the open source community.

My GSoC year was 2010 and it was definitely an amazing experience -- not just getting to meet and work alongside amazing community, but also to sharpen my software engineering skills, improve communication and have fun along the way.

If you're a student, please find something interesting you'd like to work on and apply! Find where the folks hang out and reach out to them! They'll be happy to help you get started! Back in the day we used irc.freenode.net as our communication hub for pretty much all OSS talk, but I am sure there are Slack or Discord servers now available for most projects.

Have fun!

New comment by manca in "Show HN: FFmpeg Command Visualizer and Editor"

manca — Sat, 29 Oct 2022 21:52:19 +0000

This is brilliant. A long time ago when I worked on Windows video apps, I used to use Graph Studio[1] to visualize the video graph comprised of countless DirectShow filters. It occurred to me multiple times that such a tool would be super useful for ffmpeg as well.

It really helps visualize your filter graphs, especially when building complex video processing pipelines. Too bad this is not open source... I'd be more than happy to contribute.

[1] https://github.com/cplussharp/graph-studio-next

New comment by manca in "Kagi/Orion status update: First three months"

manca — Thu, 01 Sep 2022 16:49:14 +0000

I've been following Kagi's development for some time now and the idea looks promising. Not sure if the team plans to develop its own crawling engine and index that won't depend entirely on sources such as Google/Bing/Wikipedia, etc. Right now, it seems like the results are 90% google results without the ads (which is still a big plus). However, I'd like to see if they can pull off indexing of (maybe a smaller part?) of the web on their own -- that way they can completely get decoupled from Google and not put their fate in the hands of much bigger players.

Anyway, exciting stuff and I wish the team best of luck!

New comment by manca in "Ask HN: In 2022, what is the proper way to get into machine/deep learning?"

manca — Wed, 17 Aug 2022 03:50:51 +0000

I agree with this. From my experience most of the data scientists I have worked with didn't exit the world of Jupyter notebooks. For them, code management, CI/CD, dev/stage/prod separation, etc. is a world of its own that they are not very comfortable with. Heck, they even used Sagemaker to create git repo for their Jupyter notebooks.

It doesn't mean that there aren't data scientists who have some engineering experience as well, but this seems to be rare. For that reason, getting those ML models that they painstakingly build to where they'll generate some real value is super hard. They just don't know where to start. Working across multiple teams and multiple functions is very challenging and it often creates friction. Therefore, creating tools and systems that will enable those data scientists to see the actual value of their labor is paramount.

That's why we're seeing a huge resurgence of so called MLOps tools and platforms that aim to solve all or some of the problems of the entire stack. We are very very early in this journey, but I believe 2020's will be for ML and AI what 2010's were for the cloud and data, ie. new Snowflakes and Databricks but for the actual ML apps. It's exciting.