Hacker News: fine_tune

New comment by fine_tune in "The Atlantic Quantum team is joining Google"

fine_tune — Thu, 02 Oct 2025 18:25:39 +0000

LLMs are only 3-5 years old (NLP is much older OFC), for all we know they'll be a dead end in research like LSTM are today - LLM/Multi Modal just look super hot. "Attention is all you need" was released in 2017, it took 5 years to prove it was useful, for all we know the new hot thing has already been published and LLMs are obsolete - Google might have been right to wait.

Besides I dont think the top people at Google's DeepMind - and I can only "infer" this from watching them speek online - actually think LLM's are "the one".

New comment by fine_tune in "A New Internet Business Model?"

fine_tune — Mon, 22 Sep 2025 15:45:31 +0000

Its cloudflare trying to enshittify the internet with micro transactions[0] and take their N% cut (of course it will start at like 2% but ask any uber driver how thats going).

The problem is the arguments they make for why this should happen are quite compelling, especially to those running sites (you'll see plenty of complaints on this forum about it), but theres also a large group of people who think information / code / data should be "free" (see open source code/maps/anything you can think off). So really its just a moral debate that will be lost in the interest of profit (which is ya know good n bad, if AI companies did more caching we probably wouldn't need this, but here we are).

[0] https://blog.cloudflare.com/introducing-ai-crawl-control/

New comment by fine_tune in "How to become a pure mathematician or statistician (2008)"

fine_tune — Fri, 12 Sep 2025 20:32:46 +0000

I got rage baited by this so hard, cant comprehend thinking this way.

Hung out with PhD's, economists, bankers, trust find kids, scientists, and artists - who maybe weren't top tier enough, but none thought this way.

Literally the weirdest take on a forum filled with dreamers, but every take is valid.

New comment by fine_tune in "Top model scores may be skewed by Git history leaks in SWE-bench"

fine_tune — Thu, 11 Sep 2025 19:02:06 +0000

I was going to argue "LLM's need code samples to-do well on languages and if we are honest C# is a language mostly held in private repo's" but Github's 2024 report[0] says its the 5th most used language (I'm to lazy to check if this report includes private repo's but I'll assume it doesn't).

So kinda neat to see this paper!

[0]https://github.blog/news-insights/octoverse/octoverse-2024/#...

New comment by fine_tune in "Top Secret: Automatically filter sensitive information"

fine_tune — Fri, 22 Aug 2025 23:05:56 +0000

I'm no ruby expert, so forgive my ignorance, but it looks like a small "NER model" packaged as a string convince wrapper named `filter` that tries to filter out "sensitive info" on input strings.

I assume the NER model is small enough to run on CPU at less than 1s~ per pass at the trade off of storage per instance (1s is fast enough in dev, in prod with long convos - that's a lot of inference time), generally a neat idea though.

Couple questions;

- NER doesn't often perform well in different domains, how accurate is the model?

- How do you actually allocate compute/storage for inferring on the NER model?

- Are you batching these `filter` calls or is it just sequential 1 by 1 calls

New comment by fine_tune in "Gemini Embedding: Powering RAG and context engineering"

fine_tune — Thu, 31 Jul 2025 20:15:29 +0000

Yes and no, for human search - its kinda neat, you might find some duplicates, or some nearby neighbour bugs that help you solve a whole class of issues.

But the cool kids? They'd do something worse;

They'd define some complicated agentic setup that cloned your code base into containers firewalled off from the world, give prompts like;

Your expert software dev in MY_FAVE_LANG, here's a bug description 'LONG BUG DESCRIPTION' explore the code and write a solution. Here's some tools (read_file, write_file, ETC)

You'd then spawn as many of these as you can, per task, and have them all generate pull requests for the tasks. Review them with an LLM, then manually and accept PR's you wanted. Now your in the ultra money.

You'd use RAG to guide an untuned LLM on your code base for styles and how to write code. You'd write docs like "how to write an API, how to write a DB migration, ETC" and give that as tool to the agents writing the code.

With time and effort, you can write agents to be specific to your code base through fine tuning, but who's got that kind of money?

New comment by fine_tune in "Gemini Embedding: Powering RAG and context engineering"

fine_tune — Thu, 31 Jul 2025 19:32:48 +0000

RAG is taking a bunch of docs, chunking them it to text blocks of a certain length (how best todo this up for debate), creating a search API that takes query (like a google search) and compares it to the document chunks (very much how your describing). Take the returned chunks, ignore the score from vector search, feed those chunks into a re-ranker with the original query (this step is important vector search mostly sucks), filter those re-ranked for the top 1/2 results and then format a prompt like;

The user ask 'long query', we fetched some docs (see below), answer the query based on the docs (reference the docs if u feel like it)

Doc1.pdf - Chunk N Eat cheese

Doc2.pdf- Chunk Y Dont eat cheese

You then expose the search API as a "tool" for the LLM to call, slightly reformatting the prompt above into a multi turn convo, and suddenly you're in ze money.

But once your users are happy with those results they'll want something dumb like the latest football scores, then you need a web tool - and then it never ends.

To be fair though, its pretty powerful once you've got in place.

New comment by fine_tune in "Resurrecting a dead torrent tracker and finding 3M peers"

fine_tune — Tue, 17 Jun 2025 18:23:58 +0000

You bought a house that had a murder X years ago and are wondering if your guilty for the murder, probably not - aslong as you don't do more murder in it.

I suppose real life is more interesting though, the guy who picked up the domain to stop the global ransomware crisis was picked up after Defcon if memory serves.

Ironically your probably at more risk from the GDPR for leaking those IP addresses that connected to the box via your blog post.

I'm not a lawyer/solicitor though, don't take my advise.

New comment by fine_tune in "AI masters Minecraft: DeepMind program finds diamonds without being taught"

fine_tune — Mon, 07 Apr 2025 10:53:00 +0000

Attempting to train this on a real workload I converted over the weekend after, "step" 8M~ so far and rarely scores above 5% and most are 0% but has scored 60% once 7M~ steps ago.

Adding more than 1 GPU didn't improve speed but that's pretty standard as we don't have fancy interconnect. Bit annoying they didn't use tensorboard for logging, but overall seems like a pretty cool lib - will leave it a few days and see if it can learn (no other algo has so I dont have much hope).

New comment by fine_tune in "Fine-tune Google's Gemma 3"

fine_tune — Thu, 20 Mar 2025 10:21:01 +0000

Love unsloth btw, use it for some other stuff at work, GRPO stuff was fun :)

I know its coming but "mUlTi GpU PlZ" :pleading: <3

New comment by fine_tune in "Fine-tune Google's Gemma 3"

fine_tune — Wed, 19 Mar 2025 21:35:09 +0000

> I'm interested to know if anyone is using fine-tuning to train a model on proprietary or in-house codebases and documentation.

I've done it, 1/2 the team though it was great 20% of the time, 1/2 the team hated it from day 0. I used roughly 500K lines of code.

> How much effort is required to turn code into something one can use for fine-tuning?

Very little to moderate, less than 200 lines of python, QWEM FIM, HF, LLAMA.CPP, LLAMA.CPP code extension.

> RAG solutions seem to have their limitations, and fine-tuning might be a more effective approach.

The only problem either way is keeping the information up to date, RAG just adds more cost to the inference process (which at my dev speed is pretty important).

> How much effort is required to turn code into something one can use for fine-tuning?

Fine tuning "fill in the middle" process is the process of taking a file, cutting out a some text in the middle and asking AI to guess what was there - there is a hugging face example that will have you doing it in an hour or less - your OPs team saying "No you cant litreally copy all code to a single folder" is probably the biggest hurdle (advise them you'll do it in CI and then they can stand up a FIM training endpoint that accepts a csv, pretty easy)