Hacker News: heavyarms

Ask HN: How long before we get "coding agent in a box"?

heavyarms — Mon, 22 Dec 2025 17:22:12 +0000

I've been using Claude Code since the early beta days and, since the 4.5 Sonnet release, it's changed my workflow a lot. At least in my view, the current iteration of frontier coding agents are good enough to automate a lot of rote software development tasks and are worth the money... if put in the hands of capable developers who know how to use them. But giving unrestricted access to all of your developers to something like Claude Code is also signing yourself up for huge variability in OpEx budgets.

I understand the current hardware limitations and that you can't just put a frontier LLM in a black box and hook it up to your existing MBP via USB-C. In my estimation, something like a Apple Mac Studio M3 (256gb or more of unified memory) is maybe one possible option ($7,500 - $10,000) for running a 405b open weights model... but it wouldn't be very fast. And it wouldn't come close to the level of quality or workflow of Claude Code.

To really run a current frontier LLM locally with something like >30 tokens per second would probably require four A100s.. add in NVLink bridges, expensive cooling, 256GB RAM, a cool case with LED lights (optional) and we're talking about ~$60,000? $80,000?

So my question is: How many generations—or what specific architectural shifts (specialized ASICs, better quantization, etc.)—do we need before we can buy a dedicated co-processor box that sits on a desk and runs a Sonnet-level agent at viable speeds... at a price point where it makes sense vs. spending $500-$2,000 per month per developer on API fees? In my opinion that "makes sense to me, here's the credit card" price point might be $10,000 right now, but I could be wrong.

And related question: Who will do this? Anthropic could probably make a killing right now IF they had could sell "Claude Code in a box for $10,000" but would they ever want to? It would be cannibalizing the majority of their business. But Apple might do this. And it might only be one or two generations of hardware upgrades away. They just need the "frontier LLM" to stick into the box.

Comments URL: https://news.ycombinator.com/item?id=46356220

Points: 2

# Comments: 3

New comment by heavyarms in "Gemini 2.0 is now available to everyone"

heavyarms — Wed, 05 Feb 2025 18:37:21 +0000

The last time I checked (a few days ago) it only had an "Upload Image" option... and I have been playing with Gemini on and off for months and I have never been able to actually upload an image.

It's basically what I've come to expect from most Google products at this point: half-baked, buggy, confusing, not intuitive.

New comment by heavyarms in "Google brings real-time information from The Associated Press to Gemini"

heavyarms — Wed, 15 Jan 2025 20:46:04 +0000

There's not a lot of detail in the announcement but I assume this is some kind of RAG system. I wonder if it will cover some short time period (past week, past month?) or if they are trying to cover the whole time period since the knowledge cutoff of the current model.

New comment by heavyarms in "AI companies cause most of traffic on forums"

heavyarms — Mon, 30 Dec 2024 19:28:55 +0000

What mechanism would make it possible to enforce non-paywalled, non-authenticated access to public web pages? This is a classic "problem of the commons" type of issue.

The AI companies are signing deals with large media and publishing companies to get access to data without the threat of legal action. But nobody is going to voluntarily make deals with millions of personal blogs, vintage car forums, local book clubs, etc. and setup a micro payment system.

Any attempt to force some kind of micro payment or "prove you are not a robot" system will add a lot of friction for actual users and will be easily circumvented. If you are LinkedIn and you can devote a large portion of your R&D budget on this, you can maybe get it to work. But if you're running a blog on stamp collecting, you probably will not.

New comment by heavyarms in "Elon Musk wanted an OpenAI for-profit"

heavyarms — Fri, 13 Dec 2024 20:49:38 +0000

If this is a GPT-generated joke, I'd say they cracked AGI.

New comment by heavyarms in "Test Driven Development (TDD) for your LLMs? Yes please, more of that please"

heavyarms — Wed, 04 Dec 2024 21:33:13 +0000

Whenever I see one of these posts, I click just to see if the proposed solution to testing the output of an LLM is to use the output of an LLM... and in almost all cases it is. It doesn't matter how many buzzwords and acronyms you use to describe what you're doing, at the end of the day it's turtles all the way down.

The issue is not the technology. When it comes to natural language (LLM responses that are sentences, prose, etc.) there is no actual standard by which you can even judge the output. There is no gold standard for natural language. Otherwise language would be boring. There is also no simple method for determining truth... philosophers have been discussing this for thousands of years and after all that effort we now know that... ¯\_(ツ)_/¯... and also, Earth is Flat and Birds Are Not Real.

Take, for example, the first sentence of my comment: "Whenever I see one of these posts, I click just to see if the proposed solution to testing the output of an LLM is to use the output of an LLM... and in almost all cases it is." This is absolutely true, in my own head, as my selective memory is choosing to remember that one time I clicked on a similar post on HN. But beyond the simple question of if it is true or not, even an army of human fact checkers and literature majors could probably not come up with a definitive and logical analysis regarding the quality and veracity of my prose. Is it even a grammatically correct sentence structure... with the run-on ellipsis and what not... ??? Is it meant to be funny? Or snarky? Who knows ¯\_(ツ)_/¯ WFT is that random pile of punctuation marks in the middle of that sentence... does the LLM even have a token for that?

New comment by heavyarms in "Apple Intelligence: every new AI feature coming to the iPhone and Mac"

heavyarms — Mon, 10 Jun 2024 18:32:41 +0000

If you're running a company that is paying multiple vendors for basic AI features and LLM functionality, it might be worth doing the calculation of how much of that functionality might be covered by getting all of your employees on iOS and MacOS...

New comment by heavyarms in "Ex-athletic director arrested for framing principal with AI-generated voice"

heavyarms — Thu, 25 Apr 2024 15:25:45 +0000

There are lots of valid use cases for speech synthesis and text-to-speech technology, and there are like 1 or 2 valid/legal use cases for voice cloning that I can think of. Ignoring the moral and ethical questions, why would anybody devote time and resources building a company around a very niche solution... one in which your customer churn rate is partially dependent on users not ending up in prison.

edit: typo

New comment by heavyarms in "To opt out of cross-site tracking, please enable cross-site tracking"

heavyarms — Thu, 22 Feb 2024 17:47:23 +0000

Had to chuckle when I looked at the Digital Advertising Alliance WebChoices browser tool (in Safari or any browser with cross-site tracking disabled). It allows you to opt out of being tracked, as long as you enable cross-site tracking and let them add a cookie. ¯\_(ツ)_/¯

To opt out of cross-site tracking, please enable cross-site tracking

heavyarms — Thu, 22 Feb 2024 17:47:22 +0000

Article URL: https://optout.aboutads.info/?c=2&lang=EN

Comments URL: https://news.ycombinator.com/item?id=39470537

Points: 1

# Comments: 1

New comment by heavyarms in "Nvidia emerges as leading investor in AI companies"

heavyarms — Mon, 11 Dec 2023 19:02:42 +0000

This makes sense on a number of fronts.

1. If you have capital to invest, you could do worse than AI startups at the moment.

2. Nvidia's long-term threat is not just direct competitors (AMD, Intel), but the big cloud-players going to in-house chips. Supporting the next wave of your customers makes sense.

3. Using Nvidia is the path of least resistance right now. If you only invest in startups using your products (and you are an active investor), you give startups another reason to avoid taking a risk on the alternative.

edit: typo

New comment by heavyarms in "Gemini AI"

heavyarms — Wed, 06 Dec 2023 16:44:15 +0000

I assume if one of the names in the paper was O'Shaughnessy you would immediately think: "Irish immigrant!" Schmidt? German immigrant!

New comment by heavyarms in "The false positive rate of AI detectors and its effect on freelance writers"

heavyarms — Wed, 01 Nov 2023 19:07:41 +0000

Ignoring the obvious issue that this whole anonymous story seems suspiciously perfect for selling a related product...

On the one hand... Companies spent the past couple of decades engaging in various SEO hacks to rank high on search results and OpenAI scraped the internet to train a language model. Theoretically, it seems possible that some of the SEO techniques at least partially colored the flavor of LLM-generated text, and an "AI detector" could pick that up. So if you do a great job writing SEO optimized text (wordy, structured, lots of repeated key words, etc.) you are more likely to be flagged.

But really.. "AI Detector" services are snake oil and will lead to the creation of "Anti AI Detector" services that offer protective spells against the original snake oil. See, we eliminate a bunch of jobs with AI but we create whole new disciplines of work that didn't exist before. "AI Generated Content Obfuscation Specialist - III - W2" coming to a job board near you soon.

New comment by heavyarms in "Phind Model beats GPT-4 at coding, with GPT-3.5 speed and 16k context"

heavyarms — Tue, 31 Oct 2023 19:04:28 +0000

I've been thinking along the same lines. The token window IMO should be a conceptual inverted pyramid, where there most recent tokens are retained verbatim but previous iterations are compressed/pooled more and more as the context grows. I'm sure there's some effort/research in this direction. It seems pretty obvious.

New comment by heavyarms in "Why America doesn't build"

heavyarms — Fri, 27 Oct 2023 19:23:51 +0000

I think the claim is based on the public political statements made by leaders in Texas. The fact that there is a huge discrepancy in what they say publicly against the science of global warming and the utility of renewables versus what the investment numbers say is the really sad part. Basically it boils down to: I'm going to lie through my teeth to pander to the stupid people who vote for me, but I'm also going to create favorable conditions for my wealthy buddies to make a killing in renewables.

New comment by heavyarms in "Efficient streaming language models with attention sinks"

heavyarms — Mon, 02 Oct 2023 17:49:00 +0000

Having only read the abstract, I'm probably way off the mark here, but my first thought was: LLM + LSTM.

New comment by heavyarms in "Iowa School District is using ChatGPT to determine banned books"

heavyarms — Wed, 16 Aug 2023 19:41:49 +0000

When people are faced with negative outcomes resulting from things they approve of, they do this passive-aggressive bit where they pretend to have a valid point.

New comment by heavyarms in "JFK Assassination Records – 2022 Additional Documents Release"

heavyarms — Fri, 16 Dec 2022 16:11:04 +0000

I used to buy into some of this JFK stuff when I was a X-Files watching teenager. What really burst the bubble for me was a documentary I watched where a team of snipers and forensic scientists re-created the exact shot with mannequins with bones and ballistic gel. They didn't even have to try that hard. Using the same rifle and ammo, the first shot they tried resulted in almost the same exact trajectory. I can't find a clip of that exact documentary (circa 2004-2006), but there are others who have done the same. You don't have to look hard to find very comprehensive and scientific explanations for the exact trajectory of that specific shot. But you do have to look very hard to find an actual explanation for why it is impossible that is beyond the level of "golly gee folks, I done shot lots of guns in my life and let me tell you, it ain't possible."

https://youtu.be/Q7ERXm9OwuE?t=250

New comment by heavyarms in "Riffusion – Stable Diffusion fine-tuned to generate music"

heavyarms — Fri, 16 Dec 2022 15:37:30 +0000

I dabble in music production and know some of the people in the "Lofi" world, so I know for a fact that this is not true. It's just a formulaic sub-genre where people are trying to make similar instrumentals with the same vibe. It would be jarring to listen to a playlist while studying and each song had wildly different tempos, instruments, etc.

Also, the music doesn't sound "Lofi" because it's generated by algorithms. A lot of hard work and software goes into taking a clean, pitch-perfect digital signal and making it sound like something playing on a record player from the 70s.

New comment by heavyarms in "System – A resource that aims to explain how everything in the world is related"

heavyarms — Tue, 15 Mar 2022 20:45:46 +0000

First of all, I'd like to say that this looks like a great project and I wish you the best of luck. I've done a bit of work on building knowledge graphs from semi-structured data and I know that every aspect of it is challenging. Obviously there's the data pipelines, ETL, semantic matching/categorization, statistical models, etc. Just building a simple UI for presenting a large knowledge graph was more challenging than most front end work I've ever done.

Question: if the goal is to build a knowledge graph that can "explain how anything in the world is related to everything else" how do you measure progress toward that goal? And how do you measure the quality? Just having a bunch of topics and relationships is not a great metric in my opinion. Obviously this is still very early, but here's an example I found in about 30 seconds of clicking around:

"Evidence suggests that Heart Failure is related to Income and COVID-19." [https://www.system.com/view/topic/P0XELnR0PaK]

There are topics in System for "Obesity" and "Smoking", but those are not associated to Heart Failure.