Hacker News: Gregaros

New comment by Gregaros in "Even (very) noisy LLM evaluators are useful for improving AI agents"

Gregaros — Fri, 29 May 2026 16:12:19 +0000

They should define this, but after having read the entire article I think it’s clear they mean “frameworks for evaluating the output of an agent” rather than what first might come to mind as “LLM evals”.

Their thesis is that even when the eval is useless for correctness of a single agentic action in production, it allows you to choose between two agents by cross-comparing in a large aggregated collection of tasks. Effectively: you can tune your agentic parameters.

Nothing new to the idea that taking many samples and averaging can work when a single datapoint doesn’t. Presumably this is part of a conversation in which we’re lacking context.

New comment by Gregaros in "Anthropic Cofounder Chris Olah's Remarks on Pope Leo XIV's "Magnifica Humanitas""

Gregaros — Mon, 25 May 2026 20:12:49 +0000

The poorest of the poor, subsistence farmers are barely producing enough to feed themselves; they trade and barter the little bit they can manage but it is not much and has little impact that goes beyond a tiny village-level radius. Nobody is displacing that because nobody needs to compete with that.

New comment by Gregaros in "Show HN: Auto-identity-remove – Automated data broker opt-out runner for macOS"

Gregaros — Mon, 18 May 2026 13:10:49 +0000

whitepages vs yellowpages

New comment by Gregaros in "Hardening Firefox with Claude Mythos Preview"

Gregaros — Thu, 07 May 2026 22:30:02 +0000

> Mozilla uses the term "vulnerability" for even sec-high, even though they say right below that it doesn't mean the same thing as a practical exploit.

That’s not evident in what you pastedat all.

What you pasted says

> sec-critical and sec-high are assigned to vulnerabilities that can be triggered with normal user behavior […] We make no technical difference between these […] sec-critical bugs are reserved for issues that are publicly disclosed or known to be exploited in the wild.

> sec-low is assigned to bugs that are annoying but far from causing user harm (e.g, a safe crash).

From this one infers that the "180 were sec-high" bugs found are actually exploitsble but known to have been found in the wild, and are NOT mere annoying bugs.

The difference between 180 and 270 does nothing to deflate the signicance, or lack there of, of the implication re: Mythos.

New comment by Gregaros in "US v. Heppner (S.D.N.Y. 2026) no attorney-client privilege for AI chats [pdf]"

Gregaros — Thu, 16 Apr 2026 11:25:23 +0000

> This also seems like a win for society, if there is some sort of pattern with ai helping with crimes.

That fails to recognize the tradeoff between freedom and security. Society suffers if we, for instance, lock everyone up, despite the reduction that would have in crimes. The balance between the two cannot be ignored to justify outcomes, though it is American tradition to value liberty over security when the two come in conflict.

New comment by Gregaros in "Tell HN: Anthropic no longer allowing Claude Code subscriptions to use OpenClaw"

Gregaros — Sat, 04 Apr 2026 02:23:25 +0000

Still very interesting timing to ban third party harnesses, given the proximity to the Claude Code leak …

New comment by Gregaros in "Empathy for Dummies"

Gregaros — Thu, 09 Oct 2025 13:33:20 +0000

I _don’t_ think that empathy has anything to do with it though.

Behaviour modification yes, but that is “stop talking so critically”. Or “don’t be so harsh” or “give this person special treatment”. WHEN to do that might be key here—perhaps the colleague’s husband has cancer, or their child missed school 3 days this week with the flu, or their project wasn’t productionalized/their new to the role/etc—and so a blanket “don’t talk so harshly” isn’t called for—instead what is really desired is social calibration.

But instead it seems everyone is getting caught up on the literal interpretation of this figure of speech instead.

New comment by Gregaros in "Empathy for Dummies"

Gregaros — Wed, 08 Oct 2025 16:56:56 +0000

Impossible to say what was behind any specific request, but what is generally meant by “Have a little emapathy” and its kin is : “Stop criticizingjudging/etc. or communicating with the individual being discussed that sharply, because we feel the individual has good reasons/a good excuse/a good justification for sympathy and/or some leniency here.”

New comment by Gregaros in "Empathy for Dummies"

Gregaros — Wed, 08 Oct 2025 12:59:25 +0000

  On many occasions, I have been told to “be more empathetic.”

  When I ask why, I typically get this reaction:  

  This is a ridiculous question. I am not going to answer it because it is so ridiculous.

  Empathy is the right thing to do! You should feel bad for that person. We’re humans, after all.

  These explanations never really helped.

Even after reading this, I am not sure the author really gets what is behind the request.

New comment by Gregaros in "From $479 to $2,800 a month for ACA health insurance next year"

Gregaros — Fri, 22 Aug 2025 21:22:23 +0000

> The person in the article even laughs it off since she turns 65 soon and will then switch to Medicare.

You might be surprised how few Americans can ‘laugh off’ even a one-time payment increase of $2,300–let alone a monthly recurring one.

New comment by Gregaros in "My open source project was relicensed by a YC company [license updated]"

Gregaros — Tue, 08 Jul 2025 03:15:11 +0000

This is correct—I do not engage ethics only when it won’t cost me, nor take convenience into account when determining where my lines are. Perhaps I’m privileged to have that option.

New comment by Gregaros in "New sphere-packing record stems from an unexpected source"

Gregaros — Mon, 07 Jul 2025 19:46:19 +0000

_May_ be a case for extending out what has been explored by theory to cover more useful ground (or not, depending on whether real-world usecases like yours are too heterogenous for effective general techniques).

New comment by Gregaros in "Adding a feature because ChatGPT incorrectly thinks it exists"

Gregaros — Mon, 07 Jul 2025 15:54:00 +0000

100%. Not sure why you’re downvoted here, there’s nothing controversial here even if you disagree with the framing.

I would go on to say that thisminteraction between ‘holes’ exposed by LLM expectations _and_ demonstrated museerbase interest _and_ expert input (by the devs’ decision to implement changes) is an ideal outcome that would not have occurred if each of the pieces were not in place to facilitate these interactions, and there’s probably something here to learn from and expand on in the age of LLMs altering user experiences.

New comment by Gregaros in "My open source project was relicensed by a YC company [license updated]"

Gregaros — Mon, 07 Jul 2025 11:37:23 +0000

Presumably the value in knowing "you need to sort a string in place and then discuss how a random forest gets trained" is that it impacts your answers - for instance, by allowing you to look this up before the interview while appearing to the interviewers to he operating unfer the dame conditions as the other candidates, who did not know to. Your performance then appears as a signal of broader inwoledge and capability than you possess - you have, as is the entire point here and which I should not need to spell out, gained an advantage over other candidates by virtue of the information which was intentionally leaked.

If the point of the interview were "answer those questions AND know enough to answer the follow up questions" _once told what to expect and prep_, they’d be sharing those questions with all candidates. If you feel that saying to the interviewers "by the way, I did know this because [X] told me they’d be here" wouldn’t impact outcomes, then great. If you feel you’d need to hide that, then you’re aware this involves dishonesty - and if you still struggle to see how that’s unethical, lets just make sure we never need to work together.

New comment by Gregaros in "My open source project was relicensed by a YC company [license updated]"

Gregaros — Fri, 04 Jul 2025 13:40:13 +0000

> If cheating means asking someone in the company you're interviewing for a peek at what will be asked then great. In my book that's using leverage.

In my book that is unambiguously unethical and should get the contact fired. I am shocked to see this approach promoted in such a blasé manner.

New comment by Gregaros in "Oklo, the Earth's Two-billion-year-old only Known Natural Nuclear Reactor (2018)"

Gregaros — Fri, 20 Jun 2025 18:11:26 +0000

I thought where you were going with his was "that realized the best way to dispose of their nuclear waste was to dump it in the deep past." I’d read that novel.

New comment by Gregaros in "Compiling LLMs into a MegaKernel: A path to low-latency inference"

Gregaros — Fri, 20 Jun 2025 16:25:14 +0000

Some further questions:

1. For tasks like autocomplete, keyword routing, or voice transcription, what would the latency and power savings look like on an ASIC vs. even a megakernel GPU setup? Would that justify a fixed-function approach in edge devices or embedded systems?

2. ASICs obviously kill retraining, but could we envision a hybrid setup where a base model is hardwired and a small, soft, learnable module (e.g., LoRA-style residual layers) runs on a general-purpose co-processor?

3. Would the transformer’s fixed topology lend itself to spatial reuse in ASIC design, or is the model’s size (e.g. GPT-3-class) still prohibitive without aggressive weight pruning or quantization?

New comment by Gregaros in "Compiling LLMs into a MegaKernel: A path to low-latency inference"

Gregaros — Fri, 20 Jun 2025 16:24:27 +0000

Curious if anyone has thoughts on going even further: eschewing soft-ware based inference in favor of a purely ASIC approach to a static LLM. Cost benefits? Software level additional, fine-tuneable layers to allow a degree of improvement and flexibility? We are quickly approaching ‘good enough’ for some tasks—at what point does that mean we’re comfortable locking something in for the ~2-4 year lifespan of a device if there _were_ advantages offered by a hyper-specialized chip?

New comment by Gregaros in "Magistral — the first reasoning model by Mistral AI"

Gregaros — Tue, 10 Jun 2025 16:00:59 +0000

Really anyone that writes for a living. I have a referee report on a paper asking me to correct something to be an em-dash.

New comment by Gregaros in "OpenAI's Stargate project struggling to get off the ground, due to tariffs"

Gregaros — Tue, 13 May 2025 16:39:59 +0000

gtfo. Nobody voted Trump to _raise_ taxes.