<rss version="2.0" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:atom="http://www.w3.org/2005/Atom"><channel><title>Hacker News: fryz</title><link>https://news.ycombinator.com/user?id=fryz</link><description>Hacker News RSS</description><docs>https://hnrss.org/</docs><generator>hnrss v2.1.1</generator><lastBuildDate>Thu, 23 Apr 2026 06:14:43 +0000</lastBuildDate><atom:link href="https://hnrss.org/user?id=fryz" rel="self" type="application/rss+xml"></atom:link><item><title><![CDATA[New comment by fryz in "Show HN: Chess on a Donut/Torus and Deep-Dive"]]></title><description><![CDATA[
<p>looks like the white king/queen aren't on the right colors (queen goes on her color) - confused me a bit when trying to map the space to a 2d board</p>
]]></description><pubDate>Thu, 04 Dec 2025 21:12:01 +0000</pubDate><link>https://news.ycombinator.com/item?id=46153143</link><dc:creator>fryz</dc:creator><comments>https://news.ycombinator.com/item?id=46153143</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=46153143</guid></item><item><title><![CDATA[How we helped a YC company (Upsolve) catch a GPT-5 regression]]></title><description><![CDATA[
<p>Article URL: <a href="https://www.arthur.ai/blog/how-upsolve-built-trusted-agentic-ai-with-arthur">https://www.arthur.ai/blog/how-upsolve-built-trusted-agentic-ai-with-arthur</a></p>
<p>Comments URL: <a href="https://news.ycombinator.com/item?id=45928131">https://news.ycombinator.com/item?id=45928131</a></p>
<p>Points: 17</p>
<p># Comments: 0</p>
]]></description><pubDate>Fri, 14 Nov 2025 16:01:06 +0000</pubDate><link>https://www.arthur.ai/blog/how-upsolve-built-trusted-agentic-ai-with-arthur</link><dc:creator>fryz</dc:creator><comments>https://news.ycombinator.com/item?id=45928131</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=45928131</guid></item><item><title><![CDATA[New comment by fryz in "You should write an agent"]]></title><description><![CDATA[
<p>The "magic" is done via the JSON schemas that are passed in along with the definition of the tool.<p>Structured Output APIs (inc. the Tool API) take the schema and build a Context-free Grammar, which is then used during generation to mask which tokens can be output.<p>I found <a href="https://openai.com/index/introducing-structured-outputs-in-the-api/" rel="nofollow">https://openai.com/index/introducing-structured-outputs-in-t...</a> (have to scroll down a bit to the "under the hood" section) and <a href="https://www.leewayhertz.com/structured-outputs-in-llms/#constrained-sampling-cfg" rel="nofollow">https://www.leewayhertz.com/structured-outputs-in-llms/#cons...</a> to be pretty good resources</p>
]]></description><pubDate>Fri, 07 Nov 2025 17:38:28 +0000</pubDate><link>https://news.ycombinator.com/item?id=45848819</link><dc:creator>fryz</dc:creator><comments>https://news.ycombinator.com/item?id=45848819</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=45848819</guid></item><item><title><![CDATA[New comment by fryz in "How to Catch a Wily Poacher in a Sting: A Thermal Robotic Deer"]]></title><description><![CDATA[
<p>yeah - it's one of the best success stories of wildlife conservation in the modern era.</p>
]]></description><pubDate>Sat, 26 Jul 2025 17:11:42 +0000</pubDate><link>https://news.ycombinator.com/item?id=44695540</link><dc:creator>fryz</dc:creator><comments>https://news.ycombinator.com/item?id=44695540</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=44695540</guid></item><item><title><![CDATA[New comment by fryz in "How to Catch a Wily Poacher in a Sting: A Thermal Robotic Deer"]]></title><description><![CDATA[
<p>FWIW, not saying it's right (as a hunter I wouldn't ever do this myself), but most of the biologists that build the population models, inc. the ones that they use to set the amount of hunting licenses or tags sold, build a certain amount of poaching into their models.<p>It's a particularly hard problem to solve - the hobby is usually spread through traditional means (you do it if your parents did it), and going all the way back in certain communities this was the main way to get meat, even before it became regulated. It's difficult to stop something that not only puts food on the table for your family, but has been done that way for generations.<p>This was one of the main contributors to the decline of the turkey population in the lower 48. In the early 1900's, a lot of folks thought turkey's were extinct because of over hunting and poaching, and the National Wild Turkey Foundation took efforts to restore the population for hunting.</p>
]]></description><pubDate>Fri, 25 Jul 2025 20:29:34 +0000</pubDate><link>https://news.ycombinator.com/item?id=44688039</link><dc:creator>fryz</dc:creator><comments>https://news.ycombinator.com/item?id=44688039</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=44688039</guid></item><item><title><![CDATA[New comment by fryz in "Lasagna Battery Cell"]]></title><description><![CDATA[
<p>This is one of those joyful concepts you learn about as a homeowner, especially on older homes.<p>If you have plumbing that's done in different metal materials (copper, steel, lead, etc.) and any of your pipes touch, you have to perform regular maintenance and apply a dielectric grease (another one of those single-use materials that you have to buy and store away) or your pipes could corrode and cause a ton of damage.</p>
]]></description><pubDate>Mon, 14 Jul 2025 13:59:20 +0000</pubDate><link>https://news.ycombinator.com/item?id=44560357</link><dc:creator>fryz</dc:creator><comments>https://news.ycombinator.com/item?id=44560357</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=44560357</guid></item><item><title><![CDATA[Ask HN: Are there other language/framework-specific LLMs?]]></title><description><![CDATA[
<p>We've been using V0 (https://v0.dev/) for a while now, and relative to other LLMs it definitely seems to be a level up in terms of the quality and readability of the code.<p>I've watched a few talks by the Vercel engineers and they mention that they've done a lot of work specifically for this by leaning into a strategy of:<p>* Training the model for specific language + frameworks (typescript, nextjs, react and shadcdn)<p>* Collecting and using high-quality code samples for training<p>I'm wondering if anyone else knows of any other model providers that have similar offerings where they're building LLMs to be better within the context of a specific language and/or framework (eg: Python + FastAPI, etc.).</p>
<hr>
<p>Comments URL: <a href="https://news.ycombinator.com/item?id=44492694">https://news.ycombinator.com/item?id=44492694</a></p>
<p>Points: 2</p>
<p># Comments: 0</p>
]]></description><pubDate>Mon, 07 Jul 2025 17:32:51 +0000</pubDate><link>https://news.ycombinator.com/item?id=44492694</link><dc:creator>fryz</dc:creator><comments>https://news.ycombinator.com/item?id=44492694</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=44492694</guid></item><item><title><![CDATA[New comment by fryz in "Show HN: High-performance GenAI engine now open source"]]></title><description><![CDATA[
<p>Yeah thanks for the feedback.<p>We think we stand out from our competitors in the space because we built first for the enterprise case, with consideration for things like data governance, acceptable use, and data privacy and information security that can be deployed in managed easily and reliably in customer-managed environments.<p>A lot of the products today have similar evaluations and metrics, but they either offer a SAAS solution or require some onerous integration into your application stack.<p>Because we started w/ the enterprise first, our goal was to get to value as quickly and as easily as possible (to avoid shoulder-surfing over zoom calls because we don't have access to the service), and think this plays out well with our product.</p>
]]></description><pubDate>Thu, 24 Apr 2025 18:38:20 +0000</pubDate><link>https://news.ycombinator.com/item?id=43785998</link><dc:creator>fryz</dc:creator><comments>https://news.ycombinator.com/item?id=43785998</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=43785998</guid></item><item><title><![CDATA[New comment by fryz in "Show HN: High-performance GenAI engine now open source"]]></title><description><![CDATA[
<p>Yeah great question<p>We based our hallucination detection on "groundedness" on a claim-by-claim basis, which evaluates whether the LLM response can be cited in provided context (eg: message history, tool calls, retrieved context from a vector DB, etc.)<p>We split the response into multiple claims, determine if a claim needs to be evaluated (eg: and isn't just some boilerplate) and then check to see if the claim is referenced in the context.</p>
]]></description><pubDate>Thu, 24 Apr 2025 18:35:54 +0000</pubDate><link>https://news.ycombinator.com/item?id=43785973</link><dc:creator>fryz</dc:creator><comments>https://news.ycombinator.com/item?id=43785973</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=43785973</guid></item><item><title><![CDATA[Show HN: High-performance GenAI engine now open source]]></title><description><![CDATA[
<p>Hey HN<p>After one too many customer firedrills regarding hallucinating or insecure AI models, we built a system to catch these issues before they reached production. The Arthur Engine has been running in Fortune 100 to AI Native Start-Ups over the past two years, putting security controls around more than 10 billion tokens in production every month. We're now opening up this service to developers, enabling you to leverage enterprise-grade solutions to provide guardrails and evals as a service, all for free.<p>Get it on Github (<a href="https://github.com/arthur-ai/arthur-engine">https://github.com/arthur-ai/arthur-engine</a>) to start evaluating your models today<p>Highlights of Arthur's Engine include:<p>* Built for speed and scale: It performs well with p90 latencies of sub-second well over 100+ RPS<p>* Made for full lifecycle support: Ideal for pre-production validation, real-time guardrails, and post-production monitoring.<p>* Ease of use: It is designed to be easy for anyone to run and deploy whether you're working on it locally during development, or you're deploying it within a horizontally-scaling architecture for large-scale workloads.<p>* Unification of generative and traditional AI: The Arthur AI Engine can be used to evaluate a diverse range of models from LLMs and Agentic AI systems to binary classifiers, regression models, recommender systems, forecasting models, and more.<p>* Content-specific guardrail and detection features: Ranging from toxicity and hallucination detection to sensitive data (like PII, keyword/regex and custom rules) and prompt injection.<p>* Customizability: Plug in your own models or integrate with other model or guardrail providers with ease, and tailor the system to match your specific needs.<p>Having been first-hand witnesses to the lack of adequate AI monitoring tools and the general under delivery of Gen AI systems in production, we believe that such a capability shouldn't be exclusive to big-budget organizations. Our mission is to make AI better, for everyone, and we believe by opening up this tool we can help more people get to that goal.<p>Check out our GitHub repo for examples and directions on how to use the Arthur AI Engine for various purposes such as validation during development, real-time guardrails or performance troubleshooting using enriched logging data. (<a href="https://github.com/arthur-ai/engine-examples">https://github.com/arthur-ai/engine-examples</a>)<p>We can’t wait to see what you build<p>— Zach and Team Arthur</p>
<hr>
<p>Comments URL: <a href="https://news.ycombinator.com/item?id=43782869">https://news.ycombinator.com/item?id=43782869</a></p>
<p>Points: 22</p>
<p># Comments: 12</p>
]]></description><pubDate>Thu, 24 Apr 2025 13:55:19 +0000</pubDate><link>https://github.com/arthur-ai/arthur-engine</link><dc:creator>fryz</dc:creator><comments>https://news.ycombinator.com/item?id=43782869</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=43782869</guid></item><item><title><![CDATA[New comment by fryz in "Googler... ex-Googler"]]></title><description><![CDATA[
<p>You're not wrong but suffering isn't comparative. Because it's easier for someone to bounce back or have support in the transition doesn't mean it still doesn't suck.</p>
]]></description><pubDate>Mon, 14 Apr 2025 14:34:36 +0000</pubDate><link>https://news.ycombinator.com/item?id=43681778</link><dc:creator>fryz</dc:creator><comments>https://news.ycombinator.com/item?id=43681778</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=43681778</guid></item><item><title><![CDATA[New comment by fryz in "Show HN: Mastra – Open-source JS agent framework, by the developers of Gatsby"]]></title><description><![CDATA[
<p>To add some color to this<p>Anthropic does a good job of breaking down some common architecture around using these components [1] (good outline of this if you prefer video [2]).<p>"Agent" is definitely an overloaded term - the best framing of this I've seen is aligns more closely with the Anthropic definition. Specifically, an "agent" is a GenAI system that dynamically identifies the tasks ("steps" from the parent comment) without having to be instructed that those are the steps. There are obvious parallels to the reasoning capabilities that we've seen released in the latest cut of the foundation models.<p>So for example, the "Agent" would first build a plan for how to address the query, dynamically farm out the steps in that plan to other LLM calls, and then evaluate execution for correctness/success.<p>[1] <a href="https://www.anthropic.com/research/building-effective-agents" rel="nofollow">https://www.anthropic.com/research/building-effective-agents</a>
[2] <a href="https://www.youtube.com/watch?v=pGdZ2SnrKFU" rel="nofollow">https://www.youtube.com/watch?v=pGdZ2SnrKFU</a></p>
]]></description><pubDate>Wed, 19 Feb 2025 21:41:19 +0000</pubDate><link>https://news.ycombinator.com/item?id=43108095</link><dc:creator>fryz</dc:creator><comments>https://news.ycombinator.com/item?id=43108095</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=43108095</guid></item><item><title><![CDATA[New comment by fryz in "Splitting engineering teams into defense and offense"]]></title><description><![CDATA[
<p>Neat article - I know the author mentioned this in the post, but I only see this working as long as a few assumptions hold:<p>* avg tenure / skill level of team is relatively uniform<p>* team is small with high-touch comms (eg: same/near timezone)<p>* most importantly - everyone feels accountable and has agency for work others do (eg: codebase is small, relatively simple, etc)<p>Where I would expect to see this fall apart is when these assumptions drift and holding accountability becomes harder.  When folks start to specialize, something becomes complex, or work quality is sacrificed for short-term deliverables, the folks that feel the pain are the defense folks and they dont have agency to drive the improvements.<p>The incentives for folks on defense are completely different than folks on offense, which can make conversations about what to prioritize difficult in the long term.</p>
]]></description><pubDate>Mon, 14 Oct 2024 20:47:02 +0000</pubDate><link>https://news.ycombinator.com/item?id=41841806</link><dc:creator>fryz</dc:creator><comments>https://news.ycombinator.com/item?id=41841806</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=41841806</guid></item><item><title><![CDATA[Guide to LLM Experimentation and Development in 2024]]></title><description><![CDATA[
<p>Article URL: <a href="https://www.arthur.ai/blog/the-ultimate-guide-to-llm-experimentation-and-development-in-2024">https://www.arthur.ai/blog/the-ultimate-guide-to-llm-experimentation-and-development-in-2024</a></p>
<p>Comments URL: <a href="https://news.ycombinator.com/item?id=40634623">https://news.ycombinator.com/item?id=40634623</a></p>
<p>Points: 2</p>
<p># Comments: 0</p>
]]></description><pubDate>Mon, 10 Jun 2024 15:28:34 +0000</pubDate><link>https://www.arthur.ai/blog/the-ultimate-guide-to-llm-experimentation-and-development-in-2024</link><dc:creator>fryz</dc:creator><comments>https://news.ycombinator.com/item?id=40634623</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=40634623</guid></item><item><title><![CDATA[New comment by fryz in "Why I'm Leaving New York City [video]"]]></title><description><![CDATA[
<p>shhhhhh - dont give it up :)<p>Part of whats nice about it is that there isnt a ton of people and the NJ stereotypes work out in our favor.</p>
]]></description><pubDate>Sun, 21 Apr 2024 12:04:59 +0000</pubDate><link>https://news.ycombinator.com/item?id=40105027</link><dc:creator>fryz</dc:creator><comments>https://news.ycombinator.com/item?id=40105027</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=40105027</guid></item><item><title><![CDATA[New comment by fryz in "After AI beat them, professional Go players got better and more creative"]]></title><description><![CDATA[
<p>FWIW, I find the classical chess tournaments with the super GMs to be fairly interesting, if only because the focus of the games is more about the metagame than about the game itself.<p>The article linked at the bottom of the source is a WSJ piece about how Magnus beats the best players because of the "human element".<p>A lot about the games today are about opening preparation, where the goal is to out-prepare and surprise your opponent by studying opening lines and esoteric responses (somewhere computer play has drastically opened up new fields). Similarly, during the middle/end-games, the best players will try to force uncomfortable decisions on their opponents, knowing what positions their opponents tend to not prefer. For example, in the candidates game round 1, Fabiano took Hikari into a position that had very little in the way of aggressive counter-play, effectively taking away a big advantage that Hikaru would otherwise have had.<p>Watching these games feels somewhat akin to watching generals develop strategies trying to out maneuver their counterparts on the other side, taking into consideration their strengths and weaknesses as much as the tactics/deployment of troops/etc.</p>
]]></description><pubDate>Mon, 08 Apr 2024 20:32:44 +0000</pubDate><link>https://news.ycombinator.com/item?id=39973405</link><dc:creator>fryz</dc:creator><comments>https://news.ycombinator.com/item?id=39973405</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=39973405</guid></item><item><title><![CDATA[New comment by fryz in "Grabbing Dinner"]]></title><description><![CDATA[
<p>From the article:<p>> When I asked Jody how much of  his family’s meat is wild game, he initially said “about half.” Upon reflection, he bumped the number to 70 percent.<p>Doesn't sound like this is a justification for "culture" or "tradition". Certainly seems a lot more responsible than the average "tradition" of "I got it at the grocery store".<p>When you hunt for your own food, you are forced to consider the sacrifice of the animal and have to put in the work of preparing for the hunt and cleaning the animal. Things that anyone who's not done this takes for granted when they eat meat.</p>
]]></description><pubDate>Sun, 17 Sep 2023 19:29:47 +0000</pubDate><link>https://news.ycombinator.com/item?id=37548517</link><dc:creator>fryz</dc:creator><comments>https://news.ycombinator.com/item?id=37548517</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=37548517</guid></item><item><title><![CDATA[How to Think About Production Performance of Generative Text]]></title><description><![CDATA[
<p>Article URL: <a href="https://www.arthur.ai/blog/how-to-think-about-production-performance-of-generative-text">https://www.arthur.ai/blog/how-to-think-about-production-performance-of-generative-text</a></p>
<p>Comments URL: <a href="https://news.ycombinator.com/item?id=35713868">https://news.ycombinator.com/item?id=35713868</a></p>
<p>Points: 1</p>
<p># Comments: 0</p>
]]></description><pubDate>Wed, 26 Apr 2023 14:07:20 +0000</pubDate><link>https://www.arthur.ai/blog/how-to-think-about-production-performance-of-generative-text</link><dc:creator>fryz</dc:creator><comments>https://news.ycombinator.com/item?id=35713868</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=35713868</guid></item><item><title><![CDATA[New comment by fryz in "Netflix's New Chapter"]]></title><description><![CDATA[
<p>Maybe we've not gotten there yet (kids are 2 and 1), but they can watch the same thing thousands of times and it will still glue them to the TV.</p>
]]></description><pubDate>Tue, 24 Jan 2023 17:31:49 +0000</pubDate><link>https://news.ycombinator.com/item?id=34506918</link><dc:creator>fryz</dc:creator><comments>https://news.ycombinator.com/item?id=34506918</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=34506918</guid></item><item><title><![CDATA[New comment by fryz in "Netflix's New Chapter"]]></title><description><![CDATA[
<p>You might not be the right market (or at least, the marketplace might be different for your demographic).<p>I'm a parent, and for me, and all my parent friends, Disney+ is the streaming service that generates the most value in our households. Along with all the old/nostalgic Disney animated films, they generate and acquire a lot of the "in" content for kids (Bluey, Mickey Mouse Kids House, etc.)<p>Before my kids, Disney+ would have been the first streaming service to make the cut. But now, it'll be the last.</p>
]]></description><pubDate>Mon, 23 Jan 2023 21:16:38 +0000</pubDate><link>https://news.ycombinator.com/item?id=34495178</link><dc:creator>fryz</dc:creator><comments>https://news.ycombinator.com/item?id=34495178</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=34495178</guid></item></channel></rss>