<rss version="2.0" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:atom="http://www.w3.org/2005/Atom"><channel><title>Hacker News: gkamradt</title><link>https://news.ycombinator.com/user?id=gkamradt</link><description>Hacker News RSS</description><docs>https://hnrss.org/</docs><generator>hnrss v2.1.1</generator><lastBuildDate>Thu, 18 Jun 2026 10:12:18 +0000</lastBuildDate><atom:link href="https://hnrss.org/user?id=gkamradt" rel="self" type="application/rss+xml"></atom:link><item><title><![CDATA[New comment by gkamradt in "Ask HN: What are you working on? (June 2026)"]]></title><description><![CDATA[
<p>Two projects<p>1. <a href="https://interauth.dev/" rel="nofollow">https://interauth.dev/</a><p>Share a single google doc with your agent (w/o oauth mess)<p>I needed a way to share a single google doc/sheet with my agent<p>I didn’t want to go through the heavy oauth gcp project so I’m using disposable email addresses as the work around<p>2. Agents.sh<p>I get so many cold emails that could be better if I tell the bots how to talk to and reach me. What’s top of mind for me, how I like to be pitched, etc.<p>So I made a mini platform to put up text/md files. Then added all the perms fun - pw support, expiration, every url has an inbox. Aimed at agents only.<p>Ex: <a href="https://agnts.sh/greg" rel="nofollow">https://agnts.sh/greg</a></p>
]]></description><pubDate>Sun, 14 Jun 2026 21:15:43 +0000</pubDate><link>https://news.ycombinator.com/item?id=48532812</link><dc:creator>gkamradt</dc:creator><comments>https://news.ycombinator.com/item?id=48532812</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=48532812</guid></item><item><title><![CDATA[Show HN: ARC-AGI-3 Toolkit]]></title><description><![CDATA[
<p>Article URL: <a href="https://docs.arcprize.org">https://docs.arcprize.org</a></p>
<p>Comments URL: <a href="https://news.ycombinator.com/item?id=46816529">https://news.ycombinator.com/item?id=46816529</a></p>
<p>Points: 9</p>
<p># Comments: 1</p>
]]></description><pubDate>Thu, 29 Jan 2026 21:01:27 +0000</pubDate><link>https://docs.arcprize.org</link><dc:creator>gkamradt</dc:creator><comments>https://news.ycombinator.com/item?id=46816529</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=46816529</guid></item><item><title><![CDATA[New comment by gkamradt in "OpenAI o3-pro"]]></title><description><![CDATA[
<p>o3-pro is not the same as the o3-preview that was shown in Dec '24. OpenAI confirmed this for us. More on that here: <a href="https://x.com/arcprize/status/1932535380865347585" rel="nofollow">https://x.com/arcprize/status/1932535380865347585</a></p>
]]></description><pubDate>Tue, 10 Jun 2025 21:26:51 +0000</pubDate><link>https://news.ycombinator.com/item?id=44241634</link><dc:creator>gkamradt</dc:creator><comments>https://news.ycombinator.com/item?id=44241634</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=44241634</guid></item><item><title><![CDATA[New comment by gkamradt in "Arc-AGI-2 and ARC Prize 2025"]]></title><description><![CDATA[
<p>Ah yes, two things<p>1. We had a no-data retention agreement with them. We were assured by the highest level of their company + security division that the box our test was run on would be wiped after testing<p>2. We only tested o3 against the semi-private set. We didn't test it with the private eval.</p>
]]></description><pubDate>Mon, 24 Mar 2025 23:58:58 +0000</pubDate><link>https://news.ycombinator.com/item?id=43466703</link><dc:creator>gkamradt</dc:creator><comments>https://news.ycombinator.com/item?id=43466703</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=43466703</guid></item><item><title><![CDATA[New comment by gkamradt in "Arc-AGI-2 and ARC Prize 2025"]]></title><description><![CDATA[
<p>#4 (private test set) doesn't get used for any public model testing. It is only used on the Kaggle leaderboard where no internet access is allowed.</p>
]]></description><pubDate>Mon, 24 Mar 2025 23:47:17 +0000</pubDate><link>https://news.ycombinator.com/item?id=43466626</link><dc:creator>gkamradt</dc:creator><comments>https://news.ycombinator.com/item?id=43466626</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=43466626</guid></item><item><title><![CDATA[New comment by gkamradt in "Arc-AGI-2 and ARC Prize 2025"]]></title><description><![CDATA[
<p>Good question! This was one of the main motivations of our "Paper Prize" track. We wanted to reward conceptual progress vs leaderboard chasing. In fact, when we increased the prizes mid year we awarded more money towards the paper track vs top score.<p>We had 40 papers submitted last year and 8 were awarded prizes. [1]<p>On of the main teams, MindsAI, just published their paper on their novel test time fine tuning approach. [2]<p>Jan/Daniel (1st place winners last year) talk all about their progress and journey building out here [3]. Stories like theirs help push the field forward.<p>[1] <a href="https://arcprize.org/blog/arc-prize-2024-winners-technical-report" rel="nofollow">https://arcprize.org/blog/arc-prize-2024-winners-technical-r...</a><p>[2] <a href="https://github.com/MohamedOsman1998/deep-learning-for-arc/blob/main/deep_learning_for_arc.pdf" rel="nofollow">https://github.com/MohamedOsman1998/deep-learning-for-arc/bl...</a><p>[3] <a href="https://www.youtube.com/watch?v=mTX_sAq--zY" rel="nofollow">https://www.youtube.com/watch?v=mTX_sAq--zY</a></p>
]]></description><pubDate>Mon, 24 Mar 2025 23:46:06 +0000</pubDate><link>https://news.ycombinator.com/item?id=43466619</link><dc:creator>gkamradt</dc:creator><comments>https://news.ycombinator.com/item?id=43466619</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=43466619</guid></item><item><title><![CDATA[New comment by gkamradt in "Arc-AGI-2 and ARC Prize 2025"]]></title><description><![CDATA[
<p>We have a few sets:<p>1. Public Train - 1,000 tasks that are public
2. Public Eval - 120 tasks that are public<p>So for those two we don't have protections.<p>3. Semi Private Eval - 120 tasks that are exposed to 3rd parties. We sign data agreements where we can, but we understand this is exposed and not 100% secure. It's a risk we are open to in order to keep testing velocity. In theory it is very difficulty to secure this 100%. The cost to create a new semi-private test set is lower than the effort needed to secure it 100%.<p>4. Private Eval - Only on Kaggle, not exposed to any 3rd parties at all. Very few people have access to this. Our trust vectors are with Kaggle and the internal team only.</p>
]]></description><pubDate>Mon, 24 Mar 2025 21:00:52 +0000</pubDate><link>https://news.ycombinator.com/item?id=43465360</link><dc:creator>gkamradt</dc:creator><comments>https://news.ycombinator.com/item?id=43465360</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=43465360</guid></item><item><title><![CDATA[New comment by gkamradt in "Arc-AGI-2 and ARC Prize 2025"]]></title><description><![CDATA[
<p>Hey HN, Greg from ARC Prize Foundation here.<p>Alongside Mike Knoop and François Francois Chollet, we’re launching ARC-AGI-2, a frontier AI benchmark that measures a model’s ability to generalize on tasks it hasn’t seen before, and the ARC Prize 2025 competition to beat it.<p>In Dec ‘24, ARC-AGI-1 (2019) pinpointed the moment AI moved beyond pure memorization as seen by OpenAI's o3.<p>ARC-AGI-2 targets test-time reasoning.<p>My view is that good AI benchmarks don't just measure progress, they inspire it. Our mission is to guide research towards general systems.<p>Base LLMs (no reasoning) are currently scoring 0% on ARC-AGI-2. Specialized AI reasoning systems (like R1 or o3-mini) are <4%.<p>Every (100%) of ARC-AGI-2 tasks, however, have been solved by at least two humans, quickly and easily. We know this because we tested 400 people live.<p>Our belief is that once we can no longer come up with quantifiable problems that are "feasible for  humans and hard for AI" then we effectively have AGI. ARC-AGI-2 proves that we do not have AGI.<p>Change log from ARC-AGI-2 to ARC-AGI-2:
* The two main evaluation sets (semi-private, private eval) have increased to 120 tasks
* Solving tasks requires more reasoning vs pure intuition
* Each task has been confirmed to have been solved by at least 2 people (many more) out of an average of 7 test taskers in 2 attempts or less
* Non-training task sets are now difficulty-calibrated<p>The 2025 Prize ($1M, open-source required) is designed to drive progress on this specific gap. Last year's competition (also launched on HN) had 1.5K teams participate and had 40+ research papers published.<p>The Kaggle competition goes live later this week and you can sign up here: <a href="https://arcprize.org/competition" rel="nofollow">https://arcprize.org/competition</a><p>We're in an idea-constrained environment. The next AGI breakthrough might come from you, not a giant lab.<p>Happy to answer questions.</p>
]]></description><pubDate>Mon, 24 Mar 2025 20:37:02 +0000</pubDate><link>https://news.ycombinator.com/item?id=43465162</link><dc:creator>gkamradt</dc:creator><comments>https://news.ycombinator.com/item?id=43465162</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=43465162</guid></item><item><title><![CDATA[Arc-AGI-2 and ARC Prize 2025]]></title><description><![CDATA[
<p>Article URL: <a href="https://arcprize.org/blog/announcing-arc-agi-2-and-arc-prize-2025">https://arcprize.org/blog/announcing-arc-agi-2-and-arc-prize-2025</a></p>
<p>Comments URL: <a href="https://news.ycombinator.com/item?id=43465147">https://news.ycombinator.com/item?id=43465147</a></p>
<p>Points: 188</p>
<p># Comments: 101</p>
]]></description><pubDate>Mon, 24 Mar 2025 20:35:30 +0000</pubDate><link>https://arcprize.org/blog/announcing-arc-agi-2-and-arc-prize-2025</link><dc:creator>gkamradt</dc:creator><comments>https://news.ycombinator.com/item?id=43465147</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=43465147</guid></item><item><title><![CDATA[How the cofounder of Zapier recruited me to run a $1M AI competition]]></title><description><![CDATA[
<p>Article URL: <a href="https://gregkamradt.com/writing/arc_prize">https://gregkamradt.com/writing/arc_prize</a></p>
<p>Comments URL: <a href="https://news.ycombinator.com/item?id=41925844">https://news.ycombinator.com/item?id=41925844</a></p>
<p>Points: 2</p>
<p># Comments: 0</p>
]]></description><pubDate>Wed, 23 Oct 2024 15:11:35 +0000</pubDate><link>https://gregkamradt.com/writing/arc_prize</link><dc:creator>gkamradt</dc:creator><comments>https://news.ycombinator.com/item?id=41925844</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=41925844</guid></item><item><title><![CDATA[Scaling LLMs apps via accuracy, latency, cost]]></title><description><![CDATA[
<p>Article URL: <a href="https://www.leverage.to/learn/dev/posts/scaling_llm_apps">https://www.leverage.to/learn/dev/posts/scaling_llm_apps</a></p>
<p>Comments URL: <a href="https://news.ycombinator.com/item?id=41730915">https://news.ycombinator.com/item?id=41730915</a></p>
<p>Points: 1</p>
<p># Comments: 0</p>
]]></description><pubDate>Thu, 03 Oct 2024 14:03:48 +0000</pubDate><link>https://www.leverage.to/learn/dev/posts/scaling_llm_apps</link><dc:creator>gkamradt</dc:creator><comments>https://news.ycombinator.com/item?id=41730915</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=41730915</guid></item><item><title><![CDATA[New comment by gkamradt in "ARC Prize – a $1M+ competition towards open AGI progress"]]></title><description><![CDATA[
<p>Check out the SOTA resources on the guide<p><a href="https://arcprize.org/guide" rel="nofollow">https://arcprize.org/guide</a><p>Happy to answer any questions you have along the way<p>(I'm helping run ARC Prize)</p>
]]></description><pubDate>Tue, 11 Jun 2024 22:41:52 +0000</pubDate><link>https://news.ycombinator.com/item?id=40652494</link><dc:creator>gkamradt</dc:creator><comments>https://news.ycombinator.com/item?id=40652494</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=40652494</guid></item><item><title><![CDATA[New comment by gkamradt in "ARC Prize – a $1M+ competition towards open AGI progress"]]></title><description><![CDATA[
<p>We put a bunch of detail to get started on the guide
<a href="https://arcprize.org/guide" rel="nofollow">https://arcprize.org/guide</a><p>Happy to answer any questions you have along the way<p>(I'm helping run ARC Prize)</p>
]]></description><pubDate>Tue, 11 Jun 2024 22:41:27 +0000</pubDate><link>https://news.ycombinator.com/item?id=40652485</link><dc:creator>gkamradt</dc:creator><comments>https://news.ycombinator.com/item?id=40652485</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=40652485</guid></item><item><title><![CDATA[Show HN: I Built a Semantic De-Deduplicator]]></title><description><![CDATA[
<p>Hey HN Crew!<p>We all have lists...and they can be annoying to de-duplicate.<p>* User feedback
* Groceries
* Employee Surveys
* Bug reports
* You name it<p>Most ways to consolidate like-items work off of keywords or worse, exact phrases (Sheets/Excel).<p>But LLMs are much better at understanding an items semantic meaning and determining if two items should be combined or not.<p>I decided to build my first python package, The Semantic Deduplicator, to help me consolidate items based on their meaning, not keywords.<p>For Example On Groceries:
['We need more berries', 'I want more more milk', 'Can we get more carbonated water please?', 'We need more sparkling water']
...deduplicated...
['Berries', 'Milk', 'Sparkling Water']<p>How it works:<p>1. Start with an empty list ready to populate<p>2. The first item you add will get 1) transformed into a clean name (user feedback > product request) and 2) added to the list<p>3. While you're adding more items<p>* Check to see if your new item's embedding is close to any existing item<p>* If so, ask the LLM to compare your two items to see if they should be combined<p>* If so, combine them<p>This package is more of an exploration and POC so be careful with it. I'd love to hear any feedback.<p>All the links:<p>* YT Explainer Video: https://www.youtube.com/watch?v=etLsNgkGbeM<p>* Twitter Thread: https://twitter.com/GregKamradt/status/1719760658936545336<p>* Pypi: https://pypi.org/project/semantic-deduplicator/<p>* Github: https://github.com/gkamradt/SemanticDeduplicator</p>
<hr>
<p>Comments URL: <a href="https://news.ycombinator.com/item?id=38101201">https://news.ycombinator.com/item?id=38101201</a></p>
<p>Points: 2</p>
<p># Comments: 2</p>
]]></description><pubDate>Wed, 01 Nov 2023 16:59:17 +0000</pubDate><link>https://news.ycombinator.com/item?id=38101201</link><dc:creator>gkamradt</dc:creator><comments>https://news.ycombinator.com/item?id=38101201</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=38101201</guid></item><item><title><![CDATA[New comment by gkamradt in "QGIS is the mapping software you didn't know you needed"]]></title><description><![CDATA[
<p>Thank you</p>
]]></description><pubDate>Sat, 11 Feb 2023 17:17:13 +0000</pubDate><link>https://news.ycombinator.com/item?id=34754189</link><dc:creator>gkamradt</dc:creator><comments>https://news.ycombinator.com/item?id=34754189</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=34754189</guid></item><item><title><![CDATA[New comment by gkamradt in "QGIS is the mapping software you didn't know you needed"]]></title><description><![CDATA[
<p>Ha that would be sweet.<p>Do you have a video link of what you're referring to?<p>I once tried to use the molds to make chocolate representations of the mountains ha! I learned the hard way that tempering is difficult for a novice</p>
]]></description><pubDate>Sat, 11 Feb 2023 17:17:06 +0000</pubDate><link>https://news.ycombinator.com/item?id=34754188</link><dc:creator>gkamradt</dc:creator><comments>https://news.ycombinator.com/item?id=34754188</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=34754188</guid></item><item><title><![CDATA[New comment by gkamradt in "QGIS is the mapping software you didn't know you needed"]]></title><description><![CDATA[
<p>Thank you!</p>
]]></description><pubDate>Sat, 11 Feb 2023 17:15:29 +0000</pubDate><link>https://news.ycombinator.com/item?id=34754172</link><dc:creator>gkamradt</dc:creator><comments>https://news.ycombinator.com/item?id=34754172</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=34754172</guid></item><item><title><![CDATA[New comment by gkamradt in "QGIS is the mapping software you didn't know you needed"]]></title><description><![CDATA[
<p>I actually use DEMto3D. It's touchy, but I do post-work on the .stl/3d model in blender so it works out ok for me.<p>If you have weird artifacts, I'm guessing that is due to the underlying data vs QGIS itself. Have you looked at their documentation (<a href="https://demto3d.com/en/" rel="nofollow">https://demto3d.com/en/</a>)?<p>I outline how the whole process works here
<a href="https://www.gregkamradt.com/gregkamradt/2020/2/29/manufacturing-3d-printing-bronze-casting" rel="nofollow">https://www.gregkamradt.com/gregkamradt/2020/2/29/manufactur...</a></p>
]]></description><pubDate>Sat, 11 Feb 2023 17:15:23 +0000</pubDate><link>https://news.ycombinator.com/item?id=34754170</link><dc:creator>gkamradt</dc:creator><comments>https://news.ycombinator.com/item?id=34754170</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=34754170</guid></item><item><title><![CDATA[New comment by gkamradt in "QGIS is the mapping software you didn't know you needed"]]></title><description><![CDATA[
<p>Here's the process on custom orders. I'll put the link right on the site to try and avoid confusion.<p><a href="https://docs.google.com/document/d/1IkiHG_Z5JS03mWYHv-KNAhi8JgYe1YhoDSLKI-7-0lk/edit#" rel="nofollow">https://docs.google.com/document/d/1IkiHG_Z5JS03mWYHv-KNAhi8...</a><p>edit: whoops added link</p>
]]></description><pubDate>Sat, 11 Feb 2023 17:09:01 +0000</pubDate><link>https://news.ycombinator.com/item?id=34754120</link><dc:creator>gkamradt</dc:creator><comments>https://news.ycombinator.com/item?id=34754120</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=34754120</guid></item><item><title><![CDATA[New comment by gkamradt in "QGIS is the mapping software you didn't know you needed"]]></title><description><![CDATA[
<p>The upfront costs are pretty expensive.<p>For every new location you do the process looks like:
1. Get the data and prep it for print (fixed)
2. 3D print it (fixed)
3. Rubber Mold (fixed)
4. Wax Model (variable)
5. Bronze (variable)<p>Steps 1-3 are 40-60% of the costs. So I haven't put the money out of pocket yet to put up new locations. I've let customer's ask first and then do them.<p>Surprisingly, most of our orders have been custom<p>Here's my info packet on the custom process
<a href="https://docs.google.com/document/d/1IkiHG_Z5JS03mWYHv-KNAhi8JgYe1YhoDSLKI-7-0lk/edit#" rel="nofollow">https://docs.google.com/document/d/1IkiHG_Z5JS03mWYHv-KNAhi8...</a></p>
]]></description><pubDate>Sat, 11 Feb 2023 17:08:39 +0000</pubDate><link>https://news.ycombinator.com/item?id=34754114</link><dc:creator>gkamradt</dc:creator><comments>https://news.ycombinator.com/item?id=34754114</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=34754114</guid></item></channel></rss>