<rss version="2.0" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:atom="http://www.w3.org/2005/Atom"><channel><title>Hacker News: SkiFreeWin3</title><link>https://news.ycombinator.com/user?id=SkiFreeWin3</link><description>Hacker News RSS</description><docs>https://hnrss.org/</docs><generator>hnrss v2.1.1</generator><lastBuildDate>Fri, 01 May 2026 15:02:11 +0000</lastBuildDate><atom:link href="https://hnrss.org/user?id=SkiFreeWin3" rel="self" type="application/rss+xml"></atom:link><item><title><![CDATA[New comment by SkiFreeWin3 in "Show HN: VR.dev – Open-source verifiers for what AI agents did"]]></title><description><![CDATA[
<p>totally agree, and fwiw, nothing in this implementation requires that the agent verifies it themselves. the hope is something that ultimately exists as a verification mechanism on one side of an agent-to-agent interaction/delegation.<p>because it is true that, even though we've got some adversarial aspects built into the verification, that's not truly blind from the actor (unless you explicitly design the use of these in that way, which is what I've considered as the better design)</p>
]]></description><pubDate>Wed, 11 Mar 2026 17:18:54 +0000</pubDate><link>https://news.ycombinator.com/item?id=47338395</link><dc:creator>SkiFreeWin3</dc:creator><comments>https://news.ycombinator.com/item?id=47338395</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47338395</guid></item><item><title><![CDATA[Show HN: VR.dev – Open-source verifiers for what AI agents did]]></title><description><![CDATA[
<p>Hey HN,<p>Quick origin story: vr.dev started as a virtual reality project. The domain fit perfectly. The developer adoption did not. Rather than let a good domain go to waste, I pivoted to the other kind of VR: verification and rewards for AI agents.<p>The problem I kept running into: agents report success but system state tells a different story. The database row is still active. The IMAP sent folder is empty. The tests pass because the agent modified the tests. Real benchmarks put agent success at 12-30%, and even among reported successes a large fraction are procedurally wrong in ways that are hard to catch without actually checking state.<p>So I built a library of verifiers that check real system state rather than trusting agent self-reports. There are 38 of them across 19 domains right now, organized into three tiers: HARD (deterministic probes against databases, files, APIs, git), SOFT (LLM rubric scoring for things like tone or coherence that don't have a deterministic test), and AGENTIC (verifiers that actively probe the environment via headless browser, IMAP, or shell).<p>The design decision I'd most like feedback on is the composition model. SOFT scores are gated behind HARD checks, so if the deterministic check fails, the composed score is 0.0 regardless of what the LLM judge says. The idea is to make reward hacking structurally harder rather than just hoping the judge catches it.<p>MIT licensed, runs locally via pip install vrdev, no dependency on the hosted API which matters if you're using it in a training loop. Full verifier list at <a href="https://vr.dev/registry" rel="nofollow">https://vr.dev/registry</a>.<p>Curious whether the HARD/SOFT/AGENTIC taxonomy makes sense to people, whether fail_closed is the right default, and whether anyone has built something similar and run into problems I haven't hit yet.<p><a href="https://vr.dev" rel="nofollow">https://vr.dev</a>
<a href="https://github.com/vrDotDev/vr-dev" rel="nofollow">https://github.com/vrDotDev/vr-dev</a>
<a href="https://pypi.org/project/vrdev/" rel="nofollow">https://pypi.org/project/vrdev/</a></p>
<hr>
<p>Comments URL: <a href="https://news.ycombinator.com/item?id=47322919">https://news.ycombinator.com/item?id=47322919</a></p>
<p>Points: 3</p>
<p># Comments: 2</p>
]]></description><pubDate>Tue, 10 Mar 2026 13:21:51 +0000</pubDate><link>https://www.vr.dev/</link><dc:creator>SkiFreeWin3</dc:creator><comments>https://news.ycombinator.com/item?id=47322919</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47322919</guid></item><item><title><![CDATA[New comment by SkiFreeWin3 in "Dark Source, Dark Forest: Rethinking Open Source and Blockchain in the Age of AI"]]></title><description><![CDATA[
<p>Intro quote: “AI can break the social contract of open source and blockchain: by forking, evolving, and hoarding code in secret, autonomous systems will outpace and eventually exclude humans from the digital commons we built. “Dark source” is the coming era of invisible, adversarial AI infrastructure, and it threatens the very premise of open innovation and decentralized value.”</p>
]]></description><pubDate>Tue, 22 Jul 2025 13:10:36 +0000</pubDate><link>https://news.ycombinator.com/item?id=44646437</link><dc:creator>SkiFreeWin3</dc:creator><comments>https://news.ycombinator.com/item?id=44646437</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=44646437</guid></item><item><title><![CDATA[Dark Source, Dark Forest: Rethinking Open Source and Blockchain in the Age of AI]]></title><description><![CDATA[
<p>Article URL: <a href="https://darksource.ai">https://darksource.ai</a></p>
<p>Comments URL: <a href="https://news.ycombinator.com/item?id=44646436">https://news.ycombinator.com/item?id=44646436</a></p>
<p>Points: 2</p>
<p># Comments: 1</p>
]]></description><pubDate>Tue, 22 Jul 2025 13:10:36 +0000</pubDate><link>https://darksource.ai</link><dc:creator>SkiFreeWin3</dc:creator><comments>https://news.ycombinator.com/item?id=44646436</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=44646436</guid></item><item><title><![CDATA[New comment by SkiFreeWin3 in "Anyone can push updates to the doge.gov website"]]></title><description><![CDATA[
<p>Executor -- "It's about time White people realize that the holocaust is used as a mind fuck to demoralize us and beat us down into submission." : <a href="https://news.ycombinator.com/threads?id=Executor">https://news.ycombinator.com/threads?id=Executor</a><p>Are you really looking for evidence?</p>
]]></description><pubDate>Sat, 15 Feb 2025 04:51:44 +0000</pubDate><link>https://news.ycombinator.com/item?id=43055978</link><dc:creator>SkiFreeWin3</dc:creator><comments>https://news.ycombinator.com/item?id=43055978</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=43055978</guid></item><item><title><![CDATA[New comment by SkiFreeWin3 in "This 1970s tank simulator drives through a tiny world"]]></title><description><![CDATA[
<p>Analog twins.<p>Did HN have the spot recently about bars where you drive RC construction equipment around in a sandbox?<p>It will be interesting to see how peoples' desire to operate physical objects/robots like RC competes with digital experiences as physical robotics continues to improve.<p>Once a controlled physical object leaves your field of view though, it probably becomes "virtual" in the sense that, sure, you could have a headset on like a drone pilot. But soon graphics and AI-generated experiences will be convincing realities.<p>Maybe long way of saying, how cool are toy robots about to get?</p>
]]></description><pubDate>Thu, 06 Feb 2025 01:23:21 +0000</pubDate><link>https://news.ycombinator.com/item?id=42957797</link><dc:creator>SkiFreeWin3</dc:creator><comments>https://news.ycombinator.com/item?id=42957797</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=42957797</guid></item><item><title><![CDATA[New comment by SkiFreeWin3 in "Show HN: Atlas of Space"]]></title><description><![CDATA[
<p>space question -- why are the three outer-most bodies as consistent in general direction as they are? it looks like something blasted us (our solar system) in a specific direction. (speaking of, is there some astronomical/solar system analog for cardinal directions? like how would I say, "looks like we've been blasted in a north-east direction"</p>
]]></description><pubDate>Thu, 09 Jan 2025 01:45:58 +0000</pubDate><link>https://news.ycombinator.com/item?id=42640705</link><dc:creator>SkiFreeWin3</dc:creator><comments>https://news.ycombinator.com/item?id=42640705</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=42640705</guid></item><item><title><![CDATA[New comment by SkiFreeWin3 in "Be a property owner and not a renter on the internet"]]></title><description><![CDATA[
<p>you're not far off there Nemo, and that was a good dig on the vidalia onion guy! totally remember that</p>
]]></description><pubDate>Sat, 04 Jan 2025 03:55:52 +0000</pubDate><link>https://news.ycombinator.com/item?id=42592248</link><dc:creator>SkiFreeWin3</dc:creator><comments>https://news.ycombinator.com/item?id=42592248</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=42592248</guid></item><item><title><![CDATA[New comment by SkiFreeWin3 in "Apple fires hundreds of contract workers, previously assured jobs were safe"]]></title><description><![CDATA[
<p>Some type of logical fallacy in the line of reasoning connecting your "quoted" text to the op.<p>Exploit (Oxford def.): "make full use of and derive benefit from (a resource)."
Contractor (Oxford def.): "a person or company that undertakes a contract to provide materials or labor to perform a service or do a job"<p>Did Apple mislead these people into thinking their contracts were indefinite or not potentially bound by some end date? Did Apple retract guaranteed equity in the company?<p>This is literally what the contractor market is for: provide resources in increasing and decreasing amount due to market demands.<p>I certainly sympathize for the long tenured full time tech employees getting laid off. I really sympathize with the fact that contractors tend to be non-native or immigrant demographics, or out-sourced to other countries. Maybe that's what you were talking about, but seems like an intentionally generaliezd critique of the contracting industry.</p>
]]></description><pubDate>Sun, 19 Feb 2023 01:26:38 +0000</pubDate><link>https://news.ycombinator.com/item?id=34853108</link><dc:creator>SkiFreeWin3</dc:creator><comments>https://news.ycombinator.com/item?id=34853108</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=34853108</guid></item></channel></rss>