<rss version="2.0" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:atom="http://www.w3.org/2005/Atom"><channel><title>Hacker News: mickdarling</title><link>https://news.ycombinator.com/user?id=mickdarling</link><description>Hacker News RSS</description><docs>https://hnrss.org/</docs><generator>hnrss v2.1.1</generator><lastBuildDate>Sun, 21 Jun 2026 08:28:17 +0000</lastBuildDate><atom:link href="https://hnrss.org/user?id=mickdarling" rel="self" type="application/rss+xml"></atom:link><item><title><![CDATA[New comment by mickdarling in "If Claude Fable stops helping you, you'll never know"]]></title><description><![CDATA[
<p>No, this is their get out of jail free card if people start complaining about the model being dumb or forgetful or lying, they can just say, oh well, you must have been doing something that triggered its distillation prevention technique.<p>And, they can say that for anybody at any time, and you'll never know why, and there's no way to prove it.<p>Everyone needs a flight data recorder to prove... "here's what I was actually doing and why it was not distillation." And now you're having to prove your innocence instead of them having to prove you're guilty, and really at the end of the day, it's just the model being stupid that they're protecting themselves from.</p>
]]></description><pubDate>Tue, 09 Jun 2026 22:32:43 +0000</pubDate><link>https://news.ycombinator.com/item?id=48468725</link><dc:creator>mickdarling</dc:creator><comments>https://news.ycombinator.com/item?id=48468725</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=48468725</guid></item><item><title><![CDATA[New comment by mickdarling in "Claude Fable 5"]]></title><description><![CDATA[
<p>The tags are actually displayed in raw text not rendered.</p>
]]></description><pubDate>Tue, 09 Jun 2026 17:53:17 +0000</pubDate><link>https://news.ycombinator.com/item?id=48464796</link><dc:creator>mickdarling</dc:creator><comments>https://news.ycombinator.com/item?id=48464796</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=48464796</guid></item><item><title><![CDATA[New comment by mickdarling in "Claude Fable 5"]]></title><description><![CDATA[
<p>Below is the EXACT text in Claude Desktop introducing Fable 5, including the very professional looking break tags, and at least I know where the links begin and end by looking at the anchor tag there.<p>They obviously put their best model on the job to build that.<p>----------------------<p>Fable 5: Our most capable model yet
Our newest model tackles your biggest challenges with fewer check-ins needed.<p>•
<b>Included in your plan limits until Jun 22</b><br><br>Fable takes 2× the usage of Opus.
•
<b>Switch models when a message is flagged</b><br><br>When safety measures flag a message, automatically switch to a different model to keep chatting. When off, your chat will pause instead. <a href="<a href="https://support.claude.com/en/articles/15363606" rel="nofollow">https://support.claude.com/en/articles/15363606</a>" target="_blank" rel="noopener noreferrer">Learn more</a></p>
]]></description><pubDate>Tue, 09 Jun 2026 17:21:21 +0000</pubDate><link>https://news.ycombinator.com/item?id=48464213</link><dc:creator>mickdarling</dc:creator><comments>https://news.ycombinator.com/item?id=48464213</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=48464213</guid></item><item><title><![CDATA[New comment by mickdarling in "Claude Opus 4.8"]]></title><description><![CDATA[
<p>I effectively distill the frontier models by building whole sets of skills, personas, and other artifacts that I can then run on smaller models and get 10% even 20% improvements on models like haiku or local models.<p>There's a lot of room for improving the smaller models at many levels of the stack.</p>
]]></description><pubDate>Thu, 28 May 2026 19:35:48 +0000</pubDate><link>https://news.ycombinator.com/item?id=48314290</link><dc:creator>mickdarling</dc:creator><comments>https://news.ycombinator.com/item?id=48314290</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=48314290</guid></item><item><title><![CDATA[New comment by mickdarling in "Thoughts and feelings around Claude Design"]]></title><description><![CDATA[
<p>I used it today to take a look at my previously built design system with Logos, branding, fonts, and everything else. After a lot of annoying tweaking back and forth, finally, I got something that was satisfactory.<p>Then I looked at the usage and it said I had used 95% of my Claude design usage for the week!<p>This isn't a real tool. This is a plaything, if that's what they're providing as examples.</p>
]]></description><pubDate>Sat, 18 Apr 2026 21:35:47 +0000</pubDate><link>https://news.ycombinator.com/item?id=47819716</link><dc:creator>mickdarling</dc:creator><comments>https://news.ycombinator.com/item?id=47819716</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47819716</guid></item><item><title><![CDATA[Show HN: DollhouseMCP 2.0, open-source MCP composable AI building blocks]]></title><description><![CDATA[
<p>Hi, I'm Mick. I've been building this for the last 9 months<p>DollhouseMCP 2.0 is an open-source MCP server for making and using composable building blocks for AI customization. You build elements as portable MD and YAML files, compose them into stacks, and activate the stacks in any MCP-compatible client.<p>Element types that drive behavior and permissions:<p>* Personas: behavioral profiles (how the AI sounds and acts)<p>* Skills: discrete capabilities (what the AI can do)<p>* Agents: goal-oriented, multi-step executors<p>* Ensembles: composed stacks of the above<p>* Plus templates for structured outputs and memories for persistent context.<p>Two things I think are actually new here:<p>1. Identity-based permissioning. When you activate a persona, skill, agent, or
ensemble, its permission policy takes effect in the server. Same client, same LLM, different permission surface depending on which active elements are loaded. 
A read-only analyst persona blocks creates and deletes regardless of what the client allows. A security-focused ensemble can deny specific destructive operations. This runs server-side, after the client approves the call, so policy cannot be overridden by the LLM or the client.<p>2. A bimodal agent loop. Agents do not run free inside the LLM. Every step hands control back to the MCP server, which evaluates the proposed operation
against the active permission stack, runs autonomy and risk checks, enforces any hard blocks, then returns a decision to the LLM with continue, pause, or
escalate guidance. The LLM acts on what it is allowed to do, describes the next step, and hands back to the server. The loop repeats until the goal
completes or a human is asked to intervene. Higher agency stays observable and bounded instead of opaque.<p>There's also a audit trail of approved and denied actions a danger zone lockout that will prevent the LLM from doing truly dangerous things if they go through the MCP server. And any active Dollhouse agent that is running through the DollhouseMCP server has it's actions evaluated every step.<p>I added easy configuration through the web console for a wide variety of MCP clients if you use the one-liner, and there are logs and metrics as well as the local and github hosted portfolio and collection to save your Dollhouse elements and share and use other's.  They are all validated and scanned several times along the their distribution path to keep things as safe as we can.<p>The one-liner install: 
npx @dollhousemcp/mcp-server@latest --web<p>Happy to go deep on the permissioning model, the bimodal agent loop, composition patterns, YAML schema, or anything else.<p>Home: <a href="https://dollhousemcp.com" rel="nofollow">https://dollhousemcp.com</a>
Repo: <a href="https://github.com/DollhouseMCP/mcp-server" rel="nofollow">https://github.com/DollhouseMCP/mcp-server</a>
Collection: <a href="https://collection.dollhousemcp.com" rel="nofollow">https://collection.dollhousemcp.com</a><p>Hope you like it.</p>
<hr>
<p>Comments URL: <a href="https://news.ycombinator.com/item?id=47765161">https://news.ycombinator.com/item?id=47765161</a></p>
<p>Points: 3</p>
<p># Comments: 0</p>
]]></description><pubDate>Tue, 14 Apr 2026 13:05:32 +0000</pubDate><link>https://dollhousemcp.com/</link><dc:creator>mickdarling</dc:creator><comments>https://news.ycombinator.com/item?id=47765161</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47765161</guid></item><item><title><![CDATA[New comment by mickdarling in "Simple self-distillation improves code generation"]]></title><description><![CDATA[
<p>I'm working on a tool to determine which portions of an LLM process can be optimized, and how to measure that optimization and check whether it's optimizable at all. The shaping pattern that they talk about here is directly relevant and makes a whole lot more processes potentially optimizable by looking at the pattern rather than if the metrics just go up or down.</p>
]]></description><pubDate>Sat, 04 Apr 2026 19:02:45 +0000</pubDate><link>https://news.ycombinator.com/item?id=47642155</link><dc:creator>mickdarling</dc:creator><comments>https://news.ycombinator.com/item?id=47642155</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47642155</guid></item><item><title><![CDATA[New comment by mickdarling in "Claude Code's source code has been leaked via a map file in their NPM registry"]]></title><description><![CDATA[
<p>Hey, LLM, take a look at these multiple hundred emails and docs in my docs folder from the last few years, before I started using AI, that I wrote personally. create a list of all of the idiosyncrasies that I have in my writing. Create a file to remember that. And then use that to write any new text that'll be published so it sounds like my authentic voice. Thank you.</p>
]]></description><pubDate>Tue, 31 Mar 2026 20:37:03 +0000</pubDate><link>https://news.ycombinator.com/item?id=47593151</link><dc:creator>mickdarling</dc:creator><comments>https://news.ycombinator.com/item?id=47593151</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47593151</guid></item><item><title><![CDATA[New comment by mickdarling in "The risk of AI isn't making us lazy, but making "lazy" look productive"]]></title><description><![CDATA[
<p>Maybe for you reading a paper deeply is the most constructive way that you have to absorb information.<p>For me, it is having a document and interrogating it. Maybe having many sets of documents about a whole category of information. Getting the bullet points. getting the high level and then interrogating and digging down and being able to get bubbled up information as I need it.<p>That is the learning style that matches how I learn.<p>I have never been able to skim, so reading a large document WILL teach me that topic, but getting through that doc is tough.<p>I can dump a very large set of docs in a reader that lets me interrogate the whole data set and I can fly through looking for what is interesting to me, and what I may need, and along the way I will likely dive into other parts too.  Asking questions keeps my hyperfocus active.<p>I think it is just a different style.  I have synesthesia and a hard time not working on three to five things at once.  I am use to knowing I learn differently than others.</p>
]]></description><pubDate>Sat, 28 Mar 2026 16:43:34 +0000</pubDate><link>https://news.ycombinator.com/item?id=47556208</link><dc:creator>mickdarling</dc:creator><comments>https://news.ycombinator.com/item?id=47556208</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47556208</guid></item><item><title><![CDATA[New comment by mickdarling in "[dead]"]]></title><description><![CDATA[
<p>Might be worth something to create an AI summary of the actual documents that are behind the curtain. That way the Client can have some assurance of what the deliverables are.  It's a nice ability for the escrow service to be a trusted third party to verify something is there that matches about what is expected.  If you do it right, you can even have the LLM prepare the content, encrypt it, and you never even have access to any of that information.</p>
]]></description><pubDate>Sun, 22 Mar 2026 23:36:36 +0000</pubDate><link>https://news.ycombinator.com/item?id=47483530</link><dc:creator>mickdarling</dc:creator><comments>https://news.ycombinator.com/item?id=47483530</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47483530</guid></item><item><title><![CDATA[New comment by mickdarling in "OpenClaw is a security nightmare dressed up as a daydream"]]></title><description><![CDATA[
<p>I don't use Claw. It is way too dangerous. I built my own system where I know the ins and outs and how they can break.<p>When it comes to agents' tasks, I tend to focus on things that I couldn't do before without automated agents, at least at the going price.<p>The kind of automation I'm doing is more like building a set of agents to generate marketing surveys for me. They take free form input from me and my project.  They aren't particularly sexy but they go off and do something valuable that I literally would never pay for at the prices that they are normally.</p>
]]></description><pubDate>Sun, 22 Mar 2026 22:38:49 +0000</pubDate><link>https://news.ycombinator.com/item?id=47483007</link><dc:creator>mickdarling</dc:creator><comments>https://news.ycombinator.com/item?id=47483007</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47483007</guid></item><item><title><![CDATA[New comment by mickdarling in "Zulip.com Values"]]></title><description><![CDATA[
<p>You could literally drop this into Claude Code or Codex and point it at a local fork of Zulip and have it build your bimodal version with triage and grazing styles.</p>
]]></description><pubDate>Wed, 11 Feb 2026 09:15:49 +0000</pubDate><link>https://news.ycombinator.com/item?id=46972707</link><dc:creator>mickdarling</dc:creator><comments>https://news.ycombinator.com/item?id=46972707</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=46972707</guid></item><item><title><![CDATA[New comment by mickdarling in "GitHub Agentic Workflows"]]></title><description><![CDATA[
<p>I use an LLM behavior test to see if the semantic responses from LLMs using my MCP server match what I expect them to. This is beyond the regex tests, but to see if there's a semantic response that's appropriate. Sometimes the LLMs kick back an unusual response that technically is a no, but effectively is a yes.  Different models can behave semantically different too.<p>If I had a nice CI/CD workflow that was built into GitHub rather than rolling my own that I have running locally, that might just make it a little more automatic and a little easier.</p>
]]></description><pubDate>Sun, 08 Feb 2026 16:36:04 +0000</pubDate><link>https://news.ycombinator.com/item?id=46935875</link><dc:creator>mickdarling</dc:creator><comments>https://news.ycombinator.com/item?id=46935875</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=46935875</guid></item><item><title><![CDATA[New comment by mickdarling in "GitHub Agentic Workflows"]]></title><description><![CDATA[
<p>It looks like it does have an MCP Gateway <a href="https://github.com/github/gh-aw-mcpg" rel="nofollow">https://github.com/github/gh-aw-mcpg</a> so I may see how well it works with my MCP server.  One of the components mine makes are agent elements with my own permissioning, security, memory, and skills. I put explicit programatic hard stops on my agents if they do something that is dangerous or destructive.<p>As for the domain, this is the same account that has been hosting Github projects for more than a decade.  Pretty sure it is legit. Org ID is 9,919 from 2008.</p>
]]></description><pubDate>Sun, 08 Feb 2026 16:31:06 +0000</pubDate><link>https://news.ycombinator.com/item?id=46935817</link><dc:creator>mickdarling</dc:creator><comments>https://news.ycombinator.com/item?id=46935817</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=46935817</guid></item><item><title><![CDATA[New comment by mickdarling in "LLMs could be, but shouldn't be compilers"]]></title><description><![CDATA[
<p>This is where the desire to NOT anthropomorphize LLMs actually gets in the way.<p>We have mechanisms for ensuring output from humans, and those are nothing like ensuring the output from a compiler. We have checks on people, we have whole industries of people whose whole careers are managing people, to manage other people, to manage other people.<p>with regards to predictability LLMs essentially behave like people in this manner. The same kind of checks that we use for people are needed for them, not the same kind of checks we use for software.</p>
]]></description><pubDate>Fri, 06 Feb 2026 14:48:25 +0000</pubDate><link>https://news.ycombinator.com/item?id=46913476</link><dc:creator>mickdarling</dc:creator><comments>https://news.ycombinator.com/item?id=46913476</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=46913476</guid></item><item><title><![CDATA[New comment by mickdarling in "Clawdbot - open source personal AI assistant"]]></title><description><![CDATA[
<p>I'm looking at it right now as a tool I can hollow out and stuff in my own MCP server that also has personas, skills, an agentic loop, memory, all those pieces. I may even go simpler than that and simply take a look at it's gateway and channels and drag those over and slap them onto the MCP server I have and turn it into an independent application.<p>It looks far too risky to use, even if I have it sequestered in its own VM. I'm not comfortable with its present state.</p>
]]></description><pubDate>Mon, 26 Jan 2026 15:07:05 +0000</pubDate><link>https://news.ycombinator.com/item?id=46766519</link><dc:creator>mickdarling</dc:creator><comments>https://news.ycombinator.com/item?id=46766519</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=46766519</guid></item><item><title><![CDATA[New comment by mickdarling in "Cursor's latest “browser experiment” implied success without evidence"]]></title><description><![CDATA[
<p>Thank you for the note. It's not a site I used all that often.<p>Whether you had anything to do with it or not, I have no idea.  And, since you didn't follow best practices and tell me directly rather than trying to score points here, there's really no way of knowing whether you're the one who caused the problem in the first place.<p>I built a new site without Wordpress. That took in less than a day.<p>I don't imagine you will alter your behavior to align with general best security practices anytime soon.</p>
]]></description><pubDate>Sun, 18 Jan 2026 23:29:07 +0000</pubDate><link>https://news.ycombinator.com/item?id=46673292</link><dc:creator>mickdarling</dc:creator><comments>https://news.ycombinator.com/item?id=46673292</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=46673292</guid></item><item><title><![CDATA[New comment by mickdarling in "Cursor's latest “browser experiment” implied success without evidence"]]></title><description><![CDATA[
<p>Wasn't my project to manage. That was a consulting gig.  And I fired the client right after this.</p>
]]></description><pubDate>Sat, 17 Jan 2026 15:14:12 +0000</pubDate><link>https://news.ycombinator.com/item?id=46658678</link><dc:creator>mickdarling</dc:creator><comments>https://news.ycombinator.com/item?id=46658678</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=46658678</guid></item><item><title><![CDATA[New comment by mickdarling in "Cursor's latest “browser experiment” implied success without evidence"]]></title><description><![CDATA[
<p>Had humans not been doing this already, I would have walked into Samsung with the demo application that was working an hour before my meeting, rather than the android app that could only show me the opening logo.<p>There are a lot of really bad human developers out there, too.</p>
]]></description><pubDate>Fri, 16 Jan 2026 22:11:39 +0000</pubDate><link>https://news.ycombinator.com/item?id=46652899</link><dc:creator>mickdarling</dc:creator><comments>https://news.ycombinator.com/item?id=46652899</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=46652899</guid></item><item><title><![CDATA[New comment by mickdarling in "First impressions of Claude Cowork"]]></title><description><![CDATA[
<p>I think Claude Cowork should come with a requirement or a very heavily structured wizard process to ensure the machine has something like a Time Machine backup or other backups that are done regularly, before it is used by folks.<p>The failure modes are just too rough for most people to think about until it's too late.</p>
]]></description><pubDate>Thu, 15 Jan 2026 20:29:53 +0000</pubDate><link>https://news.ycombinator.com/item?id=46638796</link><dc:creator>mickdarling</dc:creator><comments>https://news.ycombinator.com/item?id=46638796</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=46638796</guid></item></channel></rss>