<rss version="2.0" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:atom="http://www.w3.org/2005/Atom"><channel><title>Hacker News: jackson12t</title><link>https://news.ycombinator.com/user?id=jackson12t</link><description>Hacker News RSS</description><docs>https://hnrss.org/</docs><generator>hnrss v2.1.1</generator><lastBuildDate>Mon, 15 Jun 2026 07:46:39 +0000</lastBuildDate><atom:link href="https://hnrss.org/user?id=jackson12t" rel="self" type="application/rss+xml"></atom:link><item><title><![CDATA[New comment by jackson12t in "Claude Fable 5"]]></title><description><![CDATA[
<p>Fable 5's system prompt in Claude Code has several significant changes to help it take advantage of its greater autonomous capabilities compared to Opus.<p>Sharing a diff of the system prompts here: <a href="https://twelvetables.blog/comparing-claude-fable-5s-system-prompt-to-opus-4-8/" rel="nofollow">https://twelvetables.blog/comparing-claude-fable-5s-system-p...</a><p>The big difference is that the system prompt has a whole section dedicated to directing Fable how to communicate with users, and give them greater information about the (assumedly long-horizon) tasks it has completed.</p>
]]></description><pubDate>Tue, 09 Jun 2026 18:58:05 +0000</pubDate><link>https://news.ycombinator.com/item?id=48465876</link><dc:creator>jackson12t</dc:creator><comments>https://news.ycombinator.com/item?id=48465876</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=48465876</guid></item><item><title><![CDATA[Fable 5 System Prompt Comparison to Opus 4.8]]></title><description><![CDATA[
<p>Article URL: <a href="https://TwelveTables.blog/comparing-claude-fable-5s-system-prompt-to-opus-4-8/">https://TwelveTables.blog/comparing-claude-fable-5s-system-prompt-to-opus-4-8/</a></p>
<p>Comments URL: <a href="https://news.ycombinator.com/item?id=48465813">https://news.ycombinator.com/item?id=48465813</a></p>
<p>Points: 3</p>
<p># Comments: 0</p>
]]></description><pubDate>Tue, 09 Jun 2026 18:54:28 +0000</pubDate><link>https://TwelveTables.blog/comparing-claude-fable-5s-system-prompt-to-opus-4-8/</link><dc:creator>jackson12t</dc:creator><comments>https://news.ycombinator.com/item?id=48465813</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=48465813</guid></item><item><title><![CDATA[New comment by jackson12t in "The Illusion of Thinking: Strengths and limitations of reasoning models [pdf]"]]></title><description><![CDATA[
<p>This feels a bit like a weird way to test 'thinking' in models, and reminds me of the old story of Gauss[1] and his classmates being assigned the task of adding up the numbers from 1-100.<p>I think the way the paper lays out the performance regimes is pretty interesting, but I don't think they achieved their goal of demonstrating that LRMs can't use reasoning to solve complex puzzles organically (without contamination/memorization): IMO testing the model's ability to define an algorithm to solve the puzzle would have been a better evaluation of that (rather than having the model walk through all of the steps manually). I don't know that I'd use an LRM for this sort of long-tail reasoning where it has to follow one single process for a long time over just one prompt; if I needed a really long chain of reasoning I'd use an agent or workflow.<p>It sounds more like the tests measure a model's ability to reason coherently and consistently over many steps rather than a model's ability to understand and solve a complex puzzle. For example, for the Tower of Hanoi, a prompt like "Define an algorithm that will find the sequence of moves to transform the initial configuration into the goal configuration" (e.g. "find an arithmetic series formula, young Gauss") seems like it would have been a better approach than "Find the sequence of moves to transform the initial configuration into the goal configuration" (e.g. "add up all these numbers"). This is kind of seen in how the study included a step where the LRMs were given the algorithm and then asked to solve the problem, the focus was on an LRM's ability to follow the steps, not their ability to come up with an algorithm/solution on their own.<p>In a job interview, for example, who among us would accept inability to hold all of the `(2^n) - 1` steps of the Tower of Hanoi in our brain as evidence of poor reasoning ability?<p>Again, I think it's a really interesting study covering a model's ability to consistently follow a simple process over time in pursuit of a static objective (and perhaps a useful benchmark moving forward), but I'm not confident that it successfully demonstrates a meaninful deficiency in overall reasoning capability.<p>[1]: <a href="https://www.americanscientist.org/article/gausss-day-of-reckoning" rel="nofollow">https://www.americanscientist.org/article/gausss-day-of-reck...</a></p>
]]></description><pubDate>Mon, 09 Jun 2025 15:37:36 +0000</pubDate><link>https://news.ycombinator.com/item?id=44225577</link><dc:creator>jackson12t</dc:creator><comments>https://news.ycombinator.com/item?id=44225577</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=44225577</guid></item><item><title><![CDATA[Show HN: Visualizing AD data for security with Python and Obsidian]]></title><description><![CDATA[
<p>I put this tool together a few years back, and just recently got around to making it robust enough to share: <a href="https://github.com/pangolinsec/shihtzu" rel="nofollow">https://github.com/pangolinsec/shihtzu</a><p>It parses ldapsearch or dsquery output and writes markdown files with some extra logic added in so you can visualize nested memberships when you open the folder with Obsidian and use the graph view--it's kind of like a very low-powered Bloodhound (<a href="https://github.com/SpecterOps/BloodHound" rel="nofollow">https://github.com/SpecterOps/BloodHound</a>) in that sense, but it is also much quieter. It also parses `useraccountcontrol` and some of the logon-relevant attributes in AD to automatically tag accounts that are particularly interesting or uninteresting to attackers.<p>Some core features:
-  Parses LDAP attributes from text files into structured Obsidian markdown
-  Intelligently categorizes objects as Users, Groups, or Computers
-  Automatically identifies administrators and administrative privileges
-  Detects potentially risky account configurations (stale accounts, low logon counts)
-  Creates Obsidian links between related objects to enable network visualization
-  Processes UserAccountControl (UAC) values with explanations
-  Converts Windows timestamps to human-readable format
-  Smart append mode that only adds new data to existing files
-  Tagging for easy filtering and searching in Obsidian<p>It's not novel, but it's been quite useful for me in a few situations.</p>
<hr>
<p>Comments URL: <a href="https://news.ycombinator.com/item?id=43568958">https://news.ycombinator.com/item?id=43568958</a></p>
<p>Points: 3</p>
<p># Comments: 0</p>
]]></description><pubDate>Thu, 03 Apr 2025 12:59:09 +0000</pubDate><link>https://github.com/pangolinsec/shihtzu</link><dc:creator>jackson12t</dc:creator><comments>https://news.ycombinator.com/item?id=43568958</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=43568958</guid></item><item><title><![CDATA[New comment by jackson12t in "[dead]"]]></title><description><![CDATA[
<p>Hey HN! I built this mental model as a way to think about how useful and how detectable different kinds of hacking activities are, based on leading red teams in public and private sector.<p>It's been useful for me in thinking about security, and I think it could be helpful for others.</p>
]]></description><pubDate>Thu, 27 Mar 2025 20:18:39 +0000</pubDate><link>https://news.ycombinator.com/item?id=43497710</link><dc:creator>jackson12t</dc:creator><comments>https://news.ycombinator.com/item?id=43497710</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=43497710</guid></item></channel></rss>