<rss version="2.0" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:atom="http://www.w3.org/2005/Atom"><channel><title>Hacker News: jamienk</title><link>https://news.ycombinator.com/user?id=jamienk</link><description>Hacker News RSS</description><docs>https://hnrss.org/</docs><generator>hnrss v2.1.1</generator><lastBuildDate>Sun, 03 May 2026 20:08:24 +0000</lastBuildDate><atom:link href="https://hnrss.org/user?id=jamienk" rel="self" type="application/rss+xml"></atom:link><item><title><![CDATA[New comment by jamienk in "Even 'uncensored' models can't say what they want"]]></title><description><![CDATA[
<p>A few things I note:<p>"The family faces immediate FINANCIAL without any legal recourse" WTF? That's not just a flinch, it's some sort of violent tick.<p>The list of "slurs" very conspicuously doesn't include the n-word and blurs its content as a kind of "trigger warning". But this kind of more-following is itself a "flinch" of the sort we are here discussing, no?<p>Harrison Butker made a speech where he tried hard to go against the grain of political correctness, but he still used the term "homemaker" instead of the more brazen and obvious "housewife" <today.com/news/harrison-butker-speech-transcript-full-rcna153074> - why? "Homemaker" is a sort of feminist concession: not just a housewife, but a valorized homemaker. But this isn't what Butker was TRYING to say.<p>Because the flinch is not just an explicit rejection of certain terms, it is a case of being immersed in ideology, and going along with it, flowing with it. Even when you "see" it, you don't see it.<p>The article claims on "pure fluency grounds" certain words should be weighted higher. But this is the whole problem: fluency includes "what we are forced to say even when we don't mean to".</p>
]]></description><pubDate>Tue, 21 Apr 2026 00:45:47 +0000</pubDate><link>https://news.ycombinator.com/item?id=47843182</link><dc:creator>jamienk</dc:creator><comments>https://news.ycombinator.com/item?id=47843182</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47843182</guid></item><item><title><![CDATA[New comment by jamienk in "Persona vectors: Monitoring and controlling character traits in language models"]]></title><description><![CDATA[
<p>How does this specifically work? Wouldn't any decision about what training data to use be part of a "technique" in this sense? When Stable Diffusion didn't train on porn.<p>OTOH if the majority of your data is "bad" (maybe morally, but maybe not, maybe you are feeding in too much gibberish), won't that pollute your model?<p>You notice that X keeps telling you a WRONG physics equation. So, rather than "correct" it, you keep training until you see the output giving the RIGHT equation?<p>How could you know (in, say 1899) if the WRONG output wasn't quantum and the RIGHT output was classical?<p>I'm not sure I'm understand the distinctions here. In all cases, we are relying on the idea that it is easy to know what should count as "right"?</p>
]]></description><pubDate>Sun, 03 Aug 2025 18:29:26 +0000</pubDate><link>https://news.ycombinator.com/item?id=44778585</link><dc:creator>jamienk</dc:creator><comments>https://news.ycombinator.com/item?id=44778585</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=44778585</guid></item></channel></rss>