<rss version="2.0" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:atom="http://www.w3.org/2005/Atom"><channel><title>Hacker News: dk970</title><link>https://news.ycombinator.com/user?id=dk970</link><description>Hacker News RSS</description><docs>https://hnrss.org/</docs><generator>hnrss v2.1.1</generator><lastBuildDate>Wed, 08 Apr 2026 01:36:15 +0000</lastBuildDate><atom:link href="https://hnrss.org/user?id=dk970" rel="self" type="application/rss+xml"></atom:link><item><title><![CDATA[New comment by dk970 in "How do you prevent PII from leaking into test fixtures and staging datasets?"]]></title><description><![CDATA[
<p>We've been bitten by this a few times — real email addresses ending up in CSV test fixtures that got committed to repos. Curious how data engineering teams actually handle this in practice.
Do you gate on it in CI? Manual review? Just trust the process?
We built a small local CLI scanner for this — deterministic pattern matching, no network calls, exits non-zero on HIGH risk findings so you can block PRs. Happy to share if useful but mostly curious what others are doing.</p>
]]></description><pubDate>Tue, 07 Apr 2026 20:38:36 +0000</pubDate><link>https://news.ycombinator.com/item?id=47681049</link><dc:creator>dk970</dc:creator><comments>https://news.ycombinator.com/item?id=47681049</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47681049</guid></item><item><title><![CDATA[How do you prevent PII from leaking into test fixtures and staging datasets?]]></title><description><![CDATA[
<p>Article URL: <a href="https://github.com/certifieddata/pii-scan">https://github.com/certifieddata/pii-scan</a></p>
<p>Comments URL: <a href="https://news.ycombinator.com/item?id=47681048">https://news.ycombinator.com/item?id=47681048</a></p>
<p>Points: 1</p>
<p># Comments: 1</p>
]]></description><pubDate>Tue, 07 Apr 2026 20:38:36 +0000</pubDate><link>https://github.com/certifieddata/pii-scan</link><dc:creator>dk970</dc:creator><comments>https://news.ycombinator.com/item?id=47681048</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47681048</guid></item><item><title><![CDATA[New comment by dk970 in "DispoRx: Using Agentic AI as a high-fidelity simulator for ER workflows"]]></title><description><![CDATA[
<p>Premature send - Also is there any logging or validation around why agentic decisions are being made?</p>
]]></description><pubDate>Tue, 07 Apr 2026 16:33:21 +0000</pubDate><link>https://news.ycombinator.com/item?id=47677871</link><dc:creator>dk970</dc:creator><comments>https://news.ycombinator.com/item?id=47677871</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47677871</guid></item><item><title><![CDATA[New comment by dk970 in "DispoRx: Using Agentic AI as a high-fidelity simulator for ER workflows"]]></title><description><![CDATA[
<p>This is a super interesting platform play.  Question do you guys use synthetic data or how do you protect pii?<p>dk970</p>
]]></description><pubDate>Tue, 07 Apr 2026 16:32:03 +0000</pubDate><link>https://news.ycombinator.com/item?id=47677851</link><dc:creator>dk970</dc:creator><comments>https://news.ycombinator.com/item?id=47677851</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47677851</guid></item><item><title><![CDATA[New comment by dk970 in "Good Taste the Only Real Moat Left"]]></title><description><![CDATA[
<p>The new world order is what not to build...</p>
]]></description><pubDate>Tue, 07 Apr 2026 15:57:52 +0000</pubDate><link>https://news.ycombinator.com/item?id=47677303</link><dc:creator>dk970</dc:creator><comments>https://news.ycombinator.com/item?id=47677303</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47677303</guid></item></channel></rss>