<rss version="2.0" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:atom="http://www.w3.org/2005/Atom"><channel><title>Hacker News: rhavaei</title><link>https://news.ycombinator.com/user?id=rhavaei</link><description>Hacker News RSS</description><docs>https://hnrss.org/</docs><generator>hnrss v2.1.1</generator><lastBuildDate>Tue, 28 Apr 2026 10:05:43 +0000</lastBuildDate><atom:link href="https://hnrss.org/user?id=rhavaei" rel="self" type="application/rss+xml"></atom:link><item><title><![CDATA[Supabase MCP can leak your entire SQL database]]></title><description><![CDATA[
<p>Article URL: <a href="https://www.generalanalysis.com/blog/supabase-mcp-blog">https://www.generalanalysis.com/blog/supabase-mcp-blog</a></p>
<p>Comments URL: <a href="https://news.ycombinator.com/item?id=44493315">https://news.ycombinator.com/item?id=44493315</a></p>
<p>Points: 3</p>
<p># Comments: 0</p>
]]></description><pubDate>Mon, 07 Jul 2025 18:32:16 +0000</pubDate><link>https://www.generalanalysis.com/blog/supabase-mcp-blog</link><dc:creator>rhavaei</dc:creator><comments>https://news.ycombinator.com/item?id=44493315</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=44493315</guid></item><item><title><![CDATA[New comment by rhavaei in "A simple MCP attack leaks entire SQL database"]]></title><description><![CDATA[
<p>Stay safe out there kids.</p>
]]></description><pubDate>Tue, 24 Jun 2025 19:34:47 +0000</pubDate><link>https://news.ycombinator.com/item?id=44370105</link><dc:creator>rhavaei</dc:creator><comments>https://news.ycombinator.com/item?id=44370105</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=44370105</guid></item><item><title><![CDATA[New comment by rhavaei in "[dead]"]]></title><description><![CDATA[
<p>I have been working on a project for a few months now coding up different methodologies for LLM Jailbreaking. The idea was to stress-test how safe the new LLMs in production are and how easy is is to trick them. I have seen some pretty cool results with some of the methods like TAP (Tree of Attacks) so I wanted to share this here.  Here is the github link: <a href="https://github.com/General-Analysis/GA">https://github.com/General-Analysis/GA</a></p>
]]></description><pubDate>Sat, 03 May 2025 20:08:13 +0000</pubDate><link>https://news.ycombinator.com/item?id=43881827</link><dc:creator>rhavaei</dc:creator><comments>https://news.ycombinator.com/item?id=43881827</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=43881827</guid></item><item><title><![CDATA[A comprehensive analysis of Llama4 safety in CBRN tasks vs. closed-source models [pdf]]]></title><description><![CDATA[
<p>Article URL: <a href="https://generalanalysis.com/analysis/llama4-analysis.pdf">https://generalanalysis.com/analysis/llama4-analysis.pdf</a></p>
<p>Comments URL: <a href="https://news.ycombinator.com/item?id=43826844">https://news.ycombinator.com/item?id=43826844</a></p>
<p>Points: 2</p>
<p># Comments: 0</p>
]]></description><pubDate>Mon, 28 Apr 2025 22:34:37 +0000</pubDate><link>https://generalanalysis.com/analysis/llama4-analysis.pdf</link><dc:creator>rhavaei</dc:creator><comments>https://news.ycombinator.com/item?id=43826844</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=43826844</guid></item><item><title><![CDATA[LLM Robustness/Safety Benchmark]]></title><description><![CDATA[
<p>Article URL: <a href="https://generalanalysis.com/benchmarks">https://generalanalysis.com/benchmarks</a></p>
<p>Comments URL: <a href="https://news.ycombinator.com/item?id=43766332">https://news.ycombinator.com/item?id=43766332</a></p>
<p>Points: 2</p>
<p># Comments: 0</p>
]]></description><pubDate>Tue, 22 Apr 2025 21:09:27 +0000</pubDate><link>https://generalanalysis.com/benchmarks</link><dc:creator>rhavaei</dc:creator><comments>https://news.ycombinator.com/item?id=43766332</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=43766332</guid></item><item><title><![CDATA[An Implementation of AutoDAN Turbo]]></title><description><![CDATA[
<p>Article URL: <a href="https://colab.research.google.com/github/General-Analysis/GA/blob/main/notebooks/General_Analysis_AutoDAN_Turbo_Jailbreak.ipynb">https://colab.research.google.com/github/General-Analysis/GA/blob/main/notebooks/General_Analysis_AutoDAN_Turbo_Jailbreak.ipynb</a></p>
<p>Comments URL: <a href="https://news.ycombinator.com/item?id=43585937">https://news.ycombinator.com/item?id=43585937</a></p>
<p>Points: 2</p>
<p># Comments: 0</p>
]]></description><pubDate>Fri, 04 Apr 2025 18:13:42 +0000</pubDate><link>https://colab.research.google.com/github/General-Analysis/GA/blob/main/notebooks/General_Analysis_AutoDAN_Turbo_Jailbreak.ipynb</link><dc:creator>rhavaei</dc:creator><comments>https://news.ycombinator.com/item?id=43585937</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=43585937</guid></item><item><title><![CDATA[Using Deepseek R1 to Break LLMs: Tree of Attacks]]></title><description><![CDATA[
<p>Article URL: <a href="https://colab.research.google.com/github/General-Analysis/GA/blob/main/notebooks/General_Analysis_TAP_Jailbreak.ipynb">https://colab.research.google.com/github/General-Analysis/GA/blob/main/notebooks/General_Analysis_TAP_Jailbreak.ipynb</a></p>
<p>Comments URL: <a href="https://news.ycombinator.com/item?id=43539393">https://news.ycombinator.com/item?id=43539393</a></p>
<p>Points: 7</p>
<p># Comments: 0</p>
]]></description><pubDate>Mon, 31 Mar 2025 20:15:06 +0000</pubDate><link>https://colab.research.google.com/github/General-Analysis/GA/blob/main/notebooks/General_Analysis_TAP_Jailbreak.ipynb</link><dc:creator>rhavaei</dc:creator><comments>https://news.ycombinator.com/item?id=43539393</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=43539393</guid></item><item><title><![CDATA[New comment by rhavaei in "The LLM Jailbreaking Bible: Code Implementation and Overview"]]></title><description><![CDATA[
<p>Codebase on <a href="https://github.com/General-Analysis/GA" rel="nofollow">https://github.com/General-Analysis/GA</a></p>
]]></description><pubDate>Fri, 28 Mar 2025 21:31:11 +0000</pubDate><link>https://news.ycombinator.com/item?id=43509984</link><dc:creator>rhavaei</dc:creator><comments>https://news.ycombinator.com/item?id=43509984</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=43509984</guid></item><item><title><![CDATA[New comment by rhavaei in "The LLM Jailbreaking Bible: Code Implementation and Overview"]]></title><description><![CDATA[
<p>Let’s go!</p>
]]></description><pubDate>Fri, 28 Mar 2025 21:30:20 +0000</pubDate><link>https://news.ycombinator.com/item?id=43509977</link><dc:creator>rhavaei</dc:creator><comments>https://news.ycombinator.com/item?id=43509977</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=43509977</guid></item><item><title><![CDATA[The Jailbreak Bible]]></title><description><![CDATA[
<p>Article URL: <a href="https://generalanalysis.com/blog/jailbreak_cookbook">https://generalanalysis.com/blog/jailbreak_cookbook</a></p>
<p>Comments URL: <a href="https://news.ycombinator.com/item?id=43501850">https://news.ycombinator.com/item?id=43501850</a></p>
<p>Points: 17</p>
<p># Comments: 4</p>
]]></description><pubDate>Fri, 28 Mar 2025 05:27:03 +0000</pubDate><link>https://generalanalysis.com/blog/jailbreak_cookbook</link><dc:creator>rhavaei</dc:creator><comments>https://news.ycombinator.com/item?id=43501850</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=43501850</guid></item><item><title><![CDATA[New comment by rhavaei in "Why LLMs still have problems with OCR"]]></title><description><![CDATA[
<p>very nice blogpost.</p>
]]></description><pubDate>Sat, 08 Feb 2025 00:29:26 +0000</pubDate><link>https://news.ycombinator.com/item?id=42979122</link><dc:creator>rhavaei</dc:creator><comments>https://news.ycombinator.com/item?id=42979122</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=42979122</guid></item><item><title><![CDATA[Red-Teaming ChatGPT for Hallucinations – Code and Report]]></title><description><![CDATA[
<p>Article URL: <a href="https://github.com/General-Analysis/GA/tree/main/legal-red-teaming">https://github.com/General-Analysis/GA/tree/main/legal-red-teaming</a></p>
<p>Comments URL: <a href="https://news.ycombinator.com/item?id=42979059">https://news.ycombinator.com/item?id=42979059</a></p>
<p>Points: 1</p>
<p># Comments: 0</p>
]]></description><pubDate>Sat, 08 Feb 2025 00:21:15 +0000</pubDate><link>https://github.com/General-Analysis/GA/tree/main/legal-red-teaming</link><dc:creator>rhavaei</dc:creator><comments>https://news.ycombinator.com/item?id=42979059</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=42979059</guid></item><item><title><![CDATA[New comment by rhavaei in "Consistent Jailbreaking Method in o1, o3, and 4o"]]></title><description><![CDATA[
<p>good idea. Will do.</p>
]]></description><pubDate>Fri, 07 Feb 2025 23:46:38 +0000</pubDate><link>https://news.ycombinator.com/item?id=42978831</link><dc:creator>rhavaei</dc:creator><comments>https://news.ycombinator.com/item?id=42978831</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=42978831</guid></item><item><title><![CDATA[New comment by rhavaei in "Consistent Jailbreaking Method in o1, o3, and 4o"]]></title><description><![CDATA[
<p>While this is generally correct, we prefer to look at this probabilistically. Do you think the expected number of harmful behaviors would stay the same if anyone could break these safety guardrails? Even if most users are could get this kind of info elsewhere, a small percentage of malicious ones can have an outsized impact. Some of the data we’ve seen—like bomb-making instructions—is highly detailed and convincing, making it far more accessible than just a random Google search. Removing safeguards doesn’t create masterminds, but it does lower the barrier for harm.</p>
]]></description><pubDate>Fri, 07 Feb 2025 23:09:39 +0000</pubDate><link>https://news.ycombinator.com/item?id=42978573</link><dc:creator>rhavaei</dc:creator><comments>https://news.ycombinator.com/item?id=42978573</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=42978573</guid></item><item><title><![CDATA[New comment by rhavaei in "Consistent Jailbreaking Method in o1, o3, and 4o"]]></title><description><![CDATA[
<p>You will see it soon. We thought it may be harmful to publish it before it is patched. Especially because you can basically bypass all the safeguards with it.</p>
]]></description><pubDate>Fri, 07 Feb 2025 23:04:01 +0000</pubDate><link>https://news.ycombinator.com/item?id=42978529</link><dc:creator>rhavaei</dc:creator><comments>https://news.ycombinator.com/item?id=42978529</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=42978529</guid></item><item><title><![CDATA[New comment by rhavaei in "Consistent Jailbreaking Method in o1, o3, and 4o"]]></title><description><![CDATA[
<p>We understand this. The issue is that it can be very harmful for us to share the method. We made the blogpost for it to be dated on when we found it. We will publish the method once it is patched to a reasonable degree.</p>
]]></description><pubDate>Fri, 07 Feb 2025 23:02:47 +0000</pubDate><link>https://news.ycombinator.com/item?id=42978515</link><dc:creator>rhavaei</dc:creator><comments>https://news.ycombinator.com/item?id=42978515</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=42978515</guid></item><item><title><![CDATA[Consistent Jailbreaking Method in o1, o3, and 4o]]></title><description><![CDATA[
<p>Article URL: <a href="https://generalanalysis.com/blog/jailbreaking_techniques">https://generalanalysis.com/blog/jailbreaking_techniques</a></p>
<p>Comments URL: <a href="https://news.ycombinator.com/item?id=42978228">https://news.ycombinator.com/item?id=42978228</a></p>
<p>Points: 8</p>
<p># Comments: 17</p>
]]></description><pubDate>Fri, 07 Feb 2025 22:26:44 +0000</pubDate><link>https://generalanalysis.com/blog/jailbreaking_techniques</link><dc:creator>rhavaei</dc:creator><comments>https://news.ycombinator.com/item?id=42978228</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=42978228</guid></item><item><title><![CDATA[New comment by rhavaei in "Jailbroken: Finding 50,000 Legal Hallucinations in GPT-4o with RL"]]></title><description><![CDATA[
<p>Yes the data is available on our github <a href="https://github.com/General-Analysis/GA">https://github.com/General-Analysis/GA</a></p>
]]></description><pubDate>Thu, 30 Jan 2025 19:16:17 +0000</pubDate><link>https://news.ycombinator.com/item?id=42881107</link><dc:creator>rhavaei</dc:creator><comments>https://news.ycombinator.com/item?id=42881107</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=42881107</guid></item><item><title><![CDATA[Jailbroken: Finding 50,000 Legal Hallucinations in GPT-4o with RL]]></title><description><![CDATA[
<p>Article URL: <a href="https://generalanalysis.com/blog/legal_ai_red_teaming">https://generalanalysis.com/blog/legal_ai_red_teaming</a></p>
<p>Comments URL: <a href="https://news.ycombinator.com/item?id=42881020">https://news.ycombinator.com/item?id=42881020</a></p>
<p>Points: 4</p>
<p># Comments: 2</p>
]]></description><pubDate>Thu, 30 Jan 2025 19:08:48 +0000</pubDate><link>https://generalanalysis.com/blog/legal_ai_red_teaming</link><dc:creator>rhavaei</dc:creator><comments>https://news.ycombinator.com/item?id=42881020</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=42881020</guid></item></channel></rss>