Hacker News: rhavaei

Supabase MCP can leak your entire SQL database

rhavaei — Mon, 07 Jul 2025 18:32:16 +0000

Article URL: https://www.generalanalysis.com/blog/supabase-mcp-blog

Comments URL: https://news.ycombinator.com/item?id=44493315

Points: 3

# Comments: 0

New comment by rhavaei in "A simple MCP attack leaks entire SQL database"

rhavaei — Tue, 24 Jun 2025 19:34:47 +0000

Stay safe out there kids.

New comment by rhavaei in "[dead]"

rhavaei — Sat, 03 May 2025 20:08:13 +0000

I have been working on a project for a few months now coding up different methodologies for LLM Jailbreaking. The idea was to stress-test how safe the new LLMs in production are and how easy is is to trick them. I have seen some pretty cool results with some of the methods like TAP (Tree of Attacks) so I wanted to share this here. Here is the github link: https://github.com/General-Analysis/GA

A comprehensive analysis of Llama4 safety in CBRN tasks vs. closed-source models [pdf]

rhavaei — Mon, 28 Apr 2025 22:34:37 +0000

Article URL: https://generalanalysis.com/analysis/llama4-analysis.pdf

Comments URL: https://news.ycombinator.com/item?id=43826844

Points: 2

# Comments: 0

LLM Robustness/Safety Benchmark

rhavaei — Tue, 22 Apr 2025 21:09:27 +0000

Article URL: https://generalanalysis.com/benchmarks

Comments URL: https://news.ycombinator.com/item?id=43766332

Points: 2

# Comments: 0

An Implementation of AutoDAN Turbo

rhavaei — Fri, 04 Apr 2025 18:13:42 +0000

Article URL: https://colab.research.google.com/github/General-Analysis/GA/blob/main/notebooks/General_Analysis_AutoDAN_Turbo_Jailbreak.ipynb

Comments URL: https://news.ycombinator.com/item?id=43585937

Points: 2

# Comments: 0

Using Deepseek R1 to Break LLMs: Tree of Attacks

rhavaei — Mon, 31 Mar 2025 20:15:06 +0000

Article URL: https://colab.research.google.com/github/General-Analysis/GA/blob/main/notebooks/General_Analysis_TAP_Jailbreak.ipynb

Comments URL: https://news.ycombinator.com/item?id=43539393

Points: 7

# Comments: 0

New comment by rhavaei in "The LLM Jailbreaking Bible: Code Implementation and Overview"

rhavaei — Fri, 28 Mar 2025 21:31:11 +0000

Codebase on https://github.com/General-Analysis/GA

New comment by rhavaei in "The LLM Jailbreaking Bible: Code Implementation and Overview"

rhavaei — Fri, 28 Mar 2025 21:30:20 +0000

Let’s go!

The Jailbreak Bible

rhavaei — Fri, 28 Mar 2025 05:27:03 +0000

Article URL: https://generalanalysis.com/blog/jailbreak_cookbook

Comments URL: https://news.ycombinator.com/item?id=43501850

Points: 17

# Comments: 4

New comment by rhavaei in "Why LLMs still have problems with OCR"

rhavaei — Sat, 08 Feb 2025 00:29:26 +0000

very nice blogpost.

Red-Teaming ChatGPT for Hallucinations – Code and Report

rhavaei — Sat, 08 Feb 2025 00:21:15 +0000

Article URL: https://github.com/General-Analysis/GA/tree/main/legal-red-teaming

Comments URL: https://news.ycombinator.com/item?id=42979059

Points: 1

# Comments: 0

New comment by rhavaei in "Consistent Jailbreaking Method in o1, o3, and 4o"

rhavaei — Fri, 07 Feb 2025 23:46:38 +0000

good idea. Will do.

New comment by rhavaei in "Consistent Jailbreaking Method in o1, o3, and 4o"

rhavaei — Fri, 07 Feb 2025 23:09:39 +0000

While this is generally correct, we prefer to look at this probabilistically. Do you think the expected number of harmful behaviors would stay the same if anyone could break these safety guardrails? Even if most users are could get this kind of info elsewhere, a small percentage of malicious ones can have an outsized impact. Some of the data we’ve seen—like bomb-making instructions—is highly detailed and convincing, making it far more accessible than just a random Google search. Removing safeguards doesn’t create masterminds, but it does lower the barrier for harm.

New comment by rhavaei in "Consistent Jailbreaking Method in o1, o3, and 4o"

rhavaei — Fri, 07 Feb 2025 23:04:01 +0000

You will see it soon. We thought it may be harmful to publish it before it is patched. Especially because you can basically bypass all the safeguards with it.

New comment by rhavaei in "Consistent Jailbreaking Method in o1, o3, and 4o"

rhavaei — Fri, 07 Feb 2025 23:02:47 +0000

We understand this. The issue is that it can be very harmful for us to share the method. We made the blogpost for it to be dated on when we found it. We will publish the method once it is patched to a reasonable degree.

Consistent Jailbreaking Method in o1, o3, and 4o

rhavaei — Fri, 07 Feb 2025 22:26:44 +0000

Article URL: https://generalanalysis.com/blog/jailbreaking_techniques

Comments URL: https://news.ycombinator.com/item?id=42978228

Points: 8

# Comments: 17

New comment by rhavaei in "Jailbroken: Finding 50,000 Legal Hallucinations in GPT-4o with RL"

rhavaei — Thu, 30 Jan 2025 19:16:17 +0000

Yes the data is available on our github https://github.com/General-Analysis/GA

Jailbroken: Finding 50,000 Legal Hallucinations in GPT-4o with RL

rhavaei — Thu, 30 Jan 2025 19:08:48 +0000

Article URL: https://generalanalysis.com/blog/legal_ai_red_teaming

Comments URL: https://news.ycombinator.com/item?id=42881020

Points: 4

# Comments: 2