Hacker News: ddjohnson

New comment by ddjohnson in "GPT o3 frequently fabricates actions, then elaborately justifies these actions"

ddjohnson — Thu, 17 Apr 2025 08:37:40 +0000

One of the blog post authors here! We evaluated o3 through the API, where the model does not have access to any specific built-in tools (although it does have the capability to use tools, and allows you to provide your own tools). This is different than when using o3 through the ChatGPT UI, where it does have a built-in tool to run code.

(Interestingly, even in the ChatGPT UI the o3 model will sometimes state that it ran code on its personal MacBook Pro M2! https://x.com/TransluceAI/status/1912617941725847841)

New comment by ddjohnson in "GPT o3 frequently fabricates actions, then elaborately justifies these actions"

ddjohnson — Thu, 17 Apr 2025 08:29:00 +0000

One of the blog post authors here! I think this finding is pretty surprising at the purely behavioral level, without needing to anthropomorphize the models. Two specific things I think are surprising:

- This appears to be a regression relative to the GPT-series models which is specific to the o-series models. GPT-series models do not fabricate answers as often, and when they do they rarely double-down in the way o3 does. This suggests there's something specific in the way the o-series models are being trained that produces this behavior. By default I would have expected a newer model to fabricate actions less often rather than more!

- We found instances where the chain-of-thought summary and output response contradict each other: in the reasoning summary, o3 states the truth that e.g. "I don't have a real laptop since I'm an AI ... I need to be clear that I'm just simulating this setup", but in the actual response, o3 does not acknowledge this at all and instead fabricates a specific laptop model (with e.g. a "14-inch chassis" and "32 GB unified memory"). This suggests that the model does have the capability of recognizing that the statement is not true, and still generates it anyway. (See https://x.com/TransluceAI/status/1912617944619839710 and https://chatgpt.com/share/6800134b-1758-8012-9d8f-63736268b0... for details.)

New comment by ddjohnson in "Penzai: JAX research toolkit for building, editing, and visualizing neural nets"

ddjohnson — Mon, 22 Apr 2024 16:15:37 +0000

I'm curious what interop difficulties you've run into in JAX? In my experience, the JAX ecosystem is quite modular and most JAX libraries work pretty well together. Penzai's core visualization tooling should work for most JAX NN libraries out of the box, and Penzai's neural net components are compatible with existing JAX optimization libraries (like Optax) and data loaders (like tfds/seqio or grain).

(Interop with PyTorch seems more difficult, of course!)

New comment by ddjohnson in "Penzai: JAX research toolkit for building, editing, and visualizing neural nets"

ddjohnson — Sun, 21 Apr 2024 22:46:03 +0000

Thanks! This is one of the more experimental design choices I made in designing Penzai, but so far I've found it to be quite useful.

The effect system does come with a few sharp edges at the moment if you want to use JAX transformations inside the forward pass of your model (see my reply to Patrick), but I'm hoping to make it more flexible as time goes on. (Figuring out how effect systems should compose with function transformations is a bit nontrivial!)

Please let me know if you run into any issues using Penzai for your model! (Also, most of Penzai's visualization and patching utilities should work with Equinox too, so you shouldn't necessarily need to fully commit to either one.)

New comment by ddjohnson in "Penzai: JAX research toolkit for building, editing, and visualizing neural nets"

ddjohnson — Sun, 21 Apr 2024 22:40:43 +0000

Author of Penzai here! In idiomatic Penzai usage, you should always discharge all effects before running your model. While it's true you can't do `discharge_effect(jax.grad(your_model_here))`, you can still do `jax.grad(discharge_effect(your_model_here))`, which is probably what you meant to do anyway in most cases. Once you've wrapped your model in a handler layer, it has a pure interface again, which makes it fully compatible with all arbitrary JAX transformations. The intended use of effects is as an internal helper to simplify plumbing of values into and out of layers, not as something that affects the top-level interface of using the model!

(As an example of this, the GemmaTransformer example model uses the SideInput effect internally to do attention masking. But it exposes a pure functional interface by using a handler internally, so you can call it anywhere you could call an Equinox model, and you shouldn't have to think about the effect system at all as a user of the model.)

It's not clear to me what the semantics of ordinary JAX transformations like `lax.scan` should be if the model has side effects. But if you don't have any effects in your model, or if you've explicitly handled them already, then it's perfectly fine to use `lax.scan`. This is similar to how it works in ordinary JAX; if you try to do a `lax.scan` over a function that mutates Python state, you'll probably hit an error or get something unexpected. But if you mutate Python state internally inside `lax.scan`, it works fine.

I'll also note that adding support for higher-order layer combinators (like "layer scan") is something that's on the roadmap! The goal would be to support some of the fancier features of libraries like Flax when you need them, while still admitting a simple purely-functional mental model when you don't.

New comment by ddjohnson in "Penzai: JAX research toolkit for building, editing, and visualizing neural nets"

ddjohnson — Sun, 21 Apr 2024 22:27:12 +0000

Author of Penzai here. I think the answer is a bit more nuanced (and closer to "yes") than this:

- If you want to use the treescope pretty-printer or the pz.select tree manipulation utility, those should work out-of-the-box with both Equinox and Diffrax. Penzai's utilities are designed to be as modular as possible (we explicitly try not to be "frameworky") so they support arbitrary JAX pytrees; if you run into any problems with this please file an issue!

- If you want to call a Penzai model inside `diffrax.diffeqsolve`, that should also be fully supported out of the box. Penzai models expose a pure functional interface when called, so you should be able to call a Penzai model anywhere that you'd call an Equinox model. From the perspective of the model user, you should be able to think of the effect system as an implementation detail. Again, if you run into problems here, please file an issue.

- If you want to write your own Penzai layer that uses `diffrax.diffeqsolve` internally, that should also work. You can put arbitrary logic inside a Penzai layer as long as it's pure.

- The specific thing that is not currently fully supported is: (1) defining a higher-order Penzai combinator layer that uses `diffrax.diffeqsolve` internally, (2) and having that layer run one of its sublayers inside the `diffrax.diffeqsolve` function, (3) while simultaneously having that internal sublayer use an effect (like random numbers, state, or parameter sharing), (4) where the handler for that effect is placed outside of the combinator layer. This is because the temporary effect implementation node that gets inserted while a handler is running isn't a JAX array type, so you'll get a JAX error when you try to pass it through a function transformation.

This last case is something I'd like to support as well, but I still need to figure out what the semantics of it should be. (E.g. what does it even mean to solve a differential equation that has a local state variable in it?) I think having side effects inside a transformed function is fundamentally hard to get right!

New comment by ddjohnson in "Penzai: JAX research toolkit for building, editing, and visualizing neural nets"

ddjohnson — Sun, 21 Apr 2024 22:09:38 +0000

Author here! I didn't know PyTorch had its own pytree system. It looks like it's separate from JAX's pytree registry, though, so Penzai's tooling probably won't work with PyTorch out of the box.