Hacker News: derefr

New comment by derefr in "Slightly reducing the sloppiness of AI generated front end"

derefr — Fri, 12 Jun 2026 20:56:57 +0000

I'd be curious to see a version prompted to recapitulate the style of a Windows 9x app.

Everyone these days seems to fondly recall win9x as the last era when there was an actual visual "system" that applications actually obeyed (...or rather, that every app was forced into obeying, since Windows just wasn't very extensible to performant custom third-party controls until DirectDraw came along. But I digress.) I wonder whether LLMs can build something that actually obeys those rules (i.e. composes everything out of a hierarchy of [simulacra of] first-party W95-era GDI controls — think "Minesweeper is a grid of buttons with icons on them", that kind of thing), rather than just vaguely looking like W95.

New comment by derefr in "Malware developers added nuclear and biological weapons text to to their spyware"

derefr — Fri, 12 Jun 2026 20:48:12 +0000

My hypothesis is that making the knowledge of how this stuff works accessible to the public results in a lot of false-positives (from people just playing around) that intelligence agencies have to then sift through / tune filters against; which creates a noise floor for real foreign nuke programs to hide in.

So governments ban anything that could result in false positives (since nobody needs to be doing any of that stuff outside of designated labs anyway), to lower that noise floor; to in turn make catching the foreign nuke programs tractable.

(It's a bit like how fancy mansions always have a completely flat and barren part of the property between an outer perimeter and the start of any gardens/outbuildings/water features/etc. That barren area is a killbox: since nothing is supposed to be there, anything at all that does appear there is a valid target for the manion's guards to shoot at [or otherwise engage with], without needing to get a clear identification and command approval first. This wouldn't work if the killbox was covered in vision-obscuring decorative features; nor if the mansion had employees, animals, etc. that had a valid reason to wander into the killbox. So such things are prevented, in order to make the problem of perimeter security tractable.)

New comment by derefr in "WASI 0.3"

derefr — Fri, 12 Jun 2026 17:20:17 +0000

> The promise of wasi components has not been fulfilled. The market wants to hotload and link artifacts dynamically. The wasi project requires insider wizardry to use it that way: the offering has been statically linking components before you ship. Defeating 99% of the use cases.

I think both of these points on the spectrum (on the one end, fully static linking of WASI components within a monolithic single-source project; and on the other end, dynamic-at-runtime compiling + linking + loading + "hot instantiating" of arbitrary black-box WASI-component artifacts, with dynamic [presumably reflection-based?] API discovery to drive interaction with those components) are strawmen. There are relatively few "real" use-cases for WASI on either of these ends.

Most of the stuff anyone is really interested in using WASI for (AFAICT, from the use-cases for it I've seen in the wild) involves something closer to the midpoint between these two points. Somewhat dynamic and modular, but with no JIT compilation / components-fetching-components / hot component instantiation / dynamic reflection stuff going on.

Specifically, the "point" of WASI (in my opinion, and in the opinion of most people I've spoken with about it), is to serve as a sort of meta-standard (with tooling) for concrete "pluggable runtime" systems to implement plugin-support SDKs in terms of.

In such "plugin ecosystems", every "plugin" (WASI component) is of the same shape (i.e. exposing the same endpoints, and expecting the same capabilities.) And so the host runtime, and each of its plugins, can be precompiled (separately, in separate projects!) against that shape. And the plugin host can load arbitrary wasm components into a pre-baked plugin "slot" at runtime, because there's no dynamism / introspection / reflection / component framework support required or involved. The plugin host isn't a component itself; it's just ordinary host runtime code, written once. The component framework doesn't load the component; the host runtime does. Etc.

In a sense, this is "custom binding" as you were talking about. But it's custom binding against a WASI-specced target; which is what enables different plugins to be runtime-fungible within the same plugin "slot" from the host's perspective. (While giving you sandboxing for free, unlike the traditional "a plugin is just a DLL that exports certain symbols" approach.) WASI does all the work plugin hosts want of it at component compile/link time: verifying that the plugin is of the expected ABI shape, and guaranteeing sandboxing (by a WASI component inherently being a thing developed to run as an isolate under an abstract machine.) The fact that a compiled+linked WASI-component artifact is a blob of WebAssembly built against certain WIT interfaces, isn't just something tagged onto it by external metadata; it's also something inherent to the structure of the resulting artifact, i.e. a property determinable via static analysis.

At runtime — or slightly earlier, at plugin "install" time — a plugin host might not even keep the component as a component. It might preprocess it into a DLL, or whatever its runtime's equivalent of a DLL is. (A Java class file, say.) The crucial thing was that the component was a component when it was handed off from the downstream developer to the plugin host. Because then any code generated to wrap the component into a host-runtime-native module, is code trusted by the host, rather than code controlled by the downstream developer.

---

If your goal is to compile some one-off blob of WebAssembly code into an artifact that you can then e.g. treat like an old-school ActiveX component from browser JavaScript, then yeah, you don't need WASI at all for that. You're not trying to create a plugin ecosystem. You don't need to spec out a standard "socket" for downstream devs to plug into. You're just "plugging X into a thing that expects only and exactly X". So skip WASI; just use a one-off custom binding. (Though I would note that the WASI work has acted as a forcing function for the WebAssembly-component ABI, enabling you to write much richer custom bindings than you would have been able to write before WASI.)

But if you're:

- developing a FaaS runtime like Cloudflare Workers

- developing a game engine that allows "mods"

- developing a cloud-hosted agent sandbox, where the toplevel is code (that invokes LLMs, that invoke capabilities)

- developing a modern replacement for Wordpress, with an aim to allow just as much extensibility but to not repeat Wordpress's endless vulnerabilities

(etc)

...or, in other words, if you are developing an application or service that O(N) downstream-developed things (workloads, plugins, mods, extensions, whatever you want to call them) all plug into; where you want these things to all plug into your host system in a very precise and controlled way, rather than being given free rein to touch anything they want; and where these interaction patterns can all be described in one of a few very specific shapes, with a precisely definable spec for 1. what API the runtime wants to call into on the component; and 2. what APIs the runtime wants to hand to the component, to enable the component to call those APIs...

...then WASI was developed precisely for you.

And, more specifically, WASI was created so that you could:

1. use WASI to define that API spec (as machine-readable WIT files); and then

2. give that spec, and those WIT files, to the developers in your ecosystem;

3. so that they could then use existing WebAssembly+WASI tooling to build WebAssembly components that target your API spec. (Most likely not by expecting them to independently bootstrap a WebAssembly+WASI dev environment; but rather, by you shipping an SDK that embeds WebAssembly+WASI tooling and your WIT files together.)

(I would also note that this — i.e. "the thing WASI solves for" — is actually a rather rare use-case on the whole. Your average dev isn't [and shouldn't be!] building an ecosystem for API-sandboxed plugins of their code. The few devs that do need to construct their own plugin ecosystems around their project, probably can therefore be expected to go quite deep on learning any required "insider wizardry." If that was even required. Which it generally isn't, when all you're trying to do is to load one of N unknown-until-runtime but statically-defined-ABI-shape plugin components; rather than trying to load arbitrary runtime-generated dynamically-defined-ABI-shape components, allowing those components to load or compile+exec further components, etc.)

New comment by derefr in "WASI 0.3"

derefr — Fri, 12 Jun 2026 16:58:39 +0000

> WASI should be ... stable.

The WASI standard is not at 1.0 yet! The people designing WASI are still trying to figure out what people want WASI to be at this point.

This is very likely to involve a lot of major reworking before 1.0, in response to feedback from orgs actually trying to implement WASI-based WebAssembly embeddings into their systems and runtimes. 0.1.x -> 0.2.x was one reworking; 0.2.x -> 0.3.x is another. There may be more of these before an approach is finally settled upon / "locked in" for 1.x.

---

> Let's keep WebAssembly lean and fast!

AFAICT, the entire point of the changes (incl. the more detailed component model) in WASI 0.3 is performance. Not performance of WebAssembly as a black box, though; but rather, performance of the running system as a whole, when a lot of FFI traffic is flowing across the WASI boundary. The richer component model enables lower impedance mismatches and "thinner" FFI-layer implementations.

For example, from the OP:

> WASI 0.2 handed you an output-stream that you wrote into imperatively. WASI 0.3 has you pass in a stream and get back a future that resolves when the write completes.

For some host languages/runtimes, "imperative blocking write calls" is already how writes against IO descriptors are exposed to the programmer. For those languages, WASI 0.2 made sense.

But in other host languages/runtimes, writes against IO descriptors are inherently non-blocking, returning promises or yielding. For those languages, WASI 0.2 "left performance on the table." WASI 0.2 required such languages to implement a blocking IO write abstraction on top of their non-blocking IO write semantics, in order to pass that blocking-IO-write primitive into the WebAssembly componennt... even if the WebAssembly component was internally concurrent (e.g. compiled from a language like Golang) and so would highly benefit from a non-blocking-IO-write primitive!

Meanwhile, if you require that the host expose a non-blocking-IO-write primitive (as WASI 0.3 does), then for hosts with native non-blocking IO, doing so is free; while for hosts with only blocking IO, non-blocking IO can be "faked" basically for free (i.e. with a global or per-resource linearized write queue on the host side.) And likewise, non-blocking-IO-aware WebAssembly components can freely take advantage of NIO; while WebAssembly components that expect blocking IO only need the tiniest added bit of a codegen shim (`blocking_write(x) => await nonblocking_write(x);`) to fit into a WASI 0.3 world.

In other words, implementing nonblocking IO abstractions on top of blocking IO abstractions costs FFI performance, but implementing blocking IO abstractions on top of nonblocking IO abstractions is "free" (in FFI terms.) Nonblocking IO should therefore be considered the more "primitive" of the two; and so, if you have to choose only BIO or NIO to expose as a capability across a boundary to an unknown peer, NIO should be the one you choose.

---

That being said...

The WASI devs were likely aware of the "FFI optimization opportunities being left on the floor" in WASI 0.2. They likely already wanted to take things in this direction from the beginning. But in WASI 0.2, without async, it was impossible to express the concept of nonblocking IO (i.e. of IO operations returning a promise/future.) They needed to introduce this more "opinionated" (i.e. richer) component model in order to get here.

AFAICT, WASI 0.2 was never intended to be a Release Candidate of the WASI spec. (And WASI 0.3 likely isn't either!)

Rather, WASI 0.2 had areas (like IO) that were purposefully left "under-baked". The WASI team knew people needed some version of these primitives in order for WebAssembly components to usefully integrate into systems at all. But they hadn't yet put in the work on designing how certain aspects of WASI (e.g. async) would work. So they designed WASI 0.2 as a prototype design, based on the limited toolbag of primitives they had already fully agreed upon. Some aspects of WASI turned out to only "want" that limited toolbag of primitives, and so didn't change at all under WASI 0.3 (and might even be in their final shapes.) Other aspects "wanted" things that weren't there, and so experienced over-constrained designs under WASI 0.2, replaced with less-constrained designs under WASI 0.3.

I fully expect there will be more such changes under WASI 0.4. If you don't want to be a guinea pig for major WASI changes, you might want to wait for WASI 1.0. (However long that takes.)

New comment by derefr in "A Call to Action: Stop the FCC's KYC Regime"

derefr — Fri, 12 Jun 2026 16:10:59 +0000

Do they actually need to purchase numbers to do that, though?

I always imagined that there are certain shady providers ("grey-market Twilio" sort of idea) that just let you run single outbound call/text requests through a giant pool of numbers shared with other customers of the service. Perhaps specifically a bank of residential numbers plugged into banks of regular cell phones, like a residential IP proxy service provider.

New comment by derefr in "Queues Don't Fix Overload (2014)"

derefr — Thu, 11 Jun 2026 17:51:13 +0000

You're essentially describing what a silicon engineer would call independent "clock domains" (the stations) and "clock-domain-crossing signals" (the workpieces.) And, indeed, you would also tend to handle clock-domain-crossing signals by sticking an async FIFO between the two clock domains.

New comment by derefr in "Show HN: Lathe – Use LLMs to learn a new domain, not skip past it"

derefr — Sun, 07 Jun 2026 21:24:59 +0000

That's almost it, yes.

But in my experience, to actually get where they're going quickly (as opposed to spending hours and hundreds of dollars stumbling around in the dark), coding agents generally need more interactive hand-holding than that. If you just fire off one non-interactive session and wait for it to come to a stop, the problem usually isn't fully+correctly solved at the point at the LLM decides to "finish." And if you then start another non-interactive session to continue the work, the new session will have lost the old session's state/memory/context, and so will stumble through many of the same mistakes / misapprehensions.

What you really want, for a CLI program with a "use coding agent to do X" workflow-step, is for the CLI program to play the role of a human in a temporary durable coding-agent conversation session: prompting the agent; then waiting for it to finish responding (and side-effecting); then either asking the agent itself to evaluate an "am I done yet" predicate with a constrained output syntax; or having the CLI program do its own out-of-band validation of the changes made to the shared state by the agent; where, in either case, if the agent isn't "done yet", then the workflow step must continue poking it — or prompt the human to make a decision on how to proceed (possibly involving providing direct input to the LLM, but this is not ideal; ideally the CLI "abstracts away" the need for the end-user to understand the intricacies of the conversation the program is having with the LLM. Even more ideally, the conversation just whizzes by and the human doesn't have to think about an LLM being involved at all.)

Basically, think of this not as the CLI program saying to an agent "answer me this question" or "edit this file for me", but rather, the CLI program popping open a mini "guided + 99%-of-the-time automated" TUI coding-agent micro-IDE "inside" the workflow, in about the same way that git pops open your EDITOR inside `git commit`.

New comment by derefr in "Show HN: Lathe – Use LLMs to learn a new domain, not skip past it"

derefr — Sun, 07 Jun 2026 18:29:58 +0000

This approach works well, I agree. But I keep wishing that I could invert it. The architecture I feel like I keep yearning for, is a traditional CLI program that encodes most workflow knowledge/decisions as real code; but which does "just a little bit of coding agent invocation" during one specific workflow step.

Not sure how to accomplish this. Anyone have any suggestions? Are there libraries for this yet? (And how would they even work? It feels like, to do this right, there would have to be some background service that CLI software could expect to interact with via a well-known local IPC socket — similar to how e.g. the docker daemon works. But I'm unaware of any coding agent software/frameworks that expose such an IPC capability...)

New comment by derefr in "Azure Linux Desktop"

derefr — Sat, 06 Jun 2026 17:40:56 +0000

I think the GP used the word "tuned" incorrectly / to make the wrong point here.

A general-purpose OS is one to which you can build a stack on top of it for any use-case you can think of, and it will cope with whatever stack you lay on it about equally well, because it hasn't been forced into a particular shape where it's much better at some things but much worse at other things. A "jack of all trades, master of none" OS.

Microsoft would call all consumer and server editions of Windows "general-purpose OSes." But Windows Datacenter Edition and Windows IoT Core would be non-general-purpose OSes — the former only exists to run hypervisors/SANs, and it doesn't support "stripping off" that layer, so if you used it for anything else, that layer would always be there, bloating things up; and the latter only exists to run on embedded devices, and it doesn't support "adding back" the extra frameworks and services regular Windows has, that would be required to use it for "more" than embedded use-cases.

An OS being "tuned" for a particular substrate (what the OS is good at running on), meanwhile, has nothing to do with the OS's use-case (what can be run well on the OS.)

An analogy: each mobile OEM's spin of Android only works on that OEM's own phones, because that OEM's phones have the required hardware wired to the right SoC pins, and the Android spin ships with a BSP that defines a device tree that matches that expected wiring. Thus, those OEM Android spins are "tuned for" those phones.

But in the end, they're all just Android phones, and they can all do the same things. All of these Android spins are "general-purpose OSes." They're all made to enable you to put any Android software you like on top of them, and run it just fine. (Contrast Android spins made by industrial vendors specifically for automotive or kiosk use-cases, where a given car company or kiosk manufacturer then produces a hardware-customized-and-tuned spin of that already-appliance-purposed spin. You wouldn't use a car-infotainment Android upstream for other use-cases; you'd have to undo all the car-infotainment stuff.)

Azure Linux is exactly like a phone-OEM "tuning" of Android (and unlike a vertical-specific Android spin.) Azure Linux is also like, for another example, the vendor-specific Linux "distros" [really, tunings] that ship as (usually binary-only) images for various Single-Board Computers.

In all three cases, a "tuned" fork of an OS is still intended to run anything a user might want to run on the platform the "tuned" fork was forked off of. It exposes a general-purpose surface to the developer — just one that happens to do some of the general-purpose things you ask it to do, more performantly than a non-"tuned" OS would on the same hardware/substrate.

And, in all three cases, the "tuned" fork accomplishes that by relying on device-specific knowledge and capabilities (i.e. drivers, device-tree entries, kernel patches, etc) that have been burned into the "tuned" fork rather than upstreamed. There's still a HAL between you and that stuff; your workload doesn't need to know the "tuned" fork has been tuned. It just benefits automatically, from the OS having a deeper understanding of the hardware/substrate.

New comment by derefr in "Azure Linux Desktop"

derefr — Sat, 06 Jun 2026 17:20:44 +0000

> Yeah of course, it's a Linux distribution.

That is not a given. There are Linux distributions that run anywhere but are not general-purpose. For example, the various "immutable" Linux distros that exist solely to be used as Kubernetes nodes to host containers.

New comment by derefr in "Pokemon Emerald Ported to WebAssembly (100k FPS)"

derefr — Sat, 06 Jun 2026 16:59:36 +0000

I get the sense that these disassembly/decompilation projects believe that some types of IP-laden asset data can be shipped embedded into the project — not necessarily "legally", but in that they'll likely get away with doing so indefinitely — as long as:

1. those assets are stored in proprietary formats that only the game code itself understands, and

2. no tool exists in the project to extract the assets from these proprietary formats into open formats, unless that tool itself exists only in source-code form in the codebase, and requires the ROM as an input to compile it (even if in the case of such a tool, the ROM is doing nothing but serving as a "key" to unlock compilation.)

Basically, if you have to prove you have your own copy of the IP in order to make their embedded copy of the IP "legible", then it's very hard to construct an evidence-based DMCA takedown order that actually makes any coherent point about the project "distributing" said IP.

That being said, shipping assets like this at all, even if you "can get away with it", is ultimately just a kind of laziness / shortcut-taking. They do it because there's either no clear/simple/obvious way to automatically extract the given asset data from the ROM (e.g. because the relevant data is split up into various data planes + metadata bits that are stored "exploded" all over the ROM), so they just did it once by hand, committing the results; or because there's no clear/simple/obvious way to store the extracted asset data such that a regular compiler/assembler natively understands how to embed it into the binary in the particular form it was found in the original ROM. (Remember, re-assembling/compiling to the original ROM is always the test these projects use to ensure their disassembly/decompilation is preserving semantics. So they need to replicate every weird layout quirk the original dev tooling imposed upon the original ROM. And sometimes the original dev tooling included special-purpose domain-specific asset-codegen tools that aren't part of regular compiler toolchains.)

What these projects should actually be doing, is taking on the schlep: writing the extract tooling anyway, even if it's just "copy these bytes from here and these bytes from there, and spit them out as hex in an .asm file with this header"; and/or writing matching asset-codegen tooling to the tooling that likely existed in the platform SDK, to run before compile/assemble time, converting the extracted ROM asset files into a form (probably a bunch of little assembly files) that will land in the right places when linked back together to form the original ROM.

And, to be clear, they mostly do do this! These projects are very good at doing this!

But sometimes — especially on a larger project with many contributors — one or two things like this aren't audited properly, and fall through the cracks. Or they start out as temporary "bootstrap" approaches made during a private phase of development to get things working + compiling to a correct image; and then not all of those get cleaned up before the repo gets made public.

New comment by derefr in "Nvidia is proposing a beast of a CPU system for Windows PCs"

derefr — Sat, 06 Jun 2026 16:43:46 +0000

I must be using LLMs very differently than y'all, because I can't think of a single thing I would rely on an LLM that's "dumb as a stump" to do for me.

To me, LLMs are for asking research questions + exploring design spaces + pointing at codebases to investigate bugs. And those all benefit from the model being as "smart" (in terms of both fluid intelligence and burned-in knowledge) as possible.

I'm guessing there exist problems where "intelligence past a certain point" doesn't matter, so these medium-sized models can match the performance of the bigger models. But what problems might those be?

New comment by derefr in "Nvidia is proposing a beast of a CPU system for Windows PCs"

derefr — Sat, 06 Jun 2026 16:37:27 +0000

Can you say more? I don't have any memory of Qualcomm-related scandals(?), but I just read the news; I've never really been a user of their chips.

New comment by derefr in "Nvidia is proposing a beast of a CPU system for Windows PCs"

derefr — Sat, 06 Jun 2026 16:35:22 +0000

> The game changer is the unified 128 GB memory. That is the path Apple took years ago. Instead of separate memory for the CPU and GPU, everything shares a single pool. It is increasingly popular.

> The memory is not as fast as dedicated GPU memory, but it is cheap enough while delivering enough bandwidth to run AI models locally.

So, the reason "dedicated GPU memory" is fast, isn't because it's "dedicated"; it's because the types of memory built into GPU cards — GDDR and HBM — are designed for throughput over latency.

Which is to say, GDDR and HBM memory could be shared with the CPU in UMA while still being "fast" (for GPU use-cases.) In fact, the PS4/5 and Xbox 360 / One X / Series consoles have UMA architectures that use GDDR memory as their main memory, with no regular DDR memory to be found.

What I don't understand: why don't we see UMA architectures where there's both regular DDR and GDDR/HBM memory mapped into the address space of the CPU+GPU? That seems like the best of both worlds: you'd have some memory that's "tuned" for random-access CPU usage (regular DDR), and some memory that's "tuned" for streaming GPU usage (GDDR/HBM), but either type of memory can still be put to the use it wasn't "tuned" for, just with slightly-worse performance.

I guess you'd need to do a bit of software work:

1. a bit of work in the OS kernel / malloc library to get CPU workloads to "prefer" allocating DDR memory over the GDDR/HBM memory until they've exhausted DDR memory (or maybe not, if you just tell the kernel the GDDR/HBM memory is something like a zswap thinpool);

2. and a bit of work in supported ML frameworks, to teach them about a hybrid strategy between UMA "allocate anywhere, it's all the same" and NUMA "keep assets in VRAM if possible; if you spill assets to RAM, then they must stream into VRAM on access" (i.e. "at allocation time, allocate as if the system were NUMA, VRAM first then spilling to RAM; but at execution time, use the UMA codepaths, no need to copy RAM into VRAM.")

...but once that's done, it's done.

New comment by derefr in "New Texas Instruments 5532 chips are not the 5532s we’ve used for decades"

derefr — Wed, 03 Jun 2026 20:28:30 +0000

Keeping in mind, though, that this is a jellybean part. You're supposed to be able to order "a" 5532 without specifying the supplier, because many vendors produce "a" 5532, and they're all the same. Different vendors' 5532s are supposed to be able to be treated as the same SKU — literally dumped into co-mingled stock in warehouses — with no ill consequence!

(And yes, until TI's recent move, that was true of the 5532. All the other vendors' 5532s had matching datasheet specs, including the 22V max input voltage. Because a design that was built for "a" 5532 was usually built to run it up to 100%; and that a vendor couldn't offer their part as a swap-in if it couldn't do that.)

But now, if your purchasing department (or the supplier they purchase from) happens to order TI 5532s — or if the warehouse they're sourcing from has comingled any TI 5532s into the general 5532 stock — then your product is now broken, with no real recourse except to change your entirely supply chain to one that specifically excludes TI.

New comment by derefr in "CS336: Language Modeling from Scratch"

derefr — Mon, 01 Jun 2026 18:21:18 +0000

I imagine it's a lot like FPGAs:

- the hardware you need for a production use-case is relatively small, because production {models, bitstreams} have been heavily size-optimized, stripping out everything not needed to get a good result for the target use-cases

- but the hardware you need when tinkering/learning how to design {compute kernels, IP blocks} in the first place, must be quite a bit more powerful / higher-capacity, because your experiments will intentionally be the opposite of optimized: they'll be built for legibility / introspectability / debuggability at every level, which massively inflates and de-optimizes the resulting {model, bitstream}.

(And, to be clear here, "running someone else's finished model, which was designed and optimized to be used on something like a 4090, against your own prompt" is a kind of experimenting, which is cheap, in the same way that "deploying someone else's pre-baked FPGA bitstream, that was designed and synthesized for a $20 target FPGA, onto your own instance of that $20 FPGA, and then feeding your own input signals to it" is cheap. But that's not the kind of experimenting you'd be doing in this course while learning to design your own models!)

New comment by derefr in "Windows GOG DOS Games on M-Series Macs"

derefr — Mon, 01 Jun 2026 16:49:22 +0000

Same thing I'd want to be "interesting" about any emulator. Focusing on just graphics for a moment (but there are equivalent examples in other domains):

1. Running games that only ran at 10FPS on original hardware at a smooth 60+FPS, by calling the game's own rendering logic more frequently than the original hardware could "afford" to, but without breaking game logic (i.e. by forcibly decoupling the game's physics ticks from its presentation ticks);

2. Using out-of-viewport but in-{tile/frame}buffer data to expand the viewport to fill my screen (which can be very janky under some rendering paradigms, due to offscreen parts of tile/frame buffers being dynamically partial-updated with a loading seam; but which can work very well under other rendering paradigms, like the SNES's mode 7 where the tilemap was usually just fully populated once at mode-switch time);

3. Making games that used vector-graphics for at least part of their display, and soft- or hard-rasterized those vector graphics into the native low-resolution framebuffer, instead rasterize those graphics at my display's native resolution;

4. having the emulator recognize particular bitmap assets (tiles, sprites, 3D meshes/textures) the game is telling it to render, and swap these out for hand-crafted HiDPI / high-poly versions of those assets from an asset-pack file (as opposed to relying on the caprices of a DLSS-like upscaling model.)

Mind you, to have features like this work well, they often leave the realm of "interpreting the control-register pokes from the game differently", and enter the realm of "the game being patched to take advantage of the capabilities of the emulator." Then, as with these GOG games, you're no longer just shipping a ROM "and an emulator configured to run it well"; rather, you're shipping a co-designed product: an emulator tuned to run that ROM, and a ROM tuned to run in that emulator.

---

By doing this, you technically leave the realm that MAME-like "archival preservation" emulation usually aims for, of "faithful emulation" of both a game's logic and its presentation.

However! "Faithful emulation" folks shouldn't despair. The nice thing about this technique, is that this is all done by wrapping the original ROM in an emulator + shipping runtime-applied IPS patches.

In other words, the original game ROM is still there, unmodified, under an "isolation layer"; and everything being done to modify it is done using "reversible, conservation-grade" techniques.

Which means the emulator can provide a launcher UI to turn any of those presentation "enhancement" features on-and-off. If you're the "faithful emulation" type, you can just turn them off!

(And, under this paradigm, even with the "enhanced emulation" features on, the game logic is still preserved as-is; you're only modifying the presentation. The original game engine is still running; the original instructions are still executing cycle-accurately to how they should. So the "game feel" is preserved perfectly. If you were good at the original game, you'll still be good at playing an "enhanced emulation" of the game; nothing will be "off" about it. Even input movies recorded against the un-enhanced game should replay unmodified against the enhanced game!)

Contrast this to the average "HD remaster", where the game is at the very least recompiled for a new platform (with different timing guarantees), if not entirely rewritten atop a new engine. In that process, there's no "isolation layer"; no way to guarantee a preservation of any part of the original game logic in the remastered artifact. And like George Lucas, game developers coming back to their own works 20–40 years later, just can't help but want to tweak things. So these HD remasters end up breaking "game feel" in all sorts of ways.

New comment by derefr in "Chuwi Minibook X"

derefr — Mon, 01 Jun 2026 03:08:53 +0000

"Bigger screen" (i.e. being bigger on the length/width dimension) is a bad thing in this discussion. Some people want a programming/writing laptop that fits in a handbag, so that they don't have to decide to bring it, but can just leave it in their bag the way many people do with an iPad.

New comment by derefr in "OpenRouter raises $113M Series B"

derefr — Sun, 31 May 2026 14:45:48 +0000

That answers for the "sold" part but not for the "used" part.

I.e. nothing about this statement prevents Anthropic from running ads within Claude, as long as they run the ad-placement auctions themselves, and so aren't leaking any of the data they're using to decide which placements are relevant to which users+sessions. (This is the same thing Google does for SERP ad auctions.)

But actually, and perhaps more interestingly, nothing about this statement prevents Anthropic from building a Google AdSense competitor either. Other sites (or mobile apps, etc) could plop in an Anthropic ad iframe; and it'd be Anthropic's knowledge of your interactions with Claude that would drive what ads would show up in that iframe. The embedding site doesn't know what ads the users are seeing, so that's still not "selling users' data to third parties", per se.

New comment by derefr in "Zig ELF Linker Improvements Devlog"

derefr — Sat, 30 May 2026 18:42:14 +0000

So, this linker does fast incremental linking, which is great for development iteration speed.

But I assume that any kind of incremental linking, is mutually exclusive with link-time optimization? I.e. you'd never want to use this option for a release build?