FEX 2511 Tagged

You would think doing this month after month we would eventually run out of things to work on, but in true emulator fashion the work never ends. Let’s jump in to what has changed for the release this month!

More JIT improvements (Is this surprising yet?)

This month we have another smattering of changes that primarily affect the JIT, but also some related systems around it, let’s break it down.

Potential memory savings

It’s been known that FEX’s memory usage hasn’t been amazing, we cause a decent amount of memory overhead for every emulated process which adds up quickly. This becomes a huge burden on systems with only 8GB of RAM, but also hits 16GB RAM users; This is why most of our developers run on systems with at least 32GB of RAM to sidestep the problem. The vast majority of the problem comes from our JIT’s lookup caches, called L1, L2, and L3 caches depending on the tier. The L1 and L2 caches are a per-thread resource, while the L3 cache is shared between all threads in a process. All these caches end up doing is storing where to find JIT code for relevant x86 code, it’s not the JIT code itself until the L3 cache!

Turns out that these per-thread caches can actually consume quite a bit of memory and because it’s per thread then it tends to scale out very heavily for games that create quite a few threads. In some heavier games like Death Stranding, the combined L1 and L2 cache sizes can end up consuming around a gigabyte of memory! Just for lookup tables to find JIT code! We’ve also seen other games consume more or less depending on what they end up doing, usually growing the longer a game runs.

To help mitigate this problem, we are introducing two new FEX options. The first option is to disable the L2 cache entirely. While this is a fairly heavy hammer, it’s the L2 that consumes the majority of the RAM so it’s a good first step. The second option enables a heuristic to grow or shrink the L1 cache dynamically based on how frequently it is used. This also can dramatically reduce the size of the L1 cache, but because it’s usually only a few dozen megabytes, it isn’t quite as interesting.

The reason why these options aren’t enabled by default although is because there is a chance for them to introduce stuttering, which is hard to distinquish from just regular “JIT” stutter because it tends to happen at the same time. This is why we have actually introduced another optimization in that we have implemented our own writer-priority-mutex which is super low latency for our lookup caches. This cuts the lock contention time compared to our previous C++ mutex to about a third, which helps reduce stuttering.

So go out, enable the options if you’re in a low-memory situation and let us know if it works well for you!

Fixes crashes due to out of bounds branch encoding

One issue that FEX has fought over the past couple of years is that when our JIT encounters a problem with branch targets, we couldn’t restart and try again. This would result in either code that is broken being generated, or FEX throwing a message about it, both of which typically result in a game crashing. This month we introduced the ability to safely restart the JIT if we encounter this situation and compile again knowing that branch targets need to be “far jump” aware. This usually fixes older games that rely heavily on x87, but it can technically happen in a bunch of random edge cases.

If you have had a game that spuriously crashed, this might fix it!

Enable AVX for 32-bit by default

This is a fairly minor change, but for Linux emulation we weren’t enabling AVX out of a concern for potential stack overflow problems. We have fixed a couple of edge case bugs and have decided to turn it on by default, since some algorithms can provide some performance benefits that we want to ensure we get. If we find any games that have stack overflows with AVX enabled then a game profile to disable it is trivial.

This isn’t wired up to the WoW64 emulation because of some missing features in the Wine/Windows side, but that lives outside of FEX’s control.

Performance!

This month we also added a few performance improvements. Primarily we have fixed an oversight with string instructions still using TSO memory model by default, which was significant performance issues in games like Dishonored. We also optimized x87 register exchange instruction slightly.

Minor bug fixes this month

We found a handful of bugs around the project. We found out that the game Ender Magnolia was crashing when thunks were enabled due to a quirky interaction between OpenGL and Vulkan. So GL and Vulkan thunks at the same time should be a bit more stable this month.

Additionally we had a few bugs show up in our memory allocator which have been stamped out. These were also showing up in Ender Magnolia by chance, resulting in some hard to diagnose memory corruption. With that fixed our memory allocator is now even more robust!


See the 2511 Release Notes or the detailed change log in Github.

Written on November 5, 2025