FEX 2512 Tagged
Another month and here we are with a new release! We also celebrated our seven year anniversary late last month; but enough about that boring stuff, let’s talk about what we improved!
Remap procfs cmdline using PR_SET_MM_MAP
This has been a thorn in our side for a while. When an application reads the cmdline FEX would need to rewrite the file contents to remove the FEXInterpreter argument. Turns out the kernel has had this feature for quite a while to remap this file, we just weren’t utilizing it. Now instead of mangling the data, we are using the correct interface from the kernel. This means that things like Mesa application profiles and KDE Plasma see the correct application name in all instances.

Big shoutout to the external contributors that implemented this for us!
Implement support for JIT codebuffer guard page based restart
This one takes a bit to explain what this is and why it is necessary. When writing our AArch64 code emitter, we made the decision not to do range checks for how much memory is remaining in our JIT code buffer. We instead used a heuristic to determine how much space is required whcih usually worked. The problem with heuristics of course is that they can fail and our “fallback” case was to crash. This was a known problem that we would need to resolve at some point, and that was finally this month that we go around to it. Due to us utilizing larger “multiblock” JIT blocks, we had started having a more likely chance of hitting this crash, which usually ends up being due to x87 heavy code because the JIT translation is heavy.
Now when the heuristic fails, our code emitter will try writing to our guard page and we will catch the SIGSEGV and restart the JIT with a larger code buffer. Fixing these edge case crash behaviours and making our JIT more robust in the process.
Initial code caching features landing
There’s an absolute ton of work that is going in to this and it’s not yet ready for users yet, but it would be remiss to not call out all the effort on this front. This month we landed initial support for “code maps” and offline “code cache” generation. There is not yet any way for a user to actually utilize these code maps and caches but these are the required steps to get us to the transparent code-caching that we are expecting to have. Watch out for the coming months as we finish fleshing out this feature fully wired up.
Fixes APICID count
This is a bit of a weird feature that we had accidentally missed. When reading CPUID processes get what is called an APIC ID, which is essentially just a core index. Some applications will use this ID as a way to determine how many unique CPU cores are available on the system. We were accidentally always returning zero which was causing some applications to only think the system had one CPU. With this fixed, the FPGA software that this was detected in now generates the correct number of worker threads for the cores in the system. This of course improves their synthesize time dramatically since they scale well with the number of cores in the system.
Disable io_uring syscalls
Our good friends over at felix86 alerted us to an issue around io_uring causing infinite loops in node.js and libuv. Upon further investigation we determine that there is an ABI break in io_uring between x86 and Arm64 that we previously didn’t know about. This comes down to how the user submission queues in io_uring can embed epoll_event structures and these have different layouts between the architectures.
Because we can’t safely rewrite the queue data to handle this layout difference, we have determined the only course of action is to disable the syscalls. Luckily most games don’t rely on this syscall interface or applications will have a legacy fallback for when it is unsupported. In that vein, node.js now works again.
FEAT_LRCPC2 performance errata
This month we found out that a large number of Cortex and Neoverse CPU cores have an errata that only affects the instructions added in FEAT_LRCPC2. We have disabled this extension on the affected CPU cores, which can give a reasonable performance improvement in games that were TSO emulation bounded.
JIT and emulation bug fixes
There were a bunch of bug fixes in both our JIT and Linux syscall emulation this month as usual, but this month’s report is already running long so if you’re interested, take a peek at our pull requests to find out more.
See the 2512 Release Notes or the detailed change log in Github.
