FEX 2501 Tagged
Welcome back to another new FEX-Emu release in the new year! While everyone was out celebrating the holidays, we still managed to get some work done. So let’s get in to what we did this last month!
Official WINE WoW64 and Arm64ec package support
This month we have updated our Ubuntu ppa repository to now support a fex-emu-wine package. This package provides wow64 and arm64ec emulator DLL files that can be applied directly to an AArch64 build of WINE, thus allowing you to do x86 and x86-64 emulation inside of WINE directly and removing a ton of CPU overhead in the emulation! This is relatively fresh so there will be some teething issues around getting it setup, like the current upstream WINE may not integrate directly in to these builds yet. Check out our wiki for more information about getting this hooked up.
Partial support for inline self-modifying code and trap flag
As we work towards supporting more edge-case behaviour of anti-tamper and anti-debugger software. We have spent some time this month implementing support for inline self-modifying code (SMC) and the trap flag, allowing for complete Denuvo support.
Inline self-modifying code is used by Denuvo in patterns like
inc dword ptr [rip+6]
<opcode for cpuid> - 1
dec dword ptr [rip-2]
to obfuscate checks. FEX’s previous SMC handling operated purely at a block level using memory protection: an SMC write would invalidate all cached code in the touched page and then continue to the end of the current block, at which point any modified code would be regenerated. With inline SMC however this isn’t enough, since guest instructions that map to the currently executing block of host code are being changed, the host code needs to be changed too, waiting until the end of the block isn’t enough. This is now handled in FEX by detecting such cases, reconstructing the x86 context of the instruction performing the SMC (the inc in the above example), regenerating the host code at the x86 PC but limiting the length to the single modifying instruction and then jumping to that. At which point the same SMC will be detected again but since the block doesn’t contain the guest instruction being modified it has been reduced back to the first case which is handled as described prior.
The trap flag on the other hand is interesting because this is an anti-debugger tactic that some badly behaving launchers use. This is because of how debuggers treat the trap flag versus how it works when a debugger isn’t running, this lets the application detect the debugger and throw an error. FEX didn’t handle this at all prior which was causing these launchers to throw their hands up and stop running.
A note is that some of this work is only wired up on the WINE side rather than the FEX-Emu Linux emulation side, so mileage may vary!
JIT bug fixes and performance improvements
As usual, a lot of fixes landed for our JIT, ranging from incorrect backpatching of unaligned atomics, to incorrect instruction handling, to improving performance of a couple of instructions. Let’s break down what we fixed this month.
Fixed backpatching of unaligned atomics with small immediates
ARM’s FEAT_LRCPC2 extension added TSO instructions for small immediate offsets in the range of -256 to 255. These still have the regular atomic limitation of ARM where the address needs to be naturally aligned (or within 16 byte granule!) of the access type. FEX needs to emulate unaligned memory accesses from x86 by backpatching these instructions to be a DMB plus load or store. We were incorrectly patching these instructions with the small offsets. This will improve stability of emulation on hardware that supports the new FEAT_LRCPC2 instructions
Fix float to integer overflow behaviour
This is a very important change for how FEX handles when converting a float value to an integer and an overflow occurs. While we knew of the problem, we didn’t realize how wide reaching the problem was causing problems. In particular this fixes The Talos Principle’s audio cuting out, Animal Well’s music having chirping artifacts, SOMA not allowing interactions with things in the world, Satisfactory’s server crashing, and Metaphor Refantazio infinite looping before getting in-game!
There are sure to be a bunch of other little fixes that this also fixes because it’s a pervasive problem that games rely upon!
Fix ModRM decoding of 3DNow! instructions
While 3DNow! isn’t used in any recent games, to the point that AMD has removed the instruction set from Zen CPU cores, older games still use this extension if possible. Turns out we had a gap in our testing infrastructure for when a 3DNow! instruction used the SIB encoding form of the instruction. This would result in crashes and misinterpreting of instructions. This will fix some older 32-bit games using 3DNow! and of course we added new unittests to our testing infrastructure to make sure it keeps working.
Fixes H0F3A table decoding
This fix doesn’t affect any known applications, but because of how x86 compilers aggressively pad instruction sizes, this could crop up anywhere without us noticing. When the H0F3A instruction table gets decoded, FEX was incorrectly applying the REX_W prefix to instructions that would ignore the prefix. Out of all the instructions in the table, only three actually care about the prefix while the others always ignore it. If this padding occured then FEX would think it is an unknown instruction and crash. This has now been resolved which should keep us from ever hitting the issue.
Generate 80-bit SVE loadstores when necessary
For all the users that have SVE supporting hardware (There aren’t a lot of you!), we have added a new optimization that converts two loads or stores in to a single 80-bit masked loadstore instruction. While this isn’t going to be a huge improvement because this only occurs with x87 code, it’s another little optimization in the list of things that SVE improves for x86 emulation.
Increase minimum kernel requirement from 5.0 to 5.15
We’re moving in to the future with some changes that require increasing our minimum kernel version. Because we were allowing such an old version of the Linux kernel, we were hitting some heartburn in some codepaths. In order to make this easier, we are moving up the minimum kernel requirement to an LTS release of the kernel released back in 2021 already! We don’t expect this to cause too many problems, since this is an kernel supported by Ubuntu for 22.04
Drop official support for ArchLinux
Due to a clarification from the ArchLinux team this last month, they are no longer allowing packages in the AUR that don’t support x86-64. Due to this change and that FEX only supports running on an AArch64 host, they have removed our official packages from AUR. There’s nothing that we can do about this besides dropping support for ArchLinux.
See the 2501 Release Notes or the detailed change log in Github.