FEX 2206 Tagged!

Quite a large amount of changes this month since we cancelled last month’s release.

Steam’s webhelper working again

Steam started enabling the chromium sandbox. Seccomp isn’t supported in FEX-Emu so it was crashing early on. Forcibly disable it trying to use the sandbox using an application profile. This lets the game library be visible again, although it can take a while to appear.

Fix LRCPC and add support for LRCPC2

There was a bug in our CMPXCHG implementation that wasn’t using ARM’s acquire-release semantics accidentally. Fixing this bug allowed us to reenable our TSO emulation using LRCPC. Additionally we have added support for LRCPC2 which gives us some immediate encoded instructions to further reduce overhead.

On hardware that supports LRCPC these can result in a reasonable performance uplift.

SHA-1 and SHA-256 instructions implemented

These SHA instruction have been implemented and the CPUID bit is now exposed. This is a GPR based implementation, an implementation using AArch64’s equivalent SHA instructions will be implemented at a later time.

Self-modifying code support improvements

Many things have changed with supporting self-modifying code in a more extensive fashion. FEX-Emu will now tracking guest allocations of executable memory and when the code has been modified, we will clear the JIT caches. This happens for both true self-modifying code and also libraries being loaded. Fault handling is employed to know when code is modified in memory to ensure we can tracak changes. This is a new setting in FEXConfig called mtrack. The older syscall only tracking path is deprecated but still available for testing.

Option to emulate x87 with 64-bit float operations

Big shout out to CallumDev for implementing this long awaited feature.

A major performance problem of emulating x86 is any older game will be compiled to use the x87 extension. This is especially true for 32-bit games. The problem with this extension is that by default it uses 80-bit floats, which AArch64 doesn’t support. We end up emulating this entire extension using a soft-float implementaiton, which while being quite accurate, is obscenely slow.

This performance hack is now available to remove a significant amount of the overhead by operating x87 instructions using 64-bit float scalar operations instead. This is known to be inaccurate, but most Windows games will actually be configuring the x87 unit to be lower precision than 80-bit. Additionally most games don’t actually need the extra precision that 80-bit provides, so it is usually safe to emulate it more inaccurately.

This may still have some bugs, we know at least one game that has issues that aren’t explained by pure precision problems. The feature can be enabled in FEXConfig under the Hacks tab, look for “X87 Reduced Precision”

Clone3 syscall fixed

With Glibc 2.34 released, this project has started using the clone3 syscall for creating threads. FEX’s implementation was mostly untested which resulted in all applications breaking. Stack pointer behaviour was broken and now with this fixed, glibc 2.34 now works out of the box.

FEXRootFSFetcher don’t try to continue download

FEX-Emu’s CDN doesn’t support continuing file downloads. Disable to not cause issues.

FEXCore: Reclaimable thread pool allocator

FEXCore now uses an intrusive pooling allocator to allow sleeping threads to give back memory to the pool. This allows multiple threads to share a memory resource, reducing memory usage by a significant amount if an application has a bunch of sleeping threads.

FEXBash: Set PS1 environment variable to show running under emulation

Once running FEXBash it can be hard to tell if you’re running your bash terminal under emulation. Setting PS1 to FEXBash> makes it easier to tell that the terminal is running under emulation.

FEXBash> uname -a

Linux ryanh-TR2 5.17.5 #FEX-2206 SMP Jun 4 2022 15:11:07 x86_64 x86_64 x86_64 GNU/Linux

OpcodeDispatcher: Fixes PEXTRB

Newer Unreal engine releases were generating a PEXTRB instruction that our frontend decoder was decoding incorrectly. Typically this would result in a crash. This fixes both Dirt 4 and Psychonauts 2.

Misc

  • CMake
    • Add support for mold
    • Add flag for defined signed overflow handling
  • Arm64: Optimize constant generation with ADRP+ADR
  • EmulatedFiles: Fixes temporary file generation flags
  • Struct Verifier: Fixes some bugs with DRM headers not getting picked up
  • Linux v5.17 and v5.18 support
  • JIT: Code relocation support
  • OpcodeDispatcher:
    • Adds support for non-temporal loadstores
    • Implements support for PAUSE instruction
  • Syscalls:
    • 32-bit mmap syscalls fixes
      • Has been broken since the start, most applications use mmap2 instead
      • Fixes Kega Fusion
  • CompileService: Removed since it is no longer required
    • We no longer try to compile in a reentrant safe fashion
  • JITSymbols: Cleaner printing of RIP relative to a file
  • Standard TODO markers for code searching
  • Some 32-bit FS/GS writing fixes
    • Not really used so didn’t affect anything

See the 2206 Release Notes or the detailed change log in Github.

Written on June 4, 2022