FastMM4-AVX FPU Stack Corruption in 32-bit Move Procedures

Security Advisory GHSA-3x29-6h9j-vcvm. Published: March 6, 2026. Fixed in FastMM4-AVX v1.0.10.

Summary

Legacy 32-bit move routines in FastMM4-AVX used x87 fild and fistp instruction sequences for memory copy operations. If calling code entered these routines with a dirty x87 floating-point stack state (live entries already present on the 8-entry x87 register stack), the copy code could trigger an x87 stack overflow, raise floating-point exceptions, or silently corrupt data during ReallocMem or internal move paths. The initial mitigation used the emms instruction to clear the x87 state before each fild sequence. The final fix replaced all x87 copy sequences with rep movsd, eliminating the x87 dependency entirely. Fixed in v1.0.10.

Severity: Medium (CVSS 4.0 Score: 5.9)

Vulnerability Details

GHSA ID GHSA-3x29-6h9j-vcvm
Vulnerability Type FPU Stack Corruption, Use of Uninitialized Resource (CWE-908), Improper Check or Handling of Exceptional Conditions (CWE-703), Improper Check for Unusual or Exceptional Conditions (CWE-754)
Attack Type Local
Attack Vector Local process invoking the memory manager with non-clean x87 FPU state
Maintainer Maxim Masiutin
Product FastMM4-AVX Memory Manager
Affected Component 32-bit move routines in FastMM4.pas
Affected Versions All versions prior to v1.0.10
Fixed Version v1.0.10
Impact Memory Corruption, Data Corruption, Process Crash

CVSS Score

CVSS Version Score Severity Vector String
CVSS 4.0 5.9 Medium AV:L/AC:H/AT:P/PR:N/UI:N/VC:N/VI:H/VA:H/SC:N/SI:N/SA:N

Technical Details

Background

FastMM4-AVX is a high-performance memory manager for 32-bit and 64-bit Pascal and Delphi applications. It extends the original FastMM4 with AVX/AVX2/AVX-512 optimized memory copy routines and improved multi-threading support. The 32-bit code path includes legacy move routines that used x87 floating-point instructions for copying fixed-size memory blocks.

Root Cause

The x87 floating-point unit on x86 processors maintains an 8-entry register stack. The fild instruction pushes a value onto this stack; fistp stores the top value and pops it. The affected move routines (Move36, Move44, Move52, Move60, Move68) and fallback copy helpers used sequences of fild/fistp pairs to copy memory 8 bytes at a time.

The routines did not save or initialize the incoming x87 register state before using it. If a caller had live entries on the x87 stack when GetMem, ReallocMem, or any internal copy path invoked one of these routines, the subsequent fild instructions could overflow the 8-entry x87 stack. A stack overflow raises an x87 floating-point exception (or causes silent wraparound), producing incorrect copies, data corruption, or a process crash.

Affected Routines

  • Move36, Move44, Move52, Move60, Move68 in the 32-bit code path
  • Additional fallback loops in copy helpers that used x87 load and store sequences

x87 Architecture Context

  • x87 is an 8-entry register stack (ST(0) through ST(7))
  • fild pushes onto the x87 stack and raises a stack overflow exception (#IS) if all 8 entries are already occupied
  • fistp stores the top value and pops the x87 stack
  • If the incoming x87 state already has live entries, repeated fild calls in a copy routine overflow the stack
  • Overflow raises an x87 exception or causes invalid downstream floating-point behavior in the calling application

Proof of Concept

Both proof-of-concept cases require a build from a revision prior to commit 8aba44b with the 32-bit x87 move path reachable.

PoC A: Dirty x87 Stack Before ReallocMem

Push 8 values onto the x87 stack to fill it completely, then invoke ReallocMem with a size that routes through an affected move routine:

program PocFpuA;
{$APPTYPE CONSOLE}
uses FastMM4;
var p: Pointer;
begin
  GetMem(p, 36);
  asm
    fldz
    fldz
    fldz
    fldz
    fldz
    fldz
    fldz
    fldz
  end;
  ReallocMem(p, 36);
  FreeMem(p);
end.

Expected vulnerable behavior: invalid floating-point operation exception, crash, or silent data corruption in the memory copy path.

Expected fixed behavior: no x87 stack dependent failure; the move operation completes correctly using rep movsd.

PoC B: Force Non-SSE Legacy Path

Compile with DontUseASMVersion to route through the Pascal implementation and avoid SSE optimizations, then rerun PoC A:

cd Tests/Advanced
fpc -B -Mdelphi -Twin32 -Pi386 -dDontUseASMVersion AdvancedTest.dpr
AdvancedTest.exe

Fix Applied in Version 1.0.10

The vulnerability was addressed in two stages:

Stage 1: Initial Mitigation

The emms instruction was inserted before each fild sequence in the affected 32-bit move functions. emms resets the x87 tag word, clearing any MMX or x87 register alias state before the push operations begin. This reduced the risk but did not fully eliminate the x87 dependency.

  • Commit: d40934e: Fixed FPU stack corruption in Move36/44/52/60/68 by adding emms before fild sequences

Stage 2: Final Fix

All x87 fild/fistp copy sequences in the affected routines and fallback loops were replaced with rep movsd. This eliminates the x87 dependency entirely.

  • Commit: 8aba44b: Replace fild/fistp with rep movsd in 32-bit Move procedures; add security warnings

Why rep movsd Is Safer

  • No x87 register dependency; x87 state is not read or modified
  • No MMX or emms dependency
  • Works correctly on all x86 CPUs targeted by the legacy 32-bit path
  • Removes ABI sensitivity to caller x87 cleanliness
  • Simpler execution model and easier to maintain securely

Workarounds

Upgrading to v1.0.10 or later is the only complete fix. If upgrading is not immediately possible, the vulnerable path can be avoided:

  • Ensure SSEVersion is configured to use SSE2 or higher (the default on any SSE2-capable CPU). The x87 copy path is only reached on builds that explicitly disable SSE or target hardware without SSE2.
  • Do not define DontUseASMVersion in 32-bit build configurations, as this routes memory operations through the Pascal implementation that includes the x87 copy routines.

CWE Mapping

  • CWE-703: Improper Check or Handling of Exceptional Conditions. The legacy x87 copy path does not robustly handle exceptional x87 stack conditions from caller context.
  • CWE-754: Improper Check for Unusual or Exceptional Conditions. Behavior depends on a non-guaranteed precondition (clean incoming x87 state), causing failures in valid but unusual runtime states.
  • CWE-908: Use of Uninitialized Resource. The routine implicitly relies on incoming processor floating-point state that is not initialized by the routine itself.

Related Issues in Other Software

The following public security issues are not the same vulnerability, but are technically similar: they involve incorrect handling of x86 FPU or extended processor state assumptions that lead to corruption, crashes, or unsafe behavior.

  • CVE-2022-49557: Linux kernel x86/KVM FPU state size handling bug caused out-of-bounds writes and data corruption when FPU/XSAVE assumptions did not hold on certain CPUs. NVD
  • CVE-2022-50425: Linux kernel x86 FPU xstate copy logic issue causing NULL dereference due to incorrect assumptions about available init state components. NVD
  • CVE-2026-23005: Linux kernel x86/KVM guest XSAVE and XFD state inconsistency that can trigger #NM and kernel panic. NVD
  • CVE-2018-3665: Lazy FPU state restore side-channel issue on Intel systems, demonstrating security impact from incorrect floating-point state handling across execution contexts. NVD

References