The CoreCLr implementation was introduced in #35597
The fence must be an intrinsic.
- on architectures with strong memory ordering (i.e. x64) act as a compiler fence that prevents reordering optimizations.
- on architectures with weak memory ordering also emit an ordering instruction (i.e.
dmb ishld on arm64)
Also consider: #35761