Skip to content
This repository was archived by the owner on Jan 23, 2023. It is now read-only.

Conversation

@AndyAyersMS
Copy link
Member

The inliner now can inline a subset of methods with pinned locals --
namely those where the language compiler (eg CSC) can determine that a
try/finally is not necessary to ensure unpinning, and where the jit
likewise an determines that a try/finally is likewise not needed in
the root method.

Generally speaking this allows inlining in cases where the inline method
is fairly simple and does not any contain exception handling, and the
call site is not within a try region.

When inlining methods that have pinned locals and also return a value,
the jit ensures that the return value is spilled to a temp so that the
unpins can be placed just after the inlined method body and won't alter
the return value.

Closes #7774.

@AndyAyersMS
Copy link
Member Author

cc @dotnet/jit-contrib... not sure who specifically to tag anymore

if (retExpr != nullptr)
{
const bool interferesWithReturn = gtHasRef(inlineInfo->retExpr, tmpNum, false);
assert(!interferesWithReturn);

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If this assert is required for correctness (which I believe it is) then it should be a noway_assert instead of a vanilla assert.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can change it, sure. I though this check might get expensive, and it's just there to ensure we did the right thing earlier...

}
#endif // DEBUG

// Emit the unpin.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You should describe the unpin operation in a comment:
as assign a null value to the pinned variable.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@AndyAyersMS
Copy link
Member Author

Ubuntu hit a java remoting error. Will rerun.

@dotnet-bot retest Ubuntu x64 Checked Build and Test

@AndyAyersMS
Copy link
Member Author

Running a few stress legs:

@dotnet-bot test Windows_NT gcstress0x3
@dotnet-bot test Windows_NT gcstress0xc
@dotnet-bot test Windows_NT jitstress2
@dotnet-bot test Ubuntu gcstress0x3
@dotnet-bot test Ubuntu gcstress0xc
@dotnet-bot test Ubuntu jitstress2

@AndyAyersMS
Copy link
Member Author

JitStress=2 failures caused by a prior change, see #7792.

Will need to look into the GC stress failures more deeply. The FixedAddressValueType test case is one I've already worked on some during initial bring-up.

@briansull
Copy link

LGTM

@AndyAyersMS
Copy link
Member Author

Fix for #7792 is in, so will rerun the jit stress legs... still need to look at the GC stress issues.

@dotnet-bot test Windows_NT jitstress2
@dotnet-bot test Ubuntu jitstress2

@AndyAyersMS
Copy link
Member Author

Diffs on CoreCLR show size increases, some substantial.

Total bytes of diff: 38405 (0.44 % of base)
    diff is a regression.
Total byte diff includes 0 bytes from reconciling methods
        Base had    0 unique methods,        0 unique bytes
        Diff had    0 unique methods,        0 unique bytes
Top file regressions by size (bytes):
       19435 : System.Reflection.Metadata.dasm (5.97 % of base)
       16449 : Microsoft.CodeAnalysis.dasm (2.20 % of base)
        1505 : System.Private.CoreLib.dasm (0.05 % of base)
         788 : System.Runtime.Numerics.dasm (1.38 % of base)
         228 : System.IO.Compression.dasm (0.29 % of base)
5 total files with size differences.
Top method regessions by size (bytes):
        3555 : Microsoft.CodeAnalysis.dasm - Microsoft.Cci.PeWriter:WriteHeaders(ref,ref,ref,ref,byref):this
        3522 : System.Reflection.Metadata.dasm - System.Reflection.PortableExecutable.PEBuilder:WritePEHeader(ref,ref,struct):this
         943 : System.Reflection.Metadata.dasm - System.Reflection.PortableExecutable.ManagedTextSection:WriteCorHeader(ref,int,int,int):this
         878 : Microsoft.CodeAnalysis.dasm - Microsoft.Cci.PeWriter:WriteCorHeader(ref,ref)
         788 : System.Runtime.Numerics.dasm - Number:ParseNumber(byref,int,byref,ref,ref,bool):bool
208 total methods with size differences.

The more greatly-impacted methods tend to make a series of calls to helper methods that pin and then (typically) call other helper methods. The mechanics of pinning require some extra code for each inlinee.

Some of the diffs don't look all that good, and cleaning them up will require careful reasoning about the interaction of the values in the pinned local and GC safe points. Here's an example:

;;; Microsoft.CodeAnalysis.CodeGen.ILBuilder:CloseStateMachineScope():this

;;; BEFORE 

       mov      edx, eax
       mov      rcx, gword ptr [rsi+16]
       mov      r8d, edi
       call     [Microsoft.Cci.BlobUtilities:WriteUInt32(ref,int,int)]
       nop  

;;; AFTER (rsp+28H is the inline-introduced pinned local)

       ; prolog
       xor      rax, rax                       // prolog zero-init
       mov      qword ptr [rsp+28H], rax       // untracked local
       ...
       ; inlinee body
       mov      rdx, gword ptr [rsi+16]
       xor      rcx, rcx                       // zero-init
       mov      bword ptr [rsp+28H], rcx       // must-init local
       cmp      eax, dword ptr [rdx+8]
       jae      SHORT G_M14678_IG08
       movsxd   rax, eax
       lea      rax, bword ptr [rdx+rax+16]
       mov      bword ptr [rsp+28H], rax       // set the pin
       mov      rax, bword ptr [rsp+28H]
       mov      dword ptr [rax], edi
       mov      bword ptr [rsp+28H], rcx       // CSC unpin
       mov      bword ptr [rsp+28H], rcx       // JIT unpin
       ...
G_M14678_IG08:
       call     CORINFO_HELP_RNGCHKFAIL
       int3

Since the inline-introduced pinned local is an untracked GC reference, it is zeroed in the prolog. Since it is also a must-init local it's then zeroed just before the inline body. After the inline body the local is unpinned by code both from the C# compiler and from the jit. Some of these zeroings could be proved redundant and removed, at the cost of extra jit time.

Where there are lots of inline-introduced pinned locals, it would be nice to stack-pack them since in some cases the prolog init costs for these will become costly, especially so if the call sites are only conditionally reached from method entry.

It might be prudent to add an size penalty to cases that will introduce pinned locals so the jit will only inline the really trivial cases, but there is often extra IL required to describe a pin so this may be tricky to get right.

@AndyAyersMS
Copy link
Member Author

Ubuntu stress hit a java remoting error. Retrying.

@dotnet-bot test Ubuntu jitstress2

@AndyAyersMS
Copy link
Member Author

AndyAyersMS commented Oct 26, 2016

Started looking at GC stress failures from here and here

I can't repro the failure in GC\API\GC\GetGenerationWR2 and don't see any new inlines there.

I can repro the failure in baseservices\compilerservices\FixedAddressValueType with about a 50% likelihood, and this test is impacted by my changes as the 4 calls to get static addresses are all now inlined. The code the jit generates looks ok to me. I am not sure if this is a robust test -- it seems possible the class with the GC heap allocated static could end up getting allocated at the same address both times. Also the inlines done into Main reduce the number of GC stress points and so this co-incident allocation may end up as a more likely outcome.

@swaroop-sridhar @swgillespie any thoughts on the FixedAddressValueType failure? I could add NoInline attributes to the calls that pin, I suppose...?

@swgillespie
Copy link

I can't say I'm familiar with this area so I'm not sure. I have put many NoInline attributes on GC tests that are perturbed by inlining so I would have no qualms about doing the same here, though.

@AndyAyersMS
Copy link
Member Author

I see baseservices\compilerservices\FixedAddressValueType failing frequently under 0xC stress even without my changes. So I think this test should be marked as stress-incompatible.

@AndyAyersMS
Copy link
Member Author

Relaunching the stress legs...

@dotnet-bot test Windows_NT gcstress0x3
@dotnet-bot test Windows_NT gcstress0xc
@dotnet-bot test Windows_NT jitstress2

@AndyAyersMS
Copy link
Member Author

Ooops, inadvertently lost the fix i did for non-windows builds... hold on a sec.

@AndyAyersMS
Copy link
Member Author

Looks like more possible flakiness with 0xC stress... will rerun and investigate.

@dotnet-bot test Windows_NT gcstress0x3
@dotnet-bot test Windows_NT gcstress0xc
@dotnet-bot test Windows_NT jitstress2

@AndyAyersMS
Copy link
Member Author

The test with the GCStress=0xC failure (methodical/explicit/basic/refloc_c.il) has identical code gen before and after this change. So it's not a regression.

@AndyAyersMS
Copy link
Member Author

@briansull I think this is ready to go, other than the issue you raised about how paranoid we should be about potential interference between the jit-introduced unpins and the return expression. Thoughts?

@briansull
Copy link

briansull commented Oct 31, 2016

LGTM
(I reviewed my comment and I think that you should protect this with a noway_assert based upon the risk vs. the cost. I expected the cost to be negligible because in the limit we only inline a very smaller number of methods that need this unpin operation. )

@AndyAyersMS
Copy link
Member Author

@briansull will update per your feedback.

@AndyAyersMS
Copy link
Member Author

Formatter leg had some kind of network reset... retrying

@dotnet-bot retest Windows_NT x64 Formatting

The inliner now can inline a subset of methods with pinned locals --
namely those where the language compiler (eg CSC) can determine that a
try/finally is not necessary to ensure unpinning, and where the jit
likewise an determines that a try/finally is likewise not needed in
the root method.

Generally speaking this allows inlining in cases where the inline method
is fairly simple and does not any contain exception handling, and the
call site is not within a try region.

When inlining methods that have pinned locals and also return a value,
the jit ensures that the return value is spilled to a temp so that the
unpins can be placed just after the inlined method body and won't alter
the return value.

Mark FixedAddressValueType as GCStressIncompatible since the "unfixed"
class static may get re-allocated at the same address. This seems to
happen even without these changes but happens much more frequently with
them.

Closes #7774.
@AndyAyersMS
Copy link
Member Author

Rebased to resolve a merge conflict.

@AndyAyersMS AndyAyersMS merged commit c06646d into dotnet:master Nov 11, 2016
@AndyAyersMS AndyAyersMS deleted the InlinePins branch November 11, 2016 19:24
@karelz karelz modified the milestone: 2.0.0 Aug 28, 2017
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants