Consolidate implementation of ExtractBit #22225

grant-d · 2019-01-26T00:42:59Z

ExtractBit, InsertBit, ClearBit

[Edited]
This is intended to be the last PR in this series (LZCNT, LOG2, TZCNT, ROTL, POPCNT)
Previously, all the above changes were in this PR; I decided to split them into separate PRs to make CR easier.
This PR is now repurposed to focus on the final set of changes.

There are several implementations of ExtractBit, InsertBit, Clearbit in the stack.

This PR consolidates them into a central class
The actual implementations are mostly unchanged, ie same standard idioms from original callsites
All BitOps methods pass units

The benefits here are:

Less proliferation of identical helpers
Some optimizations, API consistency
Existing code/callsites are more self-documenting
New code can use these new pseudo-intrinsics
Dependency/profile analysis

All known callers only use byte, int or uint arguments.
We may decide to add long, ulong for consistency.

The know callers are mostly downstream (also see review comments):

Related PR for CoreRT: [NO MERGE] BitOps analysis CoreRT (WIP) corert#6911
Related PR for CoreFX: [NO MERGE] BitOps analysis CoreFX (WIP) corefx#34917
Related PR for Roslyn: [NO MERGE] BitOps analysis Roslyn (WIP) roslyn#32901 (example of external consumer)

cc @tannergooding, @jkotas

src/System.Private.CoreLib/shared/System/BitOps.cs

- Change return type to byte

src/System.Private.CoreLib/shared/System/BitOps.cs

grant-d · 2019-01-28T22:54:00Z

OK @tannergooding per your request I have shaken the tree: removed all unused code, and only left the consolidated methods. We can perf-tune those later on.
Beneath each region I have left the commented-out original call sites for validation.

Let me know if that's suitable, I can make any changes needed.

src/System.Private.CoreLib/shared/System/BitOps.cs

tannergooding · 2019-01-28T22:55:53Z

CC. @GrabYourPitchforks, who I believe was interested in some of this work.

src/System.Private.CoreLib/shared/System/BitOps.cs

src/System.Private.CoreLib/shared/System/SpanHelpers.Byte.cs

src/System.Private.CoreLib/shared/System/BitOps.cs

jkotas · 2019-02-14T23:26:21Z

This is adding a bunch of dead methods that are not used from anywhere else in CoreLib (different from previous PRs).

We should have the API review discussion first whether these methods are worth adding to public surface. The value & (1 << bit) != 0 pattern is pretty idiomatic - it is unclear to me whether these methods are useful.

grant-d · 2019-02-14T23:29:35Z

it is unclear to me whether these methods are useful

Makes sense, we can wait for api review.
I am curious - does the JIT turn such idioms into native instructions, eg bts ?

grant-d · 2019-02-14T23:31:56Z

@jkotas would you mind reviewing the single remaining change in this PR, that was a missed optimization from the previous PR?
[Edit] Change moved to separate PR (#22612)

tannergooding · 2019-02-14T23:34:26Z

The value & (1 << bit) != 0 pattern is pretty idiomatic - it is unclear to me whether these methods are useful.

I think they are likely useful in the scenario where they can be backed by intrinsics. Whether or not BT, BTS, BTR, and BTC are useful is probably what should be determined.

jkotas · 2019-02-14T23:34:34Z

does the JIT turn such idioms into native instructions, eg bts ?

I do not think it does today. https://github.com/dotnet/coreclr/issues/11471#issue-227453324 tracks it (look for * bt{s|r|c} formation).

jkotas · 2019-02-14T23:39:31Z

BT, BTS, BTR, and BTC are useful is probably what should be determined.

If they are useful, they should be handled as peephole optimization. All code out there is using the idiomatic pattern today, and we should not be asking all of it to be rewritten to do something else to run well.

grant-d · 2019-02-14T23:40:16Z

OK, closing this PR pending API review

tannergooding · 2019-02-14T23:55:43Z

If they are useful, they should be handled as peephole optimization. All code out there is using the idiomatic pattern today, and we should not be asking all of it to be rewritten to do something else to run well.

I was suggesting that it might be beneficial to both recognize the code pattern and have a standard API for it.

Much like Rotate, it is typical to create a "helper" method for these kind of operations (we've used InsertBit, SetBit, ClearBit, RemoveBit, GetBit, GetFlag, TestBit, FlipBit, and WriteBit ourselves in various dotnet repos). The helper method makes it very clear the intended semantics, at a glance.

grant-d · 2019-02-14T23:58:37Z

The helper method makes it very clear the intended semantics

I agree with that: the callsites I updated in the previous PRs are more self documenting, especially where operator-heavy code is present, like inline rotls.

beneficial to both recognize the code pattern and have a standard API for it
Agree. So we may or may not update the legacy callsites, but either way new code can use the intrinsic.

This also opens the door to analysis - we can then directly measure how many callers use (say) ClearBit both in sourcecode and at runtime (profiling) and thus make a decision to JIT optimize it or not.

jkotas · 2019-02-15T00:04:28Z

Much like Rotate

The difference is that the idiomatic pattern for Rotate is not a simple code. The idiomatic pattern for bit test is compact code (ie the method won't save you much typing).

tannergooding · 2019-02-15T00:14:13Z

ie the method won't save you much typing

Right, but the point of the method isn't to save on typing. It is to provide clarity in the code.
var result = SetBit(data, 31) is generally more clear than either var result = data | 0x80000000 or var result = data | (1 << 31). I would guess this is also why we have created helper methods for the cases we are using such a pattern internally.

jkotas · 2019-02-15T00:49:33Z

It is matter of personal preference. I do not think I have ever created/used SetBit and friends methods myself. data | (1 << 31) worked just fine for me.

There are ~500 occurrences of 1 << in CoreLib and even more in CoreFX. I do not see much use of SetBit and friends methods. This suggests that the direct use of this pattern is orders of magnitudes more common than folks wrapping it in helper methods.

mikedn · 2019-02-15T06:48:23Z

The JIT generates BT for a while now. It could generate BTS & co. from code like x | (1 << y) but these instructions have problematic perf characteristics - they're useful on Intels and not so much on AMDs. And the reason why they're useful on Intels is not that these instructions have great performance but rather that variable shift count instructions have bad performance. Also, their memory operand forms are unusable, such that or [mem], reg/imm is preferable.

As such, it's probably more useful to make the JIT use SHLX/SHRX/SARX instead of the traditional shift instructions when available, these are efficient on both AMD and Intel and cover more scenarios.

BTS & co. might still be useful for its "test and set" capability as that saves an additional instruction. But that's more difficult to recognize in the JIT. Perhaps an intrinsic would be useful, though still not extraordinarily valuable.

grant-d · 2019-02-15T07:30:57Z

Here's an example where I personally believe the utility methods make the code more self-documenting and simpler to digest.
The old code is commented out. At a glance I can tell what the new code is doing, vs the old code where I have to read more carefully:

In System.Diagnostics.Tracing.SessionMask

        public bool this[int perEventSourceSessionId]
        {
            get
            {
                return BitOps.ExtractBit(m_mask, perEventSourceSessionId);
                //return (m_mask & (1 << perEventSourceSessionId)) != 0;
            }
            set
            {
                if (value)
                    m_mask = BitOps.InsertBit(m_mask, perEventSourceSessionId);
                    //m_mask |= ((uint)1 << perEventSourceSessionId);
                else
                    m_mask = BitOps.ClearBit(m_mask, perEventSourceSessionId);
                    //m_mask &= ~((uint)1 << perEventSourceSessionId);

                // Or, even better, a one-liner
                BitOps.WriteBit(ref m_mask, perEventSourceSessionId, value);
            }
        }

In XmlQueryType.cs (similar fix, except using the ref based signatures)

            public bool this[int index1, int index2]
            {
                get
                {
                    Debug.Assert(index1 < _bits.Length && index2 < _bits.Length, "Index out of range.");
                    return (_bits[index1] & ((ulong)1 << index2)) != 0;
                }
                set
                {
                    Debug.Assert(index1 < _bits.Length && index2 < _bits.Length, "Index out of range.");
                    if (value == true)
                    {
                        _bits[index1] |= (ulong)1 << index2;
                    }
                    else
                    {
                        _bits[index1] &= ~((ulong)1 << index2);
                    }
                }
            }

grant-d · 2019-02-15T07:45:39Z

[Edited]

I would guess this is also why we have created helper methods for the cases we are using such a pattern internally

It's hard to figure out a narrow search phrase for all these methods, but here's some I came across:

In System.Globalization.TextInfo

private static bool IsWordSeparator(UnicodeCategory category) 
{
    return (c_wordSeparatorMask & (1 << (int) category)) != 0;
}

In XmlSubtreeReader.cs

private bool InAttributeActiveState
{
    get
    {
#if DEBUG
        Debug.Assert(0 == (AttributeActiveStates & (1 << (int)State.Initial)));
        Debug.Assert(0 != (AttributeActiveStates & (1 << (int)State.Interactive)));
        // elided 9 more, which ~might be easier understood as:
        Debug.Assert(BitOps.ExtractBit(AttributeActiveStates, (int)State.Interactive));
#endif
        return 0 != (AttributeActiveStates & (1 << (int)_state));
    }
}

In XmlSerializationReader.cs

private static bool IsTextualNode(XmlNodeType nodeType)
{
    return 0 != (s_isTextualNodeBitmap & (1 << (int)nodeType));
}

In OptimizerPatterns.cs

        public bool MatchesPattern(OptimizerPatternName pattern)
        {
            Debug.Assert(Enum.IsDefined(typeof(OptimizerPatternName), pattern));
            return (_patterns & (1 << (int)pattern)) != 0;
        }

In XmlNavigatorFilter.cs

        public override bool IsFiltered(XPathNavigator navigator)
        {
            return ((1 << (int)navigator.NodeType) & _mask) == 0;
        }

In DefaultBinder.CanConvert.cs

        private static bool CanPrimitiveWiden(Type source, Type target)
        {
            Primitives widerCodes = s_primitiveConversions[(int)(Type.GetTypeCode(source))];
            Primitives targetCode = (Primitives)(1 << (int)(Type.GetTypeCode(target)));

            return 0 != (widerCodes & targetCode);
        }

In Utf8JsonWriter.cs
(I think there were several like this, I didn't think to capture them earlier

        private void SetFlagToAddListSeparatorBeforeNextItem()
        {
            // Surprising this is not manually inlined, but it supports the point about helpers
            _currentDepth |= 1 << 31;
        }

2 more in Roslyn
3 more in IOT
3 meandering ones in WPF
2 more in AD

jkotas · 2019-02-15T07:57:43Z

Several more in Roslyn
And more in IOT
Some meandering ones in WPF
Some more in AD

What's interesting about these examples is that there are no two helpers that are the same. Everybody customizes them for their usage pattern. Here are the variations from these 4 examples.

uint InsertBit(int position, uint bits)
uint RemoveBit(int position, uint bits)
void ClearBit(ref byte data, int bitNumber) // throws when the bit number is invalid
void SetBit(ref byte data, int bitNumber) // throws when the bit number is invalid
bool GetBit(byte data, int bitNumber) // throws when the bit number is invalid
bool? GetFlag(int bitMask, int bitToCheck)
void SetFlag(ref int bitMask, int bitToSet, bool value) // Interlocked operation
void SetBit(ref int flags, int mask) // Interlocked operation
void ClearBit(ref int value, uint bitmask)
void SetBit(ref int value, uint bitmask)

grant-d · 2019-02-15T08:22:12Z

[Edited - also edited previous message, added more inline examples]
Interesting, thanks for the analysis.
So we would not be able to replace all the callsites:

7 inline examples above - we could do 4 of these
2 more in Roslyn - we could do these (and assert from their callsites)
3 more in IOT - per your analysis, probably can't do these
3 meandering ones in WPF - cannot do
2 more in AD - we could do these

However, it does support there being a use-case for helpers - devs don't always want to inline the algebra.
The onus would be to find common-enough signatures that fits most use-cases.

For reference, here's the original PR signatures:

// Overloads exist for byte|uint|int

public static bool ExtractBit(byte value, int bitOffset); // Used in SessionMask example above

public static bool WriteBit(ref byte value, int bitOffset, bool on); // Used in SessionMask
public static byte WriteBit(byte value, int bitOffset, bool on);

public static bool InsertBit(ref byte value, int bitOffset);
public static byte InsertBit(byte value, int bitOffset); // Used in SessionMask

// ClearBit, ComplementBit similar

jkotas · 2019-02-15T14:24:43Z

there being a use-case for helpers - devs don't always want to inline the algebra.

When majority of cases do inline the algebra, is it worth having an API for small minority of cases in the framework?

It feels similar to the situation with "throw helpers". Majority of code throws exceptions directly. Some folks like to have helper methods for throwing exceptions, but there is a lot of variety, styles and trade-offs. We had API suggestions for introducing a helper methods for this in the framework (like https://github.com/dotnet/corefx/issues/17068). None of them made it because of the use case is not significant enough, and it is easy enough for folks who likes to have these to have their own local copy that they like.

grant-d · 2019-02-15T16:22:11Z

Fair point, I agree with you on throw helpers.
If we focus on ExtractBit or InsertBit then the idioms are simple enough that I see your point, but for something like the below methods, I think they self-document easier than the math:

// Devs would likely need to unit this one to make sure it's right
public static uint ComplementBit(uint value, int bitOffset)
    =>  ~(~(1u << bitOffset) ^ value); 

// Easy once you know the idiom but at a glance it certainly doesn't say `BTR`
public static uint ClearBit(uint value, int bitOffset)
    => value & ~(1u << bitOffset);

// True BTR behavior, ie returns original value
public static bool ClearBit(ref uint value, int bitOffset)
{
    uint mask = 1u << bitOffset;
    bool btr = (value & mask) != 0;
    value &= ~mask;
    return btr;
}

NickStrupat · 2019-09-11T02:11:56Z

@grant-d Hi Grant,

I'm hunting for the latest version of your BitOps.cs with all the goodies in it. Could you be so kind as to point me to it?

benaadams · 2019-09-11T02:36:03Z

BitOperations.cs under System.Numerics.BitOperations https://github.com/dotnet/coreclr/blob/master/src/System.Private.CoreLib/shared/System/Numerics/BitOperations.cs

NickStrupat · 2019-09-11T02:56:15Z

@benaadams ah yes, thanks! I saw that the other day, but I'm looking for the unapproved ones like Iff, Clear, Set, etc.

I'm working on a BitSpan which wraps Span<Byte> + starting bit offset and bit length.

NickStrupat · 2019-09-11T03:03:53Z

found this
https://github.com/grant-d/coreclr/blob/db62682354d8952bd9a438460d5501118f8d1f00/src/System.Private.CoreLib/shared/System/BitOps.cs

grant-d · 2019-09-11T19:56:35Z

This might also help:
https://github.com/k2workflow/Clay/blob/526271f8349be49ac0dbbd8be77750bbf950c4e2/src/SourceCode.Clay/Numerics/BitOps.cs
I can't vouch for correctness or perf on any of this, but it has (at least) a bunch of useful ideas.

Identofy duplicate callsites for BitOps

5ff7696

grant-d changed the title ~~BitOps analysis NOMERGE (WIP)~~ [NO MERGE] BitOps analysis (WIP) Jan 26, 2019

grant-d commented Jan 26, 2019

View reviewed changes

src/System.Private.CoreLib/shared/System/BitOps.cs Outdated Show resolved Hide resolved

grant-d commented Jan 26, 2019

View reviewed changes

src/System.Private.CoreLib/shared/System/BitOps.cs Outdated Show resolved Hide resolved

Add new methods

0139220

grant-d commented Jan 26, 2019

View reviewed changes

src/System.Private.CoreLib/shared/System/BitOps.cs Outdated Show resolved Hide resolved

grant-d commented Jan 26, 2019

View reviewed changes

src/System.Private.CoreLib/shared/System/BitOps.cs Outdated Show resolved Hide resolved

- Add unsigned overload for TZCNT

4b26701

- Change return type to byte

grant-d commented Jan 26, 2019

View reviewed changes

src/System.Private.CoreLib/shared/System/BitOps.cs Outdated Show resolved Hide resolved

grant-d commented Jan 26, 2019

View reviewed changes

src/System.Private.CoreLib/shared/System/BitOps.cs Outdated Show resolved Hide resolved

Grant Dickinson added 2 commits January 28, 2019 09:55

Comments

c51d22e

Add instrinsics support to some methods

781bc08

tannergooding reviewed Jan 28, 2019

View reviewed changes

src/System.Private.CoreLib/shared/System/BitOps.cs Outdated Show resolved Hide resolved

tannergooding reviewed Jan 28, 2019

View reviewed changes

src/System.Private.CoreLib/shared/System/BitOps.cs Outdated Show resolved Hide resolved

Grant Dickinson added 3 commits January 28, 2019 14:08

CR changes: Standardize on uint instead of byte

bb2a8b1

CR: Move statics to top of file

05ce353

Trim unused code

fe01190

tannergooding reviewed Jan 28, 2019

View reviewed changes

src/System.Private.CoreLib/shared/System/BitOps.cs Outdated Show resolved Hide resolved

grant-d commented Jan 28, 2019

View reviewed changes

src/System.Private.CoreLib/shared/System/BitOps.cs Outdated Show resolved Hide resolved

Remove commented-out code

f917003

tannergooding reviewed Jan 28, 2019

View reviewed changes

src/System.Private.CoreLib/shared/System/BitOps.cs Outdated Show resolved Hide resolved

Grant Dickinson added 3 commits January 28, 2019 15:26

Cleanup

d7b7e39

Update CoreClr call sites

6beaa00

More call sites

2a0f621

grant-d pushed a commit to grant-d/corert that referenced this pull request Jan 29, 2019

Update call sites for dotnet/coreclr#22225

b8ff2cb

This was referenced Jan 29, 2019

[NO MERGE] BitOps analysis CoreRT (WIP) dotnet/corert#6911

Closed

[NO MERGE] BitOps analysis CoreFX (WIP) dotnet/corefx#34917

Closed

grant-d commented Jan 29, 2019

View reviewed changes

src/System.Private.CoreLib/shared/System/SpanHelpers.Byte.cs Outdated Show resolved Hide resolved

grant-d commented Feb 14, 2019

View reviewed changes

src/System.Private.CoreLib/shared/System/BitOps.cs Outdated Show resolved Hide resolved

Revert until api review

40a68f9

grant-d mentioned this pull request Feb 14, 2019

Fix perf issue related to LeadingZeroCount #22612

Merged

Unrevert

45e6e6a

grant-d closed this Feb 14, 2019

grant-d deleted the grant-d.bits branch February 14, 2019 23:40

Consolidate implementation of ExtractBit #22225

Consolidate implementation of ExtractBit #22225

Uh oh!

Conversation

grant-d commented Jan 26, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

ExtractBit, InsertBit, ClearBit

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

grant-d commented Jan 28, 2019

Uh oh!

Uh oh!

tannergooding commented Jan 28, 2019

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

jkotas commented Feb 14, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

grant-d commented Feb 14, 2019

Uh oh!

grant-d commented Feb 14, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

tannergooding commented Feb 14, 2019

Uh oh!

jkotas commented Feb 14, 2019

Uh oh!

jkotas commented Feb 14, 2019

Uh oh!

grant-d commented Feb 14, 2019

Uh oh!

tannergooding commented Feb 14, 2019

Uh oh!

grant-d commented Feb 14, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

jkotas commented Feb 15, 2019

Uh oh!

tannergooding commented Feb 15, 2019

Uh oh!

jkotas commented Feb 15, 2019

Uh oh!

mikedn commented Feb 15, 2019

Uh oh!

grant-d commented Feb 15, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

grant-d commented Feb 15, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

jkotas commented Feb 15, 2019

Uh oh!

grant-d commented Feb 15, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

jkotas commented Feb 15, 2019

Uh oh!

grant-d commented Feb 15, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

NickStrupat commented Sep 11, 2019

Uh oh!

benaadams commented Sep 11, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

NickStrupat commented Sep 11, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

NickStrupat commented Sep 11, 2019

Uh oh!

grant-d commented Jan 26, 2019 •

edited

Loading

jkotas commented Feb 14, 2019 •

edited

Loading

grant-d commented Feb 14, 2019 •

edited

Loading

grant-d commented Feb 14, 2019 •

edited

Loading

grant-d commented Feb 15, 2019 •

edited

Loading

grant-d commented Feb 15, 2019 •

edited

Loading

grant-d commented Feb 15, 2019 •

edited

Loading

grant-d commented Feb 15, 2019 •

edited

Loading

benaadams commented Sep 11, 2019 •

edited

Loading

NickStrupat commented Sep 11, 2019 •

edited

Loading