Commit 1e029d0
authored
Enable EVEX feature: embedded broadcast for Vector128/256/512.Add() in limited cases (#84821)
* Enable EVEX feature: embedded broadcast
Embedded Broadcast is enabled in Vector256<float>.Add() with limited cases:
1. Vector256.Add(Vec, Vector256.Create(DCon));
2. Vector256<float> VecCns = Vector256.Create(DCon);
Vector256.Add(Vec, VecCns);
3. Vector256.Add(Vec, Vector256.Create(LCL_VAR));
4. Vector256<float> VecCns = Vector256.Create(LCL_VAR);
Vector256.Add(Vec, VecCns);
Note: Case 2 4 can only be optimized when DOTNET_TieredCompilation = 0.
* remove some irrelevent change from previous main.
* Enable containment at Broadcast intrinsic
to improve the embedded broadcast enabling works.
* Convert the check logics on broadcast into a flag
* bug fixes:
1. fixed the contain logic at lowering, to accomadate the situation when
both operands for a EB compatible node are EB candidates.
2. fixed some unexpected EVEX.b set at some non-EVEX instructions on x86
* apply format patch.
* Add "insOpts" data structure to xarch:
insOpts may contain information on the EVEX.b bit,
currently only embedded broaddcast
* Add "OperIsBroadcastScalar" check:
This check is to ensure the intrinsic is actually a broadcast
scalar intrinsic, the reason to add this check is that gentree
flags are using overlapping definition, GTF_BROADCAST_EMBEDDED has
some conflicting definition, so we need to ensure the flag we checked
does not come from other overlapping flags.
* rebase the branch and resolve conflicts
* changes based on the reivews:
1. removed the gentree flag GTF_EMBEDDED_BROADCAST.
2. mark the embedded broadcast node by making it contained.
3. improved logics in GetMemOpSize() to return the correct pointer size
when embedded broadcast is enabled.
4. improved logics in genOperandDesc() to emit scalar when constant
vector operand is found to be created from scalar.
* apply format patch
* bug fixes
* bug fixes
* aaply format patch
* Enable embedded broadcast for Vector128<float>.Add
* Enable embedded broadcast for Vector512<float>.Add
* make double as embedded broadcast supported
* Add EB support to AVX_BroadcastScalarToVector*
* apply format patch
* Enable embedded broadcast for double const vector
* Enable embedded broadcast for integer Add.
* Changes based on the review:
1. Change GenTreeHWIntrinsic::OperIsEmbBroadcastHWIntrinsic
to OperIsEmbBroadcastCompatible
2. removed OperIsBroadcastScalar
3. formatting
4. correct errors in the comments.
* removed the gentree flag: GTF_VECCON_FROMSCALAR
* Bug fixes on embedded broadcast with AVX_Broadcast
* enable embedded broadcast in R_R_A path
* apply format patch
* bug fixes:
re-introduce "OperIsBroadcastScalar",
there are some cases when non-broadcast node (e.g. Load, Read)
contained by embedded broadcast and embedded broadcast
is enabled unexpectedly, using this method can filter out those cases.
* Changes based on reviews:
1. code style improvement
2. fixes typos and errors in the comments.
3. extract the operand swap logic when lowering Create node into
a function: TryCanonizeEmbBroadcastCandicate()
* unfold VecCon node when lowering if this node is
eligible for embedded broadcast.
* apply format patch
* bug fixes:
1. added missing default branch
2. filter out some possible embedded broadcast cases
for some better optimization
* resolve the mishandling for the previous conflict.
* move the unfolding logic to ContainChecks
* Code changes based on the review
* apply format patch
* support embedded broadcast for GT_IND
as the operand of a broadcast node.
* bug fixes:
Long type should only be on 64-bit system.
* apply format patch
* Introduce MakeHWIntrinsicSrcContained():
This function will handle the case that constant vector
is the operand of embedded broadcast ops.
If the constant vector is eligible for embedded broadcast,
will unfold the constatn vector to the corresponding broadcast
intrinsic form.
* Code changes based on reviews:
1. a helper function to detect embedded broadcast compatible flag
2. contain logic improvement.
3. typo fixes.
* Code changes based on review
* apply format patch
* Code changes based on review:
1. deleted irrelevant comments.
Move the contain check up to cover more cases.
* Code changes based on review:
1. Update comment to keep up with the changes in InstrDesc.
2. Removed un-needed argumnet in the irrelevant method.1 parent e126ca3 commit 1e029d0
File tree
13 files changed
+566
-62
lines changed- src/coreclr/jit
13 files changed
+566
-62
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
127 | 127 | | |
128 | 128 | | |
129 | 129 | | |
130 | | - | |
| 130 | + | |
| 131 | + | |
| 132 | + | |
131 | 133 | | |
132 | 134 | | |
133 | 135 | | |
| |||
764 | 766 | | |
765 | 767 | | |
766 | 768 | | |
| 769 | + | |
| 770 | + | |
| 771 | + | |
| 772 | + | |
767 | 773 | | |
768 | 774 | | |
769 | 775 | | |
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
781 | 781 | | |
782 | 782 | | |
783 | 783 | | |
| 784 | + | |
| 785 | + | |
| 786 | + | |
784 | 787 | | |
785 | 788 | | |
786 | 789 | | |
| |||
814 | 817 | | |
815 | 818 | | |
816 | 819 | | |
817 | | - | |
818 | | - | |
| 820 | + | |
| 821 | + | |
819 | 822 | | |
820 | 823 | | |
821 | 824 | | |
| |||
830 | 833 | | |
831 | 834 | | |
832 | 835 | | |
833 | | - | |
| 836 | + | |
834 | 837 | | |
| 838 | + | |
| 839 | + | |
835 | 840 | | |
836 | 841 | | |
837 | 842 | | |
| |||
866 | 871 | | |
867 | 872 | | |
868 | 873 | | |
869 | | - | |
870 | | - | |
| 874 | + | |
| 875 | + | |
871 | 876 | | |
872 | 877 | | |
873 | 878 | | |
| |||
1529 | 1534 | | |
1530 | 1535 | | |
1531 | 1536 | | |
| 1537 | + | |
| 1538 | + | |
| 1539 | + | |
| 1540 | + | |
| 1541 | + | |
| 1542 | + | |
| 1543 | + | |
| 1544 | + | |
| 1545 | + | |
| 1546 | + | |
| 1547 | + | |
| 1548 | + | |
| 1549 | + | |
1532 | 1550 | | |
1533 | 1551 | | |
1534 | 1552 | | |
| |||
3655 | 3673 | | |
3656 | 3674 | | |
3657 | 3675 | | |
| 3676 | + | |
3658 | 3677 | | |
3659 | 3678 | | |
| 3679 | + | |
| 3680 | + | |
| 3681 | + | |
| 3682 | + | |
| 3683 | + | |
| 3684 | + | |
| 3685 | + | |
| 3686 | + | |
| 3687 | + | |
| 3688 | + | |
| 3689 | + | |
3660 | 3690 | | |
| 3691 | + | |
| 3692 | + | |
| 3693 | + | |
| 3694 | + | |
3661 | 3695 | | |
3662 | 3696 | | |
3663 | 3697 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
1231 | 1231 | | |
1232 | 1232 | | |
1233 | 1233 | | |
| 1234 | + | |
1234 | 1235 | | |
1235 | 1236 | | |
1236 | | - | |
| 1237 | + | |
1237 | 1238 | | |
1238 | 1239 | | |
1239 | 1240 | | |
| |||
1268 | 1269 | | |
1269 | 1270 | | |
1270 | 1271 | | |
| 1272 | + | |
| 1273 | + | |
| 1274 | + | |
| 1275 | + | |
| 1276 | + | |
| 1277 | + | |
| 1278 | + | |
| 1279 | + | |
| 1280 | + | |
| 1281 | + | |
| 1282 | + | |
| 1283 | + | |
| 1284 | + | |
| 1285 | + | |
| 1286 | + | |
| 1287 | + | |
1271 | 1288 | | |
1272 | 1289 | | |
1273 | 1290 | | |
| |||
6667 | 6684 | | |
6668 | 6685 | | |
6669 | 6686 | | |
6670 | | - | |
| 6687 | + | |
| 6688 | + | |
6671 | 6689 | | |
6672 | 6690 | | |
6673 | 6691 | | |
| |||
6678 | 6696 | | |
6679 | 6697 | | |
6680 | 6698 | | |
| 6699 | + | |
| 6700 | + | |
| 6701 | + | |
| 6702 | + | |
| 6703 | + | |
6681 | 6704 | | |
6682 | 6705 | | |
6683 | 6706 | | |
| |||
6778 | 6801 | | |
6779 | 6802 | | |
6780 | 6803 | | |
6781 | | - | |
6782 | | - | |
| 6804 | + | |
| 6805 | + | |
| 6806 | + | |
| 6807 | + | |
| 6808 | + | |
| 6809 | + | |
| 6810 | + | |
6783 | 6811 | | |
6784 | 6812 | | |
6785 | 6813 | | |
| |||
6797 | 6825 | | |
6798 | 6826 | | |
6799 | 6827 | | |
| 6828 | + | |
| 6829 | + | |
| 6830 | + | |
| 6831 | + | |
| 6832 | + | |
6800 | 6833 | | |
6801 | 6834 | | |
6802 | 6835 | | |
| |||
6829 | 6862 | | |
6830 | 6863 | | |
6831 | 6864 | | |
6832 | | - | |
| 6865 | + | |
| 6866 | + | |
6833 | 6867 | | |
6834 | 6868 | | |
6835 | 6869 | | |
| |||
6842 | 6876 | | |
6843 | 6877 | | |
6844 | 6878 | | |
| 6879 | + | |
| 6880 | + | |
| 6881 | + | |
| 6882 | + | |
| 6883 | + | |
6845 | 6884 | | |
6846 | 6885 | | |
6847 | 6886 | | |
| |||
8134 | 8173 | | |
8135 | 8174 | | |
8136 | 8175 | | |
8137 | | - | |
| 8176 | + | |
8138 | 8177 | | |
8139 | 8178 | | |
8140 | 8179 | | |
8141 | | - | |
| 8180 | + | |
8142 | 8181 | | |
8143 | 8182 | | |
8144 | 8183 | | |
| 8184 | + | |
8145 | 8185 | | |
8146 | 8186 | | |
8147 | 8187 | | |
| |||
8159 | 8199 | | |
8160 | 8200 | | |
8161 | 8201 | | |
8162 | | - | |
8163 | | - | |
| 8202 | + | |
| 8203 | + | |
| 8204 | + | |
| 8205 | + | |
| 8206 | + | |
| 8207 | + | |
| 8208 | + | |
8164 | 8209 | | |
8165 | 8210 | | |
8166 | 8211 | | |
8167 | | - | |
| 8212 | + | |
8168 | 8213 | | |
8169 | 8214 | | |
8170 | 8215 | | |
| 8216 | + | |
8171 | 8217 | | |
8172 | 8218 | | |
8173 | 8219 | | |
| |||
8222 | 8268 | | |
8223 | 8269 | | |
8224 | 8270 | | |
8225 | | - | |
| 8271 | + | |
8226 | 8272 | | |
8227 | 8273 | | |
8228 | 8274 | | |
8229 | | - | |
| 8275 | + | |
8230 | 8276 | | |
8231 | 8277 | | |
8232 | 8278 | | |
| 8279 | + | |
8233 | 8280 | | |
8234 | 8281 | | |
8235 | 8282 | | |
| |||
15717 | 15764 | | |
15718 | 15765 | | |
15719 | 15766 | | |
15720 | | - | |
| 15767 | + | |
15721 | 15768 | | |
15722 | 15769 | | |
15723 | 15770 | | |
| |||
0 commit comments