Improve enum decoding -- alternative with binary search instead of map lookup #985

osa1 · 2025-05-09T15:44:41Z

This PR is the same as #981, except instead of map lookup we do binary search.

This is slightly faster than #981 on Wasm, and slightly slower than #981 on native.

Wasm results:

// PR 981
protobuf_PackedEnumDecoding(RunTimeRaw): 36483.33333333333 us.
protobuf_PackedSparseEnumDecoding(RunTimeRaw): 56450.0 us.

// This PR
protobuf_PackedEnumDecoding(RunTimeRaw): 38016.66666666667 us.
protobuf_PackedSparseEnumDecoding(RunTimeRaw): 53750.0 us.

AOT results:

// PR 981
protobuf_PackedEnumDecoding(RunTimeRaw): 42096.54 us.
protobuf_PackedSparseEnumDecoding(RunTimeRaw): 55845.825 us.

// This PR
protobuf_PackedEnumDecoding(RunTimeRaw): 44518.14 us.
protobuf_PackedSparseEnumDecoding(RunTimeRaw): 60258.2 us.

Note: in the benchmarks we have a small number (13) known values in the enum. I wonder if the benchmark results change in favor of one or the other with larger number of known values.

This is an alternative to google#980 that doesn't make a big difference in terms of performance of AOT compiled benchmarks, but makes a big difference when compiling to Wasm, comapred to google#980. When decoding an enum value, we call a callback in the enum field's `FieldInfo`. The callback then indexes a map mapping enum numbers to Dart values. When these conditions hold: - The known enum numbers are all positive. (so that we can use a list and index it with the number) - The known enum numbers are more than 70% of the large known enum number. (so that the list won't have a lot of `null` entries, wasting heap space) We now generate a list instead of a map to map enum numbers to Dart values. Note: similar to the map, the list is runtime allocated. No new code generated per message or enum type. Wasm benchmarks: - Before: `protobuf_PackedEnumDecoding(RunTimeRaw): 48200.0 us` - PR google#980: `protobuf_PackedEnumDecoding(RunTimeRaw): 42120.0 us` - Diff: -12.6% - After: `protobuf_PackedEnumDecoding(RunTimeRaw): 35733.3 us` - Diff against PR google#980: -15% - Diff against master: -25% AOT benchmarks: - Before: `protobuf_PackedEnumDecoding(RunTimeRaw): 49180.0 us` - PR google#980: `protobuf_PackedEnumDecoding(RunTimeRaw): 45726.82 us` - Diff: -7% - This PR: `protobuf_PackedEnumDecoding(RunTimeRaw): 42929.7 us` - Diff against PR google#980: -6% - Diff agianst master: -12%

…ng_2

With upcoming change we'll improve decoding performance of enums, but there will be a difference between "sparse" and "dense" enum decoding performance even though they'll both be faster. To be able to measure the difference add a "sparse" enum type and a benchmark for decoding it. "Sparse" means the enum has large gaps between known enum values, or negative enum values. When decoding this kind of enums, the mapping from the wire `varint` to the Dart value for the enum needs to be done by binary search, map lookup, or similar. For "dense" enums, we can have a list of enum values and index the list directly with the `varint` value, after a range check. These changes will be done in the follow-up PR(s).

…ng_3

osa1 · 2025-05-13T19:38:12Z

Closing in favor or #981.

When decoding an enum value, we call a callback in the enum field's `FieldInfo`. The callback then indexes a map mapping enum numbers to Dart values. When these conditions hold: - The known enum numbers are all positive. (so that we can use a list and index it with the number) - The known enum numbers are more than 70% of the large known enum number. (so that the list won't have a lot of `null` entries, wasting heap space) We now generate a list instead of a map to map enum numbers to Dart values. Similar to the map, the list is runtime allocated. No new code generated per message or enum type. AOT benchmarks: - Before: `protobuf_PackedEnumDecoding(RunTimeRaw): 47585.14 us.` - After: `protobuf_PackedEnumDecoding(RunTimeRaw): 38974.566666666666 us.` - Diff: -18% Wasm benchmarks: - Before: `protobuf_PackedEnumDecoding(RunTimeRaw): 52225.0 us.` - After: `protobuf_PackedEnumDecoding(RunTimeRaw): 34283.33333333333 us.` - Diff: -34% **Alternatives considered:** - #980 uses a map always, but eliminates the `valueOf` closure. - #985 uses a list always, and does binary search in the list when the list is "shallow". - #987 is the same as #985, but instead of calling the `valueOf` closure it stores an extra field in `FieldInfo`s for whether to binary search or directly index. These are all slower than the current PR.

osa1 added 10 commits May 8, 2025 10:47

Merge remote-tracking branch 'origin/master' into improve_enum_decodi…

1f2064d

…ng_2

Merge branch 'sparse_enum_benchmarks' into improve_enum_decoding_2

947dee6

Use binary search on sparse lists

65ee8cc

Merge remote-tracking branch 'origin/master' into improve_enum_decodi…

c9c222e

…ng_3

Fix formatting

468128e

Hide ProtobufEnum generated code interface

cc91f0a

Documentation formatting

d4043e9

Hide $_binarySearch too

e1dfbaf

This was referenced May 12, 2025

Improved enum decoding 4 #987

Closed

Improve enum decoding #980

Closed

Remove invalid assertion

d6345a8

osa1 mentioned this pull request May 12, 2025

Improve enum decoding -- alternative #981

Merged

osa1 closed this May 13, 2025

osa1 deleted the improve_enum_decoding_3 branch May 13, 2025 19:38

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Improve enum decoding -- alternative with binary search instead of map lookup #985

Improve enum decoding -- alternative with binary search instead of map lookup #985

Uh oh!

osa1 commented May 9, 2025

Uh oh!

osa1 commented May 13, 2025

Uh oh!

Uh oh!

Improve enum decoding -- alternative with binary search instead of map lookup #985

Improve enum decoding -- alternative with binary search instead of map lookup #985

Uh oh!

Conversation

osa1 commented May 9, 2025

Uh oh!

osa1 commented May 13, 2025

Uh oh!

Uh oh!