Skip to content

Improve enum decoding -- alternative with binary search instead of map lookup #985

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 11 commits into from

Conversation

osa1
Copy link
Member

@osa1 osa1 commented May 9, 2025

This PR is the same as #981, except instead of map lookup we do binary search.

This is slightly faster than #981 on Wasm, and slightly slower than #981 on native.

Wasm results:

// PR 981
protobuf_PackedEnumDecoding(RunTimeRaw): 36483.33333333333 us.
protobuf_PackedSparseEnumDecoding(RunTimeRaw): 56450.0 us.

// This PR
protobuf_PackedEnumDecoding(RunTimeRaw): 38016.66666666667 us.
protobuf_PackedSparseEnumDecoding(RunTimeRaw): 53750.0 us.

AOT results:

// PR 981
protobuf_PackedEnumDecoding(RunTimeRaw): 42096.54 us.
protobuf_PackedSparseEnumDecoding(RunTimeRaw): 55845.825 us.

// This PR
protobuf_PackedEnumDecoding(RunTimeRaw): 44518.14 us.
protobuf_PackedSparseEnumDecoding(RunTimeRaw): 60258.2 us.

Note: in the benchmarks we have a small number (13) known values in the enum. I wonder if the benchmark results change in favor of one or the other with larger number of known values.

osa1 added 10 commits May 8, 2025 10:47
This is an alternative to google#980 that doesn't make a big difference in
terms of performance of AOT compiled benchmarks, but makes a big
difference when compiling to Wasm, comapred to google#980.

When decoding an enum value, we call a callback in the enum field's
`FieldInfo`. The callback then indexes a map mapping enum numbers to
Dart values.

When these conditions hold:

- The known enum numbers are all positive. (so that we can use a list
  and index it with the number)

- The known enum numbers are more than 70% of the large known enum
  number. (so that the list won't have a lot of `null` entries, wasting
  heap space)

We now generate a list instead of a map to map enum numbers to Dart
values.

Note: similar to the map, the list is runtime allocated. No new code
generated per message or enum type.

Wasm benchmarks:

- Before: `protobuf_PackedEnumDecoding(RunTimeRaw): 48200.0 us`
- PR google#980: `protobuf_PackedEnumDecoding(RunTimeRaw): 42120.0 us`
    - Diff: -12.6%
- After: `protobuf_PackedEnumDecoding(RunTimeRaw): 35733.3 us`
    - Diff against PR google#980: -15%
    - Diff against master: -25%

AOT benchmarks:

- Before: `protobuf_PackedEnumDecoding(RunTimeRaw): 49180.0 us`
- PR google#980: `protobuf_PackedEnumDecoding(RunTimeRaw): 45726.82 us`
    - Diff: -7%
- This PR: `protobuf_PackedEnumDecoding(RunTimeRaw): 42929.7 us`
    - Diff against PR google#980: -6%
    - Diff agianst master: -12%
With upcoming change we'll improve decoding performance of enums, but
there will be a difference between "sparse" and "dense" enum decoding
performance even though they'll both be faster.

To be able to measure the difference add a "sparse" enum type and a
benchmark for decoding it.

"Sparse" means the enum has large gaps between known enum values, or
negative enum values.

When decoding this kind of enums, the mapping from the wire `varint` to
the Dart value for the enum needs to be done by binary search, map
lookup, or similar.

For "dense" enums, we can have a list of enum values and index the list
directly with the `varint` value, after a range check.

These changes will be done in the follow-up PR(s).
This was referenced May 12, 2025
@osa1
Copy link
Member Author

osa1 commented May 13, 2025

Closing in favor or #981.

@osa1 osa1 closed this May 13, 2025
@osa1 osa1 deleted the improve_enum_decoding_3 branch May 13, 2025 19:38
osa1 added a commit that referenced this pull request May 15, 2025
When decoding an enum value, we call a callback in the enum field's `FieldInfo`. The callback then indexes a map mapping enum numbers to Dart values.

When these conditions hold:

- The known enum numbers are all positive. (so that we can use a list and index it with the number)

- The known enum numbers are more than 70% of the large known enum number. (so that the list won't have a lot of `null` entries, wasting heap space)

We now generate a list instead of a map to map enum numbers to Dart values.

Similar to the map, the list is runtime allocated. No new code generated per message or enum type.

AOT benchmarks:

- Before: `protobuf_PackedEnumDecoding(RunTimeRaw): 47585.14 us.`
- After: `protobuf_PackedEnumDecoding(RunTimeRaw): 38974.566666666666 us.`
- Diff: -18%

Wasm benchmarks:

- Before: `protobuf_PackedEnumDecoding(RunTimeRaw): 52225.0 us.`
- After: `protobuf_PackedEnumDecoding(RunTimeRaw): 34283.33333333333 us.`
- Diff: -34%

**Alternatives considered:**

- #980 uses a map always, but eliminates the `valueOf` closure.
- #985 uses a list always, and does binary search in the list when the list is "shallow".
- #987 is the same as #985, but instead of calling the `valueOf` closure it stores an extra field in `FieldInfo`s for whether to binary search or directly index.

These are all slower than the current PR.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant