Skip to content

Improve enum decoding -- alternative #981

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 17 commits into from
May 15, 2025
Merged

Conversation

osa1
Copy link
Member

@osa1 osa1 commented May 8, 2025

When decoding an enum value, we call a callback in the enum field's FieldInfo. The callback then indexes a map mapping enum numbers to Dart values.

When these conditions hold:

  • The known enum numbers are all positive. (so that we can use a list and index it with the number)

  • The known enum numbers are more than 70% of the large known enum number. (so that the list won't have a lot of null entries, wasting heap space)

We now generate a list instead of a map to map enum numbers to Dart values.

Similar to the map, the list is runtime allocated. No new code generated per message or enum type.

AOT benchmarks:

  • Before: protobuf_PackedEnumDecoding(RunTimeRaw): 47585.14 us.
  • After: protobuf_PackedEnumDecoding(RunTimeRaw): 38974.566666666666 us.
  • Diff: -18%

Wasm benchmarks:

  • Before: protobuf_PackedEnumDecoding(RunTimeRaw): 52225.0 us.
  • After: protobuf_PackedEnumDecoding(RunTimeRaw): 34283.33333333333 us.
  • Diff: -34%

Alternatives considered:

These are all slower than the current PR.


cl/757724889

This is an alternative to google#980 that doesn't make a big difference in
terms of performance of AOT compiled benchmarks, but makes a big
difference when compiling to Wasm, comapred to google#980.

When decoding an enum value, we call a callback in the enum field's
`FieldInfo`. The callback then indexes a map mapping enum numbers to
Dart values.

When these conditions hold:

- The known enum numbers are all positive. (so that we can use a list
  and index it with the number)

- The known enum numbers are more than 70% of the large known enum
  number. (so that the list won't have a lot of `null` entries, wasting
  heap space)

We now generate a list instead of a map to map enum numbers to Dart
values.

Note: similar to the map, the list is runtime allocated. No new code
generated per message or enum type.

Wasm benchmarks:

- Before: `protobuf_PackedEnumDecoding(RunTimeRaw): 48200.0 us`
- PR google#980: `protobuf_PackedEnumDecoding(RunTimeRaw): 42120.0 us`
    - Diff: -12.6%
- After: `protobuf_PackedEnumDecoding(RunTimeRaw): 35733.3 us`
    - Diff against PR google#980: -15%
    - Diff against master: -25%

AOT benchmarks:

- Before: `protobuf_PackedEnumDecoding(RunTimeRaw): 49180.0 us`
- PR google#980: `protobuf_PackedEnumDecoding(RunTimeRaw): 45726.82 us`
    - Diff: -7%
- This PR: `protobuf_PackedEnumDecoding(RunTimeRaw): 42929.7 us`
    - Diff against PR google#980: -6%
    - Diff agianst master: -12%
@osa1 osa1 requested a review from mkustermann May 8, 2025 09:57
@osa1
Copy link
Member Author

osa1 commented May 9, 2025

@mkustermann this change performs better, and while it's also a breaking change (adds a new member to a public class that can be extended), in practice this won't break anything and it doesn't require major version bump.

Copy link
Collaborator

@mkustermann mkustermann left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd be ok with landing this, but see my comments on the other PR. If they make sense, maybe that could be an even better solution?

osa1 added 2 commits May 9, 2025 15:42
With upcoming change we'll improve decoding performance of enums, but
there will be a difference between "sparse" and "dense" enum decoding
performance even though they'll both be faster.

To be able to measure the difference add a "sparse" enum type and a
benchmark for decoding it.

"Sparse" means the enum has large gaps between known enum values, or
negative enum values.

When decoding this kind of enums, the mapping from the wire `varint` to
the Dart value for the enum needs to be done by binary search, map
lookup, or similar.

For "dense" enums, we can have a list of enum values and index the list
directly with the `varint` value, after a range check.

These changes will be done in the follow-up PR(s).
@osa1
Copy link
Member Author

osa1 commented May 12, 2025

@mkustermann I tried:

Among these, this PR makes most sense to merge, as it's backwards compatible, and it performs the best.

If you could take another look I'd like to merge this.

In the meantime I'll test it internally.

@osa1 osa1 requested a review from mkustermann May 12, 2025 11:31
@osa1
Copy link
Member Author

osa1 commented May 13, 2025

Internal testing reveals a failure which may be related. I'll investigate further before merging this.

@osa1
Copy link
Member Author

osa1 commented May 15, 2025

Internal testing reveals a failure which may be related. I'll investigate further before merging this.

This turned out to be some unsound user code compiled with dart2js unsound mode. Not an issue with this change. Merging.

@osa1 osa1 merged commit a6c8e56 into google:master May 15, 2025
17 checks passed
@osa1 osa1 deleted the improve_enum_decoding_2 branch May 15, 2025 09:33
@devoncarew
Copy link
Collaborator

@osa1 - I'm seeing some issue w/ this change locally when generating the firestore API.

For ProtobufEnum subclasses, the generated code is:

  ...
  static final $core.List<TargetChange_TargetChangeType?> _byValue = $pb.ProtobufEnum.$_initByValueList(values, 4);
  ...

But the ProtobufEnum.$_initByValueList reference doesn't exist:

The method '$_initByValueList' isn't defined for the type 'ProtobufEnum'.

It looks like ProtobufEnum.$_initByValueList() was added to package:protobuf. I think we'll want to rev the package version (to a new minor version) + changelog, and plan to publish concurrent w/ the next protoc_plugin publish.

@osa1
Copy link
Member Author

osa1 commented May 16, 2025

@devoncarew right, sorry, I should've documented this in the changelog probably.

This changes both the library and plugin. Library changes are backwards compatible, if you don't change your generated classes you can update the library.

But if you update the plugin and re-generate your protos, then you need the new version of the library.

So I think this should be a minor version bump in the library and major one in the plugin.

osa1 added a commit to osa1/protobuf.dart that referenced this pull request May 16, 2025
devoncarew pushed a commit that referenced this pull request May 16, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants