Skip to content

Conversation

@MarcoPolo
Copy link
Contributor

@MarcoPolo MarcoPolo commented Apr 30, 2025

An alternative to #607 with a small refactor that I think makes the batch logic more obvious.

I recommend reviewing initially with the ignore whitespace parameter (?w=1)

Here's what I like about these changes compared to #607

  • MessageBatch struct is simpler. It now only holds a list of messages and the rpc scheduler to use (previously called publish strategy). There is no longer a need for the waitgroup because we add the rpcs and publish the rpcs synchronously.
  • We refactor some code to use Go's iterators (from 1.23). This cleans up the rpc queuing logic and publishing logic.
  • A zero value MessageBatch is valid, so we don't need a MessageBatch constructor.

Some notes that may not be obvious by just looking at the diff.

  • Unless we are willing to refactor the validation pipeline, topic, and the pubsub code, it makes the most sense to reference the batch within the Message struct. This isn't really that bad. It's reasonable for a message to have a reference to the batch that it is a part of. Even if we were willing to refactor a lot of code, I'm not sure it's worth it. We don't want a message tied to a batch to exercise significantly different code flows.
    • Edit: I managed this with a couple of smaller than I expected changes.
  • It's a better user experience if a user gets an error if a message fails do validate when adding that message to the batch, not when publishing the message.

I think this approach is better so I'm going to update #607 with this code

to support batch publishing messages
@MarcoPolo MarcoPolo requested a review from raulk April 30, 2025 03:56
@MarcoPolo MarcoPolo force-pushed the marco/batch-publishing-2 branch from 6b2b624 to 4ae9632 Compare April 30, 2025 04:02
@MarcoPolo MarcoPolo force-pushed the marco/batch-publishing-2 branch from 4ae9632 to 0cda186 Compare April 30, 2025 17:09
@sukunrt sukunrt self-requested a review April 30, 2025 18:37
@MarcoPolo
Copy link
Contributor Author

Chatting with @sukunrt and he brought up the good point that sendRPC is not safe for concurrent calls, so this code as-is is racey. We also discussed some alternative designs which I’ll draft up next.

@MarcoPolo MarcoPolo marked this pull request as draft April 30, 2025 18:40
@MarcoPolo MarcoPolo force-pushed the marco/batch-publishing-2 branch from 534d7a3 to 1cc1308 Compare April 30, 2025 21:02
@MarcoPolo
Copy link
Contributor Author

MarcoPolo commented Apr 30, 2025

Closing this PR as I'll merge these changes to #607. I like the approach here better. And this avoids having multiple PRs for the same thing

@MarcoPolo MarcoPolo closed this Apr 30, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants