-
Notifications
You must be signed in to change notification settings - Fork 174
perf: avoid unnecessary buffer copy in internalWrite #1013
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
perf: avoid unnecessary buffer copy in internalWrite #1013
Conversation
|
commit: |
See the code comments for details.
a428ff3
to
a7173ff
Compare
// and does not need to be converted again. | ||
// @ts-expect-error TS2367 'encoding' can be 'buffer', but it's not in the | ||
// official type definition | ||
const buffer = encoding === "buffer" ? chunk : Buffer.from(chunk, encoding); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's fundamentally different.
const t = new stream.Transform({
transform(chunk, encoding, callback) {
console.log(chunk, encoding);
callback();
}
});
t.write('hello');
// Prints: <Buffer 68 65 6c 6c 6f> buffer
t.write(Buffer.from('hello'))
// Prints: <Buffer 68 65 6c 6c 6f> buffer
The buffer
value is how Node.js streams communicate that the chunk
is not an encoded string and does not need to be decoded, and in this case it means you do not need to be copying the chunk
by using Buffer.from(chunk, encoding)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you add more info about "buffer" encoding.
It does not seem to be supported on Node 22.16.0 (see inline comment)
// We already have the data as a buffer, let's push it as is to avoid | ||
// unnecessary additional conversion down the stream pipeline. | ||
// @ts-expect-error TS2345 'buffer' is not in the official type definition | ||
this.push(buffer, "buffer"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have tried push(buffer, "whatever")
and this works so the encoding only seem to matter for string values.
The docs say "Encoding of string chunks. Must be a valid Buffer encoding, such as 'utf8' or 'ascii'." - and indeed invalid encodings fail for strings.
But using a non existent encoding for buffer looks like an UB we should not rely on.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's not undefined behavior at all. It's directly handled in the code. It's not a "non existent encoding".
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Passing 'buffer'
to push
is allowed even if not strictly necessary. I included it here so that the intent is absolutely clear. It has no cost, is not an error, and just works. There will never be a case when push(buffer, 'buffer')
means anything other than buffer
being a Buffer
.
Thanks for the additional details @jasnell. IIUC the first change makes sense to avoid having to copy the buffer when it is already a buffer. Still not clear about
Can you measure any perf difference for that second change? |
'buffer' is a legit encoding
value in _transform
.
Does not seem to affect push
and documented as not supported.
See the code comments for details.
One challenge/question: it's not clear that this function is actually being tested with
pnpm run test
... to check I added athrow new Error('boom')
and ranpnpm run test
and everything still passed./cc @vicb @anonrig