Skip to content

EncoderNLS.Convert doesn't always out the correct value for 'completed' #12183

@GrabYourPitchforks

Description

@GrabYourPitchforks

Repro code:

var encoding = Encoding.GetEncoding("us-ascii", new EncoderReplacementFallback("hello"), DecoderFallback.ExceptionFallback);
var encoder = encoding.GetEncoder();

encoder.Convert(new[] { '\ud800' }, 0, 1, new byte[2], 0, 2, flush: false, out int charsUsed, out int bytesUsed, out bool completed);

Console.WriteLine($"charsUsed: {charsUsed}");
Console.WriteLine($"bytesUsed: {bytesUsed}");
Console.WriteLine($"completed: {completed}");

Expected output:

charsUsed: 1
bytesUsed: 0
completed: False

Actual output:

charsUsed: 1
bytesUsed: 0
completed: True

The culprit seems to be the line below.

https://github.com/dotnet/coreclr/blob/77fcaf6b738941a0c5dc3c00b70b49c7d9f63b69/src/System.Private.CoreLib/shared/System/Text/EncoderNLS.cs#L201

The values flush and this.HasState can never both be true at the same time because specifying flush = true mandates that the EncoderNLS instance not store any remaining high surrogate character for future invocations. Since this means (flush && this.HasState) == false always, by DeMorgan's theorem we have (!flush || !this.HasState) == true always, which means this clause isn't actually contributing to the final value for completed.

I believe the appropriate fix is to ignore the flush parameter entirely and to look only at three conditions: (a) all chars were consumed, (b) there's no leftover state on the EncoderNLS instance, and (c) there's no leftover state in the fallback buffer.

Metadata

Metadata

Type

No type

Projects

No projects

Milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions