[release/9.0] [Tar] Fill in the file size even if the file is empty. #107633

github-actions · 2024-09-10T17:31:23Z

Backport of #106409 to release/9.0

Customer Impact

Customer reported
Found internally

Reported in #95354

When TarWriter writes a regular file entry that has a size field zero, the entry ends up badly formatted.

MemoryStream output = new();
TarWriter archive = new(output, TarEntryFormat.Pax);
PaxTarEntry te = new(TarEntryType.RegularFile, "zero_size")
{
    DataStream = new MemoryStream(0)
};
archive.WriteEntry(te);

// Get the bytes stored in the size field in the header metadata.
// Fixed known location after name, mode, uid and gid fields.
var offset = 2*512+100+3*8;
var sizeBuffer = output.GetBuffer()[offset..(offset+12)];

byte[] expected = new byte[] {0x30,0x30,0x30,0x30,0x30,0x30,0x30,0x30,0x30,0x30,0x30,0};
Console.WriteLine($"Proper size field: {sizeBuffer.SequenceEqual(expected)}");

// Expected behavior:
// The comparison returns true because it's all "0" (char 0x32) followed by "\0": "00000000000\0"

// Actual behavior:
// The comparison returns false because it's just binary zeros.

The fix consists in making sure that when we write an entry, we write a value in the size metadata field not only when the data stream has a size greater than zero, but also when the size is zero and the data stream is not null. In any other case, we keep not writing anything to that field (like when there's no data stream).

Regression

Yes
No

It's been a bug from the beginning.

Testing

A TarWriter unit test was added to write a regular file entry that met this condition and checks the actual header bytes to confirm they look as expected.

Risk

Low. It's a targeted change for handling a very specific and uncommon case of regular file entry with a zero length data.

…tandard behavior.

Co-authored-by: Carlos Sánchez López <[email protected]>

dotnet-policy-service · 2024-09-10T17:31:54Z

Tagging subscribers to this area: @dotnet/area-system-formats-tar
See info in area-owners.md if you want to be subscribed.

src/libraries/System.Formats.Tar/tests/TarWriter/TarWriter.Tests.cs

ericstj

I support the scenario. We try to be as compatible as possible with file formats - both in what we produce and consume.

Can you let me know if we previously supported opening the bad archives we produced with these empty entries before this change? If we did, can we make sure to add a test to confirm that it continues to work?

If we didn't support reading them before, no need to add the test.

… is correct. Add test to verify that entries created with the malformed size field are still readable after the bug fix.

carlossanlop · 2024-09-11T00:13:15Z

Can you let me know if we previously supported opening the bad archives we produced with these empty entries before this change? If we did, can we make sure to add a test to confirm that it continues to work?

If we didn't support reading them before, no need to add the test.

@ericstj - I tested the behavior that we had before the bug fix when reading entries that had a malformed size field value. I can confirm we were able to read those entries.

I modified the existing test to not only confirm that the bytes in the size field have the new expected value, but that we can also read such entry with a proper TarReader.

I also added a new test that creates an entry, then overwrites the size field value with the old malformed data (pure zeros), then opened the stream with a TarReader, and confirmed that we are still able to read such malformed entries without problem.

ericstj

Thank you for adding the tests.

ericstj · 2024-09-12T17:02:18Z

@artl93 can you review this one for RC2?

sobczyk · 2024-09-13T21:44:50Z

You guys are fast. I was on PTO.
I thought a simple round trip test could also be added for all formats that store file size in that part of header.

carlossanlop · 2024-09-13T22:05:25Z

You guys are fast. I was on PTO. I thought a simple round trip test could also be added for all formats that store file size in that part of header.

The basic header fields are all exactly the same independently of the format, and size is one of those fields that remains unchanged regardless of the format. But you're welcome to add more tests with roundtrips for all formats. I kept my unit tests focused to only V7 as it's the simplest format.

sobczyk and others added 4 commits September 10, 2024 17:31

[Tar] Fill in the file size even if the file is empty. This matches s…

25df872

…tandard behavior.

fix tests, set size only when datastream is not null

5accf39

Add unit tests for size header of an empty file

43df333

Scope disposables

6bca7c2

Co-authored-by: Carlos Sánchez López <[email protected]>

ghost added the area-System.Formats.Tar label Sep 10, 2024

carlossanlop self-assigned this Sep 10, 2024

carlossanlop added the Servicing-consider Issue for next servicing release review label Sep 10, 2024

carlossanlop added this to the 9.0.0 milestone Sep 10, 2024

carlossanlop requested a review from a team September 10, 2024 18:16

campersau reviewed Sep 10, 2024

View reviewed changes

src/libraries/System.Formats.Tar/tests/TarWriter/TarWriter.Tests.cs Outdated Show resolved Hide resolved

ericstj reviewed Sep 10, 2024

View reviewed changes

src/libraries/System.Formats.Tar/tests/TarWriter/TarWriter.Tests.cs Outdated Show resolved Hide resolved

ericstj reviewed Sep 10, 2024

View reviewed changes

Update test to verify entry can be read after confirming the sequence…

4d667ba

… is correct. Add test to verify that entries created with the malformed size field are still readable after the bug fix.

ericstj approved these changes Sep 11, 2024

View reviewed changes

ericstj removed the Servicing-consider Issue for next servicing release review label Sep 12, 2024

ericstj requested a review from artl93 September 12, 2024 17:02

artl93 approved these changes Sep 12, 2024

View reviewed changes

artl93 added the Servicing-approved Approved for servicing release label Sep 12, 2024

carlossanlop merged commit c04c2b7 into release/9.0 Sep 12, 2024

carlossanlop deleted the backport/pr-106409-to-release/9.0 branch September 12, 2024 20:08

carlossanlop mentioned this pull request Sep 13, 2024

Additional Tar tests for filled in size when data is empty #107815

Merged

github-actions bot locked and limited conversation to collaborators Oct 14, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[release/9.0] [Tar] Fill in the file size even if the file is empty. #107633

[release/9.0] [Tar] Fill in the file size even if the file is empty. #107633

Uh oh!

github-actions bot commented Sep 10, 2024 •

edited by carlossanlop

Loading

Uh oh!

dotnet-policy-service bot commented Sep 10, 2024

Uh oh!

Uh oh!

Uh oh!

ericstj left a comment •

edited

Loading

Uh oh!

carlossanlop commented Sep 11, 2024

Uh oh!

ericstj left a comment

Uh oh!

ericstj commented Sep 12, 2024

Uh oh!

sobczyk commented Sep 13, 2024

Uh oh!

carlossanlop commented Sep 13, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

7 participants

[release/9.0] [Tar] Fill in the file size even if the file is empty. #107633

[release/9.0] [Tar] Fill in the file size even if the file is empty. #107633

Uh oh!

Conversation

github-actions bot commented Sep 10, 2024 • edited by carlossanlop Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Customer Impact

Regression

Testing

Risk

Uh oh!

dotnet-policy-service bot commented Sep 10, 2024

Uh oh!

Uh oh!

Uh oh!

ericstj left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

carlossanlop commented Sep 11, 2024

Uh oh!

ericstj left a comment

Choose a reason for hiding this comment

Uh oh!

ericstj commented Sep 12, 2024

Uh oh!

sobczyk commented Sep 13, 2024

Uh oh!

carlossanlop commented Sep 13, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

7 participants

github-actions bot commented Sep 10, 2024 •

edited by carlossanlop

Loading

ericstj left a comment •

edited

Loading