Skip to content

Conversation

@github-actions
Copy link
Contributor

@github-actions github-actions bot commented Sep 10, 2024

Backport of #106409 to release/9.0

/cc @carlossanlop @sobczyk @am11

Customer Impact

  • Customer reported
  • Found internally

Reported in #95354

When TarWriter writes a regular file entry that has a size field zero, the entry ends up badly formatted.

MemoryStream output = new();
TarWriter archive = new(output, TarEntryFormat.Pax);
PaxTarEntry te = new(TarEntryType.RegularFile, "zero_size")
{
    DataStream = new MemoryStream(0)
};
archive.WriteEntry(te);

// Get the bytes stored in the size field in the header metadata.
// Fixed known location after name, mode, uid and gid fields.
var offset = 2*512+100+3*8;
var sizeBuffer = output.GetBuffer()[offset..(offset+12)];

byte[] expected = new byte[] {0x30,0x30,0x30,0x30,0x30,0x30,0x30,0x30,0x30,0x30,0x30,0};
Console.WriteLine($"Proper size field: {sizeBuffer.SequenceEqual(expected)}");

// Expected behavior:
// The comparison returns true because it's all "0" (char 0x32) followed by "\0": "00000000000\0"

// Actual behavior:
// The comparison returns false because it's just binary zeros.

The fix consists in making sure that when we write an entry, we write a value in the size metadata field not only when the data stream has a size greater than zero, but also when the size is zero and the data stream is not null. In any other case, we keep not writing anything to that field (like when there's no data stream).

Regression

  • Yes
  • No

It's been a bug from the beginning.

Testing

A TarWriter unit test was added to write a regular file entry that met this condition and checks the actual header bytes to confirm they look as expected.

Risk

Low. It's a targeted change for handling a very specific and uncommon case of regular file entry with a zero length data.

@dotnet-policy-service
Copy link
Contributor

Tagging subscribers to this area: @dotnet/area-system-formats-tar
See info in area-owners.md if you want to be subscribed.

@carlossanlop carlossanlop self-assigned this Sep 10, 2024
@carlossanlop carlossanlop added the Servicing-consider Issue for next servicing release review label Sep 10, 2024
@carlossanlop carlossanlop added this to the 9.0.0 milestone Sep 10, 2024
@carlossanlop carlossanlop requested a review from a team September 10, 2024 18:16
Copy link
Member

@ericstj ericstj left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I support the scenario. We try to be as compatible as possible with file formats - both in what we produce and consume.

Can you let me know if we previously supported opening the bad archives we produced with these empty entries before this change? If we did, can we make sure to add a test to confirm that it continues to work?

If we didn't support reading them before, no need to add the test.

… is correct.

Add test to verify that entries created with the malformed size field are still readable after the bug fix.
@carlossanlop
Copy link
Contributor

Can you let me know if we previously supported opening the bad archives we produced with these empty entries before this change? If we did, can we make sure to add a test to confirm that it continues to work?

If we didn't support reading them before, no need to add the test.

@ericstj - I tested the behavior that we had before the bug fix when reading entries that had a malformed size field value. I can confirm we were able to read those entries.

I modified the existing test to not only confirm that the bytes in the size field have the new expected value, but that we can also read such entry with a proper TarReader.

I also added a new test that creates an entry, then overwrites the size field value with the old malformed data (pure zeros), then opened the stream with a TarReader, and confirmed that we are still able to read such malformed entries without problem.

Copy link
Member

@ericstj ericstj left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for adding the tests.

@ericstj ericstj removed the Servicing-consider Issue for next servicing release review label Sep 12, 2024
@ericstj
Copy link
Member

ericstj commented Sep 12, 2024

@artl93 can you review this one for RC2?

@ericstj ericstj requested a review from artl93 September 12, 2024 17:02
@artl93 artl93 added the Servicing-approved Approved for servicing release label Sep 12, 2024
@carlossanlop carlossanlop merged commit c04c2b7 into release/9.0 Sep 12, 2024
@carlossanlop carlossanlop deleted the backport/pr-106409-to-release/9.0 branch September 12, 2024 20:08
@sobczyk
Copy link
Contributor

sobczyk commented Sep 13, 2024

You guys are fast. I was on PTO.
I thought a simple round trip test could also be added for all formats that store file size in that part of header.

@carlossanlop
Copy link
Contributor

You guys are fast. I was on PTO. I thought a simple round trip test could also be added for all formats that store file size in that part of header.

The basic header fields are all exactly the same independently of the format, and size is one of those fields that remains unchanged regardless of the format. But you're welcome to add more tests with roundtrips for all formats. I kept my unit tests focused to only V7 as it's the simplest format.

@github-actions github-actions bot locked and limited conversation to collaborators Oct 14, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

area-System.Formats.Tar Servicing-approved Approved for servicing release

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants