Skip to content

Conversation

@stas00
Copy link
Contributor

@stas00 stas00 commented Feb 7, 2023

this PR fixes tokenizer's save_pretrained to remove the name_or_path entry from tokenizer_config.json because:

  1. it usually contains the local path that was used to save the model, which is not only invalid once published on the hub, it could potentially reveal some personal information.
  2. it is not used anywhere, since one needs to know name_or_path before they can load this file.

it also adjusts one tokenizer test not to test for the name_or_path entry

@HuggingFaceDocBuilderDev
Copy link

HuggingFaceDocBuilderDev commented Feb 7, 2023

The documentation is not available anymore as the PR was closed or merged.

@stas00 stas00 marked this pull request as ready for review February 7, 2023 01:43
@stas00 stas00 requested a review from sgugger February 7, 2023 01:43
Copy link
Collaborator

@sgugger sgugger left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for fixing this!

@stas00 stas00 merged commit b9af152 into main Feb 7, 2023
@stas00 stas00 deleted the tokenizer-save-config branch February 7, 2023 18:51
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants