AutoImageProcessor #20111

amyeroberts · 2022-11-07T19:12:55Z

What does this PR do?

Adds the AutoImageProcessor class and makes model image processors available to import.

Fixes # (issue)

Before submitting

This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).
Did you read the contributor guideline,
Pull Request section?
Was this discussed/approved via a Github issue or the forum? Please add a link
to it if that's the case.
Did you make sure to update the documentation with your changes? Here are the
documentation guidelines, and
here are tips on formatting docstrings.
Did you write any new necessary tests?

sgugger

Thanks for working on this. On the list of minimal things, I'd add the doc page for iamge processors and test file of AutoImageProcessor.

sgugger · 2022-11-07T19:56:08Z

docs/source/en/_toctree.yml

You'll need to write this doc ;-)
Should probably remove the one for feature extractor while we're at it.

I think we'll still need the feature extractor one for the audio models.

amyeroberts · 2022-11-08T12:36:05Z

docs/source/en/internal/image_processing_utils.mdx

Renaming to match the naming pattern of FeatureExtractionMixin

docs/source/en/model_doc/mobilevit.mdx

HuggingFaceDocBuilderDev · 2022-11-08T13:03:15Z

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint.

amyeroberts · 2022-11-08T15:30:54Z

src/transformers/models/mobilevit/__init__.py

This was removed as it's repeat code from lines 72-78

sgugger

Great work! Just have two comments.

sgugger · 2022-11-08T15:46:22Z

src/transformers/models/auto/image_processing_auto.py

This block should go before trying to get the config (so at line 287).

sgugger · 2022-11-08T15:48:14Z

tests/models/auto/test_image_processing_auto.py

Would be cool to also check it works if there is code for a dynamic feature extractor (using "hf-internal-testing/test_dynamic_feature_extractor"),

I don't think we can use hf-internal-testing/test_dynamic_feature_extractor directly, as it's for wav2vec2. I could add an equivalent vision feature extractor under e.g. hf-internal-testing/test_dynamic_feature_extractor_vision ?

Ah yes, good point. Adding a new repo sounds fine!

This is a bit tricky because of the renaming that happens in AutoImageProcessor.from_pretrained from FeatureExtractor -> ImageProcessor in this line.

It tries to import NewImageProcessor which doesn't exist in https://huggingface.co/hf-internal-testing/test_dynamic_feature_extractor_vision/blob/main/feature_extractor.py

So either:

https://huggingface.co/hf-internal-testing/test_dynamic_feature_extractor_vision/blob/main/feature_extractor.py has class NewImageProcessor(CLIPFeatureExtractor)

There's some logic which doesn't rename if the image processor isn't IMAGE_PROCESSOR_MAPPING_NAMES - although this leads loading a feature extractor, which seems like unwanted behaviour.

What do you think the intended behaviour should be?

Or we can just ignore custom code for feature extractors (I don't think there are any out for now). It shouldn't slow down the PR as it is a edge case we can deal with later!

ydshieh · 2022-11-08T16:52:09Z

Still reviewing, but in processing_auto.py/from_pretrained, the beginning part

https://github.com/huggingface/transformers/blob/1ebc7bb995c5e43961a7c8079ca3bf29f06f2411/src/transformers/models/auto/processing_auto.py#L197

around this, there is no ImageProcessingMixin. I feel this is a miss and should appear here?

src/transformers/image_processing_utils.py

ydshieh · 2022-11-08T17:34:13Z

tests/models/auto/test_image_processing_auto.py

+            config = AutoImageProcessor.from_pretrained(tmpdirname)
+            self.assertIsInstance(config, CLIPImageProcessor)
+
+    def test_image_processor_from_local_directory_from_feature_extractor_key(self):


tests/models/auto/test_image_processing_auto.py

ydshieh

Thank you @amyeroberts for this great work!

LGTM overall, but I have a small doubt
#20111 (comment)

amyeroberts · 2022-11-08T17:55:02Z

Still reviewing, but in processing_auto.py/from_pretrained, the beginning part

https://github.com/huggingface/transformers/blob/1ebc7bb995c5e43961a7c8079ca3bf29f06f2411/src/transformers/models/auto/processing_auto.py#L197

around this, there is no ImageProcessingMixin. I feel this is a miss and should appear here?

@ydshieh Yes, you're right. I've added a check now here. Can you confirm if this matches with what you think should have been added?

ydshieh · 2022-11-08T18:06:56Z

Can you confirm if this matches with what you think should have been added?

Yes!

ydshieh · 2022-11-08T18:46:03Z

One comment (no need to be done in this PR): I think it would be great if we can remove the feature_extractor_type key after loading the image processor.

from transformers import CLIPModel, AutoProcessor, CLIPProcessor, CLIPImageProcessor, CLIPFeatureExtractor, AutoImageProcessor

p = CLIPImageProcessor.from_pretrained("openai/clip-vit-base-patch32")
print(p.feature_extractor_type)
p.save_pretrained("temp-clip")

gives

CLIPFeatureExtractor

on the terminal, and in the output file preprocessor_config.json, we have

  "feature_extractor_type": "CLIPFeatureExtractor",
  "image_processor_type": "CLIPImageProcessor",

fcakyon · 2022-11-22T19:31:19Z

src/transformers/models/auto/image_processing_auto.py

+        ("swin", "ViTImageProcessor"),
+        ("swinv2", "ViTImageProcessor"),
+        ("van", "ConvNextImageProcessor"),
+        ("videomae", "VideoMAEImageProcessor"),


As far as I know, videomae and xclip are video models and video feature extractor classes are different than image feature extractor classes. Would that be better to separate them as VideoProcessors or they should it stay as it is? @NielsRogge

* AutoImageProcessor skeleton * Update references * Add mapping in init * Add model image processors to __init__ for importing * Add AutoImageProcessor tests * Fix up * Image Processor documentation * Remove pdb * Update docs/source/en/model_doc/mobilevit.mdx * Update docs * Don't add whitespace on json files * Remove fixtures * Move checking model config down * Fix up * Add check for image processor * Remove FeatureExtractorMixin in docstrings * Rename model_tmpfile to config_tmpfile * Don't make None if not in image processor map

sgugger reviewed Nov 7, 2022

View reviewed changes

amyeroberts force-pushed the autoimageprocessor branch from 2018283 to b0d1f98 Compare November 8, 2022 12:18

amyeroberts commented Nov 8, 2022

View reviewed changes

docs/source/en/model_doc/mobilevit.mdx Outdated Show resolved Hide resolved

amyeroberts marked this pull request as ready for review November 8, 2022 12:47

patrickvonplaten mentioned this pull request Nov 8, 2022

[Loading] Make sure loading edge cases work huggingface/diffusers#1192

Merged

2 tasks

This was referenced Nov 8, 2022

Cannot load CLIPProcessor / CLIPFeatureExtractor locally #20121

Closed

Why is CLIPImageProcessor not in general init? #20122

Closed

amyeroberts commented Nov 8, 2022

View reviewed changes

amyeroberts requested a review from ydshieh November 8, 2022 15:42

sgugger approved these changes Nov 8, 2022

View reviewed changes

amyeroberts added 14 commits November 8, 2022 16:59

AutoImageProcessor skeleton

8c31462

Update references

34a09b7

Add mapping in init

b201493

Add model image processors to __init__ for importing

bc9cc61

Add AutoImageProcessor tests

be0dd6b

Fix up

6f56b38

Image Processor documentation

a53ed6b

Remove pdb

3bf22a7

Update docs/source/en/model_doc/mobilevit.mdx

41f18a7

Update docs

11edd59

Don't add whitespace on json files

7053051

Remove fixtures

37eb0c4

Move checking model config down

bb4118b

Fix up

015c77e

amyeroberts force-pushed the autoimageprocessor branch from 1ebc7bb to 015c77e Compare November 8, 2022 17:00

ydshieh reviewed Nov 8, 2022

View reviewed changes

src/transformers/image_processing_utils.py Show resolved Hide resolved

Add check for image processor

00d4d0f

ydshieh reviewed Nov 8, 2022

View reviewed changes

tests/models/auto/test_image_processing_auto.py Outdated Show resolved Hide resolved

Remove FeatureExtractorMixin in docstrings

2e08d16

ydshieh approved these changes Nov 8, 2022

View reviewed changes

Rename model_tmpfile to config_tmpfile

74e0914

Don't make None if not in image processor map

aa431f2

amyeroberts merged commit 4eb918e into huggingface:main Nov 8, 2022

amyeroberts deleted the autoimageprocessor branch November 8, 2022 19:54

fcakyon reviewed Nov 22, 2022

View reviewed changes

ZoeyyHz mentioned this pull request Apr 14, 2023

The class CLIPFeatureExtractor is deprecated and will be removed in version 5 of Transformers. Please use CLIPImageProcessor instead. chenfei-wu/TaskMatrix#337

Open

AutoImageProcessor #20111

AutoImageProcessor #20111

Uh oh!

Conversation

amyeroberts commented Nov 7, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What does this PR do?

Before submitting

Uh oh!

sgugger left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

HuggingFaceDocBuilderDev commented Nov 8, 2022

Uh oh!

Choose a reason for hiding this comment

Uh oh!

sgugger left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

ydshieh commented Nov 8, 2022

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

ydshieh left a comment

Choose a reason for hiding this comment

Uh oh!

amyeroberts commented Nov 8, 2022

Uh oh!

ydshieh commented Nov 8, 2022

Uh oh!

ydshieh commented Nov 8, 2022

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

amyeroberts commented Nov 7, 2022 •

edited

Loading