Skip to content

Conversation

@chenyingshu
Copy link
Collaborator

What does this PR do?

Adds # (feature)

  1. Add SegFormer: mindone/transformers/models/segformer
  • Add SegFormer model: mindone/transformers/models/segformer/modeling_segformer.py
  • Add SegFormerImageProcessor: mindone/transformers/models/segformer/image_processing_segformer.py
  • UT in tests/transformers_tests/models/segformer
  1. Add image-segmentation pipeline: mindone/transformers/pipelines/image_segmentation.py
  • tested using SegFormer model
  1. Use mindone.transformers.AutoImageProcessor in pipeline: mindone/transformers/pipelines/__init__.py
  • tested using SegFormer model

Before submitting

  • This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).
  • Did you read the contributor guideline?
  • Did you make sure to update the documentation with your changes? E.g. record bug fixes or new features in What's New. Here are the
    documentation guidelines
  • Did you build and run the code without any errors?
  • Did you report the running environment (NPU type/MS version) and performance in the doc? (better record it for data loading, model inference, or training tasks)
  • Did you write any new necessary tests?

@chenyingshu chenyingshu requested a review from vigo999 as a code owner July 29, 2025 06:11
Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Summary of Changes

Hello @chenyingshu, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

I've added the SegFormer model and a new image segmentation pipeline to the mindone/transformers library. This enhancement allows for advanced image segmentation tasks and integrates SegFormer's image processing and modeling capabilities, making it readily available for use within the existing framework. The changes include new model and image processor files, updates to auto-configuration, and the introduction of a dedicated pipeline for image segmentation.

Highlights

  • New SegFormer Model Integration: I've introduced the SegFormer model, including its core architecture and image processing capabilities, by adding new files under mindone/transformers/models/segformer.
  • Image Segmentation Pipeline: I've implemented a dedicated image segmentation pipeline in mindone/transformers/pipelines/image_segmentation.py, enabling easy use of segmentation models.
  • AutoImageProcessor Enhancement: The AutoImageProcessor has been updated and integrated into the new pipeline, ensuring seamless image preprocessing for SegFormer and other compatible models.
  • Core Module and Auto-Configuration Updates: I've updated various __init__.py files and auto-configuration modules to correctly import and register the new SegFormer components and the image segmentation pipeline.
  • New Unit Tests for SegFormer: Comprehensive unit tests for the SegFormer model have been added to ensure its correctness and compatibility within the mindone/transformers framework.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point in your pull request via creating an issue comment (i.e. comment on the pull request page) using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in issue comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments or fill out our survey to provide feedback.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces the SegFormer model and an image segmentation pipeline. The implementation is largely adapted from HuggingFace Transformers and looks solid. I've provided a few suggestions to improve correctness and maintainability, such as fixing a type hint, improving a validation check, and suggesting refactoring to reduce code duplication. I also recommend using explicit imports instead of wildcards for better code clarity.

return_tensors: Optional[Union[str, TensorType]] = None,
data_format: ChannelDimension = ChannelDimension.FIRST,
input_data_format: Optional[Union[str, ChannelDimension]] = None,
) -> PIL.Image.Image:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

The return type hint for this function is PIL.Image.Image, but it actually returns a BatchFeature object. Please correct the type hint to reflect the actual return type.

Suggested change
) -> PIL.Image.Image:
) -> BatchFeature:

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed

Comment on lines 728 to 729
if labels is not None and self.config.num_labels < 1:
raise ValueError(f"Number of labels should be >=0: {self.config.num_labels}")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

The validation check for num_labels can be improved. The current check self.config.num_labels < 1 allows num_labels to be 0, but the loss calculation logic only handles num_labels > 1 or num_labels == 1. If num_labels is 0, no loss will be computed, which is likely not the intended behavior. The check should be for num_labels <= 0, and the error message should be more specific.

Suggested change
if labels is not None and self.config.num_labels < 1:
raise ValueError(f"Number of labels should be >=0: {self.config.num_labels}")
if labels is not None and self.config.num_labels <= 0:
raise ValueError(f"Number of labels should be > 0, but got {self.config.num_labels}")

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

revised

@chenyingshu
Copy link
Collaborator Author

/gemini review

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces the SegFormer model and an image segmentation pipeline. The changes are extensive, including model implementation, image processor, pipeline logic, and updates to auto-classes. My review has identified a critical bug in the new pipeline's preprocessing logic for certain models, an incorrect type hint, and a minor error message issue. These should be addressed to ensure correctness and robustness.

return_tensors: Optional[Union[str, TensorType]] = None,
data_format: ChannelDimension = ChannelDimension.FIRST,
input_data_format: Optional[Union[str, ChannelDimension]] = None,
) -> PIL.Image.Image:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

The return type hint for the preprocess method is PIL.Image.Image, but the method actually returns a BatchFeature object. This should be corrected to reflect the actual return type for accuracy and code clarity.

Suggested change
) -> PIL.Image.Image:
) -> "BatchFeature":

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed

if "images" in kwargs:
inputs = kwargs.pop("images")
if inputs is None:
raise ValueError("Cannot call the image-classification pipeline without an inputs argument!")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The ValueError message here refers to the image-classification pipeline, but this code is within the ImageSegmentationPipeline. This appears to be a copy-paste error and should be corrected to avoid confusion.

Suggested change
raise ValueError("Cannot call the image-classification pipeline without an inputs argument!")
raise ValueError("Cannot call the image-segmentation pipeline without an inputs argument!")

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

revised

@vigo999 vigo999 added this pull request to the merge queue Aug 18, 2025
Merged via the queue into mindspore-lab:master with commit 9e1be13 Aug 18, 2025
3 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants