-
Couldn't load subscription status.
- Fork 88
fix(examples/hunyuanvideo-i2v): update to MS 2.6.0 and MS 2.7.0 #1385
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
Summary of ChangesHello @wtomin, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request primarily focuses on enhancing the Highlights
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review
This PR updates the MindSpore version in the README, fixes a bug in the text encoder, addresses a missing GroupNorm in diffusers, and updates inference performance in the README. The code changes involve modifications to the README, text encoder initialization, utils, and vae blocks.
| else: | ||
| last_hidden_state = outputs[2][-(hidden_state_skip_layer + 1)] | ||
| last_hidden_state = outputs[1][-(hidden_state_skip_layer + 1)] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The index 1 is used here to access the hidden states from the text encoder's output when model_return_dict is False. It's important to ensure that this index is correct for all text encoder types supported. Add a comment explaining why index 1 is chosen and what it represents in the context of the outputs tuple.
If the structure of the outputs tuple changes in future versions of the transformers library, this index might become invalid, leading to runtime errors. Consider adding a check to validate the length of the outputs tuple and raise an error if it doesn't match the expected length.
| else: | |
| last_hidden_state = outputs[2][-(hidden_state_skip_layer + 1)] | |
| last_hidden_state = outputs[1][-(hidden_state_skip_layer + 1)] | |
| last_hidden_state = outputs[1][-(hidden_state_skip_layer + 1)] # outputs[1] represents the hidden states |
| else: | ||
| last_hidden_state = outputs[self.key_idx] | ||
| outputs_hidden_states = outputs[2] if len(outputs) >= 3 else None # TODO: double-check if use t5 | ||
| outputs_hidden_states = outputs[1] if len(outputs) >= 2 else None # TODO: double-check if use t5 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Similar to the previous comment, the index 1 is used here to access the hidden states. Add a comment explaining why index 1 is chosen and what it represents in the context of the outputs tuple. Also, consider adding a check to validate the length of the outputs tuple.
| outputs_hidden_states = outputs[1] if len(outputs) >= 2 else None # TODO: double-check if use t5 | |
| outputs_hidden_states = outputs[1] if len(outputs) >= 2 else None # outputs[1] represents the hidden states |
| self.conv2 = CausalConv3d(out_channels, conv_3d_out_channels, kernel_size=3, stride=1) | ||
|
|
||
| self.nonlinearity = get_activation(non_linearity)() | ||
| self.nonlinearity = get_activation(non_linearity) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The nonlinearity attribute is assigned the get_activation(non_linearity) function itself, rather than an instance of the activation function. This means the activation function is not being instantiated. Instantiate the activation function by calling get_activation(non_linearity)(). This could lead to unexpected behavior during the forward pass.
| self.nonlinearity = get_activation(non_linearity) | |
| self.nonlinearity = get_activation(non_linearity)() |
| class GroupNorm(nn.Cell): | ||
| r"""Applies Group Normalization over a mini-batch of inputs. | ||
| This layer implements the operation as described in | ||
| the paper `Group Normalization <https://arxiv.org/abs/1803.08494>`__ | ||
| .. math:: | ||
| y = \frac{x - \mathrm{E}[x]}{ \sqrt{\mathrm{Var}[x] + \epsilon}} * \gamma + \beta | ||
| The input channels are separated into :attr:`num_groups` groups, each containing | ||
| ``num_channels / num_groups`` channels. :attr:`num_channels` must be divisible by | ||
| :attr:`num_groups`. The mean and standard-deviation are calculated | ||
| separately over the each group. :math:`\gamma` and :math:`\beta` are learnable | ||
| per-channel affine transform parameter vectors of size :attr:`num_channels` if | ||
| :attr:`affine` is ``True``. | ||
| This layer uses statistics computed from input data in both training and | ||
| evaluation modes. | ||
| Args: | ||
| num_groups (int): number of groups to separate the channels into | ||
| num_channels (int): number of channels expected in input | ||
| eps: a value added to the denominator for numerical stability. Default: 1e-5 | ||
| affine: a boolean value that when set to ``True``, this module | ||
| has learnable per-channel affine parameters initialized to ones (for weights) | ||
| and zeros (for biases). Default: ``True``. | ||
| Shape: | ||
| - Input: :math:`(N, C, *)` where :math:`C=\text{num\_channels}` | ||
| - Output: :math:`(N, C, *)` (same shape as input) | ||
| Examples:: | ||
| >>> input = mint.randn(20, 6, 10, 10) | ||
| >>> # Separate 6 channels into 3 groups | ||
| >>> m = GroupNorm(3, 6) | ||
| >>> # Separate 6 channels into 6 groups (equivalent with InstanceNorm) | ||
| >>> m = GroupNorm(6, 6) | ||
| >>> # Put all 6 channels into a single group (equivalent with LayerNorm) | ||
| >>> m = GroupNorm(1, 6) | ||
| >>> # Activating the module | ||
| >>> output = m(input) | ||
| """ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
|
|
||
| 1. Install | ||
| [CANN 8.0.RC3.beta1](https://www.hiascend.com/developer/download/community/result?module=cann&cann=8.0.RC3.beta1) | ||
| [8.1.RC1](https://www.hiascend.com/developer/download/community/result?module=cann&cann=8.1.RC1) or [8.2.RC1](https://www.hiascend.com/developer/download/community/result?module=cann&cann=8.2.RC1) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
add CANN prefix
What does this PR do?
Fixes # (issue)
GroupNormin diffusrs.Adds # (feature)
Before submitting
What's New. Here are thedocumentation guidelines
Who can review?
Anyone in the community is free to review the PR once the tests have passed. Feel free to tag
members/contributors who may be interested in your PR.
@SamitHuang @vigo999 @Cui-yshoho