Skip to content

Conversation

@shirayu
Copy link
Contributor

@shirayu shirayu commented Feb 14, 2023

This PR replaces the loss with the moving average of loss.
Current calculation of loss for logging resets at the begging of each epoch.
This causes jumping of loss.

Images

Before

Before

After

After

@kohya-ss
Copy link
Owner

Thank you for this! Moving average seems to be good.
However, I think it will be better that loss/epoch is the average loss of the epoch, not the average loss of entire training.

In addition, it is a bit annoying to calculate the sum of losses every step, if the epoch has thousands of steps. TensorBoard has a 'smoothing' feature, so it will be acceptable raw loss values instead of average...

@shirayu
Copy link
Contributor Author

shirayu commented Feb 14, 2023

loss/epoch is the same value to the moving average because window size is the number of steps of an epoch.

Surely, we can also watch the raw loss value and smoothed loss value in TensorBoard.

My first intention of this PR was the improvement of CLI loss indication.
I prefer that the loss value displayed on the screen with the progress bar be continuous.

As you point out, it is wasteful to sum every step.
I will try improving it a little.

@shirayu
Copy link
Contributor Author

shirayu commented Feb 14, 2023

I removed the call of sum() at 8aed512.

@kohya-ss
Copy link
Owner

loss/epoch is the same value to the moving average because window size is the number of steps of an epoch.

Oh, sorry, I misunderstood.

I prefer that the loss value displayed on the screen with the progress bar be continuous.

I agree this.

The update looks good! I will merge as soon as I have time.

@kohya-ss kohya-ss changed the base branch from main to dev February 16, 2023 13:00
@kohya-ss kohya-ss merged commit 914d150 into kohya-ss:dev Feb 16, 2023
@shirayu shirayu deleted the improve_loss_track branch February 16, 2023 17:31
wkpark pushed a commit to wkpark/sd-scripts that referenced this pull request Feb 27, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants