Problem generalising 01. PyTorch Workflow Fundamentals model_0 to different data #374
amadanmath
started this conversation in
General
Replies: 1 comment
-
|
I had exactly the same problem when playing around with the exercises. Here's the SO solution I found that helped me |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
I have tried to play with the
model_0from https://www.learnpytorch.io/01_pytorch_workflow/When I execute it as written (except for increasing
epochs), I can see both training and evaluation shrink, and the parameters end up very close to the gold around 170th epoch:For the record, here are the dataset and gold parameters used:
However, when I just change the dataset parameters to this:
the loss and the state dictionary stabilise to this, around 360th epoch:
which is not really close to the gold parameter values. Why was the training unsuccessful here, and what else should be adjusted in order for this to work as expected? Does the training process assume the input data is between 0 and 1? If so, which part of the training loop code is sensitive to this? (It may be somewhere later in the lessons, but I have not yet gotten there.)
It seems to only happen when
Xis large; when I madebiasandweightlarger, so thatywas large, the training converged to the correct answer (though I had to increaseepochsand learning rate in order to reach it). The thing I don't understand is why makingXlarger made the result to stabilise on a wrong value (as opposed to e.g. not converge at all).Beta Was this translation helpful? Give feedback.
All reactions