-
Notifications
You must be signed in to change notification settings - Fork 212
Description
Hello, I have ported over this code from pytorch: https://github.com/nikhilbarhate99/PPO-PyTorch/blob/master/PPO.py
It's just a PPO implementation. However, this line is where I'm seeing a difference in behavior: https://github.com/nikhilbarhate99/PPO-PyTorch/blob/master/PPO.py#L97
I have tested this pytorch code locally with numpy and it works flawlessly. However, this equivalent line in C# is where I get an exception.
System.Runtime.InteropServices.ExternalException: 'shape '[1, 3, 1, 1, 3]' is invalid for input of size 3'
The C# equivalent looks like: (Ignore the lack of disposes for now)
public void Act(torch.Tensor inputs, out torch.Tensor actionD, out torch.Tensor actionLogProbD, out torch.Tensor stateValD)
{
torch.Tensor actionMean = _actor.forward(inputs);
torch.Tensor covMat = torch.diag(_actionVar).unsqueeze(0);
var dist = new MultivariateNormal(actionMean, covariance_matrix: covMat);
torch.Tensor action = dist.sample();
torch.Tensor actionLogProb = dist.log_prob(action); // Exception occurs when trying to call this
torch.Tensor stateVal = _critic.forward(inputs);
actionD = action.detach();
actionLogProbD = actionLogProb.detach();
stateValD = stateVal.detach();
}
I have checked each tensor individually and verified that the sizes are identical in C# and python. Below I have printed the actionMean
, _actionVar
, covMat
and action
tensors:
Python:
tensor([ 0.2025, -0.0714, 0.1417])
tensor([0.3600, 0.3600, 0.3600])
tensor([[[0.3600, 0.0000, 0.0000],
[0.0000, 0.3600, 0.0000],
[0.0000, 0.0000, 0.3600]]])
tensor([[ 0.4416, -0.7422, 0.0073]])
C#:
[-0.019071, -0.026232, -0.074312]
[0.36, 0.36, 0.36]
[[[0.36, 0, 0]
[0, 0.36, 0]
[0, 0, 0.36]]]
[[-0.065244, 0.017988, -0.77489]]
Simply put, they are [3], [3], [1x3x3], and [1x3] respectively. I'm not sure why the error happens in TorchSharp and not pytorch unless it's just a bug. I don't understand the nonsensical dimensions the exception is talking about. I went over this all day and even talked with some LLMs and they're convinced the code is ported properly (and I am too at this point since I have verified the tensor sizes myself). Unfortunately I can only talk to LLMs since I don't know anyone in this field and I have no official education. Any help is appreciated here, I'd like to get this up and running in C#