WIP: Re-implement Linear in managed code. #946

NiklasGustafsson · 2023-03-06T20:38:31Z

There are a couple of things in this WIP/draft PR:

Overriding _to() implementations in modules that are known not to have any parameters or buffers. This will save a small amount of runtime overhead.
An alternative implementation of Linear to more closely align with how it works in PyTorch. Rather than creating native module instances and managing their lifetimes, this approach only involves a .NET instance, and then calls into the torch.nn.functional APIs to perform the forward pass. It's simpler and gets us out of the business of managing native module instances. The downside is that we need to do a whole bunch of work without any new functionality. It's just getting rid of some technical debt and aligning better with the Python implementation, which could be enough of a reason.

@MovGP0, @lostmsu, @kaiidams, @ChengYen-Tang, @dayo05 -- your thoughts on #2?

…ffers or params.

…r parameters.

MovGP0 · 2023-03-06T21:48:56Z

internal Linear(long inputSize, long outputSize, bool hasBias = true, Device device = null, ScalarType? dtype = null) : base(nameof(Linear))

It should be Device? device = null. Same problem with the other constructors.

MovGP0 · 2023-03-06T21:59:06Z

init.kaiming_uniform_(weight, a: Math.Sqrt(5));

I think we could cache the result of Math.Sqrt(5) in a private static readonly variable. But probably won't matter in the grand scheme of things.

Note: I've checked if the compiler is smart enough to replace it with a constant; it isn't.

NiklasGustafsson · 2023-03-06T22:00:49Z

It should be Device? device = null. Same problem with the other constructors.

Right. I took the #nullable enable out. Putting it back.

NiklasGustafsson · 2023-03-06T22:15:17Z

Well, caching it probably won't make much of a difference, but it also won't hurt.

lostmsu · 2023-03-06T22:55:46Z

Well, caching it probably won't make much of a difference, but it also won't hurt.

Seems like premature optimization, that worsens readability to me.

lostmsu

Mostly NITs on the code itself, however I am unsure about the idea of manually reimplemented Linear. What is the plan for ensuring we actually track the upstream changes?

As far as PRs go, I think the part that overrides _to should be separate. OCD 🙃

src/TorchSharp/NN/Linear.cs

lostmsu · 2023-03-06T23:09:32Z

src/TorchSharp/NN/Linear.cs

-                    THSNN_Linear_set_weight(handle, value!.Handle);
-                    torch.CheckForErrors();
-                    ConditionallyRegisterParameter("weight", value);
+                    if (value is null) throw new ArgumentNullException("weight");


NIT: nameof(weight)

lostmsu · 2023-03-06T23:10:29Z

src/TorchSharp/NN/Linear.cs

        public sealed class Linear : torch.nn.Module<Tensor, Tensor>
        {
-            internal Linear(IntPtr handle, IntPtr boxedHandle) : base(handle, boxedHandle)
+            internal Linear(long inputSize, long outputSize, bool hasBias = true, Device device = null, ScalarType? dtype = null) : base(nameof(Linear))


Is there any reason to not have this constructor public? It does not take handles anymore.

Only because the Python-likeness we're striving for would prefer that users use the factories in torch.nn instead of the constructors.

src/TorchSharp/NN/Linear.cs

NiklasGustafsson · 2023-03-07T00:01:43Z

Mostly NITs on the code itself, however I am unsure about the idea of manually reimplemented Linear. What is the plan for ensuring we actually track the upstream changes?

We would have to get to all of the modules, so it's not just Linear. In some cases, it may be easier to track. There have been times when we've moved from one version to another, and the native module API didn't have some new feature, but the functional API did.

…al APIs.

kaiidams · 2023-03-11T01:59:27Z

src/TorchSharp/NN/Linear.cs

            /// <param name="device">The desired device of the parameters and buffers in this module</param>
            /// <param name="dtype">The desired floating point or complex dtype of the parameters and buffers in this module</param>
-            public static Linear Linear(long inputSize, long outputSize, bool hasBias = true, Device? device = null, ScalarType? dtype = null)
+            public static Linear Linear(long inputSize, long outputSize, bool hasBias = true, Device device = null, ScalarType? dtype = null)


Should this receive in_features and out_features to be compatible with PyTorch?

kaiidams · 2023-03-11T02:03:37Z

src/TorchSharp/NN/Linear.cs

-            public static Linear Linear(long inputSize, long outputSize, bool hasBias = true, Device? device = null, ScalarType? dtype = null)
+            public static Linear Linear(long inputSize, long outputSize, bool hasBias = true, Device device = null, ScalarType? dtype = null)
            {
-                var res = THSNN_Linear_ctor(inputSize, outputSize, hasBias, out var boxedHandle);


Is THSNN_Linear_ctor going to be deleted from https://github.com/dotnet/TorchSharp/blob/main/src/Native/LibTorchSharp/THSNN.cpp ?

kaiidams · 2023-03-11T02:09:27Z

I hope these temporary objects are gone with this change.

                public static Tensor relu(Tensor x, bool inplace = false)
                {
                    using (var m = nn.ReLU(inplace)) {
                        return m.call(x);
                    }
                }

lostmsu · 2023-03-16T06:03:47Z

src/TorchSharp/NN/Bilinear.cs

+                    var (fanIn, _) = init.CalculateFanInAndFanOut(weight);
+                    init.uniform_(_bias, -bound, bound);
+                }
+                //NOTE: it's important not to call 'RegisterComponents' here.


Adding why would be helpful for future readers. I assume it is called in base?

lostmsu · 2023-03-16T06:05:49Z

src/TorchSharp/NN/Bilinear.cs

                }
            }
+
+            private Parameter? _weight;


Why not make this readonly and non-nullable?

lostmsu · 2023-03-16T06:06:55Z

src/TorchSharp/NN/Dropout.cs

+                this._p = p;
+                this._inplace = inplace;


Should these be part of the state dict?

lostmsu · 2023-03-16T06:11:06Z

src/TorchSharp/NN/Parameter.cs

+                    scope.Include(this);
+                    scope.Detach(data);


This needs some explanation.

lostmsu · 2023-03-16T06:12:16Z

test/TorchSharpTest/NN.cs

                var output = lin.call(input);

+                output[0, 511] = 10; // When we modify the copy, the original should be altered, too.
+
                Assert.Equal(device.type, output.device_type);


Is this how Identity behaves in PyTorch? o-O

Yup:

>>> id = torch.nn.Identity() >>> input = torch.zeros(10,10) >>> output = id(input) >>> output[0,0] = 13 >>> input tensor([[13., 0., 0., 0., 0., 0., 0., 0., 0., 0.], [ 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.], [ 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.], [ 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.], [ 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.], [ 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.], [ 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.], [ 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.], [ 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.], [ 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.]])

From the PyTorch source code:

def forward(self, input: Tensor) -> Tensor: return input

…into unit

NiklasGustafsson added 5 commits February 27, 2023 11:22

Override 'Module._to()' overloads on some modules that do not have bu…

39aa1c5

…ffers or params.

Shortcircuit '_to()' for modules that are known not to have buffers o…

7b0350a

…r parameters.

Experimental rewrite of some modules.

534baaa

Merge branch 'main' into unit

ae18c5e

Slight modification to the experimental re-implementation of Linear.

7b89432

Merge branch 'missing' into unit

4c25d33

lostmsu reviewed Mar 6, 2023

View reviewed changes

NiklasGustafsson added 4 commits March 6, 2023 16:43

Implemented a number of modules as managed code calling into function…

bf32faf

…al APIs.

Merge branch 'main' into unit

6316f0f

More modules implemented without native module instances

d5b64d3

Merge branch 'main' into unit

0228eb5

kaiidams reviewed Mar 11, 2023

View reviewed changes

NiklasGustafsson added 2 commits March 13, 2023 11:50

Merge branch 'main' into unit

16cfbc3

Merge branch 'main' into unit

f844cd5

lostmsu reviewed Mar 16, 2023

View reviewed changes

NiklasGustafsson added 7 commits March 17, 2023 09:15

Merge branch 'main' into unit

cfaea64

Merge branch 'main' into unit

06c9ac7

Merge branch 'main' into unit

4778090

Fixed infinite recursion in bilinear.

339e381

Merge branch 'unit' of https://github.com/NiklasGustafsson/TorchSharp …

a5f7514

…into unit

Merge branch 'main' into unit

b1ff253

Merge branch 'main' into unit

3419f73

Niklas Gustafsson and others added 29 commits June 23, 2023 19:58

Merge branch 'main' into unit

dec12cf

Merge branch 'main' into unit

0d6b180

Merge branch 'main' into unit

e8e47d6

Merge branch 'main' into unit

162d613

Merge branch 'main' into unit

25b20ea

Merge branch 'main' into unit

55e96d7

Merge branch 'main' into unit

5906909

Merge branch 'main' into unit

1a8263b

Merge branch 'main' into unit

db20fcd

Merge branch 'main' into unit

93c1efc

Merge branch 'main' into unit

889a906

Merge branch 'main' into unit

b5c8ae7

Merge branch 'main' into unit

fecb310

Merge branch 'main' into unit

678ed3b

Merge branch 'main' into unit

1abc324

Merge branch 'main' into unit

8afc8d6

Merge branch 'main' into unit

e17eb31

Merge branch 'main' into unit

53be92f

Merge branch 'main' into unit

a341978

Merge branch 'main' into unit

53a20bb

Merge branch 'main' into unit

f4c688e

Merge branch 'main' into unit

2fd482d

Merge branch 'main' into unit

b2c69ed

Merge branch 'main' into unit

b91dd85

Merge branch 'main' into unit

c3fe4e6

Merge branch 'main' into unit

89286f4

Merge branch 'main' into unit

d50b4e9

Merge branch 'main' into unit

375202a

Merge branch 'main' into unit

153e279

NiklasGustafsson closed this Oct 19, 2023

WIP: Re-implement Linear in managed code. #946

WIP: Re-implement Linear in managed code. #946

Uh oh!

Conversation

NiklasGustafsson commented Mar 6, 2023

Uh oh!

MovGP0 commented Mar 6, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

MovGP0 commented Mar 6, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

NiklasGustafsson commented Mar 6, 2023

Uh oh!

NiklasGustafsson commented Mar 6, 2023

Uh oh!

lostmsu commented Mar 6, 2023

Uh oh!

lostmsu left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

lostmsu Mar 6, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

NiklasGustafsson commented Mar 7, 2023

Uh oh!

Choose a reason for hiding this comment

Uh oh!

kaiidams Mar 11, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

kaiidams commented Mar 11, 2023

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

MovGP0 commented Mar 6, 2023 •

edited

Loading

MovGP0 commented Mar 6, 2023 •

edited

Loading

lostmsu Mar 6, 2023 •

edited

Loading

kaiidams Mar 11, 2023 •

edited

Loading