You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
That is, for example instead of dfdq we would simply write dq, and always assume that the gradient is with respect to the final output
I think here the gradient is of the final output may be right.
Second,
Every gate in a circuit diagram gets some inputs and can right away compute two things: 1. its output value and 2. the local gradient of its inputs with respect to its output value
2. the local gradient of its output value with respect to its inputs