You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The documentation for GpuKernel_sched states that it may return LS*GS > N and your code should be able to handle that (which is fine), but I found that it's actually returning values where LS*GS < N.
Is this intended? For a specific example, calling GpuKernel_sched() w/N=273280 on a Titan X (target_g=768, target_l=512) returns LS=352 GS=768, LS*GS = 270336.
It looks like the function tries to make sure LS*GS >= N here: