Skip to content

What should .dtype for tensor subclass return? #442

@gau-nernst

Description

@gau-nernst

What is the recommended way to show the dtype that the tensor appears to be? i.e. when call subclass_tensor.dtype

I see that the current AffineQuantizedTensor and NF4Tensor will show the original dtype. I understand that this helps with compatibility for existing code (e.g. in gpt-fast, KVCache dtype is taken from weight dtype)

dtype = self.output.weight.dtype

However, personally I feel that it is a bit unintuitive, because the weight is actually not FP32/BF16 anymore (but it appears to be so for compatibility reason I suppose)

@msaroufim also mentions that

This is unfortunately a big limitation with subclasses mostly because of limitations with autograd that are very difficult to get rid of

@jerryzh168

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions