What should `.dtype` for tensor subclass return?

What is the recommended way to show the dtype that the tensor appears to be? i.e. when call `subclass_tensor.dtype`

I see that the current `AffineQuantizedTensor` and `NF4Tensor` will show the original dtype. I understand that this helps with compatibility for existing code (e.g. in gpt-fast, KVCache dtype is taken from weight dtype)

https://github.com/pytorch/ao/blob/f172c474cbd56641bb34e73df5d61818a9d4e6e1/torchao/_models/llama/model.py#L122

However, personally I feel that it is a bit unintuitive, because the weight is actually not FP32/BF16 anymore (but it appears to be so for compatibility reason I suppose)

@msaroufim also mentions that

> This is unfortunately a big limitation with subclasses mostly because of limitations with autograd that are very difficult to get rid of

@jerryzh168 

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

What should `.dtype` for tensor subclass return? #442

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

What should .dtype for tensor subclass return? #442

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

What should `.dtype` for tensor subclass return? #442