[FLOAT8] Add Hardware Compatibility Check for FP8 Quantization

### Add Hardware Compatibility Check for FP8 Quantization

#### Issue Summary
In our current implementation, we provide three APIs for model computation in FP8 format. However, for dynamic activation quant these FP8 computations are only supported on NVIDIA GPUs with SM89 and SM90 architectures. When models are quantized to FP8 on unsupported hardware, errors only occur during runtime, which can lead to confusion and wasted resources.

#### Proposed Solution
Check at the model quantization stage if the target hardware does not support FP8 computations and raise an error accordingly. This way, users are informed immediately if their hardware cannot handle FP8 quantization, rather than discovering it during runtime. Potentially point to weight-only quant which as more supported

Changes where to add errors:
```
    "float8_dynamic_activation_float8_weight",
    "float8_static_activation_float8_weight"
```


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[FLOAT8] Add Hardware Compatibility Check for FP8 Quantization #1188

Add Hardware Compatibility Check for FP8 Quantization

Issue Summary

Proposed Solution

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[FLOAT8] Add Hardware Compatibility Check for FP8 Quantization #1188

Description

Add Hardware Compatibility Check for FP8 Quantization

Issue Summary

Proposed Solution

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions