-
Notifications
You must be signed in to change notification settings - Fork 315
Description
i'm collecting a few issues I've seen, I have no clear picture of how to solve them as of this moment but aggregating them in the hopes that inspiration will strike
Problems
Problem 1
The below issue is solved by installing ao and then cd
out of the ao directory. IIRC PyTorch has a similar problem in a repro shared by @jerryzh168
Traceback (most recent call last):
File "/home/jerryzh/ao/example.py", line 2, in <module>
from torchao.quantization.quant_primitives import MappingType, ZeroPointDomain
File "/home/jerryzh/ao/torchao/__init__.py", line 8, in <module>
from . import _C
ImportError: cannot import name 'C' from partially initialized module 'torchao' (most likely due to a circular import) (/home/jerryzh/ao/torchao/__init_.py)
Problem 2
Another issue here is building the fp6 kernels is failing https://hastebin.com/share/riridivafa.rust but the nvcc and gcc versions seem fine in a repro shared by @CoffeeVampir3
Problem 3
This error shows up when you either pip install ao or build it with a mismatch in cuda versions in a repro shared by @vayuda
python test/quantization/test_quant_api.py
Traceback (most recent call last):
File "/u/pj8wfq/ao/test/quantization/test_quant_api.py", line 21, in <module>
from torchao.dtypes import (
File "/u/pj8wfq/ao/torchao/__init__.py", line 8, in <module>
from . import _C
ImportError: /u/pj8wfq/ao/torchao/_C.cpython-39-x86_64-linux-gnu.so: undefined symbol: _ZNSt15__exception_ptr13exception_ptr9_M_addrefEv
Problem 4
pypi binaries are crashing on non CUDA devices
File "/opt/hostedtoolcache/Python/3.10.11/x64/lib/python3.10/site-packages/torchao/init.py", line 14, in
from . import _C
ImportError: libcudart.so.12: cannot open shared object file: No such file or directory
Solutions
We need graceful solutions but in the meantime I'm embarassed to say I've been recommending a nuclear option which is to disable C extensions
Specifically in torchao/__init__.py
delete
if not _IS_FBCODE:
from . import _C
from . import ops
And in setup.py
delete
ext_modules=get_extensions(),