Skip to content

Conversation

@karpathy
Copy link
Owner

WIP, allowing people to run the code either on CPU (any potato) or MPS (Macbook GPUs)

Atm struggling a bit to figure out how to adjust the pyproject.toml to switch pytorch to the basic version on demand. Current workaround is to delete these lines from the pyproject.toml:

# target torch to cuda 12.8
[tool.uv.sources]
torch = [
    { index = "pytorch-cu128" },
]

[[tool.uv.index]]
name = "pytorch-cu128"
url = "https://download.pytorch.org/whl/cu128"
explicit = true

@karpathy
Copy link
Owner Author

ok but it sounds like it's unrelated to this PR or running on mac.

the only thing that is mostly preventing me from merging this branch to master is the toml issue i think. looking...

@burtenshaw
Copy link
Contributor

burtenshaw commented Oct 17, 2025

I took a look at this and I think you can deal with it using dependency groups and source selectors. There are some uv docs on it.

I opened this #99 with the suggested changes.

Add mps and cpu dependency management
Repository owner deleted a comment from i-zaitsev Oct 17, 2025
Repository owner deleted a comment from i-zaitsev Oct 17, 2025
@karpathy
Copy link
Owner Author

karpathy commented Oct 17, 2025

@burtenshaw thanks but i get an error with this (doing uv sync on my macbook)

uv sync
  × Failed to resolve dependencies for `nanochat` (v0.1.0)
  ╰─▶ Requirements contain conflicting indexes for package `torch` in split `python_full_version >= '3.12' and sys_platform == 'linux'`:
      - https://download.pytorch.org/whl/cpu
      - https://download.pytorch.org/whl/cu128

I think it's time I spend some quality time with uv docs.

Co-authored-by: Tancrède Lepoint <[email protected]>
@burtenshaw
Copy link
Contributor

@burtenshaw thanks but i get an error with this (doing uv sync on my macbook)

uv sync
  × Failed to resolve dependencies for `nanochat` (v0.1.0)
  ╰─▶ Requirements contain conflicting indexes for package `torch` in split `python_full_version >= '3.12' and sys_platform == 'linux'`:
      - https://download.pytorch.org/whl/cpu
      - https://download.pytorch.org/whl/cu128

I think it's time I spend some quality time with uv docs.

@karpathy Sorry! Then it could just be as simple as limiting the cuda index to only linux:

[tool.uv.sources]
torch = [
    { index = "pytorch-cu128", marker = "platform_system == 'Linux'"},
]

@kiankyars
Copy link

kiankyars commented Oct 18, 2025

Screenshot 2025-10-18 at 06 54 13

I can confirm that uv sync works by limiting the CUDA index to Linux on my M1 Mac.

kian@Kian nanochat % uv run python
Python 3.10.18 (main, Jul 11 2025, 22:25:58) [Clang 20.1.4 ] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> import torch
>>> torch.backends
<module 'torch.backends' from '/Users/kian/Code/nanochat/.venv/lib/python3.10/site-packages/torch/backends/__init__.py'>
>>> torch.backends.mos.is_available()
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/Users/kian/Code/nanochat/.venv/lib/python3.10/site-packages/torch/backends/__init__.py", line 60, in __getattr__
    return self.m.__getattribute__(attr)
AttributeError: module 'torch.backends' has no attribute 'mos'. Did you mean: 'mps'?
>>> torch.backends.mps.is_available()
True

@kiankyars
Copy link

Still should be checked with gpu and cpu configs to see if that throws errors

for i in range(needed_tokens):
scratch[i] = token_buffer.popleft()
# Create the inputs/targets as 1D tensors
inputs_cpu = scratch[:-1].to(dtype=torch.int32)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FYI: I'm getting the following error at L42

!self.is_mps() INTERNAL ASSERT FAILED at "/Users/runner/work/pytorch/pytorch/pytorch/aten/src/ATen/native/TensorShape.cpp":1414, please report a bug to PyTorch. as_strided_tensorimpl does not work with MPS; call self.as_strided(...) instead

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To fix it, change L41-42 to

tokens = [token_buffer.popleft() for _ in range(needed_tokens)]
scratch = torch.tensor(tokens, dtype=torch.int64, pin_memory=device.type == "cuda")

@rake93
Copy link

rake93 commented Oct 19, 2025

@karpathy - I successfully got this running on a Windows machine (CPU-only). I encountered a few issues during setup, which I have detailed below along with their resolutions.

Getting the nanochat-d32 model running on Windows CPU wasn't straightforward, but the journey was incredibly educational..!

Issues Encountered:

  1. CUDA requirement errors → Switched to CPU/MPS branch and implemented device auto-detection
  2. Model files not found → Created proper cache directory structure with model tag subdirectories
  3. Tokenizer path issues → Organized tokenizer files in separate cache location
  4. BFloat16/Float32 dtype mismatch → Implemented automatic weight conversion from bfloat16 to float32 for CPU compatibility
  5. Web server lacking CPU support → Modified chat_web.py to support CPU device type with auto-detection

Attaching a screenshot of the nanochat application running:

image

@boreys
Copy link

boreys commented Oct 20, 2025

@burtenshaw thanks but i get an error with this (doing uv sync on my macbook)

uv sync
  × Failed to resolve dependencies for `nanochat` (v0.1.0)
  ╰─▶ Requirements contain conflicting indexes for package `torch` in split `python_full_version >= '3.12' and sys_platform == 'linux'`:
      - https://download.pytorch.org/whl/cpu
      - https://download.pytorch.org/whl/cu128

I think it's time I spend some quality time with uv docs.

@karpathy Sorry! Then it could just be as simple as limiting the cuda index to only linux:

[tool.uv.sources]
torch = [
    { index = "pytorch-cu128", marker = "platform_system == 'Linux'"},
]

yes, this is working on Mac OS with M2

burtenshaw and others added 4 commits October 20, 2025 06:51
@lukestanley
Copy link
Contributor

lukestanley commented Oct 21, 2025

@karpathy Instead of assuming GPU support for Linux users, a cleaner way to do it would be to make the device specific package index for Torch an "extra", switching to the extra like my PR showed.

Adds index:
https://github.com/karpathy/nanochat/pull/17/files#diff-50c86b7ed8ac2cf95bd48334961bf0530cdc77b5a56f852c5c61b89d735fd711
Pick index:
https://github.com/karpathy/nanochat/pull/17/files#diff-f01ca501612c5f260e12ac1171d6705e6887825cba06360c99429531072ef130

If the devices are all autodetected indexes, it could "just work".
Sorry I didn't get around to validating my branch on my 3090 yet!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

9 participants