add cpu|mps support #88

karpathy · 2025-10-16T16:44:44Z

WIP, allowing people to run the code either on CPU (any potato) or MPS (Macbook GPUs)

Atm struggling a bit to figure out how to adjust the pyproject.toml to switch pytorch to the basic version on demand. Current workaround is to delete these lines from the pyproject.toml:

# target torch to cuda 12.8
[tool.uv.sources]
torch = [
    { index = "pytorch-cu128" },
]

[[tool.uv.index]]
name = "pytorch-cu128"
url = "https://download.pytorch.org/whl/cu128"
explicit = true

…s. I also discovered I think a bit of a bug, where I was casting wte to bfloat16 in the wrong place (the model init) instead of in init_weights

…rors still, so wip

karpathy · 2025-10-16T23:03:57Z

ok but it sounds like it's unrelated to this PR or running on mac.

the only thing that is mostly preventing me from merging this branch to master is the toml issue i think. looking...

burtenshaw · 2025-10-17T10:30:54Z

I took a look at this and I think you can deal with it using dependency groups and source selectors. There are some uv docs on it.

I opened this #99 with the suggested changes.

Add mps and cpu dependency management

karpathy · 2025-10-17T14:31:46Z

@burtenshaw thanks but i get an error with this (doing uv sync on my macbook)

uv sync
  × Failed to resolve dependencies for `nanochat` (v0.1.0)
  ╰─▶ Requirements contain conflicting indexes for package `torch` in split `python_full_version >= '3.12' and sys_platform == 'linux'`:
      - https://download.pytorch.org/whl/cpu
      - https://download.pytorch.org/whl/cu128

I think it's time I spend some quality time with uv docs.

nanochat/common.py

Co-authored-by: Tancrède Lepoint <[email protected]>

burtenshaw · 2025-10-17T18:32:21Z

@burtenshaw thanks but i get an error with this (doing uv sync on my macbook)
uv sync
  × Failed to resolve dependencies for `nanochat` (v0.1.0)
  ╰─▶ Requirements contain conflicting indexes for package `torch` in split `python_full_version >= '3.12' and sys_platform == 'linux'`:
      - https://download.pytorch.org/whl/cpu
      - https://download.pytorch.org/whl/cu128
I think it's time I spend some quality time with uv docs.

@karpathy Sorry! Then it could just be as simple as limiting the cuda index to only linux:

[tool.uv.sources]
torch = [
    { index = "pytorch-cu128", marker = "platform_system == 'Linux'"},
]

kiankyars · 2025-10-18T12:54:56Z

I can confirm that uv sync works by limiting the CUDA index to Linux on my M1 Mac.

kian@Kian nanochat % uv run python
Python 3.10.18 (main, Jul 11 2025, 22:25:58) [Clang 20.1.4 ] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> import torch
>>> torch.backends
<module 'torch.backends' from '/Users/kian/Code/nanochat/.venv/lib/python3.10/site-packages/torch/backends/__init__.py'>
>>> torch.backends.mos.is_available()
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/Users/kian/Code/nanochat/.venv/lib/python3.10/site-packages/torch/backends/__init__.py", line 60, in __getattr__
    return self.m.__getattribute__(attr)
AttributeError: module 'torch.backends' has no attribute 'mos'. Did you mean: 'mps'?
>>> torch.backends.mps.is_available()
True

kiankyars · 2025-10-18T21:25:02Z

Still should be checked with gpu and cpu configs to see if that throws errors

qdrk · 2025-10-19T04:18:47Z

nanochat/dataloader.py

        for i in range(needed_tokens):
            scratch[i] = token_buffer.popleft()
        # Create the inputs/targets as 1D tensors
        inputs_cpu = scratch[:-1].to(dtype=torch.int32)


FYI: I'm getting the following error at L42

!self.is_mps() INTERNAL ASSERT FAILED at "/Users/runner/work/pytorch/pytorch/pytorch/aten/src/ATen/native/TensorShape.cpp":1414, please report a bug to PyTorch. as_strided_tensorimpl does not work with MPS; call self.as_strided(...) instead

To fix it, change L41-42 to

tokens = [token_buffer.popleft() for _ in range(needed_tokens)] scratch = torch.tensor(tokens, dtype=torch.int64, pin_memory=device.type == "cuda")

rake93 · 2025-10-19T09:55:11Z

@karpathy - I successfully got this running on a Windows machine (CPU-only). I encountered a few issues during setup, which I have detailed below along with their resolutions.

Getting the nanochat-d32 model running on Windows CPU wasn't straightforward, but the journey was incredibly educational..!

Issues Encountered:

CUDA requirement errors → Switched to CPU/MPS branch and implemented device auto-detection
Model files not found → Created proper cache directory structure with model tag subdirectories
Tokenizer path issues → Organized tokenizer files in separate cache location
BFloat16/Float32 dtype mismatch → Implemented automatic weight conversion from bfloat16 to float32 for CPU compatibility
Web server lacking CPU support → Modified chat_web.py to support CPU device type with auto-detection

Attaching a screenshot of the nanochat application running:

boreys · 2025-10-20T00:09:19Z

@burtenshaw thanks but i get an error with this (doing uv sync on my macbook)
uv sync
  × Failed to resolve dependencies for `nanochat` (v0.1.0)
  ╰─▶ Requirements contain conflicting indexes for package `torch` in split `python_full_version >= '3.12' and sys_platform == 'linux'`:
      - https://download.pytorch.org/whl/cpu
      - https://download.pytorch.org/whl/cu128
I think it's time I spend some quality time with uv docs.
@karpathy Sorry! Then it could just be as simple as limiting the cuda index to only linux:
[tool.uv.sources]
torch = [
    { index = "pytorch-cu128", marker = "platform_system == 'Linux'"},
]

yes, this is working on Mac OS with M2

…various minor other changes ,e.g. changing max_iterations to num_iterations in sft script for consistency in naming

lukestanley · 2025-10-21T09:27:32Z

@karpathy Instead of assuming GPU support for Linux users, a cleaner way to do it would be to make the device specific package index for Torch an "extra", switching to the extra like my PR showed.

Adds index:
https://github.com/karpathy/nanochat/pull/17/files#diff-50c86b7ed8ac2cf95bd48334961bf0530cdc77b5a56f852c5c61b89d735fd711
Pick index:
https://github.com/karpathy/nanochat/pull/17/files#diff-f01ca501612c5f260e12ac1171d6705e6887825cba06360c99429531072ef130

If the devices are all autodetected indexes, it could "just work".
Sorry I didn't get around to validating my branch on my 3090 yet!

… flagship build which is linux. sorry to pollute the repo history...

karpathy added 5 commits October 16, 2025 16:14

trying to add basic cpu support, will try mps too

722da4f

add support for CPU and for MPS. I had to change a few cosmetic thing…

306bc38

…s. I also discovered I think a bit of a bug, where I was casting wte to bfloat16 in the wrong place (the model init) instead of in init_weights

adjust comment/guidance on device type

279b743

add autodetect of device and related stuff. getting weird warnings/er…

786119d

…rors still, so wip

many small tweaks. base, eval, core work now i think

df600b6

karpathy and others added 2 commits October 16, 2025 16:33

update the midtraining script too

ae02650

add groups and source selection

23b6351

Merge pull request #99 from burtenshaw/cpu-mps-dev-ben

e883b1d

Add mps and cpu dependency management

Repository owner deleted a comment from i-zaitsev Oct 17, 2025

revert to previous pyproject.toml

e4f9b9c

tlepoint reviewed Oct 17, 2025

View reviewed changes

nanochat/common.py Outdated Show resolved Hide resolved

fix typo

cf2baf9

Co-authored-by: Tancrède Lepoint <[email protected]>

qdrk reviewed Oct 19, 2025

View reviewed changes

burtenshaw and others added 4 commits October 20, 2025 06:51

add check for linux on cpu

c7ae920

add both sides of the source check

0abb0fa

toml changes for cpu only install

a09ac81

upgrading all other files to be able to use cpu/mps as well as cuda. …

2e9669e

…various minor other changes ,e.g. changing max_iterations to num_iterations in sft script for consistency in naming

karpathy added 3 commits October 21, 2025 10:07

i shouldnt have committed the lock file, i missed that. revert to the…

bb786c5

… flagship build which is linux. sorry to pollute the repo history...

Merge branch 'master' into cpu-mps-dev

dfcb1c1

merge and resolve conflict

5bdc99a

also add readme mention of the cpu mps changes

50bea28

karpathy merged commit 33e8a27 into master Oct 21, 2025

svlandeg deleted the cpu-mps-dev branch October 29, 2025 23:22

This was referenced Nov 14, 2025

feat: Add ROCm and device-agnostic support #23

Closed

CPU package support with auto detection #17

Closed

Refactor: Improve security and performance of the inference engine #31

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

add cpu|mps support #88

add cpu|mps support #88

Uh oh!

karpathy commented Oct 16, 2025

Uh oh!

karpathy commented Oct 16, 2025

Uh oh!

burtenshaw commented Oct 17, 2025 •

edited

Loading

Uh oh!

karpathy commented Oct 17, 2025 •

edited

Loading

Uh oh!

Uh oh!

burtenshaw commented Oct 17, 2025

Uh oh!

kiankyars commented Oct 18, 2025 •

edited

Loading

Uh oh!

kiankyars commented Oct 18, 2025

Uh oh!

qdrk Oct 19, 2025

Uh oh!

qdrk Oct 20, 2025

Uh oh!

rake93 commented Oct 19, 2025

Uh oh!

boreys commented Oct 20, 2025

Uh oh!

lukestanley commented Oct 21, 2025 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

9 participants

add cpu|mps support #88

add cpu|mps support #88

Uh oh!

Conversation

karpathy commented Oct 16, 2025

Uh oh!

karpathy commented Oct 16, 2025

Uh oh!

burtenshaw commented Oct 17, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

karpathy commented Oct 17, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

burtenshaw commented Oct 17, 2025

Uh oh!

kiankyars commented Oct 18, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

kiankyars commented Oct 18, 2025

Uh oh!

qdrk Oct 19, 2025

Choose a reason for hiding this comment

Uh oh!

qdrk Oct 20, 2025

Choose a reason for hiding this comment

Uh oh!

rake93 commented Oct 19, 2025

Uh oh!

boreys commented Oct 20, 2025

Uh oh!

lukestanley commented Oct 21, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

9 participants

burtenshaw commented Oct 17, 2025 •

edited

Loading

karpathy commented Oct 17, 2025 •

edited

Loading

kiankyars commented Oct 18, 2025 •

edited

Loading

lukestanley commented Oct 21, 2025 •

edited

Loading