Skip to content

Commit 1a5d1c2

Browse files
njbrakeaittalam
andauthored
chore: integrate whisper.cpp as a submodule (#813)
* feat: integrate whisper.cpp as a submodule with patches * simplify naming * Not final state but wanted to get this up: with these edits now the apply-patches script perfectly lines up with the current whisper.cpp folder in the main branch * apply patches in CI * Update ci.yml * Update ci.yml * fix: register whisper.cpp submodule * pin whisper.cpp to commit * move to patches * patches * Refactor * Update Makefile Co-authored-by: Davide Eynard <[email protected]> --------- Co-authored-by: Davide Eynard <[email protected]>
1 parent 6b46419 commit 1a5d1c2

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

56 files changed

+3103
-27743
lines changed

.github/workflows/ci.yml

Lines changed: 8 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -14,12 +14,19 @@ jobs:
1414
- name: Clone
1515
id: checkout
1616
uses: actions/checkout@v4
17+
- name: Initialize submodules # <-- ADD THIS STEP
18+
run: |
19+
git submodule update --init --recursive
1720
1821
- name: Dependencies
1922
id: depends
2023
run: |
2124
sudo apt-get update
22-
sudo apt-get install make
25+
sudo apt-get install make patch
26+
27+
- name: Apply whisper.cpp patches
28+
run: |
29+
bash whisper.cpp.patches/apply-patches.sh
2330
2431
- name: Cache cosmocc toolchain
2532
id: cache-cosmocc-toolchain

.gitmodules

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,3 @@
1+
[submodule "whisper.cpp"]
2+
path = whisper.cpp
3+
url = https://github.com/ggerganov/whisper.cpp.git

Makefile

Lines changed: 18 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -8,6 +8,8 @@ MAKEFLAGS += --no-builtin-rules
88
.DELETE_ON_ERROR:
99
.FEATURES: output-sync
1010

11+
# setup target needs to run before build/config.mk checks make version
12+
ifneq ($(MAKECMDGOALS),setup)
1113
include build/config.mk
1214
include build/rules.mk
1315

@@ -17,6 +19,7 @@ include llama.cpp/BUILD.mk
1719
include stable-diffusion.cpp/BUILD.mk
1820
include whisper.cpp/BUILD.mk
1921
include localscore/BUILD.mk
22+
endif
2023

2124
# the root package is `o//` by default
2225
# building a package also builds its sub-packages
@@ -85,5 +88,20 @@ cosmocc: $(COSMOCC) # cosmocc toolchain setup
8588
.PHONY: cosmocc-ci
8689
cosmocc-ci: $(COSMOCC) $(PREFIX)/bin/ape # cosmocc toolchain setup in ci context
8790

91+
.PHONY: setup
92+
setup: # Initialize and configure all dependencies (submodules, patches, etc.)
93+
@echo "Setting up dependencies..."
94+
@mkdir -p o/tmp
95+
@if [ ! -f whisper.cpp/.git ]; then \
96+
echo "Initializing whisper.cpp submodule..."; \
97+
git submodule update --init whisper.cpp; \
98+
fi
99+
@echo "Applying whisper.cpp patches..."
100+
@export TMPDIR=$$(pwd)/o/tmp && ./whisper.cpp.patches/apply-patches.sh
101+
@echo "Setup complete!"
102+
103+
104+
ifneq ($(MAKECMDGOALS),setup)
88105
include build/deps.mk
89106
include build/tags.mk
107+
endif

README.md

Lines changed: 15 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -485,6 +485,21 @@ will be used to build it), `wget` (or `curl`), and `unzip` available at
485485
[https://cosmo.zip/pub/cosmos/bin/](https://cosmo.zip/pub/cosmos/bin/).
486486
Windows users need [cosmos bash](https://justine.lol/cosmo3/) shell too.
487487

488+
### Dependency Setup
489+
490+
Some dependencies are managed as git submodules with
491+
llamafile-specific patches. Before building, you need to initialize and
492+
configure these dependencies:
493+
494+
```sh
495+
make setup
496+
```
497+
498+
The patches modify dependencies. These modifications remain as local
499+
changes in the submodule working directories.
500+
501+
### Building
502+
488503
```sh
489504
make -j8
490505
sudo make install PREFIX=/usr/local

RELEASE.md

Lines changed: 14 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -6,7 +6,19 @@ The two primary artifacts of the release are the `llamafile-<version>.zip` and t
66

77
## Release Process
88

9-
Note: Step 2 and 3 are only needed if you are making a new release of the ggml-cuda.so and ggml-rocm.so shared libraries. You only need to do this when you are making changes to the CUDA code or the API's surrounding it. Otherwise you can use the previous release of the shared libraries.
9+
Note: Steps 2 and 3 are only needed if you are making a new release of the ggml-cuda.so and ggml-rocm.so shared libraries. You only need to do this when you are making changes to the CUDA code or the API's surrounding it. Otherwise you can use the previous release of the shared libraries.
10+
11+
### Preparing the Build Environment
12+
13+
Before building, ensure all dependencies are initialized and configured:
14+
15+
```sh
16+
make setup
17+
```
18+
19+
This initializes git submodules (e.g., whisper.cpp) and applies llamafile patches. The patches integrate dependencies with llamafile's build system and add llamafile-specific functionality.
20+
21+
### Release Steps
1022

1123
1. Update the version number in `version.h`
1224
2. Build the ggml-cuda.so and ggml-rocm.so shared libraries on Linux. You need to do this for Llamafile and LocalScore. Llamafile uses TINYBLAS as a default and LocalScore uses CUBLAS as a default for CUDA.
@@ -126,4 +138,4 @@ You can use the script to create the appropriately named binaries:
126138

127139
`./llamafile/release.sh -v <version> -s <source_dir> -d <dest_dir>`
128140

129-
Make sure to move the llamafile-<version>.zip file to the <dest_dir> as well, and you are good to release after you've tested.
141+
Make sure to move the llamafile-<version>.zip file to the <dest_dir> as well, and you are good to release after you've tested.

whisper.cpp

Submodule whisper.cpp added at 6739eb8
Lines changed: 106 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,106 @@
1+
#!/bin/bash
2+
# Apply llamafile patches to whisper.cpp submodule
3+
4+
set -e
5+
6+
SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
7+
WHISPER_DIR="$SCRIPT_DIR/../whisper.cpp"
8+
PATCHES_DIR="$SCRIPT_DIR/patches"
9+
LLAMAFILE_FILES_DIR="$SCRIPT_DIR/llamafile-files"
10+
11+
cd "$WHISPER_DIR"
12+
13+
# Check if patches are already applied
14+
if [ -f "BUILD.mk" ]; then
15+
echo "Patches appear to be already applied. Skipping..."
16+
exit 0
17+
fi
18+
19+
echo "Applying patches to whisper.cpp submodule..."
20+
21+
# Step 1: Apply patches to modify files in their original locations
22+
echo "Applying modifications to upstream files..."
23+
for patch_file in "$PATCHES_DIR"/*.patch; do
24+
if [ -f "$patch_file" ]; then
25+
echo "Applying $(basename "$patch_file")..."
26+
patch -p0 < "$patch_file"
27+
fi
28+
done
29+
30+
# Step 2: Copy modified files from their original locations to root
31+
echo "Copying modified files to root directory..."
32+
cp examples/server/server.cpp .
33+
cp examples/common.cpp .
34+
cp examples/common.h .
35+
cp examples/main/main.cpp .
36+
cp examples/stream/stream.cpp .
37+
cp examples/grammar-parser.cpp .
38+
cp examples/grammar-parser.h .
39+
cp examples/dr_wav.h .
40+
cp examples/server/httplib.h .
41+
cp src/whisper.cpp .
42+
cp src/whisper-mel-cuda.cu .
43+
cp include/whisper.h .
44+
45+
# Step 3: Copy new llamafile-specific files to root
46+
echo "Copying llamafile-specific files..."
47+
cp "$LLAMAFILE_FILES_DIR/BUILD.mk" .
48+
cp "$LLAMAFILE_FILES_DIR/README.llamafile" .
49+
cp "$LLAMAFILE_FILES_DIR/README.md" .
50+
# Copy header files, excluding those that were patched
51+
for file in "$LLAMAFILE_FILES_DIR"/*.h; do
52+
[ -f "$file" ] || continue
53+
filename=$(basename "$file")
54+
if [ "$filename" != "common.h" ] && [ "$filename" != "whisper.h" ] && \
55+
[ "$filename" != "grammar-parser.h" ] && [ "$filename" != "dr_wav.h" ] && \
56+
[ "$filename" != "httplib.h" ]; then
57+
cp "$file" .
58+
fi
59+
done
60+
cp "$LLAMAFILE_FILES_DIR"/*.c . 2>/dev/null || true
61+
# Copy cpp files, excluding those that were patched
62+
for file in "$LLAMAFILE_FILES_DIR"/*.cpp; do
63+
[ -f "$file" ] || continue
64+
filename=$(basename "$file")
65+
if [ "$filename" != "common.cpp" ] && [ "$filename" != "server.cpp" ] && \
66+
[ "$filename" != "whisper.cpp" ] && [ "$filename" != "main.cpp" ] && \
67+
[ "$filename" != "stream.cpp" ] && [ "$filename" != "grammar-parser.cpp" ]; then
68+
cp "$file" .
69+
fi
70+
done
71+
cp "$LLAMAFILE_FILES_DIR"/*.hpp . 2>/dev/null || true
72+
# Don't copy .cu files since whisper-mel-cuda.cu is now patched
73+
cp "$LLAMAFILE_FILES_DIR/main.1" .
74+
cp "$LLAMAFILE_FILES_DIR/main.1.asc" .
75+
cp "$LLAMAFILE_FILES_DIR/jfk.wav" .
76+
cp -r "$LLAMAFILE_FILES_DIR/doc" .
77+
78+
# Step 4: Remove unnecessary files and directories
79+
echo "Removing unnecessary files and directories..."
80+
rm -rf bindings
81+
rm -rf .github
82+
rm -rf .devops
83+
rm -rf examples
84+
rm -rf ggml
85+
rm -rf src
86+
rm -rf include
87+
rm -rf tests
88+
rm -rf scripts
89+
rm -rf spm-headers
90+
rm -rf cmake
91+
rm -rf models
92+
rm -rf samples
93+
rm -rf grammars
94+
rm -f .gitignore
95+
rm -f .gitmodules
96+
rm -f Package.swift
97+
rm -f README_sycl.md
98+
rm -f AUTHORS
99+
rm -f CMakeLists.txt
100+
rm -f Makefile
101+
102+
echo ""
103+
echo "Patches applied successfully!"
104+
echo "Note: These changes are not committed to the submodule."
105+
echo "To reset the submodule to its clean state, run:"
106+
echo " cd whisper.cpp && git reset --hard && git clean -fd"
File renamed without changes.
File renamed without changes.

0 commit comments

Comments
 (0)