Skip to content

Commit e73cce5

Browse files
Aisukomudler
authored andcommitted
feat(conda):Add seperate env for huggingface (#1146)
**Description** This PR is related to #1117 **Notes for Reviewers** * Add conda env `huggingface.yml` * Change the import order, and also remove the no-used packages * Add `run.sh` and `make command` to the main Dockerfile and Makefile * Add test cases for it. It can be triggered and succeed under VSCode Python extension but it is hang by using `python -m unites test_huggingface.py` in the terminal ``` Running tests (unittest): /workspaces/LocalAI/extra/grpc/huggingface Running tests: /workspaces/LocalAI/extra/grpc/huggingface/test_huggingface.py::TestBackendServicer::test_embedding /workspaces/LocalAI/extra/grpc/huggingface/test_huggingface.py::TestBackendServicer::test_load_model /workspaces/LocalAI/extra/grpc/huggingface/test_huggingface.py::TestBackendServicer::test_server_startup ./test_huggingface.py::TestBackendServicer::test_embedding Passed ./test_huggingface.py::TestBackendServicer::test_load_model Passed ./test_huggingface.py::TestBackendServicer::test_server_startup Passed Total number of tests expected to run: 3 Total number of tests run: 3 Total number of tests passed: 3 Total number of tests failed: 0 Total number of tests failed with errors: 0 Total number of tests skipped: 0 Finished running tests! ``` **[Signed commits](../CONTRIBUTING.md#signing-off-on-commits-developer-certificate-of-origin)** - [x] Yes, I signed my commits. <!-- Thank you for contributing to LocalAI! Contributing Conventions ------------------------- The draft above helps to give a quick overview of your PR. Remember to remove this comment and to at least: 1. Include descriptive PR titles with [<component-name>] prepended. We use [conventional commits](https://www.conventionalcommits.org/en/v1.0.0/). 2. Build and test your changes before submitting a PR (`make build`). 3. Sign your commits 4. **Tag maintainer:** for a quicker response, tag the relevant maintainer (see below). 5. **X/Twitter handle:** we announce bigger features on X/Twitter. If your PR gets announced, and you'd like a mention, we'll gladly shout you out! By following the community's contribution conventions upfront, the review process will be accelerated and your PR merged more quickly. If no one reviews your PR within a few days, please @-mention @mudler. --> Signed-off-by: GitHub <[email protected]> Signed-off-by: Ettore Di Giacinto <[email protected]>
1 parent 2711f79 commit e73cce5

File tree

9 files changed

+253
-6
lines changed

9 files changed

+253
-6
lines changed

Dockerfile

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -14,7 +14,7 @@ ARG TARGETARCH
1414
ARG TARGETVARIANT
1515

1616
ENV BUILD_TYPE=${BUILD_TYPE}
17-
ENV EXTERNAL_GRPC_BACKENDS="huggingface-embeddings:/build/extra/grpc/huggingface/huggingface.py,autogptq:/build/extra/grpc/autogptq/run.sh,bark:/build/extra/grpc/bark/run.sh,diffusers:/build/extra/grpc/diffusers/run.sh,exllama:/build/extra/grpc/exllama/exllama.py,vall-e-x:/build/extra/grpc/vall-e-x/ttsvalle.py,vllm:/build/extra/grpc/vllm/run.sh"
17+
ENV EXTERNAL_GRPC_BACKENDS="huggingface-embeddings:/build/extra/grpc/huggingface/run.sh,autogptq:/build/extra/grpc/autogptq/run.sh,bark:/build/extra/grpc/bark/run.sh,diffusers:/build/extra/grpc/diffusers/run.sh,exllama:/build/extra/grpc/exllama/exllama.py,vall-e-x:/build/extra/grpc/vall-e-x/ttsvalle.py,vllm:/build/extra/grpc/vllm/run.sh"
1818
ENV GALLERIES='[{"name":"model-gallery", "url":"github:go-skynet/model-gallery/index.yaml"}, {"url": "github:go-skynet/model-gallery/huggingface.yaml","name":"huggingface"}]'
1919
ARG GO_TAGS="stablediffusion tts"
2020

Makefile

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -414,6 +414,7 @@ prepare-extra-conda-environments:
414414
$(MAKE) -C extra/grpc/bark
415415
$(MAKE) -C extra/grpc/diffusers
416416
$(MAKE) -C extra/grpc/vllm
417+
$(MAKE) -C extra/grpc/huggingface
417418

418419
backend-assets/grpc:
419420
mkdir -p backend-assets/grpc

extra/grpc/huggingface/Makefile

Lines changed: 18 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,18 @@
1+
.PONY: huggingface
2+
huggingface:
3+
@echo "Creating virtual environment..."
4+
@conda env create --name huggingface --file huggingface.yml
5+
@echo "Virtual environment created."
6+
7+
.PONY: run
8+
run:
9+
@echo "Running huggingface..."
10+
bash run.sh
11+
@echo "huggingface run."
12+
13+
# It is not working well by using command line. It only6 works with IDE like VSCode.
14+
.PONY: test
15+
test:
16+
@echo "Testing huggingface..."
17+
bash test.sh
18+
@echo "huggingface tested."

extra/grpc/huggingface/README.md

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,5 @@
1+
# Creating a separate environment for the huggingface project
2+
3+
```
4+
make huggingface
5+
```

extra/grpc/huggingface/huggingface.py

Lines changed: 50 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -1,13 +1,20 @@
1+
"""
2+
Extra gRPC server for HuggingFace SentenceTransformer models.
3+
"""
14
#!/usr/bin/env python3
2-
import grpc
35
from concurrent import futures
4-
import time
5-
import backend_pb2
6-
import backend_pb2_grpc
6+
77
import argparse
88
import signal
99
import sys
1010
import os
11+
12+
import time
13+
import backend_pb2
14+
import backend_pb2_grpc
15+
16+
import grpc
17+
1118
from sentence_transformers import SentenceTransformer
1219

1320
_ONE_DAY_IN_SECONDS = 60 * 60 * 24
@@ -17,18 +24,56 @@
1724

1825
# Implement the BackendServicer class with the service methods
1926
class BackendServicer(backend_pb2_grpc.BackendServicer):
27+
"""
28+
A gRPC servicer for the backend service.
29+
30+
This class implements the gRPC methods for the backend service, including Health, LoadModel, and Embedding.
31+
"""
2032
def Health(self, request, context):
33+
"""
34+
A gRPC method that returns the health status of the backend service.
35+
36+
Args:
37+
request: A HealthRequest object that contains the request parameters.
38+
context: A grpc.ServicerContext object that provides information about the RPC.
39+
40+
Returns:
41+
A Reply object that contains the health status of the backend service.
42+
"""
2143
return backend_pb2.Reply(message=bytes("OK", 'utf-8'))
44+
2245
def LoadModel(self, request, context):
46+
"""
47+
A gRPC method that loads a model into memory.
48+
49+
Args:
50+
request: A LoadModelRequest object that contains the request parameters.
51+
context: A grpc.ServicerContext object that provides information about the RPC.
52+
53+
Returns:
54+
A Result object that contains the result of the LoadModel operation.
55+
"""
2356
model_name = request.Model
2457
try:
2558
self.model = SentenceTransformer(model_name)
2659
except Exception as err:
2760
return backend_pb2.Result(success=False, message=f"Unexpected {err=}, {type(err)=}")
61+
2862
# Implement your logic here for the LoadModel service
2963
# Replace this with your desired response
3064
return backend_pb2.Result(message="Model loaded successfully", success=True)
65+
3166
def Embedding(self, request, context):
67+
"""
68+
A gRPC method that calculates embeddings for a given sentence.
69+
70+
Args:
71+
request: An EmbeddingRequest object that contains the request parameters.
72+
context: A grpc.ServicerContext object that provides information about the RPC.
73+
74+
Returns:
75+
An EmbeddingResult object that contains the calculated embeddings.
76+
"""
3277
# Implement your logic here for the Embedding service
3378
# Replace this with your desired response
3479
print("Calculated embeddings for: " + request.Embeddings, file=sys.stderr)
@@ -66,4 +111,4 @@ def signal_handler(sig, frame):
66111
)
67112
args = parser.parse_args()
68113

69-
serve(args.addr)
114+
serve(args.addr)
Lines changed: 77 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,77 @@
1+
name: huggingface
2+
channels:
3+
- defaults
4+
dependencies:
5+
- _libgcc_mutex=0.1=main
6+
- _openmp_mutex=5.1=1_gnu
7+
- bzip2=1.0.8=h7b6447c_0
8+
- ca-certificates=2023.08.22=h06a4308_0
9+
- ld_impl_linux-64=2.38=h1181459_1
10+
- libffi=3.4.4=h6a678d5_0
11+
- libgcc-ng=11.2.0=h1234567_1
12+
- libgomp=11.2.0=h1234567_1
13+
- libstdcxx-ng=11.2.0=h1234567_1
14+
- libuuid=1.41.5=h5eee18b_0
15+
- ncurses=6.4=h6a678d5_0
16+
- openssl=3.0.11=h7f8727e_2
17+
- pip=23.2.1=py311h06a4308_0
18+
- python=3.11.5=h955ad1f_0
19+
- readline=8.2=h5eee18b_0
20+
- setuptools=68.0.0=py311h06a4308_0
21+
- sqlite=3.41.2=h5eee18b_0
22+
- tk=8.6.12=h1ccaba5_0
23+
- tzdata=2023c=h04d1e81_0
24+
- wheel=0.41.2=py311h06a4308_0
25+
- xz=5.4.2=h5eee18b_0
26+
- zlib=1.2.13=h5eee18b_0
27+
- pip:
28+
- certifi==2023.7.22
29+
- charset-normalizer==3.3.0
30+
- click==8.1.7
31+
- filelock==3.12.4
32+
- fsspec==2023.9.2
33+
- grpcio==1.59.0
34+
- huggingface-hub==0.17.3
35+
- idna==3.4
36+
- install==1.3.5
37+
- jinja2==3.1.2
38+
- joblib==1.3.2
39+
- markupsafe==2.1.3
40+
- mpmath==1.3.0
41+
- networkx==3.1
42+
- nltk==3.8.1
43+
- numpy==1.26.0
44+
- nvidia-cublas-cu12==12.1.3.1
45+
- nvidia-cuda-cupti-cu12==12.1.105
46+
- nvidia-cuda-nvrtc-cu12==12.1.105
47+
- nvidia-cuda-runtime-cu12==12.1.105
48+
- nvidia-cudnn-cu12==8.9.2.26
49+
- nvidia-cufft-cu12==11.0.2.54
50+
- nvidia-curand-cu12==10.3.2.106
51+
- nvidia-cusolver-cu12==11.4.5.107
52+
- nvidia-cusparse-cu12==12.1.0.106
53+
- nvidia-nccl-cu12==2.18.1
54+
- nvidia-nvjitlink-cu12==12.2.140
55+
- nvidia-nvtx-cu12==12.1.105
56+
- packaging==23.2
57+
- pillow==10.0.1
58+
- protobuf==4.24.4
59+
- pyyaml==6.0.1
60+
- regex==2023.10.3
61+
- requests==2.31.0
62+
- safetensors==0.4.0
63+
- scikit-learn==1.3.1
64+
- scipy==1.11.3
65+
- sentence-transformers==2.2.2
66+
- sentencepiece==0.1.99
67+
- sympy==1.12
68+
- threadpoolctl==3.2.0
69+
- tokenizers==0.14.1
70+
- torch==2.1.0
71+
- torchvision==0.16.0
72+
- tqdm==4.66.1
73+
- transformers==4.34.0
74+
- triton==2.1.0
75+
- typing-extensions==4.8.0
76+
- urllib3==2.0.6
77+
prefix: /opt/conda/envs/huggingface

extra/grpc/huggingface/run.sh

Lines changed: 10 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,10 @@
1+
##
2+
## A bash script wrapper that runs the huggingface server with conda
3+
4+
# Activate conda environment
5+
source activate huggingface
6+
7+
# get the directory where the bash script is located
8+
DIR="$( cd "$( dirname "${BASH_SOURCE[0]}" )" >/dev/null 2>&1 && pwd )"
9+
10+
python $DIR/huggingface.py

extra/grpc/huggingface/test.sh

Lines changed: 10 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,10 @@
1+
##
2+
## A bash script wrapper that runs the huggingface server with conda
3+
4+
# Activate conda environment
5+
source activate huggingface
6+
7+
# get the directory where the bash script is located
8+
DIR="$( cd "$( dirname "${BASH_SOURCE[0]}" )" >/dev/null 2>&1 && pwd )"
9+
10+
python -m unittest $DIR/test_huggingface.py
Lines changed: 81 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,81 @@
1+
"""
2+
A test script to test the gRPC service
3+
"""
4+
import unittest
5+
import subprocess
6+
import time
7+
import backend_pb2
8+
import backend_pb2_grpc
9+
10+
import grpc
11+
12+
13+
class TestBackendServicer(unittest.TestCase):
14+
"""
15+
TestBackendServicer is the class that tests the gRPC service
16+
"""
17+
def setUp(self):
18+
"""
19+
This method sets up the gRPC service by starting the server
20+
"""
21+
self.service = subprocess.Popen(["python3", "huggingface.py", "--addr", "localhost:50051"])
22+
23+
def tearDown(self) -> None:
24+
"""
25+
This method tears down the gRPC service by terminating the server
26+
"""
27+
self.service.terminate()
28+
self.service.wait()
29+
30+
def test_server_startup(self):
31+
"""
32+
This method tests if the server starts up successfully
33+
"""
34+
time.sleep(2)
35+
try:
36+
self.setUp()
37+
with grpc.insecure_channel("localhost:50051") as channel:
38+
stub = backend_pb2_grpc.BackendStub(channel)
39+
response = stub.Health(backend_pb2.HealthMessage())
40+
self.assertEqual(response.message, b'OK')
41+
except Exception as err:
42+
print(err)
43+
self.fail("Server failed to start")
44+
finally:
45+
self.tearDown()
46+
47+
def test_load_model(self):
48+
"""
49+
This method tests if the model is loaded successfully
50+
"""
51+
try:
52+
self.setUp()
53+
with grpc.insecure_channel("localhost:50051") as channel:
54+
stub = backend_pb2_grpc.BackendStub(channel)
55+
response = stub.LoadModel(backend_pb2.ModelOptions(Model="bert-base-nli-mean-tokens"))
56+
self.assertTrue(response.success)
57+
self.assertEqual(response.message, "Model loaded successfully")
58+
except Exception as err:
59+
print(err)
60+
self.fail("LoadModel service failed")
61+
finally:
62+
self.tearDown()
63+
64+
def test_embedding(self):
65+
"""
66+
This method tests if the embeddings are generated successfully
67+
"""
68+
try:
69+
self.setUp()
70+
with grpc.insecure_channel("localhost:50051") as channel:
71+
stub = backend_pb2_grpc.BackendStub(channel)
72+
response = stub.LoadModel(backend_pb2.ModelOptions(Model="bert-base-nli-mean-tokens"))
73+
self.assertTrue(response.success)
74+
embedding_request = backend_pb2.PredictOptions(Embeddings="This is a test sentence.")
75+
embedding_response = stub.Embedding(embedding_request)
76+
self.assertIsNotNone(embedding_response.embeddings)
77+
except Exception as err:
78+
print(err)
79+
self.fail("Embedding service failed")
80+
finally:
81+
self.tearDown()

0 commit comments

Comments
 (0)