openai-api-serving

running local open-source model like glm-4 and qwen2.5 with OpenAI compatible API
also with embedding and rerank support

examples:

export MODEL_ROOT=/data/huggingface/models
export LLM_MODEL=deepseek-ai/DeepSeek-R1-Distill-Qwen-14B/
export EMBEDDING_MODEL=maidalun1020/bce-embedding-base_v1/
export RERANK_MODEL=maidalun1020/bce-reranker-base_v1/
CUDA_VISIBLE_DEVICES=2,3 python3 openai_api_all_in_one.py

Name		Name	Last commit message	Last commit date
Latest commit History 12 Commits
.gitignore		.gitignore
Dockerfile		Dockerfile
LICENSE		LICENSE
README.md		README.md
client_embedding.py		client_embedding.py
client_llm.py		client_llm.py
client_rerank.py		client_rerank.py
client_tool_call.py		client_tool_call.py
openai_api_all_in_one.py		openai_api_all_in_one.py
openai_api_app.py		openai_api_app.py
openai_api_embedding_app.py		openai_api_embedding_app.py
openai_api_glm4_app.py		openai_api_glm4_app.py
openai_api_protocol.py		openai_api_protocol.py
openai_api_qwen2_app.py		openai_api_qwen2_app.py
openai_api_rerank_app.py		openai_api_rerank_app.py
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Uh oh!

Repository files navigation

openai-api-serving

examples:

About

Uh oh!

Releases

Packages

Languages

Uh oh!

License

Uh oh!

jasonkylelol/openai-api-serving

Folders and files

Latest commit

History

Repository files navigation

openai-api-serving

examples:

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages