-
-
Notifications
You must be signed in to change notification settings - Fork 140
2.2.5 Backend: Aphrodite Engine
av edited this page Apr 26, 2025
·
4 revisions
Handle:
aphrodite
URL: http://localhost:33921
PygmalionAI's large-scale inference engine
# [Optional] pre-pull the image, ~5GB
harbor pull aphrodite
# Start the service
harbor up aphrodite
# [Optional] When loading closed/gated models
# provision the token
harbor hf token <your-token>
# Open HF Search to find the models
harbor find gptq awq
# Download model repo to the global HF cache
# user/repo format
harbor hf download infly/INF-34B-Chat-AWQ
# Get/set the model to run
# in the aphrodite engine
harbor aphrodite model infly/INF-34B-Chat-AWQ
# See available options
harbor run aphrodite --help
# Get/Set the extra arguments for
# the aphrodite engine
harbor aphrodite args
You can adjust used version (docker image tag) of the engine:
# Get the current version - "latest" by default
harbor config get aphrodite.version
# Set the version
harbor config set aphrodite.version latest