Skip to content

OSC/async_llm_api

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

4 Commits
 
 
 
 
 
 
 
 

Repository files navigation

README.md

This project sends chat prompts to an Ollama API generate endpoint, enabling batch processing of multiple requests.
It is intended as example code and meant to be extended for your specific uses.

Installation

  1. Install uv
    • pip install uv
  2. Create an environment
    • uv venv
  3. Activate environment
    • source .venv/bin/activate
  4. In activated environment, install requirements
    • pip install -f requirements.txt

Usage

python async_api.py --filename prompts.txt --num_requests 10 --models model1 model2 --base_url https://your/server/api --randomize_models False --api_token $API_TOKEN 

Arguments:
--filename - Path to file with prompts, one prompt per line
--num_requests - Number of requests to send, defaults to all
--models - Model names, one or more models must be specified
--base_url - Base URL of OpenAI API compliant server
--randomize_models - Whether or not use random model from list for each request
--api-token - JWT token for auth header

About

Send asynchronous LLM API requests

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages