inference-manager The inference-manager manages inference runtimes (e.g., vLLM and Ollama) in containers, load models, and process requests. Inferece Request flow. Please see inference_request_flow.md.