LCORE-455: Removed global vars to make compatible with uvicorn workers > 1 #604

are-ces · 2025-10-01T09:20:05Z

Description

When running Uvicorn with workers > 1, LCS did not work properly because each worker is a separate process, and global variables are not shared across processes.

To fix this, the code was changed to initialize necessary resources within each worker process, rather than relying on global variables.

Note: To load the configuration across processes, an environment variable was added to track the location of the config file.

Type of change

Related Tickets & Documents

Related Issue # LCORE-455
Closes # LCORE-455

Checklist before requesting a review

I have performed a self-review of my code.
PR has passed all pre-merge test jobs.
If it is a core feature, I have added thorough tests.

Testing

Please provide detailed steps to perform tests related to this code change.
Change the lightspeed_stack.yaml config to increase number of workers from 1 to any number >1. Then stress test an endpoint which uses llama-stack (/query, /streaming_query)
How were the fix/results from this change verified? Please provide relevant screenshots or results.
LCS was configured to have 2 / 4 workers and I tested /query and /streaming_query endpoints. A stress test would need to be performed.

Summary by CodeRabbit

Refactor
- Standardized authentication handling across all endpoints for consistent, per-request resolution.
- Migrated app startup to a FastAPI lifespan flow for smoother initialization of configuration, services, and database.
Bug Fixes
- Improved reliability of configuration loading, including a safe fallback via environment settings.
- Enhanced startup robustness across worker processes by deferring config load to each process.
Chores
- Minor import and dependency cleanup across endpoints with no change to API behavior or responses.

coderabbitai · 2025-10-01T09:20:12Z

Walkthrough

This PR replaces module-level auth dependency variables with inline Depends(get_auth_dependency()) across many FastAPI endpoints, adjusts related imports, and introduces lazy-loading of authentication configuration in authentication/init.py. It also refactors application startup to use FastAPI’s lifespan context and updates the CLI to set a config path environment variable for worker-local config loading.

Changes

Cohort / File(s)	Summary of changes
Endpoint auth DI refactor `src/app/endpoints/authorized.py`, `src/app/endpoints/config.py`, `src/app/endpoints/conversations.py`, `src/app/endpoints/conversations_v2.py`, `src/app/endpoints/health.py`, `src/app/endpoints/info.py`, `src/app/endpoints/metrics.py`, `src/app/endpoints/models.py`, `src/app/endpoints/query.py`, `src/app/endpoints/root.py`, `src/app/endpoints/streaming_query.py`	Replace `Depends(auth_dependency)` with `Depends(get_auth_dependency())`; remove module-level `auth_dependency` where present; minor import/order updates; no endpoint path/behavior changes beyond DI resolution.
Feedback endpoint tweaks `src/app/endpoints/feedback.py`	Switch to `Depends(get_auth_dependency())`; add `ForbiddenResponse` (and `json`/datetime import adjustments); update response imports.
Startup lifecycle overhaul `src/app/main.py`	Add `lifespan` async context; move startup logic (config load, Llama client/version check, MCP registration, DB init) into lifespan; wire lifespan into FastAPI app; reorganize imports.
Auth config lazy-load `src/authentication/__init__.py`	Update `get_auth_dependency` to guard access to configuration; on missing config, load from `LIGHTSPEED_STACK_CONFIG_PATH` (default tests path) then proceed; reorder/expand imports; no signature changes.
CLI run changes `src/lightspeed_stack.py`	Set `os.environ["LIGHTSPEED_STACK_CONFIG_PATH"] = args.config_file`; remove client creation/version check steps; retain `start_uvicorn` entry; imports reordered.

Sequence Diagram(s)

sequenceDiagram
  autonumber
  participant User
  participant FastAPI App
  participant Lifespan as Lifespan Context
  participant Config
  participant Llama as Llama Stack Client
  participant MCP as MCP Registry
  participant DB

  User->>FastAPI App: Start process
  activate FastAPI App
  FastAPI App->>Lifespan: enter()
  activate Lifespan
  Lifespan->>Config: Load configuration
  Lifespan->>Llama: Initialize / check version
  Lifespan->>MCP: Register servers
  Lifespan->>DB: Initialize and create tables
  Lifespan-->>FastAPI App: ready
  deactivate Lifespan
  note over FastAPI App: App serves requests

sequenceDiagram
  autonumber
  participant Client
  participant Endpoint
  participant DI as Depends(get_auth_dependency())
  participant Auth as authentication.get_auth_dependency
  participant Cfg as configuration
  participant FS as File Loader

  Client->>Endpoint: HTTP request
  Endpoint->>DI: Resolve dependency
  DI->>Auth: get_auth_dependency()
  Auth->>Cfg: Read auth module
  alt Config not loaded (LogicError)
    Auth->>FS: Load from LIGHTSPEED_STACK_CONFIG_PATH
    FS-->>Cfg: Configuration loaded
    Auth->>Cfg: Re-read auth module
  end
  Auth-->>DI: Auth dependency
  DI-->>Endpoint: Auth tuple
  Endpoint-->>Client: Response

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Possibly related PRs

Convert lightspeed-core to async architecture #348 — Similar refactor of FastAPI auth injection across endpoints.
Role-based authorization layer #356 — Related changes to endpoint authentication wiring and dependency usage.
LCORE-220: Add metrics to lightspeed-stack #256 — Touches /metrics; this PR updates that handler’s auth dependency.

Suggested reviewers

manstis
tisnik

Poem

A hop, a skip, config now in sight,
I twirl my ears at startup’s light.
Per-request auth, no globals to keep,
Lifespan wakes while workers leap.
Env-path crumbs guide every node—
Thump! The stack is good to load. 🐇✨

Pre-merge checks and finishing touches

✅ Passed checks (3 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title Check	✅ Passed	The title succinctly captures the primary change of removing global variables to enable compatibility with Uvicorn servers running multiple workers, directly reflecting the main fix outlined in the pull request objectives.
Docstring Coverage	✅ Passed	Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.

✨ Finishing touches

📝 Generate Docstrings

🧪 Generate unit tests

Create PR with unit tests
Post copyable unit tests in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 3

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (6)

src/app/endpoints/models.py (1)

77-79: Avoid logging secrets; redact Llama Stack configuration.

Logging the entire configuration at INFO risks leaking API keys and sensitive data. Log only non-secret fields (e.g., url, use_as_library_client) and move to DEBUG if needed.
-logger.info("Llama stack config: %s", llama_stack_configuration)
+logger.info(
+    "Llama stack configured: url=%s, library_client=%s",
+    getattr(llama_stack_configuration, "url", "<unset>"),
+    getattr(llama_stack_configuration, "use_as_library_client", "<unset>"),
+)

src/app/endpoints/query.py (3)

199-201: Avoid logging secrets; redact Llama Stack configuration.

Same concern as in models.py. Do not print the full configuration; log only safe fields.

-logger.info("Llama stack config: %s", configuration.llama_stack_configuration)
+safe_cfg = configuration.llama_stack_configuration
+logger.info(
+    "Llama stack configured: url=%s, library_client=%s",
+    getattr(safe_cfg, "url", "<unset>"),
+    getattr(safe_cfg, "use_as_library_client", "<unset>"),
+)

317-321: Bug: constructing AnyUrl at runtime will raise. Pass a str and let Pydantic validate.

pydantic.AnyUrl is a type, not a runtime constructor. AnyUrl(chunk.source) will error.

-                        doc_url=(
-                            AnyUrl(chunk.source)
-                            if chunk.source.startswith("http")
-                            else None
-                        ),
+                        doc_url=chunk.source if chunk.source.startswith("http") else None,

309-334: Referenced documents computed but not used (shadowing bug).

You build referenced_docs from rag_chunks but return the earlier referenced_documents from retrieve_response. Use the newly built list when present.

-        referenced_docs = []
+        referenced_docs = []
         doc_sources = set()
         for chunk in summary.rag_chunks:
             if chunk.source and chunk.source not in doc_sources:
                 doc_sources.add(chunk.source)
                 referenced_docs.append(
                     ReferencedDocument(
-                        doc_url=(
-                            AnyUrl(chunk.source)
-                            if chunk.source.startswith("http")
-                            else None
-                        ),
+                        doc_url=chunk.source if chunk.source.startswith("http") else None,
                         doc_title=chunk.source,
                     )
                 )
 
         logger.info("Building final response...")
+        # Prefer chunk-derived docs when available, otherwise keep those parsed from turn steps
+        referenced_documents = referenced_docs or referenced_documents
         response = QueryResponse(
             conversation_id=conversation_id,
             response=summary.llm_response,
             rag_chunks=summary.rag_chunks if summary.rag_chunks else [],
             tool_calls=tool_calls if tool_calls else None,
-            referenced_documents=referenced_documents,
+            referenced_documents=referenced_documents,
         )

src/app/main.py (2)

55-73: Fix app metadata init to avoid pre-load config access; update after startup.

Initialize with placeholders, then set title/summary/description inside lifespan after configuration is loaded.

-app = FastAPI(
-    title=f"{service_name} service - OpenAPI",
-    summary=f"{service_name} service API specification.",
-    description=f"{service_name} service API specification.",
+app = FastAPI(
+    title="Lightspeed service - OpenAPI",
+    summary="Lightspeed service API specification.",
+    description="Lightspeed service API specification.",
     version=version.__version__,
@@
-    lifespan=lifespan,
+    lifespan=lifespan,
 )

And inside lifespan, right after loading configuration:

@@
-    configuration.load_configuration(os.environ["LIGHTSPEED_STACK_CONFIG_PATH"])
+    # Load configuration and update app metadata
+    configuration.load_configuration(config_path)
+    service_name = configuration.configuration.name
+    _app.title = f"{service_name} service - OpenAPI"
+    _app.summary = f"{service_name} service API specification."
+    _app.description = f"{service_name} service API specification."

86-88: Middleware decorator must use 'http'.

@app.middleware("") is invalid; use "http" so the middleware registers.

-@app.middleware("")
+@app.middleware("http")
 async def rest_api_metrics(

🧹 Nitpick comments (3)

src/app/endpoints/models.py (2)
7-7: Use canonical FastAPI import for Depends.

Import Depends from fastapi, not fastapi.params, to match project convention and avoid confusion. As per coding guidelines.
-from fastapi.params import Depends
+from fastapi import Depends
50-51: Add response_model for better schema guarantees.

Expose the success schema via response_model to enforce and document the response type.
-@router.get("/models", responses=models_responses)
+@router.get("/models", response_model=ModelsResponse, responses=models_responses)
src/app/endpoints/query.py (1)
171-173: Add response_model to document and enforce the success payload.

This improves OpenAPI and client generation.
-@router.post("/query", responses=query_response)
+@router.post("/query", response_model=QueryResponse, responses=query_response)

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 98430bb and e971aab.

📒 Files selected for processing (15)

src/app/endpoints/authorized.py (2 hunks)
src/app/endpoints/config.py (2 hunks)
src/app/endpoints/conversations.py (4 hunks)
src/app/endpoints/conversations_v2.py (4 hunks)
src/app/endpoints/feedback.py (4 hunks)
src/app/endpoints/health.py (3 hunks)
src/app/endpoints/info.py (2 hunks)
src/app/endpoints/metrics.py (1 hunks)
src/app/endpoints/models.py (2 hunks)
src/app/endpoints/query.py (1 hunks)
src/app/endpoints/root.py (2 hunks)
src/app/endpoints/streaming_query.py (1 hunks)
src/app/main.py (2 hunks)
src/authentication/__init__.py (2 hunks)
src/lightspeed_stack.py (2 hunks)

🧰 Additional context used

📓 Path-based instructions (6)

src/**/*.py