Skip to content

Conversation

umago
Copy link
Contributor

@umago umago commented Jul 18, 2025

Description

This patch is adding the /metrics endpoint to the project.

Differences with road-core/service:

  • The metrics are prefixed with "ls_" instead of "ols_"
  • The "provider_model_configuration" does not set the non-default model/providers to 0 because we currently do not have a way to set a default model/provider in the configuration. A TODO was left in the code.

Supported metrics:

  • rest_api_calls_total: Counter to track REST API calls
  • response_duration_seconds: Histogram to measure how long it takes to handle requests
  • provider_model_configuration: Indicates what provider + model customers are using
  • llm_calls_total: How many LLM calls were made for each provider + model
  • llm_calls_failures_total: How many LLM calls failed
  • llm_calls_validation_errors_total: How many LLM calls had validation errors

Missing metrics:

  • llm_token_sent_total: How many tokens were sent
  • llm_token_received_total: How many tokens were received

The above metrics are missing because token counting PR (#215) is not merged yet.

Type of change

  • Refactor
  • New feature
  • Bug fix
  • CVE fix
  • Optimization
  • Documentation Update
  • Configuration Update
  • Bump-up service version
  • Bump-up dependent library
  • Bump-up library or tool used for development (does not change the final image)
  • CI configuration change
  • Konflux configuration change
  • Unit tests improvement
  • Integration tests improvement
  • End to end tests improvement

Related Tickets & Documents

  • Related Issue #LCORE-220
  • Closes #

Checklist before requesting a review

  • I have performed a self-review of my code.
  • PR has passed all pre-merge test jobs.
  • If it is a core feature, I have added thorough tests.

Testing

  • Please provide detailed steps to perform tests related to this code change.
  • How were the fix/results from this change verified? Please provide relevant screenshots or results.

Summary by CodeRabbit

  • New Features

    • Introduced a new /metrics API endpoint providing real-time Prometheus-compatible metrics for API usage, performance, and LLM interactions.
    • Added middleware to track REST API request counts and durations by endpoint and response status.
    • Enhanced observability with detailed metrics for LLM provider/model usage, total calls, failures, validation errors, and token counts.
    • Model selection updated to include provider information for improved metric granularity.
  • Bug Fixes

    • Improved error tracking and validation error reporting for LLM-related endpoints.
  • Chores

    • Added and updated dependencies to support metrics functionality.
    • Expanded and updated unit tests to cover new metrics features and ensure robust monitoring.

Copy link
Contributor

coderabbitai bot commented Jul 18, 2025

Walkthrough

This update introduces a comprehensive metrics and observability layer to the application. Prometheus-compatible metrics are added, tracked, and exposed via a new /metrics endpoint. Instrumentation is integrated into REST and LLM-related endpoints, with supporting middleware and utility modules. Related tests are updated and expanded to verify metrics functionality and correct integration.

Changes

File(s) Change Summary
pyproject.toml Added prometheus-client and starlette as dependencies.
src/app/endpoints/metrics.py Added a new /metrics endpoint serving Prometheus metrics.
src/app/endpoints/query.py,
src/app/endpoints/streaming_query.py
Refactored model selection to return (model_id, provider_id); added LLM call metrics instrumentation.
src/app/main.py Added metrics middleware for REST endpoints; initialized model metrics at startup.
src/app/routers.py Registered the new metrics router.
src/metrics/init.py Introduced Prometheus metric definitions for REST and LLM tracking.
src/metrics/utils.py Added utility for initializing provider/model configuration metrics.
tests/unit/app/endpoints/test_metrics.py Added unit test for the /metrics endpoint.
tests/unit/app/endpoints/test_query.py Updated tests for new model selection API and added assertions for LLM metrics.
tests/unit/app/endpoints/test_streaming_query.py Updated tests for new model selection API and LLM validation error metrics.
tests/unit/app/test_routers.py Updated router tests to account for the new metrics router.
tests/unit/metrics/init.py Added docstring file for metrics unit tests.
tests/unit/metrics/test_utis.py Added unit test for setup_model_metrics utility.

Sequence Diagram(s)

sequenceDiagram
    participant Client
    participant FastAPI App
    participant Metrics Middleware
    participant Endpoint Handler
    participant Prometheus Client

    Client->>FastAPI App: HTTP request (any endpoint)
    FastAPI App->>Metrics Middleware: Pass request
    Metrics Middleware->>Endpoint Handler: Forward request
    Endpoint Handler-->>Metrics Middleware: Response
    Metrics Middleware->>Prometheus Client: Record metrics (duration, status, path)
    Metrics Middleware-->>FastAPI App: Return response
    FastAPI App-->>Client: Response

    Client->>FastAPI App: GET /metrics
    FastAPI App->>metrics_endpoint_handler: Handle request
    metrics_endpoint_handler->>Prometheus Client: Collect latest metrics
    Prometheus Client-->>metrics_endpoint_handler: Metrics data
    metrics_endpoint_handler-->>FastAPI App: PlainTextResponse (metrics)
    FastAPI App-->>Client: Prometheus metrics
Loading

Estimated code review effort

3 (~40 minutes)

Suggested reviewers

  • tisnik

Poem

In fields of code, the metrics bloom,
Prometheus sings in every room.
From queries deep to tokens spent,
Each call and error now is meant
To surface in the metrics stream—
Observability, a rabbit’s dream!
🐇📊


📜 Recent review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between b3359ba and db5b65b.

⛔ Files ignored due to path filters (1)
  • uv.lock is excluded by !**/*.lock
📒 Files selected for processing (14)
  • pyproject.toml (1 hunks)
  • src/app/endpoints/metrics.py (1 hunks)
  • src/app/endpoints/query.py (8 hunks)
  • src/app/endpoints/streaming_query.py (5 hunks)
  • src/app/main.py (2 hunks)
  • src/app/routers.py (2 hunks)
  • src/metrics/__init__.py (1 hunks)
  • src/metrics/utils.py (1 hunks)
  • tests/unit/app/endpoints/test_metrics.py (1 hunks)
  • tests/unit/app/endpoints/test_query.py (15 hunks)
  • tests/unit/app/endpoints/test_streaming_query.py (5 hunks)
  • tests/unit/app/test_routers.py (5 hunks)
  • tests/unit/metrics/__init__.py (1 hunks)
  • tests/unit/metrics/test_utis.py (1 hunks)
🚧 Files skipped from review as they are similar to previous changes (13)
  • tests/unit/metrics/init.py
  • pyproject.toml
  • src/app/routers.py
  • src/app/endpoints/metrics.py
  • tests/unit/app/test_routers.py
  • tests/unit/metrics/test_utis.py
  • tests/unit/app/endpoints/test_metrics.py
  • src/metrics/utils.py
  • src/app/endpoints/query.py
  • src/app/endpoints/streaming_query.py
  • src/app/main.py
  • tests/unit/app/endpoints/test_streaming_query.py
  • src/metrics/init.py
🔇 Additional comments (15)
tests/unit/app/endpoints/test_query.py (15)

15-15: LGTM: Import updated to match function signature change.

The import correctly reflects the updated function name that now returns a tuple of model and provider IDs.


66-66: Good defensive mocking practice.

Adding the steps attribute initialization prevents potential AttributeErrors in tests that access this property, making the fixture more robust.


111-111: LGTM: Metrics integration added correctly.

The LLM calls metric is properly patched and will be validated in the test assertions, ensuring the metrics functionality works as expected.


134-137: LGTM: Mock updated to match new function signature.

The mock correctly returns a tuple of (model_id, provider_id) to match the updated function signature, ensuring test consistency.


152-154: LGTM: Metrics assertion validates successful LLM call tracking.

The test correctly verifies that the LLM calls counter is incremented with the appropriate provider and model labels when a successful call occurs.


182-183: LGTM: Test function name updated to match the function being tested.

The rename maintains consistency between the test name and the function it tests.


194-200: LGTM: Proper tuple unpacking and assertion of both return values.

The test correctly unpacks the tuple returned by the updated function and validates both the model ID and provider ID.


202-203: LGTM: Consistent test function renames and tuple handling.

All test functions have been systematically updated to match the new function signature, with proper tuple unpacking and assertion of both returned values.

Also applies to: 219-226, 228-229, 248-249


239-239: LGTM: Updated to expect specific HTTPException instead of generic Exception.

Using HTTPException is more appropriate for FastAPI endpoints and provides better error specificity than generic exceptions.

Also applies to: 256-256


321-321: LGTM: Validation error metrics properly tested with negative case.

The test correctly verifies that validation error metrics are not incremented when there are no shield violations, ensuring accurate metric reporting.

Also applies to: 345-346


775-777: LGTM: Refactored to use consistent fixture pattern.

Using the prepare_agent_mocks fixture improves consistency across tests and reduces code duplication.

Also applies to: 779-779


867-909: LGTM: New test provides important coverage for shield violation metrics.

The test correctly verifies that validation error metrics are incremented when shield violations occur, complementing the negative test case and ensuring comprehensive metrics coverage.


1002-1002: LGTM: Failure metrics properly tested for connection errors.

The test correctly verifies that the failure metric is incremented when APIConnectionError occurs, ensuring proper error tracking for monitoring.

Also applies to: 1020-1020


1009-1021: LGTM: Connection error test properly validates error handling and metrics.

The test correctly simulates APIConnectionError, validates the proper HTTP response, and ensures failure metrics are incremented, providing comprehensive error scenario coverage.


1323-1327: LGTM: Auth test updated consistently with function signature change.

The mock patch correctly uses the updated function name and returns a tuple to match the new signature, maintaining test consistency.

✨ Finishing Touches
  • 📝 Generate Docstrings

🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Explain this complex logic.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai explain this code block.
    • @coderabbitai modularize this function.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read src/utils.ts and explain its main purpose.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
    • @coderabbitai help me debug CodeRabbit configuration file.

Support

Need help? Create a ticket on our support page for assistance with any issues or questions.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai full review to do a full review from scratch and review all the files again.
  • @coderabbitai summary to regenerate the summary of the PR.
  • @coderabbitai generate docstrings to generate docstrings for this PR.
  • @coderabbitai generate sequence diagram to generate a sequence diagram of the changes in this PR.
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai configuration to show the current CodeRabbit configuration for the repository.
  • @coderabbitai help to get help.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

@umago
Copy link
Contributor Author

umago commented Jul 18, 2025

Need to add more unit tests also waiting on #215 to merge to add the token metrics.

@umago umago force-pushed the metrics-LCORE-220 branch 2 times, most recently from ca00143 to ed6c493 Compare July 22, 2025 16:00
@umago umago marked this pull request as ready for review July 22, 2025 16:05
Copy link
Contributor

@tisnik tisnik left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it looks ok

import os
from pathlib import Path
from typing import Any
from typing import Any, Tuple
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

tuple ;) does not need to be imported

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ops, bad habits haha I will fix it

"""Select the model ID based on the request or available models."""
def select_model_and_provider_id(
models: ModelListResponse, query_request: QueryRequest
) -> Tuple[str, str | None]:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

tuple

@tisnik
Copy link
Contributor

tisnik commented Jul 22, 2025

@coderabbitai full review

Copy link
Contributor

coderabbitai bot commented Jul 22, 2025

✅ Actions performed

Full review triggered.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🔭 Outside diff range comments (1)
src/metrics/utils.py (1)

13-33: Add error handling for client operations.

The function lacks error handling for potential failures when accessing the client or listing models, which could cause the application to crash during startup.

def setup_model_metrics() -> None:
    """Perform setup of all metrics related to LLM model and provider."""
-   client = LlamaStackClientHolder().get_client()
-   models = [
-       model
-       for model in client.models.list()
-       if model.model_type == "llm"  # pyright: ignore[reportAttributeAccessIssue]
-   ]
+   try:
+       client = LlamaStackClientHolder().get_client()
+       all_models = client.models.list()
+       models = [
+           model
+           for model in all_models
+           if hasattr(model, 'model_type') and model.model_type == "llm"
+       ]
+   except Exception as e:
+       logger.error("Failed to retrieve models for metrics setup: %s", e)
+       return

    for model in models:
-       provider = model.provider_id
-       model_name = model.identifier
+       provider = getattr(model, 'provider_id', None)
+       model_name = getattr(model, 'identifier', None)
        if provider and model_name:
-           label_key = (provider, model_name)
-           metrics.provider_model_configuration.labels(*label_key).set(1)
-           logger.debug(
-               "Set provider/model configuration for %s/%s to 1",
-               provider,
-               model_name,
-           )
+           try:
+               label_key = (provider, model_name)
+               metrics.provider_model_configuration.labels(*label_key).set(1)
+               logger.debug(
+                   "Set provider/model configuration for %s/%s to 1",
+                   provider,
+                   model_name,
+               )
+           except Exception as e:
+               logger.error("Failed to set metric for %s/%s: %s", provider, model_name, e)
🧹 Nitpick comments (5)
tests/unit/app/endpoints/test_metrics.py (1)

6-26: Review the test structure and metric validations.

The test correctly validates the basic endpoint functionality, but there are a few concerns:

  1. The test passes None as the request parameter, which works but suggests the endpoint doesn't use the request data
  2. Lines 21 and 23 check for _created gauge metrics which are typically auto-generated by Prometheus client libraries - these might be implementation-dependent and could make the test brittle

Consider these improvements:

def test_metrics_endpoint():
    """Test the metrics endpoint handler."""
-   response = metrics_endpoint_handler(None)
+   # Create a mock request object or use a proper Request instance
+   from fastapi import Request
+   mock_request = Request({"type": "http", "method": "GET", "url": "http://test/metrics"})
+   response = metrics_endpoint_handler(mock_request)
    
    assert response is not None
    assert response.status_code == 200
    assert "text/plain" in response.headers["Content-Type"]

    response_body = response.body.decode()

    # Check if the response contains Prometheus metrics format
    assert "# TYPE ls_rest_api_calls_total counter" in response_body
    assert "# TYPE ls_response_duration_seconds histogram" in response_body
    assert "# TYPE ls_provider_model_configuration gauge" in response_body
    assert "# TYPE ls_llm_calls_total counter" in response_body
    assert "# TYPE ls_llm_calls_failures_total counter" in response_body
-   assert "# TYPE ls_llm_calls_failures_created gauge" in response_body
    assert "# TYPE ls_llm_validation_errors_total counter" in response_body
-   assert "# TYPE ls_llm_validation_errors_created gauge" in response_body
    assert "# TYPE ls_llm_token_sent_total counter" in response_body
    assert "# TYPE ls_llm_token_received_total counter" in response_body
+   
+   # Verify the response follows Prometheus exposition format
+   lines = response_body.split('\n')
+   assert any(line.startswith('# HELP') for line in lines), "Should contain HELP comments"
tests/unit/metrics/test_utis.py (2)

1-1: Fix the filename typo.

The filename should be test_utils.py instead of test_utis.py to match the module being tested (utils.py).


6-24: Good test structure with comprehensive mocking.

The test properly mocks external dependencies and verifies the expected behavior. Consider adding edge case tests for better coverage.

Consider adding these additional test cases:

def test_setup_model_metrics_no_models(mocker):
    """Test setup_model_metrics when no LLM models are available."""
    mock_client = mocker.patch("client.LlamaStackClientHolder.get_client").return_value
    mock_metric = mocker.patch("metrics.provider_model_configuration")
    mock_client.models.list.return_value = []
    
    setup_model_metrics()
    
    # Should not call set on any metrics
    mock_metric.labels.assert_not_called()

def test_setup_model_metrics_missing_provider_or_model(mocker):
    """Test setup_model_metrics with models missing provider_id or identifier."""
    mock_client = mocker.patch("client.LlamaStackClientHolder.get_client").return_value
    mock_metric = mocker.patch("metrics.provider_model_configuration")
    
    fake_models = [
        mocker.Mock(provider_id=None, identifier="test_model", model_type="llm"),
        mocker.Mock(provider_id="test_provider", identifier=None, model_type="llm"),
    ]
    mock_client.models.list.return_value = fake_models
    
    setup_model_metrics()
    
    # Should not call set for models with missing data
    mock_metric.labels.assert_not_called()
src/app/endpoints/streaming_query.py (1)

437-438: Consider the timing of metrics incrementation.

The llm_calls_total metric is incremented immediately after model selection but before the actual streaming response is generated. This means successful model selection is counted as a successful LLM call, even if the streaming fails later.

Consider moving the metrics increment after the streaming completes successfully:

        response, conversation_id = await retrieve_response(
            client,
            model_id,
            query_request,
            token,
            mcp_headers=mcp_headers,
        )
-       # Update metrics for the LLM call
-       metrics.llm_calls_total.labels(provider_id, model_id).inc()
        metadata_map: dict[str, dict[str, Any]] = {}

        async def response_generator(turn_response: Any) -> AsyncIterator[str]:
            """Generate SSE formatted streaming response."""
            chunk_id = 0
            complete_response = ""

            # Send start event
            yield stream_start_event(conversation_id)

            async for chunk in turn_response:
                for event in stream_build_event(chunk, chunk_id, metadata_map):
                    complete_response += json.loads(event.replace("data: ", ""))[
                        "data"
                    ]["token"]
                    chunk_id += 1
                    yield event

            yield stream_end_event(metadata_map)
+           # Update metrics for successful LLM call completion
+           metrics.llm_calls_total.labels(provider_id, model_id).inc()
src/app/endpoints/query.py (1)

302-309: Validation error detection logic is sound.

The post-response validation error checking iterates through response steps and correctly identifies shield violations. The metrics increment and early break are appropriate.

However, consider adding a debug log for when validation errors are detected:

    # Check for validation errors in the response
    steps = getattr(response, "steps", [])
    for step in steps:
        if step.step_type == "shield_call" and step.violation:
            # Metric for LLM validation errors
+           logger.debug("Shield validation violation detected: %s", step.violation)
            metrics.llm_calls_validation_errors_total.inc()
            break
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 28abfaf and ed6c493.

⛔ Files ignored due to path filters (1)
  • uv.lock is excluded by !**/*.lock
📒 Files selected for processing (14)
  • pyproject.toml (1 hunks)
  • src/app/endpoints/metrics.py (1 hunks)
  • src/app/endpoints/query.py (9 hunks)
  • src/app/endpoints/streaming_query.py (5 hunks)
  • src/app/main.py (2 hunks)
  • src/app/routers.py (2 hunks)
  • src/metrics/__init__.py (1 hunks)
  • src/metrics/utils.py (1 hunks)
  • tests/unit/app/endpoints/test_metrics.py (1 hunks)
  • tests/unit/app/endpoints/test_query.py (15 hunks)
  • tests/unit/app/endpoints/test_streaming_query.py (5 hunks)
  • tests/unit/app/test_routers.py (5 hunks)
  • tests/unit/metrics/__init__.py (1 hunks)
  • tests/unit/metrics/test_utis.py (1 hunks)
🧬 Code Graph Analysis (5)
tests/unit/metrics/test_utis.py (1)
src/metrics/utils.py (1)
  • setup_model_metrics (13-32)
src/metrics/utils.py (2)
src/client.py (1)
  • LlamaStackClientHolder (19-52)
src/log.py (1)
  • get_logger (7-13)
tests/unit/app/endpoints/test_metrics.py (1)
src/app/endpoints/metrics.py (1)
  • metrics_endpoint_handler (14-16)
src/app/main.py (3)
src/log.py (1)
  • get_logger (7-13)
src/metrics/utils.py (1)
  • setup_model_metrics (13-32)
src/app/routers.py (1)
  • include_routers (20-38)
src/app/endpoints/query.py (2)
src/app/endpoints/streaming_query.py (1)
  • retrieve_response (490-567)
src/models/requests.py (1)
  • QueryRequest (61-146)
🧰 Additional context used
🧬 Code Graph Analysis (5)
tests/unit/metrics/test_utis.py (1)
src/metrics/utils.py (1)
  • setup_model_metrics (13-32)
src/metrics/utils.py (2)
src/client.py (1)
  • LlamaStackClientHolder (19-52)
src/log.py (1)
  • get_logger (7-13)
tests/unit/app/endpoints/test_metrics.py (1)
src/app/endpoints/metrics.py (1)
  • metrics_endpoint_handler (14-16)
src/app/main.py (3)
src/log.py (1)
  • get_logger (7-13)
src/metrics/utils.py (1)
  • setup_model_metrics (13-32)
src/app/routers.py (1)
  • include_routers (20-38)
src/app/endpoints/query.py (2)
src/app/endpoints/streaming_query.py (1)
  • retrieve_response (490-567)
src/models/requests.py (1)
  • QueryRequest (61-146)
🔇 Additional comments (42)
tests/unit/metrics/__init__.py (1)

1-1: LGTM! Clean package initialization.

The docstring clearly describes the package purpose and follows Python conventions for test package organization.

pyproject.toml (1)

16-17: Dependencies are secure and up-to-date

Both added packages meet the latest stable release criteria and have no known vulnerabilities:

  • In pyproject.toml (lines 16–17):
    • prometheus-client >= 0.22.1 is already at the most recent stable 0.22.1.
    • starlette >= 0.47.1 will pull in 0.47.2 by default, the current stable release.

No further changes are required.

src/app/routers.py (2)

16-16: LGTM! Proper import addition.

The metrics module import is correctly added to the existing endpoint imports.


38-38: LGTM! Consistent router inclusion.

The metrics router is properly included without a version prefix, following the same pattern as other infrastructure endpoints like health and authorized.

src/app/endpoints/metrics.py (1)

1-17: LGTM! Standard Prometheus metrics endpoint implementation.

The metrics endpoint is correctly implemented following Prometheus best practices:

  • Proper HTTP GET method for metrics scraping
  • PlainTextResponse with correct Prometheus content type
  • Uses official prometheus_client library functions
  • Unused request parameter properly handled with underscore prefix
tests/unit/app/test_routers.py (2)

18-18: LGTM! Proper test import addition.

The metrics import is correctly added to match the router configuration changes.


48-48: LGTM! Comprehensive test updates for new router.

The tests are properly updated to account for the new metrics router:

  • Router count correctly incremented from 10 to 11
  • Both test functions consistently updated
  • Proper assertions for router presence and prefix verification

Also applies to: 59-59, 68-68, 79-79

src/app/endpoints/streaming_query.py (4)

26-26: Good addition of metrics import.

The metrics module import is correctly placed and follows the existing import structure.


427-429: Verify the model selection function update.

The change from select_model_id to select_model_and_provider_id is consistent with the updates in query.py and properly handles the tuple return value.


233-234: Appropriate placement of validation error metrics.

The validation error metric is correctly incremented when a shield violation is detected during streaming.


478-479: Proper error metrics handling.

The failure metric is correctly incremented when an API connection error occurs.

src/app/endpoints/query.py (6)

8-8: Tuple import is now used.

The Tuple import is now properly utilized in the function signature at line 172, addressing the previous review comment.


27-27: Good addition of metrics import.

The metrics module import is correctly placed and follows the existing import structure.


126-128: Function call properly updated for tuple return.

The call to select_model_and_provider_id correctly unpacks the tuple return value into model_id and provider_id.


136-137: Metrics tracking well-placed for successful calls.

The llm_calls_total metric is incremented after the response is successfully retrieved, which is appropriate timing for non-streaming queries.


158-159: Proper failure metrics handling.

The failure metric is correctly incremented when an API connection error occurs, maintaining consistency with the streaming endpoint.


170-216: Function signature update implemented correctly.

The function signature change from select_model_id to select_model_and_provider_id is properly implemented with the correct return type Tuple[str, str | None]. The logic correctly handles both auto-selection and explicit model/provider specification.

src/app/main.py (4)

3-15: LGTM - Import statements are correct and necessary.

All new imports are properly utilized in the code: typing imports for middleware function signature, FastAPI imports for request/response handling, Starlette routing imports for route filtering, and metrics imports for observability functionality.


71-76: Correct implementation of dynamic route path extraction.

The list comprehension properly:

  • Extracts paths after all routers are included
  • Filters for relevant route types (Mount, Route, WebSocketRoute)
  • Provides the necessary data structure for middleware filtering

77-77: Proper placement of model metrics initialization.

The setup_model_metrics() call is correctly positioned after router inclusion, ensuring all application components are initialized before setting up model-related metrics.


43-66: No issues found – middleware metrics logic is sound and app_routes_paths is properly initialized prior to registration.

The startup order is correct:

  • Routers are included (routers.include_routers(app))
  • app_routes_paths is defined immediately afterward
  • The rest_api_metrics middleware is registered last

All metrics collection, timing, and filtering behavior can proceed safely.

tests/unit/app/endpoints/test_streaming_query.py (4)

225-227: Correct update to reflect API signature change.

The test properly adapts to the new select_model_and_provider_id function that returns a tuple (model_id, provider_id) instead of a single model ID.


736-768: Proper integration of metrics validation in shield call tests.

The test correctly mocks the validation error metric and asserts it's not incremented when there's no violation, ensuring metrics accuracy.


771-811: Correct verification of metric increment on shield violation.

The test properly validates that the validation error metric is incremented exactly once when a shield violation occurs.


1541-1542: Consistent API signature update in auth tuple test.

The test correctly adapts to the new function signature, maintaining test integrity while reflecting the API change to return model and provider IDs as a tuple.

src/metrics/__init__.py (5)

1-7: Clean module structure with appropriate Prometheus imports.

The module properly imports all necessary Prometheus metric types for comprehensive observability.


9-20: Well-designed REST API metrics with appropriate labels.

The counter and histogram metrics provide comprehensive coverage of API usage patterns and performance characteristics. The naming conventions follow Prometheus standards.


22-28: Appropriate gauge metric for tracking model/provider configurations.

The gauge correctly represents active provider/model combinations with descriptive labels for popularity analysis.


30-41: Comprehensive LLM call tracking with appropriate granularity.

The metrics provide good coverage: successful calls tracked by provider/model, while failures and validation errors are tracked as totals. The label strategy is well-reasoned.


43-51: Token metrics properly structured for future implementation.

The counter definitions are correctly designed with appropriate labels. The TODO comments clearly indicate these are placeholders pending the token counting feature from PR #215.

tests/unit/app/endpoints/test_query.py (12)

15-15: Correct import update reflecting API signature change.

The import properly updates to the new function name that returns both model and provider IDs.


66-66: Good defensive programming in test fixture.

Initializing mock_agent.create_turn.return_value.steps as an empty list prevents AttributeError in tests that access this property.


111-111: Proper metrics integration testing.

The test correctly mocks and verifies that the LLM calls metric is incremented with the appropriate provider and model labels.

Also applies to: 152-154


182-200: Complete test adaptation to new function signature.

The test correctly updates function name, unpacks the tuple return value, and asserts both model_id and provider_id values.


202-226: Proper handling of default model selection test.

The test correctly verifies that when no model is specified, the first available LLM model and its provider are returned.


228-246: Improved exception handling with HTTPException.

The test correctly updates to expect HTTPException instead of generic Exception, providing better error specificity.


248-259: Consistent error handling for no available models.

The test maintains proper error validation while adapting to the new function signature and exception type.


321-321: Proper negative testing for validation error metrics.

The test correctly verifies that validation error metrics are not incremented when no shield violations occur.

Also applies to: 345-346


867-909: Comprehensive test for shield violation metrics tracking.

The test properly simulates a shield violation scenario and verifies that the validation error metric is incremented correctly while maintaining full coverage of the function behavior.


1002-1003: Correct metrics tracking for connection failures.

The test properly verifies that the failure metric is incremented when API connection errors occur.

Also applies to: 1020-1020


775-777: Good refactoring to use consistent test fixtures.

The test now uses the standard prepare_agent_mocks fixture for better consistency and maintainability.


1323-1326: Consistent API signature update in auth testing.

The test correctly adapts to the new function signature while maintaining verification of proper auth token handling.

@tisnik tisnik changed the title Add metrics to lightspeed-stack LCORE-220: Add metrics to lightspeed-stack Jul 22, 2025
@umago umago force-pushed the metrics-LCORE-220 branch from ed6c493 to b3359ba Compare July 22, 2025 21:12
This patch is adding the /metrics endpoint to the project.

Differences with road-core/service:

 * The metrics are prefixed with "ls_" instead of "ols_"
 * The "provider_model_configuration" does not set the non-default
   model/providers to 0 because we currently do not have a way to set a
   default model/provider in the configuration. A TODO was left in the
   code.

Supported metrics:

 * rest_api_calls_total: Counter to track REST API calls
 * response_duration_seconds: Histogram to measure how long it takes
   to handle requests
 * provider_model_configuration: Indicates what provider + model
   customers are using
 * llm_calls_total: How many LLM calls were made for each provider + model
 * llm_calls_failures_total: How many LLM calls failed
 * llm_calls_validation_errors_total: How many LLM calls had validation
   errors

Missing metrics:

 * llm_token_sent_total: How many tokens were sent
 * llm_token_received_total: How many tokens were received

The above metrics are missing because token counting PR
(lightspeed-core#215) is not
merged yet.

Signed-off-by: Lucas Alvares Gomes <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants