Skip to content

Conversation

umago
Copy link
Contributor

@umago umago commented Jul 24, 2025

Description

This patch allows for users to configure a default model/provider pair in the configuration file.

Now models are selected as:

  • If no model/provider is specified in the configuration nor in the request, lightspeed-stack will use the FIRST MODEL AVAILABLE from llama-stack.
  • If the default model/provider is specified in the configuration file and a model/provider ARE NOT PROVIDED IN THE REQUEST, lightspeed-stack will use the model/provider FROM THE CONFIGURATION FILE.
  • If the default model/provider is specified in the configuration file and a model/provider ARE PROVIDED IN THE REQUEST, lightspeed-stack will use the model/provider FROM THE REQUEST.

tl;dr the precedent order to use a model is: request, configuration, first available in llama-stack.

Type of change

  • Refactor
  • New feature
  • Bug fix
  • CVE fix
  • Optimization
  • Documentation Update
  • Configuration Update
  • Bump-up service version
  • Bump-up dependent library
  • Bump-up library or tool used for development (does not change the final image)
  • CI configuration change
  • Konflux configuration change
  • Unit tests improvement
  • Integration tests improvement
  • End to end tests improvement

Related Tickets & Documents

  • Related Issue #LCORE-418
  • Closes #

Checklist before requesting a review

  • I have performed a self-review of my code.
  • PR has passed all pre-merge test jobs.
  • If it is a core feature, I have added thorough tests.

Testing

  • Please provide detailed steps to perform tests related to this code change.
  • How were the fix/results from this change verified? Please provide relevant screenshots or results.

Summary by CodeRabbit

  • New Features

    • Added support for specifying default model and provider in the configuration, improving model selection flexibility.
    • Introduced detailed debug logging for model and provider selection processes.
    • Exposed inference configuration via application settings for easier access.
  • Bug Fixes

    • Enhanced validation to ensure both default model and provider are set together, preventing inconsistent configurations.
    • Improved error handling for invalid model/provider selections.
  • Tests

    • Expanded and refined test coverage to verify correct model/provider selection and configuration validation in various scenarios.
    • Improved metric setup tests to ensure correct handling of default and non-default models.
    • Added tests for inference configuration validation and serialization.

Copy link
Contributor

coderabbitai bot commented Jul 24, 2025

Walkthrough

The changes introduce configuration-based defaults for model and provider selection in the application, updating the selection logic, configuration schema, metrics setup, and corresponding tests. Additional validation ensures both default model and provider are set together. Logging is enhanced for traceability, and tests are expanded for more granular coverage of the new behaviors.

Changes

File(s) Change Summary
src/app/endpoints/query.py Enhanced select_model_and_provider_id to use request, configuration defaults, or fallback to first LLM; improved logging and validation.
src/metrics/utils.py Metrics setup now uses configuration-based default model/provider; logs and sets metrics accordingly.
src/models/config.py Added InferenceConfiguration class with default_model and default_provider fields and validator; extended Configuration with optional inference field.
src/configuration.py Added inference property to AppConfig to expose inference configuration.
tests/unit/app/endpoints/test_query.py Split and expanded tests for select_model_and_provider_id into three cases: request-based, configuration-based, and fallback; removed old unified test.
tests/unit/metrics/test_utis.py Expanded metrics test to handle multiple models, patch configuration, and verify metric values for default and non-default models.
tests/unit/models/test_config.py Added tests for InferenceConfiguration validation; updated serialization tests to include new inference fields.

Sequence Diagram(s)

sequenceDiagram
    participant Client
    participant Endpoint
    participant Config
    participant ModelsList

    Client->>Endpoint: Send QueryRequest (may include model/provider)
    Endpoint->>Endpoint: select_model_and_provider_id()
    alt Model/Provider in Request
        Endpoint->>Endpoint: Use provided model/provider
    else Defaults in Config
        Endpoint->>Config: Get default_model/default_provider
        Endpoint->>Endpoint: Use config defaults
    else Fallback
        Endpoint->>ModelsList: Get first available LLM model
        Endpoint->>Endpoint: Use first LLM model/provider
    end
    Endpoint->>Endpoint: Validate selection
    Endpoint-->>Client: Return selected model/provider or error
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Poem

In the warren of code, defaults now appear,
Model and provider, both chosen with care.
If none are supplied, the config’s the guide,
And if still not found, the first LLM’s tried.
With tests all expanded and metrics precise—
This rabbit declares: the logic feels nice! 🐇

Note

⚡️ Unit Test Generation is now available in beta!

Learn more here, or try it out under "Finishing Touches" below.

✨ Finishing Touches
  • 📝 Generate Docstrings
🧪 Generate unit tests
  • Create PR with unit tests
  • Post copyable unit tests in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share
🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Explain this complex logic.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai explain this code block.
    • @coderabbitai modularize this function.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read src/utils.ts and explain its main purpose.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
    • @coderabbitai help me debug CodeRabbit configuration file.

Support

Need help? Create a ticket on our support page for assistance with any issues or questions.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai full review to do a full review from scratch and review all the files again.
  • @coderabbitai summary to regenerate the summary of the PR.
  • @coderabbitai generate docstrings to generate docstrings for this PR.
  • @coderabbitai generate sequence diagram to generate a sequence diagram of the changes in this PR.
  • @coderabbitai generate unit tests to generate unit tests for this PR.
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai configuration to show the current CodeRabbit configuration for the repository.
  • @coderabbitai help to get help.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

@umago umago marked this pull request as ready for review July 24, 2025 15:53
Copy link
Contributor

@tisnik tisnik left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this change makes sense, but would like to discuss where to put the provider + model configuration. Basically llama_stack section was meant to be just for core<->lstack API, i.e. how to run Llama Stack and how to call it (as library or via REST API).

We have customization section, but probably some new section might be created? WDYT?

@umago
Copy link
Contributor Author

umago commented Jul 24, 2025

@tisnik makes sense, I can create a new section for this. Thanks for the review

@tisnik tisnik changed the title Allow configurating a default model/provider LCORE-418: Allow configurating a default model/provider Jul 25, 2025
@umago umago force-pushed the LCORE-418-default-model-provider-config branch from 1c5dbf1 to 33b11eb Compare July 25, 2025 08:29
This patch allows for users to configure a default model/provider pair
in the configuration file.

Now models are selected as:
 * If no model/provider is specified in the configuration nor in the request,
   lightspeed-stack will use the FIRST MODEL AVAILABLE from llama-stack.
 * If the default model/provider is specified in the configuration file
   and a model/provider ARE NOT PROVIDED IN THE REQUEST, lightspeed-stack
   will use the model/provider FROM THE CONFIGURATION FILE.
 * If the default model/provider is specified in the configuration file
   and a model/provider ARE PROVIDED IN THE REQUEST, lightspeed-stack
   will use the model/provider FROM THE REQUEST.

tl;dr the precedent order to use a model is: request, configuration,
first available in llama-stack.

Signed-off-by: Lucas Alvares Gomes <[email protected]>
@umago umago force-pushed the LCORE-418-default-model-provider-config branch from 33b11eb to 5943d1b Compare July 25, 2025 08:31
@umago
Copy link
Contributor Author

umago commented Jul 25, 2025

@tisnik I created a new section called "inference", for now it only have the default_model/provider but in the future we could add things like temperature, max_tokens, top_k, etc etc etc

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🔭 Outside diff range comments (1)
tests/unit/metrics/test_utis.py (1)

1-1: Rename the test file to correct the typo

The test filename in tests/unit/metrics/test_utis.py should be updated to match the module name (utils.py):

  • tests/unit/metrics/test_utis.py → tests/unit/metrics/test_utils.py
- tests/unit/metrics/test_utis.py
+ tests/unit/metrics/test_utils.py
🧹 Nitpick comments (1)
tests/unit/metrics/test_utis.py (1)

56-69: Assertion logic is correct but could be more robust.

The assertions properly verify that only LLM models are processed and that the default model receives metric value 1 while others get 0. However, the chained call structure with .labels().set() could be fragile.

Consider using individual call assertions for better clarity:

-    mock_metric.assert_has_calls(
-        [
-            mocker.call.labels("test_provider-0", "test_model-0"),
-            mocker.call.labels().set(0),
-            mocker.call.labels("default_provider", "default_model"),
-            mocker.call.labels().set(1),
-            mocker.call.labels("test_provider-1", "test_model-1"),
-            mocker.call.labels().set(0),
-        ],
-        any_order=False,  # Order matters here
-    )
+    # Verify each model's metric setting individually
+    mock_metric.labels.assert_any_call("test_provider-0", "test_model-0")
+    mock_metric.labels.assert_any_call("default_provider", "default_model") 
+    mock_metric.labels.assert_any_call("test_provider-1", "test_model-1")
+    
+    # Verify the labels().set() calls
+    assert mock_metric.labels().set.call_count == 3
+    mock_metric.labels().set.assert_any_call(0)  # for non-default models
+    mock_metric.labels().set.assert_any_call(1)  # for default model
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 33b11eb and 5943d1b.

📒 Files selected for processing (7)
  • src/app/endpoints/query.py (2 hunks)
  • src/configuration.py (2 hunks)
  • src/metrics/utils.py (2 hunks)
  • src/models/config.py (2 hunks)
  • tests/unit/app/endpoints/test_query.py (2 hunks)
  • tests/unit/metrics/test_utis.py (1 hunks)
  • tests/unit/models/test_config.py (9 hunks)
🚧 Files skipped from review as they are similar to previous changes (6)
  • src/configuration.py
  • src/models/config.py
  • src/app/endpoints/query.py
  • tests/unit/models/test_config.py
  • src/metrics/utils.py
  • tests/unit/app/endpoints/test_query.py
🧰 Additional context used
🧬 Code Graph Analysis (1)
tests/unit/metrics/test_utis.py (1)
src/metrics/utils.py (1)
  • setup_model_metrics (11-45)
🔇 Additional comments (2)
tests/unit/metrics/test_utis.py (2)

11-18: LGTM! Configuration patching is correctly implemented.

The patches properly mock the new configuration paths for default model and provider, ensuring the test isolates the function's behavior from actual configuration dependencies.


21-52: Excellent test coverage with comprehensive mock setup.

The mock models effectively test all scenarios:

  • Default model identification (metric value 1)
  • Non-default LLM models (metric value 0)
  • Non-LLM model filtering (ignored)

The setup aligns perfectly with the function's logic.

Copy link
Contributor

@tisnik tisnik left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@tisnik tisnik merged commit c1c7ba3 into lightspeed-core:main Jul 25, 2025
17 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants