-
Notifications
You must be signed in to change notification settings - Fork 45
LCORE-724: Endpoints for conversation cache v2 #591
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
LCORE-724: Endpoints for conversation cache v2 #591
Conversation
Warning Rate limit exceeded@tisnik has exceeded the limit for the number of commits or files that can be reviewed per hour. Please wait 20 minutes and 34 seconds before requesting another review. ⌛ How to resolve this issue?After the wait time has elapsed, a review can be triggered using the We recommend that you space out your commits to avoid hitting the rate limit. 🚦 How do rate limits work?CodeRabbit enforces hourly rate limits for each developer per organization. Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout. Please see our FAQ for further information. 📒 Files selected for processing (7)
WalkthroughAdds conversation caching and a new v2 conversations API: query and streaming endpoints now persist query/response pairs into the configured conversation cache, a new utility stores CacheEntry objects, and a versioned conversations_v2 router with list/get/delete endpoints is introduced and mounted at /v2. Changes
Sequence Diagram(s)sequenceDiagram
autonumber
participant C as Client
participant Q as QueryEndpoint
participant L as LLM Provider
participant K as Conversation Cache
C->>Q: POST /query (query, ids, provider/model)
Q->>L: generate(query, provider_id, model_id)
L-->>Q: llm_response
Q->>K: store_conversation_into_cache(config, user_id, conv_id, provider_id, model_id, query, llm_response)
K-->>Q: ack
Q-->>C: 200 OK (response)
sequenceDiagram
autonumber
participant C as Client
participant S as StreamingQueryEndpoint
participant L as LLM (stream)
participant T as Transcript Store
participant K as Conversation Cache
C->>S: POST /streaming_query (query, ids)
S->>L: start_stream(query, provider_id, model_id)
loop stream chunks
L-->>S: chunk/delta
S-->>C: SSE chunk
end
S->>T: persist transcript/attachments
S->>K: store_conversation_into_cache(config, user_id, conv_id, provider_id, model_id, query, final_response)
K-->>S: ack
S-->>C: stream complete
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes Possibly related PRs
Suggested reviewers
Poem
Pre-merge checks and finishing touches and finishing touches✅ Passed checks (3 passed)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 4
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (3)
src/app/endpoints/query.py
(3 hunks)src/app/endpoints/streaming_query.py
(2 hunks)src/utils/endpoints.py
(2 hunks)
🧰 Additional context used
📓 Path-based instructions (5)
src/**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
Use absolute imports for internal modules (e.g., from auth import get_auth_dependency)
Files:
src/utils/endpoints.py
src/app/endpoints/streaming_query.py
src/app/endpoints/query.py
**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
**/*.py
: All modules start with descriptive module-level docstrings explaining purpose
Use logger = logging.getLogger(name) for module logging after import logging
Define type aliases at module level for clarity
All functions require docstrings with brief descriptions
Provide complete type annotations for all function parameters and return types
Use typing_extensions.Self in model validators where appropriate
Use modern union syntax (str | int) and Optional[T] or T | None consistently
Function names use snake_case with descriptive, action-oriented prefixes (get_, validate_, check_)
Avoid in-place parameter modification; return new data structures instead of mutating arguments
Use appropriate logging levels: debug, info, warning, error with clear messages
All classes require descriptive docstrings explaining purpose
Class names use PascalCase with conventional suffixes (Configuration, Error/Exception, Resolver, Interface)
Abstract base classes should use abc.ABC and @AbstractMethod for interfaces
Provide complete type annotations for all class attributes
Follow Google Python docstring style for modules, classes, and functions, including Args, Returns, Raises, Attributes sections as needed
Files:
src/utils/endpoints.py
src/app/endpoints/streaming_query.py
src/app/endpoints/query.py
src/app/**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
Use standard FastAPI imports (from fastapi import APIRouter, HTTPException, Request, status, Depends) in FastAPI app code
Files:
src/app/endpoints/streaming_query.py
src/app/endpoints/query.py
src/{app/**/*.py,client.py}
📄 CodeRabbit inference engine (CLAUDE.md)
Use async def for I/O-bound operations and external API calls
Files:
src/app/endpoints/streaming_query.py
src/app/endpoints/query.py
src/app/endpoints/**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
In API endpoints, raise FastAPI HTTPException with appropriate status codes for error handling
Files:
src/app/endpoints/streaming_query.py
src/app/endpoints/query.py
🧠 Learnings (1)
📚 Learning: 2025-09-18T16:46:33.353Z
Learnt from: CR
PR: lightspeed-core/lightspeed-stack#0
File: CLAUDE.md:0-0
Timestamp: 2025-09-18T16:46:33.353Z
Learning: Applies to src/client.py : Use Llama Stack client import: from llama_stack_client import AsyncLlamaStackClient
Applied to files:
src/utils/endpoints.py
src/app/endpoints/query.py
🧬 Code graph analysis (3)
src/utils/endpoints.py (2)
src/models/cache_entry.py (1)
CacheEntry
(6-19)src/configuration.py (3)
AppConfig
(36-153)conversation_cache_configuration
(133-137)conversation_cache
(147-153)
src/app/endpoints/streaming_query.py (1)
src/utils/endpoints.py (1)
store_conversation_into_cache
(139-157)
src/app/endpoints/query.py (2)
src/models/cache_entry.py (1)
CacheEntry
(6-19)src/utils/endpoints.py (1)
store_conversation_into_cache
(139-157)
🪛 GitHub Actions: Python linter
src/utils/endpoints.py
[error] 139-139: pylint: R0913 Too many arguments (7/5) in function definition (too-many-arguments).
[error] 139-139: pylint: R0917 Too many positional arguments (7/5) (too-many-positional-arguments).
🪛 GitHub Actions: Pyright
src/utils/endpoints.py
[error] 157-157: Pyright error: 'insert_or_append' is not a known attribute of 'None' (reportOptionalMemberAccess).
🪛 GitHub Actions: Ruff
src/app/endpoints/query.py
[error] 36-36: Ruff: F401 unused import 'models.cache_entry.CacheEntry'. Remove unused import. This is fixable with 'ruff --fix'.
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)
- GitHub Check: build-pr
- GitHub Check: e2e_tests
2af7fad
to
659a32c
Compare
659a32c
to
9bf87a4
Compare
c441403
to
a4fb511
Compare
a4fb511
to
79c3f84
Compare
@coderabbitai: full review |
@tisnik I'll conduct a full review of the PR changes for the conversation cache v2 endpoints. ✅ Actions performedFull review triggered. |
Description
LCORE-724: Endpoints for conversation cache v2
Type of change
Related Tickets & Documents
Summary by CodeRabbit