-
Notifications
You must be signed in to change notification settings - Fork 45
LCORE-494: Add database persistence layer and conversation tracking system #347
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
WalkthroughThis change introduces full database-backed conversation metadata management, adding SQLite and PostgreSQL support with configuration models, initialization routines, and SQLAlchemy ORM models. It implements conversation ownership validation and persistence, updates API endpoints to enforce authorization, and adds a new Changes
Sequence Diagram(s)sequenceDiagram
participant Client
participant FastAPI
participant Auth
participant DB
participant LLM
Client->>FastAPI: GET /conversations (auth)
FastAPI->>Auth: Validate user
Auth-->>FastAPI: user_id
FastAPI->>DB: Query UserConversation by user_id
DB-->>FastAPI: List of conversations
FastAPI-->>Client: ConversationsListResponse
Client->>FastAPI: POST /query (auth, conversation_id)
FastAPI->>Auth: Validate user
Auth-->>FastAPI: user_id
FastAPI->>DB: Validate conversation ownership
DB-->>FastAPI: True/False
alt Ownership valid
FastAPI->>LLM: Generate response
LLM-->>FastAPI: LLM response
FastAPI->>DB: Persist/update UserConversation
FastAPI-->>Client: Query response
else Ownership invalid
FastAPI-->>Client: 403 Forbidden
end
Estimated code review effort🎯 4 (Complex) | ⏱️ ~40 minutes Possibly related PRs
Suggested reviewers
Poem
Note 🔌 MCP (Model Context Protocol) integration is now available in Early Access!Pro users can now connect to remote MCP servers under the Integrations page to get reviews and chat conversations that understand additional development context. 📜 Recent review detailsConfiguration used: CodeRabbit UI ⛔ Files ignored due to path filters (1)
📒 Files selected for processing (20)
✅ Files skipped from review due to trivial changes (1)
🚧 Files skipped from review as they are similar to previous changes (19)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
✨ Finishing Touches
🧪 Generate unit tests
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
SupportNeed help? Create a ticket on our support page for assistance with any issues or questions. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
CodeRabbit Configuration File (
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice!!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, but:
- one nitpick
- in future we'd need to limit # records, i.e. cleanup oldest ones during insert (as in OLS). Would you mind create a ticket please?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 13
🧹 Nitpick comments (3)
src/app/database.py (2)
16-17
: Consider thread safety for global variables.Global mutable variables
engine
andSessionLocal
could lead to race conditions in multi-threaded environments. Consider using thread-local storage or proper synchronization mechanisms.Apply this pattern for thread safety:
+import threading + -engine: Engine | None = None -SessionLocal: sessionmaker | None = None +_local = threading.local() + +def _get_engine() -> Engine | None: + return getattr(_local, 'engine', None) + +def _set_engine(eng: Engine) -> None: + _local.engine = eng
62-62
: Clarify the custom schema logic.The condition checks if namespace is not None and not "public", but PostgreSQL's default schema is already "public". Consider simplifying this logic or adding a comment to explain the intent.
- is_custom_schema = config.namespace is not None and config.namespace != "public" + # Only create custom schema if explicitly specified and different from default + is_custom_schema = config.namespace is not None and config.namespace != "public"tests/unit/app/endpoints/test_conversations.py (1)
431-431
: Remove the disclaimer comment.The AI-generated disclaimer is not appropriate for production code and should be removed.
-# Generated entirely by AI, no human review, so read with that in mind.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
⛔ Files ignored due to path filters (1)
uv.lock
is excluded by!**/*.lock
📒 Files selected for processing (13)
pyproject.toml
(1 hunks)src/app/database.py
(1 hunks)src/app/endpoints/conversations.py
(3 hunks)src/app/endpoints/query.py
(4 hunks)src/app/endpoints/streaming_query.py
(3 hunks)src/app/main.py
(2 hunks)src/configuration.py
(2 hunks)src/models/base.py
(1 hunks)src/models/config.py
(3 hunks)src/models/conversations.py
(1 hunks)src/models/responses.py
(1 hunks)src/utils/common.py
(1 hunks)tests/unit/app/endpoints/test_conversations.py
(2 hunks)
🧰 Additional context used
🧬 Code Graph Analysis (7)
src/app/main.py (1)
src/app/database.py (2)
initialize_database
(81-104)create_tables
(29-31)
src/configuration.py (1)
src/models/config.py (1)
DatabaseConfiguration
(53-91)
src/app/endpoints/conversations.py (3)
src/models/responses.py (2)
ConversationsListResponse
(418-471)ConversationDetails
(389-415)src/models/conversations.py (1)
UserConversation
(11-27)src/utils/common.py (1)
retrieve_user_id
(17-26)
src/app/endpoints/streaming_query.py (1)
src/app/endpoints/query.py (2)
validate_conversation_ownership
(115-123)persist_user_conversation_details
(126-149)
src/app/database.py (3)
src/models/base.py (1)
Base
(6-7)src/models/config.py (4)
config
(85-91)SQLiteDatabaseConfiguration
(27-30)PostgreSQLDatabaseConfiguration
(33-50)db_type
(76-82)src/configuration.py (1)
database_configuration
(118-123)
src/app/endpoints/query.py (1)
src/models/conversations.py (1)
UserConversation
(11-27)
src/models/conversations.py (1)
src/models/base.py (1)
Base
(6-7)
🪛 GitHub Actions: Unit tests
src/utils/common.py
[error] 26-26: TypeError: 'NoneType' object is not subscriptable in test_retrieve_user_id when calling retrieve_user_id with None.
tests/unit/app/endpoints/test_conversations.py
[error] 470-470: TypeError: 'Mock' object does not support the context manager protocol in test_successful_conversations_list_retrieval.
[error] 503-503: TypeError: 'Mock' object does not support the context manager protocol in test_empty_conversations_list.
[error] 530-530: TypeError: 'Mock' object does not support the context manager protocol in test_database_exception.
src/app/endpoints/conversations.py
[error] 155-155: TypeError: 'Mock' object does not support the context manager protocol in get_session() usage during test of conversations endpoint.
src/app/database.py
[error] 37-37: RuntimeError: Database session not initialized. Call initialize_database() first. Occurred in multiple tests related to query and streaming_query endpoint handlers.
🔇 Additional comments (18)
pyproject.toml (1)
36-36
: SQLAlchemy 2.0.42 dependency verifiedSQLAlchemy 2.0.42 is the latest 2.x release (July 29, 2025) and no known security vulnerabilities are reported against this version. You can safely keep the
>=2.0.42
constraint, but continue to monitor release notes and vulnerability databases for future updates.src/configuration.py (2)
16-16
: LGTM!Import addition is correctly placed and follows the existing import pattern.
117-124
: LGTM!The property implementation follows the established pattern used by other configuration properties in this class. The assertion, return type annotation, and docstring are all consistent with the existing code style.
src/models/base.py (1)
1-8
: LGTM!This is a clean, standard implementation of a SQLAlchemy 2.x ORM base class. The use of
DeclarativeBase
follows modern SQLAlchemy patterns, and the pylint disable is appropriate for a base class that intentionally has minimal public methods.src/app/main.py (1)
10-10
: LGTM!Import statement is correctly placed and follows the existing import pattern.
src/app/endpoints/streaming_query.py (2)
429-445
: LGTM! Proper conversation ownership validation.The ownership validation logic is correctly implemented with appropriate error handling and security logging. This ensures users can only access conversations they own.
496-498
: LGTM! Consistent conversation persistence.The conversation persistence is correctly implemented and consistent with the query endpoint, ensuring conversation metadata is properly tracked.
src/models/conversations.py (1)
11-27
: LGTM! Well-designed ORM model with proper indexing.The model is correctly structured with:
- Appropriate field types and constraints
- Performance-optimized indexes on user_id and model
- Timezone-aware datetime fields with server defaults
- Proper primary key setup for UUID storage
src/app/endpoints/query.py (4)
115-123
: LGTM! Clean and efficient ownership validation.The function correctly validates conversation ownership with a simple database query and proper session management.
126-149
: LGTM! Proper conversation persistence with create/update logic.The function correctly handles both new conversation creation and existing conversation updates, with appropriate field updates and transaction management.
164-180
: LGTM! Proper security integration with ownership validation.The ownership validation is correctly placed before query processing, with appropriate error handling and logging for security events.
213-215
: LGTM! Conversation persistence properly placed after successful processing.The persistence call is correctly positioned after successful response generation, ensuring conversation metadata is only saved for completed interactions.
src/app/database.py (1)
104-104
: LGTM on session factory configuration.The sessionmaker configuration with
autocommit=False
andautoflush=False
is appropriate for explicit transaction management.src/models/config.py (5)
27-30
: LGTM on SQLite configuration model.Simple and clean configuration model for SQLite with appropriate field typing.
75-92
: LGTM on database type detection logic.The property methods for
db_type
andconfig
are well-implemented with appropriate error handling for edge cases.
314-314
: LGTM on configuration integration.Clean integration of database configuration into the main Configuration class with sensible defaults.
64-73
: Use a secure, persistent directory for the default SQLite databaseThe current fallback to
/tmp/lightspeed-stack.db
is world-writable (even with a sticky bit) and non-persistent across reboots. It’s safer to place the file under a user-specific data directory with restricted permissions.Affected location:
- src/models/config.py lines 64–73
Proposed change:
if total_configured_dbs == 0: - # Default to SQLite in a (hopefully) tmpfs if no database configuration is provided. - # This is good for backwards compatibility for deployments that do not mind having - # no persistent database. - sqlite_file_name = "/tmp/lightspeed-stack.db" - self.sqlite = SQLiteDatabaseConfiguration(db_path=sqlite_file_name) + # Default to a secure, user-local data directory if no DB is configured. + import os + from pathlib import Path + + # Use XDG_DATA_HOME if set, otherwise fall back to ~/.local/share + data_root = Path(os.getenv( + "XDG_DATA_HOME", + Path.home() / ".local" / "share" + )) + app_dir = data_root / "lightspeed-stack" + app_dir.mkdir(mode=0o700, parents=True, exist_ok=True) + + sqlite_file_name = str(app_dir / "lightspeed-stack.db") + self.sqlite = SQLiteDatabaseConfiguration(db_path=sqlite_file_name)This ensures the database file:
- Resides in a user-scoped directory (
~/.local/share/lightspeed-stack
by default).- Has
0o700
permissions (owner-only read/write/exec).- Persists across system reboots.
46-49
: Fix port validation logic.The validation allows port 0, which is not a valid TCP port for client connections. Port validation should be 1-65535.
- if self.port <= 0: - raise ValueError("Port value should not be negative") + if self.port < 1: + raise ValueError("Port value should be between 1 and 65535") if self.port > 65535: raise ValueError("Port value should be less than 65536")Likely an incorrect or invalid review comment.
@router.get("/conversations", responses=conversations_list_responses) | ||
def get_conversations_list_endpoint_handler( | ||
_auth: Any = Depends(auth_dependency), | ||
) -> ConversationsListResponse: | ||
"""Handle request to retrieve all conversations for the authenticated user.""" | ||
check_configuration_loaded(configuration) | ||
|
||
user_id = retrieve_user_id(_auth) | ||
logger.info("Retrieving conversations for user %s", user_id) | ||
|
||
with get_session() as session: | ||
try: | ||
# Get all conversations for this user | ||
user_conversations = ( | ||
session.query(UserConversation).filter_by(user_id=user_id).all() | ||
) | ||
|
||
# Return conversation summaries with metadata | ||
conversations = [ | ||
ConversationDetails( | ||
conversation_id=conv.id, | ||
created_at=conv.created_at.isoformat() if conv.created_at else None, | ||
last_message_at=( | ||
conv.last_message_at.isoformat() | ||
if conv.last_message_at | ||
else None | ||
), | ||
message_count=conv.message_count, | ||
model=conv.model, | ||
) | ||
for conv in user_conversations | ||
] | ||
|
||
logger.info( | ||
"Found %d conversations for user %s", len(conversations), user_id | ||
) | ||
|
||
return ConversationsListResponse(conversations=conversations) | ||
|
||
except Exception as e: | ||
logger.exception( | ||
"Error retrieving conversations for user %s: %s", user_id, e | ||
) | ||
raise HTTPException( | ||
status_code=status.HTTP_500_INTERNAL_SERVER_ERROR, | ||
detail={ | ||
"response": "Unknown error", | ||
"cause": f"Unknown error while getting conversations for user {user_id}", | ||
}, | ||
) from e | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
💡 Verification agent
🧩 Analysis chain
LGTM! Well-structured endpoint with proper error handling.
The endpoint implementation is solid with proper user ID extraction, database querying, response model construction, and comprehensive exception handling.
However, the pipeline failure indicates a test issue where get_session()
mock doesn't support the context manager protocol. Let me verify the test setup:
🏁 Script executed:
#!/bin/bash
# Description: Check how get_session is mocked in tests to fix the context manager issue.
# Expected: Find mock setup that needs to be fixed to support context manager protocol.
# Search for test files that mock get_session
rg -A 10 -B 5 "get_session.*mock" tests/
# Search for test imports and mock usage patterns
rg -A 5 "from.*database.*import.*get_session" tests/
# Look at the specific test file mentioned in the error
fd "test_conversations.py" tests/ --exec cat {}
Length of output: 26783
Address Test Mocks for Context Manager Protocol
The handler’s logic is sound, but tests fail because the patched get_session()
returns a plain mock without __enter__
/__exit__
. You have two options:
- Adjust tests to mock the context manager properly:
- Use a
MagicMock
or configuremock_session.__enter__.return_value = mock_session
andmock_session.__exit__.return_value = None
.
- Use a
- Refactor the endpoint to drop the
with
‐context and explicitly close the session, matching how tests expectclose()
to be called:
Suggested diff (option 2):
@@ def get_conversations_list_endpoint_handler(...):
- with get_session() as session:
- try:
- ... # existing logic
- except Exception as e:
- ...
+ session = get_session()
+ try:
+ # Get all conversations for this user
+ user_conversations = (
+ session.query(UserConversation).filter_by(user_id=user_id).all()
+ )
+
+ # Build response payload
+ conversations = [
+ ConversationDetails(
+ conversation_id=conv.id,
+ created_at=conv.created_at.isoformat() if conv.created_at else None,
+ last_message_at=conv.last_message_at.isoformat()
+ if conv.last_message_at else None,
+ message_count=conv.message_count,
+ model=conv.model,
+ )
+ for conv in user_conversations
+ ]
+
+ logger.info(
+ "Found %d conversations for user %s", len(conversations), user_id
+ )
+ return ConversationsListResponse(conversations=conversations)
+
+ except Exception as e:
+ logger.exception(
+ "Error retrieving conversations for user %s: %s", user_id, e
+ )
+ raise HTTPException(
+ status_code=status.HTTP_500_INTERNAL_SERVER_ERROR,
+ detail={
+ "response": "Unknown error",
+ "cause": f"Unknown error while getting conversations for user {user_id}",
+ },
+ ) from e
+
+ finally:
+ session.close()
• This ensures session.close()
is called and aligns with existing test assertions.
• If you prefer option 1, update every mocker.patch("app.endpoints.conversations.get_session", ...)
to return a mock CM, e.g.:
mock_session = mocker.MagicMock()
mock_session.__enter__.return_value = mock_session
mock_session.__exit__.return_value = None
🧰 Tools
🪛 GitHub Actions: Unit tests
[error] 155-155: TypeError: 'Mock' object does not support the context manager protocol in get_session() usage during test of conversations endpoint.
🤖 Prompt for AI Agents
In src/app/endpoints/conversations.py around lines 145 to 195, the use of
get_session() as a context manager causes test failures because the mock does
not implement __enter__ and __exit__. To fix this, refactor the endpoint to call
get_session() without a with-statement, wrap the logic in try-except-finally,
and explicitly call session.close() in the finally block. This change aligns the
code with existing test mocks expecting a close() method call and ensures proper
resource cleanup.
src/utils/common.py
Outdated
def retrieve_user_id(auth: Any) -> str: | ||
"""Retrieve the user ID from the authentication handler's tuple. | ||
Args: | ||
auth: The Authentication handler (FastAPI Depends) that will | ||
handle authentication Logic. | ||
auth: The tuple returned by the authentication handler. | ||
Returns: | ||
str: The user ID. | ||
""" | ||
return "user_id_placeholder" | ||
return auth[0] | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add input validation and error handling.
The function assumes auth
is a non-empty tuple but doesn't validate this, leading to potential TypeError
or IndexError
exceptions. The pipeline failure confirms this issue when the function is called with None
.
Apply this diff to add proper validation:
def retrieve_user_id(auth: Any) -> str:
- """Retrieve the user ID from the authentication handler's tuple.
+ """Retrieve the user ID from the authentication handler's tuple.
Args:
- auth: The tuple returned by the authentication handler.
+ auth: The tuple returned by the authentication handler.
Returns:
str: The user ID.
+
+ Raises:
+ ValueError: If auth is None or not a tuple with at least one element.
"""
- return auth[0]
+ if auth is None:
+ raise ValueError("Authentication data cannot be None")
+
+ if not isinstance(auth, (tuple, list)) or len(auth) == 0:
+ raise ValueError("Authentication data must be a non-empty tuple or list")
+
+ return str(auth[0])
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
def retrieve_user_id(auth: Any) -> str: | |
"""Retrieve the user ID from the authentication handler's tuple. | |
Args: | |
auth: The Authentication handler (FastAPI Depends) that will | |
handle authentication Logic. | |
auth: The tuple returned by the authentication handler. | |
Returns: | |
str: The user ID. | |
""" | |
return "user_id_placeholder" | |
return auth[0] | |
def retrieve_user_id(auth: Any) -> str: | |
"""Retrieve the user ID from the authentication handler's tuple. | |
Args: | |
auth: The tuple returned by the authentication handler. | |
Returns: | |
str: The user ID. | |
Raises: | |
ValueError: If auth is None or not a tuple with at least one element. | |
""" | |
if auth is None: | |
raise ValueError("Authentication data cannot be None") | |
if not isinstance(auth, (tuple, list)) or len(auth) == 0: | |
raise ValueError("Authentication data must be a non-empty tuple or list") | |
return str(auth[0]) |
🧰 Tools
🪛 GitHub Actions: Unit tests
[error] 26-26: TypeError: 'NoneType' object is not subscriptable in test_retrieve_user_id when calling retrieve_user_id with None.
🤖 Prompt for AI Agents
In src/utils/common.py around lines 17 to 27, the function retrieve_user_id
assumes auth is a non-empty tuple but does not validate this, causing potential
TypeError or IndexError when auth is None or empty. Add input validation to
check if auth is a tuple and non-empty before accessing auth[0]. If validation
fails, raise a clear exception or handle the error gracefully to prevent
pipeline failures.
/lgtm |
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
♻️ Duplicate comments (1)
src/app/database.py (1)
110-119
: Add default case and error handling for robustness.The match statement lacks a default case for unsupported database types, and engine creation calls need error handling to prevent unhandled exceptions.
match db_config.db_type: case "sqlite": sqlite_config = db_config.config assert isinstance(sqlite_config, SQLiteDatabaseConfiguration) - engine = _create_sqlite_engine(sqlite_config, **create_engine_kwargs) + try: + engine = _create_sqlite_engine(sqlite_config, **create_engine_kwargs) + except Exception as e: + logger.error("Failed to initialize SQLite database: %s", e) + raise case "postgres": postgres_config = db_config.config assert isinstance(postgres_config, PostgreSQLDatabaseConfiguration) - engine = _create_postgres_engine(postgres_config, **create_engine_kwargs) + try: + engine = _create_postgres_engine(postgres_config, **create_engine_kwargs) + except Exception as e: + logger.error("Failed to initialize PostgreSQL database: %s", e) + raise + case _: + raise ValueError(f"Unsupported database type: {db_config.db_type}")
🧹 Nitpick comments (2)
src/app/database.py (2)
72-72
: Fix trailing whitespace.Line 72 has trailing whitespace that needs to be removed.
- +
97-121
: Fix Black formatting issue.The file needs Black formatting to pass the pipeline check.
Run
black --write src/app/database.py
to fix the formatting issue.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
⛔ Files ignored due to path filters (1)
uv.lock
is excluded by!**/*.lock
📒 Files selected for processing (17)
pyproject.toml
(1 hunks)src/app/database.py
(1 hunks)src/app/endpoints/conversations.py
(3 hunks)src/app/endpoints/query.py
(3 hunks)src/app/endpoints/streaming_query.py
(3 hunks)src/app/main.py
(2 hunks)src/configuration.py
(2 hunks)src/constants.py
(1 hunks)src/models/base.py
(1 hunks)src/models/config.py
(3 hunks)src/models/conversations.py
(1 hunks)src/models/responses.py
(1 hunks)tests/unit/app/endpoints/test_conversations.py
(2 hunks)tests/unit/app/endpoints/test_feedback.py
(2 hunks)tests/unit/app/endpoints/test_query.py
(5 hunks)tests/unit/app/endpoints/test_streaming_query.py
(5 hunks)tests/unit/models/test_config.py
(2 hunks)
✅ Files skipped from review due to trivial changes (3)
- src/configuration.py
- pyproject.toml
- src/constants.py
🚧 Files skipped from review as they are similar to previous changes (13)
- tests/unit/app/endpoints/test_feedback.py
- tests/unit/models/test_config.py
- src/models/responses.py
- src/app/endpoints/streaming_query.py
- src/app/endpoints/conversations.py
- src/models/base.py
- tests/unit/app/endpoints/test_streaming_query.py
- src/app/main.py
- tests/unit/app/endpoints/test_query.py
- src/app/endpoints/query.py
- src/models/conversations.py
- tests/unit/app/endpoints/test_conversations.py
- src/models/config.py
🧰 Additional context used
🧬 Code Graph Analysis (1)
src/app/database.py (4)
src/log.py (1)
get_logger
(7-13)src/configuration.py (2)
configuration
(49-54)database_configuration
(118-123)src/models/base.py (1)
Base
(6-7)src/models/config.py (4)
config
(90-96)SQLiteDatabaseConfiguration
(27-30)PostgreSQLDatabaseConfiguration
(33-55)db_type
(81-87)
🪛 GitHub Actions: Black
src/app/database.py
[error] 1-1: Black formatting check failed. File would be reformatted. Run 'black --write' to fix code style issues.
🪛 GitHub Actions: Python linter
src/app/database.py
[warning] 72-72: Pylint: Trailing whitespace (C0303)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
- GitHub Check: build-pr
🔇 Additional comments (4)
src/app/database.py (4)
1-18
: LGTM! Clean imports and module structure.The imports are well-organized and the global variable pattern for singleton database engine/session is appropriate for this use case.
20-41
: LGTM! Proper defensive programming for resource access.The accessor functions correctly implement fail-fast behavior with clear error messages when resources aren't initialized.
43-55
: LGTM! Proper error handling for SQLite engine creation.The explicit directory existence check and comprehensive error handling align with the author's design decision for explicit directory management.
57-95
: LGTM! Comprehensive PostgreSQL engine implementation.The function properly handles:
- Connection URL construction with all parameters
- Custom schema creation with SQL injection protection via
text()
- SSL certificate configuration
- Comprehensive error handling for both engine and schema operations
The schema creation is safe from SQL injection since
config.namespace
comes from validated configuration.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
♻️ Duplicate comments (2)
src/app/endpoints/conversations.py (1)
144-195
: Address test mock compatibility for context manager usage.The endpoint implementation is solid with proper database querying, response construction, and error handling. However, based on the pipeline failures, the tests need to be updated to properly mock the context manager protocol used by
get_session()
.The endpoint uses
with get_session() as session:
but the test mocks don't implement__enter__
and__exit__
methods. This causesTypeError: 'Mock' object does not support the context manager protocol
.The tests should be updated to use
MagicMock
with proper context manager support:mock_session = mocker.MagicMock() mock_session.__enter__.return_value = mock_session mock_session.__exit__.return_value = Nonesrc/app/database.py (1)
114-122
: Add default case for unsupported database types.The match statement should handle unexpected database types to provide clear error messages for unsupported configurations.
case "postgres": postgres_config = db_config.config assert isinstance(postgres_config, PostgreSQLDatabaseConfiguration) engine = _create_postgres_engine(postgres_config, **create_engine_kwargs) + case _: + raise ValueError(f"Unsupported database type: {db_config.db_type}")
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
⛔ Files ignored due to path filters (1)
uv.lock
is excluded by!**/*.lock
📒 Files selected for processing (18)
pyproject.toml
(1 hunks)src/app/database.py
(1 hunks)src/app/endpoints/conversations.py
(6 hunks)src/app/endpoints/query.py
(4 hunks)src/app/endpoints/streaming_query.py
(3 hunks)src/app/main.py
(2 hunks)src/configuration.py
(2 hunks)src/constants.py
(1 hunks)src/models/base.py
(1 hunks)src/models/config.py
(3 hunks)src/models/conversations.py
(1 hunks)src/models/responses.py
(1 hunks)src/utils/endpoints.py
(2 hunks)tests/unit/app/endpoints/test_conversations.py
(14 hunks)tests/unit/app/endpoints/test_feedback.py
(2 hunks)tests/unit/app/endpoints/test_query.py
(5 hunks)tests/unit/app/endpoints/test_streaming_query.py
(5 hunks)tests/unit/models/test_config.py
(2 hunks)
✅ Files skipped from review due to trivial changes (4)
- pyproject.toml
- src/configuration.py
- tests/unit/app/endpoints/test_feedback.py
- src/constants.py
🚧 Files skipped from review as they are similar to previous changes (10)
- tests/unit/app/endpoints/test_query.py
- tests/unit/models/test_config.py
- src/models/base.py
- src/app/main.py
- src/app/endpoints/streaming_query.py
- src/models/responses.py
- tests/unit/app/endpoints/test_streaming_query.py
- src/app/endpoints/query.py
- src/models/conversations.py
- src/models/config.py
🧰 Additional context used
🧠 Learnings (1)
📚 Learning: the direct assignment to `agent._agent_id` in `src/utils/endpoints.py` is a necessary workaround for...
Learnt from: eranco74
PR: lightspeed-core/lightspeed-stack#348
File: src/utils/endpoints.py:91-94
Timestamp: 2025-08-06T06:02:21.060Z
Learning: The direct assignment to `agent._agent_id` in `src/utils/endpoints.py` is a necessary workaround for the missing agent rehydration feature in the LLS client SDK. This allows preserving conversation IDs when handling existing agents.
Applied to files:
src/app/endpoints/conversations.py
tests/unit/app/endpoints/test_conversations.py
🧬 Code Graph Analysis (2)
src/utils/endpoints.py (2)
src/models/conversations.py (1)
UserConversation
(11-27)src/app/database.py (1)
get_session
(34-40)
src/app/database.py (4)
src/log.py (1)
get_logger
(7-13)src/configuration.py (2)
configuration
(49-54)database_configuration
(118-123)src/models/base.py (1)
Base
(6-7)src/models/config.py (4)
config
(90-96)SQLiteDatabaseConfiguration
(27-30)PostgreSQLDatabaseConfiguration
(33-55)db_type
(81-87)
🪛 GitHub Actions: Unit tests
src/app/database.py
[error] 37-37: RuntimeError: Database session not initialized. Call initialize_database() first. This error caused multiple test failures in test_conversations.py during calls to get_session().
src/app/endpoints/conversations.py
[error] 516-577: AttributeError: 'app.endpoints.conversations' module does not have the attribute 'retrieve_user_id'. This caused failures in TestGetConversationsListEndpoint tests when attempting to mock 'retrieve_user_id'.
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
- GitHub Check: build-pr
🔇 Additional comments (14)
src/utils/endpoints.py (1)
21-29
: LGTM! Critical security function for conversation authorization.The
validate_conversation_ownership
function provides essential authorization by ensuring users can only access their own conversations. The implementation is clean with proper database session management using the context manager pattern.src/app/endpoints/conversations.py (3)
76-99
: LGTM! Well-structured response schema.The
conversations_list_responses
dictionary provides clear API documentation with appropriate example data and error cases.
200-201
: LGTM! Enhanced security with ownership validation.The parameter change from
_auth
toauth
and the addition ofvalidate_conversation_ownership
properly secure the endpoint by ensuring users can only access their own conversations.Also applies to: 216-221
276-277
: LGTM! Consistent security implementation.The delete endpoint correctly implements the same ownership validation pattern as the get endpoint, ensuring users can only delete their own conversations.
Also applies to: 292-297
src/app/database.py (5)
20-26
: LGTM! Clear error handling for uninitialized engine.The function properly validates initialization state and provides a clear error message for debugging.
34-40
: LGTM! Consistent error handling pattern.Matches the
get_engine()
function with appropriate error messaging for uninitialized sessions.
43-54
: LGTM! Robust SQLite engine creation with proper error handling.The function validates directory existence, creates the engine, and handles exceptions appropriately with logging and meaningful error messages.
84-96
: LGTM! Comprehensive PostgreSQL schema handling.The schema creation logic properly handles custom namespaces with appropriate error handling, logging, and meaningful error messages.
101-124
: Review test database initializationI couldn’t find any calls to
initialize_database()
in your tests, though I did seeget_session
being mocked intests/unit/app/endpoints/test_conversations.py
. To resolve the failures:• Verify which tests are erroring and whether they rely on an uninitialized
SessionLocal
or unmockedget_session
.
• For any test that uses the real session factory, either callinitialize_database()
in its setup or mock outSessionLocal
/get_session
just as you do intest_conversations.py
.Please confirm the failing test(s) and ensure each one either invokes
initialize_database()
before usingget_session()
, or replaces that dependency with a proper mock.tests/unit/app/endpoints/test_conversations.py (5)
25-37
: LGTM! Useful helper function for test data creation.The
create_mock_conversation
function provides a clean way to create mock conversation objects with all required attributes, improving test maintainability.
40-55
: LGTM! Proper context manager mock implementation.The
mock_database_session
function correctly implements the context manager protocol, addressing the previous test failures with__enter__
and__exit__
methods.
232-233
: LGTM! Consistent parameter naming update.The change from
_auth
toauth
parameter is correctly applied across all test cases, maintaining consistency with the endpoint changes that enable conversation ownership validation.Also applies to: 246-247, 270-271, 294-295, 320-321, 352-353, 373-374, 387-388, 410-411, 434-435, 460-461, 485-486
501-509
: LGTM! Comprehensive error handling test.The test properly verifies that the endpoint handles missing configuration correctly with appropriate status code and error message.
572-593
: LGTM! Thorough exception handling test.The test validates that database exceptions are properly caught, logged, and returned as HTTP 500 errors with meaningful messages.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (1)
tests/unit/app/endpoints/test_conversations.py (1)
499-581
: Comprehensive test coverage for the new endpoint!The
TestGetConversationsListEndpoint
class provides excellent coverage of all key scenarios: configuration errors, successful retrieval, empty results, and database exceptions. The tests properly utilize the new helper functions and follow established patterns.However, consider adding a human review pass despite the AI generation comment to ensure alignment with team standards.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
⛔ Files ignored due to path filters (1)
uv.lock
is excluded by!**/*.lock
📒 Files selected for processing (18)
pyproject.toml
(1 hunks)src/app/database.py
(1 hunks)src/app/endpoints/conversations.py
(6 hunks)src/app/endpoints/query.py
(4 hunks)src/app/endpoints/streaming_query.py
(3 hunks)src/app/main.py
(2 hunks)src/configuration.py
(2 hunks)src/constants.py
(1 hunks)src/models/base.py
(1 hunks)src/models/config.py
(3 hunks)src/models/conversations.py
(1 hunks)src/models/responses.py
(1 hunks)src/utils/endpoints.py
(2 hunks)tests/unit/app/endpoints/test_conversations.py
(21 hunks)tests/unit/app/endpoints/test_feedback.py
(2 hunks)tests/unit/app/endpoints/test_query.py
(5 hunks)tests/unit/app/endpoints/test_streaming_query.py
(5 hunks)tests/unit/models/test_config.py
(2 hunks)
✅ Files skipped from review due to trivial changes (3)
- src/constants.py
- tests/unit/app/endpoints/test_feedback.py
- pyproject.toml
🚧 Files skipped from review as they are similar to previous changes (14)
- src/models/base.py
- src/utils/endpoints.py
- src/app/endpoints/streaming_query.py
- src/app/main.py
- tests/unit/app/endpoints/test_query.py
- src/app/database.py
- src/app/endpoints/conversations.py
- src/models/conversations.py
- src/models/responses.py
- tests/unit/models/test_config.py
- src/configuration.py
- src/models/config.py
- tests/unit/app/endpoints/test_streaming_query.py
- src/app/endpoints/query.py
🧰 Additional context used
🧠 Learnings (1)
📚 Learning: the direct assignment to `agent._agent_id` in `src/utils/endpoints.py` is a necessary workaround for...
Learnt from: eranco74
PR: lightspeed-core/lightspeed-stack#348
File: src/utils/endpoints.py:91-94
Timestamp: 2025-08-06T06:02:21.060Z
Learning: The direct assignment to `agent._agent_id` in `src/utils/endpoints.py` is a necessary workaround for the missing agent rehydration feature in the LLS client SDK. This allows preserving conversation IDs when handling existing agents.
Applied to files:
tests/unit/app/endpoints/test_conversations.py
🧬 Code Graph Analysis (1)
tests/unit/app/endpoints/test_conversations.py (4)
src/app/endpoints/conversations.py (1)
get_conversations_list_endpoint_handler
(145-194)src/models/responses.py (1)
ConversationsListResponse
(418-471)src/configuration.py (2)
configuration
(49-54)AppConfig
(22-123)tests/unit/test_configuration.py (1)
test_configuration_not_loaded
(290-296)
🔇 Additional comments (9)
tests/unit/app/endpoints/test_conversations.py (9)
10-10
: LGTM!The import additions for the new endpoint handler and response models are appropriate and necessary for testing the conversations list functionality.
Also applies to: 13-17
25-37
: LGTM!The
create_mock_conversation
helper function is well-designed and properly mocks all required attributes including the datetimeisoformat()
method calls. This promotes test maintainability and consistency.
40-55
: Excellent solution for context manager mocking!This helper function properly addresses the context manager protocol issues from previous reviews by implementing
__enter__
and__exit__
methods. The use ofMagicMock
for the context manager and proper session yielding aligns with the endpoint'swith get_session() as session:
usage pattern.
232-232
: LGTM! Parameter and mocking updates align with endpoint changes.The updates from
_auth
toauth
parameter and the addition ofvalidate_conversation_ownership
mocking correctly reflect the new conversation ownership validation logic in the endpoints. These changes ensure tests remain compatible with the enhanced authorization features.Also applies to: 246-246, 258-258, 271-271, 282-282, 296-296, 309-309, 321-321, 337-337, 371-371, 385-385, 397-397, 409-409, 420-420, 434-434, 447-447, 461-461, 476-476, 487-487
312-313
: Good improvement to async mocking!Using
mocker.AsyncMock()
is more appropriate for mocking async client methods and ensures proper async behavior simulation in tests.
503-511
: LGTM!The configuration validation test correctly checks the error handling when configuration is not loaded, following the established pattern from other endpoint tests.
513-549
: LGTM!The successful retrieval test is well-structured, using the helper functions effectively to create realistic test data and properly validating both the response structure and content.
551-562
: LGTM!The empty conversations test properly validates the edge case behavior and ensures the API returns a consistent response structure even with no data.
564-580
: LGTM!The database exception test properly validates error handling behavior, ensuring the endpoint responds appropriately to database failures with the correct HTTP status and error message format.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
First TODO: please create JIRA ticket - automatic cleanup of old conversations from DB in order to keep the DB size under defined max value
Second TODO: please create JIRA ticket to update documentation (README.md)
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi,
Thanks for the patch, overall it looks pretty solid. Thank u for working on this. I left a few comments inline, let me know what you think, they are most nits but I think it would improve the code structure.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (1)
src/app/endpoints/conversations.py (1)
76-99
: Add missingmodel
field in example responses.The example responses should include all fields that will be present in the actual response to maintain consistency with the
ConversationDetails
model.{ "conversation_id": "123e4567-e89b-12d3-a456-426614174000", "created_at": "2024-01-01T00:00:00Z", "last_message_at": "2024-01-01T00:05:00Z", "message_count": 5, + "model": "gemini/gemini-2.0-flash", }, { "conversation_id": "456e7890-e12b-34d5-a678-901234567890", "created_at": "2024-01-01T01:00:00Z", "last_message_at": "2024-01-01T01:02:00Z", "message_count": 2, + "model": "gemini/gemini-2.0-flash", },
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
⛔ Files ignored due to path filters (1)
uv.lock
is excluded by!**/*.lock
📒 Files selected for processing (18)
pyproject.toml
(1 hunks)src/app/database.py
(1 hunks)src/app/endpoints/conversations.py
(6 hunks)src/app/endpoints/query.py
(4 hunks)src/app/endpoints/streaming_query.py
(3 hunks)src/app/main.py
(2 hunks)src/configuration.py
(2 hunks)src/constants.py
(1 hunks)src/models/base.py
(1 hunks)src/models/config.py
(3 hunks)src/models/conversations.py
(1 hunks)src/models/responses.py
(1 hunks)src/utils/endpoints.py
(2 hunks)tests/unit/app/endpoints/test_conversations.py
(21 hunks)tests/unit/app/endpoints/test_feedback.py
(2 hunks)tests/unit/app/endpoints/test_query.py
(5 hunks)tests/unit/app/endpoints/test_streaming_query.py
(5 hunks)tests/unit/models/test_config.py
(2 hunks)
✅ Files skipped from review due to trivial changes (2)
- pyproject.toml
- src/models/conversations.py
🚧 Files skipped from review as they are similar to previous changes (14)
- tests/unit/app/endpoints/test_feedback.py
- src/models/base.py
- src/app/main.py
- src/constants.py
- src/app/endpoints/query.py
- src/configuration.py
- tests/unit/models/test_config.py
- src/app/database.py
- src/utils/endpoints.py
- src/models/config.py
- src/app/endpoints/streaming_query.py
- tests/unit/app/endpoints/test_streaming_query.py
- tests/unit/app/endpoints/test_query.py
- src/models/responses.py
🧰 Additional context used
🧠 Learnings (1)
📚 Learning: the direct assignment to `agent._agent_id` in `src/utils/endpoints.py` is a necessary workaround for...
Learnt from: eranco74
PR: lightspeed-core/lightspeed-stack#348
File: src/utils/endpoints.py:91-94
Timestamp: 2025-08-06T06:02:21.060Z
Learning: The direct assignment to `agent._agent_id` in `src/utils/endpoints.py` is a necessary workaround for the missing agent rehydration feature in the LLS client SDK. This allows preserving conversation IDs when handling existing agents.
Applied to files:
src/app/endpoints/conversations.py
tests/unit/app/endpoints/test_conversations.py
🧬 Code Graph Analysis (1)
src/app/endpoints/conversations.py (5)
src/models/responses.py (4)
ConversationResponse
(303-350)ConversationDeleteResponse
(353-386)ConversationsListResponse
(418-471)ConversationDetails
(389-415)src/auth/__init__.py (1)
get_auth_dependency
(14-43)src/app/database.py (1)
get_session
(34-40)src/models/conversations.py (1)
UserConversation
(11-27)src/utils/endpoints.py (2)
check_configuration_loaded
(32-38)validate_conversation_ownership
(21-29)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
- GitHub Check: Pyright
🔇 Additional comments (8)
src/app/endpoints/conversations.py (4)
12-21
: LGTM: Import changes are well-structured.The new imports properly support the database persistence and conversation ownership validation features. All necessary models and utilities are included.
144-195
: LGTM: Well-implemented conversations list endpoint.The endpoint properly handles user authentication, database querying, response construction, and error handling. The use of context manager for database session management is correct.
Note: Previous test failures related to mock context manager protocol were already addressed in past commits.
200-200
: LGTM: Proper ownership validation added.The authentication parameter change enables access to user credentials, and the ownership validation ensures users can only retrieve their own conversations. This addresses the security concerns mentioned in the PR objectives.
Also applies to: 216-221
282-282
: LGTM: Consistent ownership validation.The same ownership validation pattern is correctly applied to the delete endpoint, ensuring consistent authorization across conversation management operations.
Also applies to: 298-303
tests/unit/app/endpoints/test_conversations.py (4)
25-56
: LGTM: Well-designed helper functions.The helper functions properly mock database objects and session management. The
mock_database_session
function correctly implements the context manager protocol, addressing the previous test failures mentioned in past review comments.
506-586
: LGTM: Comprehensive test coverage for new endpoint.The test class covers all important scenarios including success cases, edge cases, and error handling. The tests follow established patterns and use the helper functions appropriately.
Note: The AI-generation comment (line 505) is transparent about the code's origin, which is good practice for review purposes.
232-232
: LGTM: Test parameter updates are consistent.The authentication parameter name changes from
_auth
toauth
consistently reflect the endpoint modifications and maintain test functionality.Also applies to: 246-246, 271-271, 296-296, 321-321, 356-356, 377-377, 391-391, 415-415, 440-440, 467-467, 493-493
258-258
: LGTM: Proper mocking of ownership validation.The addition of
validate_conversation_ownership
mocking ensures tests continue to work with the new authorization logic while focusing on testing the endpoint's core functionality.Also applies to: 282-282, 309-309, 337-337, 403-403, 426-426, 453-453, 482-482
"last_message_at": "2024-01-01T01:02:00Z", | ||
"message_count": 2, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we have a defined use for these details?
If not maybe remove them for now to prevent extra writes to the database on every message.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm not really. I assumed UIs might want them
2ee403a
to
84d7ca2
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
⛔ Files ignored due to path filters (1)
uv.lock
is excluded by!**/*.lock
📒 Files selected for processing (19)
pyproject.toml
(1 hunks)src/app/database.py
(1 hunks)src/app/endpoints/conversations.py
(6 hunks)src/app/endpoints/query.py
(5 hunks)src/app/endpoints/streaming_query.py
(5 hunks)src/app/main.py
(2 hunks)src/configuration.py
(2 hunks)src/constants.py
(1 hunks)src/models/config.py
(3 hunks)src/models/database/__init__.py
(1 hunks)src/models/database/base.py
(1 hunks)src/models/database/conversations.py
(1 hunks)src/models/responses.py
(1 hunks)src/utils/endpoints.py
(2 hunks)tests/unit/app/endpoints/test_conversations.py
(21 hunks)tests/unit/app/endpoints/test_feedback.py
(2 hunks)tests/unit/app/endpoints/test_query.py
(11 hunks)tests/unit/app/endpoints/test_streaming_query.py
(5 hunks)tests/unit/models/test_config.py
(2 hunks)
✅ Files skipped from review due to trivial changes (1)
- pyproject.toml
🚧 Files skipped from review as they are similar to previous changes (16)
- src/models/database/init.py
- tests/unit/app/endpoints/test_feedback.py
- src/configuration.py
- src/app/main.py
- src/models/database/base.py
- src/utils/endpoints.py
- src/constants.py
- tests/unit/models/test_config.py
- src/models/responses.py
- src/app/database.py
- tests/unit/app/endpoints/test_streaming_query.py
- src/models/database/conversations.py
- tests/unit/app/endpoints/test_query.py
- src/app/endpoints/query.py
- tests/unit/app/endpoints/test_conversations.py
- src/models/config.py
🧰 Additional context used
🧠 Learnings (1)
📚 Learning: the direct assignment to `agent._agent_id` in `src/utils/endpoints.py` is a necessary workaround for...
Learnt from: eranco74
PR: lightspeed-core/lightspeed-stack#348
File: src/utils/endpoints.py:91-94
Timestamp: 2025-08-06T06:02:21.060Z
Learning: The direct assignment to `agent._agent_id` in `src/utils/endpoints.py` is a necessary workaround for the missing agent rehydration feature in the LLS client SDK. This allows preserving conversation IDs when handling existing agents.
Applied to files:
src/app/endpoints/conversations.py
🧬 Code Graph Analysis (1)
src/app/endpoints/streaming_query.py (3)
src/models/database/conversations.py (1)
UserConversation
(11-36)src/utils/endpoints.py (1)
validate_conversation_ownership
(21-31)src/app/endpoints/query.py (3)
persist_user_conversation_details
(75-104)evaluate_model_hints
(107-146)select_model_and_provider_id
(240-300)
🪛 GitHub Actions: Unit tests
src/app/endpoints/conversations.py
[error] 168-168: Pydantic validation error in ConversationDetails: 'last_used_model' and 'last_used_provider' fields received invalid input types (expected string but got Mock objects). See https://errors.pydantic.dev/2.11/v/string_type.
🔇 Additional comments (7)
src/app/endpoints/streaming_query.py (5)
26-26
: LGTM!The new imports are necessary for the database conversation ownership validation, persistence, and model hints evaluation functionality.
Also applies to: 38-40
387-387
: LGTM!The pylint disable comment is appropriate given the additional local variables introduced by the database operations and validation logic.
401-419
: Excellent security enhancement!The conversation ownership validation properly ensures users can only access their own conversations. The implementation follows security best practices with proper error handling, logging, and HTTP 403 response.
424-429
: Great UX improvement!The integration with
evaluate_model_hints
enables intelligent model selection that remembers the last used model/provider for conversations while still allowing request-level overrides. This enhances user experience by maintaining consistency within conversations.
477-482
: Perfect placement for conversation persistence!The call to
persist_user_conversation_details
is well-positioned after response generation but before metrics update. This ensures conversation metadata is properly tracked for future ownership validation and model hint evaluation.src/app/endpoints/conversations.py (2)
12-21
: Well-structured imports and API documentation!The new imports support the database integration and conversation listing functionality. The
conversations_list_responses
schema provides clear API documentation with appropriate example data structures.Also applies to: 76-103
205-205
: Excellent security enhancement for existing endpoints!The addition of conversation ownership validation to the existing GET and DELETE endpoints ensures proper authorization. The consistent use of
validate_conversation_ownership
across all endpoints provides a unified security approach.Also applies to: 221-226, 287-287, 303-308
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
⛔ Files ignored due to path filters (1)
uv.lock
is excluded by!**/*.lock
📒 Files selected for processing (20)
pyproject.toml
(1 hunks)src/app/database.py
(1 hunks)src/app/endpoints/conversations.py
(6 hunks)src/app/endpoints/query.py
(5 hunks)src/app/endpoints/streaming_query.py
(5 hunks)src/app/main.py
(2 hunks)src/configuration.py
(2 hunks)src/constants.py
(1 hunks)src/models/config.py
(3 hunks)src/models/database/__init__.py
(1 hunks)src/models/database/base.py
(1 hunks)src/models/database/conversations.py
(1 hunks)src/models/requests.py
(2 hunks)src/models/responses.py
(1 hunks)src/utils/endpoints.py
(2 hunks)tests/unit/app/endpoints/test_conversations.py
(21 hunks)tests/unit/app/endpoints/test_feedback.py
(2 hunks)tests/unit/app/endpoints/test_query.py
(11 hunks)tests/unit/app/endpoints/test_streaming_query.py
(5 hunks)tests/unit/models/test_config.py
(2 hunks)
✅ Files skipped from review due to trivial changes (1)
- pyproject.toml
🚧 Files skipped from review as they are similar to previous changes (18)
- tests/unit/app/endpoints/test_feedback.py
- tests/unit/models/test_config.py
- src/utils/endpoints.py
- src/models/database/base.py
- src/constants.py
- src/configuration.py
- tests/unit/app/endpoints/test_streaming_query.py
- tests/unit/app/endpoints/test_conversations.py
- src/app/endpoints/conversations.py
- src/models/database/conversations.py
- src/models/database/init.py
- src/app/endpoints/query.py
- src/app/database.py
- src/app/main.py
- src/app/endpoints/streaming_query.py
- tests/unit/app/endpoints/test_query.py
- src/models/responses.py
- src/models/config.py
🧰 Additional context used
🧬 Code Graph Analysis (1)
src/models/requests.py (1)
src/utils/suid.py (1)
check_suid
(15-28)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)
- GitHub Check: build-pr
- GitHub Check: e2e_tests
🔇 Additional comments (1)
src/models/requests.py (1)
187-193
: LGTM! Proper UUID validation for optional conversation_id.The validator correctly handles the optional nature of the
conversation_id
field by checking for truthiness before validation, uses the establishedsuid.check_suid
utility, and provides clear error messages.
10e22e5
to
833008c
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, thank you!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
🧹 Nitpick comments (7)
src/app/endpoints/conversations.py (4)
76-104
: Align documented responses with actual behavior (use 500 and add auth errors).
The handler raises HTTP 500 on DB errors, but the documented responses show a 503 “Unable to connect to Llama Stack” which doesn’t fit the DB-backed endpoint. Recommend:
- Replace 503 with 500 and a generic “Unknown error while getting conversations”.
- Optionally add 401/403 to reflect auth/ownership requirements.
Apply this diff to the documented responses:
- 503: { - "detail": { - "response": "Unable to connect to Llama Stack", - "cause": "Connection error.", - } - }, + 500: { + "detail": { + "response": "Unknown error", + "cause": "Unknown error while getting conversations", + } + }, + 401: { + "detail": { + "response": "Unauthorized", + "cause": "Authentication required.", + } + }, + 403: { + "detail": { + "response": "Forbidden", + "cause": "Not allowed to view these conversations.", + } + },
148-151
: Expose an explicit response_model for OpenAPI and validation.
Add response_model to ensure schema enforcement and accurate OpenAPI.-@router.get("/conversations", responses=conversations_list_responses) +@router.get( + "/conversations", + response_model=ConversationsListResponse, + responses=conversations_list_responses, +)
162-164
: Deterministic ordering (and groundwork for pagination).
Return the most recent conversations first and make ordering explicit. This also makes tests and client UX consistent.- user_conversations = ( - session.query(UserConversation).filter_by(user_id=user_id).all() - ) + q = ( + session.query(UserConversation) + .filter_by(user_id=user_id) + .order_by(UserConversation.last_message_at.desc().nullslast()) + ) + user_conversations = q.all()If you want to go further, introduce limit/offset query params and apply
.offset(offset).limit(limit)
to q. Example signature change:# add to imports at top of file: # from fastapi import Query def get_conversations_list_endpoint_handler( auth: Any = Depends(auth_dependency), limit: int = Query(default=50, ge=1, le=500), offset: int = Query(default=0, ge=0), ) -> ConversationsListResponse: ... q = q.offset(offset).limit(limit)
189-199
: Avoid echoing user IDs in error messages.
Returning user_id in client-visible error details is unnecessary and can be considered PII leakage.- detail={ - "response": "Unknown error", - "cause": f"Unknown error while getting conversations for user {user_id}", - }, + detail={ + "response": "Unknown error", + "cause": "Unknown error while getting conversations", + },tests/unit/app/endpoints/test_conversations.py (3)
25-45
: Mock helper covers required attributes; consider using real datetimes.
The isoformat-mocking works, but using actual datetime objects mirrors production better and reduces mock behavior coupling.Example:
from datetime import datetime, timezone mock_conversation.created_at = ( datetime.fromisoformat(created_at.replace("Z", "+00:00")).astimezone(timezone.utc) if created_at else None ) mock_conversation.last_message_at = ( datetime.fromisoformat(last_message_at.replace("Z", "+00:00")).astimezone(timezone.utc) if last_message_at else None )
526-565
: Augment assertions to cover model/provider fields.
Since the endpoint returns last_used_model/provider, assert those too to lock in the contract.assert response.conversations[0].last_used_model == "gemini/gemini-2.0-flash" assert response.conversations[0].last_used_provider == "gemini" assert response.conversations[1].last_used_model == "gemini/gemini-2.5-flash" assert response.conversations[1].last_used_provider == "gemini"
579-595
: Add a test for session acquisition failures (post-refactor).
After wrapping the with-statement in a try (as suggested), add a case whereget_session().__enter__
raises to ensure a clean HTTP 500 is returned.Example:
def test_database_exception_on_enter(self, mocker, setup_configuration): mocker.patch("app.endpoints.conversations.configuration", setup_configuration) # Configure get_session to raise on __enter__ mock_session_context = mocker.MagicMock() mock_session_context.__enter__.side_effect = Exception("DB connect failed") mock_session_context.__exit__.return_value = None mocker.patch("app.endpoints.conversations.get_session", return_value=mock_session_context) with pytest.raises(HTTPException) as exc_info: get_conversations_list_endpoint_handler(auth=MOCK_AUTH) assert exc_info.value.status_code == status.HTTP_500_INTERNAL_SERVER_ERROR assert "Unknown error" in exc_info.value.detail["response"]I can send a full patch if you adopt the handler refactor.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
⛔ Files ignored due to path filters (1)
uv.lock
is excluded by!**/*.lock
📒 Files selected for processing (20)
pyproject.toml
(1 hunks)src/app/database.py
(1 hunks)src/app/endpoints/conversations.py
(6 hunks)src/app/endpoints/query.py
(5 hunks)src/app/endpoints/streaming_query.py
(5 hunks)src/app/main.py
(2 hunks)src/configuration.py
(2 hunks)src/constants.py
(1 hunks)src/models/config.py
(3 hunks)src/models/database/__init__.py
(1 hunks)src/models/database/base.py
(1 hunks)src/models/database/conversations.py
(1 hunks)src/models/requests.py
(1 hunks)src/models/responses.py
(1 hunks)src/utils/endpoints.py
(2 hunks)tests/unit/app/endpoints/test_conversations.py
(21 hunks)tests/unit/app/endpoints/test_feedback.py
(2 hunks)tests/unit/app/endpoints/test_query.py
(11 hunks)tests/unit/app/endpoints/test_streaming_query.py
(5 hunks)tests/unit/models/test_config.py
(2 hunks)
✅ Files skipped from review due to trivial changes (1)
- tests/unit/app/endpoints/test_feedback.py
🚧 Files skipped from review as they are similar to previous changes (17)
- src/models/database/init.py
- pyproject.toml
- src/constants.py
- src/app/main.py
- src/models/requests.py
- src/utils/endpoints.py
- src/models/database/base.py
- tests/unit/app/endpoints/test_streaming_query.py
- src/configuration.py
- tests/unit/models/test_config.py
- src/models/responses.py
- src/app/endpoints/streaming_query.py
- src/models/database/conversations.py
- tests/unit/app/endpoints/test_query.py
- src/app/database.py
- src/models/config.py
- src/app/endpoints/query.py
🧰 Additional context used
🧠 Learnings (1)
📚 Learning: 2025-08-06T06:02:21.060Z
Learnt from: eranco74
PR: lightspeed-core/lightspeed-stack#348
File: src/utils/endpoints.py:91-94
Timestamp: 2025-08-06T06:02:21.060Z
Learning: The direct assignment to `agent._agent_id` in `src/utils/endpoints.py` is a necessary workaround for the missing agent rehydration feature in the LLS client SDK. This allows preserving conversation IDs when handling existing agents.
Applied to files:
tests/unit/app/endpoints/test_conversations.py
src/app/endpoints/conversations.py
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)
- GitHub Check: build-pr
- GitHub Check: e2e_tests
🔇 Additional comments (6)
src/app/endpoints/conversations.py (4)
12-21
: Imports look correct and scoped to new functionality.
Brings in response models, ORM model, session factory, and ownership validator cleanly. No concerns.
169-179
: Validate Optional fields for Pydantic compatibility.
If last_used_model/provider can be NULL in DB, ensure ConversationDetails defines these as Optional[str]. Otherwise, model validation will fail when None is encountered.Please confirm ConversationDetails uses Optional[str] for last_used_model and last_used_provider. If not, either update the model or guard here with
or ""
per your API contract.
205-227
: Ownership validation added in read path — good.
Validating conversation ownership before calling the upstream Llama Stack prevents information disclosure. Nicely placed after ID validation.
287-309
: Ownership validation added in delete path — good.
Ensures authorized deletion only. Placement mirrors the GET handler consistently.tests/unit/app/endpoints/test_conversations.py (2)
13-17
: Response model imports are correct.
Matches the new endpoint and response types. No issues.
47-63
: Context manager mock for get_session is spot on.
Properly emulateswith get_session() as session:
and allows configuring query behavior. Good reuse via helper.
with get_session() as session: | ||
try: | ||
# Get all conversations for this user | ||
user_conversations = ( | ||
session.query(UserConversation).filter_by(user_id=user_id).all() | ||
) | ||
|
||
# Return conversation summaries with metadata | ||
conversations = [ | ||
ConversationDetails( | ||
conversation_id=conv.id, | ||
created_at=conv.created_at.isoformat() if conv.created_at else None, | ||
last_message_at=( | ||
conv.last_message_at.isoformat() | ||
if conv.last_message_at | ||
else None | ||
), | ||
message_count=conv.message_count, | ||
last_used_model=conv.last_used_model, | ||
last_used_provider=conv.last_used_provider, | ||
) | ||
for conv in user_conversations | ||
] | ||
|
||
logger.info( | ||
"Found %d conversations for user %s", len(conversations), user_id | ||
) | ||
|
||
return ConversationsListResponse(conversations=conversations) | ||
|
||
except Exception as e: | ||
logger.exception( | ||
"Error retrieving conversations for user %s: %s", user_id, e | ||
) | ||
raise HTTPException( | ||
status_code=status.HTTP_500_INTERNAL_SERVER_ERROR, | ||
detail={ | ||
"response": "Unknown error", | ||
"cause": f"Unknown error while getting conversations for user {user_id}", | ||
}, | ||
) from e | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Catch exceptions from session acquisition as well.
If get_session().enter raises (e.g., DB connection failure), it happens before the try: block and bypasses your HTTPException mapping. Wrap the with in the try.
- with get_session() as session:
- try:
- # Get all conversations for this user
- user_conversations = (
- session.query(UserConversation).filter_by(user_id=user_id).all()
- )
- ...
- return ConversationsListResponse(conversations=conversations)
-
- except Exception as e:
- logger.exception(
- "Error retrieving conversations for user %s: %s", user_id, e
- )
- raise HTTPException(
- status_code=status.HTTP_500_INTERNAL_SERVER_ERROR,
- detail={
- "response": "Unknown error",
- "cause": f"Unknown error while getting conversations for user {user_id}",
- },
- ) from e
+ try:
+ with get_session() as session:
+ # Get all conversations for this user
+ user_conversations = (
+ session.query(UserConversation).filter_by(user_id=user_id).all()
+ )
+ ...
+ return ConversationsListResponse(conversations=conversations)
+ except Exception as e:
+ logger.exception(
+ "Error retrieving conversations for user %s: %s", user_id, e
+ )
+ raise HTTPException(
+ status_code=status.HTTP_500_INTERNAL_SERVER_ERROR,
+ detail={
+ "response": "Unknown error",
+ "cause": "Unknown error while getting conversations",
+ },
+ ) from e
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
with get_session() as session: | |
try: | |
# Get all conversations for this user | |
user_conversations = ( | |
session.query(UserConversation).filter_by(user_id=user_id).all() | |
) | |
# Return conversation summaries with metadata | |
conversations = [ | |
ConversationDetails( | |
conversation_id=conv.id, | |
created_at=conv.created_at.isoformat() if conv.created_at else None, | |
last_message_at=( | |
conv.last_message_at.isoformat() | |
if conv.last_message_at | |
else None | |
), | |
message_count=conv.message_count, | |
last_used_model=conv.last_used_model, | |
last_used_provider=conv.last_used_provider, | |
) | |
for conv in user_conversations | |
] | |
logger.info( | |
"Found %d conversations for user %s", len(conversations), user_id | |
) | |
return ConversationsListResponse(conversations=conversations) | |
except Exception as e: | |
logger.exception( | |
"Error retrieving conversations for user %s: %s", user_id, e | |
) | |
raise HTTPException( | |
status_code=status.HTTP_500_INTERNAL_SERVER_ERROR, | |
detail={ | |
"response": "Unknown error", | |
"cause": f"Unknown error while getting conversations for user {user_id}", | |
}, | |
) from e | |
try: | |
with get_session() as session: | |
# Get all conversations for this user | |
user_conversations = ( | |
session.query(UserConversation).filter_by(user_id=user_id).all() | |
) | |
# Return conversation summaries with metadata | |
conversations = [ | |
ConversationDetails( | |
conversation_id=conv.id, | |
created_at=conv.created_at.isoformat() if conv.created_at else None, | |
last_message_at=( | |
conv.last_message_at.isoformat() | |
if conv.last_message_at | |
else None | |
), | |
message_count=conv.message_count, | |
last_used_model=conv.last_used_model, | |
last_used_provider=conv.last_used_provider, | |
) | |
for conv in user_conversations | |
] | |
logger.info( | |
"Found %d conversations for user %s", len(conversations), user_id | |
) | |
return ConversationsListResponse(conversations=conversations) | |
except Exception as e: | |
logger.exception( | |
"Error retrieving conversations for user %s: %s", user_id, e | |
) | |
raise HTTPException( | |
status_code=status.HTTP_500_INTERNAL_SERVER_ERROR, | |
detail={ | |
"response": "Unknown error", | |
"cause": "Unknown error while getting conversations", | |
}, | |
) from e |
🤖 Prompt for AI Agents
In src/app/endpoints/conversations.py around lines 159 to 200, the current try
block does not cover the get_session() context manager acquisition, so
exceptions raised during session creation (like DB connection failures) are not
caught and mapped to HTTPException. To fix this, move the try block to wrap the
entire with get_session() statement, ensuring any exceptions during session
acquisition are caught and handled properly.
lightspeed-core#347 isn't passing for some reason, so this is a dummy commit to try to trigger CI.
# Background / Context When the user queries lightspeed-stack, they get a conversation ID in the response that they can later use to retrieve the conversation history and continue the conversation. # Issue / Requirement / Reason for change We've put the burden of remembering the conversation ID on the user, which is not ideal. The user (e.g. a UI) has to store the conversation ID somewhere, e.g. browser local storage, but this is not ideal because it can easily get lost. It also complicates the UI implementation. Another issue is that a user could read another user's conversation if they only knew the ID, which is hard, but not impossible. It's not immediately obvious that the conversation ID is a secret that should not be shared with others. # Solution / Feature Overview Add a database persistence layer to lightspeed-stack to store user conversation IDs alongside the user ID, and a bit of metadata about the conversation. Allow users to retrieve their conversations by calling a new endpoint `/conversations` (without an ID. Calling with an ID is an existing endpoint that lists the message history for a conversation), which will return a list of conversations associated with the user, along with metadata like creation time, last message time, and message count. Verify that the user owns the conversation when querying or streaming a conversation or asking for its message history, and automatically associate the conversation with the user in the database when the user queries or streams a conversation. # Implementation Details **Database Layer (src/app/database.py):** - SQLAlchemy - Support for both SQLite and Postgres databases - PostgreSQL custom schema (table namespacing) support with automatic schema creation. This is useful because llama-stack itself needs a database for persisting conversation history, and we want to avoid conflicts with that when users use the same Postgres database as llama-stack. - Automatically enable SQL statement tracing when debug logging is enabled **Models (src/models/database):** - Base model class for shared SQLAlchemy configuration - UserConversation model with indexed user_id fields for efficient querying, automatic timestamp tracking (created_at, last_message_at) and message count tracking **Configuration Updates:** - Extended the configuration system to support database connection settings **API:** - Conversations endpoint without an ID now returns a list of conversations for the authenticated user, along with metadata - Query/streaming endpoint automatically associates conversations with the user in the database, and verifies ownership of conversations **Dependencies:** - Added SQLAlchemy dependency **Tests:** Some unit tests were added # Future Work This change does not include migrations, so any future changes to the databases require writing custom backwards compatibility code. We should probably add Alembic or something simpler in the near future.
Merging, The E2E Tests check is broken and was fixed in: #376 |
Background / Context
When the user queries lightspeed-stack, they get a conversation ID in the response that they can later use to retrieve the conversation history and continue the conversation.
Issue / Requirement / Reason for change
We've put the burden of remembering the conversation ID on the user, which is not ideal. The user (e.g. a UI) has to store the conversation ID somewhere, e.g. browser local storage, but this is not ideal because it can easily get lost. It also complicated the UI implementation.
Another issue is that a user could read another user's conversation if they only knew the ID, which is hard, but not ideal. It's not immediately obvious that the conversation ID is a secret that should not be shared with others.
Solution / Feature Overview
Add a database persistence layer to lightspeed-stack to store user conversation IDs and a bit of metadata about the conversation.
Allow users to retrieve their conversations by calling a new endpoint
/conversations
(without an ID - with an ID is an existing endpoint that lists the message history for a conversation), which will return a list of conversations associated with the user, along with metadata like creation time, last message time, and message count.Verify that the user owns the conversation when querying or streaming a conversation, and automatically associate the conversation with the user in the database when the user queries or streams a conversation.
Implementation Details
Database Layer (src/app/database.py):
schema creation. This is useful because llama-stack itself needs a
database for persisting conversation history, and we want to avoid
conflicts with that when users use the same Postgres database as
llama-stack.
enabled
Models (src/models/):
Configuration Updates:
API:
conversations for the authenticated user, along with metadata
with the user in the database, and verifies ownership of conversations
Dependencies:
Future Work
This change does not include migrations, so any future changes to the databases require writing custom backwards compatibility code. We should probably add Alembic or something simpler in the near future.
Type of change
Related Tickets & Documents
MGMT-21300
MGMT-21008
Checklist before requesting a review
Testing
More tests will be added (and unit tests will be fixed) once I get some initial feedback on the implementation.
Summary by CodeRabbit
New Features
Bug Fixes
Chores
Tests
Documentation