-
Notifications
You must be signed in to change notification settings - Fork 45
LCORE-282: ensure compatibility with OLS streaming query #519
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
WalkthroughAdds media-type-aware streaming (application/json | text/plain), OLS-style event types (token, tool_call, tool_result), a central stream_event formatter, media-type-aware error/end handling, and propagates media_type and conversation_id across streaming handlers and responses. Changes
Sequence Diagram(s)sequenceDiagram
participant C as Client
participant API as /streaming_query
participant G as Generator
participant H as stream_event / handlers
C->>API: POST /streaming_query (media_type)
API->>G: start generator(media_type, conversation_id)
Note right of API: return StreamingResponse(media_type)
G->>H: stream_event({start meta}, "start", media_type)
H-->>G: formatted start event (JSON / text)
G-->>C: emit start
loop tokens / tool calls / results
G->>H: stream_event(data, event_type, media_type)
H-->>G: formatted event (JSON / text)
G-->>C: emit event
end
alt error
G->>H: generic_llm_error(err, media_type)
H-->>G: formatted error event
G-->>C: emit error + end
else success
G->>H: stream_end_event(metadata, media_type)
H-->>G: formatted end event
G-->>C: emit end
end
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes Possibly related PRs
Suggested reviewers
Poem
Pre-merge checks and finishing touches✅ Passed checks (3 passed)
✨ Finishing touches
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 5
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (2)
src/app/endpoints/streaming_query.py (2)
1-1
: CI hygiene: run Black, remove trailing whitespace, and address broad exceptions
- Run
black .
(e.g.pip install black
thenblack .
) to fix formatting.- Remove trailing whitespace in
src/app/endpoints/streaming_query.py
at lines 256, 677, and 680.- For each
except Exception
(lines 520, 705, 714), either narrow the caught exception to a more specific type or add# pylint: disable=broad-except
with a brief justification for why broad-catching is unavoidable.
389-433
: Swap out directformat_stream_data
calls in streaming handlers
The_handle_shield_event
,_handle_inference_event
, and_handle_tool_execution_event
functions still emit JSON frames viaformat_stream_data
; refactor each to usestream_event
so thattext/plain
yields raw token strings and all other media types yield JSON frames.
🧹 Nitpick comments (5)
src/models/responses.py (1)
553-576
: Tighten typing for StreamedChunk.type and future-proof the shapeConstrain
type
to known literals and keep room for extension. This improves validation and OpenAPI docs.-class StreamedChunk(BaseModel): +class StreamedChunk(BaseModel): @@ - type: str = Field( + # Known chunk types; extend as protocol evolves. + type: Literal["text", "tool_call", "tool_result", "end"] = Field( description="The type of chunk", - examples=["text", "tool_call", "tool_result", "end"], + examples=["text", "tool_call", "tool_result", "end"], )Optional: if you plan different payloads per type, consider a Tagged Union (discriminated union) with
type
as the discriminator to enforce event-specific schemas.src/app/endpoints/streaming_query.py (4)
79-101
: Start event should be media-type awareFor
text/plain
, emitting a JSON “start” frame is unexpected. Makestream_start_event
media-type aware and no-op for text.-def stream_start_event(conversation_id: str) -> str: +def stream_start_event(conversation_id: str, media_type: str) -> str: @@ - return format_stream_data( + if media_type == "text/plain": + return "" + return format_stream_data( { "event": "start", "data": {"conversation_id": conversation_id}, } )And call:
- yield stream_start_event(conversation_id) + yield stream_start_event(conversation_id, media_type)
244-271
: Align error payload with media type and avoid duplicate loggingYou already log at error+exception; one is enough. Also, for JSON, consider nesting under
detail
to mirror ErrorResponse.- logger.error("Error while obtaining answer for user question") - logger.exception(error) + logger.exception("Error while obtaining answer for user question") @@ - return format_stream_data( - { - "event": "error", - "data": { - "response": response, - "cause": cause, - }, - } - ) + return format_stream_data({ + "event": "error", + "data": {"detail": {"response": response, "cause": cause}}, + })Note: only apply if downstream expects
detail
. Otherwise keep current shape.
440-553
: Tool execution: parse metadata safely and emit via stream_event
- Use
json.loads
first, fallback toast.literal_eval
only if needed.- Emit using
stream_event
for media-type correctness.- Minor: use non-greedy metadata regex to avoid over-capture.
- for match in METADATA_PATTERN.findall(text_content_item.text): + for match in METADATA_PATTERN.findall(text_content_item.text): try: - meta = ast.literal_eval(match) + try: + meta = json.loads(match) + except Exception: + meta = ast.literal_eval(match) if "document_id" in meta: metadata_map[meta["document_id"]] = meta except Exception: # pylint: disable=broad-except logger.debug("An exception was thrown in processing %s", match) @@ - yield format_stream_data({ - "event": LLM_TOOL_RESULT_EVENT, - "data": {"id": chunk_id, "token": {"tool_name": r.tool_name, "summary": summary}}, - }) + yield stream_event({"id": chunk_id, "token": {"tool_name": r.tool_name, "summary": summary}}, LLM_TOOL_RESULT_EVENT, media_type)And consider tightening the regex (declare near definition):
-METADATA_PATTERN = re.compile(r"\nMetadata: (\{.+})\n") +METADATA_PATTERN = re.compile(r"\nMetadata:\s*(\{.*?\})\s*\n", re.DOTALL)
349-371
: Use the event constant and drop role field in shield eventsTo align with OLS naming and recent test removals, switch
"event": "token"
toLLM_TOKEN_EVENT
and omitrole
from payloads.- "event": "token", + "event": LLM_TOKEN_EVENT, "data": { "id": chunk_id, - "role": chunk.event.payload.step_type, "token": "No Violation", },Apply the same removal in the violation branch.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (3)
src/app/endpoints/streaming_query.py
(15 hunks)src/models/responses.py
(1 hunks)tests/unit/app/endpoints/test_streaming_query.py
(0 hunks)
💤 Files with no reviewable changes (1)
- tests/unit/app/endpoints/test_streaming_query.py
🧰 Additional context used
🧬 Code graph analysis (1)
src/app/endpoints/streaming_query.py (4)
src/models/responses.py (1)
StreamedChunk
(553-575)src/utils/endpoints.py (1)
get_system_prompt
(71-102)src/metrics/utils.py (1)
update_llm_token_count_from_turn
(60-77)src/utils/types.py (1)
append_tool_calls_from_llama
(65-78)
🪛 GitHub Actions: Black
src/app/endpoints/streaming_query.py
[error] 1-1: Black formatting check failed. 1 file would be reformatted: src/app/endpoints/streaming_query.py. Command 'uv tool run black --check .' exited with code 1. Run 'black --write .' to fix formatting.
🪛 GitHub Actions: Ruff
src/app/endpoints/streaming_query.py
[error] 32-32: Lint step 'uv tool run ruff check . --per-file-ignores=tests/:S101 --per-file-ignores=scripts/:S101' failed. F401: 'models.responses.StreamedChunk' imported but unused in src/app/endpoints/streaming_query.py:32. 1 fixable with 'ruff --fix'.
🪛 GitHub Actions: Pyright
src/app/endpoints/streaming_query.py
[error] 702-702: Step: uv run pyright src - Pyright error: Cannot access attribute 'usage' for class 'Turn'. Attribute 'usage' is unknown (reportAttributeAccessIssue).
[error] 703-703: Step: uv run pyright src - Pyright error: Cannot access attribute 'usage' for class 'Turn'. Attribute 'usage' is unknown (reportAttributeAccessIssue).
[error] 704-704: Step: uv run pyright src - Pyright error: Cannot access attribute 'usage' for class 'Turn'. Attribute 'usage' is unknown (reportAttributeAccessIssue).
🪛 GitHub Actions: Python linter
src/app/endpoints/streaming_query.py
[warning] 256-256: Trailing whitespace (trailing-whitespace).
[warning] 677-677: Trailing whitespace (trailing-whitespace).
[warning] 680-680: Trailing whitespace (trailing-whitespace).
[warning] 703-703: Line too long (105/100) (line-too-long).
[warning] 704-704: Line too long (110/100) (line-too-long).
[warning] 714-714: Catching too general exception (broad-exception-caught).
[warning] 584-584: Too many statements (57/50) (too-many-statements).
[warning] 32-32: Unused import: StreamedChunk from models.responses (unused-import).
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)
- GitHub Check: build-pr
- GitHub Check: e2e_tests
🔇 Additional comments (2)
src/app/endpoints/streaming_query.py (2)
52-56
: LGTM: centralized OLS event constantsConsistent event naming reduces drift. No changes needed.
761-766
: LGTM: explicit media_type on StreamingResponseGood to surface the negotiated media type here.
cfa516a
to
c2a1ccc
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 3
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
tests/unit/models/requests/test_query_request.py (1)
137-156
: Add test coverage for invalid media_type validation.The test correctly verifies that no warning is logged for supported media types. However, it's missing coverage for the new strict validation that raises
ValueError
for unsupported media types.Add a test case to verify the new validation behavior:
def test_validate_media_type_invalid(self) -> None: """Test the validate_media_type method with invalid media type.""" with pytest.raises( ValueError, match="media_type must be either 'application/json' or 'text/plain'" ): QueryRequest( query="Tell me about Kubernetes", provider="OpenAI", model="gpt-3.5-turbo", media_type="application/xml", )
🧹 Nitpick comments (3)
tests/unit/models/requests/test_query_request.py (1)
148-148
: Consider using the new MEDIA_TYPE_TEXT constant.The test hardcodes
"text/plain"
instead of using the newMEDIA_TYPE_TEXT
constant fromsrc/constants.py
. Using the constant would improve maintainability and ensure consistency with the validation logic.Apply this diff:
+from constants import MEDIA_TYPE_TEXT + class TestQueryRequest:qr = QueryRequest( query="Tell me about Kubernetes", provider="OpenAI", model="gpt-3.5-turbo", - media_type="text/plain", + media_type=MEDIA_TYPE_TEXT, )tests/unit/app/endpoints/test_streaming_query.py (1)
1657-1658
: Address pylint warnings: move imports to module level.While the import-outside-toplevel pattern works, it triggers multiple pylint C0415 warnings throughout the new test classes (lines 1657-1920). Consider moving these imports to the module level for consistency with the project's coding guidelines.
Apply this pattern to all affected test methods:
+from app.endpoints.streaming_query import ( + stream_event, + stream_end_event, + prompt_too_long_error, + generic_llm_error, + LLM_TOKEN_EVENT, + LLM_TOOL_CALL_EVENT, + LLM_TOOL_RESULT_EVENT, +) +from constants import MEDIA_TYPE_JSON, MEDIA_TYPE_TEXT class TestOLSStreamEventFormatting: """Test the stream_event function for both media types (OLS compatibility).""" def test_stream_event_json_token(self): """Test token event formatting for JSON media type.""" - from app.endpoints.streaming_query import stream_event, LLM_TOKEN_EVENT - from constants import MEDIA_TYPE_JSONsrc/app/endpoints/streaming_query.py (1)
665-665
: Consider refactoring to reduce function complexity.The
streaming_query_endpoint_handler
function has 54 statements, exceeding the recommended limit of 50 (R0915). While the current implementation is functional, consider extracting helper functions for:
- User conversation validation (lines 703-720)
- Response generation and metrics (lines 722-831)
This is optional and can be deferred to a future refactor.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (5)
src/app/endpoints/streaming_query.py
(22 hunks)src/constants.py
(1 hunks)src/models/requests.py
(2 hunks)tests/unit/app/endpoints/test_streaming_query.py
(10 hunks)tests/unit/models/requests/test_query_request.py
(1 hunks)
🧰 Additional context used
📓 Path-based instructions (10)
src/**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
Use absolute imports for internal modules (e.g., from auth import get_auth_dependency)
Files:
src/constants.py
src/models/requests.py
src/app/endpoints/streaming_query.py
**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
**/*.py
: All modules start with descriptive module-level docstrings explaining purpose
Use logger = logging.getLogger(name) for module logging after import logging
Define type aliases at module level for clarity
All functions require docstrings with brief descriptions
Provide complete type annotations for all function parameters and return types
Use typing_extensions.Self in model validators where appropriate
Use modern union syntax (str | int) and Optional[T] or T | None consistently
Function names use snake_case with descriptive, action-oriented prefixes (get_, validate_, check_)
Avoid in-place parameter modification; return new data structures instead of mutating arguments
Use appropriate logging levels: debug, info, warning, error with clear messages
All classes require descriptive docstrings explaining purpose
Class names use PascalCase with conventional suffixes (Configuration, Error/Exception, Resolver, Interface)
Abstract base classes should use abc.ABC and @AbstractMethod for interfaces
Provide complete type annotations for all class attributes
Follow Google Python docstring style for modules, classes, and functions, including Args, Returns, Raises, Attributes sections as needed
Files:
src/constants.py
tests/unit/models/requests/test_query_request.py
tests/unit/app/endpoints/test_streaming_query.py
src/models/requests.py
src/app/endpoints/streaming_query.py
src/constants.py
📄 CodeRabbit inference engine (CLAUDE.md)
Keep shared constants in a central src/constants.py with descriptive comments
Files:
src/constants.py
tests/{unit,integration}/**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
tests/{unit,integration}/**/*.py
: Use pytest for all unit and integration tests
Do not use unittest in tests; pytest is the standard
Files:
tests/unit/models/requests/test_query_request.py
tests/unit/app/endpoints/test_streaming_query.py
tests/**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
tests/**/*.py
: Use pytest-mock to create AsyncMock objects for async interactions in tests
Use the shared auth mock constant: MOCK_AUTH = ("mock_user_id", "mock_username", False, "mock_token") in tests
Files:
tests/unit/models/requests/test_query_request.py
tests/unit/app/endpoints/test_streaming_query.py
src/{models/**/*.py,configuration.py}
📄 CodeRabbit inference engine (CLAUDE.md)
src/{models/**/*.py,configuration.py}
: Use @field_validator and @model_validator for custom validation in Pydantic models
Use precise type hints in configuration (e.g., Optional[FilePath], PositiveInt, SecretStr)
Files:
src/models/requests.py
src/models/**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
src/models/**/*.py
: Pydantic models: use BaseModel for data models and extend ConfigurationBase for configuration
Use @model_validator and @field_validator for Pydantic model validation
Files:
src/models/requests.py
src/app/**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
Use standard FastAPI imports (from fastapi import APIRouter, HTTPException, Request, status, Depends) in FastAPI app code
Files:
src/app/endpoints/streaming_query.py
src/{app/**/*.py,client.py}
📄 CodeRabbit inference engine (CLAUDE.md)
Use async def for I/O-bound operations and external API calls
Files:
src/app/endpoints/streaming_query.py
src/app/endpoints/**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
In API endpoints, raise FastAPI HTTPException with appropriate status codes for error handling
Files:
src/app/endpoints/streaming_query.py
🧬 Code graph analysis (2)
tests/unit/app/endpoints/test_streaming_query.py (2)
src/app/endpoints/streaming_query.py (5)
streaming_query_endpoint_handler
(665-852)stream_event
(203-228)stream_end_event
(153-200)prompt_too_long_error
(311-333)generic_llm_error
(336-359)src/models/requests.py (1)
QueryRequest
(72-223)
src/app/endpoints/streaming_query.py (1)
src/models/responses.py (1)
ToolCall
(45-50)
🪛 GitHub Actions: Python linter
tests/unit/app/endpoints/test_streaming_query.py
[warning] 1788-1788: Pylint C0301: Line too long (106/100) (line-too-long)
[warning] 1657-1657: Pylint C0415: Import outside toplevel (app.endpoints.streaming_query.stream_event, app.endpoints.streaming_query.LLM_TOKEN_EVENT) (import-outside-toplevel)
[warning] 1658-1658: Pylint C0415: Import outside toplevel (constants.MEDIA_TYPE_JSON) (import-outside-toplevel)
[warning] 1668-1668: Pylint C0415: Import outside toplevel (app.endpoints.streaming_query.stream_event, app.endpoints.streaming_query.LLM_TOKEN_EVENT) (import-outside-toplevel)
[warning] 1669-1669: Pylint C0415: Import outside toplevel (constants.MEDIA_TYPE_TEXT) (import-outside-toplevel)
[warning] 1678-1678: Pylint C0415: Import outside toplevel (app.endpoints.streaming_query.stream_event, app.endpoints.streaming_query.LLM_TOOL_CALL_EVENT) (import-outside-toplevel)
[warning] 1679-1679: Pylint C0415: Import outside toplevel (constants.MEDIA_TYPE_JSON) (import-outside-toplevel)
[warning] 1695-1695: Pylint C0415: Import outside toplevel (app.endpoints.streaming_query.stream_event, app.endpoints.streaming_query.LLM_TOOL_CALL_EVENT) (import-outside-toplevel)
[warning] 1696-1696: Pylint C0415: Import outside toplevel (constants.MEDIA_TYPE_TEXT) (import-outside-toplevel)
[warning] 1712-1712: Pylint C0415: Import outside toplevel (app.endpoints.streaming_query.stream_event, app.endpoints.streaming_query.LLM_TOOL_RESULT_EVENT) (import-outside-toplevel)
[warning] 1713-1713: Pylint C0415: Import outside toplevel (constants.MEDIA_TYPE_JSON) (import-outside-toplevel)
[warning] 1726-1726: Pylint C0415: Import outside toplevel (app.endpoints.streaming_query.stream_event, app.endpoints.streaming_query.LLM_TOOL_RESULT_EVENT) (import-outside-toplevel)
[warning] 1727-1727: Pylint C0415: Import outside toplevel (constants.MEDIA_TYPE_TEXT) (import-outside-toplevel)
[warning] 1740-1740: Pylint C0415: Import outside toplevel (app.endpoints.streaming_query.stream_event) (import-outside-toplevel)
[warning] 1741-1741: Pylint C0415: Import outside toplevel (constants.MEDIA_TYPE_TEXT) (import-outside-toplevel)
[warning] 1754-1754: Pylint C0415: Import outside toplevel (app.endpoints.streaming_query.stream_end_event) (import-outside-toplevel)
[warning] 1755-1755: Pylint C0415: Import outside toplevel (constants.MEDIA_TYPE_JSON) (import-outside-toplevel)
[warning] 1779-1779: Pylint C0415: Import outside toplevel (app.endpoints.streaming_query.stream_end_event) (import-outside-toplevel)
[warning] 1780-1780: Pylint C0415: Import outside toplevel (constants.MEDIA_TYPE_JSON) (import-outside-toplevel)
[warning] 1793-1793: Pylint C0415: Import outside toplevel (app.endpoints.streaming_query.prompt_too_long_error) (import-outside-toplevel)
[warning] 1794-1794: Pylint C0415: Import outside toplevel (constants.MEDIA_TYPE_JSON) (import-outside-toplevel)
[warning] 1807-1807: Pylint C0415: Import outside toplevel (app.endpoints.streaming_query.prompt_too_long_error) (import-outside-toplevel)
[warning] 1808-1808: Pylint C0415: Import outside toplevel (constants.MEDIA_TYPE_JSON) (import-outside-toplevel)
[warning] 1819-1819: Pylint C0415: Import outside toplevel (app.endpoints.streaming_query.generic_llm_error) (import-outside-toplevel)
[warning] 1820-1820: Pylint C0415: Import outside toplevel (constants.MEDIA_TYPE_JSON) (import-outside-toplevel)
[warning] 1832-1832: Pylint C0415: Import outside toplevel (app.endpoints.streaming_query.generic_llm_error) (import-outside-toplevel)
[warning] 1833-1833: Pylint C0415: Import outside toplevel (constants.MEDIA_TYPE_JSON) (import-outside-toplevel)
[warning] 1846-1846: Pylint C0415: Import outside toplevel (app.endpoints.streaming_query.generic_llm_error) (import-outside-toplevel)
[warning] 1847-1847: Pylint C0415: Import outside toplevel (constants.MEDIA_TYPE_JSON) (import-outside-toplevel)
[warning] 1860-1860: Pylint W0621: Redefining name 'QueryRequest' from outer scope (line 47) (redefined-outer-name)
[warning] 1860-1860: Pylint W0404: Reimport 'QueryRequest' (imported line 47) (reimported)
[warning] 1860-1860: Pylint C0415: Import outside toplevel (models.requests.QueryRequest) (import-outside-toplevel)
[warning] 1875-1875: Pylint C0415: Import outside toplevel (app.endpoints.streaming_query.stream_event, app.endpoints.streaming_query.LLM_TOKEN_EVENT, app.endpoints.streaming_query.LLM_TOOL_CALL_EVENT, app.endpoints.streaming_query.LLM_TOOL_RESULT_EVENT) (import-outside-toplevel)
[warning] 1881-1881: Pylint C0415: Import outside toplevel (constants.MEDIA_TYPE_JSON) (import-outside-toplevel)
[warning] 1919-1919: Pylint C0415: Import outside toplevel (app.endpoints.streaming_query.stream_end_event) (import-outside-toplevel)
[warning] 1920-1920: Pylint C0415: Import outside toplevel (constants.MEDIA_TYPE_JSON) (import-outside-toplevel)
src/app/endpoints/streaming_query.py
[warning] 844-844: Pylint W0718: Catching too general exception Exception (broad-exception-caught)
[warning] 665-665: Pylint R0915: Too many statements (54/50) (too-many-statements)
🔇 Additional comments (17)
src/constants.py (1)
58-60
: LGTM!The media type constants are well-defined with clear naming and appropriate placement. The comment accurately describes their purpose for streaming responses.
src/models/requests.py (1)
147-151
: LGTM!The field description and examples are clear and accurate, properly documenting the supported media types.
tests/unit/app/endpoints/test_streaming_query.py (5)
186-192
: LGTM: Error handling now returns StreamingResponse.The test correctly validates the updated error handling behavior where connection errors return a
StreamingResponse
withapplication/json
media type instead of raising anHTTPException
. This aligns with the OLS-compatible error handling introduced in the main endpoint.
846-848
: LGTM: Test updated for OLS event type changes.The assertion correctly expects
"event": "token"
instead of the previousturn_complete
event type, aligning with the OLS-compatible event structure.
880-881
: LGTM: Role field removed for OLS compatibility.The assertion correctly validates that the role field is no longer present in the event payload, as documented in the comment on line 880.
1103-1104
: LGTM: Heartbeat now uses token event.The test correctly expects a
"token"
event type with"heartbeat"
value, consistent with the OLS-compatible event structure.
1647-1936
: Excellent OLS compatibility test coverage.The new test classes provide comprehensive coverage of:
- Media type-aware event formatting (JSON vs text/plain)
- End event structure and document references
- Error handling for both media types
- Integration tests validating the complete OLS event structure
The test assertions are thorough and correctly validate the absence of the
role
field and the presence of expected OLS event types.src/app/endpoints/streaming_query.py (10)
109-112
: LGTM: OLS event type constants are well-defined.The new event type constants centralize event naming and improve maintainability. The comment clearly indicates their purpose for OLS compatibility.
153-200
: LGTM: Media type handling correctly implemented.The function properly formats end events for both JSON and text/plain media types. The text format provides a clean, human-readable list of referenced documents.
Note: A past review comment suggests moving
available_quotas
from the top level into thedata
object for consistency with other event payloads (line 198).
203-228
: LGTM: Centralized event formatting with proper error handling.The
stream_event
function provides a clean abstraction for media type-aware event formatting. The error logging for unknown event types (line 221) helps with debugging and monitoring.
231-277
: LGTM: media_type properly propagated through event handlers.The
stream_build_event
signature correctly adds themedia_type
parameter, and all handler invocations consistently pass it through. This ensures media type-aware formatting throughout the event pipeline.
311-333
: LGTM: Appropriate error handling with correct status code.The
prompt_too_long_error
function correctly uses HTTP status code 413 (Request Entity Too Large) for oversized prompts and provides clear error messages for both media types.
336-359
: LGTM: Generic error handler with proper logging.The
generic_llm_error
function appropriately logs exceptions (lines 346-347) and provides consistent error responses for both media types.
365-660
: LGTM: Consistent media type handling across all event handlers.All event handlers have been systematically updated to:
- Accept the
media_type
parameter- Use the
stream_event
abstraction instead of directformat_stream_data
calls- Utilize the new event type constants (LLM_TOKEN_EVENT, LLM_TOOL_CALL_EVENT, LLM_TOOL_RESULT_EVENT)
The refactoring is thorough and maintains consistency across the entire event emission pipeline.
759-789
: LGTM: Media type correctly propagated through response generator.The response generator properly:
- Determines media type from the request with appropriate fallback (line 760)
- Conditionally sends the start event only for JSON responses (lines 763-764)
- Passes media type to all event emission functions (lines 784, 789)
844-852
: Broad exception catch is appropriate for error boundary.The broad
Exception
catch on line 844 serves as a top-level error boundary to ensure any unexpected errors are gracefully converted to streaming error responses rather than causing unhandled exceptions. The pylint W0718 warning can be safely ignored in this context.The error response correctly:
- Determines the appropriate media type (line 846)
- Formats the error using
generic_llm_error
(line 847)- Returns a
StreamingResponse
with proper media type (line 852)
829-831
: LGTM: StreamingResponse properly includes media_type.The response correctly determines and propagates the media type to the
StreamingResponse
constructor, ensuring the HTTP Content-Type header matches the actual response format.
c2a1ccc
to
fca0b38
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
♻️ Duplicate comments (1)
tests/unit/app/endpoints/test_streaming_query.py (1)
1668-1668
: Fix line length.Line exceeds the 100-character limit (106 characters).
Apply this diff to fix:
- expected = 'data: {"event": "token", "data": {"id": 0, "role": "inference", "token": "Hello"}}\n\n' # pylint: disable=line-too-long + expected = ( + 'data: {"event": "token", "data": ' + '{"id": 0, "role": "inference", "token": "Hello"}}\n\n' + )
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (5)
src/app/endpoints/streaming_query.py
(22 hunks)src/constants.py
(1 hunks)src/models/requests.py
(2 hunks)tests/unit/app/endpoints/test_streaming_query.py
(11 hunks)tests/unit/models/requests/test_query_request.py
(1 hunks)
🚧 Files skipped from review as they are similar to previous changes (3)
- src/constants.py
- tests/unit/models/requests/test_query_request.py
- src/models/requests.py
🧰 Additional context used
📓 Path-based instructions (7)
**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
**/*.py
: All modules start with descriptive module-level docstrings explaining purpose
Use logger = logging.getLogger(name) for module logging after import logging
Define type aliases at module level for clarity
All functions require docstrings with brief descriptions
Provide complete type annotations for all function parameters and return types
Use typing_extensions.Self in model validators where appropriate
Use modern union syntax (str | int) and Optional[T] or T | None consistently
Function names use snake_case with descriptive, action-oriented prefixes (get_, validate_, check_)
Avoid in-place parameter modification; return new data structures instead of mutating arguments
Use appropriate logging levels: debug, info, warning, error with clear messages
All classes require descriptive docstrings explaining purpose
Class names use PascalCase with conventional suffixes (Configuration, Error/Exception, Resolver, Interface)
Abstract base classes should use abc.ABC and @AbstractMethod for interfaces
Provide complete type annotations for all class attributes
Follow Google Python docstring style for modules, classes, and functions, including Args, Returns, Raises, Attributes sections as needed
Files:
tests/unit/app/endpoints/test_streaming_query.py
src/app/endpoints/streaming_query.py
tests/{unit,integration}/**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
tests/{unit,integration}/**/*.py
: Use pytest for all unit and integration tests
Do not use unittest in tests; pytest is the standard
Files:
tests/unit/app/endpoints/test_streaming_query.py
tests/**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
tests/**/*.py
: Use pytest-mock to create AsyncMock objects for async interactions in tests
Use the shared auth mock constant: MOCK_AUTH = ("mock_user_id", "mock_username", False, "mock_token") in tests
Files:
tests/unit/app/endpoints/test_streaming_query.py
src/**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
Use absolute imports for internal modules (e.g., from auth import get_auth_dependency)
Files:
src/app/endpoints/streaming_query.py
src/app/**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
Use standard FastAPI imports (from fastapi import APIRouter, HTTPException, Request, status, Depends) in FastAPI app code
Files:
src/app/endpoints/streaming_query.py
src/{app/**/*.py,client.py}
📄 CodeRabbit inference engine (CLAUDE.md)
Use async def for I/O-bound operations and external API calls
Files:
src/app/endpoints/streaming_query.py
src/app/endpoints/**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
In API endpoints, raise FastAPI HTTPException with appropriate status codes for error handling
Files:
src/app/endpoints/streaming_query.py
🧬 Code graph analysis (2)
tests/unit/app/endpoints/test_streaming_query.py (2)
src/app/endpoints/streaming_query.py (5)
stream_event
(203-228)stream_end_event
(153-200)prompt_too_long_error
(311-333)generic_llm_error
(336-359)streaming_query_endpoint_handler
(675-864)src/models/requests.py (1)
QueryRequest
(72-223)
src/app/endpoints/streaming_query.py (2)
src/models/responses.py (1)
ToolCall
(45-50)src/models/config.py (1)
Action
(327-366)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)
- GitHub Check: e2e_tests
- GitHub Check: build-pr
🔇 Additional comments (9)
tests/unit/app/endpoints/test_streaming_query.py (2)
43-57
: LGTM! Imports properly support OLS compatibility testing.The new imports for streaming event helpers, error formatters, event type constants, and media type constants are correctly sourced and align with the OLS compatibility requirements described in the PR objectives.
1660-1930
: Excellent OLS compatibility test coverage!The test suite comprehensively validates OLS-compatible streaming behavior across multiple dimensions:
- Event formatting for both JSON and text media types (token, tool_call, tool_result)
- End event structure with referenced documents and quotas
- Error handling for both prompt-too-long and generic errors
- Media type validation
- Complete event structure verification
This thorough coverage will help prevent regressions in OLS compatibility.
src/app/endpoints/streaming_query.py (7)
7-7
: LGTM! Imports and constants properly support OLS compatibility.The additions appropriately support the new OLS-compatible streaming functionality:
AsyncGenerator
enables proper typing for async error generators- Media type constants provide clear semantic values
- Event type constants follow clear naming conventions
Also applies to: 40-40, 109-112
65-84
: LGTM! OpenAPI documentation properly reflects dual media type support.The response documentation now correctly documents both
application/json
andtext/plain
media types with appropriate examples, improving API discoverability for consumers.
153-200
: Media type handling looks good, but note existing concern about available_quotas placement.The function correctly implements media-type-aware formatting for end events. However, a previous review comment noted that
available_quotas
at line 198 sits at the top level alongsideevent
anddata
, which may be inconsistent with client expectations. Consider whether this should be nested inside thedata
object.Based on past review comments
203-228
: LGTM! Clean centralized event formatting.The
stream_event
function provides a clean abstraction for media-type-aware event formatting. The implementation properly handles text/plain with appropriate formatting for each event type and includes error logging for unknown types.
231-670
: Excellent media type propagation throughout event handlers!The changes systematically propagate
media_type
through the entire event pipeline:
- All handler functions consistently accept
media_type
with sensible defaults- All event emissions route through
stream_event
for consistent formatting- Backward compatibility maintained via default parameters
- Clean separation of concerns between event construction and formatting
The implementation demonstrates good architectural discipline.
311-359
: LGTM! Media-type-aware error handling properly implemented.Both error handling functions (
prompt_too_long_error
andgeneric_llm_error
) provide appropriate formatting for both media types:
- Text format: Simple, readable error messages
- JSON format: Structured error responses with appropriate fields (status_code, response, cause)
The functions follow the same clean pattern as
stream_event
and include proper logging.
769-774
: LGTM! Proper media type integration in endpoint handler.The changes correctly integrate media type throughout the streaming lifecycle:
- Media type properly extracted from request with JSON default
- Start event conditionally sent only for JSON (text/plain doesn't need it)
- Media type passed through to all event generators
- Streaming response properly typed with media_type
The broad exception handler at lines 854-864 is intentional (as documented in comments) to ensure all errors are converted to OLS-compatible streaming responses, maintaining the streaming connection. The
APIConnectionError
at line 843 correctly raisesHTTPException
since connection failures prevent streaming.Also applies to: 793-799, 839-864
# await the async function - should return a streaming response with error | ||
response = await streaming_query_endpoint_handler( | ||
request, query_request, auth=MOCK_AUTH | ||
) | ||
|
||
assert isinstance(response, StreamingResponse) | ||
assert response.media_type == "application/json" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🧩 Analysis chain
Verify connection error handling behavior change.
The test now expects StreamingResponse
with media_type "application/json"
instead of raising HTTPException
. This aligns with the new generic_llm_error
flow but represents a significant behavior change in error handling.
Based on the implementation in src/app/endpoints/streaming_query.py
(lines 854-864), APIConnectionError
still raises HTTPException
while other exceptions return StreamingResponse
. Please confirm this test accurately reflects the expected behavior for connection errors.
🏁 Script executed:
#!/bin/bash
# Verify how APIConnectionError is handled in the endpoint
rg -A 10 'except APIConnectionError' src/app/endpoints/streaming_query.py
Length of output: 438
Adjust test to expect HTTPException for connection errors. The handler’s except APIConnectionError
block raises an HTTPException
(500), so update tests/unit/app/endpoints/test_streaming_query.py
(lines 194–200) to assert that an HTTPException
is raised rather than returning a StreamingResponse
.
🤖 Prompt for AI Agents
In tests/unit/app/endpoints/test_streaming_query.py around lines 194 to 200, the
test expects a StreamingResponse but the handler catches APIConnectionError and
raises an HTTPException(500); update the test to assert that calling
streaming_query_endpoint_handler raises HTTPException instead of returning a
StreamingResponse — use the test framework's context manager (e.g.,
pytest.raises(HTTPException)) or equivalent to call the handler and assert the
exception status_code is 500 and any relevant message.
09a77d2
to
c8774bb
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
tests/unit/app/endpoints/test_streaming_query.py (1)
898-942
: Inconsistency: comments claim "role field removed" but implementation includes it.Comments at lines 898, 941, 969, 998, 1029, and 1081 state "Role field removed for OLS compatibility", but the implementation in
streaming_query.py
explicitly includes"role": "inference"
in event data (e.g., lines 456-461, 496-502, 531-534).The new OLS compatibility tests (lines 1902, 1920, 1937) actually assert that the role field is present. Either remove these misleading comments or update the implementation to match the stated intent.
♻️ Duplicate comments (2)
src/models/requests.py (1)
213-223
: Address the previous review comment about using constants.The validation logic correctly raises a
ValueError
for unsupported media types. However, as noted in the previous review comment, the hardcoded strings"application/json"
and"text/plain"
should be replaced with theMEDIA_TYPE_JSON
andMEDIA_TYPE_TEXT
constants fromsrc/constants.py
to prevent maintenance issues where constants and validation logic could drift apart.Based on the previous review comment, apply this refactor:
+from constants import MEDIA_TYPE_JSON, MEDIA_TYPE_TEXT + """Models for REST API requests."""@model_validator(mode="after") def validate_media_type(self) -> Self: """Validate media_type field.""" if self.media_type and self.media_type not in [ - "application/json", - "text/plain", + MEDIA_TYPE_JSON, + MEDIA_TYPE_TEXT, ]: raise ValueError( - "media_type must be either 'application/json' or 'text/plain'" + f"media_type must be either '{MEDIA_TYPE_JSON}' or '{MEDIA_TYPE_TEXT}'" ) return selftests/unit/app/endpoints/test_streaming_query.py (1)
201-207
: Test expects StreamingResponse but handler raises HTTPException.The test expects a
StreamingResponse
to be returned, but the handler'sexcept APIConnectionError
block (lines 871-881 instreaming_query.py
) raises anHTTPException
with status 500. Update the test to assert that anHTTPException
is raised.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (5)
src/app/endpoints/streaming_query.py
(21 hunks)src/constants.py
(1 hunks)src/models/requests.py
(2 hunks)tests/unit/app/endpoints/test_streaming_query.py
(12 hunks)tests/unit/models/requests/test_query_request.py
(1 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
- src/constants.py
🧰 Additional context used
📓 Path-based instructions (9)
**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
**/*.py
: All modules start with descriptive module-level docstrings explaining purpose
Use logger = logging.getLogger(name) for module logging after import logging
Define type aliases at module level for clarity
All functions require docstrings with brief descriptions
Provide complete type annotations for all function parameters and return types
Use typing_extensions.Self in model validators where appropriate
Use modern union syntax (str | int) and Optional[T] or T | None consistently
Function names use snake_case with descriptive, action-oriented prefixes (get_, validate_, check_)
Avoid in-place parameter modification; return new data structures instead of mutating arguments
Use appropriate logging levels: debug, info, warning, error with clear messages
All classes require descriptive docstrings explaining purpose
Class names use PascalCase with conventional suffixes (Configuration, Error/Exception, Resolver, Interface)
Abstract base classes should use abc.ABC and @AbstractMethod for interfaces
Provide complete type annotations for all class attributes
Follow Google Python docstring style for modules, classes, and functions, including Args, Returns, Raises, Attributes sections as needed
Files:
tests/unit/app/endpoints/test_streaming_query.py
src/models/requests.py
tests/unit/models/requests/test_query_request.py
src/app/endpoints/streaming_query.py
tests/{unit,integration}/**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
tests/{unit,integration}/**/*.py
: Use pytest for all unit and integration tests
Do not use unittest in tests; pytest is the standard
Files:
tests/unit/app/endpoints/test_streaming_query.py
tests/unit/models/requests/test_query_request.py
tests/**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
tests/**/*.py
: Use pytest-mock to create AsyncMock objects for async interactions in tests
Use the shared auth mock constant: MOCK_AUTH = ("mock_user_id", "mock_username", False, "mock_token") in tests
Files:
tests/unit/app/endpoints/test_streaming_query.py
tests/unit/models/requests/test_query_request.py
src/**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
Use absolute imports for internal modules (e.g., from auth import get_auth_dependency)
Files:
src/models/requests.py
src/app/endpoints/streaming_query.py
src/{models/**/*.py,configuration.py}
📄 CodeRabbit inference engine (CLAUDE.md)
src/{models/**/*.py,configuration.py}
: Use @field_validator and @model_validator for custom validation in Pydantic models
Use precise type hints in configuration (e.g., Optional[FilePath], PositiveInt, SecretStr)
Files:
src/models/requests.py
src/models/**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
src/models/**/*.py
: Pydantic models: use BaseModel for data models and extend ConfigurationBase for configuration
Use @model_validator and @field_validator for Pydantic model validation
Files:
src/models/requests.py
src/app/**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
Use standard FastAPI imports (from fastapi import APIRouter, HTTPException, Request, status, Depends) in FastAPI app code
Files:
src/app/endpoints/streaming_query.py
src/{app/**/*.py,client.py}
📄 CodeRabbit inference engine (CLAUDE.md)
Use async def for I/O-bound operations and external API calls
Files:
src/app/endpoints/streaming_query.py
src/app/endpoints/**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
In API endpoints, raise FastAPI HTTPException with appropriate status codes for error handling
Files:
src/app/endpoints/streaming_query.py
🧬 Code graph analysis (2)
tests/unit/app/endpoints/test_streaming_query.py (2)
src/app/endpoints/streaming_query.py (5)
stream_event
(206-231)stream_end_event
(156-203)prompt_too_long_error
(314-336)generic_llm_error
(339-362)streaming_query_endpoint_handler
(689-892)src/models/requests.py (1)
QueryRequest
(72-223)
src/app/endpoints/streaming_query.py (1)
src/models/responses.py (1)
ToolCall
(47-52)
🪛 GitHub Actions: Black
src/app/endpoints/streaming_query.py
[error] 1-1: Black formatting check failed. 1 file would be reformatted. Run 'black --write' to fix code style issues.
🪛 GitHub Actions: Python linter
src/app/endpoints/streaming_query.py
[error] 235-235: pylint: C0301 Line too long (120/100) in streaming_query.py
🪛 GitHub Actions: Unit tests
tests/unit/app/endpoints/test_streaming_query.py
[error] 1-1: Test failure: streaming query endpoint did not emit expected SSE event 'start'. The assertion '"event": "start"' was not found in the response stream.
[error] 1-1: Test failure: streaming query endpoint with transcripts enabled/disabled failed for SSE content validation (same root cause as above).
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)
- GitHub Check: build-pr
- GitHub Check: e2e_tests
🔇 Additional comments (13)
src/models/requests.py (1)
149-150
: LGTM! Field description accurately documents supported media types.The updated field description clearly indicates support for both
"application/json"
and"text/plain"
, which aligns with the PR objectives.tests/unit/models/requests/test_query_request.py (1)
137-156
: Invalid media_type ValueError coverage already exists The invalid‐media_type scenario is covered in tests/unit/app/endpoints/test_streaming_query.py, so no additional test in test_query_request.py is needed.Likely an incorrect or invalid review comment.
tests/unit/app/endpoints/test_streaming_query.py (3)
1678-1956
: LGTM! Comprehensive OLS compatibility test coverage.The new test classes provide excellent coverage for OLS-compatible event formatting across both JSON and text media types. Tests verify:
- Token, tool_call, and tool_result event structures
- End event formatting with referenced documents
- Error handling for prompt_too_long and generic errors
- Media type validation and event structure compliance
43-57
: LGTM! Imports align with OLS compatibility changes.The new imports for streaming helpers, event constants, and media types are properly used throughout the new test classes and align with the implementation in
streaming_query.py
.
799-800
: Include conversation_id in your stream_build_event test setup
The turn_start test callsstream_build_event(chunk, 0, {})
, leavingconversation_id=None
, so the emitted event omits"conversation_id"
. Update the test to pass the expected conversation_id (for example,stream_build_event(chunk, 0, {}, MEDIA_TYPE_JSON, '<your-id>')
) or include it in the metadata_map.src/app/endpoints/streaming_query.py (8)
112-115
: LGTM! OLS-compatible event type constants.The event type constants are well-named and follow the naming convention. They're properly used throughout the event handlers.
156-203
: LGTM! Media-type-aware stream end event.The
stream_end_event
function properly handles both JSON and text media types, formatting referenced documents appropriately for each. The text format provides a clean separator and document list, while JSON maintains the structured SSE format.
206-231
: LGTM! Centralized event formatting with media type support.The
stream_event
function provides clean media-type-aware formatting for all event types. The text/plain branch returns simple tokens or JSON-wrapped tool data, while the JSON branch uses consistent SSE formatting. Unknown event types are logged and return empty string, which is safe.
314-362
: LGTM! Comprehensive media-type-aware error handling.Both
prompt_too_long_error
andgeneric_llm_error
provide appropriate error formatting for JSON and text media types. The JSON format includes structured error details (status_code, response, cause) while text format provides simple error messages. Error logging is appropriate.
368-398
: Verify: text media type emits JSON SSE for start event.When
media_type == MEDIA_TYPE_TEXT
, the start event at line 389-391 still emits a JSON SSE format (data: {...}\n\n
). Is this intentional? The PR description states text/plain support was added, but control events (start, turn_complete) remain in JSON SSE format even for text/plain requests.If this is intentional to keep metadata events structured, consider documenting this behavior. Otherwise, for pure text/plain responses, the start event might not need to be emitted at all (or emit as plain text).
454-683
: Verify role field requirement for OLS compatibility.All event handlers include
"role": "inference"
in their data dictionaries (e.g., lines 456-461, 496-502, 531-534, 569-574, etc.), but test comments claim "role field removed for OLS compatibility". The new OLS compatibility tests actually assert that the role field is present (test lines 1902, 1920, 1937).Clarify the intended behavior: should the role field be included or excluded for OLS compatibility? If it should be included (as the implementation and new tests suggest), remove the misleading comments in the test file.
882-892
: Inconsistent error handling between APIConnectionError and other exceptions.Lines 882-892 catch generic exceptions and return a
StreamingResponse
with error content, butAPIConnectionError
(lines 871-881) raises anHTTPException
. This is inconsistent.If the goal is to return OLS-compatible streaming error responses for all errors (as the comment states), consider also returning a
StreamingResponse
forAPIConnectionError
instead of raisingHTTPException
. Otherwise, clarify why connection errors should terminate the stream while other errors should be streamed.
805-811
: LGTM! Media type propagation through event generation.The
stream_build_event
call properly passesmedia_type
andconversation_id
, andstream_end_event
receivesmedia_type
. The media type flow from request through to event generation is correct.
07bab23
to
1510599
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, just couple of nits
|
||
# media_type is not used in 'lightspeed-stack' that only supports application/json. | ||
# the field is kept here to enable compatibility with 'road-core' clients. | ||
media_type: Optional[str] = Field( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please add media_type
to the QueryRequest docstring, it was missing :)
src/models/requests.py
Outdated
"media_type was set in the request but is not supported. The value will be ignored." | ||
"""Validate media_type field.""" | ||
if self.media_type and self.media_type not in [ | ||
"application/json", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As pointed out by coderabbit, use constants for media types
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM in overall, just pls look at @are-ces 's comment.
1510599
to
908e0f3
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
♻️ Duplicate comments (1)
tests/unit/app/endpoints/test_streaming_query.py (1)
201-207
: Test expectation doesn't match implementation behavior.The test expects
StreamingResponse
, but the handler'sAPIConnectionError
block (lines 881-891 insrc/app/endpoints/streaming_query.py
) raisesHTTPException(500)
. Update the test to assert that anHTTPException
is raised with status code 500.Apply this diff:
- # await the async function - should return a streaming response with error - response = await streaming_query_endpoint_handler( - request, query_request, auth=MOCK_AUTH - ) - - assert isinstance(response, StreamingResponse) - assert response.media_type == "application/json" + # await the async function - should raise HTTPException + with pytest.raises(HTTPException) as exc_info: + await streaming_query_endpoint_handler( + request, query_request, auth=MOCK_AUTH + ) + + assert exc_info.value.status_code == status.HTTP_500_INTERNAL_SERVER_ERROR + assert "Unable to connect to Llama Stack" in exc_info.value.detail["response"]
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (5)
src/app/endpoints/streaming_query.py
(21 hunks)src/constants.py
(1 hunks)src/models/requests.py
(4 hunks)tests/unit/app/endpoints/test_streaming_query.py
(12 hunks)tests/unit/models/requests/test_query_request.py
(1 hunks)
🚧 Files skipped from review as they are similar to previous changes (2)
- tests/unit/models/requests/test_query_request.py
- src/models/requests.py
🧰 Additional context used
📓 Path-based instructions (8)
src/**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
Use absolute imports for internal modules (e.g., from auth import get_auth_dependency)
Files:
src/constants.py
src/app/endpoints/streaming_query.py
**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
**/*.py
: All modules start with descriptive module-level docstrings explaining purpose
Use logger = logging.getLogger(name) for module logging after import logging
Define type aliases at module level for clarity
All functions require docstrings with brief descriptions
Provide complete type annotations for all function parameters and return types
Use typing_extensions.Self in model validators where appropriate
Use modern union syntax (str | int) and Optional[T] or T | None consistently
Function names use snake_case with descriptive, action-oriented prefixes (get_, validate_, check_)
Avoid in-place parameter modification; return new data structures instead of mutating arguments
Use appropriate logging levels: debug, info, warning, error with clear messages
All classes require descriptive docstrings explaining purpose
Class names use PascalCase with conventional suffixes (Configuration, Error/Exception, Resolver, Interface)
Abstract base classes should use abc.ABC and @AbstractMethod for interfaces
Provide complete type annotations for all class attributes
Follow Google Python docstring style for modules, classes, and functions, including Args, Returns, Raises, Attributes sections as needed
Files:
src/constants.py
tests/unit/app/endpoints/test_streaming_query.py
src/app/endpoints/streaming_query.py
src/constants.py
📄 CodeRabbit inference engine (CLAUDE.md)
Keep shared constants in a central src/constants.py with descriptive comments
Files:
src/constants.py
tests/{unit,integration}/**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
tests/{unit,integration}/**/*.py
: Use pytest for all unit and integration tests
Do not use unittest in tests; pytest is the standard
Files:
tests/unit/app/endpoints/test_streaming_query.py
tests/**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
tests/**/*.py
: Use pytest-mock to create AsyncMock objects for async interactions in tests
Use the shared auth mock constant: MOCK_AUTH = ("mock_user_id", "mock_username", False, "mock_token") in tests
Files:
tests/unit/app/endpoints/test_streaming_query.py
src/app/**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
Use standard FastAPI imports (from fastapi import APIRouter, HTTPException, Request, status, Depends) in FastAPI app code
Files:
src/app/endpoints/streaming_query.py
src/{app/**/*.py,client.py}
📄 CodeRabbit inference engine (CLAUDE.md)
Use async def for I/O-bound operations and external API calls
Files:
src/app/endpoints/streaming_query.py
src/app/endpoints/**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
In API endpoints, raise FastAPI HTTPException with appropriate status codes for error handling
Files:
src/app/endpoints/streaming_query.py
🧬 Code graph analysis (2)
tests/unit/app/endpoints/test_streaming_query.py (2)
src/app/endpoints/streaming_query.py (5)
stream_event
(210-235)stream_end_event
(157-207)prompt_too_long_error
(322-344)generic_llm_error
(347-370)streaming_query_endpoint_handler
(689-902)src/models/requests.py (1)
QueryRequest
(73-225)
src/app/endpoints/streaming_query.py (1)
src/models/responses.py (1)
ToolCall
(96-101)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)
- GitHub Check: build-pr
- GitHub Check: e2e_tests (ci)
🔇 Additional comments (13)
src/constants.py (1)
119-122
: LGTM!The media type constants are well-defined, follow naming conventions, and use standard MIME types. The descriptive comment provides clear context for their usage in streaming responses.
tests/unit/app/endpoints/test_streaming_query.py (3)
43-49
: LGTM!The new imports for streaming helpers and event constants are necessary for the OLS compatibility tests. The imports are correctly organized and follow Python conventions.
57-57
: LGTM!The import of media type constants is correct and necessary for the OLS compatibility tests.
1732-1993
: LGTM!The new OLS compatibility test classes are comprehensive and well-structured. They effectively validate:
- Media type-aware formatting for both JSON and text outputs
- Correct event types (token, tool_call, tool_result, end)
- Proper data structure without the "role" field (OLS compatibility)
- Error handling for both prompt_too_long and generic errors
- End event structure with all required fields
The tests follow pytest conventions and provide good coverage for the OLS compatibility features.
src/app/endpoints/streaming_query.py (9)
7-7
: LGTM!The import additions are appropriate:
uuid
is used for generating conversation IDscast
from typing is used for type safety- Media type constants are imported from the centralized constants module
All imports are used in the code and follow Python conventions.
Also applies to: 9-9, 43-43
113-116
: LGTM!The OLS-compatible event type constants are well-defined and clearly documented. They provide a single source of truth for event types used throughout the streaming logic.
69-87
: LGTM!The OpenAPI response schema has been appropriately updated to document both media types (application/json and text/plain). The examples clearly illustrate the expected format for each media type, improving API documentation.
157-207
: LGTM!The
stream_end_event
function has been effectively updated to support both media types:
- Text/plain: returns formatted reference documents with a separator
- JSON: returns structured SSE-formatted data with referenced documents
The default parameter ensures backward compatibility, and the implementation correctly filters and formats metadata.
210-235
: LGTM!The
stream_event
function provides clean centralized formatting for streaming events:
- Text/plain: extracts token text directly or formats tool events as JSON strings
- JSON: wraps events in SSE format consistently
- Includes error logging for unknown event types in text mode
The implementation is clear and maintainable.
238-244
: LGTM!The updated function signature for
stream_build_event
properly adds media_type and conversation_id parameters with appropriate defaults. The multi-line formatting follows Python conventions and stays within line length limits.
264-288
: LGTM!The dispatch logic in
stream_build_event
correctly passes the media_type parameter to all handlers, and appropriately forwards conversation_id to the turn_start handler. The pattern matching is clean and maintainable.
294-370
: LGTM!The error handling functions are well-structured and provide OLS-compatible responses:
_handle_error_event
: handles chunk errors with media type awarenessprompt_too_long_error
: returns 413 status code with descriptive error detailsgeneric_llm_error
: provides structured error responses for generic LLM failuresAll functions follow consistent patterns, include appropriate logging, and format errors correctly for both media types.
376-684
: LGTM!All event handlers have been consistently updated to support media type awareness:
- Handlers now accept
media_type
parameter with appropriate defaults- Most handlers use
stream_event
for consistent formatting- Each handler correctly branches on media type to produce text or JSON output
- Tool execution handler properly handles different tool types (memory queries, knowledge search, etc.)
The implementation maintains consistency across all event types while providing the flexibility needed for OLS compatibility.
908e0f3
to
ea5eadd
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
♻️ Duplicate comments (2)
src/app/endpoints/streaming_query.py (1)
195-207
: Consider moving available_quotas into the data object for consistency.Currently,
available_quotas
sits at the top level alongsideevent
anddata
(line 205). This is inconsistent with other events where all payload information is nested insidedata
. For a uniform event structure, consider moving it inside thedata
object:return format_stream_data( { "event": "end", "data": { "rag_chunks": [], # TODO(jboos): implement RAG chunks when summary is available "referenced_documents": referenced_docs_dict, "truncated": None, # TODO(jboos): implement truncated "input_tokens": 0, # TODO(jboos): implement input tokens "output_tokens": 0, # TODO(jboos): implement output tokens + "available_quotas": {}, # TODO(jboos): implement available quotas }, - "available_quotas": {}, # TODO(jboos): implement available quotas } )This maintains consistency with the event structure used throughout the streaming API.
tests/unit/app/endpoints/test_streaming_query.py (1)
201-207
: Fix test to expect HTTPException for connection errors.The test expects a
StreamingResponse
, but the handler'sexcept APIConnectionError
block (lines 880-891 instreaming_query.py
) raises anHTTPException
with status 500. This mismatch is causing the pipeline failure.Update the test to assert that an
HTTPException
is raised:- # await the async function - should return a streaming response with error - response = await streaming_query_endpoint_handler( - request, query_request, auth=MOCK_AUTH - ) - - assert isinstance(response, StreamingResponse) - assert response.media_type == "application/json" + # await the async function - should raise HTTPException for connection errors + with pytest.raises(HTTPException) as exc_info: + await streaming_query_endpoint_handler( + request, query_request, auth=MOCK_AUTH + ) + + assert exc_info.value.status_code == status.HTTP_500_INTERNAL_SERVER_ERROR + assert "Unable to connect to Llama Stack" in exc_info.value.detail["response"]As per coding guidelines
🧹 Nitpick comments (1)
src/app/endpoints/streaming_query.py (1)
376-408
: Optional: Reformat long line instead of using pylint disable.Line 400 uses an explicit
pylint: disable=line-too-long
comment. Consider breaking the line for better readability:- yield ( - f"data: {json.dumps({'event': 'start', 'data': {'conversation_id': conversation_id}})}\n\n" # pylint: disable=line-too-long - ) + yield ( + f"data: {json.dumps(" + f"{{'event': 'start', 'data': {{'conversation_id': '{conversation_id}'}}}}" + f")}\n\n" + )This maintains readability while staying within the line length limit.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (5)
src/app/endpoints/streaming_query.py
(21 hunks)src/constants.py
(1 hunks)src/models/requests.py
(4 hunks)tests/unit/app/endpoints/test_streaming_query.py
(12 hunks)tests/unit/models/requests/test_query_request.py
(1 hunks)
🚧 Files skipped from review as they are similar to previous changes (2)
- tests/unit/models/requests/test_query_request.py
- src/constants.py
🧰 Additional context used
📓 Path-based instructions (9)
src/**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
Use absolute imports for internal modules (e.g., from auth import get_auth_dependency)
Files:
src/models/requests.py
src/app/endpoints/streaming_query.py
**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
**/*.py
: All modules start with descriptive module-level docstrings explaining purpose
Use logger = logging.getLogger(name) for module logging after import logging
Define type aliases at module level for clarity
All functions require docstrings with brief descriptions
Provide complete type annotations for all function parameters and return types
Use typing_extensions.Self in model validators where appropriate
Use modern union syntax (str | int) and Optional[T] or T | None consistently
Function names use snake_case with descriptive, action-oriented prefixes (get_, validate_, check_)
Avoid in-place parameter modification; return new data structures instead of mutating arguments
Use appropriate logging levels: debug, info, warning, error with clear messages
All classes require descriptive docstrings explaining purpose
Class names use PascalCase with conventional suffixes (Configuration, Error/Exception, Resolver, Interface)
Abstract base classes should use abc.ABC and @AbstractMethod for interfaces
Provide complete type annotations for all class attributes
Follow Google Python docstring style for modules, classes, and functions, including Args, Returns, Raises, Attributes sections as needed
Files:
src/models/requests.py
tests/unit/app/endpoints/test_streaming_query.py
src/app/endpoints/streaming_query.py
src/{models/**/*.py,configuration.py}
📄 CodeRabbit inference engine (CLAUDE.md)
src/{models/**/*.py,configuration.py}
: Use @field_validator and @model_validator for custom validation in Pydantic models
Use precise type hints in configuration (e.g., Optional[FilePath], PositiveInt, SecretStr)
Files:
src/models/requests.py
src/models/**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
src/models/**/*.py
: Pydantic models: use BaseModel for data models and extend ConfigurationBase for configuration
Use @model_validator and @field_validator for Pydantic model validation
Files:
src/models/requests.py
tests/{unit,integration}/**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
tests/{unit,integration}/**/*.py
: Use pytest for all unit and integration tests
Do not use unittest in tests; pytest is the standard
Files:
tests/unit/app/endpoints/test_streaming_query.py
tests/**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
tests/**/*.py
: Use pytest-mock to create AsyncMock objects for async interactions in tests
Use the shared auth mock constant: MOCK_AUTH = ("mock_user_id", "mock_username", False, "mock_token") in tests
Files:
tests/unit/app/endpoints/test_streaming_query.py
src/app/**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
Use standard FastAPI imports (from fastapi import APIRouter, HTTPException, Request, status, Depends) in FastAPI app code
Files:
src/app/endpoints/streaming_query.py
src/{app/**/*.py,client.py}
📄 CodeRabbit inference engine (CLAUDE.md)
Use async def for I/O-bound operations and external API calls
Files:
src/app/endpoints/streaming_query.py
src/app/endpoints/**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
In API endpoints, raise FastAPI HTTPException with appropriate status codes for error handling
Files:
src/app/endpoints/streaming_query.py
🧬 Code graph analysis (2)
tests/unit/app/endpoints/test_streaming_query.py (2)
src/app/endpoints/streaming_query.py (5)
stream_event
(210-235)stream_end_event
(157-207)prompt_too_long_error
(322-344)generic_llm_error
(347-370)streaming_query_endpoint_handler
(689-906)src/models/requests.py (2)
QueryRequest
(73-225)Attachment
(16-70)
src/app/endpoints/streaming_query.py (1)
src/models/responses.py (1)
ToolCall
(96-101)
🪛 GitHub Actions: Unit tests
tests/unit/app/endpoints/test_streaming_query.py
[error] 1-1: Test 'test_streaming_query_endpoint_on_connection_error' failed: response.media_type == 'application/json' but was 'text/event-stream'.
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)
- GitHub Check: e2e_tests (ci)
- GitHub Check: build-pr
🔇 Additional comments (7)
src/models/requests.py (1)
11-11
: LGTM! Media type validation properly implemented.The changes correctly:
- Import media type constants from the constants module
- Update the field description and examples to use the constants
- Enforce validation by raising
ValueError
for invalid media types instead of just logging a warningThis aligns with the PR objective of ensuring OLS compatibility by strictly validating supported media types.
Also applies to: 84-84, 151-152, 217-224
tests/unit/app/endpoints/test_streaming_query.py (2)
43-50
: LGTM! Test imports align with new public API.The new imports correctly reference the expanded public API from
streaming_query.py
(stream_event, stream_end_event, error handlers, event constants) and media type constants. These are necessary for testing OLS compatibility features.Also applies to: 57-57
1725-1993
: LGTM! Comprehensive OLS compatibility tests.The new test classes thoroughly validate:
- Event formatting for both JSON and text media types
- Token, tool call, and tool result event structures
- End event structure with referenced documents
- Error handling for prompt too long and generic LLM errors
- Media type validation in QueryRequest
- Proper exclusion of "role" field for OLS compatibility
The tests are well-organized and follow pytest best practices.
src/app/endpoints/streaming_query.py (4)
113-116
: LGTM! OLS-compatible event constants.The new event type constants (
LLM_TOKEN_EVENT
,LLM_TOOL_CALL_EVENT
,LLM_TOOL_RESULT_EVENT
) provide clear abstractions for the different event types and align with OLS compatibility requirements.
210-235
: LGTM! Central event formatting function.The
stream_event
function properly centralizes media-type-aware event formatting. It correctly:
- Returns plain text tokens for
MEDIA_TYPE_TEXT
- Returns SSE-formatted JSON for
MEDIA_TYPE_JSON
- Handles tool call and tool result events appropriately
- Logs unknown event types and returns empty string as a safe fallback
The function signature includes complete type hints and a clear docstring following Google style.
322-370
: LGTM! Well-structured error handling functions.The error handling functions (
prompt_too_long_error
andgeneric_llm_error
) properly:
- Support both text and JSON media types
- Include appropriate error details (status_code, response, cause)
- Use proper logging (error level with exception details)
- Follow Google-style docstrings with complete type hints
- Return OLS-compatible error structures
These align with the PR objective of providing richer error objects compatible with OLS.
784-785
: LGTM! Media type properly propagated throughout streaming flow.The media type is correctly:
- Determined from
query_request.media_type
or defaulted toMEDIA_TYPE_JSON
(line 785)- Passed to
stream_build_event
along withconversation_id
(lines 809-811)- Passed to
stream_end_event
(line 815)- Used to set appropriate HTTP Content-Type header as
text/event-stream
for SSE (line 878)This ensures consistent media-type-aware formatting throughout the streaming response lifecycle.
Also applies to: 809-811, 815-815, 874-879
ea5eadd
to
f4fe32f
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
♻️ Duplicate comments (1)
tests/unit/app/endpoints/test_streaming_query.py (1)
201-207
: Critical: Test expects wrong response type for connection errors.The test expects a
StreamingResponse
, but the implementation atsrc/app/endpoints/streaming_query.py
lines 880-891 raises anHTTPException
with status code 500 forAPIConnectionError
. This test will fail when the connection error path is exercised.Apply this diff to fix the test:
- # await the async function - should return a streaming response with error - response = await streaming_query_endpoint_handler( - request, query_request, auth=MOCK_AUTH - ) - - assert isinstance(response, StreamingResponse) - assert response.media_type == "text/event-stream" + # await the async function - should raise HTTPException for connection errors + with pytest.raises(HTTPException) as exc_info: + await streaming_query_endpoint_handler( + request, query_request, auth=MOCK_AUTH + ) + + assert exc_info.value.status_code == status.HTTP_500_INTERNAL_SERVER_ERROR + assert "Unable to connect to Llama Stack" in exc_info.value.detail["response"]
🧹 Nitpick comments (2)
src/models/requests.py (1)
84-84
: Optional: Update docstring to use constant names.The docstring at line 84 mentions "application/json or text/plain" as hardcoded strings. For consistency, consider referencing the constant names instead.
Apply this diff:
- media_type: The optional media type for response format (application/json or text/plain). + media_type: The optional media type for response format (MEDIA_TYPE_JSON or MEDIA_TYPE_TEXT).src/app/endpoints/streaming_query.py (1)
376-408
: Simplify text media type start event formatting.For
MEDIA_TYPE_TEXT
at line 400, the function still outputs SSE-formatted JSON. This is inconsistent with howstream_event
handles text media type (which returns plain text for tokens). While this might be intentional for control events, it creates confusion and results in a very long line that exceeds limits.Consider either:
- Using
stream_event
for consistency, or- Simplifying the text case if control events must always be JSON
Apply this diff to improve readability at minimum:
if media_type == MEDIA_TYPE_TEXT: - yield ( - f"data: {json.dumps({'event': 'start', 'data': {'conversation_id': conversation_id}})}\n\n" # pylint: disable=line-too-long - ) + start_data = { + 'event': 'start', + 'data': {'conversation_id': conversation_id} + } + yield f"data: {json.dumps(start_data)}\n\n" else: yield format_stream_data( { "event": "start", "data": {"conversation_id": conversation_id}, } )
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (5)
src/app/endpoints/streaming_query.py
(21 hunks)src/constants.py
(1 hunks)src/models/requests.py
(4 hunks)tests/unit/app/endpoints/test_streaming_query.py
(12 hunks)tests/unit/models/requests/test_query_request.py
(1 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
- tests/unit/models/requests/test_query_request.py
🧰 Additional context used
📓 Path-based instructions (10)
src/**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
Use absolute imports for internal modules (e.g., from auth import get_auth_dependency)
Files:
src/models/requests.py
src/constants.py
src/app/endpoints/streaming_query.py
**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
**/*.py
: All modules start with descriptive module-level docstrings explaining purpose
Use logger = logging.getLogger(name) for module logging after import logging
Define type aliases at module level for clarity
All functions require docstrings with brief descriptions
Provide complete type annotations for all function parameters and return types
Use typing_extensions.Self in model validators where appropriate
Use modern union syntax (str | int) and Optional[T] or T | None consistently
Function names use snake_case with descriptive, action-oriented prefixes (get_, validate_, check_)
Avoid in-place parameter modification; return new data structures instead of mutating arguments
Use appropriate logging levels: debug, info, warning, error with clear messages
All classes require descriptive docstrings explaining purpose
Class names use PascalCase with conventional suffixes (Configuration, Error/Exception, Resolver, Interface)
Abstract base classes should use abc.ABC and @AbstractMethod for interfaces
Provide complete type annotations for all class attributes
Follow Google Python docstring style for modules, classes, and functions, including Args, Returns, Raises, Attributes sections as needed
Files:
src/models/requests.py
src/constants.py
tests/unit/app/endpoints/test_streaming_query.py
src/app/endpoints/streaming_query.py
src/{models/**/*.py,configuration.py}
📄 CodeRabbit inference engine (CLAUDE.md)
src/{models/**/*.py,configuration.py}
: Use @field_validator and @model_validator for custom validation in Pydantic models
Use precise type hints in configuration (e.g., Optional[FilePath], PositiveInt, SecretStr)
Files:
src/models/requests.py
src/models/**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
src/models/**/*.py
: Pydantic models: use BaseModel for data models and extend ConfigurationBase for configuration
Use @model_validator and @field_validator for Pydantic model validation
Files:
src/models/requests.py
src/constants.py
📄 CodeRabbit inference engine (CLAUDE.md)
Keep shared constants in a central src/constants.py with descriptive comments
Files:
src/constants.py
tests/{unit,integration}/**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
tests/{unit,integration}/**/*.py
: Use pytest for all unit and integration tests
Do not use unittest in tests; pytest is the standard
Files:
tests/unit/app/endpoints/test_streaming_query.py
tests/**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
tests/**/*.py
: Use pytest-mock to create AsyncMock objects for async interactions in tests
Use the shared auth mock constant: MOCK_AUTH = ("mock_user_id", "mock_username", False, "mock_token") in tests
Files:
tests/unit/app/endpoints/test_streaming_query.py
src/app/**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
Use standard FastAPI imports (from fastapi import APIRouter, HTTPException, Request, status, Depends) in FastAPI app code
Files:
src/app/endpoints/streaming_query.py
src/{app/**/*.py,client.py}
📄 CodeRabbit inference engine (CLAUDE.md)
Use async def for I/O-bound operations and external API calls
Files:
src/app/endpoints/streaming_query.py
src/app/endpoints/**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
In API endpoints, raise FastAPI HTTPException with appropriate status codes for error handling
Files:
src/app/endpoints/streaming_query.py
🧬 Code graph analysis (2)
tests/unit/app/endpoints/test_streaming_query.py (3)
src/app/endpoints/streaming_query.py (4)
stream_event
(210-235)stream_end_event
(157-207)prompt_too_long_error
(322-344)generic_llm_error
(347-370)src/models/requests.py (1)
QueryRequest
(73-225)src/utils/types.py (2)
ToolCallSummary
(73-86)TurnSummary
(89-163)
src/app/endpoints/streaming_query.py (1)
src/models/responses.py (1)
ToolCall
(96-101)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)
- GitHub Check: build-pr
- GitHub Check: e2e_tests (ci)
🔇 Additional comments (8)
src/models/requests.py (1)
11-11
: LGTM! Constants are properly imported and used in validation.The validation logic correctly uses
MEDIA_TYPE_JSON
andMEDIA_TYPE_TEXT
constants fromsrc/constants.py
, addressing the previous review feedback about avoiding hardcoded strings.Also applies to: 218-224
tests/unit/app/endpoints/test_streaming_query.py (1)
1732-1976
: LGTM! Comprehensive OLS compatibility test coverage.The new OLS compatibility tests thoroughly exercise:
- Event formatting for both JSON and text media types
- Token, tool call, and tool result events
- End event structure with referenced documents
- Error handling for both media types
- Integration scenarios verifying role field omission
The test organization and coverage are excellent.
src/app/endpoints/streaming_query.py (6)
113-117
: LGTM! Event type constants are well-defined.The OLS-compatible event type constants are clearly defined and appropriately scoped for use throughout the streaming logic.
210-235
: LGTM! Stream event formatting correctly handles both media types.The
stream_event
function properly routes based on media type:
- Text media type returns plain tokens and formatted tool events
- JSON media type returns SSE-formatted events with proper structure
- Unknown event types are logged and handled gracefully
157-207
: LGTM! End event formatting is media-type aware.The function correctly handles both media types:
- Text returns plain reference document links
- JSON returns structured SSE with referenced documents and placeholder token counts
The placement of
available_quotas
at the root level (line 205) is consistent with test expectations intest_streaming_query.py
line 1993.
322-370
: LGTM! Error handling functions provide OLS-compatible responses.Both
prompt_too_long_error
andgeneric_llm_error
correctly:
- Handle media type routing for text vs JSON responses
- Include appropriate status codes and error details
- Log errors at appropriate levels
- Provide structured error information compatible with OLS
411-684
: LGTM! Media type is properly propagated through all event handlers.All event handlers have been updated to:
- Accept
media_type
parameter with appropriate defaults- Use
stream_event
for consistent media-type-aware formatting- Maintain consistent patterns for token, tool call, and tool result events
The refactoring ensures OLS compatibility across all event types.
784-786
: LGTM! Media type handling and exception flow are correct.The implementation properly:
- Derives
media_type
from the query request (lines 784-786)- Propagates it through event generation (lines 809-815)
- Uses it for response formatting (lines 874-879)
- Handles errors with OLS-compatible responses (lines 892-906)
- Sets appropriate Content-Type headers based on media type (lines 902-905)
The exception handling correctly distinguishes between
APIConnectionError
(raises HTTPException) and other errors (returns streaming error response).Also applies to: 809-815, 874-906
LGTM nice work |
Goal: with minimal filtering, OLS can use the LCORE streaming query endpoint.
Today, OLS cannot use the original lightspeed-stack streaming query endpoint with just filtering. The following changes are necessary for OLS compatibility.
Tool Event Ambiguity Problem
Original lightspeed-stack format:
OLS expected format:
Both events have identical
event
type androle
field. A filter cannot distinguish between a tool call and its result without semantic understanding of the tool execution lifecycle. It's possible (by just looking for double tool_call events) but annoying.Data Structure Mismatches
Original format:
OLS expected:
Structural differences require data transformation, not just filtering. Again, nothing that can't actually be solved with some additional code but it's annoying for OLS so let's just fix it.
Media Type Support
OLS supports both
application/json
andtext/plain
so I addedtext/plain
as a supported mediatype.Error Handling
Original format:
OLS expected:
Let's just provide better errors for OLS.
Role Field
The role field is not required by OLS or Ansible so I removed it
Summary by CodeRabbit
New Features
Bug Fixes
Tests