LCORE-724: Endpoints for conversation cache v2 #591

tisnik · 2025-09-26T09:24:47Z

Description

LCORE-724: Endpoints for conversation cache v2

Type of change

Related Tickets & Documents

Related Issue #LCORE-724

Summary by CodeRabbit

New Features
- Conversation caching added for both standard and streaming queries.
- New v2 Conversations API: list, fetch, and delete conversations via a versioned endpoint.
Performance
- Faster access to recent conversation data when continuing interactions.
Reliability
- Improved consistency restoring context when resuming or following up on conversations.
Tests
- Router and endpoint coverage updated to include the new v2 conversations routes.

coderabbitai · 2025-09-26T09:24:55Z

Warning

Rate limit exceeded

@tisnik has exceeded the limit for the number of commits or files that can be reviewed per hour. Please wait 20 minutes and 34 seconds before requesting another review.

⌛ How to resolve this issue?

After the wait time has elapsed, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout.

Please see our FAQ for further information.

📥 Commits

Reviewing files that changed from the base of the PR and between a32f966 and 79c3f84.

📒 Files selected for processing (7)

src/app/endpoints/conversations_v2.py (1 hunks)
src/app/endpoints/query.py (2 hunks)
src/app/endpoints/streaming_query.py (2 hunks)
src/app/routers.py (2 hunks)
src/models/responses.py (1 hunks)
src/utils/endpoints.py (2 hunks)
tests/unit/app/test_routers.py (4 hunks)

Walkthrough

Adds conversation caching and a new v2 conversations API: query and streaming endpoints now persist query/response pairs into the configured conversation cache, a new utility stores CacheEntry objects, and a versioned conversations_v2 router with list/get/delete endpoints is introduced and mounted at /v2.

Changes

Cohort / File(s)	Summary
Endpoints: query & streaming `src/app/endpoints/query.py`, `src/app/endpoints/streaming_query.py`	Import and call `store_conversation_into_cache(...)` after LLM responses are produced to persist query/response data with user, conversation, provider, and model identifiers.
Utilities: cache helper `src/utils/endpoints.py`	Add `store_conversation_into_cache(config, user_id, conversation_id, provider_id, model_id, query, response)` which builds a `CacheEntry` and calls `cache.insert_or_append(...)` when a conversation cache is configured.
API: conversations v2 `src/app/endpoints/conversations_v2.py`	New APIRouter `router` providing v2 endpoints to list conversations, fetch a conversation, and delete a conversation, with validation, cache existence checks, and response shaping.
Router registration `src/app/routers.py`	Import and include the new `conversations_v2` router under the `/v2` prefix.
Models: responses `src/models/responses.py`	Add `ConversationsListResponseV2(BaseModel)` exposing `conversations: list[str]`.
Tests `tests/unit/app/test_routers.py`	Update tests to expect the additional router (router count increases, new prefix `/v2` validated).

Sequence Diagram(s)

sequenceDiagram
  autonumber
  participant C as Client
  participant Q as QueryEndpoint
  participant L as LLM Provider
  participant K as Conversation Cache

  C->>Q: POST /query (query, ids, provider/model)
  Q->>L: generate(query, provider_id, model_id)
  L-->>Q: llm_response
  Q->>K: store_conversation_into_cache(config, user_id, conv_id, provider_id, model_id, query, llm_response)
  K-->>Q: ack
  Q-->>C: 200 OK (response)

sequenceDiagram
  autonumber
  participant C as Client
  participant S as StreamingQueryEndpoint
  participant L as LLM (stream)
  participant T as Transcript Store
  participant K as Conversation Cache

  C->>S: POST /streaming_query (query, ids)
  S->>L: start_stream(query, provider_id, model_id)
  loop stream chunks
    L-->>S: chunk/delta
    S-->>C: SSE chunk
  end
  S->>T: persist transcript/attachments
  S->>K: store_conversation_into_cache(config, user_id, conv_id, provider_id, model_id, query, final_response)
  K-->>S: ack
  S-->>C: stream complete

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Possibly related PRs

LCORE-298: Conversation cache factory #567 — Adds/changes CacheEntry construction and conversation cache insert_or_append behavior which store_conversation_into_cache depends on.

Suggested reviewers

manstis

Poem

I hop through routes with a curious twitch,
Storing each question in one tidy stitch.
Streaming or single, I stash every line—
Provider, model, the timestamps align.
A rabbit's small cache keeps your chats feeling fine. 🐇

Pre-merge checks and finishing touches and finishing touches

✅ Passed checks (3 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title Check	✅ Passed	The title succinctly summarizes the main change by referencing the addition of version-2 conversation cache endpoints and includes the ticket identifier for context, directly matching the pull request’s focus. It is concise, clear, and avoids extraneous detail, ensuring that a reviewer scanning the history immediately understands the primary feature being introduced. It does not list files or use vague language, making it both specific and readable.
Docstring Coverage	✅ Passed	Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 4

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between a32f966 and 2af7fad.

📒 Files selected for processing (3)

src/app/endpoints/query.py (3 hunks)
src/app/endpoints/streaming_query.py (2 hunks)
src/utils/endpoints.py (2 hunks)

🧰 Additional context used

📓 Path-based instructions (5)

src/**/*.py