Skip to content

Conversation

@eranco74
Copy link
Contributor

@eranco74 eranco74 commented Jul 15, 2025

Add /conversations endpoint for conversation history management

Description

  • Add GET /v1/conversations/{conversation_id} to retrieve conversation history
  • Add DELETE /v1/conversations/{conversation_id} to delete conversations
  • Use llama-stack client.agents.session.retrieve and .delete methods
  • Map conversation ID to agent ID for LlamaStack operations
  • Add ConversationResponse and ConversationDeleteResponse models
  • Include conversations router in main app routing
  • Maintain consistent error handling and authentication patterns

Type of change

  • Refactor
  • New feature
  • Bug fix
  • CVE fix
  • Optimization
  • Documentation Update
  • Configuration Update
  • Bump-up service version
  • Bump-up dependent library
  • Bump-up library or tool used for development (does not change the final image)
  • CI configuration change
  • Konflux configuration change
  • Unit tests improvement
  • Integration tests improvement
  • End to end tests improvement

Related Tickets & Documents

LCORE-354

Checklist before requesting a review

  • I have performed a self-review of my code.
  • PR has passed all pre-merge test jobs.
  • [] If it is a core feature, I have added thorough tests.

Testing

  • Please provide detailed steps to perform tests related to this code change.
  • How were the fix/results from this change verified? Please provide relevant screenshots or results.

Summary by CodeRabbit

  • New Features
    • Introduced API endpoints to retrieve and delete conversation history with comprehensive responses and error handling.
    • Added response formats for conversation retrieval and deletion actions to enhance API documentation.
    • Registered conversation-related routes under the versioned API prefix.
  • Bug Fixes
    • Enhanced mapping between conversation IDs and agent IDs for improved session management reliability.
  • Tests
    • Updated tests to cover the new conversations API endpoints and verify their integration within the application.

@eranco74
Copy link
Contributor Author

eranco74 commented Jul 15, 2025

curl -s -X 'GET' -H "Authorization: Bearer ${OCM_TOKEN}"   'http://0.0.0.0:8090/v1/conversations/46f02ed5-29ee-41b6-a452-25938f926a52'   -H 'accept: application/json' | jq
{
  "conversation_id": "46f02ed5-29ee-41b6-a452-25938f926a52",
  "session_data": {
    "session_id": "46f02ed5-29ee-41b6-a452-25938f926a52",
    "session_name": "404b89b4-fa13-4c6e-b3f7-3ae079bd1f3b",
    "started_at": "2025-07-15T14:14:41.556546Z",
    "turns": [
      {
        "input_messages": [
          {
            "content": "what can you do?",
            "role": "user",
            "context": null
          }
        ],
        "output_message": {
          "content": "I can help you with various aspects of OpenShift cluster management using the assisted installer. Here's a list of things I can do:\n\n*   **Cluster Information:**\n    *   Retrieve comprehensive information about a specific cluster.\n    *   List all clusters associated with your account.\n    *   Get events related to a specific cluster to track installation progress and diagnose issues.\n*   **Host Management:**\n    *   Get events specific to a particular host within a cluster.\n    *   Assign a role to a host (master, worker, or auto-assign).\n*   **Cluster Creation and Configuration:**\n    *   Create a new OpenShift cluster.\n    *   Configure virtual IP addresses (VIPs) for cluster API and ingress traffic.\n    *   Add operator bundles to be installed with the cluster.\n*   **Installation:**\n    *   Trigger the installation process for a prepared cluster.\n*   **Other:**\n    *   List available OpenShift versions for installation.\n    *   List available operator bundles for cluster installation.\n    *   Get the ISO download URL for a cluster.\n\nTo get started, you can ask me questions like:\n\n*   \"What is the status of cluster `<cluster_id>`?\"\n*   \"Create a cluster named `<cluster_name>` with version `<version>`.\"\n*   \"List all available OpenShift versions.\"\n\nHow can I assist you today?\n",
          "role": "assistant",
          "stop_reason": "end_of_turn",
          "tool_calls": []
        },
        "session_id": "46f02ed5-29ee-41b6-a452-25938f926a52",
        "started_at": "2025-07-15T14:14:48.298571Z",
        "steps": [
          {
            "api_model_response": {
              "content": "I can help you with various aspects of OpenShift cluster management using the assisted installer. Here's a list of things I can do:\n\n*   **Cluster Information:**\n    *   Retrieve comprehensive information about a specific cluster.\n    *   List all clusters associated with your account.\n    *   Get events related to a specific cluster to track installation progress and diagnose issues.\n*   **Host Management:**\n    *   Get events specific to a particular host within a cluster.\n    *   Assign a role to a host (master, worker, or auto-assign).\n*   **Cluster Creation and Configuration:**\n    *   Create a new OpenShift cluster.\n    *   Configure virtual IP addresses (VIPs) for cluster API and ingress traffic.\n    *   Add operator bundles to be installed with the cluster.\n*   **Installation:**\n    *   Trigger the installation process for a prepared cluster.\n*   **Other:**\n    *   List available OpenShift versions for installation.\n    *   List available operator bundles for cluster installation.\n    *   Get the ISO download URL for a cluster.\n\nTo get started, you can ask me questions like:\n\n*   \"What is the status of cluster `<cluster_id>`?\"\n*   \"Create a cluster named `<cluster_name>` with version `<version>`.\"\n*   \"List all available OpenShift versions.\"\n\nHow can I assist you today?\n",
              "role": "assistant",
              "stop_reason": "end_of_turn",
              "tool_calls": []
            },
            "step_id": "50f92905-ed41-48b5-8b45-4b2e707a04f0",
            "step_type": "inference",
            "turn_id": "07b28273-dbf6-4eb0-a82f-eb80302eebc9",
            "completed_at": "2025-07-15T14:14:50.554458Z",
            "started_at": "2025-07-15T14:14:48.300589Z"
          }
        ],
        "turn_id": "07b28273-dbf6-4eb0-a82f-eb80302eebc9",
        "completed_at": "2025-07-15T14:14:50.554798Z",
        "output_attachments": []
      },
      {
        "input_messages": [
          {
            "content": "hello",
            "role": "user",
            "context": null
          }
        ],
        "output_message": {
          "content": "Hello! How can I help you with your OpenShift cluster today?\n",
          "role": "assistant",
          "stop_reason": "end_of_turn",
          "tool_calls": []
        },
        "session_id": "46f02ed5-29ee-41b6-a452-25938f926a52",
        "started_at": "2025-07-15T14:14:41.580122Z",
        "steps": [
          {
            "api_model_response": {
              "content": "Hello! How can I help you with your OpenShift cluster today?\n",
              "role": "assistant",
              "stop_reason": "end_of_turn",
              "tool_calls": []
            },
            "step_id": "4c696e66-087f-4ec8-8c54-47dda53446ea",
            "step_type": "inference",
            "turn_id": "ccf7540d-53f5-4451-a206-294a801300b8",
            "completed_at": "2025-07-15T14:14:44.119780Z",
            "started_at": "2025-07-15T14:14:41.581002Z"
          }
        ],
        "turn_id": "ccf7540d-53f5-4451-a206-294a801300b8",
        "completed_at": "2025-07-15T14:14:44.120312Z",
        "output_attachments": []
      },
      {
        "input_messages": [
          {
            "content": "list my clusters",
            "role": "user",
            "context": null
          }
        ],
        "output_message": {
          "content": "OK. I found these clusters:\n\n*   **name:** eran, **id:** 34f9e522-fe23-4a0f-9453-3d67296bf66f, **openshift\\_version:** 4.19.2, **status:** pending-for-input\n*   **name:** eran, **id:** 9a9d563a-7ba4-4687-872d-28a3bd73d61d, **openshift\\_version:** 4.19.2, **status:** pending-for-input\n*   **name:** demo, **id:** 14f96f92-9598-48de-8dae-67c2c678fe1b, **openshift\\_version:** 4.19.2, **status:** insufficient\n*   **name:** eran, **id:** bc665614-e503-4ab6-81f9-1aa5e584b0b1, **openshift\\_version:** 4.19.1, **status:** pending-for-input\n*   **name:** chatbot, **id:** cd61e6e6-b463-4d93-9983-44efafbe7da7, **openshift\\_version:** 4.16.18, **status:** pending-for-input\n*   **name:** bla, **id:** 28ccbe37-1c9a-4df3-a047-b0aeee2fb6ca, **openshift\\_version:** 4.18.16, **status:** pending-for-input\n\nWhat would you like to do next? For example, would you like to know more about a specific cluster, such as the events related to it?\n",
          "role": "assistant",
          "stop_reason": "end_of_turn",
          "tool_calls": []
        },
        "session_id": "46f02ed5-29ee-41b6-a452-25938f926a52",
        "started_at": "2025-07-15T14:14:55.334067Z",
        "steps": [
          {
            "api_model_response": {
              "content": "",
              "role": "assistant",
              "stop_reason": "end_of_turn",
              "tool_calls": [
                {
                  "arguments": {},
                  "call_id": "call_c03497fd-0320-46da-b22e-279d354e7dd6",
                  "tool_name": "list_clusters",
                  "arguments_json": "{}"
                }
              ]
            },
            "step_id": "0b92e4dc-a886-483a-b3f3-71ed6d55bf45",
            "step_type": "inference",
            "turn_id": "d0508742-5622-4846-9b4d-a2e54d9e449f",
            "completed_at": "2025-07-15T14:14:55.971837Z",
            "started_at": "2025-07-15T14:14:55.335767Z"
          },
          {
            "step_id": "c8886428-93b4-400c-892f-8fa0938e25c5",
            "step_type": "tool_execution",
            "tool_calls": [
              {
                "arguments": {},
                "call_id": "call_c03497fd-0320-46da-b22e-279d354e7dd6",
                "tool_name": "list_clusters",
                "arguments_json": "{}"
              }
            ],
            "tool_responses": [
              {
                "call_id": "call_c03497fd-0320-46da-b22e-279d354e7dd6",
                "content": [
                  {
                    "text": "[{\"name\": \"eran\", \"id\": \"34f9e522-fe23-4a0f-9453-3d67296bf66f\", \"openshift_version\": \"4.19.2\", \"status\": \"pending-for-input\"}, {\"name\": \"eran\", \"id\": \"9a9d563a-7ba4-4687-872d-28a3bd73d61d\", \"openshift_version\": \"4.19.2\", \"status\": \"pending-for-input\"}, {\"name\": \"demo\", \"id\": \"14f96f92-9598-48de-8dae-67c2c678fe1b\", \"openshift_version\": \"4.19.2\", \"status\": \"insufficient\"}, {\"name\": \"eran\", \"id\": \"bc665614-e503-4ab6-81f9-1aa5e584b0b1\", \"openshift_version\": \"4.19.1\", \"status\": \"pending-for-input\"}, {\"name\": \"chatbot\", \"id\": \"cd61e6e6-b463-4d93-9983-44efafbe7da7\", \"openshift_version\": \"4.16.18\", \"status\": \"pending-for-input\"}, {\"name\": \"bla\", \"id\": \"28ccbe37-1c9a-4df3-a047-b0aeee2fb6ca\", \"openshift_version\": \"4.18.16\", \"status\": \"pending-for-input\"}]",
                    "type": "text"
                  }
                ],
                "tool_name": "list_clusters",
                "metadata": null
              }
            ],
            "turn_id": "d0508742-5622-4846-9b4d-a2e54d9e449f",
            "completed_at": "2025-07-15T14:15:01.503481Z",
            "started_at": "2025-07-15T14:14:55.972386Z"
          },
          {
            "api_model_response": {
              "content": "OK. I found these clusters:\n\n*   **name:** eran, **id:** 34f9e522-fe23-4a0f-9453-3d67296bf66f, **openshift\\_version:** 4.19.2, **status:** pending-for-input\n*   **name:** eran, **id:** 9a9d563a-7ba4-4687-872d-28a3bd73d61d, **openshift\\_version:** 4.19.2, **status:** pending-for-input\n*   **name:** demo, **id:** 14f96f92-9598-48de-8dae-67c2c678fe1b, **openshift\\_version:** 4.19.2, **status:** insufficient\n*   **name:** eran, **id:** bc665614-e503-4ab6-81f9-1aa5e584b0b1, **openshift\\_version:** 4.19.1, **status:** pending-for-input\n*   **name:** chatbot, **id:** cd61e6e6-b463-4d93-9983-44efafbe7da7, **openshift\\_version:** 4.16.18, **status:** pending-for-input\n*   **name:** bla, **id:** 28ccbe37-1c9a-4df3-a047-b0aeee2fb6ca, **openshift\\_version:** 4.18.16, **status:** pending-for-input\n\nWhat would you like to do next? For example, would you like to know more about a specific cluster, such as the events related to it?\n",
              "role": "assistant",
              "stop_reason": "end_of_turn",
              "tool_calls": []
            },
            "step_id": "15eaca38-fed5-4f2b-b782-08d2b47f10c5",
            "step_type": "inference",
            "turn_id": "d0508742-5622-4846-9b4d-a2e54d9e449f",
            "completed_at": "2025-07-15T14:15:06.211257Z",
            "started_at": "2025-07-15T14:15:01.503940Z"
          }
        ],
        "turn_id": "d0508742-5622-4846-9b4d-a2e54d9e449f",
        "completed_at": "2025-07-15T14:15:06.211531Z",
        "output_attachments": []
      }
    ]
  }
}

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Jul 15, 2025

"""

Walkthrough

A new REST API module for managing conversation history was introduced, providing endpoints to retrieve and delete conversations by ID. Supporting response models were added, and the router was registered in the main application. Conversation-to-agent mapping is now maintained in query and streaming endpoints. Associated tests were updated for router inclusion.

Changes

File(s) Change Summary
src/app/endpoints/conversations.py New module: Implements GET and DELETE endpoints for conversation management with authentication, validation, logging, and error handling.
src/models/responses.py Added ConversationResponse and ConversationDeleteResponse Pydantic models for API responses.
src/app/routers.py Registered the new conversations router under the /v1 prefix.
src/app/endpoints/query.py,
src/app/endpoints/streaming_query.py
Updated to maintain mapping from conversation_id to agent_id after agent creation.
tests/unit/app/test_routers.py Updated tests to assert inclusion of the new conversations router and its prefix.

Sequence Diagram(s)

sequenceDiagram
    participant Client
    participant FastAPI
    participant Auth
    participant ConversationRouter
    participant LlamaStackClient

    Client->>FastAPI: GET /v1/conversations/{conversation_id}
    FastAPI->>Auth: Validate authentication
    Auth-->>FastAPI: Auth result
    FastAPI->>ConversationRouter: Validate conversation_id, lookup agent_id
    ConversationRouter->>LlamaStackClient: Retrieve session data
    LlamaStackClient-->>ConversationRouter: Session data / Error
    ConversationRouter-->>FastAPI: Response or HTTPException
    FastAPI-->>Client: ConversationResponse or error

    Client->>FastAPI: DELETE /v1/conversations/{conversation_id}
    FastAPI->>Auth: Validate authentication
    Auth-->>FastAPI: Auth result
    FastAPI->>ConversationRouter: Validate conversation_id, lookup agent_id
    ConversationRouter->>LlamaStackClient: Delete session
    LlamaStackClient-->>ConversationRouter: Deletion result / Error
    ConversationRouter-->>FastAPI: DeleteResponse or HTTPException
    FastAPI-->>Client: ConversationDeleteResponse or error
Loading

Poem

In the warren of code where the queries hop through,
Now rabbits can fetch or delete chats anew!
With agent maps tidy and routers in tow,
Our conversations are easy to stow.
Test bunnies rejoice—API’s grown,
In every new endpoint, a carrot is sown! 🥕
"""


📜 Recent review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 516c17a and 6985568.

📒 Files selected for processing (6)
  • src/app/endpoints/conversations.py (1 hunks)
  • src/app/endpoints/query.py (2 hunks)
  • src/app/endpoints/streaming_query.py (2 hunks)
  • src/app/routers.py (2 hunks)
  • src/models/responses.py (1 hunks)
  • tests/unit/app/test_routers.py (5 hunks)
🚧 Files skipped from review as they are similar to previous changes (6)
  • src/app/routers.py
  • src/app/endpoints/streaming_query.py
  • tests/unit/app/test_routers.py
  • src/app/endpoints/query.py
  • src/models/responses.py
  • src/app/endpoints/conversations.py
✨ Finishing Touches
  • 📝 Generate Docstrings

🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Explain this complex logic.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai explain this code block.
    • @coderabbitai modularize this function.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read src/utils.ts and explain its main purpose.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
    • @coderabbitai help me debug CodeRabbit configuration file.

Support

Need help? Create a ticket on our support page for assistance with any issues or questions.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai full review to do a full review from scratch and review all the files again.
  • @coderabbitai summary to regenerate the summary of the PR.
  • @coderabbitai generate docstrings to generate docstrings for this PR.
  • @coderabbitai generate sequence diagram to generate a sequence diagram of the changes in this PR.
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai configuration to show the current CodeRabbit configuration for the repository.
  • @coderabbitai help to get help.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

@eranco74 eranco74 marked this pull request as ready for review July 15, 2025 14:54
@eranco74
Copy link
Contributor Author

@coderabbitai

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Jul 15, 2025

Tip

For best results, initiate chat on the files or code changes.

@eranco74, hello! I see you've mentioned me. How can I help you today?

I notice you shared what appears to be conversation logs or API responses earlier. Are you looking for assistance with analyzing that data, or is there something specific you'd like me to help you with regarding this pull request?

@eranco74
Copy link
Contributor Author

coderabbitai review

Copy link
Contributor

@umago umago left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the patch! Comments inline

}
]
}
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In order to keep the API compatibility with road-core, we need to return the same structure from GET .../conversations/<id>.

This return model doesn't seem compatible with road-core [0][1]

[0] https://github.com/road-core/service/blob/9d65d15a4d1dec47e5aac15ee86fef39db975006/ols/app/models/models.py#L220-L247

[1] https://github.com/road-core/service/blob/9d65d15a4d1dec47e5aac15ee86fef39db975006/ols/app/endpoints/conversations.py#L62C6-L62C25

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does it have to be compatible?
What llama-stack is returning seems a lot better than I see in the link above.
input_message and output_message seems better than content + type.
Not to mention the timestamps.
I agree that I should remove a few fields (steps, tool_execution, etc), but is it a requirement to align to what we have in road-core?

Copy link
Contributor Author

@eranco74 eranco74 Jul 15, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What about something like this:

{
  "conversation_id": "46f02ed5-29ee-41b6-a452-25938f926a52",
  "chat_history": {
      {
        "messages": [
          {
            "content": "what can you do?",
            "type": "user"
          },
          {
            "content": "I can help you with various aspects of OpenShift cluster management using the assisted installer. Here's a list of things I can do:\n\n*   **Cluster Information:**\n    *   Retrieve comprehensive information about a specific cluster...",
            "type": "assistant"
          }
        ],
        "started_at": "2025-07-15T14:14:48.298571Z",
        "completed_at": "2025-07-15T14:14:50.554798Z"
      },
      {
        "messages": [
          {
            "content": "hello",
            "type": "user"
          },
          {
            "content": "Hello! How can I help you with your OpenShift cluster today?",
            "type": "assistant"
          }
        ],
        "started_at": "2025-07-15T14:14:41.580122Z",
        "completed_at": "2025-07-15T14:14:44.120312Z"
      }
    ]
  }
}

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's see what others say, from my understand, teams like Ansible are trying to replace the road-core/service with this new llama-stack implementation without having to change their client. @manstis please correct me if I am wrong.

If that's the case I think we will have to make it backward compatible.

That said, I do agree on cleaning up a bit the llama-stack output to remove the noise and I do like the timestamps too, they are useful.

I don't think the problem is adding new fields, we can totally do that as long as we keep the structure and old fields as before (assuming we want to keep it backward compatible).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@umago we don't use /conversations in road-core so are unaffected if there's backwards compatibility issues.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @manstis.
In order to add the timestamps, I need to nest the content inside the messages list.
While we break compatibility 😄 Can we change type back to role?
Let me know who else we should consult on this?

@tisnik?

Copy link
Contributor Author

@eranco74 eranco74 Jul 16, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, perhaps I can do this:

  "chat_history": {
          {
            "content": "what can you do?",
            "type": "user"
              "timestamp": "2025-07-15T14:14:48.298571Z",
          },
          {
            "content": "I can help you with various aspects of OpenShift cluster management using the assisted installer. Here's a list of things I can do:\n\n*   **Cluster Information:**\n    *   Retrieve comprehensive information about a specific cluster...",
            "type": "assistant"
            "timestamp": "2025-07-15T14:14:50.554798Z"
          },
      }
      ``` 

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Althouh I prefer the messages list since it provides better logical grouping of the messages. Better represent the conversation flow and makeing it easier to manipulate the output for analysis.
The original API flattens this context.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @manstis for clarifying. Our team also doesn't rely on the backward compatibility for this endpoint. Perhaps one last thing we can do is to ask in the internal slack if any of the teams relies on it. If no one cares, yeah, I'm all about changing it to something new and better

@eranco74 eranco74 requested a review from umago July 16, 2025 10:59
@Akrog
Copy link

Akrog commented Jul 16, 2025

Has this been properly tested?

I mean, either I am missing something about what we mean by a "conversation" or our whole conversation tracking is wrong.

Are we referring to a conversation as a single turn?
If we are not referring to that and we are referring to what I have in mind, which is the whole series of user inputs and service responses, in other words from the moment a user starts a chat until they clear the chat or starts a new one, then it won't work with what we have.

Our current code creates a new Agent [1][2] on each call [1][2].

This means that each turn will be completely separated and the internal conversation history in llama-stack will not provide previous context.

Last time I checked this I did it with 2 simple queries:

me> My name is Gorka.
agent> Nice to meet you Gorka.

me> What is my name?
agent> I don't have that information

From what I saw there are a good number of limitations currently around this:

  • Llama-stack-client doesn't allow setting the agent ID when instantiating the request
  • Llama-stack-client doesn't allow retrieving an agent from the llama-stack server
  • Llama-stack-client doesn't allow setting the name of an agent (llama-stack server allows it)
  • There is no filtering capabilities in the llama-stack server to be able to search agents by name

Also, keeping the conversation_id_to_agent_id mapping in lightspeed-stack memory is insufficient, because a restart of the pod will lose the service's ability to get information.

We are also not persisting the sessions in the llama-stack server, because we are not passing enable_session_persistence=True to the Agent on instantiation.

I don't remember if session persistence on llama-stack server requires some specific configuration (I think it didn't).

@eranco74
Copy link
Contributor Author

Akrog I agree it's not GA, but it's a start.

This means that each turn will be completely separated and the internal conversation history in llama-stack will not provide previous context.

you have it wrong, we currently use agent_cache to reuse the same agent for the same conversation.
You can see the multiple turns in output I pasted in the first comment (the output changed since but the multiple turns is there)

Also, keeping the conversation_id_to_agent_id mapping in lightspeed-stack memory is insufficient, because a restart of the pod will lose the service's ability to get information.

That is currect, we will handle that in https://issues.redhat.com/browse/LCORE-373

We are also not persisting the sessions in the llama-stack server, because we are not passing enable_session_persistence=True to the Agent on instantiation.

We are passing enable_session_persistence=True
see https://github.com/search?q=repo%3Alightspeed-core%2Flightspeed-stack%20enable_session_persistence&type=code

- Add GET /v1/conversations/{conversation_id} to retrieve conversation history
- Add DELETE /v1/conversations/{conversation_id} to delete conversations
- Use llama-stack client.agents.session.retrieve and .delete methods
- Map conversation ID to agent ID for LlamaStack operations
- Add ConversationResponse and ConversationDeleteResponse models
- Include conversations router in main app routing
- Maintain consistent error handling and authentication patterns
Copy link
Contributor

@umago umago left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks LGTM now

@tisnik tisnik merged commit eb13e53 into lightspeed-core:main Jul 16, 2025
17 checks passed
@Akrog
Copy link

Akrog commented Jul 16, 2025

@eranco74

Sorry, apparently I looked at an old code base (as shown in the links I pasted) and in there (and when I tried this) there was no caching of the agents as well as no passing of the persistence parameter.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants