Skip to content

[Feat]: Batch Prediction and Cached Prompt should work together #2122

@LinusOstlund

Description

@LinusOstlund

Is your feature request related to a problem? Please describe.

Batch predictions should be able to use a cached context.

Describe the solution you'd like

We are currently working on a classifying prompt which requires an extensive system prompt. We have been experimenting using batch processing together with context caching (leveraging the 50% discount from batch processing, with the 75% discount from context caching).

We have tried several approaches now, but the batch job fails. Here's and entry from the responding predictions.jsonl:

{"status":"Internal error occurred. Failed to get generateContentResponse: {\"error\": {\"code\": 404, \"message\": \"Not found: cached content metadata for 3814716010150232064.\", \"status\": \"NOT_FOUND\"}}"
  • There is no official Google material (tutorials, documentation, model cards, etc) stating if batch predicitons and context caching is supported when used together.
  • If it is supported, it would be lovely to see a tutorial.
  • If it is NOT supported, please add a disclaimer to the batch / context cache docs.

Describe alternatives you've considered

It is supported on OpenAI.

Additional context

No response

Code of Conduct

  • I agree to follow this project's Code of Conduct

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions