Skip to content

OpenAI speculative execution #44

@Ying1123

Description

@Ying1123

The current frontend using OpenAI will invoke multiple calls for the example below:

@sgl.function
def example(s):
  s += "Construct a character."
  s += "Name: " + gen("name") + " Birthday: " + gen("birthday") + " Job: " + gen("job")

We can optimize this to send less number of calls to save money:

  1. Gen longer in the first gen call, and skip the later if the first gen did the right thing.
  2. Allow using OpenAI's n=10 keyword argument to sample multiple completions when forked. We can also provide the interface example.run(n=10).

Metadata

Metadata

Assignees

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions