Skip to content

Commit 23fba67

Browse files
authored
Fix index in ChatCompletionChunk (#1648)
Fix a small inconsistency compared the OpenAI's chat-completion behavior (introduced in #1427 cc @drbh). When using `stream=True`, each chunk has an `index` value in `ChatCompletionChoice`. This index is not meant to be the index of the generated token but the index of the choice, which is always 0 (since TGI always return a single choice). See https://platform.openai.com/docs/api-reference/chat/object: > index _integer_ > The index of the choice in the list of choices. --- So instead of ```js data:{"id":"","object":"text_completion","created":1710508199,"model":"HuggingFaceH4/zephyr-7b-beta","system_fingerprint":"1.4.3-sha-e6bb3ff","choices":[{"index":1,"delta":{"role":"assistant","content":"I"},"logprobs":null,"finish_reason":null}]} data:{"id":"","object":"text_completion","created":1710508199,"model":"HuggingFaceH4/zephyr-7b-beta","system_fingerprint":"1.4.3-sha-e6bb3ff","choices":[{"index":2,"delta":{"role":"assistant","content":"'"},"logprobs":null,"finish_reason":null}]} data:{"id":"","object":"text_completion","created":1710508199,"model":"HuggingFaceH4/zephyr-7b-beta","system_fingerprint":"1.4.3-sha-e6bb3ff","choices":[{"index":3,"delta":{"role":"assistant","content":"m"},"logprobs":null,"finish_reason":"length"}]} ``` if should return ```js data:{"id":"","object":"text_completion","created":1710508199,"model":"HuggingFaceH4/zephyr-7b-beta","system_fingerprint":"1.4.3-sha-e6bb3ff","choices":[{"index":0,"delta":{"role":"assistant","content":"I"},"logprobs":null,"finish_reason":null}]} data:{"id":"","object":"text_completion","created":1710508199,"model":"HuggingFaceH4/zephyr-7b-beta","system_fingerprint":"1.4.3-sha-e6bb3ff","choices":[{"index":0,"delta":{"role":"assistant","content":"'"},"logprobs":null,"finish_reason":null}]} data:{"id":"","object":"text_completion","created":1710508199,"model":"HuggingFaceH4/zephyr-7b-beta","system_fingerprint":"1.4.3-sha-e6bb3ff","choices":[{"index":0,"delta":{"role":"assistant","content":"m"},"logprobs":null,"finish_reason":"length"}]} ``` **EDIT:** I also edited ToolCall.index to be always `0` (instead of the generated token index) but for this one I'm actually unsure. It might be the index of the tool in the array of tools? OpenAI's documentation doesn't provide any information about it: > index _integer_ --- I also noticed that in OpenAI's example, the last chunk doesn't have a delta and is the only one that has a `finish_reason` returning. TGI is slightly different since the last chunk has both the last delta (i.e. the last generated token) + the finish reason. I don't think this is worth fixing since it is not a requirement according to the docs/specs (at least not that I know of).
1 parent 0d9917f commit 23fba67

File tree

2 files changed

+2
-4
lines changed

2 files changed

+2
-4
lines changed

router/src/lib.rs

Lines changed: 2 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -524,7 +524,6 @@ impl ChatCompletionChunk {
524524
delta: Option<String>,
525525
tool_calls: Option<Vec<String>>,
526526
created: u64,
527-
index: u32,
528527
logprobs: Option<ChatCompletionLogprobs>,
529528
finish_reason: Option<String>,
530529
) -> Self {
@@ -535,12 +534,12 @@ impl ChatCompletionChunk {
535534
model,
536535
system_fingerprint,
537536
choices: vec![ChatCompletionChoice {
538-
index,
537+
index: 0,
539538
delta: ChatCompletionDelta {
540539
role: "assistant".to_string(),
541540
content: delta,
542541
tool_calls: tool_calls.map(|tc| DeltaToolCall {
543-
index,
542+
index: 0,
544543
id: String::new(),
545544
r#type: "function".to_string(),
546545
function: Function {

router/src/server.rs

Lines changed: 0 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -895,7 +895,6 @@ async fn chat_completions(
895895
content,
896896
tool_calls,
897897
current_time,
898-
stream_token.index,
899898
logprobs,
900899
stream_token.details.map(|d| d.finish_reason.to_string()),
901900
))

0 commit comments

Comments
 (0)