-
-
Notifications
You must be signed in to change notification settings - Fork 11.4k
[Frontend][3/N] Improve all pooling task | Support binary embedding response #27066
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
Changes from 1 commit
Commits
Show all changes
35 commits
Select commit
Hold shift + click to select a range
92a6d2e
init
noooop e54fe8b
init
noooop 83803c4
+ response_compression_pooling_output
noooop fc826d1
fix
noooop 3b10e99
support endianness
noooop fa03e6c
+ tensor_serial
noooop b033bda
+ embedding_requests_base64_client
noooop 24bb3cf
Merge branch 'main' into binary_response
noooop 63502bb
fix
noooop c8a6d0f
typo
noooop f4882a4
Support binary embedding response
noooop 836fc05
+ embedding_requests_bytes_client.py
noooop 687a889
fix
noooop 1b7790b
fix
noooop f8e37b6
Update tests/utils_/test_tensor_serial.py
noooop 9bc193f
Update vllm/entrypoints/openai/serving_embedding.py
noooop 5b85cf5
fix tests
noooop 6d81371
fix
noooop 9431501
fix
noooop 861738e
clean up
noooop 2aabfdf
fix
noooop 8fe01ff
Merge branch 'main' into binary_response
noooop b825588
Update vllm/utils/tensor_serial.py
noooop ba217ea
fix
noooop dacda88
Merge branch 'main' into binary_response
noooop 0f6382e
Update vllm/utils/serial_utils.py
noooop 4590a98
Update vllm/utils/serial_utils.py
noooop 60c8b52
Update vllm/utils/serial_utils.py
noooop 3cd65fc
fix
noooop acc4b50
fix
noooop 3a1302b
rename
noooop 07b2508
StreamingResponse
noooop 68276e6
Merge branch 'main' into binary_response
noooop 405f11d
rename
noooop 95986f1
fix
noooop File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
37 changes: 37 additions & 0 deletions
37
tests/entrypoints/pooling/openai/test_binary_encode_and_decode.py
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,37 @@ | ||
| # SPDX-License-Identifier: Apache-2.0 | ||
| # SPDX-FileCopyrightText: Copyright contributors to the vLLM project | ||
| import pytest | ||
| import torch | ||
|
|
||
| from tests.models.utils import check_embeddings_close | ||
| from vllm.entrypoints.openai.utils import ( | ||
| EMBED_DTYPE_TO_TORCH_DTYPE, | ||
| ENDIANNESS, | ||
| binary2tenser, | ||
| tenser2binary, | ||
| ) | ||
|
|
||
|
|
||
| @pytest.mark.parametrize("endianness", ENDIANNESS) | ||
| @pytest.mark.parametrize("embed_dtype", EMBED_DTYPE_TO_TORCH_DTYPE.keys()) | ||
| @torch.inference_mode | ||
| def test_encode_and_decode(embed_dtype: str, endianness: str): | ||
| for i in range(10): | ||
| tenser = torch.rand(2, 3, 5, 7, 11, 13, device="cpu", dtype=torch.float32) | ||
| shape = tenser.shape | ||
| binary = tenser2binary(tenser, embed_dtype, endianness) | ||
| new_tenser = binary2tenser(binary, shape, embed_dtype, endianness).to( | ||
| torch.float32 | ||
| ) | ||
|
|
||
| if "embed_dtype" in ["float32", "float16", "bfloat16"]: | ||
| torch.testing.assert_close(tenser, new_tenser, atol=1e-7, rtol=1e-7) | ||
| else: # for fp8 | ||
| torch.testing.assert_close(tenser, new_tenser, atol=0.1, rtol=0.1) | ||
| check_embeddings_close( | ||
| embeddings_0_lst=tenser.view(1, -1), | ||
| embeddings_1_lst=new_tenser.view(1, -1), | ||
| name_0="gt", | ||
| name_1="new", | ||
| tol=1e-2, | ||
| ) |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.