Skip to content

[Bug]: When using AzureOpenAI for embedding azure_deployment needs to be provided as well  #1770

@pavlovmilen

Description

@pavlovmilen

What happened?

As a chromadb user, when I use Azure Open AI, I want to be able to use my embedding funciton with collection creation but currently it does not let me specify azure_deployment so it throws DeploymentNotFound exception (see logs):

my_azure_openai_ef= embedding_functions.OpenAIEmbeddingFunction(
            api_key=OPEN_AI_API_KEY,
            model_name="text-embedding-ada-002",
            api_base="https://my_azure_open_ai_endpoint.openai.azure.com",
            api_type="azure",
            api_version="2023-05-15",
            )
`client.get_or_create_collection("my_collection", embedding_function=my_azure_openai_ef)`

In embedding_functions.py there is a code execution path that gets triggered when api_type="azure"

        self._client = openai.AzureOpenAI(
            api_key=api_key,
            api_version=api_version,
            azure_endpoint=api_base,
            default_headers=default_headers,
        ).embeddings

The issue is that it is missing azure_deployment="your_deployment_name_here"
Whithout it my azure deployment cannot be reached.
See attached log for more details

To test it I have this setup:

client = AzureOpenAI(
    api_key=OPEN_AI_API_KEY, 
    azure_endpoint="https://my_azure_open_ai_endpoint.openai.azure.com", 
    azure_deployment="my_azure_open_ai_deployment",
    api_version="2023-05-15")

def text_embedding(text):
    response = client.embeddings.create(model="text-embedding-ada-002", input=[text])
    return response.data[0].embedding

When I skip azure_deployment="my_azure_open_ai_deployment", my text embedding function throws not found exception. Azure Open AI endpoints need the deployment name in order for it to work.

See https://github.com/openai/openai-python?tab=readme-ov-file#microsoft-azure-openai for more details
Also can you update docs to include link to api versioning in Azure:
https://learn.microsoft.com/en-us/azure/ai-services/openai/reference#rest-api-versioning

Versions

ChromaDB 0.4.21, Python 3.11.5

Relevant log output

Cell In[2], line 14
     13 def text_embedding(text):
---> 14     response = client.embeddings.create(model="text-embedding-ada-002", input=[text])
     15     return response.data[0].embedding

File c:\Python\Lib\site-packages\openai\resources\embeddings.py:113, in Embeddings.create(self, input, model, dimensions, encoding_format, user, extra_headers, extra_query, extra_body, timeout)
    107         embedding.embedding = np.frombuffer(  # type: ignore[no-untyped-call]
    108             base64.b64decode(data), dtype="float32"
    109         ).tolist()
    111     return obj
--> 113 return self._post(
    114     "/embeddings",
    115     body=maybe_transform(params, embedding_create_params.EmbeddingCreateParams),
    116     options=make_request_options(
...
   (...)
    987     stream_cls=stream_cls,
    988 )


Error code: 404 - {'error': {'code': 'DeploymentNotFound', 'message': 'The API deployment for this resource does not exist. If you created the deployment within the last 5 minutes, please wait a moment and try again.'}}

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't workinggood first issueGood for newcomers

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions