Skip to content

Conversation

@LiuLi1998
Copy link
Contributor

@LiuLi1998 LiuLi1998 commented Aug 11, 2025

Relate issue: #21313
This PR adds support for the Anthropic /v1/messages REST API endpoint to the vLLM FastAPI server.

  • Support /v1/messages API
  • Compatibale with all existed tool call parser in OpenAI API

@LiuLi1998 LiuLi1998 requested a review from aarnphm as a code owner August 11, 2025 08:20
@mergify mergify bot added the frontend label Aug 11, 2025
Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces support for the Anthropic Messages API by adding a new API server, protocol definitions, and a serving layer for format conversion. The implementation is based on the existing OpenAI-compatible server. My review has identified several critical and high-severity issues, including a potential NoneType access error, incorrect Pydantic model usage that could lead to validation errors, a risk of generating duplicate tool call IDs, and another case of incorrect attribute access on a Pydantic model that would cause a runtime error. I have provided specific code suggestions to address these issues and ensure the stability and correctness of the new endpoint.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

critical

When handler is None, messages(raw_request) is also None. Calling create_error_response on a None object will raise an AttributeError, causing an unhandled exception and a 500 server error. You should construct an ErrorResponse directly to ensure a proper error is returned. You will need to import ErrorResponse from vllm.entrypoints.openai.protocol and HTTPStatus from http.

Suggested change
return messages(raw_request).create_error_response(
message="The model does not support Chat Completions API")
return ErrorResponse(message="The model does not support Chat Completions API",
type="model_not_found",
code=HTTPStatus.NOT_FOUND.value)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

critical

anthropic_request.tool_choice is a Pydantic model instance, not a dictionary. Accessing its attributes should be done with dot notation (e.g., .name). Using .get("name") will result in an AttributeError at runtime.

Suggested change
"name": anthropic_request.tool_choice.get("name")
"name": anthropic_request.tool_choice.name

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

The id field is defined as a required field. The model_post_init method, which attempts to set a default value, is called after Pydantic's validation. If id is not provided during initialization, a ValidationError will be raised before model_post_init can execute. To correctly provide a default value for an optional field, you should use default_factory in the field definition and remove the model_post_init method.

Suggested change
id: str
id: str = Field(default_factory=lambda: f"msg_{int(time.time() * 1000)}")

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

Using int(time.time()) to generate tool call IDs is not safe as it can produce duplicate IDs for tool calls created in the same second. This can lead to incorrect behavior when matching tool calls to their results. It's better to use a UUID-based approach for uniqueness. You can use random_tool_call_id from vllm.entrypoints.chat_utils for this, which needs to be imported.

Suggested change
"id": block.id or f"call_{int(time.time())}",
"id": block.id or random_tool_call_id(),

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

A text AnthropicContentBlock is created even if generator.choices[0].message.content is None. This can lead to an invalid content block, as the Anthropic API requires the text field for text blocks. When serialized with exclude_none=True, this would result in an invalid content block. You should only create the text content block if there is content available.

Suggested change
content: List[AnthropicContentBlock] = [
AnthropicContentBlock(
type="text",
text=generator.choices[0].message.content
)
]
content: List[AnthropicContentBlock] = []
if generator.choices[0].message.content:
content.append(
AnthropicContentBlock(
type="text",
text=generator.choices[0].message.content))

@github-actions
Copy link

👋 Hi! Thank you for contributing to the vLLM project.

💬 Join our developer Slack at https://slack.vllm.ai to discuss your PR in #pr-reviews, coordinate on features in #feat- channels, or join special interest groups in #sig- channels.

Just a reminder: PRs would not trigger full CI run by default. Instead, it would only run fastcheck CI which starts running only a small and essential subset of CI tests to quickly catch errors. You can run other CI tests on top of those by going to your fastcheck build on Buildkite UI (linked in the PR checks section) and unblock them. If you do not have permission to unblock, ping simon-mo or khluu to add you in our Buildkite org.

Once the PR is approved and ready to go, your PR reviewer(s) can run CI to test the changes comprehensively before merging.

To run CI, PR reviewers can either: Add ready label to the PR or enable auto-merge.

🚀

@mgoin
Copy link
Member

mgoin commented Aug 11, 2025

Exciting! Make sure to add some unit tests before ready and I wonder if we can also include a smoke test that claude code/some other application can communicate with the API correctly

@mgoin mgoin self-requested a review August 11, 2025 19:09
@njhill
Copy link
Member

njhill commented Aug 11, 2025

Thanks @LiuLi1998! Would you also be willing to help with ongoing support/maintenance of the API?

@LiuLi1998
Copy link
Contributor Author

Exciting! Make sure to add some unit tests before ready and I wonder if we can also include a smoke test that claude code/some other application can communicate with the API correctly

Thanks for the input! I agree — I’ll add some tests soon to make sure everything works as expected.

@LiuLi1998
Copy link
Contributor Author

Thanks @LiuLi1998! Would you also be willing to help with ongoing support/maintenance of the API?

Definitely! I’m glad to take part in the support/maintenance of the API

@LiuLi1998
Copy link
Contributor Author

Exciting! Make sure to add some unit tests before ready and I wonder if we can also include a smoke test that claude code/some other application can communicate with the API correctly

I've added initial tests.I'm not entirely sure if the current approach follows best practices or covers everything needed—would really appreciate your feedback on improvements or any other cases

Signed-off-by: liuli <[email protected]>
@mgoin mgoin changed the title Support Anthropic API Endponit Support Anthropic API /v1/messages Endpoint Aug 13, 2025
Signed-off-by: liuli <[email protected]>
Signed-off-by: liuli <[email protected]>
@mergify
Copy link

mergify bot commented Aug 18, 2025

This pull request has merge conflicts that must be resolved before it can be
merged. Please rebase the PR, @LiuLi1998.

https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/working-with-forks/syncing-a-fork

@mergify mergify bot added the needs-rebase label Aug 18, 2025
# Conflicts:
#	tests/utils.py
@mergify mergify bot removed the needs-rebase label Aug 20, 2025
@LiuLi1998
Copy link
Contributor Author

@mgoin While adding tests, I triggered the CI and encountered the following error:
ModuleNotFoundError: No module named 'anthropic'.
Could someone advise how to add the required dependency to the project's requirements? Should I add anthropic to requirements.txt or requirements-test.txt (or another file)? Any guidance on the correct procedure would be appreciated!

Signed-off-by: liuli <[email protected]>
@mergify mergify bot added the ci/build label Aug 20, 2025
@mergify
Copy link

mergify bot commented Aug 23, 2025

This pull request has merge conflicts that must be resolved before it can be
merged. Please rebase the PR, @LiuLi1998.

https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/working-with-forks/syncing-a-fork

@mergify mergify bot added the needs-rebase label Aug 23, 2025
Signed-off-by: liuli <[email protected]>
auto-merge was automatically disabled October 22, 2025 07:11

Head branch was pushed to by a user without write access

Signed-off-by: liuli <[email protected]>
@LiuLi1998
Copy link
Contributor Author

The failing test looks related. Might need python3 for CI env?

[2025-10-21T16:33:42Z] ERROR entrypoints/anthropic/test_messages.py::test_simple_messages - FileNotFoundError: [Errno 2] No such file or directory: 'python -m'

Already fixed, thx for help

@simon-mo simon-mo merged commit c9461e0 into vllm-project:main Oct 22, 2025
87 checks passed
JorgenTrondsen pushed a commit to JorgenTrondsen/vllm that referenced this pull request Oct 22, 2025
Signed-off-by: liuli <[email protected]>
Co-authored-by: liuli <[email protected]>
Co-authored-by: Cyrus Leung <[email protected]>
Co-authored-by: Michael Goin <[email protected]>
Signed-off-by: jorgentrondsen <[email protected]>
usberkeley pushed a commit to usberkeley/vllm that referenced this pull request Oct 23, 2025
Signed-off-by: liuli <[email protected]>
Co-authored-by: liuli <[email protected]>
Co-authored-by: Cyrus Leung <[email protected]>
Co-authored-by: Michael Goin <[email protected]>
albertoperdomo2 pushed a commit to albertoperdomo2/vllm that referenced this pull request Oct 23, 2025
Signed-off-by: liuli <[email protected]>
Co-authored-by: liuli <[email protected]>
Co-authored-by: Cyrus Leung <[email protected]>
Co-authored-by: Michael Goin <[email protected]>
Signed-off-by: Alberto Perdomo <[email protected]>
@shoted
Copy link

shoted commented Oct 24, 2025

how to support both openai and anthropic api

@tlipoca9
Copy link
Contributor

how to support both openai and anthropic api

@LiuLi1998 +1, is it possible? I also want it

kingsmad pushed a commit to kingsmad/vllm that referenced this pull request Oct 25, 2025
Signed-off-by: liuli <[email protected]>
Co-authored-by: liuli <[email protected]>
Co-authored-by: Cyrus Leung <[email protected]>
Co-authored-by: Michael Goin <[email protected]>
0xrushi pushed a commit to 0xrushi/vllm that referenced this pull request Oct 26, 2025
Signed-off-by: liuli <[email protected]>
Co-authored-by: liuli <[email protected]>
Co-authored-by: Cyrus Leung <[email protected]>
Co-authored-by: Michael Goin <[email protected]>
Signed-off-by: 0xrushi <[email protected]>
0xrushi pushed a commit to 0xrushi/vllm that referenced this pull request Oct 26, 2025
Signed-off-by: liuli <[email protected]>
Co-authored-by: liuli <[email protected]>
Co-authored-by: Cyrus Leung <[email protected]>
Co-authored-by: Michael Goin <[email protected]>
Signed-off-by: 0xrushi <[email protected]>
@LiuLi1998
Copy link
Contributor Author

how to support both openai and anthropic api

@LiuLi1998 +1, is it possible? I also want it

Currently, the OpenAI and Anthropic APIs are separate api servers and cannot be used at the same time.

@shoted
Copy link

shoted commented Oct 28, 2025

how to support both openai and anthropic api如何同时支持 OpenAI 和 Anthropic API

@LiuLi1998 +1, is it possible? I also want it+1,有可能吗?我也想要

Currently, the OpenAI and Anthropic APIs are separate api servers and cannot be used at the same time.目前,OpenAI 和 Anthropic API 是独立的 api 服务器,不能同时使用。

Is there a plan to support them simultaneously?

See https://docs.anthropic.com/en/api/messages
for the API specification. This API mimics the Anthropic messages API.
"""
logger.debug("Received messages request %s", request.model_dump_json())
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

would this be super slow? it calls request.model_dump_json() unconditionally.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this will only work when VLLM_LOGGING_LEVEL=DEBUG

return JSONResponse(content=generator.model_dump())

elif isinstance(generator, AnthropicMessagesResponse):
logger.debug(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

similar problem, unconditional call of generator.model_dump(exclude_none=True)



@router.post(
"/v1/messages",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it seems anthropic API only has a new /v1/messages endpoint, why not merge it with the openai server? like serving both v1/chat/completions and /v1/messages endpoints together.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think they are two different protocols, and It's possible to merge them together for functional compatibility, but I think it could lead to semantic confusion.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think Kaichao is saying to merge them into one endpoint, just hosting them side-by-side. So when you run vllm serve you get /v1/completions, /v1/chat/completions, /v1/messages, etc. I agree this would be optimal for user ease

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think Kaichao is saying to merge them into one endpoint, just hosting them side-by-side. So when you run vllm serve you get /v1/completions, /v1/chat/completions, /v1/messages, etc. I agree this would be optimal for user ease

I agree it's the most user-friendly solution.

@bbartels
Copy link
Contributor

@youkaichao @mgoin @shoted Raised #27882 to add /v1/messages to openai api_server

@shoted
Copy link

shoted commented Nov 4, 2025

@youkaichao @mgoin @shoted Raised #27882 to add /v1/messages to openai api_server

nice, bro

ilmarkov pushed a commit to neuralmagic/vllm that referenced this pull request Nov 7, 2025
Signed-off-by: liuli <[email protected]>
Co-authored-by: liuli <[email protected]>
Co-authored-by: Cyrus Leung <[email protected]>
Co-authored-by: Michael Goin <[email protected]>
rtourgeman pushed a commit to rtourgeman/vllm that referenced this pull request Nov 10, 2025
Signed-off-by: liuli <[email protected]>
Co-authored-by: liuli <[email protected]>
Co-authored-by: Cyrus Leung <[email protected]>
Co-authored-by: Michael Goin <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ci/build frontend ready ONLY add when PR is ready to merge/full CI is needed

Projects

None yet

Development

Successfully merging this pull request may close these issues.

10 participants