Skip to content

SmolVLM Support #5368

@cfelicio

Description

@cfelicio

Is your feature request related to a problem? Please describe.

No, but rather a limitation. The current models available on the gallery are great, but require capable hardware for doing vision.

Describe the solution you'd like

Recently, llama.cpp added support for vision (https://news.ycombinator.com/item?id=43943047), and now it's possible to run much smaller models that are good enough for things like surveillance video analysis in realtime (e.g. using SmolVLM)

Describe alternatives you've considered

I'm currently using Gemma3 as an alternative, but it's very hardware intensive.

Additional context

N/A

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions