Skip to content

Databend AI Server extends data warehouse with AI-ready UDFs, seamlessly fusing object storage, embeddings, and SQL pipelines.

License

Notifications You must be signed in to change notification settings

databendlabs/databend-aiserver

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

5 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

databend-aiserver

Databend AI Server extends any data warehouse with AI-ready UDFs, seamlessly fusing object storage, embeddings, and SQL pipelines.

UDFs (prefix ai_)

  • list_files(stage, limit) – UDTF that emits one row per object in external stages.
  • read_pdf(stage, path) – extract PDF text.
  • read_docx(stage, path) – extract DOCX text.
  • embed_1024(text) – 1024-dim embeddings (batch-friendly, default model qwen).

Quickstart

uv sync
uv run databend-aiserver --port 8815

Sample SQL

CREATE CONNECTION my_s3_connection
  STORAGE_TYPE = 's3'
  ACCESS_KEY_ID = '<your-access-key-id>'
  SECRET_ACCESS_KEY = '<your-secret-access-key>';

CREATE STAGE docs_stage
  URL='s3://load/files/'
  CONNECTION = (CONNECTION_NAME = 'my_s3_connection');

SELECT * FROM ai_list_files(@docs_stage, 50);
SELECT ai_read_pdf(@docs_stage, 'reports/q1.pdf');
SELECT ai_read_docx(@docs_stage, 'reports/q1.docx');
SELECT ai_embed_1024(doc_body) FROM docs_tbl;

ai_list_files returns columns: stage, relative_path, path, is_dir, size, mode, content_type, etag, and truncated (true when the optional limit is hit).

Tests

# Full suite (CI should run this)
uv run pytest

# Quicker local loop (skips tests marked slow)
uv run pytest -m "not slow"

Built by the Databend team — engineers who redefine what's possible with data.

About

Databend AI Server extends data warehouse with AI-ready UDFs, seamlessly fusing object storage, embeddings, and SQL pipelines.

Topics

Resources

License

Code of conduct

Security policy

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages