VecturaKit is a Swift-based vector database designed for on-device apps through local vector storage and retrieval.
Inspired by Dripfarm's SVDB, VecturaKit uses MLTensor and swift-embeddings for generating and managing embeddings. It features Model2Vec support with the 32M parameter model as default for fast static embeddings.
The framework offers VecturaKit as the core vector database with pluggable embedding providers. Use SwiftEmbedder for swift-embeddings integration or MLXEmbedder for Apple's MLX framework acceleration.
It also includes CLI tools (vectura-cli and vectura-mlx-cli) for easily trying out the package.
Explore the following books to understand more about AI and iOS development:
- Exploring On-Device AI for Apple Platforms Development
- Exploring AI-Assisted Coding for iOS Development
- Features
- Supported Platforms
- Installation
- Usage
- MLX Integration
- Command Line Interface
- License
- Contributing
- Support
- Model2Vec Support: Uses the retrieval 32M parameter Model2Vec model as default for fast static embeddings.
- Auto-Dimension Detection: Automatically detects embedding dimensions from models.
- On-Device Storage: Stores and manages vector embeddings locally.
- Hybrid Search: Combines vector similarity with BM25 text search for relevant search results (
VecturaKit). - Batch Processing: Indexes documents in parallel for faster data ingestion.
- Persistent Storage: Automatically saves and loads document data, preserving the database state across app sessions.
- Configurable Search: Customizes search behavior with adjustable thresholds, result limits, and hybrid search weights.
- Custom Storage Location: Specifies a custom directory for database storage.
- Custom Storage Provider: Implements custom storage backends (SQLite, Core Data, cloud storage) by conforming to the
VecturaStorageprotocol. - MLX Support: Uses Apple's MLX framework for accelerated embedding generation through
MLXEmbedder. - CLI Tools: Includes
vectura-cli(Swift embeddings) andvectura-mlx-cli(MLX embeddings) for database management and testing.
- macOS 14.0 or later
- iOS 17.0 or later
- tvOS 17.0 or later
- visionOS 1.0 or later
- watchOS 10.0 or later
Swift Package Manager handles the distribution of Swift code and comes built into the Swift compiler.
To integrate VecturaKit into your project using Swift Package Manager, add the following dependency in your Package.swift file:
dependencies: [
.package(url: "https://github.com/rryam/VecturaKit.git", from: "1.1.0"),
],VecturaKit uses the following Swift packages:
- swift-embeddings: Used in
VecturaKitfor generating text embeddings using various models. - swift-argument-parser: Used for creating the command-line interface.
- mlx-swift-examples: Provides MLX-based embeddings and vector search capabilities, specifically for
VecturaMLXKit.
Get up and running with VecturaKit in minutes. Here is an example of adding and searching documents:
import VecturaKit
Task {
do {
let config = VecturaConfig(name: "my-db")
let embedder = SwiftEmbedder(modelSource: .default)
let vectorDB = try await VecturaKit(config: config, embedder: embedder)
// Add documents
let ids = try await vectorDB.addDocuments(texts: [
"The quick brown fox jumps over the lazy dog",
"Swift is a powerful programming language"
])
// Search documents
let results = try await vectorDB.search(query: "programming language", numResults: 5)
print("Found \(results.count) results!")
} catch {
print("Error: \(error)")
}
}import VecturaKitimport Foundation
import VecturaKit
let config = VecturaConfig(
name: "my-vector-db",
directoryURL: nil, // Optional custom storage location
dimension: nil, // Auto-detect dimension from embedder (recommended)
searchOptions: VecturaConfig.SearchOptions(
defaultNumResults: 10,
minThreshold: 0.7,
hybridWeight: 0.5, // Balance between vector and text search
k1: 1.2, // BM25 parameters
b: 0.75
)
)
// Create an embedder (SwiftEmbedder uses swift-embeddings library)
let embedder = SwiftEmbedder(modelSource: .default)
let vectorDB = try await VecturaKit(config: config, embedder: embedder)Single document:
let text = "Sample text to be embedded"
let documentId = try await vectorDB.addDocument(
text: text,
id: UUID() // Optional, will be generated if not provided
)Multiple documents in batch:
let texts = [
"First document text",
"Second document text",
"Third document text"
]
let documentIds = try await vectorDB.addDocuments(
texts: texts,
ids: nil // Optional array of UUIDs
)Search by text (hybrid search):
let results = try await vectorDB.search(
query: "search query",
numResults: 5, // Optional
threshold: 0.8 // Optional
)
for result in results {
print("Document ID: \(result.id)")
print("Text: \(result.text)")
print("Similarity Score: \(result.score)")
print("Created At: \(result.createdAt)")
}Search by vector embedding:
let results = try await vectorDB.search(
query: embeddingArray, // [Float] matching config.dimension
numResults: 5, // Optional
threshold: 0.8 // Optional
)Update document:
try await vectorDB.updateDocument(
id: documentId,
newText: "Updated text"
)Delete documents:
try await vectorDB.deleteDocuments(ids: [documentId1, documentId2])Reset database:
try await vectorDB.reset()Get document count:
let count = await vectorDB.documentCount
print("Database contains \(count) documents")VecturaKit allows you to implement your own storage backend by conforming to the VecturaStorage protocol. This is useful for integrating with different storage systems like SQLite, Core Data, or cloud storage.
Define a custom storage provider:
import Foundation
import VecturaKit
final class MyCustomStorageProvider: VecturaStorage {
private var documents: [UUID: VecturaDocument] = [:]
func createStorageDirectoryIfNeeded() async throws {
// Initialize your storage system
}
func loadDocuments() async throws -> [VecturaDocument] {
// Load documents from your storage
return Array(documents.values)
}
func saveDocument(_ document: VecturaDocument) async throws {
// Save document to your storage
documents[document.id] = document
}
func deleteDocument(withID id: UUID) async throws {
// Delete document from your storage
documents.removeValue(forKey: id)
}
func updateDocument(_ document: VecturaDocument) async throws {
// Update document in your storage
documents[document.id] = document
}
}Use the custom storage provider:
let config = VecturaConfig(name: "my-db")
let customStorage = MyCustomStorageProvider()
let vectorDB = try await VecturaKit(
config: config,
storageProvider: customStorage
)
// Use vectorDB normally - all storage operations will use your custom provider
let documentId = try await vectorDB.addDocument(text: "Sample text")VecturaKit supports Apple's MLX framework through the MLXEmbedder for accelerated on-device machine learning performance.
import VecturaKit
import VecturaMLXKit
import MLXEmbedderslet config = VecturaConfig(
name: "my-mlx-vector-db",
dimension: nil // Auto-detect dimension from MLX embedder
)
// Create MLX embedder
let embedder = try await MLXEmbedder(configuration: .nomic_text_v1_5)
let vectorDB = try await VecturaKit(config: config, embedder: embedder)let texts = [
"First document text",
"Second document text",
"Third document text"
]
let documentIds = try await vectorDB.addDocuments(texts: texts)let results = try await vectorDB.search(
query: "search query",
numResults: 5, // Optional
threshold: 0.8 // Optional
)
for result in results {
print("Document ID: \(result.id)")
print("Text: \(result.text)")
print("Similarity Score: \(result.score)")
print("Created At: \(result.createdAt)")
}Update document:
try await vectorDB.updateDocument(
id: documentId,
newText: "Updated text"
)Delete documents:
try await vectorDB.deleteDocuments(ids: [documentId1, documentId2])Reset database:
try await vectorDB.reset()VecturaKit includes command-line tools for database management with different embedding backends.
# Add documents (dimension auto-detected from model)
vectura add "First document" "Second document" "Third document" \
--db-name "my-vector-db"
# Search documents
vectura search "search query" \
--db-name "my-vector-db" \
--threshold 0.7 \
--num-results 5
# Update document
vectura update <document-uuid> "Updated text content" \
--db-name "my-vector-db"
# Delete documents
vectura delete <document-uuid-1> <document-uuid-2> \
--db-name "my-vector-db"
# Reset database
vectura reset \
--db-name "my-vector-db"
# Run demo with sample data
vectura mock \
--db-name "my-vector-db" \
--threshold 0.7 \
--num-results 10Common options for vectura-cli:
--db-name, -d: Database name (default: "vectura-cli-db")--dimension, -v: Vector dimension (auto-detected by default)--threshold, -t: Minimum similarity threshold (default: 0.7)--num-results, -n: Number of results to return (default: 10)--model-id, -m: Model ID for embeddings (default: "minishlab/potion-retrieval-32M")
# Add documents
vectura-mlx add "First document" "Second document" "Third document" --db-name "my-mlx-vector-db"
# Search documents
vectura-mlx search "search query" --db-name "my-mlx-vector-db" --threshold 0.7 --num-results 5
# Update document
vectura-mlx update <document-uuid> "Updated text content" --db-name "my-mlx-vector-db"
# Delete documents
vectura-mlx delete <document-uuid-1> <document-uuid-2> --db-name "my-mlx-vector-db"
# Reset database
vectura-mlx reset --db-name "my-mlx-vector-db"
# Run demo with sample data
vectura-mlx mock --db-name "my-mlx-vector-db"Options for vectura-mlx-cli:
--db-name, -d: Database name (default: "vectura-mlx-cli-db")--threshold, -t: Minimum similarity threshold (default: no threshold)--num-results, -n: Number of results to return (default: 10)
VecturaKit is released under the MIT License. See the LICENSE file for more information. Copyright (c) 2025 Rudrank Riyam.
Contributions are welcome! Please fork the repository and submit a pull request with your improvements.
- Clone the repository
- Open
Package.swiftin Xcode or VS Code - Run tests to ensure everything works:
swift test - Make your changes and test them
- Follow SwiftLint rules (run
swiftlint lint) - Use Swift 6.0+ features where appropriate
- Maintain backward compatibility when possible
- Document public APIs with DocC comments