RAG Content provides a shared codebase for generating vector databases. It serves as the core framework for Lightspeed-related projects (e.g., OpenShift Lightspeed, OpenStack Lightspeed, etc.) to generate their own vector databases that can be used for RAG.
The lightspeed_rag_content
library is not available via pip, but it's included:
- in the base container image or
- it can be installed via UV.
To install the library via uv, do:
-
Run the command
uv sync
uv sync
-
Test if the library can be imported (expect
lightspeed-rag-content
in the output):uv run python -c "import lightspeed_rag_content; print(lightspeed_rag_content.__name__)"
The base container image can be manually generated or pulled from a container registry.
There are prebuilt two images. One with CPU support only (size cca 3.7 GB) and image with GPU support with CUDA support (size cca 12 GB).
-
Pull the CPU variant:
podman pull quay.io/lightspeed-core/rag-content-cpu:latest
-
Pull the GPU variant:
podman pull quay.io/lightspeed-core/rag-content-gpu:latest
To build the image locally, follow these steps:
-
Install the requirements:
make
andpodman
. -
Generate the container image:
podman build -t localhost/lightspeed-rag-content-cpu:latest .
-
The
lightspeed_rag_content
and its dependencies will be installed in the image (expectlightspeed-rag-content
in the output):podman run localhost/lightspeed-rag-content-cpu:latest python -c "import lightspeed_rag_content; print(lightspeed_rag_content.__name__)"
You can generate the vector database either using
- Llama-Index Faiss Vector Store
- Llama-Index Postgres (PGVector) Vector Store
- Llama-Stack Faiss Vector-IO
- Llama-Stack SQLite-vec Vector-IO
Llama-Index approaches require you to download the embedding model, and we also recommend it for Llama-Stack targets even though it should work even without manually downloading the model, model-download.
All cases require you to prepare documentation in text format that is going to be chunked and map to embeddings generated using the model:
-
Download the embedding model (sentence-transformers/all-mpnet-base-v2) from HuggingFace as follows:
mkdir ./embeddings_model pdm run python ./scripts/download_embeddings_model.py -l ./embeddings_model/ -r sentence-transformers/all-mpnet-base-v2
-
Prepare dummy documentation:
mkdir -p ./custom_docs/0.1 echo "Vector Database is an efficient way how to provide information to LLM" > ./custom_docs/0.1/info.txt
-
Prepare a custom script (
./custom_processor.py
) for populating the vector database. We provide an example of how such a script might look like using thelightspeed_rag_content
library. Note that in your case the script will be different:from lightspeed_rag_content.metadata_processor import MetadataProcessor from lightspeed_rag_content.document_processor import DocumentProcessor from lightspeed_rag_content import utils class CustomMetadataProcessor(MetadataProcessor): def __init__(self, url): self.url = url def url_function(self, file_path: str) -> str: # Return a URL for the file, so it can be referenced when used # in an answer return self.url if __name__ == "__main__": parser = utils.get_common_arg_parser() args = parser.parse_args() # Instantiate custom Metadata Processor metadata_processor = CustomMetadataProcessor("https://www.redhat.com") # Instantiate Document Processor document_processor = DocumentProcessor( chunk_size=args.chunk, chunk_overlap=args.overlap, model_name=args.model_name, embeddings_model_dir=args.model_dir, num_workers=args.workers, vector_store_type=args.vector_store_type, ) # Load and embed the documents, this method can be called multiple times # for different sets of documents document_processor.process(args.folder, metadata=metadata_processor) # Save the new vector database to the output directory document_processor.save(args.index, args.output)
Generate the documentation using the script from the previous section (Generating the Vector Database):
uv run ./custom_processor.py -o ./vector_db/custom_docs/0.1 -f ./custom_docs/0.1/ -md embeddings_model/ -mn sentence-transformers/all-mpnet-base-v2 -i custom_docs-0_1
Once the command is done, you can find the vector database at ./vector_db
, the
embedding model at ./embeddings_model
, and the Index ID set to custom-docs-0_1
.
To generate a vector database stored in Postgres (PGVector), run the following commands:
-
Start Postgres with the pgvector extension by running:
make start-postgres-debug
The
data
folder of Postgres is created at./postgresql/data
. This command also creates./output
for the output directory, in which the metadata is saved. -
Run:
POSTGRES_USER=postgres \ POSTGRES_PASSWORD=somesecret \ POSTGRES_HOST=localhost \ POSTGRES_PORT=15432 \ POSTGRES_DATABASE=postgres \ uv run python ./custom_processor.py \ -o ./output \ -f custom_docs/0.1/ \ -md embeddings_model/ \ -mn sentence-transformers/all-mpnet-base-v2 \ -i custom_docs-0_1 \ --vector-store-type postgres
Which generates embeddings on PostgreSQL, which can be used for RAG, and
metadata.json
in./output
. Generated embeddings are stored in thedata_table_name
table.$ podman exec -it pgvector bash $ psql -U postgres psql (16.4 (Debian 16.4-1.pgdg120+2)) Type "help" for help. postgres=# \dt List of relations Schema | Name | Type | Owner --------+------------------------+-------+---------- public | data_table_name | table | postgres (1 row)
The process is basically the same as in the
Llama-Index Faiss Vector Store but passing the
--vector-store-type
parameter; so you generate the documentation using the
custom_processor.py
script from earlier section
(Generating the Vector Database):
pdm run ./custom_processor.py \
-o ./vector_db/custom_docs/0.1 \
-f ./custom_docs/0.1/ \
-md embeddings_model/ \
-mn sentence-transformers/all-mpnet-base-v2 \
-i custom_docs-0_1 \
--vector-store-type=llamastack-faiss
Once the command is done, you can find the vector database at
./vector_db/custom_docs/0.1
with the name faiss_store.db
as well as a
barebones llama-stack configuration file named llama-stack.yaml
for
reference, since it's not necessary for the final deployment.
The vector-io will be named custom-docs-0_1
:
providers:
vector_io:
- provider_id: custom-docs-0_1
provider_type: inline::faiss
config:
kvstore:
type: sqlite
namespace: null
db_path: /home/<user>/rag-content/vector_db/custom_docs/0.1/faiss_store.db
Once we have a database we can use script query_rag.py
to check some results:
python scripts/query_rag.py \
-p vector_db/custom_docs/0.1 \
-x custom-docs-0_1 \
-m embeddings_model \
-k 5 \
-q "how can I configure a cinder backend"
The process is the same as in the
Llama-Stack Faiss but passing a different value on the
--vector-store-type
parameter; so you generate the documentation using the
custom_processor.py
script from earlier section
(Generating the Vector Database):
pdm run ./custom_processor.py \
-o ./vector_db/custom_docs/0.1 \
-f ./custom_docs/0.1/ \
-md embeddings_model/ \
-mn sentence-transformers/all-mpnet-base-v2 \
-i custom_docs-0_1 \
--vector-store-type=llamastack-sqlite-vec
Once the command is done, you can find the vector database at
./vector_db/custom_docs/0.1
with the name sqlitevec_store.db
as well as a
barebones llama-stack configuration file named llama-stack.yaml
for
reference, since it's not necessary for the final deployment.
The vector-io will be named custom-docs-0_1
:
providers:
vector_io:
- provider_id: custom-docs-0_1
provider_type: inline::sqlite-vec
config:
db_path: /home/<user>/rag-content/vector_db/custom_docs/0.1/sqlitevec_store.db
Once we have a database we can use script query_rag.py
to check some results:
python scripts/query_rag.py \
-p vector_db/custom_docs/0.1 \
-x custom-docs-0_1 \
-m embeddings_model \
-k 5 \
-q "how can I configure a cinder backend"
The lock file is used in this repository:
uv.lock
The lock file needs to be regenerated when new updates (dependencies) are available. Use following commands in order to do it:
uv lock --upgrade
uv sync
To generate all requirements*
files:
requirements-build.in
requirements-build.txt
requirements.txt
The following command must be executed:
scripts/generate_packages_to_prefetch.py
This project is licensed under the Apache License 2.0. See the LICENSE file for details.