Intel® AI for Enterprise RAG

Intel® AI for Enterprise RAG simplifies transforming your enterprise data into actionable insights. Powered by Intel® Xeon® processors and Intel® Gaudi® AI accelerators, it integrates components from industry partners to offer a streamlined approach to deploying enterprise solutions.

Why use Intel® AI for Enterprise RAG?

Enable intelligent ChatQ&A experiences that understand your business context:

Domain-Specific Intelligence - Enrich conversations with your organizational knowledge without training or fine-tuning models
Rapid Deployment - Transform enterprise documents into conversational AI experiences in minutes, not months
Enterprise-Ready Scale - Deploy secure, compliant ChatQ&A solutions that grow with your business needs

Core Features

One-Click Enterprise Deployment - Fully automated Kubernetes cluster provisioning with Ansible playbooks, supporting both single-node and multi-node configurations with comprehensive infrastructure setup.
Optimized AI Hardware Support - Native support for Intel® Xeon® processors and Intel® Gaudi® AI accelerators with Horizontal Pod Autoscaling (HPA), balloons policy for CPU pinning on NUMA architectures, and performance-tuned configurations.
Enterprise-Grade Security & Compliance - Integrated Identity and Access Management (IAM) with Keycloak, programmable guardrails for fine-grained control, Pod Security Standards (PSS) enforcement for secure enterprise operations, role-based access control for vector databases, and Intel® Trust Domain Extensions (TDX) support for confidential computing.
Comprehensive Monitoring & Observability - Integrated telemetry stack with Prometheus, Grafana dashboards, distributed tracing with Tempo, and centralized logging with Loki for full pipeline visibility.

If you're interested in getting a glimpse of how Intel® AI for Enterprise RAG works, check out following demo.

Note

The video provided below showcases the beta release of our project. As we've transitioned to next releases, users can anticipate an improved UI design, improved installation process along with other enhancements.

Feel free to check out the architecture of the pipeline. For the detailed microservices architecture, refer here.

Intel® AI for Enterprise RAG
- Why use Intel® AI for Enterprise RAG?
- Core Features
Requirements
Getting Started
- Cluster preparation
  - Simplified Single node Kubernetes Cluster Deployment
  - Installing Infrastructure Components on a custom cluster
- Pipeline installation
Documentation
Support
Publications
License
Security
Intel’s Human Rights Principles
Model Card Guidance
Contributing
Trademark Information

Requirements

System Requirements

Category	Details
Operating System	Ubuntu 22.04/24.04
Hardware Platforms	4th Gen Intel® Xeon® Scalable processors 5th Gen Intel® Xeon® Scalable processors 6th Gen Intel® Xeon® Scalable processors 3rd Gen Intel® Xeon® Scalable processors and Intel® Gaudi® 2 AI Accelerator 4th Gen Intel® Xeon® Scalable processors and Intel® Gaudi® 2 AI Accelerator 6th Gen Intel® Xeon® Scalable processors and Intel® Gaudi® 3 AI Accelerator
Kubernetes Version	1.29.5 1.29.12 1.30.8 1.31.4
Python	3.10

Software Prerequisites

Hugging Face Model Access: Ensure you have the necessary access to download and use the chosen Hugging Face model. Default models can be inspected in config.yaml.
For multi-node clusters CSI driver with StorageClass supporting accessMode ReadWriteMany (RWX) is required. NFS server with CSI driver that supports RWX can be installed via simplified kubernetes cluster deployment section or you can check out more detailed instructions in deployment/README.md.

Hardware Requirements

These are minimal requirements to run Intel® AI for Enterprise RAG with default settings. In case of more(or less) resources available, feel free to adjust the parameters in resources-reference-cpu.yaml or resources-reference-hpu.yaml, depending on the chosen hardware.

Deployment on Xeon only

To deploy the solution using Xeon only, you will need access to any platform with Intel® Xeon® Scalable processor that meet bellow requirements:

logical cores: A minimum of 88 logical cores
RAM memory: A minimum of 250GB of RAM
Disk Space: 200GB of disk space is generally recommended, though this is highly dependent on the model size

Deployment on Xeon + Gaudi Accelerator

To deploy the solution on a platform with Gaudi® AI Accelerator we need to have access to instance with minimal requirements:

logical cores: A minimum of 56 logical cores
RAM memory: A minimum of 250GB of RAM though this is highly dependent on database size
Disk Space: 500GB of disk space is generally recommended, though this is highly dependent on the model size and database size
Gaudi cards: 8
Gaudi driver: 1.21.3

Getting Started

Install the prerequisities.

cd deployment/
sudo apt-get install python3-venv
python3 -m venv erag-venv
source erag-venv/bin/activate
pip install --upgrade pip
pip install -r requirements.txt
ansible-galaxy collection install -r requirements.yaml --upgrade

Create a copy of the configuration file:

cd deployment
cp -r inventory/sample inventory/test-cluster

(Optional) Execute following command to install and configure third party applications, including Docker, Helm, make, zip, and jq, needed to run the Intel® AI for Enterprise RAG correctly.

ansible-playbook -u $USER -K playbooks/application.yaml --tags configure -e @inventory/test-cluster/config.yaml

Cluster preparation

Simplified Single node Kubernetes Cluster Deployment

If you need to deploy a new Kubernetes cluster or don't know where to start, follow the instructions below. If you already have a cluster prepared, skip to the next section.

Edit the inventory file:
- Open inventory/test-cluster/inventory.ini.
- Replace LOCAL_USER, REMOTE_USER and MACHINE_IP with your actual values.

Example inventory.ini for a single-node cluster:

# Kubernetes Cluster Inventory
[local]
localhost ansible_connection=local ansible_user=LOCAL_USER

[all]
# Control plane nodes
node1 ansible_host=MACHINE_IP

# Define node groups
[kube_control_plane]
node1

[kube_node]
node1

[etcd:children]
kube_control_plane

[k8s_cluster:children]
kube_control_plane
kube_node

# Vars
[k8s_cluster:vars]
ansible_become=true
ansible_user=REMOTE_USER
ansible_connection=ssh

Tip

For password SSH connections to the node, add --ask-pass to every ansible command.

For more information on preparing an Ansible inventory, see the Ansible Inventory Documentation.

Edit the configuration file:
- Open inventory/test-cluster/config.yaml.
- Set deploy_k8s to true.
- Fill in the required values for your environment. If you don't have any cluster deployed, ignore kubeconfig parameter for now.

Note

The inventory provides the ability to install additional components that might be needed when preparing a Kubernetes (K8s) cluster.

Set gaudi_operator: true if you are working with Gaudi nodes and want to install gaudi software stack via operator.
Set install_csi: nfs if you are setting up a multi-node cluster and want to deploy an NFS server with a CSI plugin that creates a StorageClass with RWX (ReadWriteMany) capabilities. Velero requires NFS to be included.
Set install_csi: netapp-trident if you are deploying with NetApp ONTAP storage backend for enterprise-grade storage with advanced features.

(Optional) Validate hardware resources:

ansible-playbook playbooks/validate.yaml --tags hardware -i inventory/test-cluster/inventory.ini -e @inventory/test-cluster/config.yaml

If this is gaudi_deployment add additional flag -e is_gaudi_platform=true.

Deploy the cluster:

ansible-playbook -K playbooks/infrastructure.yaml --tags configure,install -i inventory/test-cluster/inventory.ini -e @inventory/test-cluster/config.yaml

Add kubeconfig path in config.yaml.

(Optional) Validate config.yaml:

ansible-playbook playbooks/validate.yaml --tags config -i inventory/test-cluster/inventory.ini -e @inventory/test-cluster/config.yaml

Installing Infrastructure Components on a custom cluster

If you are using your own custom Kubernetes cluster (not provisioned by the provided infrastructure playbooks), you may need to install additional infrastructure components before deploying the application. These include the NFS server for shared storage, Gaudi operator (for Habana Gaudi AI accelerator support), Velero, or other supported services.

To prepare your cluster:

Edit the configuration file:
- Open inventory/test-cluster/config.yaml.
- Set deploy_k8s: false and update the other fields as needed for your environment.
- If you need NFS, set install_csi: nfs and configure the NFS-related variables (backing up with Velero requires NFS to be included).
- If you need Gaudi support, set gaudi_operator: true and specify the desired habana_driver_version.

Validate hardware resources and config.yaml:

ansible-playbook playbooks/validate.yaml --tags hardware,config -i inventory/test-cluster/inventory.ini -e @inventory/test-cluster/config.yaml

If this is a Gaudi deployment, add the flag -e is_gaudi_platform=true.

Install infrastructure components (NFS, Gaudi operator, or others):
```
 ansible-playbook -K playbooks/infrastructure.yaml --tags post-install -i inventory/test-cluster/inventory.ini -e @inventory/test-cluster/config.yaml
```
This will install and configure the NFS server, Gaudi operator, or velero as specified in your configuration.

Note

You can enable several components in the same run if both are needed. Additional components may be supported via post-install in the future.

Pipeline installation

Once your cluster is prepared and the required infrastructure is installed, proceed with the application installation.

ansible-playbook -u $USER -K playbooks/application.yaml --tags configure,install -e @inventory/test-cluster/config.yaml

To verify if the components were installed correctly, run the script ./scripts/test_connection.sh, connect to UI, or execute e2e tests.

Documentation

Refer to deployment/README.md or docs for more detailed deployment guide or in-depth instructions on ERAG components.

Support

Submit questions, feature requests, and bug reports on the GitHub Issues page.

Publications

Feel free to checkout articles about Intel® AI for Enterprise RAG:

License

Intel® AI for Enterprise RAG is licensed under the Apache License Version 2.0. Refer to the "LICENSE" file for the full license text and copyright notice.

This distribution includes third-party software governed by separate license terms. This third-party software, even if included with the distribution of the Intel software, may be governed by separate license terms, including without limitation, third-party license terms, other Intel software license terms, and open-source software license terms. These separate license terms govern your use of the third-party programs as set forth in the "THIRD-PARTY-PROGRAMS" file.

Security

The Security Policy outlines our guidelines and procedures for ensuring the highest level of security and trust for our users who consume Intel® AI for Enterprise RAG.

Intel’s Human Rights Principles

Intel is committed to respecting human rights and avoiding causing or contributing to adverse impacts on human rights. See Intel’s Global Human Rights Principles. Intel’s products and software are intended only to be used in applications that do not cause or contribute to adverse impacts on human rights.

Model Card Guidance

You, not Intel, are responsible for determining model suitability for your use case. For information regarding model limitations, safety considerations, biases, or other information consult the model cards (if any) for models you use, typically found in the repository where the model is available for download. Contact the model provider with questions. Intel does not provide model cards for third party models.

Contributing

If you want to contribute to the project, please refer to the guide in CONTRIBUTING.md file.

Trademark Information

Intel, the Intel logo, OpenVINO, the OpenVINO logo, Pentium, Xeon, and Gaudi are trademarks of Intel Corporation or its subsidiaries.

Other names and brands may be claimed as the property of others.

Name		Name	Last commit message	Last commit date
Latest commit History 21 Commits
.github		.github
deployment		deployment
docs		docs
src		src
.gitignore		.gitignore
.pre-commit-config.yaml		.pre-commit-config.yaml
CODE_OF_CONDUCT.md		CODE_OF_CONDUCT.md
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
README.md		README.md
SECURITY.md		SECURITY.md
THIRD-PARTY-PROGRAMS		THIRD-PARTY-PROGRAMS
ibm_catalog.json		ibm_catalog.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Intel® AI for Enterprise RAG

Why use Intel® AI for Enterprise RAG?

Core Features

Table of Contents

Requirements

System Requirements

Software Prerequisites

Hardware Requirements

Deployment on Xeon only

Deployment on Xeon + Gaudi Accelerator

Getting Started

Cluster preparation

Simplified Single node Kubernetes Cluster Deployment

Installing Infrastructure Components on a custom cluster

Pipeline installation

Documentation

Support

Publications

License

Security

Intel’s Human Rights Principles

Model Card Guidance

Contributing

Trademark Information

About

Uh oh!

Releases 14

Packages

Uh oh!

Contributors 36

Languages

License

opea-project/Enterprise-RAG

Folders and files

Latest commit

History

Repository files navigation

Intel® AI for Enterprise RAG

Why use Intel® AI for Enterprise RAG?

Core Features

Table of Contents

Requirements

System Requirements

Software Prerequisites

Hardware Requirements

Deployment on Xeon only

Deployment on Xeon + Gaudi Accelerator

Getting Started

Cluster preparation

Simplified Single node Kubernetes Cluster Deployment

Installing Infrastructure Components on a custom cluster

Pipeline installation

Documentation

Support

Publications

License

Security

Intel’s Human Rights Principles

Model Card Guidance

Contributing

Trademark Information

About

Resources

License

Code of conduct

Contributing

Security policy

Uh oh!

Stars

Watchers

Forks

Releases 14

Packages 0

Uh oh!

Contributors 36

Languages

Packages