Intel® AI for Enterprise RAG simplifies transforming your enterprise data into actionable insights. Powered by Intel® Xeon® processors and Intel® Gaudi® AI accelerators, it integrates components from industry partners to offer a streamlined approach to deploying enterprise solutions.
Enable intelligent ChatQ&A experiences that understand your business context:
- Domain-Specific Intelligence - Enrich conversations with your organizational knowledge without training or fine-tuning models
- Rapid Deployment - Transform enterprise documents into conversational AI experiences in minutes, not months
- Enterprise-Ready Scale - Deploy secure, compliant ChatQ&A solutions that grow with your business needs
- One-Click Enterprise Deployment - Fully automated Kubernetes cluster provisioning with Ansible playbooks, supporting both single-node and multi-node configurations with comprehensive infrastructure setup.
- Optimized AI Hardware Support - Native support for Intel® Xeon® processors and Intel® Gaudi® AI accelerators with Horizontal Pod Autoscaling (HPA), balloons policy for CPU pinning on NUMA architectures, and performance-tuned configurations.
- Enterprise-Grade Security & Compliance - Integrated Identity and Access Management (IAM) with Keycloak, programmable guardrails for fine-grained control, Pod Security Standards (PSS) enforcement for secure enterprise operations, role-based access control for vector databases, and Intel® Trust Domain Extensions (TDX) support for confidential computing.
- Comprehensive Monitoring & Observability - Integrated telemetry stack with Prometheus, Grafana dashboards, distributed tracing with Tempo, and centralized logging with Loki for full pipeline visibility.
If you're interested in getting a glimpse of how Intel® AI for Enterprise RAG works, check out following demo.
Note
The video provided below showcases the beta release of our project. As we've transitioned to next releases, users can anticipate an improved UI design, improved installation process along with other enhancements.
Feel free to check out the architecture of the pipeline. For the detailed microservices architecture, refer here.
- Intel® AI for Enterprise RAG
- Requirements
- Getting Started
- Documentation
- Support
- Publications
- License
- Security
- Intel’s Human Rights Principles
- Model Card Guidance
- Contributing
- Trademark Information
Category | Details |
---|---|
Operating System | Ubuntu 22.04/24.04 |
Hardware Platforms | 4th Gen Intel® Xeon® Scalable processors 5th Gen Intel® Xeon® Scalable processors 6th Gen Intel® Xeon® Scalable processors 3rd Gen Intel® Xeon® Scalable processors and Intel® Gaudi® 2 AI Accelerator 4th Gen Intel® Xeon® Scalable processors and Intel® Gaudi® 2 AI Accelerator 6th Gen Intel® Xeon® Scalable processors and Intel® Gaudi® 3 AI Accelerator |
Kubernetes Version | 1.29.5 1.29.12 1.30.8 1.31.4 |
Python | 3.10 |
- Hugging Face Model Access: Ensure you have the necessary access to download and use the chosen Hugging Face model. Default models can be inspected in config.yaml.
- For multi-node clusters CSI driver with StorageClass supporting accessMode ReadWriteMany (RWX) is required. NFS server with CSI driver that supports RWX can be installed via simplified kubernetes cluster deployment section or you can check out more detailed instructions in deployment/README.md.
These are minimal requirements to run Intel® AI for Enterprise RAG with default settings. In case of more(or less) resources available, feel free to adjust the parameters in resources-reference-cpu.yaml or resources-reference-hpu.yaml, depending on the chosen hardware.
To deploy the solution using Xeon only, you will need access to any platform with Intel® Xeon® Scalable processor that meet bellow requirements:
- logical cores: A minimum of
88
logical cores - RAM memory: A minimum of
250GB
of RAM - Disk Space:
200GB
of disk space is generally recommended, though this is highly dependent on the model size
To deploy the solution on a platform with Gaudi® AI Accelerator we need to have access to instance with minimal requirements:
- logical cores: A minimum of
56
logical cores - RAM memory: A minimum of
250GB
of RAM though this is highly dependent on database size - Disk Space:
500GB
of disk space is generally recommended, though this is highly dependent on the model size and database size - Gaudi cards:
8
- Gaudi driver:
1.21.3
Install the prerequisities.
cd deployment/
sudo apt-get install python3-venv
python3 -m venv erag-venv
source erag-venv/bin/activate
pip install --upgrade pip
pip install -r requirements.txt
ansible-galaxy collection install -r requirements.yaml --upgrade
Create a copy of the configuration file:
cd deployment
cp -r inventory/sample inventory/test-cluster
(Optional) Execute following command to install and configure third party applications, including Docker, Helm, make, zip, and jq, needed to run the Intel® AI for Enterprise RAG correctly.
ansible-playbook -u $USER -K playbooks/application.yaml --tags configure -e @inventory/test-cluster/config.yaml
If you need to deploy a new Kubernetes cluster or don't know where to start, follow the instructions below. If you already have a cluster prepared, skip to the next section.
- Edit the inventory file:
- Open
inventory/test-cluster/inventory.ini
. - Replace
LOCAL_USER
,REMOTE_USER
andMACHINE_IP
with your actual values.
- Open
Example inventory.ini
for a single-node cluster:
# Kubernetes Cluster Inventory
[local]
localhost ansible_connection=local ansible_user=LOCAL_USER
[all]
# Control plane nodes
node1 ansible_host=MACHINE_IP
# Define node groups
[kube_control_plane]
node1
[kube_node]
node1
[etcd:children]
kube_control_plane
[k8s_cluster:children]
kube_control_plane
kube_node
# Vars
[k8s_cluster:vars]
ansible_become=true
ansible_user=REMOTE_USER
ansible_connection=ssh
Tip
For password SSH connections to the node, add --ask-pass
to every ansible command.
For more information on preparing an Ansible inventory, see the Ansible Inventory Documentation.
- Edit the configuration file:
- Open
inventory/test-cluster/config.yaml
. - Set
deploy_k8s
totrue
. - Fill in the required values for your environment. If you don't have any cluster deployed, ignore
kubeconfig
parameter for now.
- Open
Note
The inventory provides the ability to install additional components that might be needed when preparing a Kubernetes (K8s) cluster.
- Set
gaudi_operator: true
if you are working with Gaudi nodes and want to install gaudi software stack via operator. - Set
install_csi: nfs
if you are setting up a multi-node cluster and want to deploy an NFS server with a CSI plugin that creates aStorageClass
with RWX (ReadWriteMany) capabilities. Velero requires NFS to be included. - Set
install_csi: netapp-trident
if you are deploying with NetApp ONTAP storage backend for enterprise-grade storage with advanced features.
-
(Optional) Validate hardware resources:
ansible-playbook playbooks/validate.yaml --tags hardware -i inventory/test-cluster/inventory.ini -e @inventory/test-cluster/config.yaml
If this is gaudi_deployment add additional flag -e is_gaudi_platform=true.
-
Deploy the cluster:
ansible-playbook -K playbooks/infrastructure.yaml --tags configure,install -i inventory/test-cluster/inventory.ini -e @inventory/test-cluster/config.yaml
-
Add
kubeconfig
path in config.yaml. -
(Optional) Validate config.yaml:
ansible-playbook playbooks/validate.yaml --tags config -i inventory/test-cluster/inventory.ini -e @inventory/test-cluster/config.yaml
If you are using your own custom Kubernetes cluster (not provisioned by the provided infrastructure playbooks), you may need to install additional infrastructure components before deploying the application. These include the NFS server for shared storage, Gaudi operator (for Habana Gaudi AI accelerator support), Velero, or other supported services.
To prepare your cluster:
-
Edit the configuration file:
- Open
inventory/test-cluster/config.yaml
. - Set
deploy_k8s: false
and update the other fields as needed for your environment. - If you need NFS, set
install_csi: nfs
and configure the NFS-related variables (backing up with Velero requires NFS to be included). - If you need Gaudi support, set
gaudi_operator: true
and specify the desiredhabana_driver_version
.
- Open
-
Validate hardware resources and
config.yaml
:ansible-playbook playbooks/validate.yaml --tags hardware,config -i inventory/test-cluster/inventory.ini -e @inventory/test-cluster/config.yaml
If this is a Gaudi deployment, add the flag -e is_gaudi_platform=true
.
-
Install infrastructure components (NFS, Gaudi operator, or others):
ansible-playbook -K playbooks/infrastructure.yaml --tags post-install -i inventory/test-cluster/inventory.ini -e @inventory/test-cluster/config.yaml
This will install and configure the NFS server, Gaudi operator, or velero as specified in your configuration.
Note
You can enable several components in the same run if both are needed. Additional components may be supported via post-install in the future.
Once your cluster is prepared and the required infrastructure is installed, proceed with the application installation.
ansible-playbook -u $USER -K playbooks/application.yaml --tags configure,install -e @inventory/test-cluster/config.yaml
To verify if the components were installed correctly, run the script ./scripts/test_connection.sh, connect to UI, or execute e2e tests.
Refer to deployment/README.md or docs for more detailed deployment guide or in-depth instructions on ERAG components.
Submit questions, feature requests, and bug reports on the GitHub Issues page.
Feel free to checkout articles about Intel® AI for Enterprise RAG:
- NetApp AIPod Mini – Deployment Automation
- Multi-node deployments using Intel® AI for Enterprise RAG
- Rethinking AI Infrastructure: How NetApp and Intel Are Unlocking the Future with AIPod Mini
- Deploying Scalable Enterprise RAG on Kubernetes with Ansible Automation
Intel® AI for Enterprise RAG is licensed under the Apache License Version 2.0. Refer to the "LICENSE" file for the full license text and copyright notice.
This distribution includes third-party software governed by separate license terms. This third-party software, even if included with the distribution of the Intel software, may be governed by separate license terms, including without limitation, third-party license terms, other Intel software license terms, and open-source software license terms. These separate license terms govern your use of the third-party programs as set forth in the "THIRD-PARTY-PROGRAMS" file.
The Security Policy outlines our guidelines and procedures for ensuring the highest level of security and trust for our users who consume Intel® AI for Enterprise RAG.
Intel is committed to respecting human rights and avoiding causing or contributing to adverse impacts on human rights. See Intel’s Global Human Rights Principles. Intel’s products and software are intended only to be used in applications that do not cause or contribute to adverse impacts on human rights.
You, not Intel, are responsible for determining model suitability for your use case. For information regarding model limitations, safety considerations, biases, or other information consult the model cards (if any) for models you use, typically found in the repository where the model is available for download. Contact the model provider with questions. Intel does not provide model cards for third party models.
If you want to contribute to the project, please refer to the guide in CONTRIBUTING.md file.
Intel, the Intel logo, OpenVINO, the OpenVINO logo, Pentium, Xeon, and Gaudi are trademarks of Intel Corporation or its subsidiaries.
Other names and brands may be claimed as the property of others.
© Intel Corporation