Note: rag.py is adapted from Underfitted's demo posted here: https://github.com/svpino/gentle-intro-to-rag/blob/main/rag.ipynb
This project demonstrates a simple RAG application that answers questions based on content from a plain text file ("grimmstales.txt") using a local embedding API and the Llama 3.1 model.
-
Activate the virtual environment:
source .venv/bin/activate # On Linux/macOS # or .\.venv\Scripts\activate # On Windows
-
Install dependencies:
pip install -r requirements.txt
-
Make sure you have Ollama or LM Studio installed and running with the Llama 3.1 model:
ollama pull llama3.1 # or use LM Studio with the appropriate model loaded -
You'll need a text file named
grimmstales.txtin the same directory (this demo uses the public domain "Grimms' Fairy Tales").- You can use any large text file by changing the
TEXT_FILEvariable inrag.py.
- You can use any large text file by changing the
Make sure your virtual environment is activated, then run:
python rag.pyThis script:
- Loads a plain text document ("grimmstales.txt")
- Splits it into manageable chunks
- Creates vector embeddings for each chunk using a local embedding API (compatible with LM Studio or similar)
- Stores the embeddings in a FAISS vector store
- Sets up a retrieval system to find relevant chunks based on questions
- Configures an LLM (Llama 3.1 via Ollama or LM Studio) to generate answers
- Combines everything into a RAG pipeline that answers questions based on the text file content
Here are some example questions you can use to demo the RAG system with "Grimms' Fairy Tales":
- What lesson does the story of "Hansel and Gretel" teach about resourcefulness?
- Which tale features a character who can spin straw into gold?
- In "The Twelve Dancing Princesses," how do the princesses manage to escape each night?
This script was converted from a Jupyter notebook created by Santiago Valdarrama (svpino): https://github.com/svpino/gentle-intro-to-rag/blob/main/rag.ipynb