LLM for Galaxy Clusters

In this repository we train a LLM to be an expert on galaxy clusters using a curated set of scientific articles on galaxy clusters.

How to use:

This code has two main components:

The code ingests a curated set of scientific articles on galaxy clusters in the form of PDFs. Then, it creates vector embeddings for these articles and saves them in a chromadb.sqlite3 database.
The code uses LangChain to call a LLM (chatgpt-3.5-turbo) to generate answers based off the users questions. The responses are augmented using the RAG technique that concentrates the LLM's answer based off the ingested PDFs.

If you are interested in contributing to the code or the base of PDFs, please contact me via [email protected] or leave a GitHub issue.

The list of scientific articles used to train the LLM can be found in LLM-GalaxyClusters.bib.

Name		Name	Last commit message	Last commit date
Latest commit History 16 Commits
69453455-0baa-450f-badb-b2575c965a81		69453455-0baa-450f-badb-b2575c965a81
.gitignore		.gitignore
LLM-GalaxyClusters.bib		LLM-GalaxyClusters.bib
LLM-GalaxyClusters.py		LLM-GalaxyClusters.py
README.md		README.md
chroma.sqlite3		chroma.sqlite3
requirements.txt		requirements.txt