A Flask API powered by a locally hosted multimodal LLM (via Ollama) to generate LaTeX code from images of equations, tables, and figures. Supports conference styles like IEEE, ensuring data privacy and flexibility. Ideal for researchers and academics.
- Multi-Content Support: Extracts LaTeX for equations, tables, and formatted text.
- Preferred Model: Optimized for the multimodal LLM
llava:34b
. - Fuzzy Matching: Handles variations in LaTeX syntax for robust extraction.
- Conference Styles: Outputs tailored for formats like IEEE.
- Privacy First: No external API calls; processes are hosted locally.
- Extensible: Built on Flask with well-structured, modular code.
- Dockerized: Easy deployment with a Docker image.
- docker-compose: Run the API with Ollama using
docker-compose
. - CI/CD: Automated testing and Docker image builds with GitHub Actions.
To use this API, you must either have Ollama installed locally or have access to a running Ollama API host with multimodal LLM support (e.g., llava:34b
).
-
Ollama Installed Locally:
- Download and install Ollama by following the instructions at https://ollama.com/.
- Ensure the Ollama service is running and accessible on
http://localhost:11434
(default).
-
Accessible Ollama API Host:
- If you are not running Ollama locally, set the
OLLAMA_API_HOST
environment variable to the accessible API host URL of a running Ollama instance.
Example:
OLLAMA_API_HOST=http://<remote-host>:<port>
- If you are not running Ollama locally, set the
Install the dependencies listed in requirements.txt
:
python-dotenv==1.0.1
pylint==3.3.3
pytest==8.3.4
black==24.10.0
Flask==3.1.0
requests==2.32.3
pillow==11.1.0
pytesseract==0.3.13
ollama==0.4.5
flasgger==0.9.7.1
-
Clone the repository:
git clone https://github.com/yrangana/Image_to_LaTeX.git cd Image_to_LaTeX
-
Create and activate a virtual environment:
python -m venv venv source venv/bin/activate (Windows: venv\Scripts\activate)
-
Install dependencies:
pip install -r requirements.txt or make install
-
Configure environment variables in a
.env
file:API_PORT=5050 OLLAMA_API_HOST=http://localhost:11434 OLLAMA_MODEL=llava:34b UPLOAD_DIR=/tmp ALLOWED_EXTENSIONS=png,jpg,jpeg
Run the Flask API:
python app.py
or
make run
Use the Makefile to run linters and tests:
make lint (Lint the codebase with pylint)
make test (Run tests with pytest)
make clean (Clean up temporary files)
- run: Starts the Flask server.
- lint: Runs pylint to check code quality.
- test: Runs tests using pytest.
- clean: Cleans up temporary files like
.pytest_cache
.
-
/api/generate
:- Method: POST
- Description: Generates LaTeX code from an uploaded image.
- Parameters:
- file: The image file (required).
- type: The content type (table, equation, text).
- model: The multimodal LLM model (default: llava:34b) (optional).
-
/api/health
:- Method: GET
- Description: Checks the API health status.
curl -X POST "http://127.0.0.1:5050/api/generate" \
-H "Content-Type: multipart/form-data" \
-F "[email protected]" \
-F "type=table"
-F "model=llava:34b"
{
"status": "success",
"data": {
"latex": "\\begin{table}[!t]\n\\centering\n\\caption{Example Table}\n\\label{tab:example}\n\\begin{tabular}{|c|c|c|}\n\\hline\nA & B & C \\\\\n\\hline\n1 & 2 & 3 \\\\\n4 & 5 & 6 \\\\\n\\hline\n\\end{tabular}\n\\end{table}",
"type": "table",
"model": "llava:34b"
}
}
- Build the Docker image:
docker build -t image_to_latex .
- Run the Docker container:
docker run -p 5050:5050 -e OLLAMA_API_HOST=http://host.docker.internal:11434 --name image_to_latex image_to_latex
- Access the API at
http://localhost:5050
.
docker-compose.yml is provided to run the API with Ollama. This file contains the necessary configurations to run both services together.
-
Build the Docker image:
docker-compose build
-
Run the Docker container:
docker-compose up -d
durring the first run, the Ollama will take some time to download the llava:34b model.
- Access the API at
http://localhost:5050
.
The API documentation is available at http://localhost:5050/apidocs
.
A Streamlit web app is available to interact with the API and test. To run the Streamlit app, use the following command:
streamlit run sreamlit_app/main.py
The Streamlit app will be available at http://localhost:8501
.
- Fork the repository.
- Create a new branch for your feature or bugfix:
git checkout -b feature/my-feature
- Commit your changes:
git commit -m "Add my new feature"
- Push the branch:
git push origin feature/my-feature
- Open a pull request.
- Framework: Built with Flask and flasgger for API documentation.
- LLM Support: Powered by Ollama for multimodal LaTeX generation.
- Streamlit: Web app for interactive testing and demonstration.