A web application that uses Retrieval-Augmented Generation (RAG) to analyze research papers and extract key insights or identify research gaps.
- PDF Analysis: Upload any research paper in PDF format
- Key Insights Extraction: Automatically identify and summarize the main findings
- Research Gaps Detection: Highlight limitations and areas for future research
- User-Friendly Interface: Simple web UI powered by Gradio
- API Access: FastAPI backend allows programmatic access
- FastAPI: Modern, high-performance web framework
- Gradio: Simple UI for machine learning models
- LangChain: Framework for LLM applications
- Azure OpenAI: Powerful language model integration
- FAISS: Vector similarity search for document retrieval
- HuggingFace Embeddings: Sentence transformers for text representation
- Python 3.9+
- Azure OpenAI API access
-
Clone the repository:
git clone https://github.com/your-username/AI-Research-Analyzer.git cd AI-Research-Analyzer -
Install dependencies:
pip install -r requirements.txt
-
Set up environment variables:
cp .env .env # Edit .env with your Azure OpenAI API key and endpoint
Run the application with:
python -m app.mainThe application will be available at http://127.0.0.1:8000
- Access the web interface at http://127.0.0.1:8000
- Upload a research paper in PDF format
- Select analysis type: "Key Insights" or "Research Gaps"
- View the generated analysis
The application uses a Retrieval-Augmented Generation (RAG) architecture:
- Document Processing: PDFs are loaded and split into manageable chunks
- Vector Embedding: Text chunks are converted to vector embeddings
- Retrieval: When a query is made, relevant chunks are retrieved
- Generation: Retrieved content is sent to the LLM with a specialized prompt
- Response: A structured analysis is returned to the user
- Support for additional document formats (DOCX, TXT)
- Batch processing of multiple papers
- More analysis types (methodology critique, literature comparison)
- Visualization of document relationships and key concepts
This project is licensed under the MIT License - see the LICENSE file for details.

