This project fetches the content of any website, cleans it up, and generates a short, human-readable summary using OpenAI’s GPT models. It removes navigation noise like ads, scripts, and styles so the summary focuses only on meaningful text.
- Scrapes a webpage using BeautifulSoup
- Cleans the page content by removing unnecessary tags (scripts, styles, inputs, etc.)
- Builds a custom prompt for OpenAI with the cleaned text
- Calls the OpenAI Chat Completions API to generate a summary in Markdown format
- Outputs a concise overview of the website (including announcements or news if present)
The internet is full of lengthy articles, blogs, and announcements. Reading through them can be time-consuming.
This project solves that problem by:
- Quickly summarizing content so you understand the core message without scrolling endlessly
- Making it easier to scan multiple sources at once
- Helping with productivity and research (ideal for students, writers, data engineers, or anyone needing quick insights)
In short, it’s like having a personal research assistant that condenses webpages into bite-sized notes.
- Python 3.9+
- Requests – fetch website content
- BeautifulSoup (bs4) – parse and clean HTML
- dotenv – manage API keys securely
- OpenAI Python SDK – generate AI-powered summaries