Skip to content

AgentOS is a lightweight, single-file implementation that provides a robust foundation for building autonomous AI agents. It implements the core concepts outlined in Karpathy's Agent OS architecture while maintaining simplicity and extensibility.

License

Notifications You must be signed in to change notification settings

The-Swarm-Corporation/AgentOS

Repository files navigation

AgentOS

License: MIT Python 3.8+ Code style: black

A minimal, production-ready implementation of Andrej Karpathy's Agent Operating System architecture, developed by Swarms.ai and partners.

AgentOS Architecture

Overview

AgentOS is a lightweight, single-file implementation that provides a robust foundation for building autonomous AI agents. It implements the core concepts outlined in Karpathy's Agent OS architecture while maintaining simplicity and extensibility. Developed by Swarms.ai and its partners, AgentOS is a production-ready implementation of autonomous AI agents that follows the architectural principles outlined by Andrej Karpathy.

Features

  • Unified Model Interface: Seamless integration with multiple LLM providers through LiteLLM
    • Support for Anthropic Claude models (Opus, Sonnet, Haiku)
    • Integration with OpenAI GPT models
    • Access to optimized variants (GPT-4o, GPT-4o-mini)
  • Browser Automation: Built-in browser agent capabilities for web interaction using browser-use
  • Multi-Modal Support:
    • Text processing and generation
    • Video analysis through Google's Gemini models
    • Audio processing and speech synthesis
    • Image handling capabilities
  • Resource Management:
    • Efficient handling of computational resources
    • Dynamic model selection based on task requirements
    • Automatic GPU/CPU optimization
  • HuggingFace Integration:
    • Direct access to open-source models
    • Support for text generation and multiple NLP tasks
    • Automatic model quantization and optimization
  • Extensible Architecture: Easy to add new capabilities and tools

Core Components

  • Model Management: Dynamic selection and utilization of language models
  • Browser Automation: Autonomous web-based task execution
  • Resource Orchestration: Efficient management of computational resources
  • Context Management: Maintains system state and task dependencies

Installation

pip3 install -U agentos-sdk

Usage

from agentos_sdk import AgentOS
from dotenv import load_dotenv

load_dotenv()

agent = AgentOS(plan_on=False, max_loops=1)

agent.run(
    "Generate a video of a cat surfing on a wave at sunset, cinematic style. Save it as 'cat_surfing.mp4. We should also add cat sounds and meowing sounds."
)

Available Tools

AgentOS comes with a powerful set of built-in tools that enable various capabilities. Here's a comprehensive list of all available tools:

Tool Name Description Use Case Examples
Browser Agent Autonomous web browser automation tool that can navigate websites, extract information, and perform web-based tasks - Web scraping
- Form filling
- Data extraction
- Website testing
Hugging Face Model Interface for using various Hugging Face models for text generation and other NLP tasks - Text generation
- Language translation
- Text classification
- Custom model inference
LiteLLM Model Unified interface for multiple LLM providers including OpenAI, Anthropic, and others - Text generation
- Chat completion
- Content creation
- Advanced reasoning
Safe Calculator Secure mathematical expression evaluator with built-in safety checks - Mathematical calculations
- Formula evaluation
- Secure computation
- Numeric processing
Terminal Developer Agent Advanced agent for performing terminal operations and development tasks - File operations
- Code execution
- System commands
- Development tasks
Generate Speech Text-to-speech conversion tool supporting multiple voices and models - Audio content creation
- Voice synthesis
- Accessibility features
- Audio narration
Generate Video AI-powered video generation tool using Google's Veo 3.0 model - Video content creation
- Visual storytelling
- Animation generation
- Creative content
Create Files Tool for creating new files in the workspace with specified content - Document creation
- Code file generation
- Report writing
- Configuration files
Update Files Tool for updating existing files by overwriting their content - Content modification
- File updates
- Document revisions
- Configuration changes

Community

Join our community of agent engineers and researchers for technical support, cutting-edge updates, and exclusive access to world-class agent engineering insights!

Platform Description Link
📚 Documentation Official documentation and guides docs.swarms.world
📝 Blog Latest updates and technical articles Medium
💬 Discord Live chat and community support Join Discord
🐦 Twitter Latest news and announcements @kyegomez
👥 LinkedIn Professional network and updates The Swarm Corporation
📺 YouTube Tutorials and demos Swarms Channel
🎫 Events Join our community events Sign up here
🚀 Onboarding Session Get onboarded with Kye Gomez, creator and lead maintainer of Swarms Book Session

Contributing

We welcome contributions from the community. Please see our contributing guidelines for more information.

License

This project is under the MIT License.

Todo

  • Add deep research agent or sub agent
  • Implement video and audio processing
  • Create better system prompt and add multiple shot examples on when to use certain tools and etc

About

AgentOS is a lightweight, single-file implementation that provides a robust foundation for building autonomous AI agents. It implements the core concepts outlined in Karpathy's Agent OS architecture while maintaining simplicity and extensibility.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Sponsor this project

 

Packages

No packages published

Languages