ChatMe 🤖

A modern, cross-platform AI chat application built with Tauri, React, and TypeScript. ChatMe supports multiple AI providers with a beautiful, responsive interface and advanced features including voice interaction and powerful agent mode with full system access.

🆕 What's New in v0.4.0

🚀 Enhanced Agent Mode: Execute terminal commands, launch apps, and manage processes
📊 Command Execution Cards: Interactive, expandable results with timing and copy buttons
⌨️ Keyboard Shortcuts: Ctrl+K to focus, Ctrl+R to repeat last command
🎨 Improved UI: Better dark mode support, polished settings page
🔒 Permission System: Safety checks for dangerous operations
💬 Natural AI Flow: AI explains, executes, and continues in one response

🚀 Key Highlights

🤖 Powerful Agent Mode: Full system access with terminal commands and app control
🎤 Voice Interaction: Speech-to-text input and text-to-speech output with customizable voices
🎯 Multi-Provider Support: OpenAI, Google Gemini, Claude, Ollama, and custom APIs
📱 Mobile-Optimized: Responsive design that works beautifully on all devices
🎨 Modern UI: Custom title bar, dark/light themes, and smooth animations
⚡ Real-time Features: Live streaming responses and interactive command execution

📱 Cross-Platform Features

Touch-Friendly Controls: Large tap targets for accessibility
Adaptive Layout: Auto-resizing sidebar and components for different screen sizes
Image Upload: File selection with drag-and-drop and preview functionality
Mobile-Optimized: UI components that work well on tablets and small screens
Gesture Support: Smooth scrolling and intuitive interactions

🌟 Platform Support

🖥️ Desktop: Windows, macOS, Linux
🌐 Web: Progressive Web App capabilities
📱 Mobile: UI is responsive and mobile-friendly (native mobile apps coming soon)

📸 Screenshots

_{Light Theme}

_{Dark Theme}

✨ Features

🎯 Multi-Provider AI Support

OpenAI: GPT-4, GPT-4 Turbo, GPT-3.5 Turbo and many more
Google Gemini: Gemini Pro, Gemini 1.5 Flash/Pro and many more (both original and OpenAI-compatible APIs)
Anthropic Claude: Claude 3 models
Ollama: Local models (Llama 2, CodeLlama, Mistral, etc.)
Custom APIs: Support for Mistral AI, Groq, Together AI, Perplexity, and any OpenAI-compatible API

🤖 Enhanced Agent Mode (NEW!)

Full System Access: Execute terminal commands, launch applications, and manage files
Natural Conversation Flow: AI explains, executes, and continues conversation in a single response
Interactive Command Cards: Expandable command execution results with timing and copy buttons
Terminal Command Execution: Run npm, git, python, and any shell commands
Application Management: Launch and manage system applications
Process Control: List and terminate running processes
File Operations: Copy, move, delete, rename files and folders
Permission System: Safety checks for dangerous operations
Smart Context Awareness: AI understands your working directory and project context

🎤 Speech Features

Speech-to-Text: Click the microphone button to dictate messages using voice input
Text-to-Speech: AI responses can be read aloud with natural-sounding voices
Auto-Speak Mode: Automatically read AI responses when they arrive
Voice Customization: Choose from multiple system voices with adjustable rate, pitch, and volume
Real-time Transcription: See your speech being converted to text in real-time
Browser Integration: Uses Web Speech API for cross-platform compatibility

🎨 Modern Interface

Custom Title Bar: Integrated window controls with drag functionality (desktop)
Mobile-First Design: Touch-optimized interface for mobile devices
Responsive Layout: Auto-adapting sidebar and layout for all screen sizes
Dark/Light Theme: System-aware theme switching
Agent Toggle: Quick agent mode toggle with visual indicator in the title bar
Smooth Animations: Polished user experience with transitions

📱 Mobile Features

Touch-Optimized Controls: Minimum 44px touch targets for accessibility
Mobile Sidebar: Auto-collapsing navigation to save screen space
Image Upload: Camera and gallery integration for image uploads
Gesture Support: Smooth scrolling and touch interactions
Mobile Keyboard: Optimized input handling for mobile keyboards
Responsive Images: Properly scaled image previews and displays

💬 Advanced Chat Features

Real-time Streaming: Live message streaming for supported providers
Image Analysis: Upload and analyze images with AI vision models
AI Thinking Display: Collapsible sections showing AI reasoning process
Message Management: Copy, export, and share conversations
Chat History: Persistent chat storage with SQLite
Markdown Support: Rich text rendering with syntax highlighting
Custom Components: Enhanced rendering for file listings and command executions
Scroll Management: Intelligent auto-scrolling and manual scroll control
Keyboard Shortcuts: Ctrl+K (focus input), Ctrl+R (repeat last command), Enter (send)

⚙️ Configuration & Settings

Provider Management: Easy setup and switching between AI providers
API Testing: Built-in connection testing with CORS-aware error handling
Model Selection: Support for different models per provider
Parameter Tuning: Temperature, max tokens, and other settings
Speech Settings: Configure voice input/output preferences and voice selection
Agent Configuration: Set working directories and enable/disable agent mode
Default Configurations: Set preferred providers for new chats

🎙️ Speech Features Guide

Voice Input (Speech-to-Text)

Click the microphone button in the input box
Grant permission when prompted by your browser
Start speaking - see real-time transcription
Click again to stop recording
Edit if needed and send your message

AI Voice Output (Text-to-Speech)

Click the speaker icon next to any AI message
Listen to the response being read aloud
Stop playback by clicking the stop button
Enable auto-speak in settings for automatic reading

Speech Settings Configuration

Speech Recognition: Toggle voice input on/off
Auto-Speak Responses: Automatically read AI responses
Voice Selection: Choose from available system voices
Speech Rate: Adjust reading speed (0.5x to 2x)
Speech Pitch: Control voice pitch (0 to 2)
Volume Control: Set playback volume (0% to 100%)

🤖 Agent Mode Guide

Enabling Agent Mode

Toggle the agent switch in the settings or title bar
Set working directory (optional) for file operations
Ask natural questions and the AI will execute commands
Watch commands execute in real-time with results

Agent Capabilities

Terminal Commands: "Run npm install" → Executes the command
Application Launch: "Open Chrome browser" → Launches Chrome
File Operations: "Copy this file to backup folder" → Performs file operation
Process Management: "Show running processes" → Lists all processes
Git Operations: "Check git status" → Runs git commands
Build & Deploy: "Build the project" → Executes build scripts
System Info: "Check Node version" → Runs system commands

Command Execution Display

Expandable Cards: Click to view command output
Execution Time: See how long commands took
Copy Button: Copy commands with one click
Status Indicators: Success/error/running states
Working Directory: Shows where commands executed

Keyboard Shortcuts

Ctrl+K / Cmd+K: Focus the input box from anywhere
Ctrl+R / Cmd+R: Repeat the last command
Enter: Send message
Shift+Enter: New line in message

Example Agent Mode Interactions

User: "Check if npm is installed and show me the version" AI: "I'll check if npm is installed and show you the version. [Executes: npm --version] ✅ npm version 10.2.5 is installed. This is a recent version compatible with most modern Node.js projects."

User: "Run the development server" AI: "I'll start the development server for you. [Executes: npm run dev] The server is now running on http://localhost:1420. You can open this in your browser to see your application."

User: "Show me all TypeScript files in the src folder" AI: "I'll search for all TypeScript files in the src directory. [Executes: search for *.ts and *.tsx files] Found 42 TypeScript files in your src folder. The main components are in src/components/ and pages are in src/pages/."

User: "Open Chrome and go to GitHub" AI: "I'll launch Chrome and navigate to GitHub for you. [Executes: launch Chrome with https://github.com] Chrome has been opened with GitHub loaded. You can now browse your repositories."

🚀 Getting Started

📦 Downloads

Desktop Applications

Windows: Download .exe installer from Releases
macOS: Download .dmg file (Intel and Apple Silicon supported)
Linux: Download .deb package or .AppImage

Mobile Support

Responsive Web UI: The application works well on mobile browsers
Native Mobile Apps: Coming soon - currently in development

Prerequisites

Node.js (v18 or higher)
Rust (latest stable)
pnpm/npm/yarn (package manager)

Installation

Clone the repository

git clone https://github.com/Amanbig/ChatMe.git
cd ChatMe

Install dependencies
```
npm install
```
Start development server
```
npm run tauri dev
```
Build for production
```
npm run tauri build
```

🌐 Testing on Mobile Devices

While native mobile apps are in development, you can test the responsive UI:

Browser dev tools: Use mobile device simulation
Local network: Access the dev server from mobile browsers
Responsive testing: Resize desktop window to test different screen sizes

🔧 Configuration

Setting up AI Providers

Launch ChatMe and navigate to Settings
Select a Provider from the available cards
Configure the API:
- API Key: Your provider's API key
- Base URL: Custom endpoint (optional for most providers)
- Model: Choose from available models
- Parameters: Adjust temperature, max tokens, etc.
Test Connection to verify setup
Set as Default (optional)

Provider-Specific Setup

OpenAI

URL: https://api.openai.com/v1/chat/completions
Models: gpt-4, gpt-4-turbo, gpt-3.5-turbo

Google Gemini

OpenAI-Compatible: https://generativelanguage.googleapis.com/v1beta/openai/chat/completions
Original API: https://generativelanguage.googleapis.com/v1beta
Models: gemini-pro, gemini-1.5-flash, gemini-1.5-pro

Mistral AI (Custom)

URL: https://api.mistral.ai/v1/chat/completions
Models: mistral-large-latest, mixtral-8x7b-instruct

Ollama (Local)

URL: http://localhost:11434
Models: Any locally installed Ollama model

🛠️ Tech Stack

Frontend

Tauri: Cross-platform desktop framework
React 18: Modern React with hooks and context
TypeScript: Type-safe development
Vite: Lightning-fast build tool
shadcn/ui: Modern component library
Tailwind CSS: Utility-first styling
React Router: Client-side routing
Sonner: Toast notifications
React Markdown: Rich text rendering with custom components

Backend

Rust: High-performance backend
SQLite: Local database storage with migrations
Reqwest: HTTP client for API calls
Tauri Commands: File system operations and agent capabilities
Serde: JSON serialization
Tokio: Async runtime

Features

Real-time Events: Tauri event system for streaming
Speech Integration: Web Speech API for voice input/output
Agent System: Intelligent file operations with LLM-driven commands
Streaming Support: Server-sent events simulation for real-time responses
Error Handling: Comprehensive error management with user-friendly messages
Theme System: Dark/light mode switching with system preference detection
Custom Rendering: Enhanced markdown with interactive file components

📁 Project Structure

ChatMe/
├── src/                    # Frontend React code
│   ├── components/         # Reusable UI components
│   │   ├── ui/            # shadcn/ui components
│   │   └── app/           # Application-specific components
│   ├── pages/             # Page components
│   ├── lib/               # Utilities and API client
│   └── assets/            # Static assets
├── src-tauri/             # Rust backend
│   ├── src/               # Rust source code
│   │   ├── commands.rs    # Tauri commands
│   │   ├── database.rs    # Database operations
│   │   └── models.rs      # Data models
│   └── Cargo.toml         # Rust dependencies
├── public/                # Public assets
└── package.json           # Node.js dependencies

🚀 Building & Distribution

Development

npm run dev          # Start dev server
npm run build        # Build frontend
cargo tauri dev      # Run Tauri development

Production

npm run build               # Build frontend
cargo tauri build          # Build desktop app
cargo tauri build --debug  # Build with debug info

Version Management

ChatMe uses semantic versioning (SemVer) across all components. Version updates are automated:

# Update version manually
npm run update-version 0.4.0

# Or use semantic version bumps
npm run version:patch    # 0.3.0 → 0.3.1
npm run version:minor    # 0.3.0 → 0.4.0  
npm run version:major    # 0.3.0 → 1.0.0

Version files updated automatically:

package.json and package-lock.json
src-tauri/Cargo.toml and Cargo.lock
src-tauri/tauri.conf.json

Release Process:

Update version using scripts above
Commit changes: git commit -m "bump version to x.x.x"
Push to main: git push
GitHub Actions will automatically build and create release

Platform Support

Windows: Native .exe and .msi installers
macOS: .app bundle and .dmg installer (Intel and Apple Silicon)
Linux: .deb, .rpm, and AppImage formats
Cross-platform: Responsive UI that works on various screen sizes

🤝 Contributing

We welcome contributions! Please follow these steps:

Fork the repository
Create a feature branch: git checkout -b feature/amazing-feature
Commit changes: git commit -m 'Add amazing feature'
Push to branch: git push origin feature/amazing-feature
Open a Pull Request

Development Guidelines

Use TypeScript for type safety
Follow the existing code style
Add tests for new features
Update documentation as needed

📄 License

This project is licensed under the MIT License - see the LICENSE file for details.

🙏 Acknowledgments

Tauri Team: For the amazing desktop framework
shadcn: For the beautiful UI components
Vercel: For the inspiration and design patterns
OpenAI, Anthropic, Google: For providing excellent AI APIs

📞 Support

Issues: GitHub Issues
Discussions: GitHub Discussions
Email: [email protected]

Made with ❤️ by Amanbig

Name		Name	Last commit message	Last commit date
Latest commit History 64 Commits
.github/workflows		.github/workflows
.vscode		.vscode
images		images
public		public
src-tauri		src-tauri
src		src
.gitignore		.gitignore
CODE_OF_CONDUCT.md		CODE_OF_CONDUCT.md
LICENSE		LICENSE
README.md		README.md
SPEECH_FEATURES.md		SPEECH_FEATURES.md
components.json		components.json
index.html		index.html
package-lock.json		package-lock.json
package.json		package.json
tsconfig.json		tsconfig.json
tsconfig.node.json		tsconfig.node.json
update-version.bat		update-version.bat
update-version.js		update-version.js
vite.config.ts		vite.config.ts

License

Amanbig/ChatMe

Folders and files

Latest commit

History

Repository files navigation