Chat with local LLM models right inside your browser.
Privacy-first extension for local AI chat with multi-provider support. Connect Ollama, LM Studio, llama.cpp, or OpenAI-compatible servers (vLLM, LocalAI, KoboldCPP) and run conversations directly in your browser with provider-aware routing and local-first workflows.
*You control the endpoint. If you configure a remote provider, your data will go to that provider.
Works on Chrome, Brave, Edge, Opera, and experimental Firefox.
Powerful, local-first capabilities
Everything you need for productive AI conversations, with full control over provider endpoints.
Multi-Provider Local Models
- Ollama, LM Studio, llama.cpp, vLLM, LocalAI, KoboldCPP
- Provider-aware model selection + capabilities
- Streaming responses + stop control
Multi-Chat Sessions
- Multiple concurrent chats
- Session persistence + history
- Tab refresh awareness
Enhanced Content Extraction
- Defuddle integration for parsing
- Smart scroll strategies
- GitHub & YouTube extraction helpers
Advanced Text-to-Speech
- Language grouping + search
- Custom text preview
- Cross-browser voice loading
Prompt Templates
- Summarize content
- Translate text
- Custom templates
File Upload & Processing
- PDF, TXT, MD, DOCX support
- Background embedding progress
- Configurable max file size
Backup & Restore
- Full backup/export flows
- Restore with partial failure reporting
- Versioned manifests
Advanced Configuration
- Provider base URL control
- Selected model ref + capability checks
- Debug logging
Search & Export
- Result grouping + scoped tabs
- Print/PDF export entrypoint
- Local-only processing
Privacy First
- No mandatory cloud APIs
- Local data storage
- Open source code
Built with modern technologies
Optimized for performance, reliability, and extensibility.
Get started in 3 easy steps
Quick setup path to get chatting with local models.
Stay connected
© 2026 Ollama Client. Open source • Privacy-first • Multi-provider local AI