Chat with Local LLM Models
Right in Your Browser

Privacy-first extension for local AI chat with multi-provider support. Connect Ollama, LM Studio, or llama.cpp and run conversations directly in your browser. Includes enhanced content extraction, semantic retrieval, and advanced text-to-speech workflows with local-first data handling.

🚀 Install Extension 💻 View Source

Local-First Privacy Model

∞ AI Models

0* Mandatory Cloud APIs

*You control the endpoint. If you configure a remote provider, your data will go to that provider.

🌐

Chrome

🦁

Brave

📘

Edge

🎭

Opera

🦊

Firefox

Powerful Features

Everything you need for productive AI conversations, all running with local-first defaults and user-controlled endpoints

🤖

Multi-Provider Local Models

Use local models through Ollama, LM Studio, or llama.cpp. Route models by provider while keeping a single chat experience.

Ollama, LM Studio, llama.cpp support
Provider-aware model selection
Streaming responses
Stop generation control

💬

Multi-Chat Sessions

Organize your conversations with multiple chat sessions, all saved locally using IndexedDB.

Multiple concurrent chats
Session persistence
Tab refresh awareness
Conversation history

🔧

Enhanced Content Extraction

Advanced content extraction with lazy loading support, site-specific configuration, and intelligent fallback strategies for modern web pages.

Defuddle integration for better parsing
Lazy loading & infinite scroll support
Site-specific extraction overrides
Automated YouTube transcript extraction
GitHub repository & profile support
Smart scroll strategies (none, instant, gradual, smart)
Network idle detection

🔊

Advanced Text-to-Speech

Built-in speech synthesis with searchable voice selection, customizable rate and pitch controls, and seamless cross-browser compatibility for an accessible AI chat experience.

Searchable voice selector with language grouping
Adjustable speech rate (0.5x - 2.0x)
Voice pitch control (0.0 - 2.0)
Custom text testing & preview
Cross-browser voice loading optimization
Per-message TTS controls

📝

Prompt Templates

Pre-built templates for common tasks to boost your productivity.

Summarize content
Translate text
Explain code
email-professional
Custom templates

📁

File Upload & Processing

Upload PDFs, text, and DOCX files, automatically chunk and embed them for semantic search and RAG.

PDF, TXT, MD, DOCX support
Automatic chunking strategies
Background embedding with progress UI
Configurable max file size

⚙️

Advanced Configuration

Full control over AI behavior with comprehensive settings and options.

Provider base URL and model controls
Excluded URLs (regex support)
Model-specific settings
Debug logging

🔒

Privacy First

Local-first by default with full endpoint control. If you point to a remote provider, prompts are sent to that provider.

No mandatory cloud API calls
User-controlled provider endpoints
Local data storage
No telemetry
Open source code

Get Started in 3 Easy Steps

Simple setup process to get you chatting with AI models locally

Start a Local Provider

Run Ollama, LM Studio, or llama.cpp with a local API endpoint. Configure the matching provider URL in extension settings.

Add Extension

Install the Ollama Client extension from the Chrome Web Store with a single click.

Start Chatting

Open the side panel, select a model, and start chatting on any website.
⚠️ CORS issue? Follow the setup guide .

🚀 Install Now - It's Free

Chat with Local LLM ModelsRight in Your Browser