Local-first · v0.6.x
Chat with local LLMs, right in your browser.
Connect Ollama, LM Studio, llama.cpp, or any OpenAI-compatible server. Provider-aware routing, streaming responses, local retrieval — all without a mandatory cloud API.
Features
What you can do without leaving your browser.
Multi-provider local models
Ollama, LM Studio, llama.cpp, vLLM, LocalAI, KoboldCPP. Capability-aware routing with provider-aware model selection.
Multi-chat sessions
Multiple concurrent chats with branch navigation. Persisted locally in IndexedDB / SQLite with tab-refresh awareness.
Enhanced content extraction
Defuddle integration with smart scroll strategies plus GitHub and YouTube extraction helpers.
Text-to-speech
Searchable voices, pitch and rate controls, per-message playback. Cross-browser voice loading.
Prompt templates
Pre-built prompts for summarization, translation, and code help. Custom templates supported.
File upload & RAG
PDF, TXT, MD, DOCX support with background embedding progress. Configurable max file size.
Backup & restore
Full data snapshots via ZIP backups with versioned manifests. Restore reports partial failures.
Advanced configuration
Provider base URL control, selected model ref, capability checks, debug logging.
Privacy first
No mandatory cloud APIs. Local data storage. Open source. You control the endpoint.
Get started
Three steps from install to first chat.
-
Step 1
Start a local provider
Run Ollama, LM Studio, or llama.cpp. Note the endpoint URL.
-
Step 2
Install the extension
Add it from the Chrome Web Store and pin it for easy access.
-
Step 3
Pick a model and chat
Open the side panel, pick a model, chat on any website. CORS issue? The setup guide covers it.
Stack
Built with React 19 · WXT · TypeScript · shadcn/ui · sql.js · Defuddle.
Endpoint privacy is yours to configure — see the privacy policy for what stays local and what leaves.