Local-first

Chat with local LLM models right inside your browser.

Privacy-first extension for local AI chat with multi-provider support. Connect Ollama, LM Studio, llama.cpp, or OpenAI-compatible servers (vLLM, LocalAI, KoboldCPP) and run conversations directly in your browser with provider-aware routing and local-first workflows.

Local-first Privacy model
AI models
0* Mandatory cloud APIs

*You control the endpoint. If you configure a remote provider, your data will go to that provider.

Works on Chrome, Brave, Edge, Opera, and experimental Firefox.

Features

Powerful, local-first capabilities

Everything you need for productive AI conversations, with full control over provider endpoints.

Multi-Provider Local Models

Use local models through Ollama, LM Studio, llama.cpp, or compatible OpenAI servers with capability-aware routing.

  • Ollama, LM Studio, llama.cpp, vLLM, LocalAI, KoboldCPP
  • Provider-aware model selection + capabilities
  • Streaming responses + stop control

Multi-Chat Sessions

Organize conversations with multiple sessions stored locally in IndexedDB.

  • Multiple concurrent chats
  • Session persistence + history
  • Tab refresh awareness

Enhanced Content Extraction

Modern extraction with lazy-loading support and site-specific strategies for clean context capture.

  • Defuddle integration for parsing
  • Smart scroll strategies
  • GitHub & YouTube extraction helpers

Advanced Text-to-Speech

Speech synthesis with searchable voices, pitch/rate controls, and per-message playback.

  • Language grouping + search
  • Custom text preview
  • Cross-browser voice loading

Prompt Templates

Pre-built prompts for summarization, translation, and code help.

  • Summarize content
  • Translate text
  • Custom templates

File Upload & Processing

Upload PDFs, text, and DOCX files and auto-embed them for semantic retrieval (no OCR pipeline required).

  • PDF, TXT, MD, DOCX support
  • Background embedding progress
  • Configurable max file size

Backup & Restore

Export and restore full data snapshots with manifested ZIP backups.

  • Full backup/export flows
  • Restore with partial failure reporting
  • Versioned manifests

Advanced Configuration

Full control over provider base URLs, model settings, and exclusions.

  • Provider base URL control
  • Selected model ref + capability checks
  • Debug logging

Search & Export

Search UX enhancements with scoped tabs and print/PDF export.

  • Result grouping + scoped tabs
  • Print/PDF export entrypoint
  • Local-only processing

Privacy First

Local-first by default with user-controlled endpoints.

  • No mandatory cloud APIs
  • Local data storage
  • Open source code
Stack

Built with modern technologies

Optimized for performance, reliability, and extensibility.

React
WXT
TypeScript
shadcn/ui
Dexie.js
Readability
Setup

Get started in 3 easy steps

Quick setup path to get chatting with local models.

Step 1

Start a local provider

Run Ollama, LM Studio, or llama.cpp and set the provider URL inside the extension settings.

Step 2

Add the extension

Install from the Chrome Web Store and pin the extension for easy access.

Step 3

Start chatting

Open the side panel, pick a model, and chat on any website. CORS issue? Follow the setup guide.

Read the setup guide
Links

Stay connected

© 2026 Ollama Client. Open source • Privacy-first • Multi-provider local AI