Ollama Client

Local-first · v0.6.x

Chat with local LLMs, right in your browser.

Connect Ollama, LM Studio, llama.cpp, or any OpenAI-compatible server. Provider-aware routing, streaming responses, local retrieval — all without a mandatory cloud API.

Features

What you can do without leaving your browser.

Multi-provider local models

Ollama, LM Studio, llama.cpp, vLLM, LocalAI, KoboldCPP. Capability-aware routing with provider-aware model selection.

Multi-chat sessions

Multiple concurrent chats with branch navigation. Persisted locally in IndexedDB / SQLite with tab-refresh awareness.

Enhanced content extraction

Defuddle integration with smart scroll strategies plus GitHub and YouTube extraction helpers.

Text-to-speech

Searchable voices, pitch and rate controls, per-message playback. Cross-browser voice loading.

Prompt templates

Pre-built prompts for summarization, translation, and code help. Custom templates supported.

File upload & RAG

PDF, TXT, MD, DOCX support with background embedding progress. Configurable max file size.

Backup & restore

Full data snapshots via ZIP backups with versioned manifests. Restore reports partial failures.

Advanced configuration

Provider base URL control, selected model ref, capability checks, debug logging.

Privacy first

No mandatory cloud APIs. Local data storage. Open source. You control the endpoint.

Get started

Three steps from install to first chat.

  1. Step 1

    Start a local provider

    Run Ollama, LM Studio, or llama.cpp. Note the endpoint URL.

  2. Step 2

    Install the extension

    Add it from the Chrome Web Store and pin it for easy access.

  3. Step 3

    Pick a model and chat

    Open the side panel, pick a model, chat on any website. CORS issue? The setup guide covers it.

Stack

Built with React 19 · WXT · TypeScript · shadcn/ui · sql.js · Defuddle.

Endpoint privacy is yours to configure — see the privacy policy for what stays local and what leaves.