Setup

Provider Setup Guide

Recommended setup for Ollama (primary), plus LM Studio and llama.cpp local endpoints. OpenAI-compatible servers (vLLM, LocalAI, KoboldCPP) also work when configured. This guide reflects v0.6.2 behavior.

Need the overview? Visit the landing page or review the privacy policy.

1. Install the Extension

Install from the Chrome Web Store: Ollama Client

2. Choose a Provider

Ollama (recommended baseline)

Default endpoint:

http://localhost:11434

LM Studio

Default profile endpoint:

http://localhost:1234/v1

llama.cpp server

Default profile endpoint:

http://localhost:8000/v1

OpenAI-compatible servers

vLLM, LocalAI, KoboldCPP (use your server URL).

http://localhost:8000/v1

Provider Connection Flow

High-level flow from extension to provider and back.

flowchart LR
  A["Extension UI"] --> B["Provider Config"]
  B --> C["Choose Provider"]
  C --> D["Ollama"]
  C --> E["LM Studio"]
  C --> F["llama.cpp"]
  D --> G["Local API"]
  E --> G
  F --> G
  G --> H["Streaming Response"]
  H --> I["Chat UI + Local Storage"]
      

3. Start Ollama (primary path)

Install Ollama from ollama.com, then start it:

ollama serve

Pull at least one chat model:

ollama pull qwen2.5:3b

Pull one embeddings model for RAG:

ollama pull all-minilm:latest

You need at least one chat model and one embeddings model installed for the full experience.

Optional helper script in this repo: tools/ollama-env.sh (helps with LAN and Firefox origin setup).

4. Configure Providers in Extension

  1. Open extension settings.
  2. Go to Providers tab.
  3. Enable provider(s) you want.
  4. Set base URL and run connection test.
  5. Select a model from the model menu in chat.

5. Verify Endpoints

Ollama check

curl http://localhost:11434/api/tags

LM Studio check

curl http://localhost:1234/v1/models

llama.cpp check

curl http://localhost:8000/v1/models

6. Important Reality Checks

  • Chat generation is provider-routed.
  • Model pull/delete/unload/version features are Ollama-focused.
  • Embedding and indexing currently depend on Ollama embeddings.

7. CORS and Browser Notes

Chrome-based browsers use DNR-based request handling. Firefox has different extension API behavior.

For Firefox and strict environments, you may need explicit OLLAMA_ORIGINS setup when using Ollama.

8. Troubleshooting

  • Confirm provider process is running.
  • Confirm endpoint URL matches runtime URL.
  • Use provider connection test before debugging model behavior.
  • Check background console for streaming/provider errors.