Setup
Provider Setup Guide
Recommended setup for Ollama (primary), plus LM Studio and llama.cpp local endpoints. OpenAI-compatible servers (vLLM, LocalAI, KoboldCPP) also work when configured. This guide reflects v0.6.2 behavior.
Need the overview? Visit the landing page or review the privacy policy.
1. Install the Extension
2. Choose a Provider
Ollama (recommended baseline)
http://localhost:11434 LM Studio
http://localhost:1234/v1 llama.cpp server
http://localhost:8000/v1 OpenAI-compatible servers
http://localhost:8000/v1 Provider Connection Flow
flowchart LR
A["Extension UI"] --> B["Provider Config"]
B --> C["Choose Provider"]
C --> D["Ollama"]
C --> E["LM Studio"]
C --> F["llama.cpp"]
D --> G["Local API"]
E --> G
F --> G
G --> H["Streaming Response"]
H --> I["Chat UI + Local Storage"]
3. Start Ollama (primary path)
ollama serve ollama pull qwen2.5:3b ollama pull all-minilm:latest
Optional helper script in this repo:
tools/ollama-env.sh
(helps with LAN and Firefox origin setup).
4. Configure Providers in Extension
- Open extension settings.
- Go to Providers tab.
- Enable provider(s) you want.
- Set base URL and run connection test.
- Select a model from the model menu in chat.
5. Verify Endpoints
curl http://localhost:11434/api/tags curl http://localhost:1234/v1/models curl http://localhost:8000/v1/models 6. Important Reality Checks
- Chat generation is provider-routed.
- Model pull/delete/unload/version features are Ollama-focused.
- Embedding and indexing currently depend on Ollama embeddings.
7. CORS and Browser Notes
For Firefox and strict environments, you may need explicit
OLLAMA_ORIGINS setup when using Ollama.
8. Troubleshooting
- Confirm provider process is running.
- Confirm endpoint URL matches runtime URL.
- Use provider connection test before debugging model behavior.
- Check background console for streaming/provider errors.