Provider Setup
Recommended setup is Ollama as the primary provider, with LM Studio and llama.cpp as alternatives. Any OpenAI-compatible server (vLLM, LocalAI, KoboldCPP) also works once configured. This guide reflects v0.6.x behavior.
1. Install the extension
Section titled “1. Install the extension”Install Ollama Client from the Chrome Web Store.
2. Pick a provider
Section titled “2. Pick a provider”| Provider | Default endpoint | Notes |
|---|---|---|
| Ollama | http://localhost:11434 | Recommended baseline. Full feature support (pull, delete, version). |
| LM Studio | http://localhost:1234/v1 | OpenAI-compatible. Chat + embeddings. |
| llama.cpp server | http://localhost:8000/v1 | OpenAI-compatible. Run with llama-server. |
| vLLM / LocalAI / KoboldCPP | http://localhost:8000/v1 | Any OpenAI-compatible server; use your actual URL. |
3. Start Ollama (primary path)
Section titled “3. Start Ollama (primary path)”Install Ollama from ollama.com, then start it:
ollama servePull at least one chat model:
ollama pull qwen2.5:3bPull one embeddings model for RAG:
ollama pull all-minilm:latestYou need at least one chat model and one embeddings model installed for the full experience.
4. Configure the extension
Section titled “4. Configure the extension”- Open the extension’s options page.
- Go to the Providers tab.
- Enable the providers you want.
- Set the base URL and run a connection test.
- Pick a model from the chat model menu.
5. Verify endpoints
Section titled “5. Verify endpoints”# Ollamacurl http://localhost:11434/api/tags
# LM Studiocurl http://localhost:1234/v1/models
# llama.cppcurl http://localhost:8000/v1/models6. Reality checks
Section titled “6. Reality checks”- Chat generation is fully provider-agnostic.
- Pull / delete / unload / version actions are Ollama-only.
- Embedding generation currently flows through Ollama; other providers can read embeddings but not produce them.
7. CORS and browser notes
Section titled “7. CORS and browser notes”Chrome-based browsers route extension requests through Declarative Net Request (DNR). Firefox uses a different extension API model.
8. Troubleshooting
Section titled “8. Troubleshooting”- Confirm the provider process is actually running.
- Confirm the endpoint URL matches the runtime URL exactly (port, scheme,
/v1suffix). - Use the Test connection button in Providers settings before debugging model behavior.
- Check the background console (
chrome://extensions→ service worker) for streaming or provider errors.