1. Install the Extension
Install from Chrome Web Store: Ollama Client
2. Choose a Provider
Ollama (recommended baseline)
Default endpoint:
http://localhost:11434
LM Studio
Default profile endpoint:
http://localhost:1234/v1
llama.cpp server
Default profile endpoint:
http://localhost:8000/v1
3. Start Ollama (primary path)
Install Ollama from ollama.com, then start it:
ollama serve
Pull at least one chat model:
ollama pull qwen2.5:3b
Optional helper script in this repo:
tools/ollama-env.sh
(helps with LAN and Firefox origin setup).
4. Configure Providers in Extension
- Open extension settings.
- Go to Providers tab.
- Enable provider(s) you want.
- Set base URL and run connection test.
- Select a model from model menu in chat.
5. Verify Endpoints
Ollama check
curl http://localhost:11434/api/tags
LM Studio check
curl http://localhost:1234/v1/models
llama.cpp check
curl http://localhost:8000/v1/models
6. Important Reality Checks
- Chat generation is provider-routed.
- Model pull/delete/unload/version features are currently Ollama-focused.
- Embedding and RAG indexing currently depend on Ollama embedding APIs.
7. CORS and Browser Notes
Chrome-based browsers use DNR-based request handling in extension flow. Firefox has different extension API behavior.
For Firefox and strict environments, you may need explicit
OLLAMA_ORIGINS setup when using Ollama.
8. Troubleshooting
- Confirm provider process is running.
- Confirm endpoint URL in provider settings exactly matches runtime URL.
- Use provider connection test before debugging model behavior.
- Check extension background console for streaming/provider errors.