Installing Ollama

LocalRAG talks to Ollama on your machine for local embeddings and chat. Ollama is a separate install—not a Python package. LocalRAG follows the Ollama HTTP API: embeddings use POST /api/embed (not the legacy /api/embeddings endpoint); chat uses POST /api/chat; model discovery uses GET /api/tags; pulls use POST /api/pull. Request and response bodies for those calls are typed in localrag/ollama/schemas.py. Use a reasonably current Ollama release so those routes match.

The canonical instructions are on the official site:

Home & docs: ollama.com
Download / install: ollama.com/download

Follow the steps there for your OS (Windows, macOS, or Linux). The site covers installers, PATH, and optional GPU notes.

After installation

Check the CLI (new terminal after install):
```
ollama --version
```
Run the server (LocalRAG expects it reachable, default http://127.0.0.1:11434):
```
ollama serve
```
On many setups the Ollama app starts this for you in the background; if localrag or the API cannot reach Ollama, run ollama serve explicitly.
Pull models LocalRAG uses by default (names match .env.example):
```
ollama pull nomic-embed-text
ollama pull llama3.2
```
You can change models via OLLAMA_EMBED_MODEL and OLLAMA_LLM_MODEL in .env.
Optional: run LocalRAG’s helper to check connectivity and pull defaults:
```
uv run localrag setup
```

Docker

If you use LocalRAG’s docker-compose.yml, Ollama runs in a container; pull models inside that container (see the README Docker section). You do not need a host install of Ollama for that path—only for native uv run localrag / local API usage.

More help

Library & API details: github.com/ollama/ollama
Model list: ollama.com/library

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Installing Ollama

After installation

Docker

More help

FilesExpand file tree

ollama.md

Latest commit

History

ollama.md

File metadata and controls

Installing Ollama

After installation

Docker

More help