RAG

Local Inference

Chat

MongoDB Atlas

Local RAG Chat With Ollama And Nano

Build a tiny Atlas-backed RAG chat flow using local nano embeddings and Ollama for generation.

Requires Ollama

MongoDB Atlas

View Source Tape Open Docs

Prerequisites

• Ollama is installed and running locally.

• The llama3.2:3b model is already pulled in Ollama.

• vai nano setup has already completed successfully.

• MongoDB Atlas is configured through MONGODB_URI or vai config set mongodb-uri.

Under the hood

See the exact VAI command, the matching Voyage AI layer, and the MongoDB query shape behind the demo.

vai chat --db "$DEMO_DB" --collection "$DEMO_COLLECTION" --local --llm-provider ollama --llm-model "$OLLAMA_MODEL" --llm-base-url http://localhost:11434 --no-history --no-stream

This is the high-level chat entrypoint from the GIF. VAI handles query embedding, Atlas retrieval, prompt assembly, and Ollama generation behind one command.

Share or copy this demo

Keep it lightweight. The prepared text stays behind the buttons.

Open canonical URL

Copy

LinkedIn opens the share dialog and copies the prepared text so you can paste it in quickly.

Exact commands

The full walkthrough is included here so anyone can replay the demo exactly as published.

$vai --version

$ollama list

$vai nano status

$export DEMO_DB=vai_demo

$export DEMO_COLLECTION=ollama_nano_chat_demo_$(date +%s)

$export DEMO_FILE="docs/demos/ollama-nano-chat-docs.jsonl"

$export OLLAMA_MODEL="llama3.2:3b"

$cat "$DEMO_FILE"

$vai ingest --file "$DEMO_FILE" --db "$DEMO_DB" --collection "$DEMO_COLLECTION" --field embedding --text-field text --local --batch-size 3

$vai index create --db "$DEMO_DB" --collection "$DEMO_COLLECTION" --field embedding --dimensions 1024

$vai chat --db "$DEMO_DB" --collection "$DEMO_COLLECTION" --local --llm-provider ollama --llm-model "$OLLAMA_MODEL" --llm-base-url http://localhost:11434 --no-history --no-stream

Chat input

>Which models are used in this demo, and what benefits does the document mention?

Chat input

>What is the workflow from ingest to exit?

Chat input

>/quit

Related demos

More shareable workflows from the same VAI demo library.

Pipeline

MongoDB Atlas

Featured

End-to-End Atlas Pipeline

Run the full workflow in one command: create sample docs, chunk them, embed them, store them in Atlas, and auto-create the vector index.

Atlas

API key

VAI command

vai pipeline /tmp/vai-demo-docs/ --db vai_demo --collection knowledge --create-index

Show Under the Hood

Prerequisites

A valid VOYAGE_API_KEY is set in the environment.

View DemoSource

Retrieval

Reranking

Featured

Two-Stage Retrieval With Reranking

Walk through the classic retrieval stack: embed the query, run Atlas vector search, rerank the candidates, then compare the result to a vector-only pass.

Atlas

API key

VAI command

vai query 'how does vector search work?' --db vai_demo --collection knowledge --model voyage-4-lite

Show Under the Hood

Prerequisites

A valid VOYAGE_API_KEY is set in the environment.

View DemoSource

Getting Started

Local Inference

Featured

Local Inference With Ollama

Run a local CLI workflow with Ollama generation and local embeddings, without a Voyage API key.

Offline-capable

Requires Ollama

VAI command

vai embed "Local inference keeps retrieval private, fast, and API-key free." --local --dimensions 256

Show Under the Hood

Prerequisites

Ollama is installed and running locally.

View DemoSource