vai logo
vai
Use CasesShared SpaceDocs
Get Started
Chunking
Preprocessing
Getting Started

Chunking Strategies Before Embedding

Compare fixed, sentence, and markdown chunking on the same sample document before any embedding or storage layer is introduced.

Offline-capable
View Source TapeOpen Docs
Prerequisites

The `vai` CLI is installed locally. No API key is required for chunking-only workflows.

Under the hood

See the exact VAI command, the matching Voyage AI layer, and the MongoDB query shape behind the demo.

vai chunk /tmp/sample.md --strategy markdown

This is a purely local preprocessing demo. The important thing is not the command syntax itself, but how each strategy changes the units of meaning that later get embedded.

Share or copy this demo

Keep it lightweight. The prepared text stays behind the buttons.

Open canonical URL

Share

Copy

LinkedIn opens the share dialog and copies the prepared text so you can paste it in quickly.

Exact commands

The full walkthrough is included here so anyone can replay the demo exactly as published.

$echo '=> step 1: chunk your docs. step 2: embed. step 3: store in Atlas.'
$printf '# Vector Search Guide\n\nVector search finds semantically similar content.\nIt uses embeddings -- numbers that capture meaning.\n\n## How It Works\n\nDocuments are chunked, embedded, and stored in Atlas.\nAt query time, your question is embedded too.\n\n## Why It Matters\n\nKeyword search misses synonyms and context.\nVector search understands intent, not just words.' > /tmp/sample.md
$echo '=> strategy: fixed -- simple, predictable sizes'
$vai chunk /tmp/sample.md --strategy fixed --size 100
$echo '=> strategy: sentence -- respects natural language boundaries'
$vai chunk /tmp/sample.md --strategy sentence
$echo '=> strategy: markdown -- heading-aware, best for .md files'
$vai chunk /tmp/sample.md --strategy markdown
$echo '=> rule: markdown->markdown | code->recursive | PDF->paragraph'

Related demos

More shareable workflows from the same VAI demo library.

Pipeline
MongoDB Atlas
Featured
End-to-End Atlas Pipeline

Run the full workflow in one command: create sample docs, chunk them, embed them, store them in Atlas, and auto-create the vector index.

Atlas
API key

VAI command

vai pipeline /tmp/vai-demo-docs/ --db vai_demo --collection knowledge --create-index

Show Under the Hood

Prerequisites

A valid VOYAGE_API_KEY is set in the environment.

View DemoSource
Getting Started
Local Inference
Featured
Local Inference With Ollama

Run a local CLI workflow with Ollama generation and local embeddings, without a Voyage API key.

Offline-capable
Requires Ollama

VAI command

vai embed "Local inference keeps retrieval private, fast, and API-key free." --local --dimensions 256

Show Under the Hood

Prerequisites

Ollama is installed and running locally.