How it works
Pick one of three chunking modes. By heading mirrors how heading-aware retrievers split content; the chunk is one H2/H3 section. By word count mirrors how fixed-size chunkers (LangChain, LlamaIndex) split content; you set a target and each chunk holds roughly that many words with sentence-boundary snapping. By paragraph mirrors how semantic splitters treat your paragraph breaks as chunk boundaries.
Why chunk quality matters
- Chunks are the unit of citation - assistants quote chunks, not pages. A well-chunked page can carry 3-5 citations; a badly-chunked page carries zero.
- Retrieval is brittle - if a chunk breaks mid-claim, the embedding no longer matches the user's intent and the whole chunk falls out of the top-K results.
- Small chunks starve - a 15-word chunk loses the supporting context that makes the claim citable. Most assistants skip it.
- Big chunks dilute - a 400+ word chunk has a vague embedding that matches too many queries weakly and none strongly.
Pair with
Run the Passage Optimizer on any chunk that flags as weak, the Headings Optimizer to make your H2s retrieval-friendly, and the Fact Density Analyzer to raise the citation value of each chunk. The full pattern is in our content-to-cite guide.