LLM Wiki vs RAG
The fundamental choice in LLM-based knowledge systems: synthesize knowledge at ingest time (LLM Wiki) or at query time (RAG).
The Difference
| RAG | LLM Wiki | |
|---|---|---|
| When synthesis happens | Query time — every question | Ingest time — once per source |
| What’s stored | Raw chunks + embeddings | Structured, interlinked markdown pages |
| Knowledge accumulation | None — re-derived each query | Compounds with every source added |
| Cross-references | Built dynamically, unreliable | Pre-built, maintained by LLM |
| Contradiction detection | None | Flagged during ingest |
| Query cost | High (retrieval + synthesis every time) | Low (wiki already synthesized) |
| Ingest cost | Low (just embed) | Higher (LLM reads and writes pages) |
The Key Insight
In RAG, ask a question requiring synthesis of five documents — the LLM has to find and piece together the relevant fragments every time. Nothing is built up.
In LLM Wiki, that synthesis was done when the sources were ingested. The answer is already partially written.
The Tradeoff
Synthesizing at ingest time means paying once, saving on every query. But it also means the wiki is only as good as what was extracted. A summary that loses precision gets stored as truth — nothing downstream will catch it automatically. This is why the “discuss key takeaways first” step and lint passes matter.
When to Use Each
- RAG — large corpora, many users, need to query original text verbatim, compliance requirements
- LLM Wiki — personal research, deep dives, topics you’re building expertise in over time
See Also
- llm-wiki-pattern — the full pattern
- wiki-schema — the schema that governs the wiki