LLM Wiki vs RAG

The fundamental choice in LLM-based knowledge systems: synthesize knowledge at ingest time (LLM Wiki) or at query time (RAG).

The Difference

RAGLLM Wiki
When synthesis happensQuery time — every questionIngest time — once per source
What’s storedRaw chunks + embeddingsStructured, interlinked markdown pages
Knowledge accumulationNone — re-derived each queryCompounds with every source added
Cross-referencesBuilt dynamically, unreliablePre-built, maintained by LLM
Contradiction detectionNoneFlagged during ingest
Query costHigh (retrieval + synthesis every time)Low (wiki already synthesized)
Ingest costLow (just embed)Higher (LLM reads and writes pages)

The Key Insight

In RAG, ask a question requiring synthesis of five documents — the LLM has to find and piece together the relevant fragments every time. Nothing is built up.

In LLM Wiki, that synthesis was done when the sources were ingested. The answer is already partially written.

The Tradeoff

Synthesizing at ingest time means paying once, saving on every query. But it also means the wiki is only as good as what was extracted. A summary that loses precision gets stored as truth — nothing downstream will catch it automatically. This is why the “discuss key takeaways first” step and lint passes matter.

When to Use Each

  • RAG — large corpora, many users, need to query original text verbatim, compliance requirements
  • LLM Wiki — personal research, deep dives, topics you’re building expertise in over time

See Also