Karpathy’s LLM Wiki: Obsidian, RAG, and Compounding Notes

Imagine you are asking the same AI assistant about the same documents again and again, while the answer disappears into chat history each time. Andrej Karpathy’s “LLM Wiki” is a workflow pattern for building personal knowledge bases with large language models. Its central idea is simple: instead of retrieving raw document chunks at question time, as many RAG systems do, an LLM builds and maintains a persistent, interlinked set of markdown pages. It matters because the knowledge becomes a durable artifact rather than a temporary answer.

The pattern is for people and teams who accumulate knowledge over time. Karpathy lists personal tracking, research, book reading, business wikis, competitive analysis, due diligence, trip planning, course notes, and hobby research as possible contexts. In this setup, the human chooses sources, asks questions, and guides emphasis. The LLM performs the maintenance work: summarizing, cross-referencing, filing, updating pages, and noting contradictions. Developers and technical users should care because Karpathy describes the document as an idea file meant to be pasted into an LLM agent such as OpenAI Codex, Claude Code, OpenCode, or similar tools.

The workflow fits inside a local note system, especially Obsidian, when the user wants readable files rather than a hidden database. Obsidian’s own documentation says it stores notes as Markdown-formatted plain text files in a vault, which is a folder on the local file system. Karpathy describes Obsidian as the interface where he browses the wiki in real time while the LLM edits it. Public implementations inspired by the idea also use Obsidian as the reading and editing surface, with raw sources going into one folder and generated wiki pages going into another.

In practice, the system has three layers: raw sources, the wiki, and a schema file. The raw sources are the curated documents and remain the source of truth. The wiki is made of LLM-generated markdown files, including summaries, entity pages, concept pages, comparisons, overviews, and syntheses. The schema file, such as CLAUDE.md for Claude Code or AGENTS.md for Codex, tells the model how to structure pages and follow workflows. The analogy is a newsroom archive: reporters gather sources, but an editor keeps the index, backgrounders, and cross-references coherent.

What comes next is cautious experimentation, not a guaranteed replacement for RAG. Karpathy writes that an index file can work at moderate scale and can avoid embedding-based RAG infrastructure in some cases, while later comments and projects suggest adding search, dashboards, CLIs, or local runtimes as the wiki grows. Public sources do not clearly confirm that LLM wikis outperform RAG in all settings. A practical next step today is to create a small Obsidian vault, place a few trusted sources in a raw folder, define a simple schema, and ask an LLM agent to generate and maintain a compact markdown wiki.

Leave a Reply

Discover more from Cybericonic

Subscribe now to keep reading and get access to the full archive.

Continue reading