Retrieval-augmented generation (RAG)
A pattern where the system retrieves relevant document chunks first, then hands them to the model to generate an answer. Useful for million-document enterprise stacks. A working REALTOR rarely needs it — paste into a 200K context window instead.
What it does (the operator translation)
The full version: split your documents into chunks, store the chunks as numerical fingerprints (embeddings) in a vector database, search the database when a question comes in, retrieve the closest-matching chunks, hand those to the model, generate an answer.
That's a real engineering pattern for real enterprise problems. A law firm with 50 million contracts. A healthcare system with 20 years of patient records. A SaaS company with a documentation site nobody can search.
A working REALTOR is not that. You have 487 contacts in your sphere, 200 active listings in Realtracs you can pull on demand, and 14 days of messages on your phone. All of that fits inside a 200K-token context window. You don't need to retrieve. You paste.
Yan/Husain Part I — actual builders — make the point cleanly: don't build retrieval infrastructure if context will fit. The added complexity of an embeddings + vector DB stack only pays when you've genuinely outgrown the paste.
The vendor pitch — "we built a RAG system on your CRM data" — is usually a wrapper around something the agent could do by exporting the CSV and pasting it. Same answer, no infrastructure, no monthly fee. Simon Willison's weblog covers the same anti-pattern — engineering complexity that doesn't pay for the use case. Yan/Husain Part III on strategy puts it cleanly: "Don't buy SaaS for what an LLM can do."
Why a working REALTOR cares (the breakpoint)
For 95%+ of working agents, you don't need RAG. The breakpoint that flips it: a multi-thousand-listing IDX with five years of historic activity, where the team needs ad-hoc Q&A across that whole corpus. That's enterprise. That's not the median agent.
What this is NOT (the category-flip)
RAG is NOT memory. It's a way to bring the right text into a single prompt. And it's NOT magic — if your underlying documents are wrong, RAG retrieves wrong chunks confidently, which actually compounds the hallucination problem.
Related terms
Context window · Large language model · Weak signal · Hallucination
Where this comes up in The Listing Machine
RAG is the example used in The Listing Machine of "engineering pattern that doesn't help a 12-deal agent." The Tuesday-morning paste workflow does the same job for the median REALTOR — without the vector database underneath.