Explainer

Retrieval-Augmented Generation

Giving a model fresh, private knowledge at answer time.

Retrieval-Augmented Generation (RAG) connects a model to an external source of documents. Instead of relying only on what it memorized during training, the system retrieves relevant passages and includes them in the prompt.

This solves two big problems at once: models can answer questions about private or recent information they were never trained on, and they can cite the specific source they used.

A typical pipeline embeds your documents, finds the closest matches to a question, and hands those passages to the model as context. The quality of retrieval usually matters more than the choice of model.