Skip to content

01 RAG Intro

Overall objective of these notes: With LangChain as the orchestration framework, improve LLMs by applying a RAG system.

  • This folder contains conceptual notes.

  • This repo contains working code.


Intuition behind LLMxRAG

There are 2 sources of info:

  • public - what LLMs have

  • private - what LLMs DO NOT have. RAG is how we provide this private info to the LLM.

Sam

More technically: After applying RAG, the LLM has 2 sources of memory:

  • parametric: public info LLM already learned

  • non-parametric: private documents we provide

Sam

RAG stands for retrieval-augmented generation.

  • retrieval: pulls our private documents

  • augmented: sends to LLM

  • generation: use our private documents to respond

Overall, a LLMxRAG system's purpose is to enhance an LLM's accuracy & relevance.


Acronyms

Sam

My acronyms, in sequantial order.

Indexing Pipeline

  • docs | our internal documents, unknown to the LLM

  • KB | the knowledge base where we store these docs as embeddings

  • I | the indexing pipeline which prepares info for the generation pipeline

Generation (RAG) Pipeline

  • Q | The user's query/prompt.

  • R | The retrieval process where we pull docs from KB

  • A | The augmentation where we send R to LLM

  • G | The generation process where answer Q

Hierarchies

Sam

2 major structural hierarchies:

  • Layer hierarchy ⟶ architectural (how the system is built)

  • Pipeline hierarchy ⟶ functional (what the system does)

Hierarchy 1: Layers

See LLM-RAG/07-RAG-Ops|07-RAG-Ops

Critical Layers

  • Data

  • Model

  • Model Deployment

  • App Orchestration

Essential Layers

  • Prompt

  • Evaluation

  • Monitoring

  • Security & Privacy

  • Caching

Enhancement Layers

  • Human-in-the-Loop

  • Cost Optimization

  • Explainability

  • Collaboration & Experimentation

Hierarchy 2: Pipelines

See LLM-RAG/03-INDEXING|03-INDEXING and LLM-RAG/04-GENERATING|04-GENERATING

Indexing Pipeline

  • load: ConnectExtractMetadataTransform

  • chunk: DivideMergeOverlap

  • embed

  • store

Generating Pipeline

  • R | retrieving

  • A | prompt managing (augmenting)

  • G | LLM constructing (generating)

Indexing Pipeline

Generation Pipeline


Footnotes

Images from textbook:

Sam

RAG system analogy

  • Pipelines (assembly line)

  • System-level components (quality inspector)

  • RAGOps infrastructure / layers (electricity in factory)

Sam

Ch 9 - dev framework

6 stages:

  1. Initiation: gather reqs, design architecture

  2. Design: I & G pipelines

  3. Development: develop pipelines, create prototype for evaluation

  4. Evaluation: assess components & system performance

  5. Deployment

  6. Maintenance: track & improve