01 RAG Intro

Overall objective of these notes: With LangChain as the orchestration framework, improve LLMs by applying a RAG system.

This folder contains conceptual notes.
This repo contains working code.

Intuition behind LLMxRAG¶

There are 2 sources of info:

public - what LLMs have
private - what LLMs DO NOT have. RAG is how we provide this private info to the LLM.

Sam

More technically: After applying RAG, the LLM has 2 sources of memory:

parametric: public info LLM already learned
non-parametric: private documents we provide

Sam

RAG stands for retrieval-augmented generation.

retrieval: pulls our private documents
augmented: sends to LLM
generation: use our private documents to respond

Overall, a LLMxRAG system's purpose is to enhance an LLM's accuracy & relevance.

Acronyms¶

Sam

My acronyms, in sequantial order.

Indexing Pipeline

docs | our internal documents, unknown to the LLM
KB | the knowledge base where we store these docs as embeddings
I | the indexing pipeline which prepares info for the generation pipeline

Generation (RAG) Pipeline

Q | The user's query/prompt.
R | The retrieval process where we pull docs from KB
A | The augmentation where we send R to LLM
G | The generation process where answer Q

Hierarchies¶

Sam

2 major structural hierarchies:

Layer hierarchy ⟶ architectural (how the system is built)
Pipeline hierarchy ⟶ functional (what the system does)

Hierarchy 1: Layers¶

See LLM-RAG/07-RAG-Ops|07-RAG-Ops

Critical Layers

Data
Model
Model Deployment
App Orchestration

Essential Layers

Prompt
Evaluation
Monitoring
Security & Privacy
Caching

Enhancement Layers

Human-in-the-Loop
Cost Optimization
Explainability
Collaboration & Experimentation

Hierarchy 2: Pipelines¶

See LLM-RAG/03-INDEXING|03-INDEXING and LLM-RAG/04-GENERATING|04-GENERATING

Indexing Pipeline

load: Connect ⟶ Extract ⟶ Metadata ⟶ Transform
chunk: Divide ⟶ Merge ⟶ Overlap
embed
store

Generating Pipeline

R | retrieving
A | prompt managing (augmenting)
G | LLM constructing (generating)

Indexing Pipeline

Generation Pipeline

Footnotes¶

Images from textbook:

Hierarchy 1: Layers (just the critical layers)
Hierarchy 2: Pipelines

Sam

RAG system analogy

Pipelines (assembly line)
System-level components (quality inspector)
RAGOps infrastructure / layers (electricity in factory)

Sam

Ch 9 - dev framework

6 stages:

Initiation: gather reqs, design architecture
Design: I & G pipelines
Development: develop pipelines, create prototype for evaluation
Evaluation: assess components & system performance
Deployment
Maintenance: track & improve