01 RAG Intro
Overall objective of these notes: With LangChain as the orchestration framework, improve LLMs by applying a RAG system.
-
This folder contains conceptual notes.
-
This repo contains working code.
Intuition behind LLMxRAG¶
There are 2 sources of info:
-
public - what LLMs have
-
private - what LLMs DO NOT have. RAG is how we provide this private info to the LLM.
Sam
More technically: After applying RAG, the LLM has 2 sources of memory:
-
parametric: public info LLM already learned
-
non-parametric: private documents we provide
Sam
RAG stands for retrieval-augmented generation.
-
retrieval: pulls our private documents
-
augmented: sends to LLM
-
generation: use our private documents to respond
Overall, a LLMxRAG system's purpose is to enhance an LLM's accuracy & relevance.
Acronyms¶
Sam
My acronyms, in sequantial order.
Indexing Pipeline
-
docs| our internal documents, unknown to the LLM -
KB| the knowledge base where we store these docs as embeddings -
I| the indexing pipeline which prepares info for the generation pipeline
Generation (RAG) Pipeline
-
Q| The user's query/prompt. -
R| The retrieval process where we pull docs from KB -
A| The augmentation where we send R to LLM -
G| The generation process where answer Q
Hierarchies¶
Sam
2 major structural hierarchies:
-
Layer hierarchy ⟶ architectural (how the system is built)
-
Pipeline hierarchy ⟶ functional (what the system does)
Hierarchy 1: Layers¶
See LLM-RAG/07-RAG-Ops|07-RAG-Ops
Critical Layers
-
Data
-
Model
-
Model Deployment
-
App Orchestration
Essential Layers
-
Prompt
-
Evaluation
-
Monitoring
-
Security & Privacy
-
Caching
Enhancement Layers
-
Human-in-the-Loop
-
Cost Optimization
-
Explainability
-
Collaboration & Experimentation
Hierarchy 2: Pipelines¶
See LLM-RAG/03-INDEXING|03-INDEXING and LLM-RAG/04-GENERATING|04-GENERATING
Indexing Pipeline
-
load: Connect ⟶ Extract ⟶ Metadata ⟶ Transform
-
chunk: Divide ⟶ Merge ⟶ Overlap
-
embed
-
store
Generating Pipeline
-
R | retrieving
-
A | prompt managing (augmenting)
-
G | LLM constructing (generating)
Footnotes¶
Images from textbook:
Sam
RAG system analogy
-
Pipelines (assembly line)
-
System-level components (quality inspector)
-
RAGOps infrastructure /
layers(electricity in factory)
Sam
Ch 9 - dev framework
6 stages:
-
Initiation: gather reqs, design architecture
-
Design: I & G pipelines
-
Development: develop pipelines, create prototype for evaluation
-
Evaluation: assess components & system performance
-
Deployment
-
Maintenance: track & improve