04 GENERATING

TL;DR¶

Sam

LLM x RAG systems have 2 sources of memory available.

Parametric: learned during initial LLM training.
Non-parametric: info stored in our KB
- Indexing pipeline: creates KB
- Generation pipeline: retrieves from KB (THIS DOC)

Generation Pipeline: input Q ⟶ respond with LLM x RAG:

Generation Pipeline

Sam

Process:

LangChain has abstracted these algorithms ⟶ retrievers.

Sam

TF-IDF: keyword-based, uses TF and IDF to score words.

BM25: probabilistic variant of TF-IDF. Adds length normalization & saturation effects so longer documents aren’t unfairly favored.

Static Word Embeddings: vector-based semantics (fixed meaning per word)

Contextual Embeddings: context-aware semantics (meanings shift with context)

Sam

Vector stores and DBs:
1. Combine FAISS with contextual embedding model
2. PineCone / Milvus / Weaviate combine dense retrieval methods ⟶ provide hybrid search functionality.
Cloud providers: Includes infrastructure, APIs, and tools for info retrieval
Web info: Connect to Wikipedia / Arxiv / AskNews / etc. See Langchain .

Apply prompt engineering. Goal is to best augment the LLM with the Q & retrieved info.

Prompting Technique	Description
Contextual	“Answer based on only the context provided below.”
Controlled generation	"Say 'I don’t know' when provided context doesn't have needed info."
Few-shot	Provide examples in prompt
Chain-of-thought (CoT)	Provide intermediate reasoning steps

Key question: Which LLM to use?

Consider these 3 major themes.

Sam

Foundation models: massive pre-trained LLMs.

SFT (supervised fine-tuning):

is: a process to adjust foundation model's weights for specific tasks
how: start with a pre-trained model ⟶ prepare labelled dataset ⟶ train model. This adjusts the model parameters to perform better on the given task.
benefits: Domain specialization, retrieval integration w KB, response customization, output control

Sam

Open source: more flexible, but need infrastructure and maintenance.

Criteria

Customization: Open source allows (1) deep integration with custom retrievers (2) control over fine-tuning
Ease of use: Open source is more difficult. Proprietary can offer prebuilt RAG solutions.
Deployment flexibility: Open source are customizable(private cloud, on-premises)
Cost: Open source has higher up-front fixed costs, lower variable costs over time.

Sam

Small models..

pros:

cons: