Skip to content

Probability MOC

This outline aims to build

  • from fundamental principles of probability

  • to more advanced topics relevant to data science, such as Bayesian methods and sampling theory.

Links

  • [[Pr-Notation]]

  • [[Pr-Glossary]]

Week 1: Introduction & Foundations

  • [[Pr-Basic Definitions]]

  • Sample space, events, outcomes.

  • Axioms of probability (Kolmogorov axioms).

  • [[Pr-Basic Set Operations]]

  • Union, intersection, complement, etc.

  • Venn diagrams and their use in probability.

Combinatorics & Basic Probability Tools

  • [[Pr-Counting Techniques]]

  • Permutations vs. combinations.

  • Applications in probability (e.g., drawing cards, selecting subsets).

  • [[Pr-Classical Probability]]

  • Equally likely outcomes.

  • Examples: dice rolls, coin tosses, card draws.

  • [[Pr-Conditional Probability]]

  • Definition and examples (e.g., medical test scenarios).

  • The “conditional” approach to solving multi-step probability problems.

  • [[Pr-Law of Total Probability]]

Bayes’ Theorem & Bayesian Thinking

  • [[Pr-Bayes Theorem]]

  • Statement of the theorem.

  • Classic examples (e.g., medical diagnostics, spam filtering).

  • Prior, Likelihood, Posterior

RVs and Distributions

[[Pr-Distributions and RVs]] - Theory [[Pr-Distributions - Example Problems]] - Applications

  • Random Variables

  • Conceptual understanding: “mapping outcomes to numbers.”

  • Support sets (finite, countably infinite).

Discrete

  • Key Discrete Distributions

  • Bernoulli, Binomial, Geometric, Negative Binomial.

  • Poisson distribution and its relationship to Binomial.

  • Examples and data science use cases (e.g., modeling counts/events).

  • Expectation & Variance of Discrete Random Variables

  • How to compute mean and variance for discrete distributions.

  • Moment generating functions (brief introduction).

Continuous

  • Continuous Random Variables

  • Probability density functions (pdf) vs. cumulative distribution functions (cdf).

  • Key Continuous Distributions

  • Uniform, Normal (Gaussian), Exponential.

  • Gamma, Beta (brief introduction for Bayesian applications).

  • Expectation & Variance of Continuous Random Variables

  • Computing integrals; examples for Normal, Exponential, etc.

  • Applications in analytics (e.g., wait times, measurement errors).

Joint Distributions & Independence

[[Pr-2.3 Joint Two RVs]]

  • Joint, Marginal, and Conditional Distributions

  • Definitions, relationships, and computing marginals from joints.

  • Independence & Conditional Independence

  • Criteria for independence.

  • Examples illustrating dependence vs. independence (e.g., correlation vs. independence).

  • Covariance & Correlation

  • Definitions, properties.

  • Relationship to independence.

Transformations & Multiple Random Variables

  • Functions of Random Variables

  • Methods to find distributions of transformed variables (e.g., sum of variables).

  • Jacobians, Change of Variables

  • Practical approach for continuous transformations.

  • Convolutions

  • Sum of independent random variables (e.g., sum of Poisson, sum of Exponential).

  • Multivariate Distributions

  • Brief introduction to multivariate normal distribution and its role in data science.

Sampling Distributions & Limit Theorems

[[Pr-2.5 2.6 Sampling]]

  • Law of Large Numbers (LLN)

  • Intuition and formal statement (weak/strong versions in brief).

  • Relevance in data-driven estimations (e.g., sample means).

  • Central Limit Theorem (CLT)

  • Statement, intuition, and how it underpins hypothesis testing in data science.

  • Applications: confidence intervals, error estimation, etc.

  • Sampling Distributions

  • Basic concept of sampling distribution of a statistic (mean, proportion, etc.).

  • Connection to bootstrap methods.

Intro to Bayesian Inference & Conjugate Priors

  • Bayesian Inference Cycle

  • Prior, likelihood, posterior, predictive distribution.

  • Conjugate Priors

  • Examples for common distributions (Beta-Binomial, Gamma-Poisson, etc.).

  • Analytical tractability and why conjugacy is helpful.

  • Posterior Predictive Checks

  • Using posterior distributions for predictions and model validation.

Markov Chains & Stochastic Processes

  • Markov Chains

  • Definition and basic properties (memoryless property).

  • Transition matrices, steady-state distributions.

  • Applications in Data Science

  • Markov Chain Monte Carlo (MCMC) concept overview (e.g., Metropolis-Hastings, Gibbs sampling).

  • Random walks (e.g., PageRank-style algorithms in SEO/data science).

  • Other Stochastic Processes

  • Poisson processes in event modeling (brief).

Applications

  • Probability in Machine Learning & Statistical Modeling

  • Logistic regression as a Bernoulli-Binomial model.

  • Naive Bayes classifier as an illustrative example.

  • Uncertainty & Variability in Data Science Pipelines

  • Prediction intervals, confidence intervals.

  • Propagation of errors.

  • Simulation & Monte Carlo Methods

  • Generating random variables (Inverse transform, Box–Muller method, etc.).

  • Simulation-based approaches to solve complex probability problems.