Symbols
- Notation 1 | Notation 2 - There were in my probability notes from the Stanford course, not sure how useful they are.
Specific to pop vs sample
| Population Parameter (Greek) | Sample Statistic (Latin) | ||
|---|---|---|---|
| Central Tendency | Mean | \(\mu\) | \(\bar{x}\) |
| Central Tendency | Standard Deviation | \(\sigma\) Sigma | \(s\) |
| Central Tendency | Variance | \(\sigma^2\) | \(s^2\) |
| Central Tendency | Proportion | \(p\) | \(\hat{p}\) |
| Central Tendency | Size | \(n\) | \(N\) |
| Correlation | Correlation | \(ρ\) Rho | r |
| Correlation | Covariance | \(\sigma_{xy}\) | \(s_{xy}\) |
| Regression | Slope | \(\beta\) | \(\hat{\beta}\) or \(b\) |
| Regression | Intercept | \(\beta_0\) | \(\hat{\beta}_0\) or \(a\) |
Not specific to pop vs sample
| Symbol Name | Used For | Population Parameter (Greek) | Sample Statistic (Latin) | |
|---|---|---|---|---|
| α (Alpha) | Significance level (probability of type I error) |
|||
| β (Beta) | Probability of type II error | |||
| ν (Nu) | Degree of freedom (df) | |||
| Ω (Capital omega) | Sample space | |||
| ω (Omega) | Outcome from sample space | |||
| θ (Theta), β (Beta) | Population parameters | Population | ||
| X, Y, Z, T | Random variables | |||
| x, y, z, t | Values of random variable |
Combinatorial Operators¶
| Symbol Name | Explanation |
|---|---|
| \(n!\) | Factorial |
| \(n!!\) | Double factorial |
| \(!n\) | Number of derangements of \(n\) objects |
| \(n P r\) | Permutation (\(n\) permute \(r\)) |
| \(n C r, \binom{n}{r}\) | Combination (\(n\) choose \(r\)) |
| \(\binom{n}{r_1, \ldots, r_k}\) | Multinomial coefficient |
| \(\left(\binom{n}{r}\right)\) | Multiset coefficient (\(n\) multichoose \(r\)) |
Stats vs Probability¶
Notational differences arise because the two fields approach similar concepts from different perspectives:
-
Probability is focused on modeling and reasoning about uncertainty, typically using theoretical distributions.
-
Statistics is focused on analyzing and summarizing data, often inferring from samples to populations.
Here’s a detailed breakdown of the notational differences:
1. Random Variables vs. Observed Data¶
Probability:¶
-
Random variables are denoted with uppercase letters (\(X, Y, Z\)).
-
Values they take are denoted with lowercase letters (\(x, y, z\)).
-
Example: "The random variable \(X\) has a value \(x\) with probability \(P(X = x)\)."
Statistics:¶
-
Observed data points (realizations of random variables) are denoted with lowercase letters (\(x, y, z\)).
-
Example: "The sample data point \(x_i\) is a realization of \(X\)."
2. Population vs. Sample Parameters¶
Probability:¶
-
Parameters of a distribution are typically denoted by Greek letters:
-
Mean: \(\mu\)
-
Variance: \(\sigma^2\)
-
Standard deviation: \(\sigma\)
-
Correlation: \(\rho\)
-
These are treated as fixed and known quantities.
Statistics:¶
-
Sample-based estimates of these parameters use Latin letters or "hat" notation:
-
Sample mean: \(\bar{x}\) or \(\hat{\mu}\)
-
Sample variance: \(s^2\)
-
Sample standard deviation: \(s\)
-
Sample correlation: \(r\)
-
These are treated as random and estimated from data.
3. Expectation and Moments¶
Probability:¶
-
Expected value: \(\mathbb{E}[X]\)
-
Variance: \({Var}(X) = \mathbb{E}[(X - \mathbb{E}[X])^2]\)
-
Higher-order moments:
-
\(\mathbb{E}[X^k]\) (raw moments)
-
\(\mathbb{E}[(X - \mu)^k]\) (central moments)
-
These are theoretical and depend on the assumed distribution of \(X\).
Statistics:¶
-
Sample mean: \(\bar{x} =\frac{1}{n} \sum_{i=1}^n x_i\)
-
Sample variance: \(s^2 =\frac{1}{n-1} \sum_{i=1}^n (x_i - \bar{x})^2\)
-
Sample moments:
-
Raw moment: \(\frac{1}{n} \sum_{i=1}^n x_i^k\)
-
Central moment: \(\frac{1}{n} \sum_{i=1}^n (x_i - \bar{x})^k\)
4. Probability Distributions¶
Probability:¶
-
Focuses on population-level distributions:
-
Probability mass function (PMF): \(P(X = x)\)
-
Probability density function (PDF): \(f_X(x)\)
-
Cumulative distribution function (CDF): \(F_X(x) = P(X \leq x)\)
Statistics:¶
-
Focuses on empirical distributions:
-
Relative frequency of observed data.
-
Empirical CDF: \(F_n(x) =\frac{\text{number of } x_i \leq x}{n}\).
5. Notation for Inference¶
Probability:¶
-
Known distribution parameters are fixed:
-
\(X \sim N(\mu, \sigma^2)\) (Normal distribution).
-
We derive properties of \(X\), like \(P(a \leq X \leq b)\).
Statistics:¶
-
Parameters are unknown and estimated:
-
\(\hat{\mu}, \hat{\sigma}^2\) are estimates of \(\mu, \sigma^2\).
-
Confidence intervals: \(\mu \in (\hat{\mu} - c, \hat{\mu} + c)\) with some confidence level \(1 - \alpha\).
6. Conditional Dependence¶
Probability:¶
-
\(P(X \mid Y)\): Conditional probability of \(X\) given \(Y\).
-
Conditional expectation: \(\mathbb{E}[X \mid Y]\).
Statistics:¶
-
Regression models estimate conditional relationships:
-
\(\hat{y}_i = \beta_0 + \beta_1 x_i\) in simple linear regression.
-
The focus is on estimation and interpretation.
7. Likelihood and Estimation¶
Probability:¶
-
Likelihood: \(L(\theta) = P(X \theta)\), where \(\theta\) are fixed parameters.
-
Probability is derived based on assumed \(\theta\).
Statistics:¶
-
Likelihood: \(L(\theta) = P(X | \theta)\), but \(\theta\) is treated as an unknown to be estimated.
-
Maximum likelihood estimation (MLE): \(\hat{\theta} = \arg\max_{\theta} L(\theta) = \arg\max_{\theta} P(X \mid \theta)\)
Summary Table¶
| Concept | Probability | Statistics |
|---|---|---|
| Random variables | \(X, Y\) | \(X, Y\) |
| Observed values | \(x, y\) | \(x_i, y_i\) |
| Population mean | \(\mu_X\) | \(\mu_X\) |
| Sample mean | — | \(\bar{x}\) or \(\hat{\mu}\) |
| Expectation | \(\mathbb{E}[X]\) | — |
| Variance | \(\text{Var}(X)\) | \(s^2\) (sample) |
| \(f_X(x)\) | — | |
| Empirical distribution | — | \(F_n(x)\) |
| Parameters | \(\theta\) (fixed) | \(\hat{\theta}\) (estimated) |