Statistical Inference 1~4

1 Probability Theory

1.1 Set Theory

Definition 1.1.1 The set, S, of all possible outcomes of a particular experiment is called the sample space for the experiment.

Definition 1.1.2 An event is any collection of possible outcomes of an experiment, that is, any subset of S(including S itself).

Definition 1.1.5 Two events $A$ and $B$ are disjoint if $A \cap B = \emptyset$. The events $A_1, A_2, \dots$ are pairwise disjoint if $A_i \cap B_j = \emptyset$ for all $i \neq j$.

Definition 1.1.6 If $A_1, A_2, \dots$ are pairwise disjoint and $\cup_{i=1}^ {\infty}A_i = S$, then the collection $A_1, A_2, \dots$ forms a partition of $S$.

1.2 Basics of Probability Theory

Definition 1.2.1 A collection of subset of $S$ is called a sigma algebra (or Borel field), denoted by B, if it satisfies the following three properties:

  • $\emptyset \in B$.
  • If $A \in B$, then $A^c \in B$.
  • If $A_1, A_2, \dots \in B$, then $\cup_{i=1}^{\infty}A_i \in B$.

Definition 1.2.4 Given a sample space $S$ and an associated sigma algebra $B$, a probability function $P$ with domain $B$ that satisfies

  • $P(A) \ge 0$ for all $A \in B$.
  • $P(S) = 1$.
  • If $A_1, A_2, \dots \in B$ are pairwise disjoint, then $P(\cup_{i=1}^{\infty}A_i)= \sum_{i=1}^{\infty}P(A_i)$.

1.3 Conditional Probability and Independence

1.4 Random Variables

1.5 Distribution Functions

1.6 Density and Mass Functions

1.8 Miscellanea

2 Transformations and Expectations

2.1 Distribution of Functions of a Random Variable

Theorem 2.1.4 Let $X$ have pdf $f_X(x)$ and let $Y = g(X)$, where $g$ is a monotone function. Let $X$ and $Y$ be defined by (2.1.7). Suppose that $f_X(x)$ is continuous on $X$ and that $g^{-1}(y)$ has a continuous derivative on $Y$. Then the pdf of $Y$ is given by

2.2 Expected Values

2.3 Moments and Moment Generating Functions

Definition 2.3.6 Let $X$ be a random variable with cdf $F_X$. The moment generating function (mgf) of $X$ (or $F_X$), denoted by $M_X(t)$, is

**Theorem 2.3.7 If $X$ has mgf $M_X(t)$, then

2.4 Differentiating Under an Integral Sign

Theorem 2.4.1 (Leibnitz’s Rule) If $f(x,\theta), a(\theta), b(\theta)$ are differentiable with respect to $\theta$, then

 - f(a(\theta),\theta)\frac{d}{d\theta}a(\theta)+\int_{a(\theta)}^{b(\theta)}\frac{d}{d\theta}f(x,\theta)dx

Notice that if $a(\theta)$ and $b(\theta)$ are constant, we have a special case of Leibnitz’s Rule:

2.6 Miscellanea

3 Common Families of Distributions

3.1 Introduction

3.2 Discrete Distribution

Discrete Uniform Distribution

Hypergeometric Distribution

Binomial Distribution

Poisson Distribution

Negative Binomial Distribution

Geometric Distribution

3.3 Continuous Distribution

Uniform Distribution

Gamma Distribution

Normal Distribution

Beta Distribution

Cauchy Distribution

Lognormal Distribution

Exponential Distribution

Double Exponential Distribution

3.4 Exponential Families

A family of pdfs or pmfs is called an exponential family if it can be expressed as

3.5 Location and Scale Families

3.6 Inequalities and Identities

3.6.1 Probability Inequalities

Theorem 3.6.1 (Chebychev’s Inequalilty) Let $X$ be a random variable and let $g(x)$ be a nonnegative function. Then, for any $r > 0$,

3.6.2 Identities

3.8 Miscellanea

3.8.2 Chebychev and Beyound

4 Multiple Random Variables

4.1 Joint and Marginal Distributions

The marginal pmf

4.2 Conditional Distributions and Independence

4.3 Bivariate Transformations

4.4 Hierarchical Models and Mixture Distributions

Theorem 4.4.3 If X and Y are any two random variables, then

Theorem 4.4.7 For any two random variables X and Y,

4.5 Covariance and Correlation

covariance: $\mathrm{Cov}(X,Y) = \mathrm{E}((X - \mu_X)(Y - \mu_Y))$

correlation: $\rho_{XY} = \mathrm{Cov}(X,Y)/(\sigma_X \sigma_Y)$

Theorem 4.5.3 For any random variables $X$ and $Y$,

Theorem 4.5.5 If $X$ and $Y$ are independent random variables, then $\mathrm{Cov}(X,Y) = 0$ and $\rho_{XY} = 0$.

Theorem 4.5.6 If $X$ and $Y$ are any two random variables and $a$ and $b$ are any two constants, then

Theorem 4.5.7 For any random variables $X$ and $Y$,

  • $-1 \le \rho_{XY} \le 1$.
  • $\rho_{XY}^2 = 1$ if and only if there exist numbers $a \neq 0$ and $b$ such that $P(Y = aX + b) = 1$. If $\rho_{XY}=1$, then $a > 0$, and if $\rho_{XY} = -1$, then $a < 0$.

4.6 Multivariate Distributions

4.7 Inequalities

4.7.1 Numerical Inequalities

Lemma 4.7.1 Let a and b be any positive numbers, and let p and q be any positive numbers (necessarily greater than 1) satisfying


with equality if and only $a^p = b^q$.

Theorem 4.7.2 (Holder’s Inequality) Let X and Y be any two random variables, and let p and q satisfy. Then

Theorem 4.7.5 (Minkowski’s Inequality) Let X and Y be any two random variables. Then for $1 \le p < \infty$,

4.7.2 Functional Inequalities

Theorem 4.7.7 (Jensen’s Inequality) For any random variable X, if g(x) is a convex function, then

Theorem 4.7.9 (Covariance Inequality) Let $X$ be any random variable and $g(x)$ and $h(x)$ any functions such that $\mathrm{E}g(X)$, $\mathrm{E}h(X)$, and $\mathrm{E}(g(X)h(X))$ exist.

  • If $g(x)$ is a nondecreasing function and $h(x)$ is a nonincreasing function, then

  • If $g(x)$ and $h(x)$ are either both nondecreasing or both nonincreasing, then

4.9 Miscellanea

4.9.1 The Exchange Paradox

4.9.2 More on the Arithmetic-Geometric-Harmonic Mean Inequalilty

4.9.3 The Borel Paradox