Statistical Inference 1~4
Statistical Inference 1~4
1 Probability Theory
1.1 Set Theory
Definition 1.1.1 The set, S, of all possible outcomes of a particular experiment is called the sample space for the experiment.
Definition 1.1.2 An event is any collection of possible outcomes of an experiment, that is, any subset of S(including S itself).
Definition 1.1.5 Two events $A$ and $B$ are disjoint if $A \cap B = \emptyset$. The events $A_1, A_2, \dots$ are pairwise disjoint if $A_i \cap B_j = \emptyset$ for all $i \neq j$.
Definition 1.1.6 If $A_1, A_2, \dots$ are pairwise disjoint and $\cup_{i=1}^ {\infty}A_i = S$, then the collection $A_1, A_2, \dots$ forms a partition of $S$.
1.2 Basics of Probability Theory
Definition 1.2.1 A collection of subset of $S$ is called a sigma algebra (or Borel field), denoted by B, if it satisfies the following three properties:
 $\emptyset \in B$.
 If $A \in B$, then $A^c \in B$.
 If $A_1, A_2, \dots \in B$, then $\cup_{i=1}^{\infty}A_i \in B$.
Definition 1.2.4 Given a sample space $S$ and an associated sigma algebra $B$, a probability function $P$ with domain $B$ that satisfies
 $P(A) \ge 0$ for all $A \in B$.
 $P(S) = 1$.
 If $A_1, A_2, \dots \in B$ are pairwise disjoint, then $P(\cup_{i=1}^{\infty}A_i)= \sum_{i=1}^{\infty}P(A_i)$.
1.3 Conditional Probability and Independence
1.4 Random Variables
1.5 Distribution Functions
1.6 Density and Mass Functions
1.8 Miscellanea
2 Transformations and Expectations
2.1 Distribution of Functions of a Random Variable
Theorem 2.1.4 Let $X$ have pdf $f_X(x)$ and let $Y = g(X)$, where $g$ is a monotone function. Let $X$ and $Y$ be defined by (2.1.7). Suppose that $f_X(x)$ is continuous on $X$ and that $g^{1}(y)$ has a continuous derivative on $Y$. Then the pdf of $Y$ is given by
2.2 Expected Values
2.3 Moments and Moment Generating Functions
Definition 2.3.6 Let $X$ be a random variable with cdf $F_X$. The moment generating function (mgf) of $X$ (or $F_X$), denoted by $M_X(t)$, is
**Theorem 2.3.7 If $X$ has mgf $M_X(t)$, then
2.4 Differentiating Under an Integral Sign
Theorem 2.4.1 (Leibnitz’s Rule) If $f(x,\theta), a(\theta), b(\theta)$ are differentiable with respect to $\theta$, then
$$
\frac{d}{d\theta}\int_{a(\theta)}^{b(\theta)}f(x,\theta)dx=f(b(\theta),\theta)\frac{d}{d\theta}b(\theta)
 f(a(\theta),\theta)\frac{d}{d\theta}a(\theta)+\int_{a(\theta)}^{b(\theta)}\frac{d}{d\theta}f(x,\theta)dx
$$
Notice that if $a(\theta)$ and $b(\theta)$ are constant, we have a special case of Leibnitz’s Rule:
2.6 Miscellanea
3 Common Families of Distributions
3.1 Introduction
3.2 Discrete Distribution
Discrete Uniform Distribution
Hypergeometric Distribution
Binomial Distribution
Poisson Distribution
Negative Binomial Distribution
Geometric Distribution
3.3 Continuous Distribution
Uniform Distribution
Gamma Distribution
Normal Distribution
Beta Distribution
Cauchy Distribution
Lognormal Distribution
Exponential Distribution
Double Exponential Distribution
3.4 Exponential Families
A family of pdfs or pmfs is called an exponential family if it can be expressed as
3.5 Location and Scale Families
3.6 Inequalities and Identities
3.6.1 Probability Inequalities
Theorem 3.6.1 (Chebychev’s Inequalilty) Let $X$ be a random variable and let $g(x)$ be a nonnegative function. Then, for any $r > 0$,
3.6.2 Identities
3.8 Miscellanea
3.8.2 Chebychev and Beyound
4 Multiple Random Variables
4.1 Joint and Marginal Distributions
The marginal pmf
4.2 Conditional Distributions and Independence
4.3 Bivariate Transformations
4.4 Hierarchical Models and Mixture Distributions
Theorem 4.4.3 If X and Y are any two random variables, then
Theorem 4.4.7 For any two random variables X and Y,
4.5 Covariance and Correlation
covariance: $\mathrm{Cov}(X,Y) = \mathrm{E}((X  \mu_X)(Y  \mu_Y))$
correlation: $\rho_{XY} = \mathrm{Cov}(X,Y)/(\sigma_X \sigma_Y)$
Theorem 4.5.3 For any random variables $X$ and $Y$,
Theorem 4.5.5 If $X$ and $Y$ are independent random variables, then $\mathrm{Cov}(X,Y) = 0$ and $\rho_{XY} = 0$.
Theorem 4.5.6 If $X$ and $Y$ are any two random variables and $a$ and $b$ are any two constants, then
Theorem 4.5.7 For any random variables $X$ and $Y$,
 $1 \le \rho_{XY} \le 1$.
 $\rho_{XY}^2 = 1$ if and only if there exist numbers $a \neq 0$ and $b$ such that $P(Y = aX + b) = 1$. If $\rho_{XY}=1$, then $a > 0$, and if $\rho_{XY} = 1$, then $a < 0$.
4.6 Multivariate Distributions
4.7 Inequalities
4.7.1 Numerical Inequalities
Lemma 4.7.1 Let a and b be any positive numbers, and let p and q be any positive numbers (necessarily greater than 1) satisfying
Then,
with equality if and only $a^p = b^q$.
Theorem 4.7.2 (Holder’s Inequality) Let X and Y be any two random variables, and let p and q satisfy. Then
Theorem 4.7.5 (Minkowski’s Inequality) Let X and Y be any two random variables. Then for $1 \le p < \infty$,
4.7.2 Functional Inequalities
Theorem 4.7.7 (Jensen’s Inequality) For any random variable X, if g(x) is a convex function, then
Theorem 4.7.9 (Covariance Inequality) Let $X$ be any random variable and $g(x)$ and $h(x)$ any functions such that $\mathrm{E}g(X)$, $\mathrm{E}h(X)$, and $\mathrm{E}(g(X)h(X))$ exist.

If $g(x)$ is a nondecreasing function and $h(x)$ is a nonincreasing function, then

If $g(x)$ and $h(x)$ are either both nondecreasing or both nonincreasing, then