# Statistical Inference 5~8

# Statistical Inference 5~8

# 5 Properties of a Random Sample

## 5.1 Basic Concepts of Random Samples

## 5.2 Sums of Random Variables from a Random Sample

**Lemma 5.2.5**:

and

**Theorem 5.2.6**:

- ,
- ,
- .

**Theorem 5.2.7**:

**Example 5.2.8**:Let be a random sample from a population. Then the mgf of the sample mean is

Thus, has a distribution.

Another simple example is given by a random sameple. Here, we can also easily derive the distribution of the sample mean. The mgf of the sample mean is

which we recognize as the mgf of a , the distribution of .

**Theorem 5.2.9** If and are independent continuous random variables with pdfs and , then the pdf of is

**Example 5.2.10 (Sum of Cauchy random variables)**

**Theorem 5.2.11** Suppose is a random sample from a pdf or pmf `\(f(x|\theta)\)`

, where

is a member of an exponential family. Define statistics by

If the set contains an open subset of , then the distribution of is an exponential family of the form

**Example 5.2.12 (Sum of Bernoulli random variables)**

## 5.3 Sampling from the Normal Distribution

### 5.3.1 Properties of the Sample Mean and Variance

**Theorem 5.3.1**: Let be a random sample from a distribution, and let and . Then

- and are independent random variables,
- has a distribution,
- has a chi squared distribution with degrees of freedom.

**Lemma 5.3.2**:

- If is a random variable, then .
- If are independent and , then .

## 5.4 Order Statistics

**Theorem 5.4.3**:

and

**Theorem 5.4.4**:

**Theorem 5.4.6**:

## 5.5 Convergence Concepts

### 5.5.1 Convergence in Probability

**Definition 5.5.1**:

or,

**Theorem 5.5.2 (Weak Law of Large Numbers)**:

### 5.5.2 Almost Sure Convergence

**Definition 5.5.6**:

**Theorem 5.5.9 (Strong Law of Large Numbers)**:

### 5.5.3 Convergence in Distribution

**Definition 5.5.10**:

**Theorem 5.5.14 (Central Limit Theorem)**: Let be a sequence of iid random variables whose mgfs exist in neighborhood of 0 (that is , exists for `\(|t| < h\)`

, for some positive ). Let and . (Both and are finite since the mgf exists.) Define . Let denote the cdf of . Then, for any , ,

### 5.5.4 The Delta Method

## 5.6 Generating a Random Sample

### 5.6.1 Direct Methods

### 5.6.2 Indirect Methods

### 5.6.3 The Accept/Reject Algorithm

### 5.6.(4) The MCMC methods

**Gibbs Sampler**

**Metropolis Algorithm**

# 6 Principles of Data Reduction

## 6.1 Introduction

Three principles of data reduction:

- The Sufficiency Principle
- The Likelihood Principle
- The Equivariance Principle

## 6.2 The Sufficiency Principle

### 6.2.1 Sufficient Statistics

**Definition 6.2.1**: A statistic is a sufficient statistic of if the conditional distribution of the sample given the value of does not depend on .

**Theorem 6.2.2**: If `\(p(x | \theta)\)`

is the joint pdf or pmf of and `\(q(t | \theta)\)`

is the pdf or pmf of , then is a sufficient statistic for if, for every in the sample space, the ratio `\(p(x | \theta)/q(T(X) | \theta)\)`

is constant as a function of $$\theta

**Theorem 6.2.6 (Factorization Theorem)**: Let `\(f(x | \theta)\)`

denote the joint pdf or pmf of a sample . A statistic is a sufficient statistic for if and only if there exist functions `\(g(x | \theta)\)`

and such that, for all sample points and all parameter points ,

### 6.2.2 Minimal Sufficient Statistics

### 6.2.3 Ancillary Statistics

**Definition 6.2.16**: A statistic whose distribution does not depend on the parameter is called an ancillary statistic.

### 6.2.4 Sufficient, Ancillary, and Complete Statistics

**Example 6.2.20 (Ancillary precision)**

**Theorem 6.2.24 (Basu’s Theorem)**: If T(X) is a complete and minimal sufficient statistic is independent of every ancillary statistic.

**Theorem 6.2.25 (Complete statistic in the exponential family)**: Let be iid observations from an exponential family with pdf or pmf of the form

where . Then the statistic

is complete as long as the parameter space contains an open set in .

## 6.3 The Likelihood Principle

## 6.4 The Equivariance Principle

# 7 Point Estimation

## 7.1 Introduction

This chapter is divided into two parts. The first part deals with methods for finding estimators, and the second part deals with evaluating these estimators.

**Definition 7.1.1** A point estimator is any function of a sample; that is, any statistic is a point estimator.

## 7.2 Methods of Finding Estimators

### 7.2.1 Method of Moments

### 7.2.2 Maximum Likelihood Estimators

### 7.2.3 Bayes Estimators

### 7.2.4 The EM Algorithm

## 7.3 Methos of Evaluating Estimators

### 7.3.1 Mean Squared Error

**Definition 7.3.1** The mean squared error of an estimator of a parameter is the function of defined by .

### 7.3.2 Best Unbiased Estimators

**Definition 7.3.7** An estimator is a best unbiased estimator of if it satisfies for all and, for any other estimator with , we have for all . is also called a uniform minimum variance unbiased estimator (UMVUE) of .

**Theorem 7.3.9 (Cramer-Rao Inequality)**: Let be a sample with pdf `\(f(x | \theta)\)`

, and let be any estimator satisfying

and

Then

**Corollary 7.3.10 (Cramer-Rao Inequality, iid case)** If the assumptions of Theorem 7.3.9 are satisfied and, additionally, if are iid with pdf `\(f(x|\theta)\)`

, then

**Lemma 7.3.11** If `\(f(x|\theta)\)`

satisfies

(true for an exponential family), then

**Corollary 7.3.15 (Attainment Let be iid `\(f(x|\theta)\)`

, where `\(f(x|theta)\)`

satisfies the conditions of Cramer-Rao Theorem. Let `\(L(\theta|X)=\prod_{i=1}^n f(x_i|\theta)\)`

### 7.3.3 Sufficiency and Unbiasedness

**Theorem 7.3.17 (Rao-Blackwell)** Let W be any unbiased estimator of , and let T be a sufficient statistic for . Define `\(\phi(T) = \mathrm{E}(W|T)\)`

. Then and for all ; that is, is a uniformly better estimator of .

**Theorem 7.3.19** If W is a best unbiased estimator of , then W is unique.

**Theorem 7.3.20** If , W is the best unbiased estimator of if and only if W is uncorrelated with all unbiased estimators of 0.

**Theorem 7.3.23** Let T be a complete sufficient statistic for a parameter , and let be any estimator based only on T. Then is the unique best unbiased estimator of its expected value.

### 7.3.4 Loss Function Optimality

# 8 Hypothesis Testing

## 8.1 Introduction

**Definition 8.1.1** A hypothesis is a statement about a population parameter.

**Definition 8.1.2** The two complementary hypotheses in a hypothesis testing problem are called null hypothesis and the alternative hypothesis. They are denoted by and , respectively.

**Definition 8.1.3** A hypothesis testing procedure or hypothesis test is a rule that specifies:

- For which sample values the decision is made to accept as true.
- For which sample values is rejected and is accepted as true.

The subset of the sample space for which will be rejected is called the *rejection region* or *critical region*. The complement of the rejection region is called the *acceptance region*.

## 8.2 Methods of Finding Tests

### 8.2.1 Likelihood Ratio Tests

**Definition 8.2.1** The likelihood ratio test statistic for testing versus is

A likelihood ratio test (LRT) is any test that has a rejection region of the form , where is any number satisfying .

### 8.2.2 Bayesian Tests

### 8.2.3 Union-Intersection and Intersection-Union Tests

## 8.3 Methods of Evaluating Tests

### 8.3.1 Error Probabilities and the Power Function

Type I Error and Type II

Accept | Reject | |
---|---|---|

Truth | Correct | Type I Error |

Truth | Type II Error | Correct |

**Definition 8.3.1** The power function of a hypothesis test with rejection region is the functiuon of defined by .