Statistical Inference 5~8

5 Properties of a Random Sample

5.1 Basic Concepts of Random Samples

5.2 Sums of Random Variables from a Random Sample

Lemma 5.2.5:


Theorem 5.2.6:

  • ,
  • ,
  • .

Theorem 5.2.7:

Example 5.2.8:Let be a random sample from a population. Then the mgf of the sample mean is

Thus, has a distribution.

Another simple example is given by a random sameple. Here, we can also easily derive the distribution of the sample mean. The mgf of the sample mean is

which we recognize as the mgf of a , the distribution of .

Theorem 5.2.9 If and are independent continuous random variables with pdfs and , then the pdf of is

Example 5.2.10 (Sum of Cauchy random variables)

Theorem 5.2.11 Suppose is a random sample from a pdf or pmf \(f(x|\theta)\), where

is a member of an exponential family. Define statistics by

If the set contains an open subset of , then the distribution of is an exponential family of the form

Example 5.2.12 (Sum of Bernoulli random variables)

5.3 Sampling from the Normal Distribution

5.3.1 Properties of the Sample Mean and Variance

Theorem 5.3.1: Let be a random sample from a distribution, and let and . Then

  • and are independent random variables,
  • has a distribution,
  • has a chi squared distribution with degrees of freedom.

Lemma 5.3.2:

  • If is a random variable, then .
  • If are independent and , then .

5.4 Order Statistics

Theorem 5.4.3:


Theorem 5.4.4:

Theorem 5.4.6:

5.5 Convergence Concepts

5.5.1 Convergence in Probability

Definition 5.5.1:


Theorem 5.5.2 (Weak Law of Large Numbers):

5.5.2 Almost Sure Convergence

Definition 5.5.6:

Theorem 5.5.9 (Strong Law of Large Numbers):

5.5.3 Convergence in Distribution

Definition 5.5.10:

Theorem 5.5.14 (Central Limit Theorem): Let be a sequence of iid random variables whose mgfs exist in neighborhood of 0 (that is , exists for \(|t| < h\), for some positive ). Let and . (Both and are finite since the mgf exists.) Define . Let denote the cdf of . Then, for any , ,

5.5.4 The Delta Method

5.6 Generating a Random Sample

5.6.1 Direct Methods

5.6.2 Indirect Methods

5.6.3 The Accept/Reject Algorithm

5.6.(4) The MCMC methods

Gibbs Sampler

Metropolis Algorithm

6 Principles of Data Reduction

6.1 Introduction

Three principles of data reduction:

  • The Sufficiency Principle
  • The Likelihood Principle
  • The Equivariance Principle

6.2 The Sufficiency Principle

6.2.1 Sufficient Statistics

Definition 6.2.1: A statistic is a sufficient statistic of if the conditional distribution of the sample given the value of does not depend on .

Theorem 6.2.2: If \(p(x | \theta)\) is the joint pdf or pmf of and \(q(t | \theta)\) is the pdf or pmf of , then is a sufficient statistic for if, for every in the sample space, the ratio \(p(x | \theta)/q(T(X) | \theta)\) is constant as a function of $$\theta

Theorem 6.2.6 (Factorization Theorem): Let \(f(x | \theta)\) denote the joint pdf or pmf of a sample . A statistic is a sufficient statistic for if and only if there exist functions \(g(x | \theta)\) and such that, for all sample points and all parameter points ,

6.2.2 Minimal Sufficient Statistics

6.2.3 Ancillary Statistics

Definition 6.2.16: A statistic whose distribution does not depend on the parameter is called an ancillary statistic.

6.2.4 Sufficient, Ancillary, and Complete Statistics

Example 6.2.20 (Ancillary precision)

Theorem 6.2.24 (Basu’s Theorem): If T(X) is a complete and minimal sufficient statistic is independent of every ancillary statistic.

Theorem 6.2.25 (Complete statistic in the exponential family): Let be iid observations from an exponential family with pdf or pmf of the form

where . Then the statistic

is complete as long as the parameter space contains an open set in .

6.3 The Likelihood Principle

6.4 The Equivariance Principle

7 Point Estimation

7.1 Introduction

This chapter is divided into two parts. The first part deals with methods for finding estimators, and the second part deals with evaluating these estimators.

Definition 7.1.1 A point estimator is any function of a sample; that is, any statistic is a point estimator.

7.2 Methods of Finding Estimators

7.2.1 Method of Moments

7.2.2 Maximum Likelihood Estimators

7.2.3 Bayes Estimators

7.2.4 The EM Algorithm

7.3 Methos of Evaluating Estimators

7.3.1 Mean Squared Error

Definition 7.3.1 The mean squared error of an estimator of a parameter is the function of defined by .

7.3.2 Best Unbiased Estimators

Definition 7.3.7 An estimator is a best unbiased estimator of if it satisfies for all and, for any other estimator with , we have for all . is also called a uniform minimum variance unbiased estimator (UMVUE) of .

Theorem 7.3.9 (Cramer-Rao Inequality): Let be a sample with pdf \(f(x | \theta)\), and let be any estimator satisfying



Corollary 7.3.10 (Cramer-Rao Inequality, iid case) If the assumptions of Theorem 7.3.9 are satisfied and, additionally, if are iid with pdf \(f(x|\theta)\), then

Lemma 7.3.11 If \(f(x|\theta)\) satisfies

(true for an exponential family), then

**Corollary 7.3.15 (Attainment Let be iid \(f(x|\theta)\), where \(f(x|theta)\) satisfies the conditions of Cramer-Rao Theorem. Let \(L(\theta|X)=\prod_{i=1}^n f(x_i|\theta)\)

7.3.3 Sufficiency and Unbiasedness

Theorem 7.3.17 (Rao-Blackwell) Let W be any unbiased estimator of , and let T be a sufficient statistic for . Define \(\phi(T) = \mathrm{E}(W|T)\). Then and for all ; that is, is a uniformly better estimator of .

Theorem 7.3.19 If W is a best unbiased estimator of , then W is unique.

Theorem 7.3.20 If , W is the best unbiased estimator of if and only if W is uncorrelated with all unbiased estimators of 0.

Theorem 7.3.23 Let T be a complete sufficient statistic for a parameter , and let be any estimator based only on T. Then is the unique best unbiased estimator of its expected value.

7.3.4 Loss Function Optimality

8 Hypothesis Testing

8.1 Introduction

Definition 8.1.1 A hypothesis is a statement about a population parameter.

Definition 8.1.2 The two complementary hypotheses in a hypothesis testing problem are called null hypothesis and the alternative hypothesis. They are denoted by and , respectively.

Definition 8.1.3 A hypothesis testing procedure or hypothesis test is a rule that specifies:

  • For which sample values the decision is made to accept as true.
  • For which sample values is rejected and is accepted as true.

The subset of the sample space for which will be rejected is called the rejection region or critical region. The complement of the rejection region is called the acceptance region.

8.2 Methods of Finding Tests

8.2.1 Likelihood Ratio Tests

Definition 8.2.1 The likelihood ratio test statistic for testing versus is

A likelihood ratio test (LRT) is any test that has a rejection region of the form , where is any number satisfying .

8.2.2 Bayesian Tests

8.2.3 Union-Intersection and Intersection-Union Tests

8.3 Methods of Evaluating Tests

8.3.1 Error Probabilities and the Power Function

Type I Error and Type II

  Accept Reject
Truth Correct Type I Error
Truth Type II Error Correct

Definition 8.3.1 The power function of a hypothesis test with rejection region is the functiuon of defined by .

8.3.2 Most Powerful Tests

8.3.3 Sizes of Union-Intersection and Intersection-Union Tests

8.3.4 p-Values

8.3.5 Loss Function Optimality