Inference for a single mean – part 2

Chapter 19

Inference with numerical variables

  • Sample mean \(\bar{x}\)

  • Standard error \[ SE = \frac{\sigma}{\sqrt{n}} \approx \frac{s}{\sqrt{n}} \] Because we almost never know \(\sigma\), we use \(s\) to approximate – then use t-distribution to compensate.

    Also known as Student’s t-distribution. Developed by W. Gosset, Head Experimental Brewer, Guinness Brewing.

Conditions for CLT

We still need to verify certain conditions to justify our model.

  • Independence (e.g. from a random sample)

  • Normality

    • if sample is small then population should be normally distributed
    • if sample is large then conditions on population can be relaxed
    • Population should be much larger than sample (10x)

Normality, continued

If sample is large then conditions on population can be relaxed

  • no extreme outliers
  • slight or no skew \(n\geq 15\)
  • medium skew \(n \geq 30\)
  • strong skew \(n \geq 60\)

Confidence Interval

\[ \mbox{CI} = \mbox{point estimate} \pm t_{df}^\ast * SE \]

\(t_{df}^*\) is determined from confidence level and degree of freedom

Hypothesis Test

Test statistic is now \(T\) (instead of \(Z\))

\[ T = \frac{ \bar{x} - \mbox{null value}}{SE} \]

\[ SE = \frac{s}{\sqrt{n}} \]

Once you have T score, find corresponding p-value ( using technology or table)