Hypothesis Testing and Statistical Analysis
Formulating Hypotheses
In statistical analysis, the first step is often formulating hypotheses. These are statements about population parameters that we aim to test using sample data.
- Null Hypothesis (H₀): This is the default assumption, typically stating that there's no effect or no difference between groups.
- Alternative Hypothesis (H₁): This is what we're testing for, often suggesting a difference or effect exists.
For a study on a new teaching method: H₀: The new method has no effect on test scores. H₁: The new method improves test scores.
NoteThe alternative hypothesis can be one-tailed (specifying a direction) or two-tailed (any difference).
Significance Levels and p-values
- Significance Level (α): The probability of rejecting H₀ when it's actually true (Type I error).
- Common levels: 0.05, 0.01, 0.001
- p-value: The probability of obtaining results at least as extreme as the observed results, assuming H₀ is true.
A p-value less than α leads to rejecting H₀.
Chi-Squared Tests
Expected and Observed Frequencies
- Observed Frequencies: The actual counts in your data.
- Expected Frequencies: What you'd expect if H₀ were true.
Chi-Squared Test for Independence
This test determines if there's a significant relationship between two categorical variables.
- Create a contingency table of observed frequencies.
- Calculate expected frequencies for each cell.
- Calculate the chi-squared statistic: $$ \chi^2 = \sum \frac{(O - E)^2}{E} $$ where O is observed frequency and E is expected frequency.
- Determine degrees of freedom: (rows - 1) × (columns - 1)
- Compare $\chi^2$ to critical value or use p-value.
Testing if gender and preference for math are independent:
Calculated $\chi^2=2.78, \mathrm{df}=1$ For $\alpha=0.05$, critical value $=3.84$
Since $2.78<3.84$, fail to reject $\mathrm{H}_{\mathrm{o}}$.