Scatter Diagrams and Correlation
Scatter diagrams, also known as scatter plots, are graphical representations of bivariate data. They provide a visual way to examine the relationship between two variables.
Creating Scatter Diagrams
To create a scatter diagram:
- Plot each data pair $(x, y)$ as a point on a coordinate plane
- Label the axes with the variable names and units
- Choose appropriate scales for both axes
For instance, if studying the relationship between study time and test scores:
- X-axis: Hours studied
- Y-axis: Test score (out of 100)
- Each point represents a student's study time and corresponding test score
Interpreting Scatter Diagrams
When analyzing scatter diagrams, consider:
- Direction of correlation:
- Positive: As x increases, y tends to increase
- Negative: As x increases, y tends to decrease
- No correlation: No clear pattern
- Strength of correlation:
- Strong: Points closely follow a pattern
- Weak: Points loosely follow a pattern
- No correlation: Points appear randomly scattered
- Form of relationship:
- Linear: Points roughly follow a straight line
- Non-linear: Points follow a curve or other pattern
It's crucial to remember that correlation does not imply causation. Two variables may be correlated without one directly causing the other.
Linear Regression: Equation of y on x
Linear regression finds the best-fitting straight line through the data points.
Line of Best Fit
The line of best fit, also called the regression line, minimizes the vertical distances between the data points and the line.