The correlation coefficient is a statistical measure that shows the strength and direction of a linear relationship between two variables. Denoted as r, it ranges from -1 to +1. A positive value means that variables move together, while a negative value shows they move in opposite directions.
The concept of correlation was first explored in the late 19th century by Sir Francis Galton and later formalised by Karl Pearson, whose name is still linked to the widely used Pearson correlation coefficient. Since then, it has become a cornerstone of statistics, applied in fields ranging from science and healthcare to finance and business management.
Formula (Pearson’s r):
\(
r = \dfrac{\sum (x_i – \bar{x})(y_i – \bar{y})}
{\sqrt{\sum (x_i – \bar{x})^{2} \cdot \sum (y_i – \bar{y})^{2}}}
\)
Where:
Cautions
The correlation coefficient is widely used because it is simple, versatile, and easy to interpret. When applied with care, it provides valuable insights into how variables interact, helping researchers and decision-makers make better predictions and avoid misleading conclusions.