- Home
- Products
- Lean Six Sigma & Minitab book
- Customised books
- OPEX Minitab menu
- Resources
- Data files & downloads
- Academic Use
- DMAIC Structure Slides
- Articles
- Videos
- Software Solutions
- News
- Contact
- Shop
Interpreting the Pearson Coefficient
In a separate article, we introduced Correlation and the Pearson coefficient, and this article looks in more detail at how to interpret the Pearson coefficient, and in particular, it’s p-value.
Firstly, a reminder of the scatter plots and the Pearson coefficient, which aims to quantify the relationship that might exist between two variables on a scatter plot. The coefficient ranges from -1.0 to +1.0, where:
-1.0 is a strong inverse relationship
0 indicates no relationship
+1.0 is a strong direct relationship
You might think that’s the end of the matter but, as with many things in Six Sigma, it’s actually a little more complicated! This is because you must also assess whether the correlation is statistically significant.
However, you also need to consider whether the correlation is statistically significant before you go any further. Why? Because with small sample sizes (and we only have 5 data points in this example!) there is a small chance that your data points will fall in such a way that it appears that a correlation exists, even when it doesn’t.
So, to assess the statistical significance of your correlation, you need to look at the p-value that is calculated alongside the Pearson coefficient, which can be interpreted as follows:
– If the p-value is low (generally less than 0.05), then your correlation is statistically significant, and you can use the calculated Pearson coefficient.
– If the p-value is not low (generally higher than 0.05), then your correlation is not statistically significant (it might have happened just by chance) and you should not rely upon your Pearson coefficient.
In our example above, the p-value is 0.3 (not statistically significant) which reflects the very small sample size (n=5). So, we should ignore the Pearson coefficient for now – it’s suggesting a correlation that might not even exist!
Confused? Here’s a summary:
- The Pearson coefficient helps to quantify a correlation
- The p-value helps to assess whether a correlation is real (statistically significant).
- The Pearson coefficient and p-value should be interpreted together, not individually.
Share This Story, Choose Your Platform!
Values of the Pearson Correlation David M. LanePrerequisites Introduction to Bivariate DataLearning Objectives
The Pearson product-moment correlation coefficient is a measure of the strength of the linear relationship between two variables. It is referred to as Pearson's correlation or simply as the correlation coefficient. If the relationship between the variables is not linear, then the correlation coefficient does not adequately represent the strength of the relationship between the variables. The symbol for Pearson's correlation is "ρ" when it is measured in the population and "r" when it is measured in a sample. Because we will be dealing almost exclusively with samples, we will use r to represent Pearson's correlation unless otherwise noted. Pearson's r can range from -1 to 1. An r of -1 indicates a perfect negative linear relationship between variables, an r of 0 indicates no linear relationship between variables, and an r of 1 indicates a perfect positive linear relationship between variables. Figure 1 shows a scatter plot for which r = 1. Figure 1. A perfect positive linear relationship, r = 1. Figure 2. A perfect negative linear relationship, r = -1. Figure 3. A scatter plot for which r = 0. Notice that there is no relationship between X and Y.
Figure 4. Scatter plot of spousal ages, r = 0.97. Figure 5. Scatter plot of Grip Strength and Arm Strength, r = 0.63. The relationship between grip strength and arm strength depicted in Figure 5 (also described in the introductory section) is 0.63. Please answer the questions: |