Scatter plot, correlation and Pearson’s r are related topics and are explained here with the help of simple examples. Learn how to create scatter plot and find co-efficient of correlation (Pearson’s r) in Excel and Minitab. This is the foundation before you learn more complicated and widely used Regression and Logistic Regression analysis.

Scatter Plot

Scatter Diagram is a basic graphic tool that illustrates the relationship between two variables. The dots on the scatter plot represent data points.

• If we want to associate ‘Y’ with a single ‘X’, we can use scatter plot to ascertain the correlation
• It is used to visually determine whether a potential relationship exists between an input and an outcome

Correlation is the strength of association between two continuous variables. Correlation is explained here with examples and how to calculate correlation coefficient (also known as Pearson correlation coefficient). So the next step from scatter diagram is correlation.

While scatter diagram shows the graphical representation, it doesn’t tell us the strength of relationship between the two variable. But we can calculate the strength of relationship by calculating correlation coefficient.

How to measure Correlation/How much is the Correlation

Let us understand Correlation Coefficient, now we will call it or know it by ‘r’. In marketing research we are often interested in knowing the strength of association between two continuous variables, as in the following situations:

• – How strongly are sales related to advertising expenditures?
• – Is there an association between market share and size of the sales force?

In situations like these, correlation coefficient r, is the most widely used statistic, summarizing the association between two continuous variables X and Y. It indicates the degree to which variation in X, is related to the variation in Y. Because it was originally proposed by Karl Pearson, it is also known as the Pearson correlation coefficient.

• ‘r’ indicates the extent to which two variables are related
• It can range from -1.0 to +1.0, A positive correlation coefficient indicates a positive relationship, a negative coefficient indicates an inverse relationship
• Higher the absolute value of ‘r’, stronger the correlation between ‘Y’ & ‘X‘

Correlation in Minitab

It is very easy to calculate correlation coefficient r in Excel. You have to keep Y in one column and X in another column, same as Minitab.

Correlation coefficient r, also know as Pearson product moment coefficient of correlation. It is calculated by the following formula:

Let us take an example, in the table below “X” is study time in hrs and “Y” is test score. We want tom check if there is any association between study time and test score. We will use the formula mentioned above.

I hope you have understood the above concept and if you want to learn more such tools then go for a Six Sigma course from Simplilearn. The course is aligned to IASSC and ASQ exam, integrates lean and DMAIC methodologies using case studies and real-life examples.

There is another good online Six Sigma Green Belt course from Coursera. This course is from University System of Georgia and is well recognized.

If you want to learn new age data science techniques, then one good starting point is Data Science course from Simplilearn. Data Science is emerging very fast and early movers will always have advantage.