5313 shaares
127 private links
127 private links
There may be complex and unknown relationships between the variables in your dataset.
It is important to discover and quantify the degree to which variables in your dataset are dependent upon each other. This knowledge can help you better prepare your data to meet the expectations of machine learning algorithms, such as linear regression, whose performance will degrade with the presence of these interdependencies.
In this tutorial, you will discover that correlation is the statistical summary of the relationship between variables and how to calculate it for different types variables and relationships.
After completing this tutorial, you will know:
- How to calculate a covariance matrix to summarize the linear relationship between two or more variables.
- How to calculate the Pearson’s correlation coefficient to summarize the linear relationship between two variables.
- How to calculate the Spearman’s correlation coefficient to summarize the monotonic relationship between two variables.