Computing the Correlation Coefficient
This chapter explains how to calculate the correlation coefficient r. a quantitative measure of linear association. To calculate r for a pair of variables involves transforming them to standard units, then taking the average of the product of the two variables in standard units.
Ecological correlation is the correlation coefficient calculated for averages of individuals, rather than for individuals. Ecological correlations say little about the (linear) association for individuals; generally, ecological correlations tend to overstate the strength of the association for individuals.
We saw in chapter that the correlation coefficient measures linear association. We know that r does not measure nonlinear association. We know that the value of r can be deceptive if the data are heteroscedastic or contain outliers. We know that r is always between −1 and +1. We know how to estimate r by eye. But we do not know how to compute r from data. In this section, we shall learn how to compute the correlation coefficient: r is the average product of X and Y, after putting X and Y on an equal footing by transforming them to standard units —standard deviations above the mean.
Standard units are a way of putting different kinds of observations on the same scale. The idea is to
replace a datum by the number of standard deviations it is above the mean of the data. If a datum is above the mean, its value in standard units is positive; if it is below the mean, its value in standard units is negative. A datum that is above the mean by 2.5 times the SD is 2.5 in standard units.
When a list is transformed to standard units, the mean of the new list is zero, and the SD of the new list is one: that is what it means for a set of data to be in standard units. Standard units are dimensionless. If the original list has units, the original SD has the same units. To transform a measurement to standard units, we divide the measurement (minus the mean) by the SD, which cancels the original units.
If we know the mean and SD of the original data, we can restore a datum that is in standard units to the original units of measurement, as follows:
Note that both the transformation from original units to standard units and the transformation from standard units to original units are affine transformations. illustrates converting from original units to standard units and back. It is a dynamic example: it changes whenever you reload the page.Source: www.stat.berkeley.edu