Fundamentals of Statistics contains material of various lectures and courses of H. Lohninger on statistics, data analysis and chemometrics......click here for more.


Exercise - Design a data set with predefined correlations

The goal of this exercise is to get acquainted with the correlation coefficient. Try to design an artificial data set containing 20 "measurements" on 4 variables (v1..v4). The variables v2 to v4 should be correlated to variable v1, showing the following approximate (+/- 0.1) correlation coefficients ri,j:

r1,2 = 0.0
r1,3 = 0.9
r1,4 = -0.9

Use the  DataLab  for all the work. When ready, you should come up with the following items, as well as answers to the following questions:
 

  • The data set (20 objects, 4 variables) in ASC format.
  • The cross-correlation matrix of your data set.
  • What about the correlations between variables v3, and v4 ? Is it possible that these two variables are uncorrelated, although both of them show a high correlation to variable v1?