Fundamentals of Statistics contains material of various lectures and courses of H. Lohninger on statistics, data analysis and chemometrics......click here for more.


Population and Sample

In applied statistics, we have to differentiate strictly between populations and samples. Consider a box containing 500 balls of three different colors. This set of balls is called the population. Descriptive statistical measures e.g. the mean or the standard deviation, or in our example the number of red, green and blue balls, are called parameters if they are calculated from the population. If you select for example 50 balls out of this box - a subset of the population - this set of 50 balls is called a sample. A descriptive measure referring to a sample - here the number of red balls in the box estimated from the sample of 50 balls - is called an estimate (or statistic). Parameters are depicted by Greek letters, estimates are depicted by Latin letters.


Population A Population is the set of all possible states of a random variable. The size of the population may be either infinite or finite.
Sample A Sample is a subset of the population; its size is always finite.

The population, which is the basis of a statistical survey, has always to be defined in an exact way (which is not always easy) in order make sure that the results are comparable. The best way to go is to define not only the factual prerequisites ("what has to be analysed") but also the spatio-temporal general framework.

In our example of 500 balls, the size of the population is finite. In many cases - especially with real measurements - the population size is infinite. For example, if the variable of interest is the concentration of oxygen in air, measured with some analytical device. The population is the (infinite) set of all (possible) measurements (= results derived from the analytical instrument).

Another example is the number of cars driving on a particular section of a highway in the morning between 7am and 10am. The population of this variable is the number of cars using this highway during the defined time interval each day - as long as the highway exists.