|
The population is the entire collection of individuals or measurements about which information is desired. A sample is a subset of the population that is selected for study. A statistic is a numerical characteristic of the sample, from which inferences about the population's parameters -- numerical characteristics of the population -- can be drawn.
The goal of inferential statistics is to use sample statistics to make inference about population parameters.
Data consisting of numerical (quantitative) variables can be further divided into two groups: discrete and continuous.
The most common type of discrete variable we will encounter is a counting variable.
Depending on how many variables we are measuring on the individuals or objects in our sample, we will have one of the three following types of data sets:
Measurements on any variable, even the same variable on the same subject, will always vary. The pattern of variation of a variable is called its distribution, which can be described both mathematically and graphically. In essence, the distribution records all possible numerical values of a variable and how often each value occurs (its frequency). The most famous example of a distribution is the bell-shaped curve. To see examples of several types of distributions, see the ``How are things distributed'' concept lab.
Statistics begins with skills and principles for examining data. In this section, we will learn how to graphically display univariate data. The following principles apply:
NOTE: Observations which do not follow the regular pattern (sometimes called outliers) should not be removed from the data unless there is good reason to do so. A ``good'' reason could be that the outlier is a ``typo'' or that the equipment used to measure the observation failed. In other words, the removal of extreme observations should be based on technical or scientific knowledge rather than mere habit. Sometimes, in fact, we are interested in finding outliers. For instance, universities like to give scholarships to students who have unusually high scores on standardized tests such as the ACT or SAT.
Histograms provide a good graphical means for visualizing sample (population) distributions and for comparing more than one sample (population). Numerical summary statistics provide additional information for describing and comparing samples (populations). We will be concerned primarily with two types of numerical summary statistics,
The webmaster and author of this Math Help site is Graeme McRae.