What are some examples of categorical variables


Analysis of categorical and continuous variables


Next page:Measuring instruments Upwards:Types of variables Previous page:Discrete and Continuous Random Variables & nbsp index

Every empirical measurement delivers discrete measured values, even if in individual cases a large number of different measured values ​​are possible.

A simple and at the same time comprehensive classification of statistical analysis methods considers the number the (discrete) "characteristics" of the respective "variables". There are two types of statistical analysis techniques on which this glossary is based: techniques for analyzing categorical variables and techniques for analyzing continuous variables.

  • Categorical variables are characteristics that have a limited number of characteristics (categories) (categorical variables).
  • Variables with a large number of characteristics do not count as categorical variables. If these measurements are based on a continuous property, we want them to be continuous variables denote (English: continuous variable).

    Examples: The income in exact DM amounts or the age in years are variables with a great number of characteristics. Both are based on a continuous characteristic (solvency, lifetime). They should therefore be treated like continuous variables. The variables nationality or exercised occupation also have many characteristics. The world has more than a hundred nationalities and the standard international classification of occupations recognizes several thousand occupations. However, since neither variable is based on a continuous property, one cannot treat them as continuous variables, but has to separate Use occupations or nationalities as categorical variables.

This differentiation according to the number of characteristics results in two different approaches to statistical data analysis: Since categorical variables only have a limited number of characteristics, it is obvious that the individual Model categories. If, on the other hand, there are variables with a large number of characteristics, this procedure is no longer practicable. Instead, one falls back on analytical methods that determine certain properties of the distribution all Model the characteristics of these variables: e.g. the "center" or the "scatter" of these characteristics. The number of values ​​from which a variable is no longer regarded as a categorical variable is not specified and depends on both content and practical aspects.

How does this distinction between analytical methods for categorical and continuous variables relate to the other classifications of variables? With regard to the distinction in mathematical statistics between "discrete and continuous random variables", categorical variables can also be referred to as discrete variables. The fact that there are clearly distinguishable (discrete) characteristics in discrete variables predestines this data type for the statistical modeling of the occurrence of individual Characteristics (categories). Methods of categorical data analysis are therefore based on the "distribution models" for discrete random variables. Conversely, for variables with many characteristics, distribution models for continuous random variables are often used, provided that the property being measured is a continuous characteristic. In individual cases it may be necessary to use a so-called Continuity or Continuity correction balance.

In principle, categorical variables can have both a metric and a non-metric "level of measurement"; the most varied levels of measurement can be taken into account through a suitable specification of the statistical analysis method. However, since non-metric variables usually have a limited number of characteristics, they make up a large part of the categorical variables. Metric variables, on the other hand, often have many forms and are therefore often viewed as continuous variables. They can only be meaningfully analyzed as categorical variables if they have a limited number of characteristics. This is the case, for example, when it comes to Counting variables acts that do not assume very high values ​​in practice (number of children, number of roommates, number of residences, etc.), or when the range of values ​​of the metric variables has been divided into a limited number of classes (see "classified variables") .



Next page:Measuring instruments Upwards:Types of variables Previous page:Discrete and Continuous Random Variables & nbsp index HJA 2001-10-01