This document requires Netscape 3.x or compatible Web Browser.

UT Bullet Biostatistics for the Clinician

Biostatistics for the Clinician

UT Logo

University of Texas-Houston
Health Science Center

Lesson 1.2

Variables and Measures

Lesson 1: Summary Measures of Data 1.2 - 1 UT Bullet

UT Bullet Biostatistics for the Clinician

1.2 Variables and Measures

1.2.1 Why Important

Now let's move into some more familiar territory. When you start to measure the impact of a treatment you have to ask yourself, "What kinds of variables am I dealing with here? What are my choices of variables?"

Now, you might ask, why do I need to know about types of variables or measures? You need to know, in order to evaluate the appropriateness of the statistical techniques used, and consequently whether the conclusions derived from them are valid. In other words, you can't tell whether the results in a particular medical research study are credible unless you know what types of variables or measures have been used in obtaining the data.

Variables and Measures
Exercise 1:
You need to know the types of variable to:

No Response
Know biostatistical vocabulary
Evaluate medical research studies
Compute statistics
None of the above

Lesson 1: Summary Measures of Data 1.2 - 2 UT Bullet

UT Bullet Biostatistics for the Clinician

1.2.2 Types of Variables

Look at the left side of Figure 1.1 below. You can see that one way to look at variables is to divide them into four different categories ( nominal, ordinal, interval and ratio). These refer to the levels of measure associated with the variables. In everyday usage the convention is to then use the level of measure to refer to the kind of variable. So you can then speak of nominal, ordinal, interval, etc. variables.

One isn't necessarily better than another category. But, it is true you typically have more information with some than with others, and you're more used to working with some than with others.

With interval and ratio variables for example, you can do averages and things like that. You know there are numbers. You can add them up, divide and things like that. Its a little trickier sometimes with nominal and ordinal variables. But in human experiments there's no way you can get around it. You often work with nominal or ordinal variables.

Figure 1.1: Types of Variables
Figure 1.1 Types of Variables
Lesson 1: Summary Measures of Data 1.2 - 3 UT Bullet

UT Bullet Biostatistics for the Clinician

Four Types of Variables

Look again at Figure 1.1. You can see there are four different types of measurement scales (nominal, ordinal, interval and ratio). Each of the four scales, respectively, typically provides more information about the variables being measured than those preceding it. That is the reason why the terms "nominal", "ordinal", "interval", and "ratio" are often referred to as levels of measure. Now let's look at the differences so that you can tell them apart.

Variables and Measures
Exercise 2:
How many different levels of measure for variables exist?

No Response

Nominal Variables

What does the word "nominal" comes from? It has to do with naming. So nominal comes from name and that is all you can do with variables measured on nominal scales (nominal variables). The important thing is there is no measure of distance between the values. You're either married or not married. The answer is determined, yes or no. So there is no question of how far apart in a quantitative sense those categories are. They are just names. Nominal scales name and that is all that they do. Some other examples are sex (male, female), race (black, hispanic, oriental, white, other), political party (democrat, republican, other), blood type (A, B, AB, O), and pregnancy status (pregnant, not pregnant.

Variables and Measures
Exercise 3:
Can the distances between the categories of a nominal variable be measured?

No Response

Ordinal Variables

In the next kind of variable you have a little more sophistication than you can get with just names alone (see Figure 1.1). What does ordinal imply? Ordinal implies order. And, order means ranking. So the things being measured are in some order. You can have higher and lower amounts. Less than and greater than are meaningful terms with ordinal variables where they were not with nominal variables. For example, you don't rank male and female as higher and lower. But you do rank stages of cancer, for example, as higher and lower. You can rank pains as higher or lower. So, ordinal variables give you a more sophisticated level of measure - a finer tuned level of measurement. But you have now added only this one element having to do with ranking. You know that something is higher than something else, or lower than something, or more painful than something, or less painful than something.

So, ordinal scales both name and order. Some other examples of ordinal scales are rankings (e.g., football top 20 teams, pop music top 40 songs), order of finish in a race (first, second, third, etc.), cancer stage (stage I, stage II, stage III), and hypertension categories (mild, moderate, severe).

Variables and Measures
Exercise 4:
Nominal variables name only. Ordinal variables:

No Response
Name only
Order only
Both name and order

Lesson 1: Summary Measures of Data 1.2 - 4 UT Bullet

UT Bullet Biostatistics for the Clinician

Interval Variables

What about interval variables (see Figure 1.1)? How are they different? Why are Celsius and Fahrenheit temperature variables called interval variables? They are called interval variables because the intervals between the numbers represent something real. This is not the case with ordinal variables.

Interval variables have the property that differences in the numbers represent real differences in the variable. Another way to say this is that equal equal differences in the numbers on the scale represent equal differences in the underlying variables being measured. For example, look at the difference between 36 degrees and 37 degrees compared to the difference between 40 degrees and 41 degrees on either Fahrenheit or Celsius temperatures? Is the difference the same? Because the differences in the numbers are the same, when you have an interval variable you know temperature intervals are the same.

So, with interval variables you now know not only whether one value is higher than another, but that the distances between the intervals on the scales are the same. Again, you have a higher level of information. Interval scales not only name and order, but also have the property that equal intervals in the numbers measured represent real equal differences in the variables.

Examples of interval scales include the Fahrenheit and Celsius temperatures previously mentioned, SAT, GRE, MAT, and IQ scores. In general, many of the standardized tests of the psychological, sociological and educational displines use interval scales. Interval measures all share the property that the value of zero is arbitrary. On the Celsius scale, for example, 0 is the freezing point of water. On the Fahrenheit scale, 0 is 32 degrees below the freezing point of water.

Variables and Measures
Exercise 5:
Interval variables:

No Response
Name, order & have equal intervals
Name and order only
Order only
Name only

Lesson 1: Summary Measures of Data 1.2 - 5 UT Bullet

UT Bullet Biostatistics for the Clinician

Ratio Variables

Ratio variables have all the properties of interval variables plus a real absolute zero. That is, value of zero represents the total absence of the variable being measured. Some examples of ratio variables are length measures in the english or metric systems, time measures in seconds, minutes, hours, etc., blood pressure measured in millmeters of mercury, age, and common measures of mass, weight, and volume (see Figure 1.1).

They are called ratio variables because ratios are meaningful with this type of variable. It makes sense to say 100 feet is twice as long as 50 feet, because length measured in feet is a ratio scale. Likewise it makes sense to say a Kelvin temperature of 100 is twice as hot as a Kelvin temperture of 50 because it represents twice as much thermal energy (unlike Fahrenheit temperatures of 100 and 50). With ratio variables, the only difference from interval variables is that you have a true zero so that you can actually talk about ratios. That is a person's lung capacity can be twice somebody else's lung capacity. In order to make those kinds of statements you have to have be able to compute meaningful ratios and you can only do that if you have a true zero. But really for the purposes of any statistical tests it makes no difference whether you have interval or ratio variables.

Variables and Measures
Exercise 6:
Ratio variables have:

No Response
A real 0
Equal intervals
All except "No Response" above

Lesson 1: Summary Measures of Data 1.2 - 6 UT Bullet

UT Bullet Biostatistics for the Clinician

Qualitative vs. Quantitative Variables

Look at (Figure 1.1) again. On the left hand side you see that there are two larger classifications for the kinds of variables you have been studying. There are qualitative variables and there are quantitative variables. You can see that the four levels of measure (nominal, ordinal, interval and ratio) fall into these two larger supercategories. So, interval and ratio variables are two kinds of quantitative variables and nominal and ordinal variables are two kinds of qualitative variables.

Now one kind of variable isn't necessarily better than another. You are a little more used to working with quantitative variables. For example, you can do averages and things like that with quantitative variables, you know there are numbers, you can add them up and divide and things like that. With qualitative variables it's not so clear cut. Its a little trickier some times. But when you are working with humans there's no way you can get around it.

Don't Dilute Your Variables

The important thing is to avoid diluting your measures. If you have interval measures you should keep them at the finest level of measure you have. Don't reclassify temperature measures into categories like "High" and "Low", or "Very Cold", "Cold", "Neutral", "Hot", "Very Hot". Don't cluster or group them and make them into ordinal variables. If you do, you are throwing away information. So, if you have information at the interval level, record it at the interval level. If its at the ordinal level, record it at that level. And, of course, if you're at the nominal level you're stuck with recording it at that level. So never collapse your measurements together when you begin your experiments in a way that you lose information.

Variables and Measures
Exercise 7:
Interval or ratio variables should not be regrouped into nominal or ordinal measures.

No Response

Parametric vs. Nonparametric

When statistical analyses are applied, the statistics must take into account the nature of the underlying measurement scale, because there are fundamental differences in the types of information imparted by the different scales (see Figure 1.1). The bottom line is the following. Nominal and ordinal scales must be analyzed using what are called nonparametric or distribution free statistical methods. On the other hand, interval and ratio scales are, if at all possible, to be analyzed using the typically more powerful parametric statistical methods. But, parametric statistics typically require that the interval or ratio variables have distributions shaped like the bell (normal) curve as well as having some other assumptions. It turns out that the bell curve assumption is a reasonable one for many of the kinds of variables frequently encountered in medical practice.

Variables and Measures
Exercise 8:
Nominal and ordinal variables require:

No Response
Parametric methods
Nonparametric methods

Lesson 1: Summary Measures of Data 1.2 - 7 UT Bullet

UT Bullet Biostatistics for the Clinician

Independent vs. Dependent Variables

Look again at (Figure 1.1), this time at the right side, and you see another way of categorizing variables. Basically you need to discriminate between outcomes like gastric ulcers, on the one hand, and other variables that may or may not affect that outcome. So, the ones that are the causal factors, or that you may manipulate are called the independent variables. The outcomes of the treatments or the responses to changes in the independent variables are called the dependent variables, because their values presumably depend on what happens to the independent variables. For example treatments you administer in an experiment constitute levels of the independent variable(s). In smoking research you might look at number of cigarettes smoked as an independent variable and incidence of lung cancer as a dependent variable. In research on atherosclerosis, you might look at dietary saturated fat or amount of vitamin E supplementation as independent variables and degree of atherosclerosis as a dependent variable. In research on comparative cancer treatments, the cancer treatments form the independent variable(s) while various measures of progression of the disease would make up the dependent variables. If you wanted to look at how aspirin dosages affect the frequency of second heart attacks, the aspirin dosage would be the independent variable, while the heart attack frequency would be the dependent variable.

Variables and Measures
Exercise 9:
Variables you manipulate are:

No Response
Independent variables
Dependent variables

Lesson 1: Summary Measures of Data 1.2 - 8 UT Bullet

UT Bullet Biostatistics for the Clinician

1.2.3 C.R.A.P. Detectors

The following summarize some good general rules for the appropriate conduct of medical research and the evaluation of medical research studies.

C.R.A.P. Detectors
C.R.A.P. Detector #1.1 Dependent variables should be sensible. Ideally, they should be clinically important, but also related to the independent variable.
C.R.A.P. Detector #1.2 In general, the amount of information increases as one goes from nominal to ratio. Classifying good ratio measures into large categories is akin to throwing away data.

Final Instructions

Press Button below for your score.

  • After completing Lesson 1.2, including all practice exercises, press the "Submit... " button below for Lesson 1.2 research participation credit.
  • After you press "Submit..." it is possible Netscape may tell you it is unable to connect because of unusually high system demands. If you receive no error message upon submission you're OK. But, if Netscape gives you an error message after you press the "Submit..." button, wait a moment and resubmit or consult the attendant.
  • Finally, press the "Table of Contents..." button below to correctly end Lesson 1.2 and return to the Lesson 1 Table of Contents so you may continue with Lesson 1.3.

End Lesson 1.2
Variables and Measures

Lesson 1: Summary Measures of Data 1.2 - 9 UT Bullet