*Research organizes a question or questions about patient characteristics,
methods of treatment or materials, and
processes used in treatment and produces accurate, valid answers to improve the quality of patient care (1). To
be able to answer questions, we must
first convert our observations into measurable quantities-also known as
"properties" or "variables"(2). This article defines different types of variables
and elaborates on their use and misuse
in published research.*

A variable, according to Webster, has
several definitions: ". . . able to vary or
alter, susceptible to change, having no
fixed value. . . "(3). When applied to research, variables are classified as *independent *or *dependent* (4). Understanding variables, their definitions, and
how they may be manipulated and
measured is critical to making correct
inferences (4).

The researcher has control over independent variables and can choose to
alter or change them. *Dependent variables *change or react to the state of the
*independent variable* (5,6). For example, to evaluate the effect a new residual limb sock has on reducing edema,
the researcher first selects patients who
have edematous residual limbs. Next,
two commonly used residual limb
socks and a new one to be evaluated
are applied, and the circumferences or
volumes of the residual limbs are measured.

In this case the *independent variable*
is socks, and the circumference or volume of the residual limbs is the dependent variable. There is control over
which sock is applied, but no control
over how circumference or volume of
the residual limb reacts to changes in
pressure presented by the socks.

For another example, a prosthetic foot has been designed to be easier to walk with and to save energy. The measure of energy expenditure will be heart rate. To perform an evaluation, other commonly used prosthetic feet must be obtained, subject amputees should walk a measured distance at a defined speed (velocity) and their heart rates should be measured immediately before and after they walk.

This process should be repeated with
each of the prosthetic feet. In this in-
stance the prosthetic feet are the *independent variables* in that the researcher
has chosen the feet, while heart rate
(the subjects' reaction to walking with
the various feet) is the *dependent variable*. The distance and speed are under
the researcher's control, and the effect
of the feet on the effort of walking is
measured by heart rate alone.

Variables are usually denoted by X and Y. The independent variable is identified as the X-variable and is plotted on the X-axis or abscissa of a graph, and the dependent variable is defined as the Y-variable and is plotted on the Y-axis or ordinate.

A lack of understanding or awareness of classifying variables is a common cause of errors in scientific literature published by allied health professionals (1). The type of statistical test to be applied to a data set depends on classification of the variable.

Variables are classified into two major groups: discrete and continuous (1,2,4,7). Discrete variables have finite values, such as sex, blood type, race and manual muscle test grades, and allow subjects to be grouped into mutually exclusive categories. Discrete variables are often defined by alpha characters, such as blood type (A, B, 0, AB), or numerically as integers such as muscle grades (0, 1, 2, 3, 4, 5).

In both cases there are distances between the values that are neither equal or defined. If the dependent variable "pain" was graded as none 0, slight=1, moderate=2, severe=3, or unbearable=4, then values between the discrete values are undefined and meaningless.

However, continuous variables such as age, height, weight, blood pressure or force are infinitely divisible. Continuous variables are always numeric and are on a real scale with clearly defined subdivisions (4). The choice of what statistical test to apply to data depends on whether the variables are discrete or continuous. A judgment error on this point will result in flawed conclusions.

The ensuing discussion focuses on the differences and restrictions one must be aware of when defining the variables used to answer a research question.

Variables with the simplest characteristics are those in the nominal category.
Webster defines *nominal* as: "... consisting of giving a name.., in name
only, not in fact.. . very small compared to expectations.., hardly worth
mention.. . "(3). In the context of research, *nominal* data commonly identifies groups of two members, e.g., male
or female, left or right, yes or no, spasticity or no spasticity, young or old,
normal or abnormal, etc. There is no
rank or order, no better or worse, and
no mathematical operations that can
be applied to numbers used in grouping
in this manner (2).

Because of the prolific use of computers in managing research data and because computers prefer numbers to alpha characters, it is common to code data as 1=male, 2=female or 1=yes, 2=no. One must remember that these numbers only reflect codes and that the item they represent is not arithmetic. There is no such thing as an "average sex" of 1.5 or an "average yes/no" of 1.5, which is what you would get if you derived the mean of "average sex" or "yes/no" responses. You can see the ridiculous conclusion that can be drawn by inappropriate operation with numbers that do not follow the laws of mathematics.

Grouping subjects in *nominal* categories is simply a way of separating
subjects into two groups. It also facilitates record keeping and tallying by assigning arbitrary numbers to identify
groups. This is the first level of measurement.

The next step in characterizing data
is that of the *ordinal* scale (1,8,9).
Webster defines *ordinal* as ". . . denoting order.., as, the ordinal numbers,
first, second, third." This is probably
the most misused of all levels of measurement (9). In an ordinal scale, a hierarchy is ascribed to data and listed from
most to least, worst to best, highest to
lowest, etc. The distance from group to
group is not defined (2,8,9).

For example, in the previously described pain scale with 5 possible levels, it is not possible to equate the difference between "no" and "slight pain" to the difference between "severe" and "unbearable pain." These descriptors are subject to a wide range of interpretations.

Many researchers err by assigning numbers to parameters that fall on an ordinal scale. For example, if the variable of blood type is being measured it would be appropriate to group people according to blood type. Assume that the researcher, either for convenience or out of necessity for computer data entry, assigned numbers to those blood types.

It would not be appropriate to perform mathematical operations such as adding, subtracting and averaging of those numbers because it is meaningless to compute a blood-type group that is more or less than A. It is either type A, B, 0 or AB. There simply is no such thing as an average blood type, but there is a most common blood type. When numbers are assigned to discrete variables care must be taken not to forget the type of data.

How often have we read about the "average" spasticity score when the original data were mild, moderate or severe (with scores of 1, 2 or 3 assigned) or muscle grades when the tested data were zero, trace, poor, fair, good, normal (with scores of 0, 1,2,3, 4 or 5 assigned). In each case, numbers are assigned to discrete groups on an ordinal scale, leading to inappropriate data analysis, flawed logic and invalid inferences (9).

Unfortunately, average (incorrect) spasticity scores and muscle grades are too frequently reported. This mathematical limitation is especially important because most data are managed with statistical computer programs. Once the numerical representations of the ordinal scale groups are entered, the computer has no way of knowing the difference, and the results, however tempting, are invalid. While it would be correct to tally the counts in each group to arrive at the conclusion that "most patients had 3+ quadriceps," it is incorrect to arrive at the conclusion that "..... the average quadriceps grade was 3.7."

To visualize the correct variable type (discrete vs. continuous), draw a horizontal line. If you can use a ruler or some other scale to place the next value of the group, then it is continuous. If you have trouble with the exact placement of the next value on the line then it is discrete. For example, suppose the T under the line in Figure 1 represents the manual muscle test grade of "trace." How far away do you put the "F" (fair)?

Since there is no well-defined distance between manual muscle test
grades, you cannot mathematically manipulate them as though they were on a
continuous linear arithmetic scale. A
strong warning about incorrect usage
of *ordinal* scales was issued by Johnston *et al*. in discussing current methodology used to evaluate rehabilitation
outcomes (10).

Often, *ordinal* data are summed,
making them meaningless. Depending
on the categories being evaluated, it is
not uncommon to find a list such as
bathing, feeding, catheter care, wheelchair skills and walking used to evaluate rehabilitation outcomes.

If an *ordinal* scale from 1 to 10 is used
to evaluate or grade a list of patient
skills, it is not difficult to see that a
change from feeding with an orthosis to
feeding with no orthosis has no linear
relationship to a change from wheelchair to walking abilities. Worse yet, if
such scores are summed there is great
temptation to average them, which has
no mathematical validity and leads to
enormous and often unfortunate interpretation of results affecting factors
such as rehabilitation strategies and reimbursement (8).

This is not to imply that *nominal* or
*ordinal* scales cannot be used. While it
is beyond the scope of this article to
discuss the statistics, proper use of discrete *nominal* and/or *ordinal* data is not
only acceptable but useful and necessary. Many published and commonly
used ordinal scales are routinely applied by allied health professionals to
define functional capacities (11). Examples include perceived exertion
scales, muscle tone, static balance,
walking performance, etc. (11). Misuse
in data management when applying
statistical analyses is surprisingly frequent.

An example of proper treatment of ordinal data was demonstrated by Roland and Morris in their research regarding the development of a reliable and sensitive measure of disability in patients with low back pain (12). Their 24-item questionnaire was constructed from a previously published scale.

Examples of the questions to which respondents answered ''yes" or "no" were, "I walk more slowly than usual because of my back," and "because of my back pain, I am more irritable and bad tempered with people." The "yes" responses were tallied to find the total "yes" count or disability score. Thus, an individual score could range from 0 (no disability) to 24 (severe disability). This questionnaire was found to be simple, applied to a mixed population, had excellent short-term repeatability, compared well with self-rated measures of pain, and was unrelated to age or social class (12).

The next level of measurement is the
*interval**scale*. According to Webster,
the definition of *interval* includes .... . a
space between things. . . a period of
time between any two points or
events. . the extent of difference between two qualities" (3). The interval
scale has two conditions that differentiate it from either the *nominal* or *ordinal* scales in that there is a defined unit
of measure and a zero point. However,
both are arbitrary (7). The important
characteristic of *interval* data is that the
distance between any two units is
known and can be reproduced (1).

Examples of interval data include the
measurement of temperature in degrees (Celsius or Fahrenheit) and calendar time. The *interval scale* does not
have a fixed zero, demonstrated by the
fact that the two temperature scales
have zero at two points on their respective scales (2). Fresh water freezes at 0°
on the Celsius scale and at 32° on the
Fahrenheit scale. The temperature at
which salt water freezes is arbitrarily
designated as 0° on the Fahrenheit
scale (13). With interval data it is legitimate to add the data and divide by the
number of subjects to arrive at a mathematical mean for the group (2).

For example, it would be permissible to measure temperature on consecutive days, obtain a sum of the temperatures and divide by the number of days to arrive at an average or mean temperature. A temperature probe is often used to objectively define the level of pathology in the neuropathic foot (14). It would be correct to measure the temperatures of the involved and noninvolved portions of the feet, obtain the differences and calculate an average difference for a group of similarly involved patients.

The final continuous variable type is
referred to as *ratio* and is defined by
Webster as: ". . . a fixed relation in degree or number between two similar
things.., the quotient of one quantity
divided by another" (13). The major
difference between the two data classifications of interval and ratio is that ratio data has an absolute zero (2). Ratio
data is the most frequently used class
by healthcare professionals who deal
with patients' physical attributes. Ordinal and interval data are most commonly encountered by professionals
who work within psycho-social arenas.

Examples of ratio data include height (inches, centimeters, meters, etc.), weight (pounds, kilograms, newtons, etc.), velocity, distance (e.g., circumference or distance walked), volume, heart rate, force, torque, etc. With this latter classification, all mathematical operations are valid (13).

In Figure 2 , suppose the number 52 above the line is the height in inches of your subject. Where would you place the 53-inch mark on the line? Since height is a continuous variable, the 53inch mark would be placed one inch to the right of the 52-inch mark. Understanding this concept is crucial when analyzing data and reporting results.

Depending on the variable type and the goal of your research, you will have different types of variables and levels of measurement available (2). To enable proper analysis and meaningful results, it is important to understand the differences and mathematical restrictions that apply.

*
Brenda Rae Lunsford, MS, PT, is a visiting assistant professor in the school of physical therapy at Texas Women's University in Houston.
*

**References:**

- Lehmkuhl D. Mixing one part common sense with each part statistics in planning the design and reporting the results of clinical research in physical therapy. Phys Ther 1987;67: 12:1851-3.
- Michels B. Evaluation and research in physical therapy. Phys Ther 1982;62:6.
- Webster. New twentieth century dictionary. Unabridged 2nd ed. The World Publishing Co., 1962.
- Currier DP. Element of research in physical therapy. 2nd ed. Baltimore: Williams & Wilkins, 1984: Chapter 5.
- Payton 0. Research: The validation of clinical practice. Philadelphia: FA Davis, 1979:51-6, 81.
- Dominowski RL. Research methods. New Jersey: Prentice Hall, 1980.
- Afifi AA, Azen S. Statistical analysis: A computer-oriented approach. 2nd ed. New York: Academic Press, 1979:2-5.
- Wright BD, Linacre JM. Observations are always ordinal; measurement, however, must be interval. A Special Communication. Arch Phys Med Rehabil 1989; 70:857-67.
- Merbitz C, Morris J, Grip JC. Ordinal scales and foundations of misinference. Arch Phys Med Rehabil 1989;70:308-12.
- Johnston MV, Keith RA, Hinder, SR. Measurement standards for interdisciplinary medical rehabilitation. Arch Phys Med Rehabil 1992;73.
- Bohanon RW. Simple clinical measures. Phys Ther 1987;67:12:1845-50.
- Roland M, Morris R. A study of the natural history of back pain: Part I. Development of a reliable and sensitive measure of disability in low back pain. Spine 1983;8: 141-4.
- Krebs DE. Measurement theory. Phys Ther 1987;67:12:1834-9.
- Elftman NL. Clinical management of the neuropathic limb. JPO 1991;4:1:1-12.