Research organizes a question or questions about patient characteristics, methods of treatment or materials, and processes used in treatment and produces accurate, valid answers to improve the quality of patient care (1). To be able to answer questions, we must first convert our observations into measurable quantities-also known as "properties" or "variables"(2). This article defines different types of variables and elaborates on their use and misuse in published research.
A variable, according to Webster, has several definitions: ". . . able to vary or alter, susceptible to change, having no fixed value. . . "(3). When applied to research, variables are classified as independent or dependent (4). Understanding variables, their definitions, and how they may be manipulated and measured is critical to making correct inferences (4).
The researcher has control over independent variables and can choose to alter or change them. Dependent variables change or react to the state of the independent variable (5,6). For example, to evaluate the effect a new residual limb sock has on reducing edema, the researcher first selects patients who have edematous residual limbs. Next, two commonly used residual limb socks and a new one to be evaluated are applied, and the circumferences or volumes of the residual limbs are measured.
In this case the independent variable is socks, and the circumference or volume of the residual limbs is the dependent variable. There is control over which sock is applied, but no control over how circumference or volume of the residual limb reacts to changes in pressure presented by the socks.
For another example, a prosthetic foot has been designed to be easier to walk with and to save energy. The measure of energy expenditure will be heart rate. To perform an evaluation, other commonly used prosthetic feet must be obtained, subject amputees should walk a measured distance at a defined speed (velocity) and their heart rates should be measured immediately before and after they walk.
This process should be repeated with each of the prosthetic feet. In this in- stance the prosthetic feet are the independent variables in that the researcher has chosen the feet, while heart rate (the subjects' reaction to walking with the various feet) is the dependent variable. The distance and speed are under the researcher's control, and the effect of the feet on the effort of walking is measured by heart rate alone.
Variables are usually denoted by X and Y. The independent variable is identified as the X-variable and is plotted on the X-axis or abscissa of a graph, and the dependent variable is defined as the Y-variable and is plotted on the Y-axis or ordinate.
A lack of understanding or awareness of classifying variables is a common cause of errors in scientific literature published by allied health professionals (1). The type of statistical test to be applied to a data set depends on classification of the variable.
Variables are classified into two major groups: discrete and continuous (1,2,4,7). Discrete variables have finite values, such as sex, blood type, race and manual muscle test grades, and allow subjects to be grouped into mutually exclusive categories. Discrete variables are often defined by alpha characters, such as blood type (A, B, 0, AB), or numerically as integers such as muscle grades (0, 1, 2, 3, 4, 5).
In both cases there are distances between the values that are neither equal or defined. If the dependent variable "pain" was graded as none 0, slight=1, moderate=2, severe=3, or unbearable=4, then values between the discrete values are undefined and meaningless.
However, continuous variables such as age, height, weight, blood pressure or force are infinitely divisible. Continuous variables are always numeric and are on a real scale with clearly defined subdivisions (4). The choice of what statistical test to apply to data depends on whether the variables are discrete or continuous. A judgment error on this point will result in flawed conclusions.
The ensuing discussion focuses on the differences and restrictions one must be aware of when defining the variables used to answer a research question.
Variables with the simplest characteristics are those in the nominal category. Webster defines nominal as: "... consisting of giving a name.., in name only, not in fact.. . very small compared to expectations.., hardly worth mention.. . "(3). In the context of research, nominal data commonly identifies groups of two members, e.g., male or female, left or right, yes or no, spasticity or no spasticity, young or old, normal or abnormal, etc. There is no rank or order, no better or worse, and no mathematical operations that can be applied to numbers used in grouping in this manner (2).
Because of the prolific use of computers in managing research data and because computers prefer numbers to alpha characters, it is common to code data as 1=male, 2=female or 1=yes, 2=no. One must remember that these numbers only reflect codes and that the item they represent is not arithmetic. There is no such thing as an "average sex" of 1.5 or an "average yes/no" of 1.5, which is what you would get if you derived the mean of "average sex" or "yes/no" responses. You can see the ridiculous conclusion that can be drawn by inappropriate operation with numbers that do not follow the laws of mathematics.
Grouping subjects in nominal categories is simply a way of separating subjects into two groups. It also facilitates record keeping and tallying by assigning arbitrary numbers to identify groups. This is the first level of measurement.
The next step in characterizing data is that of the ordinal scale (1,8,9). Webster defines ordinal as ". . . denoting order.., as, the ordinal numbers, first, second, third." This is probably the most misused of all levels of measurement (9). In an ordinal scale, a hierarchy is ascribed to data and listed from most to least, worst to best, highest to lowest, etc. The distance from group to group is not defined (2,8,9).
For example, in the previously described pain scale with 5 possible levels, it is not possible to equate the difference between "no" and "slight pain" to the difference between "severe" and "unbearable pain." These descriptors are subject to a wide range of interpretations.
Many researchers err by assigning numbers to parameters that fall on an ordinal scale. For example, if the variable of blood type is being measured it would be appropriate to group people according to blood type. Assume that the researcher, either for convenience or out of necessity for computer data entry, assigned numbers to those blood types.
It would not be appropriate to perform mathematical operations such as adding, subtracting and averaging of those numbers because it is meaningless to compute a blood-type group that is more or less than A. It is either type A, B, 0 or AB. There simply is no such thing as an average blood type, but there is a most common blood type. When numbers are assigned to discrete variables care must be taken not to forget the type of data.
How often have we read about the "average" spasticity score when the original data were mild, moderate or severe (with scores of 1, 2 or 3 assigned) or muscle grades when the tested data were zero, trace, poor, fair, good, normal (with scores of 0, 1,2,3, 4 or 5 assigned). In each case, numbers are assigned to discrete groups on an ordinal scale, leading to inappropriate data analysis, flawed logic and invalid inferences (9).
Unfortunately, average (incorrect) spasticity scores and muscle grades are too frequently reported. This mathematical limitation is especially important because most data are managed with statistical computer programs. Once the numerical representations of the ordinal scale groups are entered, the computer has no way of knowing the difference, and the results, however tempting, are invalid. While it would be correct to tally the counts in each group to arrive at the conclusion that "most patients had 3+ quadriceps," it is incorrect to arrive at the conclusion that "..... the average quadriceps grade was 3.7."
To visualize the correct variable type (discrete vs. continuous), draw a horizontal line. If you can use a ruler or some other scale to place the next value of the group, then it is continuous. If you have trouble with the exact placement of the next value on the line then it is discrete. For example, suppose the T under the line in Figure 1 represents the manual muscle test grade of "trace." How far away do you put the "F" (fair)?
Since there is no well-defined distance between manual muscle test grades, you cannot mathematically manipulate them as though they were on a continuous linear arithmetic scale. A strong warning about incorrect usage of ordinal scales was issued by Johnston et al. in discussing current methodology used to evaluate rehabilitation outcomes (10).
Often, ordinal data are summed, making them meaningless. Depending on the categories being evaluated, it is not uncommon to find a list such as bathing, feeding, catheter care, wheelchair skills and walking used to evaluate rehabilitation outcomes.
If an ordinal scale from 1 to 10 is used to evaluate or grade a list of patient skills, it is not difficult to see that a change from feeding with an orthosis to feeding with no orthosis has no linear relationship to a change from wheelchair to walking abilities. Worse yet, if such scores are summed there is great temptation to average them, which has no mathematical validity and leads to enormous and often unfortunate interpretation of results affecting factors such as rehabilitation strategies and reimbursement (8).
This is not to imply that nominal or ordinal scales cannot be used. While it is beyond the scope of this article to discuss the statistics, proper use of discrete nominal and/or ordinal data is not only acceptable but useful and necessary. Many published and commonly used ordinal scales are routinely applied by allied health professionals to define functional capacities (11). Examples include perceived exertion scales, muscle tone, static balance, walking performance, etc. (11). Misuse in data management when applying statistical analyses is surprisingly frequent.
An example of proper treatment of ordinal data was demonstrated by Roland and Morris in their research regarding the development of a reliable and sensitive measure of disability in patients with low back pain (12). Their 24-item questionnaire was constructed from a previously published scale.
Examples of the questions to which respondents answered ''yes" or "no" were, "I walk more slowly than usual because of my back," and "because of my back pain, I am more irritable and bad tempered with people." The "yes" responses were tallied to find the total "yes" count or disability score. Thus, an individual score could range from 0 (no disability) to 24 (severe disability). This questionnaire was found to be simple, applied to a mixed population, had excellent short-term repeatability, compared well with self-rated measures of pain, and was unrelated to age or social class (12).
The next level of measurement is the intervalscale. According to Webster, the definition of interval includes .... . a space between things. . . a period of time between any two points or events. . the extent of difference between two qualities" (3). The interval scale has two conditions that differentiate it from either the nominal or ordinal scales in that there is a defined unit of measure and a zero point. However, both are arbitrary (7). The important characteristic of interval data is that the distance between any two units is known and can be reproduced (1).
Examples of interval data include the measurement of temperature in degrees (Celsius or Fahrenheit) and calendar time. The interval scale does not have a fixed zero, demonstrated by the fact that the two temperature scales have zero at two points on their respective scales (2). Fresh water freezes at 0° on the Celsius scale and at 32° on the Fahrenheit scale. The temperature at which salt water freezes is arbitrarily designated as 0° on the Fahrenheit scale (13). With interval data it is legitimate to add the data and divide by the number of subjects to arrive at a mathematical mean for the group (2).
For example, it would be permissible to measure temperature on consecutive days, obtain a sum of the temperatures and divide by the number of days to arrive at an average or mean temperature. A temperature probe is often used to objectively define the level of pathology in the neuropathic foot (14). It would be correct to measure the temperatures of the involved and noninvolved portions of the feet, obtain the differences and calculate an average difference for a group of similarly involved patients.
The final continuous variable type is referred to as ratio and is defined by Webster as: ". . . a fixed relation in degree or number between two similar things.., the quotient of one quantity divided by another" (13). The major difference between the two data classifications of interval and ratio is that ratio data has an absolute zero (2). Ratio data is the most frequently used class by healthcare professionals who deal with patients' physical attributes. Ordinal and interval data are most commonly encountered by professionals who work within psycho-social arenas.
Examples of ratio data include height (inches, centimeters, meters, etc.), weight (pounds, kilograms, newtons, etc.), velocity, distance (e.g., circumference or distance walked), volume, heart rate, force, torque, etc. With this latter classification, all mathematical operations are valid (13).
In Figure 2 , suppose the number 52 above the line is the height in inches of your subject. Where would you place the 53-inch mark on the line? Since height is a continuous variable, the 53inch mark would be placed one inch to the right of the 52-inch mark. Understanding this concept is crucial when analyzing data and reporting results.
Depending on the variable type and the goal of your research, you will have different types of variables and levels of measurement available (2). To enable proper analysis and meaningful results, it is important to understand the differences and mathematical restrictions that apply.
Brenda Rae Lunsford, MS, PT, is a visiting assistant professor in the school of physical therapy at Texas Women's University in Houston.