RESEARCH FORUM--The Research Sample,
Part I: Sampling
Thomas R. Lunsford, MSE, CO
Brenda Rae Lunsford, MS, MAPT
ABSTRACT
The cost of studying an entire population to answer a specific question is usually prohibitive in terms of time, money and
resources. Therefore, a subset of subjects representative of a
given population must be selected; this is called sampling.
The concepts involved in selecting subjects to represent the
larger population are presented. Sampling errors and associated determining factors are reviewed.
Definitions of the research populations, including target
and accessible groups, are given. The inclusion and exclusion
criteria required to refine the accessible population to a researchable subgroup are explained, and an example is provided. The two types of sampling methods, probability and
nonprobability, are defined and presented with their respective types. Probability sampling includes simple random
sampling, systematic sampling, stratified sampling, cluster
sampling and disproportional sampling. Nonprobability
sampling includes convenience sampling, consecutive sampling, judgmental sampling, quota sampling and snowball
sampling.
The goals and concepts related to recruitment are reviewed with application to survey and experimental research. Three steps are suggested for obtaining an appropriate research sample: (1) clearly define the target population,
(2) define the accessible population, and (3) define the steps
and effort that will be employed to recruit subjects for study.
Introduction
The first two questions most researchers ask once a research
project has been defined are, "How many subjects will I
need to complete my study, and how will I select them?"
This article, "Part I," will attempt to address the issues
related to selecting subjects for a research project. "Part
II," which will be published in the Fall JPO, will present in
detail the factors relevant to determining sample size.
In clinical research it would be ideal to include the entire
population when conducting a study; this enables a generalization to be made about the results to the population as a
whole. In some cases this has been possible, such as when the
1976 Philadelphia Legionnaire's disease epidemic was studied. However, in most cases, the population in question is too
large or too spread out over time and distance to allow for
measuring or evaluating each member of the population.
Researchers have developed a number of techniques
where only a small portion of the total population is sampled, and attempts to generalize the results and conclusions
for the entire population are made. There are some distinct
advantages and disadvantages in using samples. Advantages include that sampling involves a smaller number of
subjects and is more time efficient, less costly and potentially more accurate (since it is more feasible to maintain
control over a smaller number of subjects). Disadvantages
include potential bias in the selection of subjects, which
may lead to error in interpretation of results and decrease
in ability to generalize the results beyond the subjects actually studied (1-3).
Cox and West describe a population as a well-defined
group of people or objects that share common characteristics (1). All immigrants from Germany or all patients with
left hemiplegia are examples of well-defined groups with
common characteristics. A population in a research study is
a group about which some information is sought. Most researchers cannot include all members of the population in
their studies and must resort to limiting the number of subjects to only a sample from the population.
A sample is a small subset of the population that has been
chosen to be studied (1,2). The sample should represent the
population and have sufficient size so a given innovative orthosis or prosthesis can be subjected to a fair statistical
analysis. Unfortunately, all samples deviate from the true
nature of the overall population by a certain amount due to
chance variations in drawing the sample's few cases from
the population's many possible members. This is called sampling error and is distinguished from non-chance variations
due to determining factors. Determining factors include
items such as biased sampling procedures, effects of independent variables, research conditions and other causal
agents or circumstances (2).
One of the most famous cases of biased sampling was the
1936 Literary Digest poll before the U.S. presidential election
of 1936 (2,3). Two million ballots were mailed out, received
back and tabulated; the results confidently predicted the
easy election of Landon (57 percent) over Roosevelt. Unfortunately, the names on the mailing list were taken from
telephone directories and lists of automobile owners. At that
time, only people of certain wealth had telephones and/or
drove cars, and there was a strong correlation between those
with wealth and a preference for Landon. The larger mass of
people without cars or telephones voted for Roosevelt, giving him the largest margin of victory in history at that time.
This large error in prediction is a prime example of the consequences that biased sampling can produce.
Many clinical studies do not achieve their intended purposes because the researcher is unable to enroll enough
subjects. Therefore, at some point in planning a study consideration should be given to sample size. While the number of subjects studied is important, even more important
in a study is that the subjects accurately represent the larger population. In contrast to the previous example where
more than two million ballots gave a biased and erroneous
result, polls taken by Gallup and Harris in 1968, in which
only 2,000 voters were sampled, predicted a victory by
Richard Nixon of 41 and 43 percent, respectively. Nixon
won by 42.9 percent (2).
Sampling Concepts
Samplingtarget populationexternal validity
Since it will not be practical to recruit every human with
spasticity for this study, it is necessary to define an accessible population. The accessible population is a subset of the
target population that reflects specific characteristics with
respect to age, gender, diagnosis, etc., and who are accessible for study (4).
Therefore, in the AFO footplate example, it is critical to
define or characterize the target population before a sample of subjects can be defined. For example, will all patients
with spasticity be included? Is the question to be limited to
children with diplegia, secondary to cerebral palsy, adults
post CVA or adults and children post brain injury? This
narrowing and refining of the research question is useful
since it more clearly defines the target as well as accessible
populations and has direct impact on the external validity
of the inferences to be drawn at the conclusion of the study.
Once the specific clinical and demographic characteristics of the accessible population are defined, it is important
to consider the geographic and time constraints with which
both the researcher and potential subjects will have to
contend.
Will the study intervention require more than one visit to
the clinic, laboratory or office? If so, how far can subjects be
expected to travel, and what means of transport are required
to get them there? If the research is to be conducted in a
large metropolitan area with good public transportation, repeated trips and distance may not be a problem. Transportation logistics can be an insurmountable problem if not
planned for and resolved in designing the research plan.
In the case of the AFO footplate study, one constraint
that might be placed on the accessible subjects is that they
live within 30 minutes' travel by car and are able to commit
to four visits within a one-month period of time. This leads
to the next major consideration in the sampling process,
defining inclusion and exclusion criteria of the accessible
population.
Inclusion Criteria
In the above example of specific AFOs for patients with
spasticity, it is important to consider the research question
and include factors that will enable a homogeneous selection of subjects, e.g., age, gender, diagnosis, degree of spasticity, muscle groups affected, etc. It may be determined
that the specific variables under study (footplate contours
and spasticity) are more likely to show an effect in the
growing child than in the adult. Therefore, one inclusion
criterion may be a specific population of children whose
ages encompass the growing years.
Since patterns of tone are different depending on diagnosis, it may be desirable to specify cerebral palsy and not
include other diagnoses. Also, since loads and deforming
forces on feet are different if the subject has bilateral versus unilateral involvement, it may be desirable to include
only those subjects with hemiplegia. Therefore, the inclusion criteria for this study may be children between the
ages of 2 and 15 (growth with walking years) diagnosed as
cerebral palsy with spastic hemiplegia. Also, subjects who
live within a specific distance, have convenient and affordable transportation and are able to commit to a specific
number of visits may be included. A final inclusion criterion may be parental consent and support.
Exclusion Criteria
Exclusion criteria are applied to subjects who generally meet
the inclusion criteria but must be excluded because they cannot complete the study or possess unique characteristics that
may confound the results. For example, it may be necessary
to exclude subjects with spastic hemiplegia secondary to
cerebral palsy who were premature at birth or who have additional medical problems that may affect their outcomes. A
child with epilepsy may be taking medication that can also
affect his/her muscle tone, which could confound study results. If the ability to walk is an important dependent variable of the study, then subjects who do not walk or who have
been walking for less than one year may be excluded. Subjects who may have unreliable sources of transportation or
noncompliant parents also may need to be excluded.
An important ethical consideration is the willingness of
the subject to participate. In the instance of a study of spastic children, parental permission must be obtained or the
subject must be excluded. Also, withholding one treatment
to evaluate another may pose a difficult ethical consideration. The exclusion criteria, considering all of the above,
may result in sampling guidelines that exclude children
who are less than 2 or more than 15 years of age, are not on
medication that affects muscle tone, do not walk or have
walked less than one year, have inadequate transportation
and/or whose parents will agree to participate (see Table 1
).
Sampling Methods
The process of defining a representative subpopulation to
study is called sampling. There are two main categories of
sampling, probability and nonprobability.
Probability Sampling
The first potential problem in any system of selection is
bias. Bias can occur easily as previously described in the
Roosevelt and Landon election of 1936, and it also may be
related to researcher preference. Patient volunteers can introduce bias since they tend to be healthier and produce results different from subjects chosen randomly. To avoid selection bias it is important to guarantee that each of the
candidates for inclusion in the study has an equal opportunity for selection, That guarantee requires subjects to be selected at random, or that randomization be employed. Randomization is important for two reasons: First, it provides a
sample that is not biased, and second, it meets the requirements for statistical validity (2). Several methods exist that
can be used to randomly select subjects.
Simple random sampling can be accomplished using an
array of random numbers (1) (see Table 2
). In this table the
numbers are grouped into series of five digits. This grouping method is for ease of presentation only; the same numbers could be grouped in twos or threes. For grouping by
threes, the first column would contain 104,803,757,042, etc.
How the random numbers are organized depends on the
size of the population to be studied. Once the random
numbers are organized into columns and rows, the researcher must decide where to start in the table and in what
direction to proceed.
Suppose it has been decided that there are 900 patients
(i.e., the accessible population) with spasticity from which
to draw a sample for the AFO footplate study. From this accessible population it is desired to randomly select 90 subjects for the study. The first step is to assign a number from
1 to 900 to each member of the accessible population. Next.
the starting number in the table must be selected. (An easy
way to do this is to close your eyes and place the point of a
pencil on the table.)
Assume that the number selected is 88974 (column 3,
row 3), and it has been decided to use the last three digits
to determine which subject is selected first. In this case, the
last three digits are 974. However, the accessible population
numbers range from 1 to 900, and 974 cannot be used. Arbitrarily proceeding downward, the next random number is
48237; therefore, patient 237 will become the first subject
selected for the study. The next subjects would be numbers
306, 301,802, 308, etc., until the entire sample of 90 subjects
is selected. Obviously, a larger random number table would
be required to select 90 subjects.
It is also possible to use all digits in the random number
table. Beginning again with the five digit number 88974 and
progressing downward, the first subjects selected would be
889, 744, 823, 725, 306, 012, 802, etc., until again all 90 subjects are selected for the sample.
Systematic random sampling is a method frequently chosen for its simplicity because it is a periodic process (1,2,46). This method could be carried out by selecting the first
subject randomly as described above and then selecting
every second or third subject who comes to the office and
meets the inclusion/exclusion criteria. This method, however, is problematic in that other staff who know of the
method can manipulate patient appointments to assure inclusion. There is no advantage to this method over simple
random sampling (5).
Stratified sampling is a method by which subjects are
grouped according to strata such as age, gender or diagnosis (left hemi vs. right hemi), etc. (1,2,4-6). Using this
method, subgroups of interest can be defined and equal
numbers of subjects sampled for each group. For example,
if there was interest in the functional outcomes for use of a
certain type of AFO footplate in patients post brain injury,
then it would be useful to define age as a subgrouping since
age often relates to the functional challenge imposed on
various orthoses. For example, a young child may engage in
a lot of crawling, jumping, running, etc., when wearing an
AFO whereas a senior citizen is more likely to walk cautiously. This permits comparison of the subgroups, such as
children (5 to 12), teens (13 to 19), adults (20 to 55) and seniors (56 and up). Using this method, subjects would be recruited randomly for each subgroup, and, although each
subgroup would have a different age range, the general inclusion/exclusion criteria would apply to each of the subgroups.
Cluster sampling is a method used to enable random
sampling to occur while limiting the time and costs that
would otherwise be required to sample from either a very
large population or one that is geographically diverse
(1,2,4,5). An example of how this might be used is as
follows.
To obtain as many subjects as possible and to eliminate
any potential bias inherent in selecting subjects from one
specific clinic, the researcher may wish to select subjects
from all of the hospitals and outpatient clinics within a given area. However, this would be too costly and time-consuming. Therefore, use of the cluster approach is appropriate. Using this method, a one- or two-level randomization
process is used. First, each hospital and outpatient clinic
that meets the inclusion criteria is identified. Second, one of
the selection methods described above is used to randomly
select a portion of those facilities. All of the available subjects from the randomly selected facilities could be included, or subjects from each of the randomly selected facilities
could themselves be randomly selected. The important element in this process is that each of the facilities and each of
the subjects have an equal opportunity to be chosen, with
no researcher or facility bias.
Disproportional sampling is a method that facilitates the
difficulty encountered with stratified samples of unequal
size (2). Suppose, for example, it is desired to conduct a survey of the members of the American Academy of Orthotists and Prosthetists. Also suppose an educational grant
has been secured that will support study of only 200 members (subjects) and that the available population in the
Academy is 2,000 individuals, in the available population of
2,000 there are 1,700 males and 300 females. Since the 200
subjects needed for the study comprise only 10 percent of
the available population, then how many of each gender
are required? Simple proportioning suggests that 17/20 (85
percent) of the 200 be males and 3/20 (15 percent) be females. This would result in approximately 170 males and 30
females. The small number of females probably would not
provide adequate representation for drawing conclusions
about the entire membership.
One way of dealing with this situation is to use a simple
random sample and leave the proportional representation
to chance; however, unless the sample is unusually large,
the differential effect of gender will probably not be controlled (6). A disproportional sampling design will permit
random selection of Academy members of adequate size
from each category. For example, 100 males and 100 females may be selected. This sample of 200 cannot be considered random because each female has a much greater
chance (higher probability) of being chosen.
This approach creates an adequate sample size, but it
presents problems for data analysis because the characteristics of one group (in this case, the females) will be overrepresented in the sample. Fortunately, this effect can be
controlled by weighting the data so the males receive a proportionally larger mathematical representation in the
analysis of scores than the females.
Calculating proportional weights involves determining
the probability that any one male or female Academician
will be selected. Selecting 100 male Academicians involves a
probability of 100 out of 1,700, or 1 of 17 (1/17). The probability of any one female Academician being selected is 100
out of 300, or 1 of 3 (1/3). Therefore, each female has a probability of selection more than five times that of any male.
Next, the assigned weights are determined by taking the
inverse of these probabilities. The weight for male scores is
17/1 17, and that for females is 3/1 = 3. This means that
when the data are analyzed, each male's score will be multiplied by 17, and each female's score will be multiplied by
3. In any mathematical manipulation of the data, the total
of the males' scores would be larger than the total of the females' scores. Therefore, the proportion of each group is
differentiated in the total data set.
Because all Academy members in a group will have the
same weight, the average scores for that group will not be
affected; however, the relative contribution of these scores
to overall data interpretation will be controlled.
Nonprobability Sampling
In the real world of clinical research true random sampling
is very difficult to achieve. Time, cost and ethical considerations often prohibit researchers from making the necessary arrangements and securing the necessary clearances,
for example, to obtain subjects from other facilities or
professional practices to test a hypothesis. Therefore, it is
often necessary to use other sampling techniques. These
techniques produce nonprobability samples in that the
sampling technique is not random (2,5).
With nonprobability sampling it is unlikely that the population selected will have the correct proportions because
all members of the population do not have an equal chance
of being selected. Therefore, it may not be assumed that the
sample fully represents the target, and any statement generalizing the results beyond the actual sample tested must
be stated with qualification.
Because the validity of statistical testing methods is
based on random selection of subjects, it is important when
using nonprobability sampling that random techniques be
employed to the maximum extent possible. Five nonprobability sampling techniques have evolved: convenience sampling, consecutive sampling, judgmental sampling, quota
sampling and snowball sampling.
Convenience sampling is probably the most commonly
used technique in clinical research today (1,2,4,5). With
convenience sampling, subjects are selected because of
their convenient accessibility to the researcher. These subjects are chosen simply because they are the easiest to
obtain for the study. This technique is easy, fast and usually the least expensive and troublesome. The famous sample description of "10 healthy young men" is assuredly
either 10 male medical, prosthetic/orthotic or therapist
students who have volunteered to be subjects for a study.
The criticism of this technique is that bias is introduced into the sample. Volunteers always are suspect because they
tend to be the healthiest, strongest, fastest, most skilled,
etc. (7).They often volunteer because they like to show off
or are competitive in nature and like to be tested. Volunteers may not be representative of the larger overall
population.
Another common example of a convenience sample occurs when subjects are selected from the clinic, facility or
educational institution at which the researcher is employed. Bias is likely to be introduced using this sampling
technique because of the methods, styles and preferences
of treatment employed at a given facility.
Consecutive sampling is a strict version of convenience
sampling where every available subject is selected, i.e., the
complete accessible population is studied. This is the best
choice of the nonprobability sampling techniques since by
studying everybody available, a good representation of the
overall population is possible in a reasonable period of
time (5).
Even though consecutive sampling does not allow randomization of the original subject pool to be studied, every
effort should be made to randomize at all other levels. For
example, assume it is desirable to test two different prosthetic feet. Once the study pool of subjects is defined, the
assignment of prosthetic feet to subjects should be random.
If all subjects will be tested with each of the feet, the order
of testing should be randomized to remove as much bias as
possible in the testing procedures.
If every subject is tested wearing foot A first and foot B
second, foot B may prove to be the best foot, due to the
learning effect. The learning effect gives an advantage to
the subsequent items (prosthetic feet in this case) tested
because the subjects become more familiar with the procedures and protocol and develop experimental skill. If foot
B were truly superior and the testing was not random, its
beneficial effect would be vulnerable to challenge because
of the learning effect. The subjects become comfortable
with the testing procedure with foot A and simply perform
better the second time around with foot B. The results and
generalizability would be flawed.
Judgmental sampling, also called purposive sampling, is
another form of convenience sampling where subjects are
handpicked from the accessible population (2). This technique leaves much to be desired because of its inherent
bias. Subjects usually are selected using judgmental sampling because the researcher believes that certain subjects
are likely to benefit or be more compliant. For example, in
the study of prosthetic feet athletic subject amputees might
be selected for the more athletic foot because they are
more likely than a sedentary or geriatric patient to benefit
from that foot.
Quota sampling is a nonprobability technique used to
ensure equal representation of subjects in each layer of a
stratified sample grouping (2). For example, in the study of
the orthotic impact on spasticity using different footplate
contour designs, assume there are four different designs,
and it is desired that randomization be applied as to which
subject gets which footplate to test.
Using Table 3
, one method would be to assign subjects
consecutively to footplate designs I to IV for the first four
subjects (Round 1). The next round would assign subject 1
to footplate IV, subject 2 to footplate I, subject 3 to footplate II and subject 4 to footplate III, etc. In this manner
there are equal numbers of subjects for each insert tested,
and bias is managed as long as the subjects are assigned
consecutively with no manipulation by anyone familiar or
involved with the study. This allows control over the distribution of subjects across test situations and provides some
protection from bias even though the original set of subjects was not randomly selected.
Snowball sampling is a technique used to identify potential subjects when appropriate candidates for study are
hard to locate (2). For example, if locating an adequate
number of amputees becomes difficult, an amputee belonging to a local support group could be recruited to assist
in locating subjects willing to participate in a study. In other words, it is possible to have assistance from patients to
help identify people with similar disabilities or conditions
to assist in identifying subjects for study. This process is
known as snowballing or chain referral (2).
Recruitment
Once the decision to use a certain sampling approach has
been made, subjects must be recruited. The goal of recruitment is to obtain a sample large enough to enable valid statistical analysis and allow subjects to be selected in such a
manner as to avoid bias (4). Errors or problems in either
of these areas can be prevented with a research design
that employs controls and a carefully planned sampling
technique.
The chosen method of recruitment usually is based on
the type of study; for example, survey data collected via
questionnaire may be obtained by a direct person-to-person interview, telephone or mailed form. Experimental
research, such as for the AFO footplate study, requires
that subjects be able to commit time and transportation
to come to the study site and repeat this effort more
than once.
There often is an inverse relationship between the ease
of recruitment effort and the success in obtaining data. In
survey research, for example, direct personal effort in recruitment often is not employed; the recruitment method
frequently is comprised of obtaining a mailing list and
submitting questionnaires to the accessible population via
the mail. A frequent drawback in this type of recruitment effort is a very low response rate of 50 to 60 percent (7). Another disadvantage is that the researcher
loses all control over the actual data gathering. If the low
return is anticipated and an adequate number of questionnaires is sent, then problems caused by inadequate
data may be avoided; however, this does not help the loss
of control.
Alternately, subjects are more difficult to recruit when
more effort on their part is requested. For example, when
multiple visits are required, such as in the AFO footplate
example, recruiting subjects is a bigger chore for the researcher since subjects are asked to travel to the test site
and do so on more than one occasion. However, because
the researcher applies test conditions directly to the subject, not only is it probable that all necessary data will be
obtained, but control over the experiment and the data acquisition is maintained.
Once the accessible population has been defined, every
effort should be made to obtain subjects in the manner
planned. If a systematic random sampling method has been
chosen and a large proportion of the accessible subjects refuses to participate, then a bias error is introduced into the
study. In the case of subject refusal, bias is introduced since
the reason for their refusal is often universal. For example,
several subjects may refuse because the study seems physically too difficult; when this occurs, the researcher is left
with only subjects who do not think the effort requested of
them is too difficult. This implies that the remaining subjects may be more fit or healthy than those who refused.
This is a threat to external validity and affects the researcher's ability to generalize the results to the original
target population (3).
Recruitment techniques may include personal contact,
follow-up phone calls, incentives (such as paying subjects
for their time or parking), etc. Some researchers even make
home visits to potential subjects to explain the research and
its importance; others mail advertising brochures to make
participating seem exciting and important.
Language also may present a potential difficulty with recruitment. Therefore, a brochure in the appropriate foreign
language or a staff or volunteer who can translate or interpret the foreign language may be required. Subjects may be
recruited from the facility in which the researcher works or
is familiar, or special efforts may have to be made to contact other similar facilities to engage their permission to approach their patients.
Sometimes advance work can be done to assist the
recruitment process once the study is ready to begin.
Community groups, such as local churches, YMCA, youth
organizations, patient support groups and local business
groups such as Kiwanis and Elks, may be contacted for
support in identifying potential subjects. Depending on the
community impact, these groups may even invite a researcher to address their membership to explain the importance of their project to gain acceptance and willingness to participate.
Summary
The goals of sampling are to decrease time and money
costs, to increase the amount of data and detail that can be
obtained, and to increase accuracy of data collection by
preventing errors.
To accomplish these goals it is necessary to follow these
steps:
- Clearly define the target population to which the results will be generalized. For example, the AFO footplate
study could be targeted to children in the growing years
with flexible deformities or to adults with fixed deformities.
Very specific inclusion criteria that outline the desired demographic and clinical characteristics of the desired target
population are necessary.
- An accessible population representative of the target
must be defined by additional inclusion criteria with specific characteristics regarding the geographic, social and time
frames required for this subpopulation. For example, having transportation available, being English-speaking and
not being Christian Scientists could be inclusion criteria.
Also, exclusion criteria are developed in this step to avoid
any ethical problems and eliminate characteristics that may
invalidate the results. For example, if an ethical problem
may arise in denying treatment to one of the groups, an exclusion criterion might include excluding anyone already
on a treatment protocol for the clinical problem under
study.
- The sampling process must be defined well ahead of
subject selection whether it be a random (probability) or
nonrandom (nonprobability) approach, and the researchers must adhere to a specific technique for recruitment appropriate for that approach. The recruitment effort
must be vigorous enough to assure a large enough sample
to enable statistical validity and must minimize probability
of error and bias of selection.
THOMAS R. LUNSFORD, MSE, CO, is director of the orthotic
department at the Institute for Rehabilitation and Research in
Houston, Texas, and assistant professor of physical medicine and
rehabilitation at Bay/or College of Medicine in Houston.
BRENDA RAE LUNSFORD, MS, MAPT is visiting assistant
professor at Texas Women University's School of Physical Therapy in Houston.
References:
- Cox RC, West WL. Fundamentals of research for health professionals, 2nd ed. Ramsco Pub. Co.; 1986:29.
- Portney LG, Walkins MR Foundations of clinical research:
Applications to practice. East Norwalk, Conn.: Appleton and
Lange; 1993.
- Dominowski RL. Research methods. New Jersey: PrenticeHall; 1980.
- Hulley SB, Cummings SR. Designing clinical research. An
epidemiologic approach. Baltimore: Williams and Wilkins; 1988.
- Currier DR Elements of research in physical therapy, 2nd ed.
Baltimore: Williams and Wilkins; 1984.
- Schlesselman JJ. Case-control studies: Design, conduct,
analysis. New York: Oxford University Press; 1982.
- Isaac 5, Michael W. Handbook in research and evaluation,
2nd ed. San Diego: Edits Pub.; 1990:189.
|
|