Outcome measurement is certainly not a new concept for therapists working in the field of pediatric prosthetics. Ongoing controversy regarding the cost-benefit of providing upper-limb prostheses to children has stimulated considerable interest and research in prosthetic use evaluation for many years now. This article, intended to provide historical background material, chronicles some of the influences and evolutionary stages and processes of functional outcome measure development specific to the pediatric, upper-limb prosthetic population.( J Prosthet Orthot . 2009;21:227–231. )
Lambert and Sciora1 are believed to have conducted, in 1959, one of the earliest formal studies to investigate the use of upperlimb prostheses for children. The authors designed a questionnaire for parents and used the results to evaluate prosthetic use for 65 juvenile patients. Based on the parental responses, the child's overall prosthetic use was graded as good, fair, or poor.
Over the next few years, observational-type assessment gained popularity, and clinicians (primarily therapists and physicians) started to put together tests to try to explore the functional abilities of those who used upper-limb prostheses. Typically, rating scales were designed to assess an individual's performance on a selected list of common activities of daily living.
As early as 1961, occupational therapist Ann Teska2 reported her experience in attempting to devise a performance test for the New York University prosthetic research team to evaluate the new APRL-Sierra No. 1 hand. This early test consisted of five play activities, and the child's ability was rated simply as excellent, satisfactory, or unsatisfactory.
One of the better-known tools to be developed in that era was the "Prosthesis Adjustment Scale" described by Brooks and Shaperman3 in 1965. Their objective was to see if it was possible to distinguish the differences in overall patient performance. Each child was scored as 0, 1, or 2 in each of the five different categories-wearing pattern, operating skill, applied use, maintenance, and acceptance. Others, such as Bergholtz,4 Paciga et al.,5 and Baron et al.6 developed their own individual rating scales in combination with a variety of activity or task lists as methods of assessing the use of prostheses for their respective studies.
A second approach was to use timed tests to evaluate performance. One example is the instrument used by Mendez7 for the United Kingdom trial (1978–1981) of the Swedish myoelectric hand for young children. It took the form of a group of six structured tasks, selected for use from training activities utilized in the occupational therapy program. The tasks were scored on achievement within a given time, and then a separate puzzle assembly task was timed from beginning to completion. The test was not designed to be used for subject comparison but only as a competence baseline for each child.
The "Prosthetic Performance Test" designed for use in the Ontario Crippled Children's Centre (OCCC) Pre-school Training Study, 1980–1982,8 incorporated both approaches. Seven activities were carried out in a standardized fashion, and the child's accomplishment and time taken for each activity were recorded. In addition, the complete session was videotaped and an outside evaluator later reviewed the tapes and provided a rating for each individual activity. The same procedure was used for a follow-up study three years later.9 Edelstein also used a similar, twofold observational approach (timed and rating scale) for performance comparison among children fitted with myoelectric and bodypowered hands in a later study.10
Unfortunately, there does not appear to have been any further investigation of the psychometric properties of the study-specific assessment tools developed in those years. With the exception of the OCCC study, there is little evidence that they were ever used again for any follow-up activity.
Following the introduction of myoelectric prostheses for young children by Rolf Sorbye11 of Sweden in 1971, and subsequent trials in the United Kingdom7 and Canada,8 myoelectric prostheses became an increasingly popular choice for parents of children with amputations or limb deficiency. Clinic teams, pressured to justify the higher costs of fitting these devices, were forced to look for more specific and objective ways to evaluate the benefit of providing myoelectric prostheses, particularly if the practice was to be started at a very young age. Although it was commonly believed that there was a positive relationship between early fitting and higher prosthetic skill and that a highly cosmetic prosthesis contributed to enhanced self-image and improved self-esteem, these correlations had not been demonstrated empirically.
Recognizing the need for a more systematic approach, David Krebs organized a one-day consensus conference in April 1984, in conjunction with the Association of Children's Prosthetic Orthotic Clinics (ACPOC) Annual Meeting in Baltimore. The purpose of the meeting was to bring together therapists experienced in pediatric upper-limb amputee rehabilitation to begin the process of developing a test battery for assessment of prosthetic prehension performance among very young children. Nineteen therapists were invited to participate. Each was asked to submit a list of bimanual, repetitive tasks that could potentially be developed into test items and ultimately included in a battery of ageappropriate functional assessments. Participants were also asked to contribute references to help develop a comprehensive bibliography.
The day started with a series of invited papers listed below:12
Prehension and Developmental Theory: Review of Normal Skills Acquisition (Clarke).
Test and Measurement Theory Relevant to ULP Performance (Krebs).
Prehension and Motor Skills Testing in Child Upper Extremity Amputees: A Literature Review (Hubbard).
Comparison of Terminal Devices and Body-Powered/Myoelectric Prosthetic Systems Available for Pre-School Upper Limb Amputees (Celikyol).
The group then evaluated each of the bimanual task suggestions (bimanual/repetitive) and estimated their ageappropriateness. In the course of the day, three age-specific test batteries (2–3 years, 4–5 years, 6–7 years) were developed to assess pediatric, below-elbow, prosthetic functional performance for children aged 2–7 years. It was recognized that the day's effort was only a first step and that the batteries would have to be used and empirically evaluated. However, it did demonstrate what could be achieved by bringing together an experienced group of therapists with a singular purpose.
A summary of the conceptual background, the introductory papers, the Symposium proceedings, and the standardized instructions for administering and scoring the tests was later published as a text, Prehension Assessment: Prosthetic Therapy for the Upper Limb Child Amputee.12 Regrettably, despite the initial enthusiasm, there was little follow-through activity. Although a number of therapists did trial the use of the test batteries, many did not feel that the timed scoring was well suited for pediatric evaluation, and there were no further follow-up meetings or validation efforts.
It is interesting though to note how little the issues faced today have changed from the ones discussed more than 20 years ago in this book's introductory chapter: "Prosthetic treatment options for pediatric upper limb amputees have expanded in recent years much more rapidly than has useful information on the relative merits of new prosthetic approaches. Clinical decisions have become increasingly complex as a result of lack of research into available treatment methods and prescription components. In addition, no reports of normative prosthetic prehensile function data exist to aid the therapist and other team members in determining appropriate training termination points, assessing treatment efficacy, predicting a child's future prosthetic performance or diagnosing developmental delays."12
The second and probably more significant milestone of the 1980s was the development of the "UNB Test of Prosthetics Function."13 Recognizing the need for assessment standardization, a group of researchers at the University of New Brunswick (UNB) decided to take on the project of designing a test to assess a child's ability to use a prosthesis. Motivating factors were the need to be able "to determine the optimal age of fitting, the most effective type and duration of training and the most advantageous type of prosthesis for youngsters." The development involved the participation of an international group of therapists (Elizabeth Sanderson, UNB, Canada; Linda Stelzer, Detroit Institute for Children, USA; Sheila Hubbard, Hugh MacMillan Medical Centre, Canada; Wally Farrell, Newington Children's Hospital, USA; Susan Clarke and Joanna Patton, CAPP, USA; Lotti Hermansson, Orebro Regional Hospital, Sweden; Elizabeth Hardy, Rosemary Flemons and Alicia Mendez, Queen Mary's, Roehampton, UK).
This group of experienced therapists provided "data from field trials with their clients and colleagues to establish valid test items, a meaningful rating scale, inter-rater reliability and inter-subtest reliability."13 A Rating Scale for Functional Evaluation previously developed by occupational therapist, Barbara O'Shea,14 was used as the basis for the measure developed for this test. The UNB Test of Prosthetic Function gained relatively widespread clinical acceptance and is still being used today.15 It is an observer-rated measure with a dual rating scale used to evaluate performance and spontaneity of use. It has been used primarily by clinical therapists to assess the individual child's ability to operate their prosthesis and to determine the progress of functional training. Researchers have also used it in validation activity.16,17
Hermansson, of Sweden, then reported the development of a new instrument, the "Skill Index Ranking Scale" (SIRS)18 in 1991, which was intended to describe a child's ability to tolerate and use a prosthesis. According to this scale, the child's accomplishments with the prosthesis could be assessed according to 14 different levels of function when using an electric hand prosthesis. Each step was designed to put a higher demand on the child. Although some therapists have used it as an outcome measure, it was not designed as such but more for use as a description of stepwise development with a prosthesis.
In the 1990s, researchers became more interested in the use of self-report functional measures. Researchers, Gauthier-Gagnon and Grise,19 suggested that functional questionnaire responses, reflecting the amputee's actual use of the prosthesis rather than just capability, might provide results that were more meaningful in evaluating effectiveness and cost benefits of the prostheses. There had been two self-report measures, designed specifically for use by children or adolescents with upper limb loss, reported in earlier studies by Paciga et al.5 and Weaver et al.,20 but these questionnaires were only used for their respective studies and not validated or repeated. Wright and the research team at the Bloorview Kids Rehab (formerly Hugh MacMillan Rehabilitation Centre) decided to adopt this approach in 1995 in developing a new type of functional status measure-the self-rated "Prosthetic Upper Extremity Functional Index" (PUFI). This instrument was designed to evaluate the extent to which a child uses a prosthesis for common bimanual activities, the ease of task performance with and without the prosthesis, and the perceived usefulness of the prosthesis. Stages of development included reliability and aspects of construct validity testing with 30 children and teens at Bloorview,21 a multicentre, validity study with 41 subjects,16 development of a computer software package version, and later on, the inclusion of the PUFI-PC in a large-scale outcomes assessment by the Shriners Hospitals and the establishment of a PUFI international database.
In April 1997, with the support of the Shriners Hospitals, the AAOS Committee on Pediatric Orthopedics organized and held a consensus conference on "The Child with a Limb Deficiency."22 For this two-day event, a group of invited participants gathered together to discuss a variety of subjects related to the care of children with limb deficiency. Each attendee presented a "state of the art" article on their individual area of expertise, and a consensus debate was held at the end of each topic area. The symposium ended with a session on outcome measurement. The Speaker, psychologist James Varni, was critical of the approach commonly taken by health care professionals to evaluate the outcome of surgical, medical, and rehab practices. He suggested that the focus of attention was too narrow in trying to prove that a particular intervention or practice was successful, according to individual beliefs or professional standards, without also considering the end result from the perspective of the child and family. Whether a clinician thought the intervention was successful or worthwhile was really immaterial unless the child and/or family thought there had been an improvement or change in the child's quality of life. He recommended that health care professionals start taking a much broader perspective, because the "need for multidimensional health-related outcomes assessment was going to become increasingly necessary in clinical decision-making policy development and program evaluation."23 He further suggested that future measures needed to address the various dimensions delineated by the World Health Organization.
Varni also worked with Pruitt to develop a series of functional status inventory measures for children with upper or lower limb deficiency: the CAPP-FSI for children aged 8–17 years24 (1996), the CAPP-FSI version for pre-school children,25 (1998) and the CAPP-FSIT for toddlers26 (1999). These inventories included a list of common activities of daily living (majority being upper limb), and the respondent was asked to indicate 1) if the individual did the activity and 2) if the prosthesis was used. The same group also developed a prosthesis satisfaction scale-the CAPP-PSI.27 Unfortunately, plans for further psychometric evaluation were not completed, and it is unknown whether the measures were ever used in clinical practice.
In recent years, two newer observational-type functional measures have been produced.
The development of the "Unilateral Below-Elbow Test" (UBET)28 functional measure in 2002 presents another example of the value of a collaborative process. In this case, a group of clinicians from 10 of the Shriners Hospitals formed a Study Group to develop a method for evaluating the "functional status" of the children being reviewed in the Shriners Hospitals' multi-assessment study.29 Their aim was to develop an observational measure to evaluate function in bimanual activities for both prosthesis wearers and nonwearers. The instrument is similar in design to the UNB Test but does not have subtests and uses slightly different rating scales. Nine tasks were identified for each of the four age groups (2–4 years, 5–7 years, 8–10 years, and 11–21 years). Two scales, Completion of task and Method of Use, are used to rate performance. For the Method of Use scale, the scoring criteria vary according to whether the patient is wearing a prosthesis or not.
It was intended that the new UBET measure would be used in conjunction with the PUFI as a twofold approach to address the functional status dimension of the study. In addition, The Pediatric Quality of Life Inventory (PedsQL) generic core scale and the Pediatric Outcomes Data Collection Instrument (PODCI) were used to assess quality of life and health outcomes. This group met together with other researchers and consultants connected to the study in Portland in March 2002, to obtain feedback and to review and refine the test protocol before the implementation of the multicentre trial.
Psychometric testing of a new measure, the "Assessment of Capacity for Myoelectric Control" (ACMC), was reported by Hermansson et al. in 200530 and 2006.31 This instrument was designed (with the earlier SIRS measure as part of the background) to determine the capacity of an individual to operate the control system of a myoelectric prosthesis. An experienced and qualified therapist rater observes the child or adult during normal everyday activity to rate the aspects of prosthetic for gripping, holding, releasing, and coordinating movements. It is now being used in clinical practice but therapists must first be trained and qualified to administer the test. Several therapists attended the course held in conjunction with the MEC'08 conference in Fredericton in August 2008.
Other projects known to be in progress include:
PUFI—ongoing activity includes the development of a module for children who do not use a prosthesis—the UFI, a new software version 4, and the initial stages of development of an adult version of the PUFI. The designers also hope to have a web-based version available in the near future.
Sköld, A., Eliasson, A-C, et al. are developing a Child Hand-use Experience Questionnaire (CHEQ) for children with unilateral, nonprosthetic hand deficiency. The measure is expected to be available on the Internet shortly ( www.cheq.se ).
Hermansson is working on the development (scoring criteria) and validation of the Assisting Hand Assessment (AHA) for children with unilateral hand deficiency and AHA for children and adults with and without upper-limb prostheses (amputees and children with congenital deficiency).
It is evident that a great amount of effort has been expended in this area, and it is unfortunate that so many of the outcome measures developed over the years have not been fully validated or put into routine clinical use. In too many cases, tests were developed for specific study investigations with isolated populations, and as such, the results cannot be generalized, and the tools have not been used further. Regrettably, there is still no system in place to determine empirically the positive functional and psychosocial outcomes associated with prosthetic use.
So what do we need to do? Although it is obvious that the accuracy and precision of study results must meet acceptable measurement standards, clinical teams also should not keep ignoring the need to do something to assess the value of prosthetic prescription and fitting for clients. Rather than continue to find fault with the instruments available or seek to develop new ones, it is time to at least select a few that can be agreed upon and start to use them in a consistent manner. Once there is a process in place, it will be possible to identify the problems or gaps and work toward a better solution.
Fortunately, comprehensive outcome measurement reviews by Buffart et al.32,33 and Wright et al.34,35 have identified a growing number of prosthetic-specific and generic measures worthy of consideration to help assess the functional outcome. In some cases, further psychometric evaluation work is recommended. Other domains such as psychological, social, and environmental factors are less well developed and need to be addressed by appropriately qualified personnel. It is also important to insist that consistent data collection become a routine part of clinical practice, mandated if necessary, along with other requirements of professional practice. If this is to succeed, measurement tools must be practical to administer in a clinical setting, and they need to have relevance, both for the clinician and for the child/family.
There does appear to be a growing awareness that there is no simple way to determine if prosthetics are cost beneficial for children with upper-limb amputations or limb deficiencies and that it will require a multidimensional approach and longitudinal study. There is also a need for collective and collaborative investigation to obtain sufficient data to determine statistical significance and to be able to compare practices.
The direction and guidelines for future work provided by Biddis and Chau36 are worthy of consideration. Based on an extensive review of the prosthetic literature regarding upper limb use and abandonment, they recommended that any future research work should 1) increase generality of results, 2) focus on controlled study designs, 3) deploy formal statistical analyses, 4) conduct multifactor analyses, 5) develop and adopt standardized measurement tools, 6) provide complete descriptive documentation, and 7) capture contemporary consumer views.
Correspondence to: Sheila Hubbard, Dip P&OT, BSc(PT), 26 Stonemanse Court, Toronto, ON, Canada, M1G 3V3; e-mail:
SHEILA HUBBARD, Dip P&OT, BSc(PT), is affiliated with the Bloorview Kids Rehab, Toronto, ON, Canada. The author declares no conflict of interest.