James
S. Bowman
Public Personnel Management
Page 557
Copyright 1999 Information Access Company. All rights reserved. COPYRIGHT
1999 International Personnel Management Association
Few
administrative functions have attracted more attention and so successfully
resisted solution than employee evaluation. Since performance appraisal
is impossible, what actually happens is personnel appraisal. When
such hypocrisy occurs, civil service systems predicated on merit
are undermined.
This
article commences with the evolution of the appraisal function,
the root of ethical problems found in service ratings. Common types
of evaluation (with their strengths and drawbacks), who does them,
and typical rating errors are then examined. This climaxes with
a discussion of the fundamental and beguiling reason for these deficiencies.
Diagnosis completed, attention shifts to ways to improve appraisals,
which leads to a specification of the characteristics of a system
that could withstand legal, if not ethical, scrutiny. The analysis
closes by sketching future, not necessarily promising, trends.
"Personnel
Appraisal (pers'-n-el a-pra'-zel) n: given by someone who does not
want to give it to someone who does not want to get it." --
Anonymous
One's
work in modern organizations will be evaluated to assess the extent
to which individual and collective needs coincide--or conflict.
Since many decisions can hinge on these ratings, the process is
central to human resource management. Key to employee compliance,
performance improvement and system validation functions, such reviews
are mechanisms to reinforce organizational values. They provide
data on the effectiveness of recruitment, position management, training,
and compensation (where such information is most frequently used).
Clearly,
then, employee evaluation is a chief function of management. While
an effective process can benefit an agency, creating, implementing,
and maintaining it is no easy task. Programs serving multiple purposes
may do none of them particularly well. In business, for instance,
less than 20 percent accomplish their goals, and under 10 percent
of organizations judge their appraisal systems to be effective.[1]
There is no reason to believe that the situation is any different
in government.
Personnel
appraisal, in short, is one of a manager's most difficult issues
precisely because it is both important and problematic. Few managerial
functions have attracted more attention and so successfully resisted
solution than employee evaluation.[2] Patently, personnel systems
predicated on rewarding merit, are undermined when questionable
appraisal practices take place. What these widely-used and intensely-disliked
systems reveal is that instead of being a solution, they are often
part of the problem.
Not
surprisingly, paradoxes abound as: people are often less certain
about "where they stand" after the appraisal than before
it; the higher one rises in a department, the lower the likelihood
that quality feedback will be received; and most employees perceive
little connection between performance and pay.[3] Despite--or perhaps
because of--the vexing, intractable nature of personnel appraisal,
political pressures to "just do it" are substantial. While
the general public knows appraisal problems from its own work experiences,
it nevertheless makes an odd assumption: since evaluations are done
successfully (somewhere) in business bureaucracies, they should
especially be used in government agencies.
This
article begins with the evolution, as eerie as it is, of the appraisal
function. Common types of appraisals, who does them, and typical,
if robust, rating errors are then examined. This climaxes with a
discussion of the fundamental and beguiling reason for these problems.Diagnosis
completed, attention then shifts to ways to design and improve evaluation
programs. This leads to a specification of the characteristics of
a system that could withstand legal scrutiny. The article closes
by sketching future trends in personnel appraisal.
Evolution
The
root of the paradoxical nature of service ratings--rarely do they
deliver in practice what is promised in theory--is a legacy of the
spoils system. Aghast at widespread looting, plunder, and corruption
during that era, good government groups, armed with scientific management
techniques like job analysis, sought to guarantee competence by
insulating employees from political influence.
In
order to assure that public employees would not be rewarded or punished
for partisan reasons, the reformers worked hard to establish merit
systems under which supervisory discretion was severely limited
and closely policed by non-partisan civil service commissions...empowered
to defend merit principles.[4]
As
the merit system evolved, the emphasis was on recruiting meritorious
people and protecting them from partisan entanglements. Less attention
was devoted to divining ways to evaluate their work; after all,
the system was designed to select competent workers in the first
place.
It
should not be surprising, then, that although concern for appraisal
has existed for a long time (Congress mandated evaluations as early
as 1842), the topic for decades "was a theoretical and administrative
backwater, ignored by scholars and practitioners alike."[5]
The dramatic growth of government during the Great Depression and
World War II, however, culminated in considerable interest in appraisal
programs so that by the 1950s many jurisdictions had adopted them.
Characteristic
of the times, an underlying faith in science to control, direct,
and measure human performance resulted in the continuing search
for, if not the perfect evaluative scheme, at least ways to improve
existing technology. Thus, many of the early systems, based on personal
traits (discussed in the next section), were widely criticized for
failing to differentiate between employees since virtually everyone
received a "satisfactory" rating.
Aimed
at correcting this, the 1978 Civil Service Reform Act sought to
evaluate employees not on subjective characteristics but on objective,
job-related performance standards. This effort, in turn, produced
its own set of problems so that in 1993 the National Performance
Review declared it to be dysfunctional to the success of governmental
programs. For its part, NPR in calling for simplified, decentralized,
team-based evaluation de-emphasizing the need for result-oriented
appraisals; this approach, as discussed below, may be no more successful
than it has been in business.
Today,
service ratings remain as certainly "the most maligned area
of personnel and...seem to be tolerated only because no one can
think of any better, realistic alternatives."[6] Yet abandoning
the function altogether may not be a solution, since human beings
have always made informal or formal evaluations of others. The challenge
is to decide what to appraise in a manner that meets the needs of
the organization and the individual.
Common
Types of Appraisal
Since
there are few jobs with clear, comprehensive, objective output measures
that eliminate the need for judgment, the most widely used methods
are judgmental in nature. What differentiates them is the degree
of subjectivity that is likely in the judgments to be made. The
approaches can be readily grouped as (1) trait-, (2) behavior- and
(3) result-based systems. Recognize, however, that there is considerable
variety in available techniques, and not only are they frequently
combined with one another, but there are also different systems
that may be used for various types of employees.[7] Only the most
well-known are examined here; yet even these, albeit in differing
degrees, produce evaluations that are either deficient (not all
pertinent factors are considered) or contaminated (irrelevant considerations
are included).
Trait-Based
Systems
This
method requires judgments on the degree to which someone possesses
certain desired personal characteristics deemed important for the
job. Despite the inherent subjectivity of this format, it continues
to be practiced because human beings routinely make trait judgments
about others in daily life. The approach, although often inscrutable,
seems intuitively sensible as a result.
There
are colorful iterations of such graphic rating scales based on the
characteristics chosen, their definitions (if any), and the number
of categories (adjective or numeric) used. None, however, overcome
serious validity and reliability questions. Thus, because it is
difficult to define personality characteristics (much less the extent
to which someone has them), subordinates may become suspicious,
if not resentful, especially because this technique has little value
for the purpose of performance improvement. Human traits, after
all, are relatively stable aspects of individuals.
This
is not to suggest that vivid personal traits are unimportant in
job performance; people can hardly perform without them. Indeed,
the use of flexible, subjective criteria seems inevitable, especially
for ambiguous managerial jobs. The problem is valid measurement.
When used with accurate job descriptions and trained evaluators,
such ratings may become more credible. However, even when the traits
measured are job-related (e.g., job knowledge, dependability), a
landmark court opinion (Brito v. Zia, 1973) criticized their subjective
nature because the results were not anchored in or related to actual
work behavior.[8] Just as trait rating is no longer likely to be
used alone, neither is the narrative essay technique; in fact, in
one form or another written descriptions often supplement most appraisal
formats.
Behavior-Based
Systems
Unlike
trait-focused methods, which emphasize who a person is, behavior-oriented
procedures attempt to discern what someone actually does. The relatively
tangible, objective nature of these systems makes them more legally
defensible than personality scales. In point of fact, civil rights
legislation of the 1960s and 1970s led to the developmentof a number
of tools that concentrate on behavioral data, two of which are considered
here.
The
Critical Incident Technique (CIT) is used to record behaviors that
are unusually superior or inferior. It can be implemented in a responsive
and flexible manner; supervisors can be trained to pay more attention
to incidents of exceptional behavior in some performance areas at
certain times and in other areas in different periods.[9] Notably
a critical incident log may be helpful in supporting other appraisal
methods.
Important
drawbacks, however, include its "micro-management" feature
as supervisors keep a "book" on people; mistakes, rather
than achievements, may be more apt to be recorded since employees
are supposed to be competent. Another concern is that subordinates
may engage in easily-documented activities, while hiding errors
and neglecting tasks not readily observed. Then too, valuable, steady
performers, not generally involved in spectacular events, may be
overlooked. Halachmi also notes that the record could be incomplete
or unreliable either due to the rater's knowledge or the nature
of the appraisee's job--either one of which make comparisons between
individuals problematic. The anecdotal nature of the method, in
short, is both its strength and weakness.
The
behaviorally-anchored rating system (BARS) builds on the incident
method as well as the graphic rating scales discussed earlier. It
defines the dimensions to be evaluated in behavioral terms and uses
critical events to anchor or describe different performance levels.
When introduced in the 1960s, BARS was claimed to be a breakthrough
technology since raters could match observed activity on a scale
instead of judging it as desired or undesired.[10] Since the scales
are developed from the experience of employees, it was also thought
that user acceptance was likely. Because the system is job-related,
it remains relatively invulnerable to legal challenge.
Yet
the method is often not practical as each job category requires
its own BARS; either for economic reasons or the lack of employees
in a specific job, the approach is often infeasible. Secondly, Gomez,
et al.[11] argue that if personal attributes are a more natural
way to think about other people, then requiring supervisors to use
BARS (or for that matter any non-trait technique) is merely a sleight-of-hand
that introduces psychometric errors (discussed below). Indeed, they
cite research that finds both employers and employees prefer trait
systems. Other studies demonstrate that employers and employees
do not make much of a distinction between BARS and trait scales.[12]
Not surprisingly, there is little evidence to support the superiority
of this technique over other approaches.[13]
Finally,
most experts in both business and public administration do not find
that the potential gains in using BARS warrant the substantial investment
required in time and resources. Thus, where this technique is used,
it often plays a residual role limited to either a small number
of selected job categories and/or to the developmental function
of personnel appraisal. Overall, then, whatever else trait- and
behavior-based systems may do, they are largely silent on the question
of what an employee is to accomplish.
Results-Based
Systems
Neither
a measure of personal characteristics nor employee behaviors, outcome-oriented
approaches, attempt to calibrate one's contribution to the success
of the organization. Although "results" have always been
of keen interest to administrators, management by objectives (MBO)[14]
promises to achieve substantial organization-individual goal congruence.
Introduced in the 1950s, this most common results-focused approach
establishes agency objectives, followed in cascading fashion by
derivative objectives for every department, all managers, and each
employee. These systems require specific, realistic objectives,
mutually-agreed upon goals, interim progress reviews, and comparison
between actual and expected accomplishments at the end of the rating
period.
Despite
its rationality, as well as evidence of effectiveness,[15] MBO like
other appraisal techniques has serious drawbacks:
- While
development of objectives may not be as technically demanding
as BARS, the process nonetheless is quite time-consuming as an
effective program takes 3-5 years to implement (accordingly, few
organizations adopt the formal hierarchical process to ensure
organization-department-manager-employee linkage).
- There
likely will be conflicting objectives, differing views on the
appropriateness of the objectives, and disagreements about the
extent to which objectives are mutually agreed upon and fulfilled.
- Because
it focuses on short-term goals, a compulsive "results-no-matter-what"
mentality can produce predictable quality and ethical problems
as anything that gets in the way of the objective gets shunted
aside (in any public or private service organization, how a job
is done often is as critical as its output).
- Not
only is establishing equally challenging objectives for all people
difficult, but also expectations that they will invariably improve
(an MBO-induced "treadmill") can lead to user acceptance
problems.
- The
technique can stifle creativity as employees may define their
job narrowly (as they "work to quota") leaving some
problems undetected and unresolved.
- Teamwork
is apt to suffer if employees become preoccupied with personal
objectives at the expense of collegiality (they may fulfill their
goals, but not be good all-round performers).
- Since
performance outcomes do not indicate how to change, the method
may not assist in the employee development function.
MBO,
nonetheless, remains a popular technique to appraise managers since
their roles are often ambiguous and it does provide a measure of
accomplishment against predetermined objectives.
Commentary:
"Man plans, God laughs." Jewish proverb
To
summarize, Exhibit 1 specifies the promise, problems, and prospects
of trait, behavior, and results approaches to appraisal. While the
intuitive appeal of trait rating is considerable, it is highly susceptible
to both contamination and deficiency errors; its future potential,
accordingly, is limited to a supplemental role in the review process
due to subjectivity and vulnerability to court challenge.
Exhibit
1. Promise, Problems, and Prospects of Person-centered
Appraisal
Systems
Characteristic
System Promise Problems Prospects
Trait-
High Intuitive Contamination Supplemental
based
high appeal and deficiency role
low
errors
Behavior-
High Job Susceptible High
based
average related to deficiency technical
average
errors demands
Results-
High Face Deficiency Emphasizes
based
average validity problems
accomplishments
average
to
high
Systems
based on employee behavior also hold substantial promise since they
are job-related--something most judges expect. Yet they too are
likely to play a modest role in the years ahead largely because
of their susceptibility to deficiency errors and, in the case of
BARS, high technical demands coupled with limited applicability.
Results-derived approaches, like the others, have face validity,
but often suffer from a host of deficiency and implementation problems.
Still, they do emphasize actual accomplishments, as opposed to personalities
or behaviors, and therefore may survive litigation.
While
combined techniques may offer advantages, available research does
not support a clear choice among methods.[16] Since each has its
own strengths and weaknesses, selecting one to cure a problem likely
will cause another; there is no fool-proof approach. Notice too
that all three systems are backward-looking; because there is no
systematic continuous improvement process, they may be self-defeating
because they perpetuate the organizational status quo. Ironically,
the better traditional appraisals are done, the more likely that
organization will remain the same(!). Hausser and Fay wistfully
argue that the search for the perfect instrument--a goal that has
eluded industrial psychologists for over 50 years--is now largely
regarded as futile.[17] Instead, they suggest, efforts to improve
the overall appraisal process likely will provide much larger returns
than developing (and redeveloping) seemingly better rating forms
every time a new high official takes office.
Paradoxically,
then, the technique used is decidedly not the central issue in personnel
appraisal since the type of tool does not seem to make much difference.[18]
Summarizing a National Research Council study, Nigro and Nigro report
that
The
council found no convincing evidence to support arguments that distinguishing
between behaviors and traits has much effect on rating outcomes.
It found that psychologically, supervisors form generalized evaluations
which strongly color memory for and evaluation of actual work behaviors.
It also found that there is little evidence to suggestthat rating
systems based on highly job-specific dimensions produce results
that are much different from those using global or general dimensions.[19]
That
is, available evidence indicates that judgments about performance
are not necessarily correlated with results precisely because these
decisions rely on cognitive abilities, which are notoriously error-prone
(see below).[20] Not surprisingly, the choice of a tool is less
important than the fact the employees often have little confidence
in the abilities of managers to effectively implement them.[21]
The National Performance Review found, for instance, that "performance
ratings are unevenly distributed by grade, gender, occupation, geographic
location, ethnic group, and agency" (shoe size was not mentioned).[22]
Appraisal
software programs, nonetheless, promise to enable managers to select
predigested forms (or to design their own), walk them through form
completion (including tips and hints, provision of preprogrammed
phrases and prompts for examples), and verify their work with arithmetical,
logical consistency, and legal checks. However, in a balanced review
of these programs, Grote, notes that they run on algorithms with
no knowledge of the organizational culture, job standards, or individual
performance--problems likely to intensify in a virtual workplace.[23]
Indeed, they make the process too easy; managers should devote real
thought to appraisals, not merely point and click. And, the software
contributes nothing to the most important part of service ratings:
the manager-employee interview (discussed in a subsequent section).
Raters
Since
common appraisal methods are judgmental in character, "Who
makes this judgment?" Traditionally there was one answer: the
subordinate's immediate supervisor. However, other knowledgeable
information sources include the ratee, peers, computers, and outsiders.
Based
on the belief that the employee has important insights about how
the job should be done, self-appraisals can provide valuable data,
particularly when the supervisor and employee engage in joint goal
setting. Yet these evaluations are subject to distortions including
self-congratulation or, less likely, self-incrimination. It is well
established, for instance, that many people attribute good performance
to their own efforts and blame poor performance on other factors.
These biases can be moderated if objective standards exist and the
ratee is regularly provided genuine feedback. Still, because these
evaluations tend to focus on personal growth and motivation, they
are best used for developmental, rather than administrative, purposes.
As
work in some organizations has changed from a stable set of tasks
done by one person to a more fluid ensemble of changing requirements
done by groups of employees, peer or team evaluation becomes appropriate.
In a high trust agency culture, where co-workers have access to
relevant information, such assessments can be accurate. When these
conditions do not exist, supervisors likely will be reluctant to
give up control. And, subordinates will often see these techniques
as disruptive competition that can easily be sabotaged by lenient
ratings or converted into "popularity contests." Thus,
these reviews are often most useful when done anonymously and for
developmental reasons.
The
objective of electronic monitoring, third, is to increase productivity,
improve quality, and reduce costs; it does so by continuously collecting
performance data, pinpointing problems, and providing immediate
feedback. When such monitoring provides objective performance appraisals,
employee satisfaction and improved morale may result. Today computer-generated
statistics are the basis for evaluations of millions of office workers
engaged in clerical, repetitive tasks; the virtual worksite of the
future is almost certainly going to expand the collection and use
of such information. When implemented without reasonable safeguards
(e.g., employee access to data, rights to challenge erroneous records,
rating decisions made on the basis of non-electronic as well as
electronic information), thesesoftware programs can create an "electronic
sweatshop" environment damaging creativity, morale, and health.
If employees feel helpless, manipulated and exploited, then most
techniques eventually will be circumvented.[24]
Finally,
multi-rater or 360[degrees] evaluation systems--those that gather
information from subordinates, peers, and citizens--by definition
provide more data than other approaches. More data may produce more
reliable, but not necessarily more valid, information. The administratively
complex nature of these systems is compounded by a lack of convergence
between the different sources. That is, managers may be confronted
with a host of seemingly conflicting opinions--all of which may
be accurate from their respective viewpoints. Still, systems that
assure respondent anonymity and encourage participant responsibility
no doubt supply some useful feedback for both improving management
processes and employee development. And, there is growing acknowledgment
of the value of the technique; the first scholarly reference to
it was in 1993, but today the term is commonly used in the field.
In short, while one's immediate supervisor is apt to play an important
role in the rating process, feedback from other sources is increasingly
seen as a way to obtain a more holistic understanding of performance.
Rating
Errors
The
use of ratings assumes that evaluators are reasonably objective
and precise. Regardless of the appraisal instrument used, though,
a large number of well-known errors occur in the process based on
(1) cognitive limitations, (2) intentional manipulation, and (3)
organization influences. When they happen--and they are difficult
to prevent--not only is the rater's judgment called into question,
but also the resulting evaluation may leave the ratee unable to
accurately judge his/her own performance.
When
confronted with large amounts of information, people generally seek
ways to simplify it. Cognitive information processing theory maintains
that appraisal is a complex memory task involving data acquisition,
storage, retrieval, and analysis. To process these data, subjective
categories are employed, which in turn produce no less than five
problems. Thus, compatibility ("similar to me" or liking)
error is a potent one since both compatibility and ratings are person-focused.
Indeed, most employees believe their supervisor's liking of them
influences evaluations.[25]
The
next mental shortcut is the spillover (halo or black mark) effect--i.e.,
if the ratee does one thing exceptionally well (halo) or poorly
(black mark), then that unfairly reflects on everything else. The
recency effect takes place when a major event occurs just prior
to the time of the evaluation and overshadows all other incidents.
Contrast error exists when people are rated relative to other people
instead of against performance standards. Finally, actor/observer
bias (partially alluded to earlier) occurs when subordinates as
actors often point to external factors while supervisors as observers
attribute weak performance to employees.
The
second general source of rating problems is that appraisals in many
organizations are adroitly seen as a political--not necessarily
a rational--exercise; results are intentionally manipulated, higher
or lower, than the employee deserves. The goal is not measurement
accuracy, but rather management discretion and organizational effectiveness.
Accordingly,
leniency or friendliness error (the "Santa Clause" effect)
is the consequence of a desire to: maintain good working relationships,
maximize the size of a merit raise, encourage a marginal employee,
show empathy for someone with personal problems, or to avoid confrontations
(and appeals) with an aggressive worker.[26] Conversely, severity
error (the "horns" effect) may be emphasized as a way
to either send a message to a good performer that some aspect of
his/her work needs improvement or to shock an average employee into
higher performance. Over 70 percent of managers in one survey reported
that they deliberately inflated or deflated evaluations for such
reasons.[27]
Note
that the inherent conflict of interest present in supervisory evaluations
is a powerful political reason likely to make the leniency effect
prevail over other psychometric errors. That is, if all (or most)
subordinate evaluations are inflated, then the supervisor may look
like an effective manager; if the appraisals are not so inflated,
then his/her management abilities may be called into question.[28]
However, the employer has an obligation to conduct appraisals with
due care. This duty may be violated (as a result of the Santa Claus
effect) when a poor performer receives satisfactory ratings and
subsequently is subjected to attempts at termination.
Finally,
this leads to examination of a set of organizational influences
that cause at least four problems. The first is insufficient management
commitment to performance appraisal . In light of the difficulties
with various evaluation schemes, much skepticism, futility, and
even doubts about the possibility of performance appraisal exist.[29]
Investing heavily in these systems, then, does not make a lot of
sense for some administrators. The daily press of business, makes
it a peripheral, not central, responsibility; it is often isolated
not only from getting the job done but also from organizational
planning and budget strategies. There are few incentives--and sometimes
genuine disincentives--to use appraisal as a management tool. Employee
evaluations are done for the sake of evaluation--an irrelevant,
once-a-year formality to complain about, complete, and forget in
the service of administrative rules.
Such
an attitude leads to the error of central tendency (if not leniency)
where nearly all are rated satisfactorily--if for no other reason
than higher or lower scores may require time-consuming documentation.
This "error" is, in turn, reinforced by the no money effect--i.e.,
there frequently are insufficient funds to distribute and/or they
are awarded on an across-the-board basis.
Overall,
cognitive, political, and organizational limitations help explain
the reasons for rater error. While some of these constraints can
be addressed in training, something more fundamental lies at the
root of personnel appraisal difficulties: human nature. Its pertinent
aspects are revealed by risk aversion, implicit personality theory,
conflicting role expectations, and personal reluctance.
Since
defending one's judgment in open court is not something most relish,
it is natural that supervisors reduce risk by being aware of all
possible pitfalls in the appraisal process. A paradox arises, however,
when playing safe through leniency may invite a legal challenge
on the grounds that appraisals did not differentiate employees by
performance.[30]
Second,
implicit personality theory suggests that people generally judge
the "whole person" based on limited data (stereotyping
based on first impressions or the halo effect); ratings then tend
to justify these global opinions rather than accurately gauge performance.
Conflicting role expectations, third, are inherent in the appraisal
process as evaluators must reconcile being a helpful coach with
acting as a critical judge. In playing these roles, administrators
(as noted earlier) also evaluate themselves; human nature suggests
that better-than-deserved ratings will occur for one's own managerial
skills may be called into question should employees receive poor
evaluations.
Last,
appraisal systems are complicated by the understandable distaste
that people have when asked to formally evaluate others. Since there
is no such thing as infallible judgment, when administrators must
take responsibility for judging the worth of others, "it is
dangerously close to a violation of the integrity of the person."[31]
Most people, especially in light of all the other questions about
the reliability and validity of personnel appraisal, are as reluctant
to judge others as they are to be judged themselves. It is onerous,
in other words, to "play God." Little wonder, then, that
the sentiment expressed in the quotation at the outset of this article
is shared by many: "appraisal is given by someone who does
not want to give it to someone who does not want to get it."
To
summarize, since evaluation in many jobs is not amenable to objective
assessment and quantification, ratings typically incorporate non-performance
factors--for all the reasons discussed above. When this occurs it
leads to a violation of the most revered principle of this field
of HRM: appraisals evaluate performance, not the person.[32] Verisimilitude
trumps veracity.
Improving
the Process
Designing
an appraisal system requires not only establishing policies and
procedures, but also obtaining the support of the entire workforce
and its union(s). Top officials must publicly commit to the program
by devoting sufficient resources to it and by modeling appropriate
behavior. Managers, in turn, need to be convinced that the system
is relevant and operational. Employees likewise should see it as
in their interest to take it seriously. A profile (or "slice")
taskforce, representing all of these groups from different parts
of the department, can conduct a needs assessment by collecting
agency archival and employee attitudinal data. It should then revise
an existing system (or create a new one) based on the findings and
test it on a trial basis. This could be done in jurisdictions that
allow customization to agency needs (over half of state governments
for example) or as part of a reinventing government laboratory experiment.
It is possible to finesse and marginalize formal requirements entirely
(see Exhibit 2); beating the system may be faster, more flexible,
and just as effective as formally reforming it.
Exhibit
2. Beating the System
In
one major unit of a large hospital, a charismatic department manager
decided that whatever the administration of the hospital did, he
was going to run his facilities department on the basis of TQM.
Well in advance of the hospital's annual tedious performance appraisal
drill, he gathered his troops together, reviewed the hospital's
sorry form, and then told them that what it represented was the
starting point for them to practice their kaizen--continuous improvement--skills.
"What do we need to do, given the fact that this basic form
is mandated, in order to complete it well enough to keep the personnel
monkeys off our backs but also get some good out of the process
for ourselves?" he asked his team. He funded a series of weekly
pizza meetings for a task force of facilities employees who were
charged with developing an answer to his question that everyone
supported enthusiastically.
Source:
Grote, D. The Complete Guide to Performance Appraisal (New York:
AMACOM,1999), 351.
The
design chosen involves numerous key technical questions, many of
which were discussed earlier. These include selection of the most
useful tool(s), as well as raters, based on system objective, practicality,
and cost. Training is needed in an effort to minimize the various
kinds of errors previously examined. Yet, it is generally acknowledged
that mere awareness of these problems is unlikely to affect behavior;
instead, raters must engage in and receive feedback from role plays,
simulations, and videotaped exercises. Evaluators also need training
in interpersonal skills in order to effectively conduct appraisal
interviews.
Monitoring
performance, the period between plan approval and formal appraisal,
includes frequent positive or corrective feedback based on performance
not personality. When done conscientiously throughout the year,
the actual evaluation will then simply confirm what has already
been discussed.[33]
Finally,
the evaluation is culminated by the appraisal interview. In preparing
for the meeting, the employee may complete a self assessment and
managers should collect necessary information and complete, in draft
form, the rating instrument. Although a collaborative problem-solving
approach is effective, most managers use a one-way "tell and
sell" technique where they inform subordinates how they were
rated and then justify the decision.[34] No matter the approach,
supervisors should use the event to support the policies and practices
of the entire system and be trained in goal setting, communication
skills, and positive reinforcement.
Summary
and Conclusion
To
distill the foregoing argument, the characteristics a personnel
appraisal system should contain to satisfy both employers and employees--and
to survive a court challenge--are specified below. As discussed,
however, implementing this HRM function is fraught with difficulty.
Readers are invited to evaluate the extent to which the following
standards are met by agencies in their jurisdictions:
1.
The rating instruments, which should strive for simplicity not complexity,
are derived from job analysis.
2.
Training is provided to all employees about the systems and to managers
in its use.
3.
The appraisal is grounded in accurate job descriptions and the actual
ratings are based on observable performance.
4.
Evaluations are completed under standardized conditions and are
free of adverse impact.
5.
Preliminary results are shared with the ratee.
6.
Some form of upper level review, including an appeal process, exists
that prevents a single manager from controlling an employee's career.
7.
Performance counseling and corrective guidance services exist.
While
many systems may not compare favorably to such standards, recall
that the crux of the appraisal problem is not system design. Instead,
since evaluation is a matter of human judgment, the conundrum is
how the plan and the information it generates is used.
The
perennial, melancholy search for the best technique, nonetheless,
relentlessly (sometimes shamelessly) continues. As we peer into
the century ahead, personnel appraisal will become either more or
less complex. Should the long-standing preference for person-centered
evaluations persist, then organizational downsizing and workforce
changes will likely complicate appraisals. The virtual workplace--unbound
by time and space--is apt to exacerbate this situation.
Downsizing
has been a one-two punch. Personnel offices have shrunk, placing
more responsibilities on line managers; at the same time the number
of supervisors have been reduced, requiring the remaining ones to
evaluate more subordinates.[35] The potential for both system design
and implementation problems, as a result, has increased.
Several
changes in the composition of the workforce also imply a more challenging
climate for appraisals. Thus, employees are becoming increasingly
diverse and evaluating people of all colors and cultures is surely
more arduous than assessing a homogenous staff. Also, the fastest
growing part of the working population are contingent employees--temporaries,
short-term contract workers, volunteers--who, by definition, present
evaluation challenges.
Exhibit
3. Evaluating Organizations, Not Individuals
"Body
swayed to the music. O brightening glance, how can we know the dancer
from the dance?" --William Butler Yeats
Individual
appraisal is a complex issue. Even when done with great care, it
can be devastating to people and destructive to organizations. While
it may be true that management practices are seldom discarded merely
because they are dysfunctional, it is also true that the reinventing
government movement (Chapter 1) provides an opportunity to re-examine
orthodox approaches to appraisal.
The
premise of organization-centered evaluation is that quality services
are a function of the system in which they are produced. Systems
consist of people, policies, technology, supplies, and a socio-political
environment within which all operate. Note that these parameters
are beyond appraisee control; indeed, the employees themselves are
hired, tasked, and trained by the organization. A person-only assessment,
stated differently, is deficient if the goal is to comprehend all
factors affecting performance. In a well-designed management system,
virtually all employees will perform properly; a weak system will
frustrate even the finest people.
Traditional,
person-centered appraisal methods are based on a faulty, unrealistic
assumption: that individual employees are responsible for outcomes
derived from a complex system. Since an organization is a group
of people working to achieve a common goal, the managerial role
is to foster that collaboration. If the result is inadequate, then
it is management's responsibility--and no one else's.
From
a systems perspective, the causes of good or bad performance are
spread throughout the organization and its processes. Many results
in the workplace are outside the power of employees traditionally
made responsible for those outcomes. When over 90 percent of performance
problems are the consequence of the management system,[A] holding
low level minions accountable is a way of evading responsibility;
the cause of most performance problems lies not within the individual
employee, but within the organization divined by its leaders.
Since
employees have little authority over organizational systems, relevant
appraisals should provide two kinds of feedback:
- system
performance data automatically generated from statistical process
controls (i.e., evaluation is built into the work process itself),
and
- individual
performance data--used primarily for developmental purposes--derived
from anonymous multi-rater 360[degrees] evaluations (focusing
on attributes such as teamwork, customer satisfaction, timeliness,
communication skills, and attendance).
The
key is to listen to customers of the process and emphasize continuous
improvement. By making the system as transparent as possible, the
focus is on non-threatening analyses of work processes and people's
contributions to those processes. Such an approach would be organizationally
valid, socially acceptable, and administratively convenient--key
criteria for any appraisal method. Importantly, it would change
the process from an often adversarial one to a more constructive
collaborative effort.
Reflecting
American individualism,[B] the field of HRM has focused on people
rather than systems. It is politically unlikely, therefore, that
organizational appraisals will supplant individual ratings (indeed,
when performance appraisals were abolished at one well-known federal
government demonstration project in California, the project was
terminated partly because productivity improved). Yet a number of
public agencies (National Oceanic and Atmospheric Administration,
Internal Revenue Service, Social Security Administration) and private
companies (Motorola, Merrill Lynch, Procter and Gamble) have modified
their approach to appraisals. To better reflect a systems perspective,
they have incorporated teamwork (in addition to individual achievements),
citizen/customer feedback (in addition to supervisory opinions),
and process improvement (in addition to results) dimensions into
their evaluations.
A more
complete "reinvention" would be to clearly state a performance
standard, and then assume that most employees will do the job for
which they were hired. Greg Boudreaux, a manager at the National
Rural Electric Cooperative, continues by saying that for the small
number who do not do their jobs, "...investigate why. Some
will need further training or management counseling. Some may be
an actual problem. But deal with those problems on a case-by-case,
and not through a generic, faculty, performance appraisal system."[C]
Indeed,
the approach described here is partly consistent with the most recent
appraisal fad: performance management. This strategy emphasizes
that managing performance (not merely doing an end-of-the-year evaluation)
is key to organizational success. Thus, performance management is
a continuing cycle of goal setting, coaching, development, and assessment.
From a systems perspective, however, it exemplifies the "wrong-problem
problem." In a triumph of hope over experience, it tries to
solve the wrong problem precisely by focusing on the individual,
not the organization.
[A]
Deming, W.E. The New Economics (Cambridge, MA: MIT/CAES).
[B]
This is an area where our myths may be more dangerous than our lies.
The lone frontiersman and the outlaw gunslinger--largely products
of Hollywood--were far less important in the American West than
farmers raising barns together and shopkeepers settling in small
towns. The myth also does not explain the wild popularity of team
sports in contemporary life.
[C]
"What TQM Says About Performance Appraisal ," Compensation
Review and Benefits Review, May/June, 1994, 20-24.
Alternatively,
should organizations begin to shift away from person-centered appraisal
and toward organization- or process-centered appraisals, individual
evaluations may be less complex in the years ahead--or perhaps abolished
altogether (see Exhibit 3). Whether the appraisal function becomes
more or less difficult in the 21st century, it is only worth doing
if it is an integral part of the management system and if it helps
both the organization and the individual develop to full potential.
Notes
[1]
Longenecker, C O. and S. J. Goff, " Why Performance Appraisals
Still Fail," Compensation and Benefits Review. November-December,1990,
36-41 and Schellhardt, T. D. "Annual agony: it's time to evaluate
your work and all involved are groaning," Wall Street Journal,
November 19, 1996,. A1, A5.
[2]
Halachmi, A "The practice of performance appraisal ,"
in Rabin, J. et al. (eds.), Handbook of Public Personnel Administration
(New York: Marcel Dekker, Inc, 1995), 321-355.
[3]
Daley, D (1992). Performance Appraisal in the Public Sector: Techniques
and Applications (Westport, CT: Quorum/Greenwood, 1992).
[4]
Nigro, L G. and F.A. Nigro, The New Public Personnel Administration
(Itasca, IL: F. E. Peacock Publishers, Inc., 1994), 113.
[5]
Riley, D D., Public Personnel Administration (New York: Harper Collins
College Publishers, 1993), 115.
[6]
Cox, R W., J.J. Buck and B.N. Morgan, Public Administration in Theory
and Practice (Englewood Cliffs, NJ: Prentice Hall., 1994)., 70.
[7]
It is neither feasible nor desirable, therefore, to discuss all
of these instruments; to do so would be to encourage the notion
that the problem of performance measurement is merely one of technique
[8]
Despite all of these problems, the technique has obvious intuitive
appeal since traits may simply be a shorthand way of describing
a person's behavior This may explain why some psychologists contend
that personality rating scales are not only reasonably valid and
reliable, but also that they are more acceptable to evaluators ((re:
Cascio, W.F., Applied Psychology in Human Resource Management 5th
ed. (Upper Saddle River, NJ: Prentice Hall, l998)).
[9]
Halachmi, opcit.,p. 326.
[10]
0Ibid, p. 330.
[11]
Gomez-Mejia, L R., D. B. Balkin, and R.L. Cardy, Managing Human
Resources (Upper Saddle River, NJ: Prentice Hall, 1995).
[12]
Wiersma, U and G. Latham, "Practicality of behavioral observation
scales, behavioral expectation scales, and trait scales," Personnel
Psychology vol. 39, l986, 619-628.
[13]
Borman, W C., "Job behavior, performance, and effectiveness,"
in M. D. Dennette and L. M. Hough (eds.), Handbook of Industrial
and Organizational Psychology, vol. 2. Palo Alto, CA: Consulting
Psychologists Press, 1991, 271-326.
[14]
Fondly known as "massive bowel obstruction," precisely
because such a rational system could, in the view of critics, never
work with human beings.
[15]
Rogers, R and J Hunter, "Impact of Management by Objectives
on Organizational Productivity," Journal of Applied Psychology,
Vol. 76, 199l, 322-326.
[16]
Wanguri, D M., "A Review, An Integration, and a Critique of
Cross-disciplinary Research on Performance Appraisals, Evaluations,
and Feedback," Journal of Business Communications, Vol. 32,
1995, No. 3, 267-293 and Milkovich, C. T. and A. K.Urgdor, Pay for
Performance: Evaluating Performance Appraisal and Merit Pay (Washington,
DC: National Academy Press, 1991).
[17]
Hauser, D and C.H. Fay, "Managing and Assessing Employee Performance,"
in H. Risher and C. H. Fay (eds.), New Strategies for Public Pay
(San Francisco, CA: Jossey-Bass), 185-206.
[18]
Cardy, R L. and G.H. Dobbins, Performance Appraisal : Alternative
Perspectives (Cincinnati, OH: South-Western Publishing Co., 1994).
[19]
Nigro and Nigro, op cit., p. 135.
[20]
Murphy, K R., and J.N. Cleveland, Performance Appraisal : An Organizational
Perspective (Boston: Allyn and Bacon, 1991).
[21]
Daley, op cit.
[22]
National Performance Review, From Red Tape to Results: Creating
a Government that Works Better and Costs Less (Washington, DC: US.
Government Printing Office, 1993), 32
[23]
Grote, D Complete Guide to Performance Appraisal (New York: AMACOM,
1996).
[24]
Early examples include (a) data entry personnel who, when evaluated
by the number of key strokes, pressed the space bar while making
personal calls and (b) telephone operators, when expected to fulfill
a quota in a given time period, would hang up on people with complex
problems. The National Institute of Occupational Health and Safety
estimates that two-thirds of all video display terminals are electronically
monitored (Ambrose, M.L., G.S. Alder, and T.W. Noel, "Electronic
Performance Monitoring: A Consideration of Rights," in Schminke,
M. ed., Managerial Ethics: Moral Management of People and Processes
(Mahwah, NJ: Laurence Erlbaum Associates, Inc., 1998, 61-80).
[25]
Several comprehensive studies have found that racial and sex discrimination,
once common in evaluations, are no longer pervasive (Pulakos, E.D.,
et al., "Examination of Race and Sex Effects on Performance
Ratings," Journal of Applied Psychology, vol. 74,1989, 770-780
and Waldman D. A. and B.J. Avolio, Race Effects in Performance Evaluations:
Controlling for Ability, Education and Experience," Journal
of Applied Psychology, Vol. 76,1991, 897-911.
[26]
Leniency (a.k.a. "grade inflation") in academe is "the
refusal by faculty members to behave like adults, that is like people
with enough integrity to disappoint other people. It is as though
some professors want to believe that everybody deserves to be first.
Everybody doesn't" (Carter, S. Integrity (New York, Basic Books,
1996), 79).
[27]
Longenecker, C O. and D. Ludwig. "Ethical Dilemmas in Performance
Appraisals Revisited," Journal of Business Ethics Vol. 9, 1990,
961-969.
[28]
The saying, "When you point your finger at me, remember that
your other fingers are pointing back at you," is appropriate
here
[29]
Nigro and Nigro, op cit., pp. 114-116.
[30]
Halachmi, op cit., p. 325.
[31]
McGregor, D (1957). An Uneasy Look at performance Appraisal . Harvard
Business Review, Vol. 35, 1957, 90
[32]
The pervasiveness of this problem accounts for the use of the term
"personnel appraisal," not " performance appraisal
" in this essay.
[33]
In the private sector, those companies that emphasized frequent
feedback outperformed those that did not in all financial and productivity
measures (Campbell, R.B. and L.M. Garfinkel, "Strategies for
Success in Measuring Performance," HR Magazine, June, 1996,
98-104).
[34]
Wexley, K, "Appraisal Interview," in Berk, R.A. (ed.),
Performance Assessment (Baltimore: Johns Hopkins University Press,
1986),. 167-185.
[35]
US. Merit Systems Protection Board, Federal Supervisors and Strategic
Human Resources Management (Washington, DC: The Board, 1998).
[*]
adapted from Evan Berman, James S. Bowman, Montgomery VanWart, and
Jon West (2000). Human Resource Management: Processes, Problems,
and Paradoxes (Thousand Oaks, CA: Sage). Copyright, James S. Bowman
James
S. Bowman
Askew School of Public Administration and Politics
Florida State University
620 Bellamy Building
Tallahassee, FL 32306-2032
James
S. Bowman is professor of public administration at Florida State
University and editor of Public Integrity, a new quarterly journal
sponsored by three leading professional associations. "Human
Resource Management: Paradoxes, Processes, and Problems" (Sage,
2000) is his latest co-authored work. A past National Association
of Schools of Public Affairs and Administration Fellow, as well
as a Kellogg Foundation Fellow, Bowman serves on all editorial boards.
|