Beyond BLS

Beyond BLS briefly summarizes articles, reports, working papers, and other works published outside BLS on broad topics of interest to MLR readers.

December 2016

Appraising the performance of performance appraisals

Summary written by: Peter C. Fisk

The overwhelming majority of U.S. workers receive performance appraisals from their employers, but What do performance appraisals do? That’s the question asked and answered by Peter Cappelli and Martin Conyon in a recent working paper (National Bureau of Economic Research, Working Paper 22400, July 2016).

The authors note that despite the relative ubiquity of performance appraisals, the economics literature is nearly devoid of research on the subject, including such fundamental points as why appraisals are used and how they actually affect employment outcomes and wages.

The study performs regression analysis on panel data for managerial employees from a single, large, publicly traded U.S. firm between 2001 and 2007. The authors identify advantages and potential disadvantages with this approach. For instance, it inherently controls for cross-firm heterogeneity but might not yield results that can be readily generalized. The paper observes that, although the business under study is far larger than most, its individual stores are comparable to other retail establishments, and store managers constitute 96 percent of the observations in the study. The data include appraisal scores, employment outcomes, and various demographic attributes. Top executives were excluded because it was apparent they don’t receive performance appraisals.

Cappelli and Conyon acknowledge common criticisms of performance appraisals. One frequent criticism, supported by previous studies, is that supervisors are reluctant to flag substandard performance for fear of creating conflict in the workplace. This aversion, as the criticism goes, tends to yield a biased score distribution that Cappelli and Conyon dub the “Lake Wobegon” effect—where no worker is below average. Among other variations on the leniency-bias theme, critics have asserted that as supervisors develop personal relationships with employees, they become increasingly predisposed to overlooking substandard performance. In a broader sense, the authors note that the appraisal system is widely perceived as unpopular and dysfunctional, and some critics have called for scrapping it altogether.

The paper goes into considerable discussion of the interplay between objective and subjective elements of performance appraisals, the extent to which appraisals are used for finalizing compensation for the previous period’s work versus providing incentives for future performance, and how the appraisal system helps define the very nature of the employer–employee relationship.

Among its key findings, the paper concludes that performance appraisal scores are indeed functional and informative, and that they are positively related to several key employment outcomes: merit pay and bonuses, promotions, demotions and dismissals, and quits. Moreover, the authors state: “Perhaps most important, we find evidence that employers reward improvements in performance and that they reward different levels of performance differentially, consistent with the view that performance appraisals are not simply a means of settling up subjective aspects of prior performance. Instead, they are an adaptation to the unique, open-ended nature of employment relationships where improvements in performance matter and where employers exercise discretion in rewards that may be used to surprise subordinates.”

Additionally, contrary to another common criticism of the performance appraisal system, the paper finds that scores for individuals vary considerably over time, suggesting that initial human capital may not be as significant a factor in improved performance as one might suppose. One implication of this finding, the authors note, is an argument against dismissing employees for having scores near the bottom of the appraisal distribution in a given year, as the current score might not be a reliable predictor of future performance.