Are We Really Measuring and Rewarding What We Want?

Article by Dr. John E. Kello

A few years back the president of the Association for Psychological Science published a column titled “The Publication Arms Race”. Dr. Lisa Feldman Barret’s central point is that we professors are rewarded mainly for our publications, and mainly for the quantity of those publications. We assume (and hope) that those publications will make contributions to our science and practice. But the entry ticket to a faculty position at a college or university, and the key criterion for subsequent contract renewal, tenure, promotion, and other professional honors and awards (research grants, titled/ endowed professorships), is the number of published articles in the “right” journals. And the criteria have become inflated (hence the arms race catch phrase). In the academy we are rewarding publication quantity, while hoping for other outcomes as well. Dr. Barret laments that the numbers game may cause academicians to do things that don’t really advance our knowledge but get published, and bypass research that might advance knowledge but be less likely to get published.

The article inspired me to reflect on a classic paper written by Dr. Steven Kerr in 1975 with the intriguing title, “On the Folly of Rewarding A, While Hoping for B”. Dr. Kerr captures the essence of Dr. Barret’s lament; indeed, one example was drawn directly from academia. Kerr gives the example of the common university practice of rewarding research while hoping that the professors will also be excellent teachers (a hope no doubt shared by students and their tuition-paying parents). The open secret in academia is that Dr. Kerr and Dr. Barret are right. It certainly does happen that some professors produce research of quality as well as quantity, and are also excellent teachers. But again, the incentives are awarded for volume of publications in the right journals, not necessarily for the quality of the contributions, nor for what happens in the classroom.

These examples drawn from the academic world are representative of the broader, pervasive problem of measuring, tracking, and rewarding the wrong things, in hopes that we are thereby getting the right things. As Dr. Kerr said so beautifully, we are surprisingly prone to reward A while hoping for B.

A personal example. Do you know how your credit score is determined? I once had the shocking experience of helping my daughter buy a car, only to find that her credit score was better than mine. Lest you think I am a deadbeat, constantly on the verge of defaulting on bills and declaring bankruptcy, please accept my assurances that I actually was then and am now an excellent credit risk. But by the measures applied at the time to determine my credit score and thereby, my risk-worthiness, I was not. My daughter’s income at the time was a fraction of mine, and she had only been employed for a couple of years. No matter. You can’t be late paying a bill, regardless of the reason. That’s a ding. You can’t use a large portion of your credit card limit, even if your limit is low (your decision), and you pay it in full each month and never use the revolving charge option. Another ding. You can’t cancel credit cards, even if you sign up for one in a store that day to get the instant 10% off, and then cancel it (as my wife did over and over and over).

A late or missed payment is taken as an indicator you can’t pay, never mind that you misplaced the bill and double-paid the next month. Keeping a low credit card limit (say $5k) and using $2-3k of it each month, is taken as an indicator you are close to maxing out your credit, not that you didn’t see any reason to accept the credit card company’s offers to increase your credit limit. Cancelling credit cards, no matter the reason, is yet another ding. It is taken as an indicator you can’t keep up with payments on them. And cancelling five of them!? Oh my!  

If my then 24- year old daughter had not gotten such a great credit score herself, and thereby qualified for a favorable rate on her car loan, I could have possibly written a check for the total amount then and there, and had her pay me back over time (though they probably would have worried that my check would bounce).    

Back to academia. Do you know how the US News and World Report determines its annual rankings of colleges and universities? Even a cursory look at the criteria and their weightings shows some A vs. B factors. “Faculty Resources” counts for 20%. How measured? Answer: class size, faculty salary, percentage of faculty with the highest degree in their fields, student-faculty ratio, and the proportion of faculty who are full-time. Schools with smaller classes, which pay their faculty more, which have a higher percentage of their faculty full-time with the highest degrees, and which on average have fewer students-per-faculty member, score the best in this category. Set aside the concern that at least some of those numbers can be very misleading (e.g., class size can be artificially lowered if an independent study course with one student enrolled is counted as a class. If so, that independent study class and one class with 100 students combine to yield an average class size of 50.5). More critically, do any of those “faculty resources” measures actually tell you anything about the quality of the faculty as teachers, the presumed primary mission of professors?

“Peer Assessment”, i.e. reputational rank, is another 20% factor. If “top academics” say a college is great… well, it must be. How’s the teaching?

“Outcomes” is a catch-all 50% factor that comprises measures of enrolling, retaining, and graduating students of different backgrounds with manageable debt and salaries exceeding those of high school graduates. As desirable as all of those outcomes are, how’s the teaching?

None of the factors making up the other 10% relate directly or even indirectly to teaching excellence. You can look it up.

One of the constant problems with any organizational program is the tendency to assess it in a check-the-box, by-the-numbers way. We want to improve the quality of our product, so we launch six sigma, lean, or similar programs. The goal is continuous quality improvement. What do we measure? Too often, the number of trained black belts and green belts we have, the number of projects we are running, the number of meetings they have, the number of suggestions they make, and on and on. Are those quantitative measures telling us anything about quality?

I have seen behavior-based safety programs fall into the numbers trap… how many observation teams we have, how many observations per week or month they make, how many training classes we held, etc. Are those numbers giving us a safer workplace? I have seen some well-intentioned near-miss programs focus so much on the number of incidents collected (sometimes with a required minimum) that they have little to say about the seriousness of the incident, nor whether subsequent corrective actions were taken.

As we are reminded by Drs. Barret and Kerr, if you want B, be careful to actually measure and reward B, not A. 

 

 

Next
Next

How Did That Happen?