« Evaluating new product and enhancement ideas | Main | 19.20.21 »

January 09, 2008

Who gets promoted? Mr. 3.76 or Ms 3.79?

If you haven't already signed up for NY Times consumer technology writer David Pogue's "Circuits" column, I recommend it (here).  His industry insights and fun and straightforward style make for a good read.

However, little did I know David Pogue also has insights to share with regard to HR and HR technology.  I opened up today's column to read the following:

"I can’t tell you how much I love the Internet Movie Database (imdb.com)[...] The wisdom of the masses on IMDB -- thousands of people’s collective grade for a movie on a 1-to-10 scale -- very rarely misses.

(Hint: It’s not really a 1-to-10 scale. By the time you average together all the scores from a huge number of people with different tastes, the scale gets compressed. On IMDB, an average movie usually gets around a 7. Anything that averages an 8 score is sensational; below 6, it’s a turkey.)"

"What on earth does this have to do with HR technology?", you might ask.  What a great question!

As I've mentioned before on this blog, I see technology as the ultimate enabler.  Most anything that the latest and greatest Talent Management system can offer, you could do on paper, in Excel, or through some other means if you were working with a small enough group of people. But when you're supporting 40,000, even a green screen starts to sound like a pretty attractive alternative to Microsoft Word.

But what happens when you bring together and compare performance scores from such a huge pool of people, and then try to make decisions based on that data?  Answer: Exactly what David Pogue describes above.

To illustrate, let's suppose your performance review process assesses employees' goals and competencies.  On average, we could say your employees will be evaluated against 10 different factors (probably a conservative estimate) on a scale of 1 to 5, with 5 being awesome. 

Well, first off, nobody is terrible at everything (and even if they are, who will have the heart to say so?), so the lowest overall scores across your company will probably sit somewhere in the range of 2.0-2.5 (remember, "3" is supposedly average). 

On the other end of the continuum, you have more hope.  After all, who wouldn't want to give their star player all 5's?  Still, you're probably looking at overall highs in the 4.0-4.5 range.  So suddenly your scale has shrunk from 1.0-5.0 to around 2.0 - 4.5, and 70% of your 40,000 employees fall somewhere between 3.2 and 3.6.   

Lucky for you, your hot new technology solution does a good job of hiding this clutter behind sexy org charts and matrices when rolling up the data.

But what does that mean about the decisions you're making regarding your talent?  You may be able to separate some of the wheat from the chaff with this data, but when it comes to making more individual talent decisions, is a difference of .03 in an overall performance score meaningful?  Or for that matter, is it legally defensible? 

The answer to both questions is "Probably not." 

So what's the solution?  Hint: it's not expanding your scale to 1.0-10.0.  That's like manipulating one axis of a chart to make insignificant differences look gargantuan (below).

Blog_example_5

A better approach is to have a process for collecting performance data that you know yields objective results.  Using BARS - think SMART goals for competencies - is one method. 

Then, you've got to do what David Pogue has casually done with IMDB: analyze the data to find the thresholds and figure out what kinds of differences are really meaningful. 

If you really want talent data that you can confidently (and legally) report from and use for making important talent decisions, I just don't see much alternative.  The problem is that there isn't a HR technology vendor out there whose solution can do it for you, and I really don't see any heading in that direction.

TrackBack

TrackBack URL for this entry:
http://www.typepad.com/t/trackback/2498568/24979208

Listed below are links to weblogs that reference Who gets promoted? Mr. 3.76 or Ms 3.79?:

Comments

Feed You can follow this conversation by subscribing to the comment feed for this post.

Yes statistical significance is a foreign concept in much of corporate america. Plenty of scorecards abound that rank order plants, people, etc. leading the recipients to believe that being ranked #3 is different and better than rank 4, when statistically they are the same. I've often said I want to write a book and call it "Pretend Science", a series of anecdotes about six sigma and other tools misapplied with negative and sometimes funny results.

Anyway I didn't mean to pick apart your analogy... you point is well made about striving for objective, observable assessment criteria. This allows the rating scale to be more a measure of frequency or consistency, as opposed to a good-bad scale.

Thanks for the dialog.

-Dave Polacheck

Thanks for your comment, Dave. Good stuff. We are in near total agreement. Though the analogy isn't perfect, I think it is telling. And your description of the way this data has been used in your experience resonates with what I've seen and heard elsewhere. Still, I think the fact that performance data is increasingly hidden behind fancy graphical reporting mechanisms, without first being analyzed to discover what's statistically significant, poses some unique challenges that aren't currently being addressed by the solutions on the market.

Intriguing thoughts and we've all seen the "regression to the mean" of our rating scales (yes I'm using that statistical term out of context) However, there is a significant difference between lots of people rating the same set of movies -- according to The Wisdom of Crowds they've probably got it right -- and lots of managers rating many different employees, generally leaving us with only a single set of ratings for each employee.

IMO, we need to take our performance management processes as the subjective measurement systems that they are. I've participated/led these processes as an HR partner in a 3k-employee company, a 30k-employee company and a 300k-employee company, and I've yet to see truly meaningful and on-target action taken purely on the basis of the ratings. The most effective processes I've seen allow for the gathering of data, but use that data more as a catalyst for thoughtful, challenging dialog about each level of the organization -- how the team is performing, who's ready for a bigger role, who's struggling, etc. -- from the shop floor engineer and remote field sales rep to the senior business leadership teams. Through these discussions, you achieve as much calibration of ratings as is probably possible in a big organization, and I've found that the quality and consistency of these organizational reviews is probably a decent predictor of the success of the company over time.

Post a comment

If you have a TypeKey or TypePad account, please Sign In

Email subscribe

  • Enter your email address:

    Delivered by FeedBurner

Recommended

Blogroll 2.0

  • Google

    WWW
    zapaterismo.typepad.com

Creative Commons

Disclaimer

  • The opinions in this blog are my own, and do not necessarily reflect the views of PDI.