Saturday, September 20, 2008

Measuring Things

Everyone loves to measure things. Eric told me the other day of a story he heard at NFJS about how Henry Ford publicly measured employees performance of producing I-Beams. The mere process of public measuring increased the number of I-Beams produced (I looked around on the internet, couldn't find any references).

Recently at work we started playing the Hudson Continuous Integration Game. In this "game" there is a public record of points. Your check-ins can net you points or loose you points. The rules that we play with are:

  • -10 points for breaking a build
  • 0 points for breaking a build that already was broken
  • +1 points for doing a build with no failures (unstable builds gives no points)
  • -1 points for each new test failures
  • +1 points for each new test that passes
  • +3 points for removing a warning, TODO, or fixing findbugs errors
  • -3 points for checking in a warning, TODO, or creating findbugs errors

Each month we reset the scores. This is the third month we're doing this. The top three get prizes (a toy from the dollar store to display proudly on their desk), the looser also get a toy, the cockroach of shame. We're half way though this month, the top three all have 200+ points (at the moment I'm #2), then the point count drops off considerably. I believe #4 has 100 points, and the person in last place has -1 (note: 25% of the people playing are fulltime developers, the other 75% are scientists who do a little development, but everyone plays).

This is far from perfect measurement of performance, but I have to tell you the fear of public ridicule for having low points (or just my competitive nature) has certainly made me go right back and fix any findbugs errors, and implement TODO's rather than just leave them there. It's kind of neat on a personal level, but it also has had an effect on our team as well. It's encouraged other people to cleanup their warnings and fix easy problems and it's started a lot of discussions about good coding practice (I think this has been the most valuable thing it's done). The most controversial rule is loosing 3 points for checking in a TODO.

There are a number of people who feel that checking in a TODO shouldn't loose you any points, that it will encourage people to just not mark things as TODO when they should be. I can totally see this point. On the other hand, no matter how much I hate to loose points if I have 2-3 things in a month that really are TODOs and I don't have time to implement the feature right then, I'm okay loosing 6-9 points... I created work by checking in. I should get dinged. In my eyes this encourages people to not check-in if they're going to create work for other people.

The whole process has been very fun. And it's started a number of conversations about development with people who weren't talking about it so much. I'm currently measuring the success of the game by how much people are talking about it. This month that measurement is at 104, I hope next month the success of the game is 150.


redsolo said...

Great to hear that it actually works and does not turn the competition into a blood bath.

Any ideas on how to extend the game plugin?

Ash said...

I agree, the level of talking is a huge bonus. I especially like how bad code is exposed (you have no excuse for leaving unused imports, etc, around).

I have toyed with the idea of writing a script/screen scraper to compile statistics over time: We could have graphs showing points over time. Also it would be useful to get an overview of how people got their points -- i.e all for tests, for fixing bugs, etc. A means of configuring the rules (aside from compiling a custom version) would be cool.

We just need to stop being lazy and start contributing, this plugin is great!

Web Statistics