Who Is +1? Coder VAR
If you follow baseball statistics or read baseball sportswriters like those at ESPN.com, over the last few years you may have heard about VORP or WAR. These intense-sounding acronyms stand for Value Over Replacement Player (VORP) and Wins Above Replacement (WAR). While the method of how these are calculated for baseball players is somewhat arcane (and in fact there are at least 3 different methods that people use to calculate WAR) the concept itself is simple and persuasive, which has led to the increased popularity in these statistics.
The general concept behind VORP and WAR is that a good way to identify the value of a player is to rate them relative to an average replacement player. For example, let’s say you want to rate the 3rd baseman for your favorite baseball team. You could look at his offensive and defensive statistics, and compare the details to 3rd basemen on other teams. But VORP and WAR attempt to create a single number that rates your 3rd baseman against an “average” replacement 3rd baseman. An above-average VORP/WAR means that your player contributes more than an average replacement.
This is based on common statistical techniques (various comparisons to average). A similar idea can be useful as a way to categorize coders and their skills. I call this Value Above Replacement, or VAR. Unlike baseball’s VORP or WAR, which seek to create a single metric to rate baseball players, the concept of VAR can be applied to any metric. The formula is based on using standard deviation, as follows:
For any given metric X –
- Calculate X for every coder
- Calculate the average (mean) of X across all coders
- Calculate the population standard deviation of X across for all coders
- Calculate VAR X for each coder as VAR X = ((X of coder) – (average X)) / (standard deviation X) and then truncate
For any given metric, this shows you how many “standard deviations” each coder is from the average. If you are familiar with normal distributions or bell curves, this applies the same concept. Someone who is in the top 3% for a given area (measured by metric X) might be a +2 for example, meaning that the person is 2 standard deviations above average (and, conversely, someone who is in the bottom 3% might be a -2).
Those of you whose teachers “graded on the curve” in school might be recoiling at the thought of applying this to coders. But as I’ve mentioned in other writing on metrics, this isn’t a grading system. Metrics are best used as a categorization system useful to help you more objectively identify or confirm the strengths and weaknesses of individuals and teams. In this regard, VAR can be extremely helpful as a way to focus on the most meaningful data, namely the distribution and categorization of contributions and skills.
For example, let’s say you have 7 coders on a software team, and you measure their productivity by looking at the number of tasks each coder completes and the complexity of each task (this metric is called Points in Codermetrics). For a one month period, you might have data like the following:
- Coder A Productivity = 24 tasks completed x average task complexity 2 = 48
- Coder B Productivity = 20 tasks completed x average task complexity 2 = 40
- Coder C Productivity = 26 tasks completed x average task complexity 1 = 26
- Coder D Productivity = 38 tasks completed x average task complexity 1 = 38
- Coder E Productivity = 17 tasks completed x average task complexity 3 = 51
- Coder F Productivity = 22 tasks completed x average task complexity 2 = 44
- Coder G Productivity = 15 tasks completed x average task complexity 3 = 45
For this group, then, the average productivity for the month (rounded) is 42, and the standard deviation is 8. With these values, you can calculate the Productivity VAR for each coder. I’ve created an example Google Docs spreadsheet which you can access here and it has also been posted in Shared Resources. Below is a screenshot showing the calculated values for this group.
This provides a good example of how Coder VAR helps. If you just look at the Productivity metrics, as highlighted in the pie chart, it isn’t that easy to see how the values are grouped, and which values stand out as separate from the others. With Productivity VAR, you can easily see that there are three groups, one high (Coder E at +1), one low (Coder C at -2), and one in the middle (everyone else at 0).
In studying codermetrics for your software team, this is often the kind of information that can be extremely useful. How many (if any) coders are above-average or below-average in a specific area? What specific areas of strength might be lost if someone left and was replaced by an average coder? Do areas of strength or weakness correlate with coders’ level of experience, and what areas of weakness might be improved?
Coder VAR also provides a useful way to discuss and convey key findings from your metrics. This is part of what has driven the popularity of WAR in baseball. It’s easier to understand if you say “Coder E is plus one for Productivity” or “Coder B’s Productivity is average,” than if you say “Coder E’s Productivity was fifty-one” or “Coder B’s Productivity is forty.” Or if you are looking to hire a new coder, it might be useful to know and discuss that you are looking for “a coder whose Productivity is plus one.”
There are some limitations to be aware of when using VAR. For example, VAR draws a line that separates values that may in fact be close, such as Coder A (Productivity 48 and Producivity VAR 0) and Coder E (Productivity 51 and Productivity VAR 1) in the data above. This points to the usefulness of VAR, which is as a method of categorization, not a method of detailed analysis and certainly not a method of grading. Also, you should be careful about VAR analysis of coders who are known to have different levels of experience or who have very different roles on your team. Coder VAR is best used as an analysis of coders who are generally similar in experience and roles, and therefore you might want to analyze VAR separately for your senior and junior coders, for example.
The biggest limitation of Coder VAR is that it is clearly relative to your population, so it is limited if your population (the number of coders analyzed) is small and isolated. For example, if you have a team with three senior coders, then it can be somewhat useful to look at their Productivity VAR. But it would also be helpful to know how the coders’ Productivity compares to other senior coders, either on other teams in your organization, or in other organizations. Maybe your senior coders are all similarly productive (Productivity VAR of 0 within the team) but maybe they are highly productive compared to other senior coders (Productivity VAR +1 or +2 when compared across teams). This is a general problem with codermetrics, namely the lack of normalized data we all have and share, and something I hope to address more in the future. For now, those in larger organizations would be able to address this by measuring across teams and applying techniques to establish normalized baselines (something also discussed in my book).
As with other metrics, however, if you are aware of the limitations then Coder VAR can still be very useful. It can help you to increase your understanding of team dynamics, to identify and analyze the characteristics of successful teams, and to plan ways to improve your software team.