The Problem of Measuring Software Productivity

May 16th, 2014 by Donald Beckett

Measuring Software Productivity So, just why do we want to measure software productivity (without using the root word “productive” in the answer)? I believe that it comes down to the desire to numerically evaluate an inherently complex process so that quantitative comparisons can be made to provide a basis for decision making:

Is output per unit of labor or cost increasing or decreasing?
Benchmarking against “the industry” or “the competition”
Identify practices that either promote or impede increased output and better quality

I’m sure there are many others that could be added to the list.

Issues

Traditionally, software productivity has been measured as a ratio between units of output and units of effort. Simple productivity measures worked fairly well for well defined, repetitive manufacturing processes where a 10% increase in input reliably translates to a comparable increase in output, but there are massive problems with applying simple productivity measures to complex, non-repetitive design processes like software development.

First, just what are those units of output? The first measure to be used – and one that is still used – was logical source lines of code (SLOC). Software projects, after all, still produce code. However, not all lines of code are equal. Programs written in COBOL, PL/1, or basic assembly language require much more code than ones written in 4GL languages even if they deliver equivalent functionality. In effect, productivity measured this way links efficiency with lower productivity by equating the quantity of code with size rather than the amount of functionality delivered: hardly a valid economic measure. Add to this how to account for generated code and re-usable objects and the use of source lines of code for measuring productivity becomes anything but straightforward.

Function point analysis (FPA) recognized the problems with SLOC and focused on measuring the functionality a software project delivers by quantifying its inputs, outputs, and data design requirements. For input/output (I/O) intensive systems this works fairly well. For real time, middleware, and computationally complex systems, FPA encounters a problem similar to that of SLOC. These systems, while generally more difficult to design and implement than business systems, normally create far fewer of the artifacts measured by FPA. As a result, their size measured in function points is smaller and much of the work is not captured at all in FP analysis. Consequently, the business and economic value of such projects is understated.

The second issue revolves around the other half of the ratio equation: effort. The fact is that effort is not collected consistently between organizations and, often, not even within the same organization. Ideally, actual effort hours dedicated to the software project should be counted. However, many organizations obtain effort from the time tracking system that is used for billing, in which case the reported effort is often overstated (or understated in overtime situations). “Hours billed or paid for” is not always the same as “Hours actually worked”. Additionally, fixed price contracts have little incentive to capture or report actual effort hours accurately. When these projects experience difficulties, overtime hours may be underreported (if reported at all).

The third issue concerns the very nature of ratio-based effort/cost productivity measures. Because they focus on effort or cost to produce X units of deliverable software, they completely ignore one of the most critical business drivers in software development: schedule. The relationship between schedule and effort is non-linear. When a project’s schedule is compressed from what is optimal, the amount of effort required to create the same quantity of output increases exponentially (a model of this is illustrated in The Laws of Software Staffing). A productivity measure that does not account for the impact of schedule is missing what may be the most important factor influencing it.

The fourth issue that productivity measures have to address is far less known: project size influences productivity. Whether using a ratio measure or one like Quantitative Software Management’s (QSM) productivity index, productivity tends to increase with size (examples that illustrate this may be found in the webinar, Function Point Analysis Over the Years). As a result, only projects of similar size can be compared to each other using ratio based productivity.

Software Productivity Index Elephant in the Room The fifth issue is the real elephant in the room. Nobody knows what all of the factors that affect productivity are. Moreover, how these factors interact with and impact one another is also unknown and certainly varies from project to project. For instance, how is productivity influenced by having an experienced skilled team while at the same time having poorly defined requirements and limited access to the customer? This will certainly vary from project to project. QSM’s solution to this quandary is the Productivity Index (PI), which measures the impact of all of factors (size, effort, schedule, environment) and calculates an overall measure of project efficiency based on:

How much software the project developed
How much effort was expended in developing that software
Project duration

We have found that when organizations have some discipline in their development processes that the PI’s of their projects follow a consistent pattern: one that can be used to estimate future projects and measure organizational productivity trends over time. The PI is elegant in its simplicity.

What to do?

It would be nice if there were a simple solution to the problems measuring software productivity entail, but there isn’t. However, there are measures that can be taken to reduce measurement challenges. A common theme throughout this paper is that valid comparisons should be based on an “apples to apples” basis. Here are a few things that can be done.

Collect data from your organization’s projects. Even if you do not have enforced standards in place, these projects were developed in a common company culture and share many similarities.
Establish standards for software size and effort.
Make the standards for software size and effort as simple and easy to follow as possible – or they won’t be followed!
Use similar projects for productivity benchmarks. Projects can be grouped in application domains based on the type of software they develop. (SLIM-Estimate breaks down software projects into domains where like to like comparisons can be done.)
For every project, determine what its most important driver was. For measuring productivity or doing comparisons, use projects with the same driver. The principal drivers are cost/effort, schedule, team size, and quality. These affect how a project is planned and developed and have a significant impact on productivity.
a. Projects that optimize (minimize) cost/effort have smaller team sizes and better quality. However, they take longer to complete.
b. Projects that optimize schedule cost more, have larger teams, and poorer quality.
c. Projects that optimize team size cost less, have higher quality, and complete in about the same time as those with a larger team size.
d. Projects that optimize quality take longer to complete and may cost more.

Perfection is the enemy of the possible. The purpose of productivity analysis is to provide insight into the organization so that decision making has a sound basis. Processes and standards need to be good enough to provide useful information – which will probably not be perfect – but also simple and practical enough to gain widespread acceptance and use throughout the organization. A perfect process that no one understands and everyone hates is worse than a practical process that can be successfully and repeatably used across the enterprise.

Blog Post Categories

Productivity

Donald Beckett's blog