Several weeks ago I read an interesting study on finding bugs in giant software programs:
The efficiency of software development projects is largely determined by the way coders spot and correct errors.
But identifying bugs efficiently can be a tricky business, when the various components of a program can contain millions of lines of code. Now Michele Marchesi from the University of Calgiari and a few pals have come up with a deceptively simple way of efficiently allocating resources to error correction.
...Marchesi and pals have analysed a database of java programs called Eclipse and found that the size of these programs follows a log normal distribution. In other words, the database and by extension, any large project, is made up of lots of small programs but only a few big ones.
So how are errors distributed among these programs? It would be easy to assume that the errors are evenly distributed per 1000 lines of code, regardless of the size of the program.
Not so say Marchesi and co. Their study of the Eclipse database indicates that errors are much more likely in big programs. In fact, in their study, the top 20 per cent of the largest programs contained over 60 per cent of the bugs.
That points to a clear strategy for identifying the most errors as quickly as possible in a software project: just focus on the biggest programs.
Nicole Tedesco adds her thoughts: