Electronic Design
Requiem for a Bug – Verifying Software: Testing and Static Analysis

Requiem for a Bug – Verifying Software: Testing and Static Analysis

This article offers two approaches toward addressing the problem of software verification at increasing levels of sophistication.

What do you do to make your code behave as intended? And are the methods you chose as effective in finding problems as you thought? In this article, the first of a two-part series, I will write a piece of code in Ada and then try to find bugs with two different methods:

• Testing  by writing a simple unit test

• Static analysis using the CodePeer tool

Download this article in .PDF format
This file type includes high resolution graphics and schematics when applicable.

These approaches address the problem of software verification at increasing levels of sophistication. Although the example chosen is small, it will show what is and isn’t achievable with the various verification techniques.

The sample program that will illustrate the different methods is a binary search for a particular value in a sorted array; the search function returns the index of the value (if present) and zero otherwise. The same value may occur more than once in the array; in such a situation, the search function is to return the index of one match.

Binary search is a well-known algorithm to locate a given value in a sorted array. Instead of searching through the entire array from the start, a binary search will compare the array value in the middle of the array with the given value. If the former is greater, searching will continue to the left of that middle value; otherwise, it will continue to the right. In this manner, the binary search will split the search space in half at each iteration, until it either finds the value or concludes that it can't be in the array. A standard linear search has O(n) execution time, while a binary search is O(log2(n))

Binary search is a very basic algorithm, but it is surprisingly easy to make mistakes when programming it. This has been pointed out by great minds such as Donald Knuth1 and Jon Bentley2. If you’re interested in further information, I invite you to read a recent “Stack Overflow” discussion3, but only after reading my articles.

Here’s my first try at implementing the binary search algorithm:

type Integer_Array is array (Positive range <>) of Integer;
   -- Positive is the subset of Integer values in the range 1 .. Integer'Last
   -- An object of type Integer_Array has a Positive lower bound and any upper bound 

function Search (A : Integer_Array; Value : Integer) return Natural is
   -- Natural is the subset of Integer values in the range 0 .. Integer'Last
  Left  : Positive := A'First; -- Index of lower bound
  Right : Positive := A’Last;  -- Index of upper bound
  Split : Positive;
   begin
  while Left <= Right loop
     Split := (Left + Right) / 2;
     if A (Split) = Value then
        return Split;
     elsif A (Split) < Value then
        Left := Split + 1;
     else
        Right := Split - 1;
     end if;
  end loop;
  return 0;
   end Search;

This implementation assumes that the array indices are positive numbers (but not necessarily starting at 1). However, the result of the function is a Natural value, which includes zero. This allows the function to return the special value zero, which cannot be a valid index in the array, to indicate that the value was not found.

The algorithm maintains two indices—Left and Right. These indicate the zone in which the algorithm is currently searching. Anything outside this range has been ruled out. At each step, the algorithm computes the middle point between Left and Right, compares the value at that index with the given value, and either returns the result or rules out the left or right area from the zone to be searched. Once the zone to be searched becomes empty (the exit condition of the loop), the algorithm can return zero, indicating that the value is not there.

That looks reasonable, right? We’ll try testing first to see if there are any problems. I tested this function in the following way, which looked pretty exhaustive to me:

I created an array of 10 random integers between 1 and 100, sorted in ascending order. Then, I used the Search function above with this array as the first parameter, and all integers between 1 and 100 as the second parameter. For each of these values, I checked whether the result was the expected one: 0 if the value is not in the array; otherwise, it returns an index such that A (I) = Value.

In fact, my function crashed on the first test, with search value 1! Why is that? I forgot to take into account the case where Value is already smaller than the first element of the array. In this case, the algorithm will set the variable "Right" to smaller and smaller values, until it tries to set it to zero. This will ultimately fail with a runtime check because Right is constrained to be Positive, which excludes zero.

The solution is to protect against this case specifically by adding the following check before the while loop:

if A (Left) > Value then
    return 0;
  end if;

My seemingly exhaustive test found no more problems.

Shown is a screenshot of a CodePeer session with CodePeer messages and computed preconditions.

Then I ran the static analysis tool CodePeer, which examines Ada code for potential run-time errors and logic bugs. CodePeer reports problematic constructs in source code, and computes preconditions for functions, i.e., conditions that need to be met by the caller. A screenshot of the CodePeer report for our example appears in the figure. In this case, it is very interesting to look at the computed precondition for our Search function:

   --    A(A'First)'Initialized
   --    A'Last >= 1
   --    A'First <= A'Last

It seems to require that the array should be non-empty. And indeed my code will fail when passing it an empty array (e.g., when the lower bound is 1 and the upper bound is 0). My simple-minded testing strategy did not cover this case, because it only tested with a fixed array of size 10. The correction is very easy; we simply add an appropriate test at the beginning of the function. Also, since Right might not be Positive (i.e., when the array is empty), either its declaration needs to be changed or its initialization needs to be moved. We chose the latter, and for consistency, we also moved the initialization of Left.

Here is a modified version of the program that includes corrections to the two errors that we found:

function Search (A : Integer_Array; Value : Integer) return Natural is
   -- Natural is the subset of Integer values in the range 0 .. Integer'Last
  Left, Right, Split : Positive;
   begin
  if A'Length = 0 then
     return 0;
  end if;
  Left  := A'First;
  Right := A'Last;

  if A (Left) > Value then
     return 0;
  end if;

  while Left <= Right loop
     Split := (Left + Right) / 2;
     if A (Split) = Value then
        return Split;
     elsif A (Split) < Value then
        Left := Split + 1;
     else
        Right := Split - 1;
     end if;
  end loop;
  return 0;
   end Search;

With its default settings, CodePeer did not identify any additional bugs in this code. Is the program now correct? We’ll answer this issue in Part 2 of this series, when we use SPARK 2014 and formal methods to augment our verification approach.

Download this article in .PDF format
This file type includes high resolution graphics and schematics when applicable.

 

References:

1. D. Knuth, The Art of Computer Programming, Volume 3: (2nd ed.) Sorting and Searching; Addison Wesley Longman; 1998

2. J. Bentley, Programming Pearls (Second Edition); Addison-Wesley; 2000.

3. http://stackoverflow.com/questions/504335/what-are-the-pitfalls-in-implementing-binary-search

TAGS: Dev Tools EDA
Hide comments

Comments

  • Allowed HTML tags: <em> <strong> <blockquote> <br> <p>

Plain text

  • No HTML tags allowed.
  • Web page addresses and e-mail addresses turn into links automatically.
  • Lines and paragraphs break automatically.
Publish