Interview: David Cok Explains Annotations For All

GrammaTech is a leading research firm specializing in innovative new techniques for preventing damaging cyber-attacks, with a commercial products division that manufactures CodeSonar, software that provides static analysis for source and binary code.

GrammaTech’s research division undertakes projects for U.S. government agencies and other contractors, including NASA, the NSF, and many branches of the Department of Defense. One of their recent projects from the National Science Foundation resulted in a website that they announced today, as part of an initiative called Annotations for All.In many ways, this initiative is similar to the contract functionality found in Ada 2012 (see “Ada 2012: The Joy of Contracts”).

I talked with David Cok, the Vice President of Technology at GrammaTech, to find out more about this initiative, and learn more about why they’re partnering with the NSF in the name of software annotations.

Wong: What is Annotations for All?

Cok:Annotations for All is an initiative to harness the power of the software development community at large in order to grow the practice of using software annotations, static analysis, and model-checking tools. We originally just called the project “Crowdsourcing Annotations” because at the project’s core, we are hoping to encourage developers to contribute annotations to widely-used libraries, which will ultimately make them safer to use.

Wong: How do annotations make code safer?

Cok: Many of the exploitable vulnerabilities in code come from within third-party libraries, or from using them improperly, as opposed to coming from the code that has been written by developers in-house, which is usually much more thoroughly scrutinized before deployment. With the addition of reliable annotations to accompany these libraries, static analysis tools can better check to make sure a program will not fail.

Wong: So what exactly are software annotations? How are they used?

Cok: Annotations are written by a programmer and provide information about what a program is supposed to do and what it is not supposed to do. There are annotation languages that correspond to specific programming languages, as well as tools that help developers write and check software annotations. The word “specifications” is also used interchangeably with “annotations” – which is logical because annotations specify what a program is supposed to do, as well as any expected side effects. Here is a simple example of an annotated C header for a program that swaps the values at two pointers, using ACSL:

â¨/*@ requires \valid(p) && \valid(q); // both pointer must be to valid memory
â¨ assigns *p, *q; // Only *p and *q are changed by this routine
â¨ ensures *p == \old(*q) && *q == \old(*p);  // values are swapped compared to the pre-stateâ¨ 
*/â¨

void swap (int * p, int * q);

A corresponding example for Java (using the Java Modeling Langauge) is (Java references are always to valid memory):

â¨/*@ ensures a.val = \old(b.val) && b.val == \old(a.val);
â¨ assignable a.val, b.val;â¨
*/

â¨void swap_val_field (Val a, Val b);  // Here Val is a class with a val field.

A more important aspect is lack of buffer overflow. ACSL expresses that this way:

â¨â¨/*@ requires \valid(dest + (0..n) && \valid_read(src+(0..n)) && n >= 0; // reads and writes won't overflow
assigns dest[0..n] ; // only the dest array is changed by this routine
ensures (\forall integer i ; 0 <= i < n ==> dest[i] == \old(src[i]) ); // the dest array receives the expected valuesâ¨
*/

â¨void copy(int * dest, int * src, int n);â¨â¨

Note that the specification implies a certain behavior if the arrays src and dest overlap in memory. An implementation may or may not implement a copy consistent with the spec.

Wong: What about static analysis tools? Can’t those tools check to make sure code is functioning properly?

Cok:Well, static analysis tools can use these annotations to check code more fully. A static analysis tool, without annotations, can determine that a program won’t fail – but that doesn’t necessarily mean that the program will operate within the scope of the programmer’s intentions. By using a machine-readable annotation language, a developer can tell a static analysis tool to not only check that the software won’t fail, but check that it is in fact operating within his or her scope of intent. Furthermore, static analysis tools operating on source code must make assumptions about the library routines the program calls for which source code is not available (or analyze the binary itself). Annotations make those assumptions explicit and more precise. They help make the static analyses more accurate and faster.

Wong: Why is GrammaTech undertaking this project with the NSF?

Cok: It’s pretty simple, really. The overarching goal of all of our research is to make code – and the resulting software – safer once deployed on devices. We rely on software to keep us safe, as it flies our airplanes, administers correct dosages of medicine, keeps us connected throughout the world, etc… In today’s world, safety is both freedom from physical harm and freedom from security breaches. The better the technology is for keeping the software safe, the more we can depend on the software itself to keep us and our surrounding world safe from device failures and hacking attempts. So it’s pretty natural for us to be interested in growing the adoption of software annotations because ultimately, they help the tools we develop function at their highest level.

Wong: How can developers participate in Annotations for All?

Cok: Great question – the best way developers can contribute to the project is by contributing annotations for open-source libraries. This will have the most widespread benefit. Other opportunities to support the initiative include helping us test annotation tools or simply sharing any expert knowledge you might have. You can go to the contribute page on the Annotations for All website to learn more, to contribute, or just to let us know if you’re interested in the opportunity to contribute in the future.