Consumer demand for increasing functionality in devices has resulted in multi-core processors requiring more sophisticated cache control. Why? The presence of a large number of data-processing agents sharing a memory resource on a system-on-chip (SoC) requires that the agents maintain some type of locally cached data to reduce the data transportation cost. This, in turn, leads to the requirement for cache coherency to allow agents to cache data during processing and then make it available to the next processing agent. Today’s caches need to be intelligent in the sense that instead of invalidating lots of data that might turn out stale, the cache controllers need to invalidate on a line-by-line basis in order to ensure that anybody reading an address gets the latest value written.
How can one accomplish this? We will explore this question in this article using a unique, existing formal technology that can provide a method for modeling and verifying cache-coherent protocols, with an ARM AMBA AXI Coherency Extensions (ACE) protocol. We will demonstrate how to develop and verify system- and interface-level VIP for the ARM ACE protocol.
Cache-Coherent Protocol Specifications
The first step in creating the protocol is a painstaking approach to developing the protocol specification. Cache-coherent protocol specifications generally define and describe several key elements of the underlying protocol:
- The system components,
- The granularity of coherency, that is, the size of a cache line,
- The access rights that agents have to cache lines, that is, cache-line states,
- The transactions that operate on cache lines,
- The channels or networks the transactions travel on,
- The effect of the transactions on cache-line states, and
- A description of an illustrative subset of the transaction flows in the system.
Cache-Coherent Protocol Verification
After creating the specification of the protocol, successful verification of the protocol is an equally challenging task. Often, considerable energy goes into the verification of cache-coherent protocols with several forms of verification.
Documenting the protocol is the first form of verification and is effective because it forces careful consideration of various aspects of the protocol on the part of the specification writer. Contemplation of the flows often reveals things that need different handling by the protocol. In addition, creating action tables can highlight holes in the protocol and uncover ambiguities.
Moreover, peer reviews are also very common and naturally occur since there are so many consumers of the specification. The peer-review process often leads to interesting whiteboard discussions that promote clarification from the architect. The architect then incorporates the necessary updates into the specification to remove ambiguities.
The final level of verification is to build a model of the protocol, create high-level properties describing cache coherence, and run simulations against the properties or attempt to prove the properties with a formal verification tool. Note that there are many challenges inherent at this level of verification.
These challenges include:
- Creating a machine-readable specification,
- Ensuring protocol model soundness,
- Debugging the protocol model,
- Verifying the protocol model,
- Using the model for protocol comprehension, and
- Creating a reusable protocol model for RTL verification.
Creating a Machine-Readable Specification
A machine-readable, cache-coherent specification can be created by choosing a modeling language (such as Murphi, TLA, or Verilog), which then goes through the specification, creating the stated rules, capturing the necessary states, and coding the necessary state transitions at the specified moments. However, this process has its shortcomings for several reasons.
For one, the language chosen is generally unlike anything in the specification; for example, the specification is not likely to define variables or represent the rules in explicit if-then-else statements. Instead, there are words, tables, and drawings. Thus, the translation itself will likely lead to some errors. In addition, because of the necessary translation, it may be difficult to adapt the model when the specification is changed or enhanced. Further, modeling languages generally do not have good methods for capturing the clarifying comments that a specification might have, nor are they able to attach the comments to the rules or actions of the protocol.
As noted previously, specifications are heavily table-based. These tables concisely describe the legal input activity when the protocol is in a particular state and then describe the actions to take. A table-based specification can more clearly give a big-picture view of the operations. Of the two options, it is clear that a table-based, machine-readable input format is often preferred.
One means of creating a model for the ARM ACE protocol is with Jasper Design Automation’s ActiveModel table-based entry format. The format is similar to those used by many cache-coherent protocol specifications (Fig. 1). The current state and input activity appears in columns on the left-hand side of the table (in green) and the next-state transitions and other activity appears in columns on the right-hand side of the table (in yellow). The rows of the tables represent the various possible combinations of input activity and current state followed by the appropriate actions for each combination. This format is not only more like the tables that architects write, but it can be the actual tables the architects write, which eliminates one potential source of errors.
Ensuring Protocol Model Soundness
The next step is to question the soundness of the protocol model you have created. For example, if the properties you are verifying are not failing, how do you know that the model is sufficiently exercising all states of interest? In a typical coverage-driven verification flow for an RTL design, coverage points ensure that the input stimulus can get the design into all the interesting states. Likewise, protocol models return dividends when they instrumented with coverage points to ensure exploration of interesting states.
Recall that the rows of a generic action table represent the various possible combinations of input activity and current state followed by the appropriate actions for each combination. An advantage of Jasper’s ActiveModel technology is that, in combination with these tables, it removes the burden of manually creating cover properties by automatically generating covers for every row of every table. You can then use these row covers to determine whether you can actually reach the specified row. If a row cover is unreachable, the protocol may be over-specifying some cases that are not possible. Removing these over-specified cases adds clarity to the specification.
One can also automatically generate a full check with this technology. The full check ensures that at least one row of the table is always being referenced. This check identifies cases where the specification does not address the actions necessary for all the possible states the protocol can get into. The check flags these cases as holes in the specification and indicates that the specification must be re-examined and modified accordingly.
Moreover, these tables can also automatically generate parallel checks. The advantage of parallel checks ensures that each row-pair combination in the table does not overlap. An overlap is typically a coding problem with the table; eliminating the overlaps is essential for creating an unambiguous specification.
Debugging the Protocol Model
Debugging a cache-coherent protocol model typically is not much different than debugging any RTL-based design. A checker fails, a waveform shows the failure, and a typical process ensues of tracing back from the failure until the bug turns up. This process is generally adequate, but it has room for improvement. For example, to understand a failure trace from a protocol model, it is often beneficial to have a high-level view of what is occurring. In this situation, a link back from the property failure to the relevant tables in the specification is helpful. ActiveModel creates this link from a failing property to the appropriate rows in the table (Fig. 2). With the table in view, one gains a better, high-level grasp of what is occurring (or what is not occurring).
Another debugging difficulty with protocol-model failures is the fact that often many transactions must occur in parallel to create the appropriate sequence that violates the check. Ideally, if one could eliminate the unimportant transactions from the waveform showing the violation, the violation could be debugged more efficiently. A technology for doing so, appropriately called QuietTrace, is available when using ActiveModel. An example shows two waveforms, each showing all the row covers that were active in the violation trace (Fig. 3). The first appears without quieting of the waveform; it shows 50 firings of the row covers. The second is the equivalent trace obtained by QuietTrace technology. It shows only 13 firings of the covers. The 13 row covers are all likely contributing to the violation, drastically improving debug efficiency.
Verifying the Protocol Model
A cache-coherent protocol specification describes the rules that, when followed, maintain coherency among the caches in the system. These rules are the assumptions made about the operations of the components of the system. To confirm that adhering to these rules does indeed keep the caches coherent, you need high-level properties that define coherency.
Recall that high-level cache coherence means that a read to a coherent memory location reads the last data written to the same memory location (provided no other writes occur). In addition, all agents must observe writes to a coherent memory location in the same order. It is possible to construct these statements as high-level properties (assertions) for verification.
In addition, one can express common invariants about the caches in a system. For example, according to the ACE protocol, two caches cannot have the same cache line in a dirty state, and two caches cannot have the same cache line in a unique state.
A protocol model verified using simulation alone could only find bugs. A protocol model verified using formal techniques could find all those same bugs and prove the high-level properties. Protocol models are generally smaller than most RTL designs undergoing formal verification and are generally better understood (since there is a specification). These factors make protocol models an ideal fit for formal verification.
We performed formal verification on the ACE ActiveModel-based protocol model following the above methodologies and checklist. The formal verification process proved to be effective at both clarifying the specification and ensuring the correctness of the protocol model, used subsequently to verify RTL. Users of the protocol can now use the model to aid in protocol comprehension.
Using the Model for Protocol Comprehension
Another goal when creating an executable protocol model is to use the model to help users of the protocol, such as RTL designers and verifiers, gain quick comprehension of the protocol. The protocol specification, although rich with valuable information, is a static document. The transaction flows described or drawn in this document represent a small subset of the allowable transaction flows in a system. Consumers of the specification regularly wonder about some variations of the described transaction flows or what-if scenarios when viewing transaction flow drawings. The answers are probably somewhere in the words of the specification. Whereas the waveforms, drawings, and transaction flow descriptions are concise representations that allow for quick comprehension, they lack the depth of the specification’s text. The protocol model offers a way to arrive at the answers in a meaningful way that does not require the user to mine the specification’s text to find the relevant nuggets.
To generate the desired transaction flow, a user typically has a target in mind. For example, a user trying to understand the ACE protocol may want to see a case in which a ReadShared snoop hits a UniqueClean line. In a simulation model, this flow may be difficult to generate. The user would need to know the sequence of inputs needed to hit the condition or the user could write a coverage point and try to let the coverage point be hit through a random simulation. Both of these options are too time consuming to be practical. Thus, it is rare to use simulation models for protocol comprehension.
On the other hand, this use model (visualizing cases to build comprehension) is a perfect application for the ACE ActiveModel. A user can simply go to the ActiveModel table showing a master reacting to snoops and ask the tool to show a scenario for a row where a ReadShared snoop hits a UniqueClean line. An example shows visualization of a row indicating that a ReadShared snoop is hitting a UC line (Fig. 4). The tool quickly generates the nine-cycle waveform showing the desired flow (Fig. 5).
Notice that the row covers that were hit along the way are displayed in the waveform. We can see that a tool tip appears when we use the mouse to hover over any of the row covers on the cycle it was hit (Fig. 5, again). The automatically generated tool tip text gives a short description of what is occurring at that time. This description came directly from the comments entered into the spreadsheet next to each row.
The initial waveform produced is often just a starting point. Once the user understands the waveform, further questions may arise. For example, in the above scenario it might be interesting to know whether WasUnique and DataTransfer can be active on the snoop response. To ask this question, a user simply adds another constraint to the waveform by 1) selecting the necessary signals, 2) clicking an “Add Constraint” button, 3) modifying the expression to indicate the desired values, and 4) replotting the waveform.
We can examine two steps of this process (Fig. 6). The upper waveform shows the “Add Constraint” dialog in which we modify the default expression. The lower waveform shows the results after replotting the waveform. The replotted waveform confirms that WasUnique and DataTransfer can be active on the snoop response.
In addition, as mentioned in the section on debugging a protocol model, Jasper ActiveModel technology provides the QuietTrace feature. This feature modifies the generated waveform to reduce the irrelevant activity and make it easier to comprehend the target activity.
Creating a Reusable Protocol Model for RTL Verification
As previously shown, the protocol model can be used to verify the protocol and to aid in protocol comprehension, and if a protocol model is developed to verify the protocol, it can be used to verify RTL. The challenge is in trying to map the abstract protocol model states and transactions to the corresponding RTL states and transactions. Not only is the mapping difficult, the RTL is constantly changing and, thus, potentially breaking the map.
An alternative is to create a protocol model in which we write all the protocol rules at the interface level. Then, the RTL can connect directly to the protocol model with no mapping. However, there are drawbacks to this method. For one, the model is more difficult to write because it needs to be less abstract and must deal with some of the bit-level activity at the interfaces. In addition, an interface-based model is not making its relevant state transitions at the same time as the RTL. The state transitions within the RTL model are happening based on the ordering points internal to the design; for example, at the end of the L2 cache pipeline. Therefore, the interface-based model’s states must be slightly abstracted versions of the actual cache states to handle cases where the interface-based model cannot determine the exact ordering a design will choose to handle transactions.
Because the ACE ActiveModel is interface-based, you can use it to verify RTL by connecting to the ACE interfaces of a system, a capability with major implications. Remember, we used the rules of the ACE protocol as assumptions when we verified the protocol model against high-level properties about cache coherence. Now, we use these protocol rules as assertions/checks when attaching the model to RTL. This implies that the assertions/checks are verifying that the RTL is cache coherent. In addition, the row covers generated by ActiveModel, which are a high-quality set of properties, help to check the quality of the verification environment.
Cache-Coherent Protocol Verification of RTL
When verifying RTL designs whose protocol does not support cache coherence, it is sufficient to create VIP that monitors only one protocol interface and verify the design against the VIP. The single-interface VIP tracks enough states to ensure that transactions in flight are operating correctly. For example, a single-interface VIP tracks all the reads that are outstanding and ensures that all read responses are only for outstanding reads. The single-interface VIP also monitors, in detail, all the bit-level activity occurring on the interface and ensures that it adheres to the specification.
However, when verifying designs whose protocol supports cache coherence, not only does the VIP need to continue to monitor the single-interface activity, but it must also monitor system-level activity. For example, if two agents in a system both request access rights to a cache line that will allow the agents to modify the data, then the ordering agent in the system must not give both agents access rights at the same time. In order for the VIP to check this requirement, it must track the access rights given to both agents. Thus, for completeness, the VIP can no longer just monitor a single interface in the system. We need a system-level piece of VIP to accomplish this.
Additional challenges exist in the VIP that generates protocol transactions for cache-coherent protocols. Some of these include:
- Modeling all the transactions a protocol component may generate;
- Modeling all the freedoms of the protocol in transaction responses, for example, snoop responses; and
- Modeling all the cache-line conflict cases where multiple agents are operating on the same cache line.
As has been discussed, the components of the protocol may only support a subset of the available cache-line states. Thus, there is a variety of transactions that different components might issue for what is essentially the same internal operation. An earlier example mentioned that in the ARM ACE protocol, a component that supports all the cache-line states might issue a ReadShared when the internal operation is a load; whereas, a component that supports a smaller set of the cache-line states might issue a ReadClean when the internal operation is a load. The challenge for the VIP developer is that the model must be able to represent both types of components.
A similar challenge exists when considering all the possible responses a component can make to a transaction. The protocol specification will often indicate all the allowable responses and then make a recommendation on which response to use. In the ARM ACE protocol, when a ReadShared snoop hits on a cache line in the SD state in a master’s cache, there are several responses allowed by the protocol. The master being snooped could:
- Transition to the I state, assert PassDirty, and deassert IsShared;
- Transition to the SC state, assert PassDirty, and assert IsShared; or
- Stay in the SD state, deassert PassDirty, and assert IsShared.
Here again, we see that the challenge for the VIP developer is to generate all the allowable responses.
Some of the most complex pieces of RTL designs are the pieces of logic handling the conflict cases when there are multiple requestors of the same cache line. Therefore, it is important for VIP to be able to generate scenarios that will get the design into cases where these conflicts occur. A classic example of such a conflict: an agent is snooped for a cache line when it already has a write-back transaction outstanding for that line. Generating these types of cache-line conflict scenarios is essential for effective verification.
For simulation-based verification, the checkers and the generators are distinct VIP components. For example, we need a set of properties to check an interface that a design drives, and we need a model for random generation of traffic on an interface that is an input to the design. For formal-based verification, the VIP components are the same. In formal verification, we use the properties of the protocol as checks (assertions) when a design is driving an interface, and we use them as generators (constraints) when the interface is an input to the design. To clarify, a formal verification tool is not actually generating transactions; however, it is considering all possible transactions based on the properties written for the rules for the protocol.
We must verify three elements for successful cache-coherent protocol verification: 1) the RTL design that implements a protocol component, 2) the VIP that checks a protocol component, and 3) the VIP that generates transactions into a protocol component. Often, we verify these elements simultaneously by connecting them to each other. For example, in a simulation environment, it is possible to connect the checker VIP to the generator VIP. A comparable formal environment would be one where we connect two instances of the VIP back-to-back with one using the properties as assertions (checkers) and the other as constraints (generators).
Another, more thorough, verification option is to independently develop two similar pieces of VIP and then formally verify them against each other. Essentially, this is an equivalence checking exercise. For example, for the ARM ACE protocol, ARM engineers and Jasper engineers independently developed single-interface VIP for ACE (and for AXI). We targeted the ARM VIP for simulation environments while we targeted the Jasper VIP to have maximal performance during formal analysis. We then formally verified the two independently developed pieces of VIP in several back-to-back configurations (Fig. 7). In this formal process, we used the properties in each piece of VIP as both constraints (generators) and assertions (checkers). This verification process identifies holes in the property set (missing checks) as well as properties that are too strong. The latter presents the potential for over-constraint situations in formal analysis.
The end goal is to ensure that we build our RTL designs while maintaining the cache coherency attributes of the system. To that end, we build verification environments with a combination of VIP components and RTL components. The verification target in these environments is typically the RTL. However, especially for the new VIPs, we also test VIPs for correctness. In an example, the cache-coherent interconnect RTL component has two ACE interfaces for connection to master components and three AXI interfaces for connection to slave components (Fig. 8). In the verification environment, the VIP models take the place of actual RTL components. In this environment, an ACE system-level VIP is also connected to both ACE interfaces to ensure proper system-level behavior.
Cache-coherent protocols and the RTL implementations of the components of the protocols provide unique verification challenges. Initially, we must model and verify the protocol to demonstrate that it adheres to high-level, cache-coherence rules. A large audience, such as RTL designers and verifiers, must also understand the protocol. To enable high-quality verification of RTL protocol components, a sound methodology is imperative when developing VIP for the protocol.
The Jasper ActiveModel technology provided an effective means of modeling the ARM ACE protocol. We not only used the resulting model to verify the protocol, but it also became a platform for querying the protocol to understand transaction flows and freedoms in the protocol. Finally, the ACE ActiveModel became the system-level VIP used to verify cache-coherent ACE components alongside the ACE interface-level VIP.
Jasper and ARM have collaborated to perform formal verification in environments of this nature on both an ARM cache-coherent interconnect and cache-coherent processor.