Lies, Damned Lies, and Coverage: Mark Litterick Email
Lies, Damned Lies, and Coverage: Mark Litterick Email
Mark Litterick
Verilab GmbH1, Germany
Email: [email protected]
Abstract- Functional coverage is a key metric for establishing the overall completeness of a verification process;
however, empirical evidence suggests that coverage models are often inaccurate, misleading and incomplete. This paper
proposes that such coverage defects are extremely common and since coverage analysis tends to focus on holes, or missing
coverage, rather than the accuracy of what is already reported, then this represents a significant risk to the overall
quality of the verification process. After a generic introduction to the problem the paper discusses practical examples and
proposes pragmatic solutions for minimizing the risk and improving quality. Finally we demonstrate a novel application
of the UCIS API to cross-reference different aspects of functional coverage in order to validate correctness of the model
under some circumstances.
I. INTRODUCTION
Functional coverage is a key metric in establishing the overall completeness of a verification process and is
especially relevant for modern constrained-random environments. Functional coverage models are manually
developed against verification requirements and implemented using cover points and cover groups within the
verification components using the chosen language. During the course of a project the evolving coverage score is
then analyzed to determine completeness of the task and any holes in the coverage results are addressed by altering
the stimulus.
The problem is that many teams do not rigorously analyze the accuracy of coverage hits in the model as part of
the methodology. This paper addresses that false positive problem from a pragmatic view based on observations
across many projects, in many companies, using various verification languages. This paper does not provide a silver
bullet to solve all these problems; but rather raises awareness with a comprehensive discussion and practical
examples, provides pragmatic methodology guidelines and discusses possible tool and methodology enhancements
for the future - all of which should add up to improved quality and more honest functional coverage. The concepts
presented are language independent, but any terminology used is taken from SystemVerilog and the Universal
Verification Methodology (UVM) [1] in order to minimize confusion in the text.
II. FUNCTIONAL COVERAGE
Functional coverage is essential for constrained random verification (in order to determine what stimulus was
actually generated and checked) and beneficial also for directed testing (to validate if the tests achieved their
intended goal) [2]. Functional coverage data from multiple runs, with different seeds, and multiple different tests is
collated into a single database to give an overall measure of verification status. Although a key metric regarding the
scope of stimulus and quality of the testbench, it is important to note that all aspects of functional behavior also need
to be checked - something that often gets forgotten in the drive for coverage completion - coverage alone is not
enough.
Functional coverage is manually specified to identify what the verification team thinks is important to the success
of the project, and should not be confused with low-level code-coverage. Code coverage only identifies which lines
of code in the Device-Under-Test (DUT) were activated by the verification environment, but does not identify if
these lines of code were exercised correctly or at an appropriate time, and cannot identify missing code or features.
Code coverage has the benefit of being automatically measured (although typically manually filtered). It is assumed
that both code and functional coverage are used in the reader’s environments.
This paper assumes a working knowledge of functional coverage implementation using high-level verification
languages and a corresponding methodology such as UVM. The focus of this paper is on determining how accurate,
or otherwise, the functional coverage implementation actually is.
1
www.verilab.com
1) Field Ranges
Correctly specifying ranges for the cover points in transaction fields is really the last line of defense against
poorly scoped stimulus. If the bins for key values, such as minimum and maximum, are not separate from other
ranges, then we can easily run full regressions with good coverage scores and miss the critical cases. We have
observed this in many applications and it can be a killer problem. For example: a graphics application failed to stress
the arc drawing algorithm with length or radius parameters of zero, but when the coverage was reviewed and the
software team consulted these bins were included explicitly forcing the stimulus to be opened-up resulting in several
key bugs being uncovered in the DUT.
2) Conditional Fields
Another common mistake in transaction coverage is to cover all transaction field values on the same sample event
without additional filtering. Many applications have conditional fields that are only valid in specific transaction
types (e.g. extension fields, or message parameters) and these should be qualified using conditional constructs in the
cover statement (e.g. only count the value if some additional condition is valid) otherwise the coverage results are
falsely positive.
3) Configuration Objects
It is misleading to cover configuration object fields when the configuration is set or changed since the actual value
may or may not be used. Instead these field values should only be covered when a dependent operation occurs in the
DUT (e.g. a transaction occurs on an interface, or a transformation algorithm is executed). This might require
decomposition of the configuration object fields into more than one coverage group to allow for independent
sampling events (e.g. separate transmit and receive events).
5) Error Injection
Functional coverage related to error injection (i.e. stimulus with deliberate protocol violations, e.g. CRC error) is
often poorly represented in the coverage model results. Part of the reason for this might lie in the fact that passive
observers like monitors cannot always distinguish between the specific causes of a current transaction when error
injection is active (e.g. a coding error could be misinterpreted as a content error). This is one of the few cases where
we would recommend supplementing the passive coverage model (if required) with some stimulus coverage from
the driver responsible for doing the actual error injection based on sequence item flags passed down from the
sequencer.
6) Irrelevant Data
At any particular DUT abstraction (e.g. block, sub-system or full-chip) capturing too much irrelevant data in the
coverage database is misleading because it looks like lots of interesting stuff is happening and this might not be the
case. If this data is not really relevant to the job in hand than this give falsely encouraging results and furthermore
can hide the real missing interesting coverage goals - exaggeration, in effect, is also a kind of lie. For example, in a
network-on-chip application very many types of packet can be transported by the bus fabric, however if the goal is
to validate full functionality of a router subsystem, then only packet information relevant to the router should be
covered (e.g. some of the header fields used for routing decisions, flow-class information, overall packet length,
etc.), the router does not care about many other aspects of packet content (e.g. payload) and this coverage should be
suppressed from the model.
We would recommend against the methodology whereby everything is sampled in the coverage base, to such a
degree that the raw coverage is overwhelming, and then only a few aspects are picked-out for annotation to a
verification plan for requirements tracking. In practice this approach gives poor results and it is hard to measure the
absolute quality - a better approach is to keep the raw coverage lean and pertinent, and then to map the appropriate
aspects of the coverage to relevant sections of the verification plan for tracking using forward or backward
annotation.
B. Temporal Coverage
In this context temporal coverage is related to timing of events between different aspects of functional coverage
and not just the accumulation of data with in a particular transaction associated with a single protocol interface.
2) Reset Conditions
Typically we wish to cover reset activation when the DUT is in different states (e.g. processing a transaction, idle,
waking-up, going-to-sleep, etc.). It is very common to observe non-zero reset coverage even though only an initial
reset has been applied - this is misleading since the DUT was not in any state when the initial reset was applied. It is
better to exclude the initial reset event from this coverage and make sure only subsequent reset events are used.
3) Temporal Relationships
It is not uncommon to see an entire functional coverage model for an environment based exclusively on
transaction-based coverage. In any design with more than one interface, or any internal storage stages (e.g. pipelines,
buffers, FIFOs, synchronizers, etc.) this is very unlikely to cover all the appropriate verification concerns for such a
DUT. We also need to cover important temporal relationships (e.g. relative transaction timing on separate interfaces,
transaction timing relative to power-down requests, occurrence of flush requests relative to processing state, etc.).
This coverage tends to be quite distributed and will often require dedicated cross-interface monitoring at the
environment level to track relative state and timing events for the various interfaces - it is also hard to implement
and review, but extremely valuable nonetheless.
4) Covering Checks
Functional coverage of all checks operational in the system is a requirement. Some checks, like assertions in the
interfaces, are automatically covered as part of the unified coverage reports - but this is typically not enough. For
accurate coverage we also need to know under what conditions the checks were successful and this typically
requires extra information to be recorded to know we have covered all cases (e.g. an assertion that says “A must
cause B or C” is a valid check, but needs two cover points to be defined to know that “A caused B” and “A caused
C”).
Transaction content checks in a monitor and relationship checks in a scoreboard also need explicitly crafted
coverage. For example if the environment is capable of dropping a packet with a certain class of error, then the
scoreboard model must be aware of this to prevent mismatches, but we also need to functionally cover the
occurrence of all such valid filtering conditions. For some reason these conditions are often omitted from the
coverage model, even though it is critical behavior that we need to know was stimulated and checked. The concept
is not so complicated; ask yourself “can you tell, from the coverage results, if every check that was deemed
important actually happened, how often, and under what conditions?”
5) Sub-Transaction Events
Many protocols require additional coverage for intermediate events while the interface is trying to decode some
traffic. Sometimes these important events do not even result in a transaction being published by the monitors (e.g. if
two competing masters back-off during arbitration), and therefore need to be recorded by some other means. For
example, in the I2C protocol we need to provide coverage, which identifies that, the DUT backed-off during
different phases of an arbitration contest (when an external master wins), and we also need to identify if the external
master was addressing the DUT at the time.
Here are a few things to look out for when trying to determine register related coverage goals:
• did we use all relevant values and ranges in control and configuration?
• did we read all appropriate status responses from the DUT?
• did we validate all the reset values from the registers?
• did we access all register addresses?
• did we attempt all access types for each register?
• did we prove all appropriate access policies for the register fields?
1) Register Writes
It is extremely misleading to cover the field values for control and configuration registers when a write operation
is performed. This can lead to very high coverage scores even in tests with little or no additional DUT functionality
(for example we can write ten different values to a configuration register but use only one of them). Instead these
field values should be covered when they are actually used by the DUT (e.g. to send a transaction, or execute an
algorithm). In such cases the coverage implementation can be provided by the register model but triggered by a
passive monitor when an appropriate event occurs in the environment.
2) Register Reads
Covering field values for status register reads is also misleading, since some of these results come from reset
conditions and not DUT operations. It is more appropriate to record field values for a status register read only when
the value has changed from its reset value, or from the previously read value. Meta-data can be used to track the
previous state of the field value.
3) Reset Values
In order to validate that the reset values of the registers in the DUT match those in the model, we really need to
cover read accesses to registers when no other operation has been performed on the register after reset. In real life
this can be difficult to attain, especially for volatile fields (that are updated by RTL), and this coverage is often
omitted from the model. If we add some meta-data to a register field to track the updates from bus operations or
RTL (by extending the active-monitoring capability) we can more accurately cover reads of reset status (and
separately check the values).
5) Access Rights
The access rights of a register in a particular address map can put additional restrictions on the types of operations
allowed on the register (and therefore the enclosed fields). Both legal and illegal access types must be covered for
each register in the address map range (e.g. we need to cover attempted writes to read-only registers). This coverage
is often omitted and therefore the corresponding access rights and protection could avoid validation in the
verification flow.
6) Access Policies
Irrespective of the overall access rights in the address map, each register field can have specific access policy
applied (e.g. write-only, read-only, read-to-clear, write-one-to-clear, etc.). We need to ensure that all possible
accesses have been attempted on these fields and where appropriate take into account the access data values. For
example with a read-only register field, we need to cover both read and write attempts to the field, and the write
must have a value that does not match the field content. Likewise for a write-one-to-clear field, we need to cover the
attempted write of both one and zero.
Additional cross coverage may be required if the access policy is dynamic (e.g. in order to implement a hardware
firewall). In which cases we need to cover both legal and illegal attempts, with corresponding data if appropriate, in
each of the available modes.
V. LIE DETECTORS
Unfortunately there are no silver bullets for validating the correctness and completeness of the functional
coverage model in the context of the different verification abstraction levels; this represents a significant risk
considering the importance of the coverage results in the verification flow. This section provides some pragmatic
guidance for improving the methodology including suggestions on how to prepare the coverage model and how go
about analyzing coverage accuracy of implemented code. In addition we look at a novel approach that was tried out
to automate some aspects of this and propose how this might be extended in the future as tool developers turn their
attention to this significant problem.
A. Reviews
1) Coverage Architecture and Plan Review
During verification architecture development, prior to the implementation of the coverage code, it is necessary to
plan the coverage model with enough detail that a skilled reviewer can identify inaccuracies and omissions. A full
discussion on the process and requirements for functional coverage planning is outside the scope of this paper but is
discussed in [2] and [3]; the key thing to consider in this context is that the plan needs to be reviewed in detail since
this is the best chance of detecting missing or poorly scoped coverage points. For each component of the verification
environment we should create a coverage table that details:
• what is covered (items and ranges)
• when the sampling event occurs (temporal event)
• under what conditions the sampling is allowed (logical condition)
All aspects of coverage should be considered including transaction fields, configuration settings, control, status,
temporal relationships, checks, etc. The goal here is to be concise and complete, but not get bogged down in code
syntax. Prior to implementation interested parties should review the coverage plan in order to detect what is missing,
irrelevant or incorrect. The conditional and temporal aspects of this analysis are extremely important - these are
often omitted by less experienced teams, resulting in a coverage plan that is very transaction content focused and
inadequate for most applications. Leaving the details of the conditional and temporal aspects to the implementation
stage is not recommended, since it is much harder to review and more error prone.
It is important to cross-reference all aspects of verification environment operation in the analysis of the detailed
behavior. It is not just the successful hit of particular coverage points that we are concerned with in this analysis, but
the absolute score for each and every bin - including the distribution of type and instance-based coverage results. For
the current example we should cross-reference coverage with the following:
Log-files tell us what transactions were observed at each interface:
• are all transaction fields correctly reported in coverage?
• what about derived values like packet length and meta-data like delays?
• is there too much data recorded for non-relevant fields (like payload content in this case)?
• is the error correctly reported in transaction coverage?
Waves illustrate the relative timing of the transactions on the various interfaces:
• are the relationships of traffic on the interfaces (e.g. overlaps) recorded in coverage?
• are there any significant transaction ordering events to report (e.g. overtaking)?
• are the delays correctly measured and reported?
• does the coverage reflect how many packets are in flight (inside DUT)?
• does the coverage reflect how many slices are being processed at a particular time?
• are the observed clock relationships correctly reported by coverage?
Checks are performed by assertions (protocol), monitors (content) and scoreboards (relationships):
• which checks were performed and are they all reflected in coverage?
• do the observed assertion scores match the scoreboard and transaction coverage?
• does coverage reflect that the scoreboard model also filtered-out a packet?
• is the effect of error detection included in register or protocol signal coverage?
Such a thorough analysis of detailed coverage scores against observed scenarios tends to quickly find some major
defects in the coverage model including omissions - often we can confirm the stimulus and checks are valid, but
cannot honestly say the occurrence of the specific conditions is well reflected in the coverage. Successful conclusion
of this analysis process also provides improved confidence in the overall verification environment and coverage
accuracy in particular.
C. Automation
With current technology there is very little scope for automatically deciding what types of functional coverage are
appropriate for most high-level protocol scenarios. Of course, once decided, there is scope for automatically
generating the coverage implementation and adapting it to DUT parameters, but that is not the focus for this paper.
A more interesting and relevant question in the context of this paper is “can we automate the validation of
functional coverage correctness and completeness in a given coverage model?” Essentially this should be possible,
to a degree, since all we need is a rule-based application of the same cross checks that were applied during the
manual analysis stage mentioned previously. The author is not currently aware of any commercial tool that does this,
but in order to demonstrate a proof of concept we experimented with an ad-hoc approach to validating the coverage
using the Unified Coverage Interoperability Standard (UCIS) [4] and PyUCIS [5]. UCIS is an open industry-
standard that provides an Application-Programming Interface (API) to enable sharing of coverage data across
In order to experiment with this novel approach, we chose to look at an OCP verification environment that has
configuration-aware assertions (enabled, or not, based on profile configuration) and class-based transaction
coverage. Using the UCIS we can access and compare:
• assertion and class-based coverage scores
• scores for different assertions in an interface
• different aspects of class-based coverage
In theory there is a strong relationship between the assertion results and the observed functional coverage for the
transactions (e.g. if we get N valid protocol assertions passing, then we should not have a score of more than N for
any aspect of transaction coverage). Likewise different assertion scores can be compared (e.g. assertions for
different phases of the protocol - request, data-handshake, response - should have appropriate coverage scores based
on observed full transaction types). Transaction based coverage can also be compared between components and
layers in the environment (e.g. the number of protocol errors reported in the transaction coverage can be compared
with the checker coverage from the scoreboard).
The UCISDB stores hierarchical information (scope) and coverage counts (coveritem). In order to access the
required bin count information we need to open the UCISDB, and search through the scopes iteratively until we
match the required coveritem name, then we can extract the coverage count. Note, in the following code snippets:
# ucis_* methods are the actual UCIS API methods wrapped with SWIG into Python code.
# pyucis_scope_itr is a Pythonic iterator build from the UCIS API methods:
# - ucis_ScopeIterate, ucis_ScopeScan, ucis_FreeIterator
# pyucis_cover_itr is a Pythonic iterator build from the UCIS API methods:
# - ucis_CoverIterate, ucis_CoverScan, ucis_FreeIterator
We implemented additional python helper methods to find the UCIS scope from specified path description,
provided as a string or compiled regular expression (pyucis_find_scope), and extract the coverage count from
UCISDB for the corresponding coveritem (pyucis_get_count, pyucis_get_cov_count) as shown in the following
code. These helper methods are also available via PyUCIS [5].
In our environment we execute a Python script as a post-processing step after single simulations or complete
regressions. In order to access the required bin count information we need to open the UCISDB, for example:
The application-specific rules associating the coverpoints are hard-coded into the user script. For example, in OCP
we can check that the class based command coverage matches the number of passes for the command hold assertion
(MCmd must remain stable for the duration of the request phase):
if (pyucis_get_count(db, "/tb_top/ocp_if/checker/assert_request_hold_MCmd")
!= pyucis_get_count(db, "/vlab_ocp_pkg/vlab_ocp_monitor/cg_req/cp_cmd"))
print("ERROR: ...” )
For OCP many of the coverage checks are more complicated due to the profile configuration and also some
assertions fire more than once per phase. For example the burstlength configuration field identifies whether
MBurstLength is present; if it is, then the assertion checks it’s value on every clock during the request phase. We can
(only) check that the class-based coverage never reports a higher score for burst length values than the total assertion
score if burstlength was ever active (cp_burstlength bin “1” was hit) in the profile configuration (otherwise the
cp_burst_length coverage is omitted from the database):
In this proof of concept stage, we identified some 40 rules (not all of which were implemented) for this OCP
environment with 60 assertions and 5 covergroups containing 30 coverpoints. In the process of coding and
validating the rules for associating the different coverage scores, and trying it out on several regression runs with
different profile (configuration) settings, we were able to detect:
• incorrect assertion coding resulting in higher than expected scores
• misleading functional coverage for some aspects of transaction content
It is necessary to bear in mind that association of the assertions with the coverage classes used a manual rule-
based approach, and also that this level of cross checking is only one aspect of coverage validation for items that are
actually implemented. In effect, validating coverage using the UCIS only provides us with a sanity check for the
available coverage, nevertheless this technique could also be extended to provide some level of unit testing for the
coverage and assertion aspects of reusable verification components. In addition the technique demonstrates potential
for automation of the coverage validation in future environments if it were extended to make use of generic formal
algorithms.
VI. RESULTS
All of the examples illustrated in the paper were from real projects. The methodology and discussion alone should
raise awareness and improve the quality and accuracy of the reader’s coverage models. Results from the automatic
crosscheck approach using the UCIS are presented in the context of one project, but the concept itself is open to
further development by the verification community and tool providers.
ACKNOWLEDGMENT
The author would like to thank André Winkelmann (Verilab) for implementing the UCIS crosschecks and the
additional PyUCIS helper methods, and Gordon McGregor (Nitero) for the original implementation of PyUCIS.
REFERENCES
[1] UVM (Universal Verification Methodology), Accellera, www.accellera.org
[2] G. Allan et al, Coverage Cookbook, Mentor Graphics, https://verificationacademy.com
[3] J. Sprott, P. Marriott and M. Graham, Navigating The Functional Coverage Black Hole, DVCon US 2015
[4] UCIS (Unified Coverage Interoperability Standard), Accellera, www.accellera.org
[5] PyUCIS (SWIG/Python bindings for the UCIS API), Verilab, https://bitbucket.org/verilab/pyucis
[6] A. Yehia, UCIS Applications: Improving Verification Productivity, Simulation Throughput, and Coverage Closure Process, DVCon
2013