How Non Data Quality Can Cost Money WWW - Morefromtom
How Non Data Quality Can Cost Money WWW - Morefromtom
September 2005
Introduction
When viewed from a high level, the cost of poor quality data can affect a
company’s bottom-line in two ways. First, there’s the cost of scrap and rework,
and second, missed opportunities.
Process failure
For example, capturing erroneous customer data like address, contact
information, account details.
- Irrecoverable costs; e.g. premiums sent in vain to non-existing customer
addresses.
- Liability and exposure costs; for instance credit risk losses when data
quality problems cause erroneously offering credit to a customer who is
not considered creditworthy on the basis of self-supplied information.
- Recovery costs of unhappy customers; time spent handling complaints.
-2-
unnecessary insofar this is caused by faulty initial data entry. For example,
when a mailing is done on the basis of a problematic customer file,
dedicated scripts need to be run to deal with the (known!) errors in the
address fields. This process needs to be repeated for every mailing. Since
such customer files are often shared across departments and systems,
source changes need to be negotiated with all end users of these data.
- Data cleansing software costs; data cleansing software (like Vality,
Ascential, etc.) is usually very expensive. However, there’s a tradeoff
between scarce labor doing this ‘by hand’, and the fact that ETL data
quality software to help with such tasks typically has very high license
costs. Purchase may sometimes prove remarkably economical when
related to (often unseen) labor costs for manually improving data quality.
-3-
improvement. This topic will need to stay high on management’s agenda to
create lasting improvement.
Conclusion
Problems in data quality often go unnoticed. It can be both a source of
process inefficiencies (timeliness), as well as operational costs (direct and
indirect losses). In neither of these cases is it apparent that improvement is
possible from enhancing data quality.
One of the pernicious consequences of suboptimal data quality is that the cost
of poor quality data is usually hidden. Lack of data quality is not obvious to
those not deliberately looking for it. Quantifying costs isn’t always easy. What
makes the indirect costs of poor data quality so pernicious is that the relation
between data quality problems and its consequences is non-obvious, and
often only occurs with a substantial time delay. Therefore, the connection
between downstream consequences and poor quality data is often not made,
and the problems are not attributed to their true cause.
The cause of many downstream data quality costs can easily remain largely
hidden (e.g. data quality), and therefore insufficiently subject to management
attention and intervention. Also, progress after improvement efforts is gradual,
relatively slow, in large part ‘cultural’, and therefore difficult to monitor and
track.
Data quality is far more than accurate data entry. It stems from monitoring
downstream data usage, maintaining comprehensive and up-to-date meta
data, and nurturing a corporate culture of naturally doing things right at the
first attempt. Only then will knowledge workers learn to expect data quality,
and enforce it because it’s the natural thing to do. Letting data quality slide will
promote a culture of negligence, and disdain for the use of one’s most
precious assets: customer information.
The case for accurate source data is further underlined when one realizes that
the source in and of itself does little more than support primary processes,
which is fine. However, the greater value to the organization comes from
enhancing these data, from deriving new information from source data.
-4-
The investment in improving information quality is recouped several times in
decreased costs, and improved value of information to accomplish strategic
business goals.
Resources
Larry P. English (1999) Improving Data Warehouse and Business Information
Quality: Methods for Reducing Costs and Increasing Profits. Wiley, ISBN 0-
471-25383-9
Sid Adelman, Larissa Moss & Majid Abai (2005) Data Strategy. Addison-
Wesley, ISBN 0-321-24099-5
-5-