Data Quality: Upstream vs. Downstream
A new Forrester report
Data quality problems cost US businesses more than $600 billion a year, according to The Data Warehousing Institute (2002). How does that happen? How come companies are losing this much money and not even realizing there’s a way to save up?
The major problem usually lies in the source the data is received from and the way it is processed before it enters the database/warehouse (if processed at all). Then there’s also the relevance of data, its accuracy, completeness, and consistency.
We keep wondering how come data quality check still exists as a procedure performed once in a while, rather than as a part of the front-end process? How come most companies start worrying about the quality of their data only when it’s already dirty and in use? How come it doesn’t occur to them that the quality of data needs to be thought through before it’s actually captured?
A recent Forrester paper titled “It’s Time to Invest in Upstream Data Quality” digs into this topic and explains why organizations do not pay enough attention to data quality. (The study was led by principal analyst Rob Karel.) The research firm suggests that when companies “realize short-term data cleanup ROI immediately, it’s hard to justify front-end investments that may take years.”
At the same time, Forrester analysts say, IT budget planning committees tend to avoid the existing data quality (DQ) products that allow integrating downstream data hygiene rules into front-end processes, justifying this by solutions’ cost and complexity.
“The result? I&KM (Information and Knowledge Management) pros quickly reach diminishing return on data quality investments, requiring even more investments later on to catch up with missed opportunities like verifying customer contact information, standardizing product data, and eliminating duplicate records.”
To break this cycle, data quality initiatives and source systems audit should be done upstream. Even at the early stages of data capturing, data quality already plays an important role in the future of the company. It is the early stages that make a difference in how your data turns out and if it pays off later on.
Start small
Still, why enterprise-wide data quality initiative often turns into a total disappointment? According to another Forrester’s report “A Truism For Trusted Data: Think Big, Start Small,” this happens because of managers’ ambitions to implement a trusted enterprise-wide system all at once. (The document was released earlier this year.)
That’s why experts from Forrester recommend thinking global, but starting small. In other words, they advice considering a bottom-up approach that defines quantitative and qualitative ROI for only those few select functional organizations that can best articulate and measure the business impact poor quality data has on processes.
Here is a short overview of the steps, as Forrester recommends, to effectively implement data quality initiatives:
1. First of all, those responsible should start from defining what they mean by “data quality” and “trusted data.” As Jill Dyché once mentioned, it’s time to understand the following seldom-understood truth: there are different levels of “acceptability” for data. And, according to her, the key is to understand company’s business requirements and then drill them down to data requirements. That will tell conclusively what good enough for the company really is.
Forrester defines “data quality” as “data used by business stakeholders to support their processes, decisions, or regulatory requirements with no reservations as to the data’s relevance, accuracy, integrity, and other previously agreed upon definitions of quality.”
The research company reminds that data quality must come directly from business stakeholders, for they are those who understand business requirements of the company, and thus may set the standards for company’s data quality.
2. Then, Forrester analysts insist on building a business case that starts small.
“Scoping and prioritization based on the business processes within the organization that are most critically affected by poor data quality is the key to defining the business case that will get your trusted data initiative off the ground,” experts say.
3. As word of the value of the data quality project gets around, other organizations, such as those responsible for order management and fulfillment, may also want to implement data quality improvements within their environments.
“Eventually, the tide will turn and these business stakeholders will sign up and support an enterprise-class solution to solving their data quality problems, but for that to happen, value must be demonstrated,” Forrester analysts conclude.
So, the bottom line, think about upstream data quality and start small, fostering the success and helping it to spread throughout the company. What could be simpler?
Further reading
- Neglecting the Quality of Data Leads to Failed CRM Projects
- Poor Data Quality Can Have Long-Term Effects
- 5 Things to Watch Out for in Data Warehousing
Relevant slides