MyPage is a personalized page based on your interests.The page is customized to help you to find content that matters you the most.


I'm not curious

Data Quality Management for Better Big Data Analytics

Published on 31 July 15
10 Mario rewarded for 10 times 5 Mario rewarded for 5 times   Follow
1371
2
1
For businesses that have begun using it, business analytics has become a valuable means of uncovering insights that aid decision making in many areas. For business analytics to produce reliable results, it has to be available in the right volumes (for statistically valid results) and at the right time. It must also be of the right quality.
Data Quality Management for Better Big Data Analytics - Image 1

Thereâs no doubt that most medium and large scale businesses capture and store significant amounts of data that is processed by their core business support systems as well as business intelligence (BI) systems, and maintained by their IT departments. Even so, there may be a hesitation about how to start up an analytics initiative. This may be because of a lack of a clear data quality management system being in place, one that provides confidence that data is owned, managed, controlled, and reliable, and can be made available when required.

Although the whole exercise of setting up and implementing a data quality management system is too large to describe here, the following provides a very high level perspective on how to approach the problem.

Identify existing data assets and new requirements

The first phase in taking control of all the data in the organization is to inventory it and know more about it. Typical questions to answer at this stage are:

  • What types of data are there? How rich is each one?
  • Who owns (or maintains) the data? Who consumes it? Who makes it available?
  • Where and how is the data stored? And for how long?
  • Is the data quality level known?
  • How and where is data captured?
  • Why is this data maintained?
  • Are new data requirements known, and what are they?
  • Is metadata available?
  • What types of data are secure, and how does this security work?

Strategise and Plan

In the second phase the objective is define the strategy of a data management system, along with itâs highest level components.

  • Goals: these should be supportive of the business goals
  • Organization: What is the data management organization structure? Who will own it, who will be responsible for various aspects of data management, such as security, stewardship, ageing and retention, etc.
  • Scope of the data management systems: Identify inclusions, ie, what types of data fall within the scope of the data management system? Also included would be related processes and standards. What are the high level objectives for each inclusion?
  • Prepare a roadmap for implementation: Prioritise the various areas and types of work. Plan for implementation in terms of projects, teams and schedules.

Define and Implement New Data Quality Management System

  • Governance: what is the governance organization structure, who are the authorities and authorization processes? What are the touchpoints and controls with vendors? What standards does the data management system follow, and how is data quality assured?
  • Processes and Technologies: The management of data necessarily comprises a holistic set of standard processes and guidelines that address how data should be sourced, handled, stored and accessed such that any user may be confident that he/she always has the right version that came from a single source. These processes must be defined with reference to associated technologies that are used for these purposes. At the minimum, these processes and technologies should address the following:

      1. Sourcing and capture
      2. Cleansing and Integration
      3. Quality checks
      4. Duplicate control
      5. Versioning
      6. Storage Management
      7. Archiving
      8. Storage
      9. MDM (Master Data Management)
      10. Security and Compliance
      11. Development of new systems for handling data
















For businesses that have begun using it, business analytics has become a valuable means of uncovering insights that aid decision making in many areas. For business analytics to produce reliable results, it has to be available in the right volumes (for statistically valid results) and at the right time. It must also be of the right quality.

Data Quality Management for Better Big Data Analytics - Image 1

Thereâs no doubt that most medium and large scale businesses capture and store significant amounts of data that is processed by their core business support systems as well as business intelligence (BI) systems, and maintained by their IT departments. Even so, there may be a hesitation about how to start up an analytics initiative. This may be because of a lack of a clear data quality management system being in place, one that provides confidence that data is owned, managed, controlled, and reliable, and can be made available when required.

Although the whole exercise of setting up and implementing a data quality management system is too large to describe here, the following provides a very high level perspective on how to approach the problem.

Identify existing data assets and new requirements

The first phase in taking control of all the data in the organization is to inventory it and know more about it. Typical questions to answer at this stage are:

  • What types of data are there? How rich is each one?
  • Who owns (or maintains) the data? Who consumes it? Who makes it available?
  • Where and how is the data stored? And for how long?
  • Is the data quality level known?
  • How and where is data captured?
  • Why is this data maintained?
  • Are new data requirements known, and what are they?
  • Is metadata available?
  • What types of data are secure, and how does this security work?


Strategise and Plan

In the second phase the objective is define the strategy of a data management system, along with itâs highest level components.

  • Goals: these should be supportive of the business goals
  • Organization: What is the data management organization structure? Who will own it, who will be responsible for various aspects of data management, such as security, stewardship, ageing and retention, etc.
  • Scope of the data management systems: Identify inclusions, ie, what types of data fall within the scope of the data management system? Also included would be related processes and standards. What are the high level objectives for each inclusion?
  • Prepare a roadmap for implementation: Prioritise the various areas and types of work. Plan for implementation in terms of projects, teams and schedules.


Define and Implement New Data Quality Management System

  • Governance: what is the governance organization structure, who are the authorities and authorization processes? What are the touchpoints and controls with vendors? What standards does the data management system follow, and how is data quality assured?
  • Processes and Technologies: The management of data necessarily comprises a holistic set of standard processes and guidelines that address how data should be sourced, handled, stored and accessed such that any user may be confident that he/she always has the right version that came from a single source. These processes must be defined with reference to associated technologies that are used for these purposes. At the minimum, these processes and technologies should address the following:


      1. Sourcing and capture
      2. Cleansing and Integration
      3. Quality checks
      4. Duplicate control
      5. Versioning
      6. Storage Management
      7. Archiving
      8. Storage
      9. MDM (Master Data Management)
      10. Security and Compliance
      11. Development of new systems for handling data


This blog is listed under Data & Information Management Community

View Comments (2)
Post a Comment

Please notify me the replies via email.

Important:
  • We hope the conversations that take place on MyTechLogy.com will be constructive and thought-provoking.
  • To ensure the quality of the discussion, our moderators may review/edit the comments for clarity and relevance.
  • Comments that are promotional, mean-spirited, or off-topic may be deleted per the moderators' judgment.
  1. 05 August 15
    0

    Donal, thanks for your input! I do agree that if a formal data management system is not already in place, then the early stages of a Big Data project might well provide some good experience that can be used in setting up a data management system. But even if there is no immediate plan for Big Data, it's helpful to have a management system with formal controls in place for enterprise data. It would certainly be useful for analytics initiatives that are required to start off with whatever "non-Big" data is available.

  2. 05 August 15
    0

    Mario, These are valid questions to ask for any data quality management initiative you might undertake. Nothing specific here about Big Data. In my experience, often the First Big Data project is often a discovery one, and by its very nature would follow a different set of processes. Not that over time, the process you outlined above might be introduced as trusted insights from the discovery phase are acted upon.

You may also be interested in
Awards & Accolades for MyTechLogy
Winner of
REDHERRING
Top 100 Asia
Finalist at SiTF Awards 2014 under the category Best Social & Community Product
Finalist at HR Vendor of the Year 2015 Awards under the category Best Learning Management System
Finalist at HR Vendor of the Year 2015 Awards under the category Best Talent Management Software
Hidden Image Url

Back to Top