MyPage is a personalized page based on your interests.The page is customized to help you to find content that matters you the most.


I'm not curious

Business Intuition: A Key Input in Big Data Analytics

Published on 24 March 15
10 Mario rewarded for 10 times 5 Mario rewarded for 5 times   Follow
681
0
1

Iâve sometimes been asked why it is that thereâs so much more emphasis on the involvement of the business in analytics projects, and why itâs any different from the need to involve the business in IT or software development projects.

This question is indeed a valid one. After all, no business software would serve its users well unless it was designed to meet their needs. In order to find out what those needs are the developers would need to speak to the business to understand their viewpoint and the problems they need solved.

In the case of big data analytics as well, the business needs to be very closely involved, but in a much more integral way. To start with, of course, analytics is used in the hope that it will help guide business choices and decisions, and answer questions by uncovering new insights. This may be considered to be analogous to providing business objectives and requirements to a software developer. But beyond this, the business needs to be much more involved, and perhaps integrated into the analytics team because the process of insight discovery requires a far greater depth of application of business domain knowledge and intuition at every stage than the software development process does.

This can be illustrated by considering the example of how big data stored using Hadoop needs to be cleaned and integrated. In conventional enterprise data stores and data warehouses data is captured, stored and retrieved according to predefined data models in a structure manner. Field widths are fixed, and data dictionaries describe how data should be formatted, what nomenclatures should be used, and tables are usually normalized. Finally, the data is all stored in regular columns and rows, and is identified and retrieved using primary keys, or correlated and joined using foreign keys. When data needs to be merged (using ETL, for example), a business perspective is needed to identify what data needs to be merged and how to reformat data as needed. But beyond that, the bulk of the exercise is a technical one that is taken care of by developers. Data is retrieved and merged according to keys and values.

With big data stores in Hadoop, however, the exercise is not as straightforward. Data may be available in different forms and from different sources. It may be structured or unstructured. Data sources may be social media, log files, devices, streams, or emails. There may not be any primary keys or foreign keys with which to conveniently extract particular records. Data formats for the same element (e.g., names) may vary widely. The only way to begin putting various elements and ârecordsâ of data together would be by using a business perspective. As an example, location data may not exist neatly in any particular database column named âLocationâ, but may actually need to be inferred or derived using indirect methods. An IP address, for example, could be used to reveal a location in some cases. The usage of certain words in text on social media or emails could be construed to have a positive connotation or a negative connotation (for example, the occurrence of the word âawfulâ in a sentence containing âservice was awfulâ). Other data could be used to detect buying patterns. These patterns or intuitive correlations about how to put data elements together are not ones that a technical IT team would be able to make, but are ones that a business subject matter expert would need to think of and investigate. Once the technologists integrate the data as suggested, further validation is required by a business specialist in order to determine which ârecordsâ make sense for the purpose of analytics and which should be discarded.

Looking at the case of IT again, what could be said is that once the requirements are known, the developers can go away, do their work and come back with the output. In the case of analytics, however, a representative of the business needs to drive, provide input and direction, evaluate, validate and help maintain perspective at almost every step of the way. They have to be present and participate constantly, which is why the emphasis on their involvement cannot be understated.
Iâve sometimes been asked why it is that thereâs so much more emphasis on the involvement of the business in analytics projects, and why itâs any different from the need to involve the business in IT or software development projects.

This question is indeed a valid one. After all, no business software would serve its users well unless it was designed to meet their needs. In order to find out what those needs are the developers would need to speak to the business to understand their viewpoint and the problems they need solved.

In the case of big data analytics as well, the business needs to be very closely involved, but in a much more integral way. To start with, of course, analytics is used in the hope that it will help guide business choices and decisions, and answer questions by uncovering new insights. This may be considered to be analogous to providing business objectives and requirements to a software developer. But beyond this, the business needs to be much more involved, and perhaps integrated into the analytics team because the process of insight discovery requires a far greater depth of application of business domain knowledge and intuition at every stage than the software development process does.

This can be illustrated by considering the example of how big data stored using Hadoop needs to be cleaned and integrated. In conventional enterprise data stores and data warehouses data is captured, stored and retrieved according to predefined data models in a structure manner. Field widths are fixed, and data dictionaries describe how data should be formatted, what nomenclatures should be used, and tables are usually normalized. Finally, the data is all stored in regular columns and rows, and is identified and retrieved using primary keys, or correlated and joined using foreign keys. When data needs to be merged (using ETL, for example), a business perspective is needed to identify what data needs to be merged and how to reformat data as needed. But beyond that, the bulk of the exercise is a technical one that is taken care of by developers. Data is retrieved and merged according to keys and values.

With big data stores in Hadoop, however, the exercise is not as straightforward. Data may be available in different forms and from different sources. It may be structured or unstructured. Data sources may be social media, log files, devices, streams, or emails. There may not be any primary keys or foreign keys with which to conveniently extract particular records. Data formats for the same element (e.g., names) may vary widely. The only way to begin putting various elements and ârecordsâ of data together would be by using a business perspective. As an example, location data may not exist neatly in any particular database column named âLocationâ, but may actually need to be inferred or derived using indirect methods. An IP address, for example, could be used to reveal a location in some cases. The usage of certain words in text on social media or emails could be construed to have a positive connotation or a negative connotation (for example, the occurrence of the word âawfulâ in a sentence containing âservice was awfulâ). Other data could be used to detect buying patterns. These patterns or intuitive correlations about how to put data elements together are not ones that a technical IT team would be able to make, but are ones that a business subject matter expert would need to think of and investigate. Once the technologists integrate the data as suggested, further validation is required by a business specialist in order to determine which ârecordsâ make sense for the purpose of analytics and which should be discarded.

Looking at the case of IT again, what could be said is that once the requirements are known, the developers can go away, do their work and come back with the output. In the case of analytics, however, a representative of the business needs to drive, provide input and direction, evaluate, validate and help maintain perspective at almost every step of the way. They have to be present and participate constantly, which is why the emphasis on their involvement cannot be understated.

This blog is listed under Development & Implementations and Data & Information Management Community

Related Posts:
Post a Comment

Please notify me the replies via email.

Important:
  • We hope the conversations that take place on MyTechLogy.com will be constructive and thought-provoking.
  • To ensure the quality of the discussion, our moderators may review/edit the comments for clarity and relevance.
  • Comments that are promotional, mean-spirited, or off-topic may be deleted per the moderators' judgment.
You may also be interested in
Awards & Accolades for MyTechLogy
Winner of
REDHERRING
Top 100 Asia
Finalist at SiTF Awards 2014 under the category Best Social & Community Product
Finalist at HR Vendor of the Year 2015 Awards under the category Best Learning Management System
Finalist at HR Vendor of the Year 2015 Awards under the category Best Talent Management Software
Hidden Image Url

Back to Top