MyPage is a personalized page based on your interests.The page is customized to help you to find content that matters you the most.

I'm not curious

IT Career Development Platform

SKIP>>

We built MyTechLogy for you

Help us to help you.

Share your expectations and experience to improve it.

Please enter your feedback.

Click here to continue..

Thank you for your Feedback

Your feedback would help us in sending you the most relevant job opportunities

Missing the Analytics Forest for the Trees

Published on 27 May 15

Mario Lewis 10

Missing the Analytics Forest for the Trees - Image 1

Having listened to a few industry acquaintances recently as they shared their observations of the whole Big Data phenomenon, and then reading Gartner Inc.â€™s summary of itâ€™s 2015 Hadoop Adoption Study I came to the conclusion that the state of play in this field is somewhat akin to where the adoption of the cloud and cloud-based services was just a few short years ago. At the time, there was a lot of hype about the cloud. Everyone was talking about it, software companies were evangelizing it, and most CIOs were watching it with keen interest. The unspoken realization was that while everyone had heard about the capital investment savings, operational cost improvements and scaling opportunities the cloud potentially presented most were still not very sure what it actually consisted of, or how it was to be implemented.

To some extent, Big Data seems to be in a very similar space today, going by what Iâ€™ve heard and read about its uptake. Its adoption still has a way to go, although it is definitely growing. The reasons for this, as unearthed by the study, seem understandable given the facts. Mining Big Data usually means having to implement Hadoop in order to store and work with large volumes of unstructured data. While the concept of Hadoop is one thing, actually putting together all the various components of a Big Data technology stack and working with them until pre-processed data is ready for analytics is still not all that easily done, relatively speaking. There is a shortage of technology professionals, as evidenced by the study, and those companies that have tried it donâ€™t appear to be evangelizing it much just yet.

Clearly thereâ€™s more work and waiting to be done, but itâ€™s all moving in the right direction as far as I can tell. For example, while a new generation of Hadoop technologists are in training, and practicing their new skills as we speak, there are companies such as Platfora, Altiscale and others in the Big Data landscape that have come out with offerings that attempt to provide diverse solutions that make it quicker to implement Hadoop and extract meaningful business output from it.

But there was one analysis of the survey findings that confused me, and that is that the low number of users of Hadoop implementations relative to the cost of cluster hardware and associated software seems to be a dissuading factor for its uptake. There are a number of reasons cited to explain the low number of users. I havenâ€™t read the detailed study report, but I would ask why the number of users was being associated with the investment in Hadoop implementation. Hadoop is not an end user application, it's a component in an analytics technology stack. Shouldnâ€™t the association have been between the business gain that could ultimately be attributed to a decision made on the basis of Big Data analytics insights vs the investment? For example, a supercomputer that analyses volumes of data to predict the weather would probably have very few direct users (weather experts), but their predictions could be used to prepare for the impacts of severe weather on an entire nation.

This is what makes me ask if there are a lot of potential analytics users that are missing out on its benefits because they are worried about the difficulty of implementing Hadoop, or because they don't fully understand it just yet. I'm sure that with the passage of time this gap will be filled. In the meantime their focus should remain on the end application, which is business analytics. Analytics does not necessarily need Big Data (or large volumes of unstructured data). It just needs enough data to make statistical models and correlations valid, and that data may well be in the large volumes of structured data that are already being captured in existing enterprise systems. But thatâ€™s a topic for another day.

This blog is listed under Cloud Computing , Development & Implementations and Data & Information Management Community

Share this Post:

Was the blog helpful?

Big data analytics

Hadoop

Structured Data

Cloud

Post a Comment

Please notify me the replies via email.

Important:

We hope the conversations that take place on MyTechLogy.com will be constructive and thought-provoking.
To ensure the quality of the discussion, our moderators may review/edit the comments for clarity and relevance.
Comments that are promotional, mean-spirited, or off-topic may be deleted per the moderators' judgment.