An article from McKinsey & Company, a leading management consulting and research firm, declares: “By 2018, the United States alone could face a shortage of 140,000 to 190,000 people with deep [data] analytical skills as well as [a shortage of] 1.5 million managers and analysts with the know-how to use the analysis of big data to make effective decisions. Might this be your shop?
Many companies today are scrambling to assemble an IT data analytics infrastructure to support data analytics. But before they can even begin they have to figure out what kind of analytics the organization will want to deploy. Big data is just one of many possibilities and the infrastructure that works for some types of data analytics won’t work for others.
Just off the top of the head this blogger can list a dozen types of data analytics in play: OLAP, business intelligence (BI), business analytics, predictive analytics, real-time analytics, big data analytics, social analytics, web analytics, click stream analytics, mobile analytics, brand/reputation analysis, and competitive intelligence. You’ve probably have a few of these already.
As advanced analytics pick up momentum data center managers will be left trying to cobble together an appropriate IT infrastructure for whatever flavors of analytics the organization intends to pursue. Unless you have a very generous budget you can’t do it all.
For example, big data is unbelievably hot right now so maybe it makes sense to build an infrastructure to support big data analytics. But predictive analytics, the up and coming superstar of business analytics, is an equally hot capability due to its ability to counter fraud or boost online conversion immediately, while the criminal or customer is still online.
BI, however, has been the analytics workhorse for many organizations for a decade or more, along with OLAP, and companies already have a working infrastructure for that. It consists of a data warehouse with relational databases and common query, reporting, and cubing tools. The IT infrastructure, for the most part, already is in place and working.
On the other hand, if top management now wants big data analytics or real time data analytics or predictive analytics you may need a different information architecture and design, different tools, and possibly even different underlying technologies. Big data, for example, relies on Hadoop, a batch process that does not make use of SQL. (Vendors are making a valiant effort to graft a SQL-like interface onto Hadoop with varying degrees of success.)
Real-time analytics is just that—real-time—basically the opposite of Hadoop. It works best using in-memory data and logic processing to speed the results of analytic queries in seconds or even microseconds. Data will be stored on flash storage or in large amounts of cache memory as close to the processing as it can get.
A data information architecture that is optimized for big data’s unstructured batch data cannot also be used for real time analytics. And the traditional BI data warehouse infrastructure probably isn’t optimized for either of them. The solution calls for extending your existing data management infrastructure to encompass the latest analytics management wants or designing and building yet another IT data infrastructure. Over the past year, however, the cloud has emerged as another place where organizations can run analytics, provided the providers can overcome the latencies inherent in the cloud.