Forrester: Big Data on Cutting Edge

| October 2, 2011

Big Data is not the typical data mining and business intelligience applications. There are new examples of more applications on the cutting edge. NY Times article by Steve Lohr :Sorting Reality fro Hype”

A Forrester Research report, published Friday, provides some leavening perspective on the big data phenomenon. The report is based on a survey of 60 Forrester clients who are using or experimenting with big data computing. It tries to define big data, assess its current applications and offer tips for corporate managers.

The takeaway points, from reading the report and an interview with a co-author, include: big data is an applied science project in most companies, and a major potential constraint is not the cost of the computing technology but the skilled people needed to carry out these projects — the data scientists.
The report concludes that big data is a real and significant trend. “Big data technology, while early-stage, is not vapor-ware,” the authors write.

The science-project nature of big data to date is highlighted, I thought, by six examples of innovators described at the start of the report. There is Google, of course. And IBM’s Watson, which defeated two human “Jeopardy” champions earlier this year, is cited. But two of the other examples — remote sensors collecting data on premature babies in a hospital ward at the University of Ontario, and a smart-grid project at the Tennessee Valley Authority — are also collaborations with IBM Research (which is not mentioned).

So yes, there are cutting-edge innovators with big data, but not a lot, it seems.
Boris Evelson, a Forrester analyst and coauthor of the report, with Brian Hopkins, explained that big-data computing differs fundamentally from using other data-analysis tools, like business intelligence and data warehousing software.

“With the other technology, you need to model something first,” Mr. Evelson said. “But what if you don’t know the questions? Big data is all about exploration without preconceived notions.”
Indeed, big data is about finding patterns in the proverbial noise of vast, unstructured data sets.

The big data tools, Mr. Evelson noted, are not themselves costly. Much of the software is based on open-source Hadoop, a framework for handling diverse data and probing it with distributed, parallel-processing computing clusters. There are commercial versions from companies including Cloudera, I.B.M., EMC and Hortonworks. And business intelligence software makers, like Microstrategy, are integrating their offerings with big data tools. And there are cloud-based services emerging for big data applications.

Yet if the tools are comparatively low-cost, the skills needed are specialized and technical. Exploring for patterns in the data is not yet for the corporate rank and file.
“In big data today,” Mr. Evelson said, “it’s all about programming. You need Java programmers, computational statisticians and mathematicians.”

Category: Uncategorized

Comments are closed.