Tapping into the power of Big Data

Tapping into the power of Big DataTreating it differently from your core enterprise data is essential.

Like most corporations, the Walt Disney Co. is swimming in a rising sea of Big Data: information collected from business operations, customers, transactions, and the like; unstructured information created by social media and other Web repositories, including the Disney home page itself and sites for its theme parks, movies, books, and music; plus the sites of its many big business units, including ESPN and ABC.

“In any given year, we probably generate more data than the Walt Disney Co. did in its first 80 years of existence,” observes Bud Albers, executive vice president and CTO of the Disney Technology Shared Services Group. “The challenge becomes what do you do with it all?”

Albers and his team are in the early stages of answering their own question with an economical cluster-computing architecture based on a set of cost-effective and scalable technologies anchored by Apache Hadoop, an open-source, Java-based distributed file system based on Google File System and developed by Apache Software Foundation. These still-emerging technologies allow Disney analysts to explore multiple terabytes of information without the lengthy time requirements or high cost of traditional business intelligence (BI) systems.

This issue of the Technology Forecast examines how Apache Hadoop and these related technologies can derive business value from Big Data by supporting a new kind of exploratory analytics unlike traditional BI. These software technologies and their hardware cluster platform make it feasible not only to look for the needle in the haystack, but also to look for new haystacks. This kind of analysis demands an attitude of exploration—and the ability to generate value from data that hasn’t been scrubbed or fully modeled into relational tables.

Using Disney and other examples, this first article introduces the idea of exploratory BI for Big Data. The second article examines Hadoop clusters and technologies that support them (page 22), and the third article looks at steps CIOs can take now to exploit the future benefits (page 36). We begin with a closer look at Disney’s still-nascent but illustrative effort.

"In any given year, we probably generate more data than the Walt Disney Co. did in its first 80 years of existence." —Bud Albers of Disney