Creating a cost-effective Big Data strategy

Photo: Bud AlbersPhoto: Scott ThompsonPhoto: Matt EstesDisney's Bud Albers, Scott Thompson, and Matt Estes outline an agile approach that leverages open-source and cloud technologies.

Interview conducted by Galen Gruman and Alan Morrison

Bud Albers joined what is now the Disney Technology Shared Services Group two years ago as executive vice president and CTO. His management team includes Scott Thompson, vice president of architecture, and Matt Estes, principal data architect. The Technology Shared Services Group, located in Seattle, has a heritage dating back to the late 1990s, when Disney acquired Starwave and Infoseek.

The group supports all the Disney businesses ($38 billion in annual revenue), managing the company’s portfolio of Web properties. These include properties for the studio, store, and park; ESPN; ABC; and a number of local television stations in major cities.

In this interview, Albers, Thompson, and Estes discuss how they’re expanding Disney’s Web data analysis footprint without incurring additional cost by implementing a Hadoop cluster. Albers and team freed up budget for this cluster by virtualizing servers and eliminating other redundancies.


PwC: Disney is such a diverse company, and yet there clearly is lots of potential for synergies and cross-fertilization. How do you approach these opportunities from a data perspective?

BA: We try and understand the best way to work with and to provide services to the consumer in the long term. We have some businesses that are very data intensive, and then we have some that are less so because of their consumer audience. One of the challenges always is how to serve both kinds of businesses and do so in ways that make sense. The sell-to relationships extend from the studio out to the distribution groups and the theater chains. If you’re selling to millions, you’re trying to understand the different audiences and how they connect.

One of the things I’ve been telling my folks from a data perspective is that you don’t send terabytes one way to be mated with a spreadsheet on the other side, right? We’re thinking through those kinds of pieces and trying to figure out how we move down a path. The net is that working with all these businesses gives us a diverse set of requirements, as you might imagine. We’re trying to stay ahead of where all the businesses are.

In that respect, the questions I’m asking are, how do we get more agile, and how do we do it in a way that handles all the data we have? We must consider all of the new form factors being developed, all of which will generate lots of data. A big question is, how do we handle this data in a way that makes cost sense for the business and provides us an increased level of agility?

We hope to do in other areas what we’ve done with content distribution networks [CDNs]. We’ve had a tremendous amount of success with the CDN marketplace by standardizing, by staying in the middle of the road and not going to Akamai proprietary extensions, and by creating a dynamic marketplace. If we get a new episode of Lost, we can start streaming it, and I can be streaming 80 percent on Akamai and 20 percent on Level 3. Then we can decide we’re going to turn it back, and I’m going to give 80 percent to Limelight and 20 percent to Level 3. We can do that dynamically.