As director of global online and strategic advertising sales for FT.com, the online face of the Financial Times, Jon Slade says he “looks at the 6 billion ad impressions [that FT.com offers] each year and works out which one is worth the most for any particular client who might buy.” This activity previously required labor-intensive extraction methods from a multitude of databases and spreadsheets. Slade made the process much faster and vastly more effective after working with Metamarkets, a company that offers a cloud-based, in-memory analytics service called Druid.
“Before, the sales team would send an e-mail to ad operations for an inventory forecast, and it could take a minimum of eight working hours and as long as two business days to get an answer,” Slade says. Now, with a direct interface to the data, it takes a mere eight seconds, freeing up the ad operations team to focus on more strategic issues. The parallel processing, in-memory technology, the interface, and many other enhancements led to better business results, including double-digit growth in ad yields and 15 to 20 percent accuracy improvement in the metrics for its ad impression supply.
The technology trends behind FT.com’s improvements in advertising operations—more accessible data; faster, less-expensive computing; new software tools; and improved user interfaces—are driving a new era in analytics use at large companies around the world, in which enterprises make decisions with a precision comparable to scientific insight. The new analytics uses a rigorous scientific method, including hypothesis formation and testing, with science-oriented statistical packages and visualization tools. It is spawning business unit “data scientists” who are replacing the centralized analytics units of the past. These trends will accelerate, and business leaders who embrace the new analytics will be able to create cultures of inquiry that lead to better decisions throughout their enterprises. (See Figure 1.)
How better customer analytics capabilities are affecting enterprises
This issue of the Technology Forecast explores the impact of the new analytics and this culture of inquiry. This first article examines the essential ingredients of the new analytics, using several examples. The other articles in this issue focus on the technologies behind these capabilities (see the article, “The art and science of new analytics technology,” on page 30) and identify the main elements of a CIO strategic framework for effectively taking advantage of the full range of analytics capabilities (see the article, “How CIOs can build the foundation for a data science culture,” on page 58).
Basic computing trends are providing the momentum for a third wave in analytics that PwC calls the new analytics. Processing power and memory keep increasing, the ability to leverage massive parallelization continues to expand in the cloud, and the cost per processed bit keeps falling.
FT.com benefited from all of these trends. Slade needs multiple computer screens on his desk just to keep up. His job requires a deep understanding of the readership and which advertising suits them best. Ad impressions—appearances of ads on web pages—are the currency of high-volume media industry websites. The impressions need to be priced based on the reader segments most likely to see them and click through. Chief executives in France, for example, would be a reader segment FT.com would value highly.
“The trail of data that users create when they look at content on a website like ours is huge,” Slade says. “The real challenge has been trying to understand what information is useful to us and what we do about it.”
FT.com’s analytics capabilities were a challenge, too. “The way that data was held—the demographics data, the behavior data, the pricing, the available inventory—was across lots of different databases and spreadsheets,” Slade says. “We needed an almost witchcraft-like algorithm to provide answers to ‘How many impressions do I have?’ and ‘How much should I charge?’ It was an extremely labor-intensive process.”
FT.com saw a possible solution when it first talked to Metamarkets about an initial concept, which evolved as they collaborated. Using Metamarkets’ analytics platform, FT.com could quickly iterate and investigate numerous questions to improve its decision-making capabilities. “Because our technology is optimized for the cloud, we can harness the processing power of tens, hundreds, or thousands of servers depending on our customers’ data and their specific needs,” states Mike Driscoll, CEO of Metamarkets. “We can ask questions over billions of rows of data in milliseconds. That kind of speed combined with data science and visualization helps business users understand and consume information on top of big data sets.”
Decades ago, in the first wave of analytics, small groups of specialists managed computer systems, and even smaller groups of specialists looked for answers in the data. Businesspeople typically needed to ask the specialists to query and analyze the data. As enterprise data grew, collected from enterprise resource planning (ERP) systems and other sources, IT stored the more structured data in warehouses so analysts could assess it in an integrated form. When business units began to ask for reports from collections of data relevant to them, data marts were born, but IT still controlled all the sources.
The second wave of analytics saw variations of centralized top-down data collection, reporting, and analysis. In the 1980s, grassroots decentralization began to counter that trend as the PC era ushered in spreadsheets and other methods that quickly gained widespread use—and often a reputation for misuse. Data warehouses and marts continue to store a wealth of helpful data.
In both waves, the challenge for centralized analytics was to respond to business needs when the business units themselves weren’t sure what findings they wanted or clues they were seeking.
The third wave does that by giving access and tools to those who act on the findings. New analytics taps the expertise of the broad business ecosystem to address the lack of responsiveness from central analytics units. (See Figure 2.) Speed, storage, and scale improvements, with the help of cloud co-creation, have made this decentralized analytics possible. The decentralized analytics innovation has evolved faster than the centralized variety, and PwC expects this trend to continue.
The three waves of analytics and the impact of decentralization
“In the middle of looking at some data, you can change your mind about what question you’re asking. You need to be able to head toward that new question on the fly,” says Jock Mackinlay, director of visual analysis at Tableau Software, one of the vendors of the new visualization front ends for analytics. “No automated system is going to keep up with the stream of human thought.”
Big data techniques—including NoSQL1 and in-memory databases, advanced statistical packages (from SPSS and SAS to open source offerings such as R), visualization tools that put interactive graphics in the control of business unit specialists, and more intuitive user interfaces—are crucial to the new analytics. They make it possible for many people in the workforce to do some basic exploration. They allow business unit data scientists to use larger data sets and to iterate more as they test hypotheses, refine questions, and find better answers to business problems.
E. & J. Gallo Winery, one of the world’s largest producers and distributors of wines, recognizes the need to precisely identify its customers for two reasons: some local and state regulations mandate restrictions on alcohol distribution, and marketing brands to individuals requires knowing customer preferences.
“The majority of all wine is consumed within four hours and five miles of being purchased, so this makes it critical that we know which products need to be marketed and distributed by specific destination,” says Kent Kushar, Gallo’s CIO.
Gallo knows exactly how its products move through distributors, but tracking beyond them is less clear. Some distributors are state liquor control boards, which supply the wine products to retail outlets and other end customers. Some sales are through military post exchanges, and in some cases there are restrictions and regulations because they are offshore.
Gallo has a large compliance department to help it manage the regulatory environment in which Gallo products are sold, but Gallo wants to learn more about the customers who eventually buy and consume those products, and to learn from them information to help create new products that localize tastes.
Gallo sometimes cannot obtain point of sales data from retailers to complete the match of what goes out to what is sold. Syndicated data, from sources such as Information Resources, Inc. (IRI), serves as the matching link between distribution and actual consumption. This results in the accumulation of more than 1GB of data each day as source information for compliance and marketing.
Years ago, Gallo’s senior management understood that customer analytics would be increasingly important. The company’s most recent investments are extensions of what it wanted to do 25 years ago but was limited by availability of data and tools. Since 1998, Gallo IT has been working on advanced data warehouses, analytics tools, and visualization. Gallo was an early adopter of visualization tools and created IT subgroups within brand marketing to leverage the information gathered.
The success of these early efforts has spurred Gallo to invest even more in analytics. “We went from step function growth to logarithmic growth of analytics; we recently reinvested heavily in new appliances, a new system architecture, new ETL [extract, transform, and load] tools, and new ways our SQL calls were written; and we began to coalesce unstructured data with our traditional structured consumer data,” says Kushar.
“Recognizing the power of these capabilities has resulted in our taking a 10-year horizon approach to analytics,” he adds. “Our successes with analytics to date have changed the way we think about and use analytics.”
The result is that Gallo no longer relies on a single instance database, but has created several large purpose-specific databases. “We have also created new service level agreements for our internal customers that give them faster access and more timely analytics and reporting,” Kushar says. Internal customers for Gallo IT include supply chain, sales, finance, distribution, and the web presence design team.
Data scientists are nonspecialists who follow a scientific method of iterative and recursive analysis with a practical result in mind. Even without formal training, some business users in finance, marketing, operations, human capital, or other departments already have the skills, experience, and mind-set to be data scientists. Others can be trained. The teaching of the discipline is an obvious new focus for the CIO. (See the article,”How CIOs can build the foundation for a data science culture” on page 58.)
Visualization tools have been especially useful for Ingram Micro, a technology products distributor, which uses them to choose optimal warehouse locations around the globe. Warehouse location is a strategic decision, and Ingram Micro can run many what-if scenarios before it decides. One business result is shorter-term warehouse leases that give Ingram Micro more flexibility as supply chain requirements shift due to cost and time.
“Ensuring we are at the efficient frontier for our distribution is essential in this fast-paced and tight-margin business,” says Jonathan Chihorek, vice president of global supply chain systems at Ingram Micro. “Because of the complexity, size, and cost consequences of these warehouse location decisions, we run extensive models of where best to locate our distribution centers at least once a year, and often twice a year.”
Modeling has become easier thanks to mixed integer, linear programming optimization tools that crunch large and diverse data sets encompassing many factors. “A major improvement came from the use of fast 64-bit processors and solid-state drives that reduced scenario run times from six to eight hours down to a fraction of that,” Chihorek says. “Another breakthrough for us has been improved visualization tools, such as spider and bathtub diagrams that help our analysts choose the efficient frontier curve from a complex array of data sets that otherwise look like lists of numbers.”
Analytics tools were once the province of experts. They weren’t intuitive, and they took a long time to learn. Those who were able to use them tended to have deep backgrounds in mathematics, statistical analysis, or some scientific discipline. Only companies with dedicated teams of specialists could make use of these tools. Over time, academia and the business software community have collaborated to make analytics tools more user-friendly and more accessible to people who aren’t steeped in the mathematical expressions needed to query and get good answers from data.
Products from QlikTech, Tableau Software, and others immerse users in fully graphical environments because most people gain understanding more quickly from visual displays of numbers rather than from tables. “We allow users to get quickly to a graphical view of the data,” says Tableau Software’s Mackinlay. “To begin with, they’re using drag and drop for the fields in the various blended data sources they’re working with. The software interprets the drag and drop as algebraic expressions, and that gets compiled into a query database. But users don’t need to know all that. They just need to know that they suddenly get to see their data in a visual form.”
Tableau Software itself is a prime example of how these tools are changing the enterprise. “Inside Tableau we use Tableau everywhere, from the receptionist who’s keeping track of conference room utilization to the salespeople who are monitoring their pipelines,” Mackinlay says.
These tools are also enabling more finance, marketing, and operational executives to become data scientists, because they help them navigate the data thickets.
The huge quantities of data in the cloud and the availability of enormous low-cost processing power can help enterprises analyze various business problems—including efforts to understand customers better, especially through social media. These external clouds augment data that business units already have direct access to internally.
Ingram Micro uses large, diverse data sets for warehouse location modeling, Chihorek says. Among them: size, weight, and other physical attributes of products; geographic patterns of consumers and anticipated demand for product categories; inbound and outbound transportation hubs, lead times, and costs; warehouse lease and operating costs, including utilities; and labor costs—to name a few.
Social media can also augment internal data for enterprises willing to learn how to use it. Some companies ignore social media because so much of the conversation seems trivial, but they miss opportunities.
Consider a North American apparel maker that was repositioning a brand of shoes and boots. The manufacturer was mining conventional business data for insights about brand status, but it had not conducted any significant analysis of social media conversations about its products, according to Josée Latendresse, who runs Latendresse Groupe Conseil, which was advising the company on its repositioning effort. “We were neglecting the wealth of information that we could find via social media,” she says.
To expand the analysis, Latendresse brought in technology and expertise from Nexalogy Environics, a company that analyzes the interest graph implied in online conversations—that is, the connections between people, places, and things. (See “Transforming collaboration with social tools,” Technology Forecast 2011, Issue 3, for more on interest graphs.) Nexalogy Environics studied millions of correlations in the interest graph and selected fewer than 1,000 relevant conversations from 90,000 that mentioned the products. In the process, Nexalogy Environics substantially increased the “signal” and reduced the “noise” in the social media about the manufacturer. (See Figure 3.)
Improving the signal-to-noise ratio in social media monitoring
What Nexalogy Environics discovered suggested the next step for the brand repositioning. “The company wasn’t marketing to people who were blogging about its stuff,” says Claude Théoret, president of Nexalogy Environics. The shoes and boots were designed for specific industrial purposes, but the blogging influencers noted their fashion appeal and their utility when riding off-road on all-terrain vehicles and in other recreational settings. “That’s a whole market segment the company hadn’t discovered.”
Latendresse used the analysis to help the company expand and refine its intelligence process more generally. “The key step,” she says, “is to define the questions that you want to have answered. You will definitely be surprised, because the system will reveal customer attitudes you didn’t anticipate.”
Following the social media analysis (SMA), Latendresse saw the retailer and its user focus groups in a new light. The analysis “had more complete results than the focus groups did,” she says. “You could use the focus groups afterward to validate the information evident in the SMA.” The revised intelligence development process now places focus groups closer to the end of the cycle. (See Figure 4.)
Adding social media analysis techniques suggests other changes to the BI process
Third parties such as Nexalogy Environics are among the first to take advantage of cloud analytics. Enterprises like the apparel maker may have good data collection methods but have overlooked opportunities to mine data in the cloud, especially social media. As cloud capabilities evolve, enterprises are learning to conduct more iteration, to question more assumptions, and to discover what else they can learn from data they already have.
One way to start with new analytics is to rally the workforce around a single core metric, especially when that core metric is informed by other metrics generated with the help of effective modeling. The core metric and the model that helps everyone understand it can steep the culture in the language, methods, and tools around the process of obtaining that goal.
A telecom provider illustrates the point. The carrier was concerned about big peaks in churn—customers moving to another carrier—but hadn’t methodically mined the whole range of its call detail records to understand the issue. Big data analysis methods made a large-scale, iterative analysis possible. The carrier partnered with Dataspora, a consulting firm run by Driscoll before he founded Metamarkets. (See Figure 5.)2
The benefits of big data analytics: A carrier example
“We analyzed 14 billion call data records,” Driscoll recalls, “and built a high-frequency call graph of customers who were calling each other. We found that if two subscribers who were friends spoke more than once for more than two minutes in a given month and the first subscriber cancelled their contract in October, then the second subscriber became 500 percent more likely to cancel their contract in November.”
Data mining on that scale required distributed computing across hundreds of servers and repeated hypothesis testing. The carrier assumed that dropped calls might be one reason why clusters of subscribers were cancelling contracts, but the Dataspora analysis disproved that notion, finding no correlation between dropped calls and cancellation.
“There were a few steps we took. One was to get access to all the data and next do some engineering to build a social graph and other features that might be meaningful, but we also disproved some other hypotheses,” Driscoll says. Watching what people actually did confirmed that circles of friends were cancelling in waves, which led to the peaks in churn. Intense focus on the key metric illustrated to the carrier and its workforce the power of new analytics.
The more pervasive the online environment, the more common the sharing of information becomes. Whether an enterprise is a gaming or an e-commerce company that can instrument its own digital environment, or a smart grid utility that generates, slices, dices, and shares energy consumption analytics for its customers and partners, better analytics are going direct to the customer as well as other stakeholders. And they’re being embedded where users can more easily find them.
For example, energy utilities preparing for the smart grid are starting to invite the help of customers by putting better data and more broadly shared operational and customer analytics at the center of a co-created energy efficiency collaboration.
Some of the data in the E. & J. Gallo Winery information architecture is for production and quality control, not just customer analytics. More recently, Gallo has adopted complex event processing methods on the source information, so it can look at successes and failures early in its manufacturing execution system, sales order management, and the accounting system that front ends the general ledger.
Information and information flow are the lifeblood of Gallo, but it is clearly a team effort to make the best use of the information. In this team:
Mining the information for patterns and insights in specific situations requires the team. A key goal is what Gallo refers to as demand sensing—to determine the stimulus that creates demand by brand and by product. This is not just a computer task, but is heavily based on human intervention to determine what the data reveal (for underlying trends of specific brands by location), or to conduct R&D in a test market, or to listen to the web platforms.
These insights inform a specific design for “smart shelving,” which is the placement of products by geography and location within the store. Gallo offers a virtual wine shelf design schematic to retailers, which helps the retailer design the exact details of how wine will be displayed—by brand, by type, and by price. Gallo’s wine shelf design schematic will help the retailer optimize sales, not just for Gallo brands but for all wine offerings.
Before Gallo’s wine shelf design schematic, wine sales were not a major source of retail profits for grocery stores, but now they are the first or second highest profit generators in those stores. “Because of information models such as the wine shelf design schematic, Gallo has been the wine category captain for some grocery stores for 11 years in a row so far,” says Kent Kushar, CIO of Gallo.
Saul Zambrano, senior director of customer energy solutions at Pacific Gas & Electric (PG&E), an early installer of smart meters, points out that policymakers are encouraging more third-party access to the usage data from the meters. “One of the big policy pushes at the regulatory level is to create platforms where third parties can—assuming all privacy guidelines are met—access this data to build business models they can drive into the marketplace,” says Zambrano. “Grid management and energy management will be supplied by both the utilities and third parties.”
Zambrano emphasizes the importance of customer participation to the energy efficiency push. The issue he raises is the extent to which blended operational and customer data can benefit the larger ecosystem, by involving millions of residential and business customers. “Through the power of information and presentation, you can start to show customers different ways that they can become stewards of energy,” he says.
As a highly regulated business, the utility industry has many obstacles to overcome to get to the point where smart grids begin to reach their potential, but the vision is clear:
This new kind of data sharing could be a chance to stimulate an energy efficiency competition that’s never existed between homeowners and between business property owners. It is also an example of how broadening access to new analytics can help create a culture of inquiry throughout the extended enterprise.
This article has explored how enterprises are embracing the big data, tools, and science of new analytics along a path that can lead them to a broader culture of inquiry, in which improved visualization and user interfaces make it possible to spread ad hoc analytics capabilities to every user role. This culture of inquiry appears likely to become the age of the data scientists—workers who combine a creative ability to generate useful hypotheses with the savvy to simulate and model a business as it’s changing.
It’s logical that utilities are instrumenting their environments as a step toward smart grids. The data they’re generating can be overwhelming, but that data will also enable the analytics needed to reduce energy consumption to meet efficiency and environmental goals. It’s also logical that enterprises are starting to hunt for more effective ways to filter social media conversations, as apparel makers have found. The return on investment for finding a new market segment can be the difference between long-term viability and stagnation or worse.
Tackling the new kinds of data being generated is not the only analytics task ahead. Like the technology distributor, enterprises in all industries have concerns about scaling the analytics for data they’re accustomed to having and now have more. Publishers can serve readers better and optimize ad sales revenue by tuning their engines for timing, pricing, and pinpointing ad campaigns. Telecom carriers can mine all customer data more effectively to be able to reduce the expense of churn and improve margins.
What all of these examples suggest is a greater need to immerse the extended workforce—employees, partners, and customers—in the data and analytical methods they need. Without a view into everyday customer behavior, there’s no leverage for employees to influence company direction when markets shift and there are no insights into improving customer satisfaction. Computing speed, storage, and scale make those insights possible, and it is up to management to take advantage of what is becoming a co-creative work environment in all industries—to create a culture of inquiry.
Of course, managing culture change is a much bigger challenge than simply rolling out more powerful analytics software. It is best to have several starting points and to continue to find ways to emphasize the value of analytics in new scenarios. One way to raise awareness about the power of new analytics comes from articulating the results in a visual form that everyone can understand. Another is to enable the broader workforce to work with the data themselves and to ask them to develop and share the results of their own analyses. Still another approach would be to designate, train, and compensate the more enthusiastic users in all units—finance, product groups, supply chain, human resources, and so forth—as data scientists. Table 1 presents examples of approaches to fostering a culture of inquiry.
Key elements of a culture of inquiry
The arc of all the trends explored in this article is leading enterprises toward establishing these cultures of inquiry, in which decisions can be informed by an analytical precision comparable to scientific insight. New market opportunities, an energized workforce with a stake in helping to achieve a better understanding of customer needs, and reduced risk are just some of the benefits of a culture of inquiry. Enterprises that understand the trends described here and capitalize on them will be able to improve how they attract and retain customers.
1 See “Making sense of Big Data,” Technology Forecast 2010, Issue 3, http://www.pwc.com/us/en/technology-forecast/2010/issue3/index.jhtml, for more information on Hadoop and other NoSQL databases.
2 For more best practices on methods to address churn, see Curing customer churn, PwC white paper, http://www.pwc.com/us/en/increasing-it-effectiveness/publications/curing-customer-churn.jhtml, accessed April 5, 2012.