How CIOs can build the foundation for a data science culture

 
How CIOs can build the foundation for a data science culture
Helping to establish a new culture of inquiry can be a way for these executives to reclaim a leadership role in information.
By Bud Mathaisel and Galen Gruman

Features


Introduction

The new analytics requires that CIOs and IT organizations find new ways to engage with their business partners. For all the strategic opportunities new analytics offers the enterprise, it also threatens the relevance of the CIO. The threat comes from the fact that the CIO’s business partners are being sold data analytics services and software outside normal IT procurement channels, which cuts out of the process the very experts who can add real value.

Perhaps the vendors’ user-centric view is based on the premise that only users in functional areas can understand which data and conclusions from its analysis are meaningful. Perhaps the CIO and IT have not demonstrated the value they can offer, or they have dwelled too much on controlling security or costs to the detriment of showing the value IT can add. Or perhaps only the user groups have the funding to explore new analytics.

Whatever the reasons, CIOs must rise above them and find ways to provide important capabilities for new analytics while enjoying the thrill of analytics discovery, if only vicariously. The IT organization can become the go-to group, and the CIO can become the true information leader. Although it is a challenge, the new analytics is also an opportunity because it is something within the CIO’s scope of responsibility more than nearly any other development in information technology.

The new analytics needs to be treated as a long-term collaboration between IT and business partners—similar to the relationship PwC has advocated1 for the general consumerization-of-IT phenomenon invoked by mobility, social media, and cloud services. This tight collaboration can be a win for the business and for the CIO. The new analytics is a chance for the CIO to shine, reclaim the “I” leadership in CIO, and provide a solid footing for a new culture of inquiry.

Back to top


The many ways for CIOs to be new analytics leaders

In businesses that provide information products or services—such as healthcare, finance, and some utilities—there is a clear added value from having the CIO directly contribute to the use of new analytics. Consider Edwards Lifesciences, where hemodynamic (blood circulation) modeling has benefited from the convergence of new data with new tools to which the CIO contributes. New digitally enabled medical devices, which are capable of generating a continuous flow of data, provide the opportunity to measure, analyze, establish pattern boundaries, and suggest diagnoses.

“In addition, a personal opportunity arises because I get to present our newest product, the EV1000, directly to our customers alongside our business team,” says Ashwin Rangan, CIO of Edwards Lifesciences. Rangan leverages his understanding of the underlying technologies, and, as CIO, he helps provision the necessary information infrastructure. As CIO, he also has credibility with customers when he talks to them about the information capabilities of Edwards’ products.

For CIOs whose businesses are not in information products or services, there’s still a reason to engage in the new analytics beyond the traditional areas of enablement and of governance, risk, and compliance (GRC). That reason is to establish long-term relationships with the business partners. In this partnership, the business users decide which analytics are meaningful, and the IT professionals consult with them on the methods involved, including provisioning the data and tools. These CIOs may be less visible outside the enterprise, but they have a crucial role to play internally to jointly explore opportunities for analytics that yield useful results.

“IT has partnered successfully with Gallo’s marketing, sales, R&D, and distribution to leverage the capabilities of information from multiple sources. IT is not the focus of the analytics; the business is.”
—Kent Kushar, E. & J. Gallo Winery

E. & J. Gallo Winery takes this approach. Its senior management understood the need for detailed customer analytics. “IT has partnered successfully with Gallo’s marketing, sales, R&D, and distribution to leverage the capabilities of information from multiple sources. IT is not the focus of the analytics; the business is,” says Kent Kushar, Gallo’s CIO. “After working together with the business partners for years, Gallo’s IT recently reinvested heavily in updated infrastructure and began to coalesce unstructured data with the traditional structured consumer data.” (See “How the E. & J. Gallo Winery matches outbound shipments to retail customers” on page 11.)

Regardless of the CIO’s relationship with the business, many technical investments IT makes are the foundation for new analytics. A CIO can often leverage this traditional role to lead new analytics from behind the scenes. But doing even that—rather than leading from the front as an advocate for business-valuable analytics—demands new skills, new data architectures, and new tools from IT.

At Ingram Micro, a technology distributor, CIO Mario Leone views a well-integrated IT architecture as a critical service to business partners to support the company’s diverse and dynamic sales model and what Ingram Micro calls the “frontier” analysis of distribution logistics. “IT designs the modular and scalable backplane architecture to deliver real-time and relevant analytics,” he says. On one side of the backplane are multiple data sources, primarily delivered through partner interactions; on the flip side of the backplane are analytics tools and capabilities, including such new features as pattern recognition, optimization, and visualization. Taken together, the flow of multiple data streams from different points and advanced tools for business users can permit more sophisticated and iterative analyses that give greater insight to product mix offerings, changing customer buying patterns, and electronic channel delivery preferences. The backplane is a convergence point of those data into a coherent repository. (See Figure 1.)

Figure 1
A CIO’s situationally specific roles

Given these multiple ways for CIOs to engage in the new analytics—and the self-interest for doing so—the next issue is how to do it. After interviewing leading CIOs and other industry experts, PwC offers the following recommendations.

Back to top


Enable the data scientist

One course of action is to strategically plan and provision the data and infrastructure for the new sources of data and new tools (discussed in the next section). However, the bigger challenge is to invoke the productive capability of the users. This challenge poses several questions:

  • How can CIOs do this without knowing in advance which users will harvest the capabilities?
  • Analytics capabilities have been pursued for a long time, but several hurdles have hindered the attainment of the goal (such as difficult-to-use tools, limited data, and too much dependence on IT professionals). CIOs must ask: which of these impediments are eased by the new capabilities and which remain?
  • As analytics moves more broadly through the organization, there may be too few people trained to analyze and present data-driven conclusions. Who will be fastest up the learning curve of what to analyze, of how to obtain and process data, and of how to discover useful insights?

What the enterprise needs is the data scientist—actually, several of them. A data scientist follows a scientific method of iterative and recursive analysis, with a practical result in mind. Examples are easy to identify: an outcome that improves revenue, profitability, operations or supply chain efficiency, R&D, financing, business strategy, the use of human capital, and so forth. There is no sure way of knowing in advance where or when this insight will arrive, so it cannot be tackled in assembly line fashion with predetermined outcomes.

The analytic approach involves trial and error and accepts that there will be dead ends, although a data scientist can even draw a useful conclusion—“this doesn’t work”—from a dead end. Even without formal training, some business users have the suitable skills, experience, and mind-set. Others need to be trained and encouraged to think like a scientist but behave like a—choose the function—financial analyst, marketer, sales analyst, operations quality analyst, or whatever. When it comes to repurposing parts of the workforce, it’s important to anticipate obstacles or frequent objections and consider ways to overcome them. (See Table 1.)

Table 1
Barriers to adoption of analytics and ways to address them

Table 1: Barriers to adoption of analytics and ways to address them.
Josée Latendresse of Latendresse Groupe Conseil says one of her clients, an apparel manufacturer based in Quebec, has been hiring PhDs to serve in the data science function.

Josée Latendresse of Latendresse Groupe Conseil says one of her clients, an apparel manufacturer based in Quebec, has been hiring PhDs to serve in this function. “They were able to know the factors and get very, very fine analysis of the information,” she says.

Gallo has tasked statisticians in IT, R&D, sales, and supply chain to determine what information to analyze, the questions to ask, the hypotheses to test, and where to go after that, Kushar says.

The CIO has the opportunity to help identify the skills needed and then help train and support data scientists, who may not reside in IT. CIOs should work with the leaders of each business function to answer the questions: Where would information insights pay the highest dividends? Who are the likely candidates in their functions to be given access to these capabilities, as well as the training and support?

Many can gain or sharpen analytic skills. The CIO is in the best position to ensure that the skills are developed and honed.

The CIO must first provision the tools and data, but the data analytics requires the CIO and IT team to assume more responsibility for the effectiveness of the resources than in the past. Kushar says Gallo has a team within IT dedicated to managing and proliferating business intelligence tools, training, and processes.

When major systems were deployed in the past, CIOs did their best to train users and support them, but CIOs only indirectly took responsibility for the users’ effectiveness. In data analytics, the responsibility is more directly correlated: the investments are not worth making unless IT steps up to enhance the users’ performance. Training should be comprehensive and go beyond teaching the tools to helping users establish an hypothesis, iteratively discover and look for insights from results that don’t match the hypothesis, understand the limitations of the data, and share the results with others (crowdsourcing, for example) who may see things the user does not.

What constitutes good NLP is open to debate, but it’s clear that some of the more useful methods blend different detailed levels of analysis and sophisticated filtering, while others stay attuned to the full context of the conversations.
 

Training should encompass multiple tools, since part of what enables discovery is the proper pairing of tool, person, and problem; these pairings vary from problem to problem and person to person. You want a toolset to handle a range of analytics, not a single tool that works only in limited domains and for specific modes of thinking.

The CIO could also establish and reinforce a culture of information inquiry by getting involved in data analysis trials. This involvement lends direct and moral support to some of the most important people in the organization. For CIOs, the bottom line is to care for the infrastructure but focus more on the actual use of information services. Advanced analytics is adding insight and power to those services.

Back to top


Renew the IT infrastructure for the new analytics

As with all IT investments, CIOs are accountable for the payback from analytics. For decades, much time and money has been spent on data architectures; identification of “interesting” data; collecting, filtering, storing, archiving, securing, processing, and reporting data; training users; and the associated software and hardware in pursuit of the unique insights that would translate to improved marketing, increased sales, improved customer relationships, and more effective business operations.

Because most enterprises have been frustrated by the lack of clear payoffs from large investments in data analysis, they may be tempted to treat the new analytics as not really new. This would be a mistake. As with most developments in IT, there is something old, something new, something borrowed, and possibly something blue in the new analytics. Not everything is new, but that doesn’t justify treating the new analytics as more of the same. In fact, doing so indicates that your adoption of the new analytics is merely applying new tools and perhaps personnel to your existing activities. It’s not the tool per se that solves problems or finds insights—it’s the people who are able to explore openly and freely and to think outside the box, aided by various tools. So don’t just re-create or refurbish the existing box.

Even if the CIO is skeptical and believes analytics is in a major hype cycle, there is still reason to engage. At the very least, the new analytics extends IT’s prior initiatives; for example, the new analytics makes possible the kind of analytics your company has needed for decades to enhance business decisions, such as complex, real-time events management, or it makes possible new, disruptive business opportunities, such as the on-location promotion of sales to mobile shoppers.

Given limited resources, a portfolio approach is warranted. The portfolio should encompass many groups in the enterprise and the many functions they perform. It also should encompass the convergence of multiple data sources and multiple tools. If you follow Ingram Micro’s backplane approach, you get the data convergence side of the backplane from the combination of traditional information sources with new data sources. Traditional information sources include structured transaction data from enterprise resource planning (ERP) and customer relationship management (CRM) systems; new data sources include textual information from social media, clickstream transactions, web logs, radio frequency identification (RFID) sensors, and other forms of unstructured and/or disparate information.

The analytics tools side of the backplane arises from the broad availability of new tools and infrastructure, such as mobile devices; improved in-memory systems; better user interfaces for search; significantly improved visualization technologies; improved pattern recognition, optimization, and analytics software; and the use of the cloud for storing and processing. (See the article, “The art and science of new analytics technology,” on page 30.)

Understanding what remains the same and what is new is a key to profiting from the new analytics. Even for what remains the same, additional investments are required.

Back to top


Develop the new analytics strategic plan

As always, the CIO should start with a strategic plan. Gallo’s Kushar refers to the data analytics specific plan as a strategic plan for the “enterprise information fabric,” a reference to all the crossover threads that form an identifiable pattern. An important component of this fabric is the identification of the uses and users that have the highest potential for payback. Places to look for such payback include areas where the company has struggled, where traditional or nontraditional competition is making inroads, and where the data has not been available or granular enough until now.

The strategic plan must include the data scientist talent required and the technologies in which investments need to be made, such as hardware and software, user tools, structured and unstructured data sources, reporting and visualization capabilities, and higher-capacity networks for moving larger volumes of data. The strategic planning process brings several benefits: it updates IT’s knowledge of emerging capabilities as well as traditional and new vendors, and it indirectly informs prospective vendors that the CIO and IT are not to be bypassed. Once the vendor channels are known to be open, the vendors will come.

Criteria for selecting tools may vary by organization, but the fundamentals are the same. Tools must efficiently handle larger volumes within acceptable response times, be friendly to users and IT support teams, be sound technically, meet security standards, and be affordable.

The new appliances and tools could each cost several millions of dollars, and millions more to support. The good news is some of the tools and infrastructure can be rented through the cloud, and then tested until the concepts and super-users have demonstrated their potential. (See the interview with Mike Driscoll on page 20.) “All of this doesn’t have to be done in-house with expensive computing platforms,” says Edwards’ Rangan. “You can throw it in the cloud … without investing in tremendous capital-intensive equipment.”

With an approved strategy, CIOs can begin to update the IT internal capabilities. At a minimum, IT must first provision the new data, tools, and infrastructure, and then ensure the IT team is up to speed on the new tools and capabilities. Gallo’s IT organization, for example, recently reinvested heavily in new appliances; system architecture; extract, transform, and load (ETL) tools; and ways in which SQL calls were written, and then began to coalesce unstructured data with the traditional structured consumer data.

Back to top


Provision data, tools, and infrastructure

The talent, toolset, and infrastructure are prerequisites for data analytics. In the new analytics, CIOs and their business partners are changing or extending the following:

  • Data sources to include the traditional enterprise structured information in core systems such as ERP, CRM, manufacturing execution systems, and supply chain, plus newer sources such as syndicated data (point of sale, Nielsen, and so on) and unstructured data from social media and other sources—all without compromising the integrity of the production systems or their data and while managing data archives efficiently.
  • Appliances to include faster processing and better in-memory caching. In-memory caching improves cycle time significantly, enabling information insights to follow human thought patterns closer to their native speeds.
  • Software to include newer data management, analysis, reporting, and visualization tools—likely multiple tools, each tuned to a specific capability.
  • Data architectures and flexible metadata to accommodate multiple streams of multiple types of data stored in multiple databases. In this environment, a single database architecture is unworkable.
  • A cloud computing strategy that factors in the requirements of newly expanded analytics capability and how best to tap external as well as internal resources. Service-level expectations should be established for customers to ensure that these expanded sources of relevant data are always online and available in real time.

The adoption of new analytics is an opportunity for IT to augment or update the business’s current capabilities. According to Kushar, Gallo IT’s latest investments are extensions of what Gallo wanted to do 25 years ago but could not due to limited availability of data and tools.

Of course, each change requires a new response from IT, and each raises the perpetual dilemma of how to be selective with investments (to conserve funds) while being as broad and heterogeneous as possible so a larger population can create analytic insights, which could come from almost anywhere.

Back to top


Update IT capabilities: Leverage the cloud’s capacity

With a strategic plan in place and the tools provisioned, the next prerequisite is to ensure that the IT organization is ready to perform its new or extended job. One part of this preparation is the research on tools the team needs to undertake with vendors, consultancies, and researchers.

The CIO should consider some organizational investments to add to the core human resources in IT, because once the business users get traction, IT must be prepared to meet the increased demands for technical support. IT will need new skills and capabilities that include:

  • Broader access to all relevant types of data, including data from transaction systems and new sources
  • Broader use of nontraditional resources, such as big data analytics services
  • Possible creation of specialized databases and data warehouses
  • Competence in new tools and techniques, such as database appliances, column and row databases, compression techniques, and NoSQL frameworks
  • Support in the use of tools for reporting and visualization
  • Updated approaches for mobile access to data and analytic results
  • New rules and approaches to data security
  • Expanded help desk services

Without a parallel investment in IT skills, investments in tools and infrastructure could lie fallow, causing frustrated users to seek outside help. For example, without advanced compression and processing techniques, performance becomes a significant problem as databases grow larger and more varied. That’s an IT challenge that users would not anticipate, but it could result in a poor experience that leads them to third parties that have solved the issue (even if the users never knew what the issue was).

Most of the IT staff will welcome the opportunities to learn new tools and help support new capabilities, even if the first reaction might be to fret over any extra work. CIOs must lead this evolution by being a source for innovation and trends in analytics, encouraging adoption, having the courage to make the investments, demonstrating trust in IT teams and users, and ensuring that execution matches the strategy.

Back to top


Conclusion

Developing insightful, actionable analytics is a necessary skill for every knowledge worker, researcher, consumer, teacher, and student.
 
The adoption of new analytics is an opportunity for IT to augment or update the business’s current capabilities. According to CIO Kent Kushar, Gallo IT’s latest investments are extensions of what Gallo wanted to do 25 years ago but could not due to limited availability of data and tools.
 

Data analytics is no longer an obscure science for specialists in the ivory tower. Increasingly more analytics power is available for more people. Thanks to these new analytics, business users have been unchained from prior restrictions, and finding answers is easier, faster, and less costly. Developing insightful, actionable analytics is a necessary skill for every knowledge worker, researcher, consumer, teacher, and student. It is driven by a world in which faster insight is treasured, and it often needs to be real time to be most effective. Real-time data that changes quickly invokes a quest for real-time analytic insights and is not tolerant of insights from last quarter, last month, last week, or even yesterday.

Enabling the productive use of information tools is not a new obligation for the CIO, but the new analytics extends that obligation—in some cases, hyperextends it. Fulfilling that obligation requires the CIO to partner with human resources, sales, and other functional groups to establish the analytics credentials for knowledge workers and to take responsibility for their success. The CIO becomes a teacher and role model for the increasing number of data engineers, both the formal and informal ones.

Certainly, IT must do its part to plan and provision the raw enabling capabilities and handle GRC, but more than ever, data analytics is the opportunity for the CIO to move out of the data center and into the front office. It is the chance for the CIO to demonstrate information leadership.

Back to top


1 The consumerization of IT: The next-generation CIO, PwC white paper, November 2011, http://www.pwc.com/us/en/technology-innovation-center/consumerization-information-technology-transforming-cio-role.jhtml, accessed February 1, 2012.