Modeling to automate data center operations

Photo: Kirill Sheynkman is the founder, president, and CEO of ElastraKirill Sheynkman of Elastra discusses how modeling IT environments in the data center will enable enterprises to migrate legacy workloads to the cloud.

Interview conducted by Vinod Baya and Bo Parker

Kirill Sheynkman is the founder, president, and CEO of Elastra. A database software executive, he spent the early part of his career in the database sales and technology organizations of Oracle and Microsoft. Sheynkman was co-founder, president, and CEO
of two successful startups: Stanford Technology Group, which created the world’s first relational online analytical processing (OLAP) engine (acquired by Informix/IBM in 1995), and Plumtree Software, which created the world’s first corporate portal for heterogeneous information (acquired by BEA Systems in 2005). Between startups, Sheynkman was entrepreneur in residence at Sequoia Capital. He holds a degree in electrical engineering and computer science from Stanford University and an MBA from the Haas School of Business at the University of California, Berkeley.

In this interview, Sheynkman shares how semantic modeling techniques are emerging to bring automation to the data center and agility to IT operations.


PwC: Kirill, can you please tell us about your company, Elastra?

KS: Elastra is about two years old. In a sentence, we are focused on the next generation of data center automation and IT service management.
The way things typically work in IT departments is that application architects draw something on a whiteboard or in Visio or, even more sophisticated, in Rational Rose. They create a design, talk and comment about it, and then hand it off to another group that deploys that design and makes it real. That process tends to be lengthy and error prone. Iterations of that process take a long time, and also there is a disconnect in the thinking or the perspective on how applications work.

For example, application architects think about scalability: Do we want horizontal or vertical scalability for this application? Because we’re going to get X demand. Hardware people or data center architects think about what kind of servers they need, the horsepower and flexibility of those servers, and what happens when one goes down. It’s a very different perspective. So, at Elastra, we said we could take something that used to be just design and drawings and actually make them into real applications.

To approach the problem, we started with the idea of capturing the data center architecture and infrastructure in a model, a data model. We are using semantic technology to model the data center—not just settings, not just packages, not just computers and networks, but the life cycles, the processes, the dependencies, the procedures for backing up and restoring and so on. The model also includes policies that are being used geographically, information about where things are located, and so on.

It’s an interesting approach, because there have been CMDBs [configuration management databases] and things like that for managing configurations. There are package management systems for managing packages and patches. But when it actually came down to making the thing run, it boiled down to scripts or run books or something like that. It was still this process of code that needed to be versioned, that needed to be run.

PwC: How does having a model-based approach help?

KS: If we could capture [model] everything, then one, we could automate the entire data center operation via a software server, and two, we could do interesting things. For example, if we know the life cycle of deploying a system, we can write an optimizer that says this is the optimal way of deploying this thing, without ever touching the script. We can version it, and we can modify these things on the fly. We can use software to define an allocation policy—how resources are deployed. And we can use software to map a design to an allocation policy depending on the situation.

Here’s another way to look at our opportunity: HR has an automation system, the sales force has an automation system, and finance has always had one. Supply chain management has had an automation system. IT has not. There are bits and pieces, but there’s nothing that holistically looks at IT as a process that creates a product. You have contracts and licenses with suppliers—they bring these components together that need to be composed. It needs to be put through the assembly line in the right order to come out with the result at the end.

PwC: How will the IT organization benefit from your products?

KS: Here’s an example. I can give you a design that’s
a document on a USB [universal serial bus] key chain, and you can run it. It can be very, very complex, and you can run it in many different ways, in many different setups. So, say you’re a consulting company and you have a repository of designs and practices that you have implemented, such as how to deploy a scalable data warehouse that involves a particular kind of data. Launch five of them, and look at them in different sizes. Look at how they perform. Everything is created just-in-time.

PwC: How does cloud computing fit with what you are trying to do?

KS: We are focused on automated deployments of systems, using resource pools that are called clouds. There are a couple of characteristics of cloud that creates the fit. One is the notion of disassociating the hardware resource pool from the [application] software requirements. That’s the first thing that cloud forces you to do. You no longer think in terms of, “I have these five machines, and we’re going to put application A on this machine and B on this machine.” I think of them as a pool of processing, a pool of storage, a pool of networking resources that I can use to do things. The other characteristic is the fact that these resources are networked by design. You can’t have it without a network, so they are connected and so can be orchestrated in a variety of combinations.


“HR has an automation system, the sales force has an automation system, and finance has always had one. Supply chain management has had an automation system. IT has not.”


The big IT organizations, they look at companies like Google with admiration, not just because of Google’s market cap, but because Google has been able to build a very scalable data center, where a huge engineering organization doesn’t really care about hardware or specific machines. What do you mean requisition a machine to do a proof of concept? I have this pool of compute, and I have a pool of talent, and I can actually create things on my own schedule.

So cloud is conducive to that kind of thinking. It’s abstraction of the machine. Abstraction of the machine is what cloud gave us, so we said, well, the next interesting thing for enterprise IT is the abstraction of the application. Abstraction of application deployment and design—from the enterprise architect who is concerned with policy and governance, down to the application architect who is concerned with what pieces are needed to meet the requirements that were given.

PwC: We are interested in the question of how IT enables business agility. How can IT
be more responsive to business changes, and what opportunity do you see in this
regard for yourselves?

KS: I’ll break it up into two pieces. The first piece is about being more responsive to changes. Anytime you can do things faster and go live faster, there’s value to it. We have an ROI [return on investment] calculator on our Web site that looks at costs in terms of time saved. You can tweak the parameters and say, “If I can save 15 percent in my development time versus my testing time on a project, what does this really translate to based on the cash flows that I anticipate from this project? What does this get me in terms of realizing that project sooner?”

For the second part, our opportunity is that IT is seen as a bottleneck, but IT is a bottleneck sometimes for the right reason. The governance role that IT plays and the policies that IT puts in place are very significant. So the question was: How can we make the business agile and let it innovate, while at the same time making sure that IT keeps track of the house?

And that’s what governance is all about. It’s about delegating responsibility while maintaining control. With early clouds—like Amazon.com’s, for example, and others—you see engineers and maybe even business units whip out their credit cards and they start running applications. The cool thing about it is, “Wow, we got it up in two days, and it’s running, and it’s our wiki, and we’re doing all this nice stuff.” That’s wonderful. You want that kind of reaction, because they’re doing it for a reason.

However, IT cringes. Why? Because if the information goes out where it shouldn’t go, who’s going to support and maintain it? If they start relying on it, how is this going to fit into the overall architecture? So everyone from the CIO down doesn’t like that. The alternative, we think, is not to ban that notion that you can get things done on demand. The idea is to merge policy and control with the flexibility of an on-demand offering, public or private, and a cloud format for your data center—and an application design process that’s meshed with the two.

PwC: Can you provide an example of how you are working with customers now and solving these problems?

KS: I’ll give you an example of a large software company in the San Francisco Bay Area. They’re moving an entire data center from an old one to a new one that they built using the latest and greatest technologies. They need to move a significant chunk of the operations of one of their major business units to the new data center. We need to put it in so that it can evolve from the design of the old data center to the new one and we can speed up the iterations.

How do you do that? How do you move thousands of machines and systems coherently? They’re building a model of how things interrelate to each other, what the process of this movement is going to be, and they’re automatically deploying it in the new data center. What do we bring to the table for them? It’s that they can actually get it done in time using our solutions. The model is being used to drive the migration. Think of the model of the old as a design, as a drawing. Now run this drawing on the new data center and take advantage of the capabilities that data center has.


“We have a matchmaking algorithm that says, ‘These are the requirements for these software components, and this is what this cloud or this set of hardware resources provides.’”


 

PwC: Can you tell us how the model works? What are some of the key characteristics?

KS: We have a matchmaking algorithm that says, “These are the requirements for these software components, and this is what this cloud or this set of hardware resources provides.” So it’s requirements and capabilities.

For example, say I want to have a highly scaled or redundant database—one that really scales. Let’s say Oracle RAC [Real Application Clusters] fits the bill. Now, where do I run it? Getting into technical detail, Oracle RAC needs three NICs [network interface cards] to function. That’s what you have to have on the machine. Well, Amazon.com, for example, doesn’t let you do that yet. You cannot run it there, so you find a different solution to the same problem. Your search path from the model lets you run it on a VMware virtualized data center, which lets you have three NICs, and then you need to provision them and hook them up together. Say Amazon.com gets that capability tomorrow—well, you can use either one then.

PwC: What innovation did you have to do to develop the models needed? What are some of the key characteristics of the models?

KS: We started with a notion of building an ontology for the whole system. That was at the very beginning. It’s a very difficult process, because there are so many things to track and capture and think about. We went with an ontology-based approach because it enables different viewpoints and different sets of knowledge to be readily incorporated. It would be very difficult if we said, “We’re going to have an XML [Extensible Markup Language] schema that describes the data center.” That would be extremely difficult, and it would break down at the first customer site.

The model needs to be dynamic. It needs to be distributed. It needs to have rule bases and to acknowledge that the way I see things is not the way you see things, and the way this customer deals with information is not the way another customer deals
with information. So we started with a notion of accommodating such differences. As we have been building the product and talking to prospects and to customers, we augment the model, and these changes are easy to do because of the technologies that we’ve chosen.

We also went with a notion of loose schema, and that’s what you need to do the reasoning and the processing. I have spent my entire life in the relational database world, so it was not an easy thing. But our chief architect and I, we said, “OK, this really does make sense for this particular problem.” And perhaps that’s one of the reasons why people haven’t been able to tackle it before.

I think a lot of innovation will take place in terms of how these network systems talk to each other and how they communicate, how they exchange their capabilities, and how they share a workload between themselves. We’re scratching the tip of the iceberg. Automating the processes—right now we basically have three allocation mechanisms, and we have a couple of rudimentary planning engines that do that. But to actually use real inference rules and so on to automate the IT process further, that’s very interesting. As in automating manufacturing, your tolerances become better,
your error rates become smaller. That’s where the innovation takes place. That’s what we would expect with IT processes.