Overcoming transformation obstacles with semantic wikis

Photo: Michael LangPhoto: Brooke Stevenson

In this interview, Micheal Lang and Brooke Stevenson discuss linkages between data interoperability and enterprise architecture, and how Semantic Web standards are helping the US Department of Defense create a single view of the architectures in place.

Interview conducted by Alan Morrison and Bo Parker

Michael Lang is co-founder and chairman of Revelytix. Brooke Stevenson is president of Spry Enterprises. In this interview, Lang and Stevenson discuss the linkages between data interoperability and enterprise architecture, and how Semantic Web standards are helping the US Department of Defense create a single view of the architectures in place.

PwC: How did Revelytix get started?

ML: About eight or nine years ago, I founded a company named MetaMatrix. It’s since been sold to Red Hat. MetaMatrix was an early attempt to provide a modeldriven solution for data integration issues. In building MetaMatrix, we built a pretty sophisticated modeling framework based on the OMG [Open Management Group] MetaObject Facility specification.

Our largest customer base became the US government, totally by accident. Our backgrounds were in the financial services industry, and we built the product thinking that we’d solve problems for the financial services industry. But the NSA [US National Security Agency] became our first customer, and then DoD [US Department of Defense] bought a bunch of the software. We wound up selling quite a bit to financial services, but quite a bit more to the government.

With that product, I think we had a fair amount of success building models of domains to facilitate the installation of disparate data sources. We would build models of data sources and models of domains, wire them together, and let you query the domain model to get information from independent data sources. And in that exercise, well, in building that company and working with our customers, it became clear that the MetaObject Facility was not a rich enough language to model everything we wanted to model. So the NSA pointed us in the direction of OWL, the Web Ontology Language.

We founded Revelytix four years ago to build a collaborative ontology editor based on the W3C [Worldwide Web Consortium] standard to OWL and the Resource Description Framework [RDF], and it’s been hosted on the Web as a free tool for the last three years, Knoodl.com [http://knoodl.com]. We have applied semantic modeling to a bunch of different problems.

“It became clear that the MetaObject Facility was not a rich enough language to model everything we wanted to model. So the NSA pointed us in the direction of OWL, the Web Ontology Language.”

Brooke started using it for a project at the US Army about a year ago, and in January 2009, the Army funded a pilot project that combined OWL modeling techniques with business process modeling techniques using the Business Process Modeling Notation [BPMN] to facilitate a kind of enterprise architecture analysis that they were incapable of doing at the time. That project has since been funded for a production version. So we built them a pilot in January, and we are now building a full production version.

The Business Transformation Agency [BTA] funded this production version. The Business Transformation Agency is an agency within DoD that sits above the services [Army, Air Force, Marine Corps, and Navy] and reports at the OSD, the Office of the Secretary of Defense. The Business Transformation Agency’s mission is to transform the way DoD does IT.

PwC: Is this a way for the DoD to get all of its various systems on the same page, regardless of which agency or which branch of the service?

BS: Yes. The senior-level sponsor for our project right now is Dennis Wisnosky, who is the CTO and chief architect for DoD. He has a bit more of a business visionary focus, but he works for the deputy chief management officer in OSD. He wrote one of the only really thorough books on DoDAF [Department of Defense Architecture Framework], and he helped create an information model for managing the data that you would put into DoDAF, which has traditionally just been views of data.

To advance the strategy with DoDAF, to help get better systems interoperability, and to make sure we describe things using a more common methodology, Dennis is having his team publish some guidance documents for DoDAF. DoDAF traditionally has been the opposite of prescriptive. It’s been a very open framework that just generally guides what descriptions you should provide in the requirements definition and acquisition life cycle or in systems development.

But now that we need to deal with this cross-program interoperability problem, they’ve realized that they need to be more prescriptive to get some common modeling techniques and patterns used across all the systems.

PwC: That’s the same point at which TOGAF [The Open Group Architecture Framework] has found itself at release 9.0. It is getting more prescriptive as it gets more mature. It’s basically gone from telling you that you should produce to moving more in the direction of telling you how to produce it.

BS: That’s exactly where they are going with DoDAF. Our approach provides a strategy for putting into production a set of tools and technologies and an architecture team to help facilitate, and the governance processes to help oversee that transition.

PwC: So how are organizations like DoD approaching transformation nowadays? What modeling techniques are they using, and what problems are they confronting, including the semantics problems?

ML: We look at the fundamental problem that you’re describing as one of description. The reason that enterprises can’t achieve the goals that DoD is trying to achieve—and it’s basically the same one you are articulating—is that access and capabilities and other sorts of things are not described in a way that’s useful to achieve the mission. I’d say there are three legs to the mission: first is interoperability, and part of interoperability is discoverability; another is integration; and the third would be analysis.

“Look around any large enterprise. You’ll find that they spend lots of time and money describing things. Those descriptions are all over the place, and hardly any of them are useful at runtime.”

What we have provided for DoD is a road map showing that if you describe things with enough precision and in a language that’s executable, you can achieve all three goals—interoperation, integration, and a different class of analysis—with one technique. The power of that technique derives from the ability to describe things with a different sort of precision than is used today. And, really, that’s our entire solution about how to describe things.

PwC: And that extends from the more granular EAI [enterprise application integration] kinds of concerns up to organizational design and capability integration?

ML: Yes. It’s the most granular description of the most arcane message format that your enterprise has up to the governance model that the board of directors insists be used to operate the company at the high level.

PwC: Implicit in what you are saying, then, is that there’s a missing aspect of corporate or government performance?

ML: Look around any large enterprise. Governments are not at all unique in this. You’ll find that they spend lots of time and money describing things. Those descriptions are in UDDI [Universal Description, Discovery, and Integration] repositories, they’re in specialized metadata repositories, they’re in data models, they’re in Word documents, and they’re in spreadsheets. Those descriptions are all over the place, and hardly any of them are useful at runtime.

Three years ago, we got started in DoD by putting forth a best practice that we call community-based vocabulary development, so that a domain (say, human resources, acquisition, or any domain within an enterprise) would have a community that could build a vocabulary that described their domain to any degree of abstraction or precision that it wished, using a single technique—the Web Ontology Language.

For the last three years, we went from fighting furious wars with the XML [Extensible Markup Language] schema people … [laughter] … I’m glad you got a kick out of that, because it wasn’t funny at the time. I would say that today we have a complete and total victory. All of the domain modeling at DoD now is being done in OWL.

PwC: So you have brought the whole DoD along, is that what you’re saying?

ML: I believe that’s the case, yes.

PwC: That’s quite an accomplishment.

ML: Now the next step is to make those artifacts useful. You have some of these domain categories available now, so how do you use them to drive analysis? Brooke figured out how to apply that technique in the domain of architecture, and it’s one of the places that DoD wants to apply this technique aggressively.

BS: The senior leadership in DoD—the three stars and the four stars—are driven to transform the way that they do business. Consider acquisitions, for example. Instead of thinking about the acquisition of these big monolithic systems, they plan to think about the acquisition of capabilities expressed as services, and the way that they build out those capabilities once they’re headed down the acquisition path.

The problem is the big bureaucracy underneath that senior leadership level. To make that transition happen, the first thing they need to figure out is how to adjust the way they define requirements and the way they run the acquisition process to realize those requirements. If they can transform that part of DoD, then the rest will follow naturally.

PwC: It seems like the acquisition aspect is the critical piece, then. Each agency and branch has its own habit of acquisition that it’s developed over the decades. You’re suggesting that the transformation would be to get acquisition to occur in a fashion that incorporates the learning that you’re imparting and that the highest levels of the DoD are on board with.

BS: That’s exactly true. But you can’t change everything about the way they do business, because there are huge organizations and massive policies in place. What you can change for the acquisition people, or the people who do portfolio management and analysis, is the way that they analyze what they’re going to invest in and how they’re going to meet requirements. And so that’s where the world of enterprise architecture comes in.

Enterprise architecture, if it’s functioning properly, should not just describe a system, but should describe all of the data that you need to do analysis. If you formally capture those descriptions, refocus the way you do enterprise architecture work, and collect that data that they are using for investment analysis and capability gap analysis, then you’re transforming the whole way that they establish requirements and do analysis to make the appropriate investments.

PwC: Does this net out to using OWL to discipline the description process so that the semantics—the wobbliness of the semantics that have allowed the same description to be interpreted multiple ways—is ratcheted down and people strive to a common understanding of what the services are supposed to do?

BS: That’s exactly right. The only addition I’ll make there is that OWL is the underlying description framework that we use for everything. To help solve the description requirements challenge in a way that is natural to the business analyst or the mission analyst community—the users—we also use business process modeling. We’ve started with the BPMN standard from the OMG, but we use OWL to capture all the data in those BPMN models as well so that we can relate it to all of the other information artifacts that are relevant.

That is another key standard for us, because getting collaborative consensus and a formal way of describing requirements brings a lot of different parties together. The Business Process Modeling Notation gives you a nice set of semantics to do that.

“Enterprise architecture, if it’s functioning properly, should not just describe a system, but should describe all of the data that you need to do analysis.”

PwC: We are also trying to understand how enterprise transformation and enterprise architecture frameworks contribute to value creation. We’re looking at complex adaptive systems and evolution models to try to understand how that aspect of value also needs to be taken into account when you’re considering a transformation, because you may inadvertently destroy value if you are not aware of the emergent value creation aspects of your organization. Have you looked at that?

ML: Well, we thought about it. We coined the term “emergent analytics” about a year ago. I think it’s not exactly what you’re talking about, but it’s a concept that can be realized from using RDF and OWL. We haven’t actually put this into operation anywhere, so it’s still conceptual at this point.

Several years ago, the primary driver for us to move to RDF as an information model was extensibility. All of the information models presently used are extensible only if you’ll accept brittleness. If you want to extend them beyond some point, everything built on them, then you have to rebuild the models around the new extended version of what you did. RDF and OWL have a unique property. You can extend the information model arbitrarily—and I mean the word “arbitrarily”—without breaking anything.

We hope to put the capability in place at DoD soon so that people who know nothing about each other, and practically nothing about the information model they are interacting with, can make assertions. These assertions can be facts or concepts for other sorts of things, but principally facts and concepts. Essentially, they will be able to just dump these assertions into this extensible graph of information. And if you do that on a large enough scale, information that you didn’t anticipate emerges from the graph.

If you are familiar with complexity theory and things like that, this is all part of what you would expect from this sort of an information model. Now, as I’ve said, we have not put this approach into play, but we’ve done enough work with these technologies to believe that there isn’t a reason in the world why it wouldn’t work. So it’s able to let any community of any arbitrary size make any assertions they want to make, and allow new information and new types of analysis to emerge from the graph.

“RDF and OWL have a unique property. You can extend the information model arbitrarily—and I mean the word “arbitrarily” —without breaking anything.”

PwC: There is a long history of requirements, analysis, and transferring requirements into code. What are the big differences here relative to more traditional approaches?

ML: The biggest difference is the number of people that participate in the descriptive activity. If any organization thinks it will be able to describe things to the degree of precision that we are talking to you about today, with all groups of engineers, that organization will not transform itself, ever.

To be transformative with this, you have to involve a very large number of people who know very specific things about parts of their domain. They might be the only people who know those things. The odds of an engineer getting that kind of description about something are zero. There is no possibility he is going to get it.

If you don’t involve very large communities in describing things, you can never transform the way you do business. DoD has latched onto this, and they know this is the case. The trick was that we convinced them to use OWL. After they came to that conclusion—OWL is the only technology available to achieve that goal—then it became easy.

PwC: I think we agree with you on that. OWL has distinct advantages that other techniques don’t seem to have, but there are, of course, lots of critics. I’m sure you’ve confronted a lot of them who would say that OWL is overly complicated and that the people in the organization who need to do the description are never going to get their arms around it.

ML: In my view, this is where there is an enormous disconnect between the Semantic Web crowd and us. We don’t have any requirement whatsoever in what we described to you for inference and reasoning. The thing that makes OWL complicated is the inferencing and reasoning requirement.

Basically, all we are using is RDF schema. Don’t tell me that any person at any skill level in an organization can’t make a simple assertion, an assertion like this is a part of that, or one that says this has this color, or this has that function. That’s all we are asking these people to say.

They are saying those things right now in Excel spreadsheets, they are saying them in Word documents, they are saying them in e-mail messages, and they are making those same statements in 19 different technologies. We give them a user interface that lets them make these simple assertions. But when I say assertion, I mean a simple sentence with a subject, predicate, and object. They make a simple assertion about something they know about.

PwC: And the wiki that you just showed us is their point of entry for that, correct?

ML: Yes.

BS: The other aspect of the picture we’re describing is that there are now a whole bunch of communities working on this ontology development. They have ontologists who are engineering the OWL file, and they have subject matter experts who are stating facts or making assertions that fit into the ontology. The two important tricks are governing the collaborations between those two kinds of people (engineers and subject matter experts) and defining your use case for that ontology up front.

In one example, we are using the ontology to do service discovery and portfolio management. So our ontology architecture is totally driven by those two things.

If you think for some reason up front that you have to do inferencing but you don’t really know what you are going to do, then that does make the ontology development a lot more complex. And most of the time that’s overkill.

PwC: So in essence you are saying start small, develop this initial capability, and just use the more ambitious inferencing when you really need it and don’t do the overkill.

BS: Right. That’s the great thing about the OWL and RDF standards being inherently extensible. You can add that all in later, but still do a lot of really valuable things early on with a more simple use case.