ChallengesBy Ericka Chickowski | Posted 2008-08-05 Email Print
The future of data management, integration and search could lie in semantic web technology. Baseline is arming readers with information on semantics technology by examining the niche, the opportunities and challenges it may present to business leaders, IT management and end users in the next few years.
Many experts, however, do believe that it will take a little longer than overnight for semantics to make a real difference within most mainstream organizations.
“If I were to put a number of years─it depends on the general economic situation and a number of factors─I would definitely say it’s less than five years, probably less than three years before we see mainstream adoption,” Polikoff says. Gilbane Group’s Moulton is less enthusiastic about semantic technology’s near-term prospects. “We’re talking a decade or more for it to really work well. It’s like voice recognition. It’s just kind of creeping along and creeping along and it’s getting a lot better, but it’s still not everywhere. It doesn’t always work really well,” she says. “It’s the interface that’s the real issue. It’s not the technology. These are design problems more than technology problems.”
Many other obstacles must be conquered for semantic technology to really be picked up by the average enterprise. Foremost is the issue of developing ontologies.
“Ontologies really have to get built up. They get built up in two ways. One is through humans creating them for these sophisticated applications, and there are government agencies and professionals who do this,” Moulton says. “And the other way is through machines calling context and by learning how language is being used.”
A number of ontology languages and standards have already been created to help organizations and tools developers build up ontologies in a uniform manner. But some observers are critical of the current semantic ecosystem and language structure. Giannandrea of Metaweb believes these languages are unnecessarily complex. “It’s great that the schema can be malleable and contributed to,” he says. “But it’s not OK if, in order to do that, you’ve got to know ontology languages. There’s nothing fundamentally wrong with these ontology languages, and I understand the value. It’s just that you’re being asked to buy more than you need in order to get the benefits. You’ve got these markup languages like RDF or N3 or OWL, and you’ve basically got to buy into a whole tool chain before you can get these up.”
Metaweb is championing better integration with existing APIs and markup languages, and the company is using a wiki-style base of volunteers to build up a web of connections between public information in order to bypass the complication of these ontology languages. Limitations in most organization’s database infrastructure create the biggest roadblock to the use of semantics within the enterprise, Giannandrea says.
“While there are some business standards being created for invoicing, and this and that transaction, the majority of the semantic meaning the companies are capturing within their databases is locked up in the database,” Giannandrea says. “If I have a personnel database, and it has people’s names, date of birth and managerial position, the schema for that and the meaning of those terms for a field value is basically unique to that database.”
A whole industry has been created to aid database cross-referencing and schema matching because that’s what it takes anytime an organization would like to connect two databases together.
“We talk to many CIOs who say this is complete madness. We have these tools that let the data flow back and forth, but we don’t know that the field might have the same meaning,” Giannandrea says. To create numerous connections between the data, there is a need for what’s called a “triple predicate-based” database.
“You’d have Arnold Schwarzenegger in the system and Maria Shriver in the system, and you’re going to represent that they’re married by adding ‘Arnold is married to Maria. Maria is married to Arnold.’ These sorts of triple predicate-based systems have existed since the 1960s, and there’s general agreement that if you’re going to represent structured but open-ended human knowledge, you’ve got to use them,” he says. ]
The problem is that most relational databases are not appropriate for storing and operating on these scales. Column database stores, which are used for data warehousing, are also not appropriate for queries on these relational stores. So, the problem is, you need a new kind of database. And that’s a little a bit of a problem if you’re an enterprise and have all of your data in a relational data store. People are beginning to recognize that. There are a lot of database researchers, and well-known people in the field are beginning to write about this.”
The vision of semantics is great; it just needs to be simplified in its execution, Giannandrea says. “You have to make this stuff acceptable,” he says. “We think the underlying idea is fantastic. Our computers need to be able to understand the concepts so that they can then do more for you. That’s a great vision. It’s just that the current tool chain is a little too academic and based too much in the realm of artificial intelligence.”