Knowledge bases as Web page backbones

position paper to the W3AI workshop at the 5th WWW conference

INRIA Rhône-Alpes, 655 avenue de l'Europe, 38330 Montbonnot Saint-Martin (France) Jerome.Euzenat@inrialpes.fr

Knowledge bases and hypermedia

The mixing of knowledge bases and hypermedia has been already achieved for long (see Gaines, 1990, Rechenmann, 1993). Such an idea can be put forth for various reasons:

The knowledge out of knowledge bases cannot be seen in isolation from other knowledge sources available in the firms or laboratories: bibliographic references, full text papers, experimental data and programs. This relationship must be established in one way or another. For anyone who wants to explain, to annotate or at least to document a knowledge base, "hypermedia" are now the main support;
The success of the World-Wide Web enables the publication of a knowledge base. Thus even users with little computer knowledge can have access to it at low cost: one server can have many users on cheap workstations;
Another, deeper, reason is the conviction of several researchers that one major interest of a knowledge base is the possibility to consult it as an encyclopaedia. This idea is related to the knowledge medium concept (Stefik, 1986). It has been further investigated by (Gaines, 1990) and (Rechenmann, 1993).

The Sherpa project, at INRIA Rhône-Alpes, Grenoble, designs tools and models for knowledge representation. This work blends together four kinds of knowledge through corresponding representation units: object-based representation; task models; behavioural equations; hypertext and lexicon.

Our old Shirka system was already able to link formal knowledge with pictures and texts through a proprietary hypermedia system. A knowledge base on E. coli genome has been built with it (Perrière&, 1990). We are currently developing a new system called Tropes (with its HTTP server counterpart HyTropes) and are involved in the craft of a knowledge base on the fruit fly (D. melanogaster) genome.

Here are described the advantages of the Web as a hypermedia management system related to a knowledge base, but above all the advantages of generating Web pages from a knowledge base instead of generating them by hand or from documents.

The knowledge hypermedium

As already noted by various authors (see ISF, LINKS database, WebMap) an object-based representation system is already a web of related objects. Thus, the similarity between Web pages and objects is quite obvious and the mapping from one to another is straightforward. The WWW mode of browsing is thus a natural interface to object systems.

Knowledge bases can be used as Web servers whose skeleton is the structure of formal knowledge (mainly in the object-based formalism) and whose flesh consists of pieces of texts and images tied to the objects. Turning a knowledge base system into a Web server is easily achieved by connecting it to a port and transforming each object reference into an URL. If the knowledge base is already documented by Web pages, the latter remain linked to or integrated into the pages corresponding to these objects.

The advantages of such an approach with regard to the previous proprietary hypertext systems are chiefly the availability of the knowledge base content to a wide and untrained audience. HyTropes participates in the knowledge medium idea promoted by Mark Stefik. However, other advantages are found in the consistency of the base (there is no dangling link since the skeleton is generated automatically and formal information is supposed to be sound).

Intelligence added web servers

World-wide availability and safety are precious contributions, however they are quite restricted with regard to the possibilities of knowledge bases. As a matter of fact, from a knowledge base server it is possible to build complex queries grounded on the formal knowledge (see figure). For instance, a user looking for an apartment in a real estate knowledge base can first select a filter form from the "house" concept, ask the lexicon for the meaning of the slot/word "F3" and decide to fill the form with corresponding criteria; the user can select one of the objects given as answers and have a look at the ground map and at a picture of the house together with the usual precise information. This combines the advantages of a very structured server with the freedom of usual servers. Moreover, the answer will be given in function of a semantically grounded method instead of using a simple full-text search.

HyTropes makes filtering queries available and we are currently implementing the classification aspect (the mechanism is simpler but this requires more work on displaying classification results on trees; see technicalities below). In fact the whole Tropes API is subject to URLising. So the next step will consist in publishing the URL rules (far simpler than the URL generated by the FORM tag) in order to enable any other application to:

properly query a knowledge base;
generate Web pages linked with any knowledge base.

The queries can be as complex as required: it will be possible not only to browse but also to build and modify knowledge bases. This raises problems of concurrent access and user support. Thus, since 1995, we have been working on the cooperative construction of knowledge bases for expressing the consensus between a community of geographically distributed people. Each researcher has a knowledge base from which knowledge can be isolated and submitted to the consensual knowledge base. The latter base will then contact the other members of the group for acceptation, rejection or comments on the submitted piece of knowledge. This requires formal comparison and merging of contributions from several knowledge sources and a robust protocol for such a task. A complete protocol has been designed for these activities and the research about knowledge base comparison is ongoing.

Conclusions

For long some people have been thinking that knowledge bases are interesting as formal repositories of knowledge rather than as problem solvers. Knowledge-based Web pages can be read as hypermedia documents and also interrogated for problem solving (Gaines and Shaw, 1992). The success of the world-wide web is an opportunity to test this idea in the large. The Web is an ideal support for the diffusion of knowledge, but we pretend that the formal representation of knowledge is a very important issue for the Web. As a matter of fact, the answers actually provided by the various Web worms are so huge and so often irrelevant that formal organisation of knowledge will soon be unavoidable. This should help the user and the server to share some content (as required by others).

Technicalities

The technical aspects of the available Tropes server are presented in the HyTropes home page together with a simple demonstration of the program and the sources.

The missing items of classical knowledge base browsers are the graph display of hierarchical data (e.g. class trees or graphs). An extension of Thomas Koch's TreeTool written in Java is used in order to display our own class hierarchies. More development on colour and behaviour of the nodes is required.