\title{Hyper-G: Information---the Next Generation} \author{Klaus Schmaranz} %\inst (Graz University of Technology, Austria\\ %kschmar@iicm.tu-graz.ac.at)} \begin{Article} \begin{abstract} The first part of the paper deals with problems and shortcomings of electronic publishing today that have up to now prevented the really big breakthrough. In the following sections the use of Hyper-G, the first second generation Web system for distribution of electronic publications on the Web as well as on \acro{CD}-\acro{ROM} is discussed. It is shown that the object oriented database architecture of the Hyper-G server overcomes most of the problems of today's solutions. The last section contains a short description of journals and books that are already utilizing this new technology. Please note that throughout the paper the name Hyper-G is used describing the technology of the system. The server itself as a product is available under the product name HyperWave. \end{abstract} \section{Introduction} Speaking about portable documents one will easily realize that portability of documents has several faces. Mainly portability means that documents can be transported on the Web. This implies that the document format chosen has to be compatible with several hardware and operating system platforms. So far one can make sure that interesting information is useful for the majority of Web users by choosing the right format. Although a well chosen document format makes sure the documents are usable for readers this is only the first step to really portable information. Documents on the Web only make sense if interested readers can find the information they desire. Thus easy location of documents has to be considered to be part of the portability aspect too. Besides easy data retrieval transmission speed is a sensible point to be considered. Readers having to wait annoyingly long for documents or having to deal with broken connections are very likely not to be interested in electronic publications at all. Considering that at the moment an estimated 300 million users from different cultures and speaking different languages have access to the Internet, multilinguality of documents becomes more and more important. Although with today's methods it is not possible to translate documents to other languages on the fly there is a lot that can be done. One can consider publishing papers in more than one language and mechanisms can be implemented to give the user the possibility to look up unknown terms. Dictionaries as well as glossaries can be made available for specialized areas. Electronic publishing naturally does not only mean to publish documents on the Web. There are also a lot of potential readers of electronic publications that still have no Internet access. To keep costs low the system chosen for distribution must be able to support both Web and \acro{CD}-\acro{ROM}. Otherwise the amount of work doubles because these versions have to be prepared seperately. A look at electronic documents on the Web shows that at the moment most of the Web sites serving electronic publications are run by universities and only a few are operated by publishing companies on an evaluation basis free of charge. However electronic publishing in the long term is surely driven by publishing companies which means that charging mechanisms and user as well as group access management are highly important topics. In the following sections you will find a detailed discussion of the features of Hyper-G, the first `second-generation' Web server [Maurer 96] that make electronic publishing easier and more effective than ever before. This includes navigation issues that help the readers to find their way through the electronic jungle. Also the standpoint of publishing companies concerning billing and access rights is considered making sure that the system will fit their needs. Additionally turnaround time from submission to appearance of a paper and the cost effectiveness of this process will be discussed. \section{Using Hyper-G for Electronic Publishing} Before discussing new publishing paradigms let us have a closer look at Hyper-G. Using Hyper-G for electronic publishing solves a lot of the problems mentioned above and opens the way for completely new electronic publishing paradigms. Hyper-G automatically supports hybrid Web and \acro{CD}-\acro{ROM} publication without additional effort as has been successfully proven over the last two years with \acro{J}.\acro{UCS}, the Journal of Universal Computer Science by Springer-Verlag [Maurer 94]. The kernel of the Hyper-G server is an object-oriented distributed network database with a separate link database. Information structure as well as document meta information are a basic part of the concept [Kappe 91]. This makes it possible to present the user with a seamless world-wide structured information space across server boundaries. The document structure gives readers the possibility to locate interesting papers easily since all the information can be structured by topic, by journal or by other criteria that are important. Structure in Hyper-G is achieved by the use of collections that themselves can hold collections or documents. This concept allows it to build structured information trees where the contents of a collection need not even reside on the same server as the collection itself. Collections or documents can be members of arbitrary many collections without physically copying them. The result is that different structures can be applied to the same dataset and users can choose the most logical structure for their purposes. Document meta information such as author, title, keywords, creation date, modification date as well as expiry date and many more, support the readers in getting as much information as possible. Naturally document meta information is searchable and full text searches may be performed. The scope of searches is user definable and can be one small part of one server or even the whole content of all servers worldwide in one single operation. Even when doing searches on multiple servers it is not necessary to know about the server addresses. More than that: meta information cannot be applied only to documents but also to hyperlinks! This means that links can have types, such as annotation links, inline links, also version links for documents where multiple versions exist and many more. Hyper-G servers do not provide read access only, write access is also possible. Read and write access to documents are controlled on a user and group access right basis and billing is integrated in the server. All links in Hyper-G servers are stable [Andrews 95], which means that dangling links (links pointing to nowhere) are impossible. Whenever a document is moved from one location to another, even across server boundaries, all the links pointing to that document automatically point to the new location. This kind of stability is achieved using \acro{URN}s (Uniform Resource Names) instead of \acro{URL}s [Berners-Lee et al.\ 94]. \acro{URN}s can naturally be mapped to \acro{URL}s when accessing a Hyper-G server with a standard WWW or Gopher client. If documents are deleted all the links pointing to this document remain open and are hidden. Whenever the document reappears the links are closed again. Not only are all links stable, they are also bidirectional. Accessing a document on a Hyper-G server readers do not see only the outgoing links of documents but also all links pointing to the document from the outside. Links can be followed in the reverse direction, which would not be possible using first-generation Web servers. This very special feature is achieved by the separate link database of Hyper-G servers. If links were embedded in the documents themselves as is the case with first generation Web servers this would not be possible. The separate link database has another advantage: it makes every document hyperlinkable even if the document format does not allow links [Maurer 96]. Amongst other structuring elements Hyper-G supports the concept of clusters. A cluster contains several documents that are related to each other and therefore should be viewed together. As an example out of chemistry 3\acro{D} molecular models could be clustered together with an explanatory text. In this case the user would get the 3\acro{D} model in one window together with the explanatory text in another window. Clusters are also used to serve multilingual documents. Documents in different languages are clustered together and the user then gets the document matching his language preferences. In first-generation Web systems the only possibility to have multilingual documents is to let the user choose the language on the entry page and then follow different paths through the server for different languages. This approach causes a lot of work for server operators and the readers have no chance to change the language while reading. With Hyper-G only one path through the server has to be maintained and the readers can switch between multiple languages on the fly [Andrews 94]. Versioning of documents is also supported using clusters. Different versions of a document are clustered together and the reader can switch back and forth between different versions on the fly. To support the server operator and keep maintenance of multiple document versions easy a special parser is available. When updating documents this parser tries automatically to find the positions of all hyperlinks of the old version in the new one. Other special features of Hyper-G are glossaries and automated glossary hyperlink creation. A glossary in Hyper-G is defined as an arbitrary collection of explanatory documents that are classified by their titles and keywords. Hyperlinks to glossary items in a document are then automatically created by a special parser that searches for the glossary items in the text of the document. Automated creation of glossary links is at the moment supported for \acro{HTML} and \acro{HTF} documents, \acro{PDF} and PostScript support are under development. To make creation of referential hyperlinks easier a {\it Vocative Hyperlink Creation Language} (\acro{VHCL}) has been implemented. This language supports the description of document context and potential hyperlinks in that context. As an example typical phrases like ``see also page \emph{nn}'' would be recognized by the program and a link to page \emph{nn} would automatically be created. Journals normally have their well defined citation rules making it easy to write a \acro{VHCL} program that recognizes citations and automatically creates inter document as well as intra document hyperlinks. As is the case with glossary links this feature is at the moment implemented for \acro{HTML} and \acro{HTF}; \acro{PDF} as well as PostScript support are under development. \section{Providing Quick Access} As has been mentioned earlier it is extremely important to provide quick access to information otherwise it would be worthless. Since Internet is neither very reliable nor fast considering long distance data transfer it is necessary to mirror documents to several servers world-wide. Doing this readers are able to choose the server that is geographically most convenient for them in the network sense. Implemented in Hyper-G are two mechanisms that make long distance transmissions effective: first a cache is implemented that works as all the well-known proxy servers do. Although caching can help a lot it is surely not enough, because lifetime of documents in the cache can be rather short depending on the traffic. For this reason a second mechanism called replication is implemented. Replication means that documents from one server can be mirrored to other Hyper-G servers and the replicated documents know about the original. %The benefit of replication is that users are no longer %forced to know about mirrored documents but get the local document %automatically instead of the remote one. As an example a user could be %connected to a Hyper-G server in the USA and finds an interesting %document in Austria. If a replica of the document exists on the server %in the States the user automatically gets the replica instead of %downloading the original from Graz. It is not necessary that the user %knows about the replica, everything is done automatically. This functionality is one of the benefits of using \acro{URN}s instead of \acro{URL}s, it would be impossible to implement it for first generation systems using \acro{URL}s. Besides caching and replication readers can actively utilize the ability to have write access to their personal home collection. Instead of defining bookmarks on their local computer they can insert references into their home collection on their Hyper-G server. The benefit is that their ``bookmarks'' are then accessible from wherever they connect to the server, %It is not necessary to sit on their desktop machine to access them which is especially ideal for people who are travelling a lot. \section{New Publishing Paradigms} Having a closer look at the way electronic publishing is done today one will mostly find \acro{HTML} or \acro{PDF} documents that are very similar to their paper based counterparts. Often a search engine is provided to make location of interesting papers easier, all other benefits of doing publishing electronically are mostly neglected. For the reader of electronic publications nearly no value is added compared to paper based articles. Worse than that---considering \acro{HTML} documents the possibility to do high quality printouts for archival purposes is lost. This is surely not enough to make electronic publishing on the Web a success. In the above section discussion was about the special Hyper-G features. Utilizing them allows completely new electronic publishing paradigms that are no longer driven by technical demands and shortcomings of certain document formats like \acro{HTML}. Instead authors can concentrate on the content rather than the document format and choose a format convenient for them without loosing important hypernavigation features. As an example a paper about new chemical structures could consist of 3\acro{D} models of molecules that are clickable. The hyperlinks could then lead to spectrum images that are then linked to some additional text based explanations in, for example, \acro{PDF} [Adobe 93]. A video of an experiment, naturally again with hyperlinks to explanations, completes the presentation. All the documents in the example above carry meta information like keywords and can therefore easily be located in a search. Acceptance of electronic publications is highly dependent on their quality. For electronic publishing quality does not only mean high quality contents, which can be assured by an appropriate refereeing process. Stability of electronic publications is at least as important. Technically it is easy to change electronic papers after publication but this is unacceptable. Instead Hyper-G's annotation and versioning mechanisms can be used to alert the reader of new results or errata. In this case the paper is not changed at all, only additional information is added to the paper. Therefore all citations of the paper that existed for the original version are still valid and the reader can choose to browse annotations and newer versions of the paper on demand. Annotations in Hyper-G are hyperlinks pointing to the document that is annotated. Since Hyper-G's links are bidirectional the reader simply follows an annotation link backwards to read the annotation. The use of \acro{URN}s in a link database instead of \acro{URL}s embedded in documents guarantees that the annotation links are stable. This means that an annotated document can be moved around in the server or even from one server to another without generating annotations that point to nowhere. All links that pointed to the document before are then pointing to the document at its new location. Being able to examine the neighbourhood of a paper makes it possible to find other interesting papers on the same topic that very likely are difficult if not impossible to locate if only unidirectional links were possible as is the case with first-generation Web servers. \section{Turnaround Time and Cost Effectiveness} One of the most time consuming processes of electronic publishing is refereeing. Up to now refereeing normally means that a paper copy of the submitted paper is sent to the referees who send a corrected paper copy back. The annotated version is then sent to the author. If there are misunderstandings between referee and author the document is usually sent back and forth several times. A much faster turnaround time can be achieved if refereeing, corrections and clarifying misunderstandings can be parallelized. The logical way is to do refereeing electronically. Using Hyper-G's electronic annotations this task can be performed easily: papers are inserted into the Hyper-G server with read access only for the referees and if the referees agree also for the author. The referees then comment on the papers using the annotation mechanism. If desired annotations can also be made readable for the author, so the author is able to react immediately on the referees' comments. More than that -- the author himself could also annotate the referees' comments to clarify misunderstandings. Naturally the author as well as the referees remain anonymous [Maurer 95]. This kind of refereeing shortens the time used for the whole process significantly because the authors are able to do corrections in their papers and clear misunderstandings while refereeing is still in progress. There is no longer a need to send papers back and forth between referees and authors. Naturally it would be too optimistic to think that refereeing can usually be done within days instead of weeks. There is nothing that can be done about referees that are too busy, but most of the time it helps a lot to do everything in parallel instead of sequential. An additional benefit of this kind of refereeing is that the whole process from submission over refereeing to publication is automatically documented and can be stored for archival purposes. In general electronic publications are considered to be cheaper than paper based publications. This is true if both the electronic as well as the paper based version of a paper have the same contents and the electronic version only provides full text search as added value. This is not true if one wants to utilize all the additional features that lie in the electronic nature of the medium. The final step of inserting the paper into the server is in this case not only the insertion itself. Also hyperlinks as well as structure, eventually a version of the paper that is split into single sections or versions of the paper in different formats have to be prepared. Naturally there can be other electronic specialities like Java scripts or 3\acro{D} navigation rooms and many more. All that is very time consuming if it has to be done by hand. The whole process is critical in terms of cost effectiveness and should be automated to the highest extent possible. Using the special tools that Hyper-G provides such as the {\it Table of Content Generator}, the {\it Glossary Hyperlink Generator} and the {\it Vocative Hyperlink Creation Language} automates the additional work to the maximum extent possible. The steps to be performed at insertion are limited to running the tools and controlling them. All reference and glossary link creation is done automatically. \section{Current Electronic Publications With Hyper-G Technology} The first electronic journal based on Hyper-G was \acro{J}.\acro{UCS} -- the Journal of Universal Computer Science by Springer-Verlag. It is a monthly journal covering all knowledge areas of computer science and additionally to the Web version a yearly \acro{CD}-\acro{ROM} and printed version are provided by Springer. Papers in \acro{J}.\acro{UCS} appear in two parallel formats: hypertext and hyperlinked PostScript. \acro{PDF} is planned for 1997. Springer also publish Few Body Systems (\acro{FBS}), one of the most reputable Journals in physics. Started in January 1995. Academic Press distribute the Journal for Network and Computer Applications (\acro{JNCA}) (former \acro{JMCA} and \acro{JMA}). Started in January 1996. Datenstrukturen by Ottmann and Widmeyer, the German bible of Data structures. This uses hyperlinked PostScript. Meyer's Lexikon, one of the most comprehensive German encyclopedias, is electronically available on a Hyper-G server on an n-user license basis. Addison-Wesley publish some 30 books electronically on the Web using Hyper-G. \begin{thebibliography}{99999} \bibitem[Adobe 93]{AdobePDF} Adobe Systems Inc.: Portable Document Format Reference Manual, Addison-Wesley (1993). \bibitem[Andrews 95]{AndKapMauSch95} Andrews, K., Kappe, F., Maurer, H. and Schmaranz, K.: On Second Generation Network Hypermedia Systems, Proc. \acro{ED}-\acro{MEDIA} '95, (1995), 69--74. \bibitem[Berners-Lee et al.\ 94]{BLCaiLuoNieSec94} Berners-lee, T., Cailliau, R., Luotonen, A., Nielsen, H. and Secred, A.: The World-Wide Web. Communications of the \acro{ACM} 37, 8 (1994), 76--82. \bibitem[Andrews 94]{AndKap94} Andrews, K. and Kappe, F.: Soaring Through Hyperspace: A Snapshot of Hyper-G and its Harmony Client, Proc. of Eurographics Symposium on Multimedia/Hypermedia in Open Distributed Environments, Graz (1994). \bibitem[Kappe 91]{KapMauTom91} Kappe, F., Maurer, H. and Tomek, I.: Hyper-G -- % Specification of Requirements, Proc. Conference on Intelligent Systems (\acro{CIS}) '91, (1991), 257--272 \bibitem[Maurer 94]{MauSch94} Maurer, H. and Schmaranz, K.: \acro{J}.\acro{UCS} -- The Next Generation in Electronic Journal Publishing, Computer Networks and \acro{ISDN} Systems, Computer Networks for Research in Europe, Vol.\ 26 Suppl.\ 2, 3, (1994), 63--69. \bibitem[Maurer 95]{MauSch95} Maurer, H. and Schmaranz, K.: \acro{J}.\acro{UCS} and Extensions as Paradigm for Electronic Publishing, Proceedings \acro{DAGS}'95, Boston Massachusetts, (1995). \bibitem[Maurer 96]{Mau96} Maurer, H. ed.: HyperWave -- The Next Generation Web Solution, Addison-Wesley, (1996). \end{thebibliography} \end{Article} \endinput %\end{document} -- ----------------------------------------------------------------------------- Dipl. Ing. Klaus Schmaranz Institute for Information Processing and Computer Supported New Media (IICM) Graz University of Technology, Graz/Austria/Europe/Earth/Milky-Way/Universe email: kschmar@iicm.tu-graz.ac.at,phone: +43/316/873-5611,fax: +43/316/824394 -----------------------------------------------------------------------------