Addressing on the Internet and limitations of location-dependent references
Addressing on the Internet
The primary addressing mechanism on the internet is based on the application of a special type of cross-references for navigation in the World Wide Web (WWW), the hyperlinks. They are substantial elements of markup languages which describe the logical and physical structure of documents. Examples include the Hypertext Markup Language (HTML) and eXtensible Markup Language (XML) of the World Wide Web Consortium (W3C) describing and defining content, structure and semantic of documents.
The HTML-specification-4.0.1, for example, defines hyperlink as link between two resources with a starting point and a target. The target description of a hyperlink must include the indication of a worldwide unequivocal address. At present, Uniform Resource Locators (URLs), determining the document location, are predominantly used for this purpose.
However, hypertexts lose consistency and navigation functionality if their links are unstable. The reason for this is the link dependency on the storage location of digital objects.
Alternative approaches providing location-independent addressing through separating link targets from direct reference to the storage location, have already been developed. One example for this is the ISO-Standard HyTime - Hypermedia/Time-based Structuring Language. Yet, these alternatives had no wide-spread impact.
Associated XML-standards for resource linking as e.g. XML Linking Language (XLink) or
HLink - Link recognition for the XHTML Family W3C increase semantic referencing and the functions of HTML-links. These standards, also, include approaches for location-independent addressing, yet, they are application-related and have no mechanism for (totally) location-independent addressing.
Time stability of location-dependent links: applied methods and their limitations
To compensate the disadvantages of location-dependent links, two different methods are applicable, providing time stability of links through e.g.
- URLs, from which the server dynamically determines the storage location in the form of CGI-scripts or databases,
- a specific configuration of web-servers enabling redirection of old to new addresses through "aliases" or "redirects",
- application of assignment algorithms for creating a stable URL name structure,
- application of periodical URL-checks by the information provider.
The methods described offer only medium term solutions for specific situations.
- Addressing methods offering long-term solutions are subject to change through technical modifications like change of server, database, and software or modifications in competence through extension or reduction of institutions' collection priorities. Such changes cannot be compensated by location dependent object addressing methods.
- Periodical URL-check procedures have to result in a consistent URL maintenance. That means, URLs identified incorrect by link check must be corrected in all referencing systems like catalogues, bibliographies, and portals. However, this method is very time-consuming.
- Furthermore, URLs can become temporarily unavailable through unstable server connections, networks etc., but the user receives no information on eventual document copies available, because URLs do not include this service.
- Electronic documents can change location through modifications of distributed business processes. They cannot be identified and addressed by URLs in a reliable manner.
An electronic publication is located on a publisher's server at location A. It is reported for archiving to an institution at location B where format and content of the document are processed. Additional information required for archiving is rendered at location C. The document is finally available to the user at location D.
- A further aspect is the insufficient location-independent citation of electronic documents through URLs.
A doctorand intends to integrate a persistently quotable identifier in his document BEFORE the publication is stored on the document server of e.g. a university library. This means, the storage location should be known before it is actually determined in the business process.
Reliable, persistent referencing of digital contents through location-independent identification and addressing mechanisms must be applied. Persistent referencing requires, in the long-term, institutional support embedded in an international infrastructure.