You are here : I*M Europe / Telematics / Libraries / Activities 17/01/00
     
DigiCult home Supporting pages | DE |ES | FR | IT

Introduction 

ACTIVITIES

Projects
Support actions 
Publications 

BACKGROUND

Support documentation
Policy 
Statistics 

FOCUS

Public libraries 
Music libraries 
Distance learning
Metadata
C&EE 
Software 

CONTACTS

Commission 
Test sites

Updated: 09 JUN 99



Third Metadata Workshop

Luxembourg, 12 April 1999



Metadata home page


Table of contents

  1. Introduction
  2. Specific objectives of the third workshop
  3. Participation
  4. Structure of the workshop
  5. Report on the workshop sessions
  6. Conclusions
  7. Glossary

The full report of the workshop is available for download in Word ( 56 KByte).


Executive summary

On 12 April 1999, the third Metadata Workshop and Concertation Meeting, organised by the European Commission DGXIII/E2, took place in Luxembourg.

48 participants from organisations all around Europe attended the workshop as well as several European Commission services.

This workshop, which is part of an on-going concertation activity started in 1997, had four objectives:

  1. to present recent developments around the Dublin Core metadata element set and look at future directions
  2. to present RDF and XML and look at the practical consequences for metadata implementation
  3. to look at issues related to unique identifiers for electronic resources
  4. to discuss metadata issues related to long-term availability of resources.

The workshop was conducted in 4 sessions, reflecting the objectives above.

The major conclusions of the workshop can be summarised as follows:

  • For electronic documents and resources produced today, there is a pressing need for tools and systems to create and maintain metadata. Further research in this area is necessary.
  • The matter is complex, as requirements of different types have to be met, e.g. electronic commerce and long-term preservation of resources.
  • There is a need for a highest common denominator across domains and services. It is not yet clear what the specification for this is, and co-operation between many actors is necessary.
  • For projects under the Fourth Framework Programme it is necessary to pay attention to the developments.
  • For projects under the Fifth Framework Programme, the scope needs to be widened to include other domains, especially museums and archives where these issues are also important. A wide participation in the debate would increase the general applicability and interoperability of the solutions.
  • Under the Fifth Framework Programme, clustering of activities in the area of metadata systems and services is encouraged.
  • The following recommendations can be formulated:
    • Continuation of concertation is necessary
    • Initiatives need to take a focused, practical approach
    • The needs of the European citizen need to be taken into account
    • Performance criteria and impact of initiatives need to be measured
    • Issues around cost and quality need to be further addressed
    • Support for multilinguality needs to be enhanced
    • Further involvement of commercial parties is necessary

For further information, including PowerPoint presentations, see the Workshop's Web site at:

http://www.echo.lu/libraries/en/metadata/metadata3.html

For more information concerning the first two workshops, see:

http://www.echo.lu/libraries/en/metadata.html

http://www.echo.lu/libraries/en/metadata2.html

For more information on the Libraries sector of the Telematics Application Programme, see Web site at:

http://www.echo.lu/libraries/en/libraries.html

Back to top


1. Introduction

This document is the report of the third Workshop on Metadata, which took place in the Conference room in the EuroForum building, rue Robert Stumper, zone Cloche d'Or, in Luxembourg on Monday 12 April 1999 from 9:00 to 16:30.

The European Commission DGXIII, Unit E2, is organising a series of workshops on the issue of metadata. Intended participation is from libraries sector projects within the Telematics Applications Programme and from projects in other TAP sectors and other programmes, both EU and national. The primary objectives of the workshops are:

To establish a platform for co-ordination between projects concerned with metadata in a broad sense.

Under the current Framework Programme for RTD there are a number of projects concerned with metadata as such or with descriptions and descriptors of electronic documents. These projects will come across the same issues and problems and will benefit from concertation, as this will allow them to compare their concepts and approaches with others.

To make a wider European community aware of developments in the standards arena and stimulate feedback from the projects to the standards.

Developments in metadata in the Internet, specifically in Dublin Core, are moving fast. Some European organisations invest in participating in the Dublin Core workshops but not all have easy access to this activity. By inviting Dublin Core workshop participants to present the developments in the proposed workshops, a wider European audience can be informed on this subject. At the same time, models and experiences from the projects can be fed back into the standards arena.

2. Specific objectives of the third workshop

The specific objectives of this third workshop, held in Luxembourg on 12 April 1999, were as follows.

  • to present recent developments around the Dublin Core metadata element set and look at future directions
  • to look at issues related to unique identifiers for electronic resources
  • to present RDF and XML and look at the practical consequences for metadata implementation
  • to discuss metadata issues related to long-term availability of resources

3. Participation

48 persons representing projects from the Telematics programme, national projects and various Commission services attended the workshop.

The list of participants is attached as appendix 3.

4. Structure of the workshop

This third workshop was organised on a single day and contained four sessions:

The programme of the workshop is attached in Appendix 1. Printouts of the presentations, with short biographical notes of the presenters are attached in appendix 2.

Back to top


5. Report on the workshop sessions

5.1 Introduction session

Axel Szauer (European Commission, Deputy Head of Unit DGXIII/E2) welcomed the participants. He noted that this third workshop is the last in a series of concertation meetings under the Fourth Framework Programme.

Makx Dekkers (PricewaterhouseCoopers TechServ for Libraries team) gave an overview of the two previous metadata workshops, highlighting the objectives of the workshop series and the conclusions of the earlier meetings.
The first Metadata Workshop took place in Luxembourg on 1 and 2 December 1997 and contained a tutorial on metadata, project presentations and break-out sessions on various areas related to metadata (creation, harvesting, retrieval). The conclusions were:

  • There needs to be clarity about version control and maintenance of Dublin Core. The Dublin Core group will be asked to give a clear statement about this.
  • Further pilot projects should be started to further develop experience, test out the issues and help realise a critical mass of Dublin Core metadata.
  • The interest and requirements existing in Europe warrant the establishment of a European group of implementers discussing the practical issues of implementing metadata in general and Dublin Core in particular.
  • The liaison with other groups concerned with metadata, such as the CEN/ISSS working group on Metadata for Multimedia Information (MMI), should be established to ensure applicability and interoperability of metadata as widely as possible and cover the needs of a wide range of communities.

The second Metadata Workshop, which took place in Luxembourg on 26 June 1998, contained a session on technical issues (creation tools, educational applications, controlled vocabulary, multilingual issues) and a session on strategic issues related to standardisation. The conclusions were:

  • The strategic discussions highlighted that establishing widely accepted agreements is essential for the success of metadata;
  • It is necessary that consensus on agreements for metadata is achieved across domains (e.g. libraries, museums, education, business, etc.);
  • Agreements and standards need to be maintained over time in a clear and open way with participation of all interested parties (especially user communities) to guarantee stability over time;
  • Formal and informal bodies involved in the standardisation of metadata sets (Dublin Core community, CEN, ISO) need to find effective ways of co-operation to ensure maximum acceptance of agreements and to avoid overlapping activities;
  • Further metadata workshops organised by the European Commission are considered to be valuable platforms for co-ordination and exchange of experience.

A number of events were highlighted that occurred after the second workshop:

  • The establishment of an explicit governance structure for Dublin Core: the Dublin Core Directorate with Policy and Technical Advisory Committees;
  • The preparation by the CEN/ISSS Workshop on Metadata for Multimedia Information (MMI) of documents describing a Model and a Framework for Metadata;
  • The introduction of work items based on the Dublin Core agreements into standardisation bodies: ISO, NISO and CEN;
  • Involvement of the rights holders community in the further development of Dublin Core.

These events address some of the issues that were identified in the Luxembourg workshops.

Back to top


5.2 Developments in metadata standardisation

Godfrey Rust (Data Definitions), representing the Info2000 INDECS project, presented his views on the development of metadata standardisation from the perspective of the rights holders' community. He presented the metadata model that has been developed for the project, based on the concept that ‘People make Stuff ’, ‘People do Deals about Stuff’ , and ‘Stuff is being used by People’ . He pointed out that we begin to understand some of the issues, such as the convergence of different types of material to complex multimedia information, the Web-based convergence of culture and commerce, the fact that metadata is of a ‘make once, use many' nature and that persistence requires absolute uniqueness. He identified the problem that there are many overlapping standards and that there is not enough unique identification. He mentioned some existing metadata standards (e.g. EDIFACT, MARC) and identifier schemes, and listed a wide range of emerging standards (e.g. MPEG7, Dublin Core, DOI), indicating that there is an explosion of activities in different sectors - the commercial print, audio and audiovisual sectors as well as libraries, archives and education - developing sets and schemas for digital resources. He outlined that 1999 is a critical year for the development of metadata standards, with the potential of convergence rather than competition.
In his view, existing standards are there mainly for specific uses, specific products, specific territories, and for local access to physical objects with simple rights associated to their use. New standards would need to be applicable to many uses, multimedia resources on the Web, and for distributed access to digital resources with complex structures and multiple rights associated to them.
In the last part of his presentation, he explained the objectives and the approach of the INDECS project, a project co-funded by the European Commission under the Info2000 programme, addressing interoperability of data in E-Commerce systems, and bringing together partners from a wide range of backgrounds (book sector, record industry, multimedia rights, music rights, audiovisual rights, literary rights, database provider).
After Rust's presentation, the following issues were discussed:

  • There is a need for ‘cross-walks’ between various emerging metadata sets to avoid enormous problems with interoperability in the future
  • Content provides will be interested in making appropriate use of metadata. Misuse should be prevented through cross-domain agreements.
  • Discovery is an important objective for description of resources, but not the only one.
  • The focus on Events that the INDECS model has, is an interesting new look on the future of resource description; there should, however, be a backward mapping to legacy methods, such as MARC-based cataloguing.

5.3 Metadata and identifiers

Norman Paskin (International DOI Foundation) introduced the Digital Object Identifier (DOI) as a unique identifier of a piece of digital content and as a resolution system to get to that content. He stressed that the fact that the DOI effort is lead by the content world, not the technologists. Currently, the DOI links to a URL; in the future, it is intended to develop into a name for all types of creations. It will support various types of services in the area of Electronic Commerce. This requires interoperable metadata to be associated to the DOI.
After recalling the requirements for names as defined by the Internet engineering Task Force (IETF) (global scope, uniqueness, persistence, scalability, legacy support, extensibility and independence) and the requirements for metadata as identified by INDECS, he listed the metadata requirements in the DOI context (functional granularity and unique identification).
He stated that the value of the DOI initiative should be demonstrated through a "killer application". As an example he described the benefits of DOI for reference linking. In the Web environment, references are more than just the traditional article citations. The reference link needs to support multiple manifestations where complex rights might be involved and automation is required to ensure persistence and interoperability. For this purpose, simple URLs are not sufficient. The DOI could provide a solution because it is immediately ‘implementable and automatable', gives a managed, persistent name and is extensible to various resource types. It lays the path for multiple resolution to manifestations of works, and will give possibilities to resolve to local copies of the referred material.
The metadata to be associated to the DOI will consist of a common kernel for specific genres and Paskin illustrated this with a list of kernel elements and a list of additional elements for a ‘journal work’.
Some of the issues that need to be considered are:

  • Metadata must be ‘simple but deep’
  • Registration of metadata
  • Demonstrable applications

Finally, he presented the DOI vision that the DOI could serve a similar function as the UPC Bar Code, providing a unifying digital identifier service in an infrastructure for Intellectual Property Commerce.

After Paskin's presentation, the following topics were discussed:

  • Not all of the content to be identified is commercial. For non-commercial material the cost/benefit ratio would be completely different.
  • If a global metadatabase is required, who would maintain it and where?

Juha Hakala (Helsinki University Library) reported on the status of the Uniform Resource Name (URN) work in the IETF. This activity originated in the realisation that there is a need to uniquely identify electronic resources to ensure efficient retrieval and long-time preservation. He indicated that DOI is fully compliant with the URN. The standardisation of URN is almost complete. RFCs (the IETF equivalent of a standard) are available for the URN syntax (RFC 2141), URN resolution (RFCs 2168 and 2169) and URN resolution services (RFC 2483). For the URN, there is a need for defined namespaces. A draft RFC has been submitted to IETF in March 1999. The namespaces "NBN" has been reserved for national bibliography numbers, ISBN and ISSN centres plan to begin registration of their namespaces when the RFC has been approved. The namespace maintenance poses some problems: national maintenance would require national agencies, and the Internet Assigned Numbers Authority (IANA) that has the responsibility for the global maintenance, would require personnel resources to do this.
The URN resolution is separate from the assignment of identifiers and is foreseen to use, initially, the Domain Name Services (DNS) and ‘Trivial HTTP’. It remains to be seen if a truly global service is possible. Three types of identifiers can be used:

  • ‘dumb’ codes where only one resolution service can be established, comparable to the ISSN;
  • ‘semi-intelligent’ codes where a limited number of resolution services can be used, such as with the NBN namespace which allows decentralisation to national level;
  • ‘intelligent’ codes where extensive decentralisation can be achieved, like the mechanisms underlying the ISBN.

Real implementations should allow ‘cascading’ resolution, e.g. from a publisher site to legal deposit collections.
Hakala summarised the current status of implementation, pointing out that resolution via Web indices only allows URN to URL resolution and that for browsers, there are only experimental plug-ins and Java applets. The DNS already supports URN resolution through BIND.
He listed a number of challenges for the future of URN:

  • finalisation of URN standardisation
  • registration of legacy identifyers as URNs
  • involvement of application developers
  • legal and contractual framework for delivery of archived Web documents
  • extension of the URN system, e.g. to URN-based commercial systems
  • tools and principles for long-term preservation

In the discussion, it was identified that issues around the cost factors for URNs has not been addressed yet.

Leif Andresen (the Danish National Library Authority) first described the INDOREG project. This project aims to find a solution for the registration of Internet documents at the national bibliographic level. It prescribes the same rules for static publications as for print material, and establishes new rules for dynamic material. It recommends to use Dublin Core as a self-registration format and provides a PURL server for persistent resolution. A report on the project will be published in May 1999.
In Summer 1997, the Danish Ministry of Research and Information Technology and the Danish State Information service, have worked out a standard (published in Summer 1997), describing how government publications shall be encoded and described. This applies to all new publications issued by Danish ministries, government offices and agencies. Metadata must be supplied, in a format that was inspired by Dublin Core.
A new law for legal deposit is in force in Denmark since January 1998, requiring producers of static documents on the Internet to register these documents using an electronic application form. Three institutions share responsibility for the registration:

  • The Danish Library Centre for Indoreg documents
  • The Danish State Information Service for government publications
  • The Royal Library for legal deposit

An application form committee ensures common agreement on the application forms. The current definition is based on Dublin Core. A technical description and guidelines have been published supporting the generation of metadata as an integrated part of the creation of a document.
For identification of a resource, the use of a unique identifier is mandated. It can be based on unique identifiers included in the document (ISBN, DOI, ISSN etc.), but if there is none of these, a Danish URN is assigned through a common Royal Library/Danish Library Centre numbering machine.

In the discussion, the following issues were addressed:

  • To help creators of metadata, tools should be available.
  • There is an important issue around the quality assurance of metadata as well as ensuring authenticity.
  • The cost of creating metadata should be offset by benefits for the creators.

Back to top


5.4 RDF and XML

Dan Brickley (Bristol University) presented a paper providing a broad overview of the scope, capabilities and likely applications of the W3C's Resource Description Framework, including its relationship to other metadata projects such as PICS, DSig, IMS, Warwick Framework and the Dublin Core Element Set. He described RDF as a framework for describing resources building on Uniform Resource Identifiers (URI) and Extensible mark-up language (XML). Target applications for RDF are areas such as resource discovery, privacy, site maps, content rating and intellectual property rights. RDF aims to provide a means to make statements about properties of Webresources. He gave examples of the use of RDF for Dublin Core metadata.
The Resource Description Framework includes a number of components: a formal data model, an XML-based concrete syntax, and a (proposed) Schema language for describing RDF vocabularies. The RDF model uses a graphical description model with nodes (for resources), arcs (for properties) and values. This model is useful because it is arbitrarily extensible, allows the use of vocabularies and provides a common infrastructure (syntax, query, editors) even if there is no agreement on semantics.
The RDF Model and Syntax is now a recommendation in the World Wide Web Consortium (W3C); the Schema is a proposed recommendation. Future work includes the development of RDF Query. He also indicated that tools for the use of RDF are now emerging.
In the discussion, it was identified that tools and guidelines need to be available before wide use can be made of RDF.

Winfried Mühl (Göttingen Digitization Center) presented the work in GDZ. Funded by the German Research Foundation (DFG) in spring 1997, the main practical goal of the Göttingen Digitization Center (GDZ) at the State and University Library of Lower Saxony (Germany) is to build up a digital library from collections of old book materials.
Another main task of the GDZ is to give support in know-how and technical recommendations to other projects with related objectives. For this it is engaged in evaluating tools and techniques in the relevant fields of scanning, imaging, document management and accessibility via Internet. Since autumn 1997 the GDZ is producing digital documents as sets of images. To supply them with selective access metadata are used to describe them both on book and on structural level.
Searching for a suitable metadata format some demands were to be met. The GDZ expects a lot of exchange of metadata and digital documents with related projects.
Still unknown document structures presumably must be implemented and handled. And of course users should be well supplied with tools for browsing and editing the digital documents. Regarding this all the GDZ has chosen RDF/XML as its default metadata format. It is expected to support well interoperability between different platforms and it is extensible to new meta-information. Further on it presumably will be wide spread and supplied by many software tools. At present the new RDBMS driven document management system (DMS) AGORA will be established as the central administration tool for the digital library of the GDZ.
The default data interface for import and export supports a RDF/XML version developed by the GDZ in July 1998. A running prototype is already using this format. Some tools are developed at the GDZ to generate RDF/XML files from online resources. Editors like the AGORA DocViewer can be used to interactively generate metadata for giving structure to flat image sets from scratch.
Mühl presented the concept and some details of the GDZ RDF/XML version as well as the tools used to handle it.
He identified a number of open issues that have emerged from the work at GDZ, especially the migration of the current solution to conform to the present RDF Recommendation and mapping of the GDZ RDF to Dublin Core semantics. Tools that have been developed at GDZ could help others. The generation of unique production numbers for the digital documents was on of the outstanding technical issues that he mentioned.

In the discussion, it was again identified that tools are of the utmost importance.

5.5 Metadata for preservation

Michael Day (UKOLN) talked about the issue of metadata for preservation from the perspective of the CEDARS project. He noted that it is becoming increasingly recognised that metadata has an important role in the ongoing management of digital resources, including their preservation. Preservation metadata can be seen as a specialised form of administrative metadata that can support the long-term preservation of the information content of digital resources by recording technical data about their hardware and software dependencies. Metadata can also help preserve the authenticity of digital resources by the application of cryptographic techniques and by recording relevant contextual information. Metadata will also be needed to help manage the complex intellectual property rights management issues that surround digital preservation. The paper outlined functional and practical issues relating to preservation metadata from the perspective of Cedars - a Consortium of University Research Libraries (CURL) project funded by the JISC under its Electronic Libraries (eLib) programme. The aims of this project are to promote awareness and identify appropriate strategies for collection management and long-term preservation, based on a realistic sample of current digital resources. Day identified four roles for metadata:

  • Technical metadata
  • Rights management metadata
  • Intellectual preservation metadata
  • Resource discovery metadata

He discussed in some detail the model for an Open Archival Information System (OAIS) being developed by the Consultative Committee for Space Data Systems and its application in the Cedars project.
He concluded that digital preservation is increasingly becoming an important issue for libraries, archives and other organisations and that the creation and maintenance of relevant metadata can contribute to the solving of some digital preservation problems. He specifically pointed to the OAIS reference model as providing a common framework of terms and concepts.

Titia van der Werf (Royal Library, the Netherlands) described the work of the Nedlib project in the area of preservation. She first gave a quick overview of the project, which has the aim to develop basic tools for use by deposit libraries to ensure that electronic publications can be used now and in the future. One of the elements of the project is to define a data-flow and map this into a data-model for a deposit system for electronic publications (DSEP). In this environment, metadata needs to be gathered, produced and recorded. A DSEP requires a number of different types of metadata: for cataloguing, for installation and de-installation, for access and for preservation. The preservation metadata needs to take account of differing requirements according to which preservation strategy is followed (emulation, migration, preserve hardware and/or software). NEDLIB has opted for the OAIS-Reference Model. This means that the metadata for preservation elements identified in NEDLIB need to be mapped to the different OAIS-concepts of information objects, such as Representation Information and Preservation Description Information.
Van der Werf presented the various options for preservation strategies. In the migration strategy where resources are converted from one platform to another to follow technical progress, the metadata needs to be changed every time a migration cycle takes place. This means that the amount of metadata associated to the resource grows over time. In the emulation strategy where resources are left unchanged but (obsolete) hardware is emulated in software, the metadata is stable and only required when an emulator is required.
In the demonstration phase of the Nedlib project, the emulation strategy will be implemented using material from a number of publishers.

In the discussion session, questions were asked on the potential scalability of the solutions. That is an issue that has not yet been addressed. The advantages and drawbacks of the migration and emulation strategies were briefly discussed.

Back to top


6. Conclusions

In the closing session, Ian Pigott of the European Commission DGXIII, Unit E2, summarised the conclusions of the workshop.

  • For electronic documents and resources produced today, there is a pressing need for tools and systems to create and maintain metadata. Further research in this area is necessary.
  • The matter is complex, as requirements of different types have to be met, e.g. electronic commerce and long-term preservation of resources.
  • There is a need for a highest common denominator across domains and services. It is not yet clear what the specification for this is, and co-operation between many actors is necessary.
  • For projects under the Fourth Framework Programme it is necessary to pay attention to the developments.
  • For projects under the Fifth Framework Programme, the scope needs to be widened to include other domains, especially museums and archives where these issues are also important. A wide participation in the debate would increase the general applicability and interoperability of the solutions.
  • Under the Fifth Framework, clustering of activities in the area of metadata systems and services is encouraged.
  • The following recommendations can be formulated:
    • Continuation of concertation is necessary
    • Initiatives need to take a focused, practical approach
    • The needs of the European citizen need to be taken into account
    • Performance criteria and impact of initiatives need to be measured
    • Issues around cost and quality need to be further addressed
    • Support for multilinguality needs to be enhanced
    • Further involvement of commercial parties is necessary

Back to top


European Commission 
DGXIII Telematics for Libraries 
Contact: Concha Fernandez de la Puente
e-mail: concha.fpuente@lux.dg13.cec.be

Home - Gate - Back - Top - Metadata3 - Relevant

Back to Top
EC Home Page
Disclaimer
Search
Personalised I*M Europe
DGXIII Home Page
Feedback
Site map
About this site