You are hereI*M Europe / Telematics / Libraries / Activities 02/11/99
A-Z Subject index Libraries home page Supporting pages FR | DE | IT | ES News
Daily libraries news
News archive
What's new?
Support Actions
External resources
National focal points
Test sites

Telematics for Libraries - Projects


Updated: 15 JUN 99

Project Number and Title
1047 - MARC Optical Recognition
Programme/Action line Call Topic(s) Start End Project Duration
in Months
FP 3/ IV Theme 17 December 1992 October 1994 24
OCR/ICR; retrospective conversion; library catalogues; structure recognition; character recognition

New bibliographic record products and services applying internationally recognised standards (Theme: 17)
Project description
The project had as key goals to evaluate the feasibility of OCR/ICR as an approach to the retrospective conversion of library catalogues, in printed form, through:
  • development of a prototype tool;
  • integration of prototype into a production environment;
  • test and assessment of methods under real conditions.
The retrospective conversion of library catalogues depends equally on character conversion of the data and on coding of the data's structure. Previous work investigated OCR but with only limited automatic treatment of the structure and formatting. Taking as source records a printed national bibliography, the project used state-of-the art tools in OCR/ICR and integrated these with an ODA-based approach to structure recognition in order to generate high-quality, UNIMARC-formatted records.
Technical approach
MORE was divided into three phases: specification, development and evaluation. Within the phases, tasks were scheduled over seven workpackages:
  • Technical specifications;
  • Dictionaries;
  • Structure recognition;
  • Character recognition;
  • Testing & acceptance of software;
  • Prototype;
  • Production test.
The system directly assimilates printed catalogues into machine-readable format via OCR. The tools for character and structure recognition can be configured to process all catalogues which have a sufficiently homogeneous structure.
When errors or other exceptions occur, the image of the original document, with the problem high-lighted, is displayed, with the best estimate solution plus alternatives. Verified data is converted to high quality UNIMARC formatted records.
The developed prototype was tested under production conditions using the 'Bibliographie de Belgique 1973', selected because its records pre-dated current layout standards. Nevertheless the success of the tests clearly demonstrated the viability and potential of the method.
Key issues
The main technical issues explored were:
  • Role and use of dictionaries, both generic and specific application derived;
  • Analysis and modelling of library catalogue data structures;
  • Integration of structure and character recognition tools.
Impact and results
The project will permit the extension and application of existing techniques to other domains of document processing in library catalogues.
The results include: Specifications of record structure analysis and recognition; Prototype workstation for OCR/ICR and structure recognition of printed library catalogue records; Sample conversions of printed national bibliographic records; Report on feasibility and cost-effectiveness of the approach.
Input accuracy, targeted at 99.8%, compares to double keying standards. Input speed, however, is much greater and the treatment of errors more immediate and informative, with document handling largely eliminated.
The method is technically and commercially feasible for a catalogue conversion system. As such it would be expected to at least halve human involvement in the process.
The production-tested prototype can be adopted as a commercial-grade workstation for RECON of printed library catalogues.
Software design and specification documents are deliverables of the project but have restricted availability.
Other published reports cover:
  • An evaluation of the prototype;
  • An evaluation of tests on the 'Bibliographie de Belgique 1973'.


Name of Institution/Organisation Postal Code / City Country
Jouve S.I. F - 75025 PARIS CEDEX 01 FR
Title, First Name, Name Marie-Elise Fréon Address: 18, rue Saint Denis
BP 414-01
Tel: +33-1 44 76 86 20 Fax: +33-1 44 76 86 39
E-mail 1: E-mail 2:

Other Partners

Name of Institution/Organisation Country Role
Centre de Recherche Informatique, Nancy FR P
Bibliothèque Royale Albert 1er BE P

Top of     Document

Home - Gate - Back - Top - More - Relevant

Children's page
Distance learning
Music libraries
Public libraries
Green paper
IST Information Day
FP5 Calls
Back to Top
EC Home Page
Personalised I*M Europe
DGXIII Home Page
Site map
About this site