University of California
SOPAG Electronic Resources Cataloging Task Force

Return to:  Main Page

 Report and Discussion of Recommendations
Cataloging Approach 

The Task Force has been directed to build upon work conducted by TFER1, adopting the single record approach recommended by that group. The guidelines employ CONSER's "single" record concept delineating the digital counterpart of a print journal in cataloging records for the print. The guidelines adopt decision points that dictate cataloging of the digital version itself when the content of the digital differs substantively from the print. In this respect, the resulting product consists of records representing both single and separate record approaches. (For the minority view, see Separate Records: the Minority View

The Task Force recognizes that any plan to create records for serial titles at a central site and distribute them for integration into local databases encounters the complexities inherent in attempting to merge serial records. Making use of centrally constructed single record cataloging necessitates great care to ensure that records to be overlaid be properly identified, matched and modified.  Each library needs to consider records in the existing catalog reflecting former latest entry cataloging practices, brief records entered during retrospective conversion projects, incoming records for which there is not an exact one to one correspondence with existing records, and records to be overlaid that lack matching keys.  Given these considerations, it is unlikely that a simple overlay process can be employed to load records in local catalogs and it is likely that some degree of local cataloging attention will be needed. For the most part, the Task Force thinks that work to integrate centrally-created records for CDL titles locally can be conducted by a paraprofessional working from records provided by the central cataloging agency.  We also anticipate that the task of integrating these records locally will be greatest during the first year and will be substantially reduced in subsequent years. 

The present national cataloging guidelines that employ the single record approach to reflect both print and electronic versions of a resource also provide for the option of creating separate records describing the digital resource fully.  The Task Force recognizes that campuses wishing or needing to have separate records for electronic resources in their databases as a complement to the single record approach can do so. In this instance, local catalogs can carry records for print versions on which comparable digital resources are also described, as well as separate records which describe the digital resources fully. 

In summary, the Task Force recommends adopting the single record approach for Tier 1 and Tier 2 titles with decision points employed to catalog digital versions directly in certain instances, either at the point of initial cataloging or in the course of catalog maintenance.  Such an approach means catalog users will enjoy integrated access to like materials in the present MELVYL system requiring that holdings be linked to a single bibliographic record to present a merged display.  Local loading of complementary cataloging for digital versions means campuses that wish to employ such records can do so.  Campuses could elect to share such records. 

In determining decision points directing digital cataloging, the Task Force considered those defined and employed by other agencies, including the Library of Congress.  Because the guidelines entail cataloging for digital resources under a variety of circumstances, the guidelines explicate practices for constructing both types of records, those cataloging digital versions directly and those delineating digital versions in cataloging  records for print.   These recommended guidelines focus most specifically on serials cataloging practices, conforming to those encompassed by CONSER guidelines.  The preparation of separate records for serials is also provided for. 

For specific practices related to the preparation of  separate records for monographs, catalogers are pointed to Cataloging Internet Resources : A Manual and Practical Guide, edited by Nancy Olson <http://www.purl.org/oclc/cataloging-internet>. 

Additional detail for specific CDL cataloging practices for other materials will need to be addressed in the future and added to the guidelines.  Some specific types are: special collections related to Finding Aids of the Online Archive of California (OAC); government documents; and maps and imagery.  This is not an exhaustive list.  The Task Force recommends that standards for the MARC core collection level record established by the OAC Metadata Standards Working Group be enfolded in cataloging guidelines for CDL resources.  Evolving standards for the cataloging of government documents and maps and images, developed by those with expertise in cataloging these materials, should be considered and added to the guidelines in the future. 

Aggregator Databases 

Aggregator databases are large databases that aggregate, or bring together, the full-text of journal articles. In most cases, the journals have print equivalents. The aggregators fall into two broad categories: those that aggregate publications of a single or small group of publishers and those that are subject-based and include articles from a vast array of publishers . 

Nearly all the CDL ejournals are publisher-based aggregators, such as Academic Press, American Mathematical Society, Wiley, etc. Most, but not all, retain the concept of "issues" so the user can locate a particular volume and date.  Illustrations,  tables and photographs are usually included.  We believe this type of publication is well-suited to the single-record approach. 

Subject-based aggregators are usually presented as a searchable database, such as the ISI full-text databases (e.g., ABI/Inform) or Lexis-Nexis. The concept of "issues" no longer exists. Since there is no chief source for the component journals, a bibliographic description cannot be prepared according to the traditional library model. Usually these databases contain full-text or selected articles, with illustrations, brief articles, and other features omitted. 

The Program for Cooperative Cataloging has formed the Task Group on Journals in Aggregator Databases. Their charge is to conduct a demonstration project using vendor-supplied records for electronic journals embedded in an aggregation. The goal is to create record sets that libraries could buy, similar to the Major Microform Project. The current plan is for the vendor to supply brief in-analytic records for the titles. The project will evaluate content, creation, maintenance, and record distribution. Naturally these titles will be cataloged separately from the print but data elements may be included (e.g., OCLC numbers of the print) so that individual libraries may merge them with their print records should they choose to do so. (For more information, see the task group’s web site: http://lcweb.loc.gov/catdir/pcc/aggregatortg.html

The Task Force recognizes that importance of this national effort and will continue to monitor its activities.  Nonetheless we found that the group is not far enough along to help us meet our needs at this time. 

Centralized Cataloging of CDL titles 

The Task Force recommends that a single site serve as the Centralized Cataloging Agency (CCA).  This agency could assume responsibility for cataloging the major types of CDL materials.  Specialized types of materials could be cataloged either by the CCA, or by another campus with particular focused expertise, as determined in the future. 

A number of advantages attend to centralized cataloging: 
 

  • Ensuring representation of all CDL-licensed materials in the Melvyl Catalog in a  timely fashion 
  • Maintaining the currency of subscriber and coverage data
  • Assigning PURLs and maintaining the currency of web locations as  addresses change 
  • Eliminating redundancy of effort 
  • Leveraging the knowledge and expertise of a particular campus for the benefit of all
  • Ensuring uniform interpretation and application of guidelines and decision points 
  • Ensuring that the master CDL record has all the data elements needed for the CDL Directory 
  • Supplying cataloging to the campuses in a timely fashion 
  • Maintaining the currency of cataloging description 
  • Coordinating communication with the CDL and among the campus cataloging units 
  • Facilitating the centralized distribution and re-distribution of records
It would be most efficient to select an existing campus cataloging unit to serve as the CCA.  The Task Force suggests that the following criteria be used in selecting the CCA: 
 
  • Willingness and commitment to provide high quality cataloging that supports access to CDL resources in a timely manner; 
  • Administrative commitment to provide resources for CDL cataloging and maintenance  on a long-term basis; 
  • High level of expertise and experience cataloging electronic resources; 
  • Staffing levels sufficient to catalog and distribute CDL records in a timely  manner; 
  • Thorough knowledge of evolving national and international cataloging standards and practices; 
  • Enhance status for purpose of enhancing OCLC master records with PURLs; 
  • Commitment to maintain up-to-date URLs, holdings data, and subscription information; 
  • Willingness to engage in continual process improvement.
 
Relationship of the Bibliographic File to the CDL Directory 

The Task Force has been asked to address the "inclusion of data in the bibliographic record that will aid in identifying and retrieving electronic resources available within a particular discipline outside of the catalog and periodicals database, e.g., the CDL Directory". 

This component of the charge engendered the Task Force’s study of the relationship of descriptive data elements and controlled access points carried in MARC records, which are the foundation of library catalogs, with those data elements utilized by the CDL Directory.  MARC records are a rich source of data about all types of library resources including detailed descriptive information about the origin, nature and extent of a given resource, its relationship to other resources, classification and subject terms and consistent access points for responsible agencies and individuals.  Traditional library catalogs and Web lists or directories, such as the CDL Directory or local campus Web lists of resources, can and should be complementary.  The idea, already routinely applied at the University of California, San Diego, and at other institutions around the country, is to extend the functionality of the online catalog and derive other types of access from the data-rich MARC records.  Since the data elements included in MARC-based online catalogs and Web-based directories have a high degree of overlap, the Task Force recommends that the MARC records created for electronic resources be mined via an automated process to derive the data elements required for the CDL Directory. 

The Task Force has discussed the data needs of the CDL Directory with its Web Design Coordinator and has studied the documentation provided for those wishing to contribute records to the CDL Directory (available at http://www.cdlib.org/libstaff/system_services/projects/cdl_directory/).  The Task Force and the CDL Web Design Coordinator believe there is significant value in developing mechanisms for deriving multiple products from the same set of data elements.  All of the data elements currently included in the CDL Directory can be carried in the MARC record, in addition to the data elements required for the Melvyl Catalog.  The Task Force has developed a preliminary data map between MARC fields and CDL Directory database fields which can be the foundation of an automated data exchange and extraction process see Metadata Map Between the CDL Directory and MARC Records,

Currently, many CDL Directory records are created by copying data elements from records in the Melvyl Catalog or other campus OPAC and rekeying them into the records format (metadata scheme) devised for the Directory and by adding in some data elements that are unique to the Directory.  If the records in the MARC-based online catalog are kept up to date, products derived from the catalog would also be up to date.  Data would only need to be entered once and could be manipulated to produce a variety of products, whether an online catalog or a Web-based directory.  Ideally, the Web-based products would then become as dynamic as the online catalog and would not require time-consuming rekeying and reloading of data to generate Web displays.  Such a procedure could also be employed effectively as other projects, such as the Online Archive of California (OAC).  An automated data-exchange mechanism would not preclude the continued creation of records for either the Melvyl Catalog or the CDL Directory using only the data elements required for the specific type of access needed in those cases where only one type of access might be desired.  However, the Task Force recommends using the MARC record as the foundation when multiple types of access are desired because it is an international standard for communicating bibliographic data , provides for consistent forms of access for the same person or body (controlled headings) and it can accommodate the descriptive data needs of the CDL Directory. 

There is work underway at the Library of Congress and other institutions to develop reliable data exchange software between different metadata schemes.  Since August 1998 the Network Development and MARC Standards Office has made freeware and beta versions of conversion utilities available for testing (see http://lcweb.loc.gov/marc/marcdtd/marcdtdbeta.html).  They have provided a detailed manual for the utilities that was most recently updated in February 1999.  The Task Force believes that this and other utilities can and should be investigated by the CDL so that the MARC records created for CDL resources can be utilized effectively in multiple end products, specifically the Melvyl Catalog and the CDL Directory. 

The Task Force’s specific recommendations are: 

1)  MARC records created for electronic resources licensed by the CDL and 
     the campuses include the unique data elements required for the Web-based 
      products and that this data be carried in consistently-defined fields 

2)  The MARC record be the primary metadata scheme employed when multiple 
      types of access, such as the Melvyl Catalog and the CDL Directory, are 
      desired for the same resource and that other metadata be derived from the 
      MARC records. 

3)  Future development efforts regarding the structure of the CDL Directory 
     include a representative from the Centralized Cataloging Agency so that 
     data elements can be appropriately defined and carried in all MARC record; 

4)  The CDL assign staff to investigate the employment of available utilities for 
      the exchange of data between MARC and other metadata schemes and 
      automate that process as soon as possible. 

Centralized URL Resolution 

The Task Force recommends that a centralized PURL (Persistent Uniform Resource Locator) server or another means of URL resolution be established in association with the centralized cataloging service for digital resources.  A centralized server will permit all campuses to carry a single URL in cataloging records, the same address referenced by the CDL Directory.  The address can be assigned one time by the CCA and distributed to the campuses in 856 fields of the MARC records.  It can also serve the CDL Directory.  A single address will preclude duplicative effort otherwise required by the campuses.  It will result in a single intelligible display in the MELVYL catalog. 

The Task Force recommends that the PURL server be hosted by the CDL as the agency with appropriate technical capability.  Use of the centrally located server for assignment and maintenance of CDL resource addresses would represent an expanded application of that envisioned for the Online Archive of California (OAC), for which CDL Technologies is presently investigating its establishment. 

As a component of cataloging guidelines, the Task Force recommends initial assignment of PURLs in MARC records by the CCA to be conveyed by means of distributed MARC records, carried in OCLC records, and in each campus's local catalog and directory.  The CCA would assume overall responsibility for assigning PURLs and maintaining currency.  In addition, each campus  would be authorized for upkeep of PURLs so that corrections can be made as broken or inappropriate links are encountered.  The Task Force understands from the Director of CDL Technologies that there are no technical constraints that should prevent distributed maintenance. 

The Task Force believes a centralized service could present a model enabling measurement of the resources required for ongoing resolution of corrupted URLs, so that the CDL could gauge the impact of expanding that service in future to more fully serve the campuses. 

Record Production and Distribution 

In the course of its work, the Task Force considered several different methods of producing and distributing cataloging records. The following process, reflecting input by CDL-T and information gathered by the Task Force’s Survey on Record Distribution sent to the UC Heads of Cataloging is recommended as the most direct, efficient and cost effective method.

The Task Force recommends file distribution by means of FTP file transfer as the most straightforward and efficient means of transferring records to the CDL and the campuses. 

Processing and distribution steps would include: 
 

1) The CCA constructs or updates bibliographic records in the OCLC database as part of 
     its usual local cataloging process.  In the instance of serials cataloging, CONSER 
     records are enhanced. In the instances of computer files and books, those records 
     would be enhanced with PURLs added to records. 

2) Cataloging records are exported from OCLC into the CCA's local system. 

3) CDL-specific data elements are added to records. 

4) CDL records are output from the CCA's local system to CDL-T as a unique file. 

5) CDL-T loads files into the Melvyl Catalog and the California Periodicals Database. 

6)  CDL exports records into similarly composed USMARC files, stripping local elements 
     and content designation associated with the campus's local system. 

7)  CDL-exported files are placed at an FTP site for three months for retrieval by the 
     campuses.

Detailed specifications on coding of CDL location data, processing routines and record  distribution are outlined in the appendix entitled CDL Technical Documents

The Task Force recognizes that the CDL and the campuses will continue to adhere to contractual obligations regarding OCLC participation and has forwarded related questions to UC Heads of Technical Services (HOTS) for consideration. 

The CCA would maintain regular communication with the CDL Acquisitions Unit to obtain and update information about licensed resources. The CCA would reflect in MARC records the updated information about resources supplied by the CDL and would convey that information to the campuses by means of revised records and a defined notification process.  The Task Force also recommends that a two-way communication mechanism be established between the CCA and campus cataloging units to enable timely sharing of information on CDL cataloging details.  The CCA could  use this means to announce progress on major packages, request feedback on cataloging questions and report on other operational issues that might arise.  Campuses will have the same need to query and communicate with the CCA.  We recommend that the communication mechanism be formalized and that campus cataloging contacts be designated.  Communications could be via a CDL cataloging listserv, web site, or some other means. 

Record Maintenance 

A principle responsibility of the CCA will be the maintenance of records for Internet resources. The types of updates the CCA will make include, but are not limited to: 
 

  • Keep the hotlinks current
  • Update coverage (extent) information as publishers add more issues
  • Update subscriber data as libraries select or deselect vendor packages
  • Update access data when passwords, IP access, or public keys are implemented or changed
  • Update system requirements, navigational paths, and other critical access information
  • Recatalog when titles change, print version is replaced by online version, etc.
 
The methodology for keeping information current will include: 
 
  • Respond to queries/complaints from selectors, public services, and other campuses 
  • Input PURLS and run validation checks with the PURL software 
  • Investigate unsuccessful hotlinks online or with the publisher 
  • Communicate regularly with CDL Acquisitions 
  • Subscribe to CDLInfo, CDLAlert and publisher announcement lists 
  • Monitor "What's new" pages and publisher web sites 
  • Monitor CDL web site, especially Collections section and "What's new" 
  • Maintain a "tickler file" of titles for which change is anticipated (e.g., print title has changed; license signed, not yet available) 
  • Periodically look at the sites for changes in title, coverage, etc. 
Melvyl Record Processing and Display 

The Task Force recommends that CDL process records received from the CCA using standard Melvyl system normalization, indexing and loading programs.  CCA records for monographic databases and web sites would be loaded into the Melvyl Catalog and those for serials into the California Periodicals Database.  In working with our CDL-T liaison, we strove to identify a record loading process which minimizes the need for special processing and programming with the recognition that a major development investment in the current Melvyl system technology is not desirable at this time. Some programming will be needed, however, to establish a new CDL input stream, associated CDL location code and holdings segment, and to output FTP files to local campuses. 

The Task Force also identified four modest Melvyl system enhancements which we feel are important to create a clearer presentation of CDL resources and provide search capabilities equal to other materials in the union catalog.  These enhancements are described below. 
 

1. AT CDL Search Limit 
The Melvyl processing scheme presented in this report will result in a new  CDL location in the Melvyl holdings display for all cataloged CDL titles. The current array of locations defined in the Melvyl Catalog are each associated with an AT limit which allows users to isolate these materials.  Once the CDL location is established in the Melvyl California and the California Periodicals Database, it will be natural for users to desire and expect the ability to limit to CDL materials.  The desire to limit to remote electronic resources has been expressed in several forums since electronic fulltext materials began appearing in Melvyl, and was a desirable function identified by the CAT/PE Regeneration Task Force. 

While an AT CDL limit would be more limited than a form limit for all Internet materials, it is our judgment that it would provide UC users with a powerful means to isolate remote resources in Melvyl.  It is our understanding that creating another AT limit for a defined location does not require a large development expenditure and, if this is the case, we strongly recommend that an AT CDL limit be established. 

2. CDL/CCA Base Record  
The single record technique relies on the addition of fields and notes to the existing record to inform users about the availability of the electronic resource.  It is therefore critical that these additions display to users in the Melvyl Catalog.  Under the current Base Record algorithm, the record from the Library on Congress automatically becomes the Base Record for public display.  URLs from other sources are brought out for display but other note fields (such as the 530) are not. 

The Library of Congress record was given automatic Base Record status originally as a method of enhancing the bibliographic description for serials, many of which were brief or incomplete in the early days of the CALLS database.  Today most of the serials cataloging copy used in the  OCLC database originates at CONSER and the quality of most records is much higher.  Cataloging produced by the CCA will be based on CONSER standards, enhanced by the CCA/CDL additions for electronic access.  In addition, CCA/CDL records will be subject to ongoing updating to reflect significant changes in UC access and holdings.  We feel that UC users would be best served if the CCA/CDL record were weighted to become the Base Record and that no compromise or loss in bibliographic detail would result from this decision. 

3. Indexing 776$x  
The use of the single record approach raises the question of where to put the ISSN for the electronic version when one exists.  The 022 field is repeatable and some libraries have put the ISSN for the electronic title in a second 022.  The Library of Congress prefers to use the 776$x (Other Physical Form) field for the second ISSN even when using a single record and recommends that other libraries do the same.  We have chosen to follow the Library of Congress' approach in the CDL Cataloging Guidelines in the interest of adopting as standard an overall approach as possible.  The 776$x is currently not indexed in the Melvyl Catalog.  This means that ISSNs in this field will not link through the hooks-to-holdings link with Melvyl databases.  We therefore recommend that the 776$x be added to the ISSN search index in the Melvyl Catalog.  This will present users with notification of holdings in all formats, including electronic, when they are searching in linked databases.  We note, however, that the information on electronic holdings will not be complete because the A&I link displays the location/holdings segment only and this will not contain the URL. Users will still need to access the bibliographic record to see the display of actual electronic locations (URLs).  This is not a perfect solution but would at least notify users that they may be able to find the text of the article through the CDL.  An additional note could be supplied such as "See Electronic Location in bibliographic record in the California Periodicals Database" if that is thought useful. 

4. 856 Display  
Although the charge to the Task Force does not include recommending changes to the display of URLs in the 856 field, the need to improve the display was raised as a critical issue many times both among Task Force members and through comments from our campus colleagues.  We did not develop a proposal for changing URL displays, but trust that a proposal from CDL will be forthcoming soon.  In order for the Melvyl Catalog to serve as an effective gateway to electronic resources, the electronic locations (URLs) must be presented to users in a clear, well-organized display.  The current URL display frustrates rather than facilitates access by presenting them in an undifferentiated string of characters.  Catalog users cannot determine which URL to select for access from their campus and must click at  random to access Internet resources.  Campus librarians cannot see from the public display which URL their library has submitted.  We wish to emphasize the importance of a clear URL display to the overall success of adding CDL materials to the Melvyl catalog.  Much of the value of our cataloging effort will be lost without a clearer public display of URLs. 

  
Cost Models 

The Task Force has identified cost factors associated with centralized cataloging and record distribution.  Cost estimates are currently being assigned to the cost factors and compensation models developed.  The cost factors and cost estimates (work in progress) are presented in spreadsheet form  under Cost Factors and Estimates

The Task Force recommends adoption of the centralized cataloging agency model (CCA) as a cost-effective method of providing high quality cataloging that supports the timely access to CDL Tier 1 and Tier 2 titles. Adoption of the CCA model would in large part eliminate the redundancy of having each campus separately catalog CDL resources for inclusion in campus online catalogs and the Melvyl Catalog.   The ongoing costs associated with the higher level intellectual activity needed to catalog CDL resources and maintain active links to those resources (primarily library assistant III to librarian) would be a shared cost rather than separately absorbed by each campus. Ongoing costs at the campus level would primarily be assigned to record receipt and processing at a lower cost.  The start-up costs and processing costs will vary from campus to campus depending on local systems requirements. 

The CCA would relieve campuses of the need to commit significant levels of higher level cataloging staff to process and maintain CDL Tier 1 and Tier 2 titles.  Centralized cataloging of CDL resources would provide campuses with much needed flexibility in allocating limited staff resources to other technical processing priorities.


Return to: To top of page         Main Page