SPATIAL DATA MEETS DUBLIN CORE:

THE ALEXANDRIA DIGITAL LIBRARY

by Mary Lynette Larsgaard

Assistant Head, Map and Imagery Laboratory, Davidson Library/Investigator, Alexandria Digital Library

University of California, Santa Barbara

December 1996

Note: This is a merge of two presentations, one at the UKOLN/OCLC Warwick Conference, University of Warwick, England, April 1-3, 1996, and one at the CNI/OCLC Conference, OCLC Inc., Dublin OH, September 23-25, 1996.

Introduction

The Alexandria Digital (ADL) is one of six Digital Library Initiative (DLI) projects funded by the U.S. National Science Foundation (NSF), the Advanced Research Projects Agency (ARPA), and the National Aeronautics and Space Administration (NASA). Fairly quickly, ADL discovered that users of the ADL Catalog needed a default set of fields to be displayed for each bibliographic record satisfying a query; displaying the full set of fields for most users both unnecessary and confusing. At a metadata workshop last November held in Santa Barbara, ADL members had an opportunity to learn about the Dublin Core. While the Dublin Core was originally intended for resource description,, or metadata records, for networked electronic information objects, ADL has extended the Core to metadata for all forms of spatial data, to use as a brief-record set of elements displaying to general users. In addition, ADL has discovered that many first-time and one-time metadata creators need a brief-record workform, with only the basic fields; ADL staff therefore created the Portable Ingest File, which is available over the Web, in the Public Documents/Metadata portion of the ADL homepage, which is at:

http://alexandria.sdc.ucsb.edu

ADL made extensive use of the Content Standard for Digital Geospatial Data, a 1994 publication by the U.S. Federal Geographic Data Committee (Reston VA); see the following Website to obtain a copy of this document, while includes a description of fields:

http://www.fgdc.gov

For the relationship between USMARC and Dublin Core fields, see the excellent document by Rebecca Guenther (Library of Congress), "Mapping the Dubliln Core Metadata Elements to USMARC" (Discussion Paper 86) at:

gopher://marvel.loc.gov:70/00.listarch/usmarc/dp86.doc

(or you may go in through: http://lcweb.loc.gov/marc/)

Dublin Core and ADL Equivalencies

In a spatial-data query situation, either the potential user:

a. knows what kind of information is needed but does not know the citations for the specific items (this is by far the more common); or

b. has a specific item in mind, or works by a specific author.

The second case has been well dealt with for many years, both within the traditional library community and within the bibliography community, and will not be discussed here. In the first case, the information the map librarian needs in order to make sure the user gets the item(s) the user needs is as follows, and generally in this order:

  1. geographic location: the vast majority of the time, this is by place name; microscopically few users know the latitude and longitude of the area in mind;
  2. themes or subjects: for example, geology; railroads; vegetation; etc.;
  3. date of data: while frequently the response to this is, "the most current you have," often users need historic, repetitive data, or data for a very specific date or dates;
  4. level of detail/scale/resolution: that is, does the user need to see a relatively small area in considerable detail, a relatively large area with little detail, etc.;
  5. format: hardcopy or digital; if hardcopy, what size (e.g., 8 ½" x 11", to be put into a report; 4' x 5', to display on a wall during a class presentation); if digital, what file type (e.g., raster or vector; Arc/Info coverages, .lan - ERDAS - files; .tiff; etc.).

The following list of fields gives Dublin Core field name, ADL field names, and - in parentheses - USMARC field numbers; for USMARC, a dollar sign is used as a subfield indicator. Since ADL needs to provide access to hard-copy items, it has been necessary to add some fields; for these, there is no Dublin Core field-name equivalent.

Dublin Core Field Name Alexandria Digital Library Field Name

Author Originator (720)

Main entry, personal name (100$a); main

entry, dates associated with name (100$d)

Main entry, corporate name (110$a); main

entry, subordinate unit (110$b)

Title Title (245$a)

Place of publication (260$a)

Publisher Publisher (260$b)

Date Publication date (260$c)

TECHNICAL INFORMATION

For all formats:

Access constraints (506$a)

Use constraints (540$a)

For digital materials:

Native dataset environment (538)

Direct spatial reference method (352$a)

Raster object type (352$b)

Processing and enhancements (590$u)*

Horizontal positional accuracy

value (514$g)

Map-projection name (342$a)

For hardcopy maps:

Constant ratio, linear horizontal

scale (034$b)

Map-projection name (255$b)

For remote-sensing images:

Sensor (590$o)*

Percent cloud cover (514$m)

For satellite images:

Path (590$l)*

Row (590$m)*

Spectral bands (590$w)

Form Extent (300$a)

Other physical details (300$b)

Dimensions (300$c)

Accompanying material (300$e)

ObjectType Genre keyword (650$v)

Genre keyword (651$v)

Relation Series name (440$a); issue

identifier (440$v)

Multilevel descriptor identifier

(no USMARC equivalent)

Control number of parent record (772$w)

Control number of host record (773$w)

Edition (250)

Control number of record of

other edition (775$w)

Control number of record of other physical

form (776$w)

Control number of record with nonspecific

relationship (787$w)

Coverage Beginning date (045$b)

Ending date (045$b)

West-bounding coordinate (034$d)

East-bounding coordinate (034$e)

North-bounding coordinate (034$f)

South-bounding coordinate (034$g)

Place keyword (651$a; 650$z)

Geographic name (651$a; 650$z);

geographic subdivision (651$z; 650$z)

Language Primary language; (coded; 008/35-37)

Multiple languages (coded; 041)

Language note (in natural language; 500)

Subject Theme keyword (650$x)

Stratum keyword (650$x)

Temporal keyword (650$y)

Other agent Personal name, secondary entry (700$a);

dates associated with name (700$d)

Corporate name, secondary entry (710$a);

subordinate unit (710$b)

Identifier Contact organization (270$q); address

(270$a); city (270$b); state or province

(270$c); postal code (270$d); electronic-

mail address (270$m); fax telephone

number (270$l); voice-telephone number

(270$k)

Local call number (099)

URL (856$z)

Control number (001)

Source This is a composite field, named:

Source Information. It includes all the

fields in tables s_cit_info (source citation

information), s_src_info(source inform-

ation); s_src_time_period_cont (time

period covered by source) (786, 787)

*Field number assigned by ADL; not in standard USMARC format.

Conclusion

A perennial problem with catalogs - first with hard-copy and now with online catalogs - is that they attempt to serve two very different users - first, the general users who are seeking for information to fill a need and are generally interested in metadata only to the extent that it enables them to get the information they need as speedily and with as little work as possible; and secondly, the metadata creators, who need to have sufficiently detailed metadata in order to be positive that an item to be cataloged really is a unique item that requires a unique metadata record. Using the Dublin Core set of fields for the first group is extremely useful in that it enables users to make a yes-no decision just from this small set of fields. Such a set of fields is desperately needed for cataloging the Web, and we hope soon to see it commonly implemented in the description of these resources.

HTML 3.2 Checked!