MULTILEVEL DESCRIPTION, MULTILEVEL INHERITANCE, RELATIONS/LINKS: CONTENT AND CARRIER

by

Mary Lynette Larsgaard

Alexandria Digital Library/Davidson Library

University of California, Santa Barbara

Version 0.6

September 22, 1996


Introduction

"The major thrust of cataloging should therefore be in building conceptual models of works and their agents, by which people usually try to find such works .... in principle, MARC is independent of AACR2 and of any other set of cataloging rules."

Heaney, Michael. 1995. Object-oriented cataloging. Information technology and libraries (9/95):15-53.

"0.24. It is a cardinal principle of the use of part I that the description of a physical item should be based in the first instance on the chapter dealing with the class of materials to which that item belongs. .... In short, the starting point for description is the physical form of the item in hand, not the original or any previous form in which the work has been published."

Anglo-American cataloguing rules. 1988. 2d ed. rev. Chicago: American Library Association.

These two quotations form the basis of the following paper, which agrees completely with the first and disagrees equally with the second, taking instead the stance that intellectual content is of primary interest to the user and physical form important but secondary.

The Relationships

Call them what you will - cartographic materials or spatial data - these items about 80 percent of the time have some sort of link or relationship with others of their ilk. At the same time, what was allowed by AACR1 and by USMARC to show the nature of these links was certainly thorough but unfortunately very time-consuming. Then, with AACR2, effective January 1, 1981, multilevel description for parent-child relationships became an option, which by implication allowed other relationships to be expressed within the bibliographic record.

Unfortunately, USMARC - not only the best but the most heavily used machine-readable format for bibliographic description in U.S. libraries - did not. As has been generally true in the last fifty years for U.S. libraries, AACR proposes, the Library of Congress (LC) - and by extension USMARC - disposes. That is, when it comes to a matter of the current edition of AACR stating a rule, and LC stating that it will follow a different practice, many libraries unswervingly follow LC, for many sheerly practical reasons having to do with the economics of accepting existing LC, and LC-following libraries', copy as is. USMARC is intended for use as a communications format for bibliographic description, but not for any one given set of cataloging rules, although for historical reasons, AACR in its various editions is very fully provided for.

USMARC did not provide for linking relationships as specified in AACR2 until relatively recently. There are many linking relationships possible, and they are often in reciprocal, paired fields. For the purposes of this paper, the links downward, from what might generally be called parent to what might generally be called child, will not be discussed in this paper, but rather only the links from child to parent. This paper is an outcome of the cataloging done for the Alexandria Digital Library, during the initial stages of which a database computer engineer - after inquiring of me how often users would need to know the names, etc., of all sheets belonging to a series (e.g., all 57,000 sheets in the U.S. Geological Survey 1:24,000-scale series), and hearing my response, "Seldom" - stated that the reciprocal fields were not needed, that for the few times such information were requested, software could search all linking-field $w's in child records (control number for e.g., map-series parent, 772$w) for those containing the given control number.

The fields upon which this paper is focused are as follows:

772 link to parent from child

773 link to host from part

775 link to prior edition from newer edition

[Query: does one link always to the first edition but not between different editions? or are all editions linked to each other? And are multiple links possible - that is, in the case of a new edition of a U.S. Geological Survey topographic quadrangle, from one edition to the earlier editions, and also to the parent record? oR are links in chains - that is, newest edition to earlier editions, and from earliest edition to parent?]

776 link to "original" physical form from other forms

[Query: same query as for 775]

787 nonspecific relationship

These fields are of especial interest to catalogers who want to give users the best possible access but have no interest in the reinputting of information which classic methods of cataloging require. We thus have the interesting situation of USMARC making possible cataloging practices which, except for parent-child, are not specifically mentioned in AACR (latest edition).

Following is a table giving the AACR2R method of making clear a relationship, with the USMARC method(s) in the second column.

I. Parent/child

Unfortunately, USMARC lumps together some very different relationships together under 772, including supplement and whole/part. For the purposes of this discussion, the focus is on the parent/child relationship, which is an excellent expression of the most frequent of relationships within the spatial-data world, that of map series (series as a whole/individual sheet) and of remote-sensing images (e.g., air-photo flights; satellite-image missions).

AACR2R USMARC

I. Parent/child

i. series 4xx, 8xx

ii. multilevel description 772

II. Host/part

i. "in" analytics [fill in these fields]

ii. multilevel description 773

III. Other edition

i. Make full record. 250

ii. multilevel description 775

IV. Other physical form

i. Make full record. 533 (details of reproduction); 534

(details of original)

ii. multilevel description 776

V. Other unspecified relationship/complex relationship

i. Make full record. [e.g.?] 580

ii. multilevel description 787

Several questions immediately come to mind here about linkages:

a. are there some situations - such as large-scale topographic series - where one would use 772, and others - such as monographic series, where the child is much more loosely tied to the parent, with far fewer fields in common from child to child, in fact only the author/title of the parent - where using 4xx/8xx is the best procedure?

b. how many linkages does one make in such situations as map series, where many different editions of a single sheet exist? are the links from more current editions back to the first edition, but not to the parent record? are links to parent record and to previous editions? and so on. (1)

c. similarly, when a map sheet is scanned into digital form, is the link from the record for the scanned item to the record of the hard-copy original? is there also a link to the parent record for the map series?

As was briefly indicated above, the parent/child relationship is especially important for spatial data, since the vast majority of all materials participate in that sort of relationship; for example, such items as:

a. map series; e.g., U.S. Geological Survey 1:24,000-scale topographic sheets; a total of about 57,000 sheets;

b. air-photo flights; aerial photographs most frequently are part of flights, with ca. 200-300 frames per roll.

c. satellite images; e.g., the first Landsat satellite was launched in 1972, so by now there are literally millions of images

and most recently:

d. geographic information systems: 2 different types of relationships:

i. tiles (analogous to sheets in a map series)

ii. layers of thematic information that cover the entire area; each thematic layer will cover (or should cover) the entire area, rather as in a national atlas - each sheet covers the same geographic area and shows a different subject. There is rarely much common metadata between coverages, since each generally has different lineage. It would also be difficult to devise a concept that survives from one GIS software to another. It will be interesting to see how this system works when one applies it to digital data, and specifically to vector data.

Approaches

What is needed in USMARC is a multilevel-description field, that expresses what role or roles any given record has. There are several ways to do this:

1. the national map collection of Canada, which is part of the National Archives of Canada, uses (in GEAC) the following scheme:

001 parent record

002 subrecord

003 both parent and subrecord

null monographic item

(Parker, Velma. 1990. Multilevel cataloging/description for cartographic materials. Western Association of Map Libraries information bulletin 21:86-96.)

2. the Alexandria Digital Library currently uses:

root; node; leaf

A change that is being considered is to use numbers:

- either: monographic item null

parent 0

child 1-n

- or: 0 not compound

1 parent

2 child at second level

3 child at third level

(and so on)

Currently a document dealing with this matter generally is up for review:

Functional requirements for bibliographic records, draft report for world-wide review. May 1996. Frankfurt am Main : Deutsche Bibliothek [for] IFLA Universal Bibliographic Control and International MARC Programme.

This report - which is recommended by the IFLA Study Group on the Functional Requirements for Bibliographic Records - has as its base the idea that there are three tiers involved:

1. the work;

2. the manifestation; and

3. the copy.

AACR in its various editions has "developed in a world where a document was relatively stable over time ..." (Dorman, David. June 7, 1996. Re: edition statements. Saratoga Springs, NY: Skidmore College. email message ID 31B88E99.430B@skidmore.edu). The IFLA report attempts to deal with a world where that is no longer true for many "documents."

Host/part is perhaps the least common of these relationships in the spatial-data arena, but it is nonetheless much needed. For many years, map librarians have needed quick access to maps bound in other publications, e.g.:

a. individual plates in atlases [cite Nancy Edstrom etc. on tables of contents for atlases]

b. maps in periodicals: The American Geographical Society's index to maps in periodicals is certainly extremely useful.

The other-editions situation very frequently occurs in topographic series, and to a far lesser extent with some monographic maps, such as the geologic map of Africa issued in different editions by Unesco, or the maps of states at 1:500,000 issued by the U.S. Geological Survey.[get citations]

There has already been considerable discussion about cataloging an item that appears in several different physical forms. What is needed here is a briefer way to catalog, I would suggest using the 776 fields, to give us what Barbara Tillett calls a machine version of the old printed-card dashed-on entry. [cite Barbara's Dublin-core-meeting paper]

What happened in these dashed-on entries was this. Let us say, for example, that we have a topographic map series of [use one of those AMS map series records], with an index. The catalog record would look like this:

United States. Army Map Service.

...

--- Index. ...

The "dashed-on" part is an analog of the bibliography in which one has one author with more than one publication listed; in this case, the author's name is shown in all entries except the first as a brief underlining. What makes this such an important option for catalogers is that one may have one intellectual entity in several different forms - e.g., monographic book; CD; VCR tape. It is not useful to library users to offer them lengthy sets of citations for what is the same intellectual content on different carriers. This is especially important given that most users are not interested in looking at any more than about 35 citations. The problem here is that AACR2R states in 0.24 that an item is to be cataloged first of all by working with the chapter dealing with the carrier, before intellectual content is considered. This is not in accord with user requirements. It is extremely rare for a user say, "I need a CD and I don't care what's on it." Certainly carrier is important to the user, but it is secondary. Content comes first.

Other physical form is increasingly important for spatial data with the continually growing presence of spatial data in digital form. There are a couple of different approaches that one might take here:

a. describe a number of different physical formats in the same record;

b. describe the different physical formats locally;, in e.g. NOTIS holdings (HLD) screens;

c. describe each separately and fully;

d. describe each separately but by using linking fields do not repeat any information in a child that appears in a parent.

In the past, microform was the chief situation of this type, with some 35mm-slide, facsimile and photocopy occurrences. More and more users say to the map librarian, "I need x spatial data in digital form." The most recent major event in this area in the United States is DRGs, Digital Raster Graphics; these are scanned versions of the several map series in the U.S., the 1:24,000-scale, 7.5' topographic quadrangles, the 1:100,000-scale sheets and the 1:250,000-scale sheets. And how are we to deal with the matter of one digital file that is available in different file formats, e.g., GIFF and JPEG?

Other unspecified relationship [so far, I haven't come up with one of these for cartographic material - suggestions?]

Getting A Bit More Specific

In Appendix A are given several different kinds of examples:

a. map from a volume;

b. different editions of one USGS 1:24,000-scale sheet;

c. the same spatial-data item cataloged first according to AACR2R and second using the alternate method that USMARC makes possible.

All are USMARC-tagged; as a part of my death-to-all-coded fields initiative, USMARC tags 1xx-8xx are emphasized. [Possibility for future: Scans of portions of some of the maps, to illustrate from whence the "bibliographic" information came.]

The USMARC-method makes for observably shorter records. Note especially in these records that for the USMARC records only the $w (control number of record being pointed to) is used; putting in all fields is extra, repetitious work for the cataloger.

It is essential to keep in mind that these USMARC-method records are ONLY what the cataloger inputs. What the general user sees is another matter altogether, and a difficult question it is, to figure out how we are going to display this complex situation to the user in a clear, unambiguous way. The three obvious options are:

a. parent

- child

(this is like the old dashed-on method)

b. child

- parent

c. child record that is an agglomeration of parent and child fields, and that looks almost exactly like the record of the AACR2R version. See example __ in Appendix A of a record for a sheet of the 1:24,000-scale USGS map series.

[see appendix for examples]

Parkinson's Law and the general cussedness of life makes one suspect that the favorite method of users will be whatever takes computer engineers longest to code. Another look is enough to convince or at least strongly to persuade us that the way to do this so that the cataloger does minimal inputting is to use templates, as for example happens when a cataloger working on OCLC, faced with cataloging a new edition of title, calls the previous edition, types, "new" at the command line, and works with a record that has 1xx-8xx intact, with only the 001-099 fields blanked out. This will probably have the same general effect as the USMARC-method examples in Appendix A without computer engineers having to write code to meld without overwriting fields a child and parent record to the example given previously.

Yet another possibility is not to have any note fields in the child records, and instead to have any necessary notes (especially noting variances, e.g., "Phrases at head of title differ") in the parent record. In any case, it will be important to use explicit field labels (not in libraryese) noting roles and relationships, with, with the default display being minimal-level/brief cataloging fields. Also, the different relationships may need to be shown in different ways.

The overwhelming importance to cartographic materials of being able to link and show relationships is depicted in the following worst possible case (see Appendix A for records):

a. parent record - all USGS topographic series

b. parent record - 1:24K USGS topographic map series

c. parent record - orthophotoquada

d. individual quadrangle: Santa Barbara 1952

e. individual quadrangle: SB 1967

f. individual quadrangle: SB 1988

g. individual quadrangle: SB orthophotoquad 1976

h. individual quadrangle: SB orthophotoquad 1980

i. ??Digital Elevation Model for SB 1:24,000??

j. ??DLG for 1:100K??

Let's start out with relatively easy matters first, and then link together the entire set.

Host/part

PART RECORD

This item is physically a part of: HOST RECORD

Note: this does not link with the rest.

Source

DIGITAL ELEVATION MODEL

Source/lineage: melded record

Parent/child

MELDED RECORD

or

PARENT

- CHILD

or

This CHILD RECORD ("in" analytic approach)

is a part of: PARENT RECORD

Other editions

FIRST EDITION

Other edition: RECORD

Other edition: RECORD

Other edition: RECORD

etc.

Other physical forms

HARDCOPY TOPOGRAPHIC SHEET

Scanned into raster digital form: RECORD

file extension: .gif

file extension: .tiff

file extension: .jpeg

Microform: RECORD

Facsimile: RECORD

All together:

Parent

- child

- other physical form:

- facsimile

- microform

- scanned

- gif

- jpeg

- etc.

- other edition

- as above

- other edition

Child as source

- link to Digital Elevation Model

- link to Digital Line Graph


Ending Thought

We have two major goals in mind:

a. present simply, and clearly, complex relationships (e.g., multiple versions) to the users but only as much information as they need;

b. simplify the creation of catalog records by catalogers

(Martin, Giles. December 17, 1995. Re: field 856 considerations. New South Wales: University of Newcastle, Quality Control Section, University of Newcastle Libraries. Message ID Pine.3.89.9512181017.A6847-0100000@dewey.newcastle.edu.au)



FOOTNOTES

(1) The Library of Congress is exploring several different methods that have been used at LC for whole/part and parent/child relationships, to try to standardize for maps, mss, photos, etc. Kay Giles is in charge of this effort.

HTML 3.2 Checked!