Moving From MARC to XML - Part Three
Handling of Authority Metadata

One of the serious problems inherited in MARC cataloging is the handling and linking of name headings that have multi-forms and multi-scripts.  It is not unusual for authors to have more than one form of name, and if the author is internationally well known he/she may be called differently using different languages in different parts of the world.  In this section, we will discuss the problems encountered when marking up such names in metadata and propose how XML can help to resolve them.

Authority Metadata

In MARC cataloging, only one name form is allowed in the bibliographic metadata.  The selected form is known as the established heading.  We are forced to dump all other forms and scripts as variants.  Information about the established form and the variant forms are defined in authority metadata.  The following is a Library of Congress authority record for the Hong Kong based novelist and journalist 金庸:

001 oca00560270 
....
100  1    |a Jin, Yong,|d1924-
400  1    |aChin, Yung,|d1924-
400  1    |aZha, Liangyong,|d1924-
400  1    |a Cha, Louis,|d1924-
400  1    |a Cha, Liang-yung,|d1924-
400  0    |a Kim-Dung,|d1924-
400  1    |a Kim, Dung,|d1924-
400  0    |aJinyong,|d1924-
400  1    |a Yong, Jin,|d1924-
400  1    |a Kin, Yō,|d1924-
....

This record suffers from the following problems:

We can resolve these problems by adopting the concepts as discussed in Part Two.  Using Model A and Model B in a UCS/Unicode environment, we will have:

Model A

001        oca00560270 
....
100  1    |6880-01|a Jin, Yong,|d1924-
880  1    |6100-01|a金庸,|d1924-
400  1    |aChin, Yung,|d1924-
400  1    |6880-02|a Zha, Liangyong,|d1924-
880  1    |6400-02|a查良鏞,|d1924-
400  1    |a Cha, Louis,|d1924-
400  1    |a Cha, Liang-yung,|d1924-
....

Model B

001        oca00560270 
....
100  1    |a Jin, Yong,|d1924-
400  1    |aChin, Yung,|d1924-
400  1    |a Zha, Liangyong,|d1924-
400  1    |a 查良鏞,|d1924-
400  1    |a Cha, Louis,|d1924-
400  1    |a Cha, Liang-yung,|d1924-
700  1    |a 金庸,|d1924-
....

Notice that by adopting the parallel field concept of tag 880 and subfield 6 in Model A, we are able to

In Model B above, tag 700 (Established Linking Entry) is used to make 金庸 equivalent to the established heading "Jin, Yong" (note that you can also put 金庸 in tag 100 and "Jin, Yong" in tag 700) .  However, Model B fails to maintain the parallel linking capability of non-established name forms.  For example, 查良鏞 and "Zha, Liangyong" are unrelated.  The implication of this deficiency is that you can fall back from Model A to Model B, but not from Model B to Model A.  See Part Two for a more detailed analysis of these two models.

It is controversial to make a decision on whether to use Model A or Model B. However, if we use XML to markup the authority metadata, such difficulty can be totally bypassed.  For example, we can have:

<?xml version="1.0" encoding="UTF-8"?>
<marc name="authority" cdate="19980625" udate="19980625" rcn="ABrG">
....
<fd id="0" name="001" ind1="" ind2="" label="Control Number">
    <sf name="">oca00560270</sf>
</fd>

<fd id="1.1" script="cjk.chinese" name="100" ind1="1" ind2="b" label="Author">
    <sf name="a">金庸¸</sf><sf name="d">1924-</sf>
</fd>

<fd id="1.2" script="latin.pinyin" name="100" ind1="1" ind2="b" label="Author">
    <sf name="a">Jin, Yong,</sf><sf name="d">1924-</sf>
</fd>

<fd id="1.3" script="latin.wadegiles" name="100" ind1="1" ind2="b" label="Author">
    <sf name="a">Chin, Yung,</sf><sf name="d">1924-</sf>
</fd>

<fd id="2.1" script="cjk.chinese" name="400" ind1="1" ind2="b" label="See From Author">
    <sf name="a">查良鏞,</sf><sf name="d">1924-</sf>
</fd>

<fd id="2.2" script="latin.pinyin" name="400" ind1="1" ind2="b" label="See From Author">
    <sf name="a">Zha, Liangyong,</sf><sf name="d">1924-</sf>
</fd>

<fd id="2.3" script="latin.wadegiles" name="400" ind1="1" ind2="b" label="See From Author">
    <sf name="a">Cha, Liang-yung,</sf><sf name="d">1924-</sf>
</fd>

<fd id="3" script="latin.english" name="400" ind1="1" ind2="b" label="See From Author">
    <sf name="a">Cha, Louis,</sf><sf name="d">1924-</sf>
</fd>
....
</marc>

Using XSL stylesheets, a library automation system should be able to transform this XML markup to either Model A or Model B without any problem.

Linking from Bibliographic to Authority Metadata Respository

It was found that many of the problems related to searching and display of names that have multi-forms and multi-scripts originate from the restriction of having the established name hard-coded in the bibliographic metadata.  Why do we need to be bound by this restriction?

If we can have a repository of authority metadata in XML format, and if each of these XML documents is accessible via URL, then instead of hard-coding the established name to the bibliographic metadata we can replace it with a link to the authority repository.  We can accomplish this by introducing the XInclude, XLink and XPointer technology to the metadata.

By replacing the established name with an <xi:include> element, for example,

<fd id="5.1" script="cjk.chinese" name="100" ind1="1" ind2="b" label="Author">
    <xi:include href="http://xxxx/oca00560270.xml#xpointer(id('1.1')/sf)"/>
</fd>

it is possible to include the desired name from a name authority metadata repository in the bibliographic metadata.  XInclude is not yet supported in web browsers.  However, we can use XLink and XPointer to achieve the same results.  Explore how this works in the following example:

[NOTE: You need to use Internet Explorer 5.x or above to view the following XML links. If you have problem viewing the third link in Internet Explorer for Windows, check if you have installed MSXML Parser Version 3.0 and have it installed in replace mode]

  1. display sample bibliographic metadata in XML format
  2. display the same XML document with XLink and XPointer markup
  3. display the same XML document with XLink and XPointer markup, plus XSL stylesheet

It is obvious that by adopting this repository and linking concept, many other issues related to authority control will also be resolved.  With the Internet as an infrastructure for distributed repositories, it is no longer necessary for each library to duplicate and maintain a copy of the authority metadata locally.  Library automation systems should be able to just point to the repository and retrieve the most up-to-date version of the metadata as needed.  It is suggested that any cataloging agency or consortium building a name authority metadata repository should seriously consider adopting XML as their basic format and equipping the repository with this linking capability.

 

K.T. Lam (lblkt@ust.hk)
Created: 18 April 2001.
Last Revised: 25 July 2001.