IMoKftBG_text-06 — Gregory H. Leazer

Bibliographic Families and FRBR Work Sets

Technicalities 41(6) Nov/Dec 2021: 9-13

The previous entry in this column began with the question “What is a work?” In it I addressed the issue of a possible political basis in the formation of work sets in Functional Requirements for Bibliographic Control (FRBR).1 That argument, at its core, simply argued that what goes into the set, and what is left out—that is, what is considered an expression, manifestation or item of a previously established work according to (FRBR) and its scions—might vary from one community to another. That variability may not be distributed equally and randomly across user communities, but instead could vary amongst levels of interest or expertise in various kinds of bibliographic conditions, such as by discipline or educational attainment, and thus should be considered political. One kind of distinction might be the lumpers versus the dividers, as we have seen in biological taxonomy; some people might be more inclined to have large and inclusive groups, while others make smaller groups of finer distinction. The definition of a work, and the criteria of membership will always be a matter of debate. And because we are classifying textual materials, interpretative subjectivities will always be an element in our labor, subjectivities that will fall along political dimensions.

I think there are three tasks when we consider a group of texts (books, films, etc.) as a work:

• What are the members of the set (i.e., what is in the set and what is out)?
• How do we describe the set as a collectivity?
• How do we organize internally the members of the set?
• How do we connect the set externally to other works or entities?

The last column was an attempt to address that first question; our work here today will be to address the second.

Works, Bibliographic Families and Work-Sets

So let us make a technical distinction in our vocabulary here, in the hope that it is clarfying. The term “work” is a bit of a muddle, and can be used to indicate “the” progenitor work, or, variably, the progenitor work and its descendants. Under the first theory, the descendants were at least potentially individual works, although inter-related in various ways. Under the second theory a descendant work was not its own work but merely a separate subordinate thing, such as an “expression” according to FRBR." Under the second theory we have the potential that two distinct texts—potentially without a single word in common—are actually the same work. That is under the second theory a work is not a particular, but a kind orcategory, whose progenitor “stood in” for the other things that were not the same thing but of the same kind. Or maybe a “work” is both the particular progenitor work and the category of descendant texts.

So, our technical distinction. Patrick Wilson, the Sage of the Bay, distinguished between the progenitor work and what he called its family,2 which following others, we will call a bibliographic family. For example, Wilson distinguishes between Hamlet and Hamlet-family.

The production of a work is clearly not the writing down of all the members of the family, but is rather the starting of a family, the composing of one or more texts that are the ancestors of later members of the family.3

There is still present in Wilson some notion that the family members are of lower status than the original work, and it is perhaps a little too wishful of me to think that Wilson is describing a bibliographic family as a progenitor work and its descendant works, as opposed to its descendant texts or expressions or editions or clones or whatever lower status designation we can invent for it. For us, here today, we are not going to make an ontological distinction—let us for the sake of conceptual simplicity call all the descendants works of their own right, just as children are humans of their own right. And we can assemble those family members into a set, which we also call a family, with the understanding that any one family is constantly changing to include others by birth, marriage, and is variously affected by birth and divorce. For bibliographic families we have no members that die, but the conditions of inception is not merely the transmission of genetic material from two sources, but include a wide range of influence and transmission. But in both cases we can subset the family into narrower sub-components as well.

Work-Set as Collocation

We stated above that the question addressed by this essay was “How do we describe the set as a collectivity?” With some new terminology we can restate that question “How should we model a bibliographic family as a work-set?” The work-set(with a hyphen) is our term for the representation of the bibliographic family in the catalog. The bibliographic family::work-set is basically the same as that of the world to a sign, like a city to a map. I like the term “work-set” in part because it recognizes the set-based approach that Svenonius uses and justifies as the foundation for her approach organizing bibliographic objects.4

Per FRBR and RDA: Resource Description and Access, work-sets are formed simply by collocating individual bibliographic records, along with their corresponding expressions, manifestations and items. Not only is this consistent with Svenonius’ set theoretic approach to the organization of information generally but also with the previous century’s approach to cataloging using the card catalog—to physically assemble the cards that contained bibliographic records and order them into a composite set. Collocation meant physically assembling the records together into a sequence. As Allyson Carlyle said in 1996 that is still relevant today, “[t]he collocation standard … was developed to increase the comprehensibility of retrieval sets. It stipulates that relevant author, work, and subject records be arranged and displayed together, one after another and without interruption by irrelevant records.”5 For the last few decades we have attempted to translate that structure in our online catalogs so that when you locate one member of the work-set, you are ineluctably led to the other members. Probably the first thing we learned as MLIS students in our cataloging courses was that term used above—“collocate”—as the way of assembling records to express groups of documents on the basis of their authorship, topic, or textual genealogy, as we do with works.
It’s the story of a Man Named Brady

There are at least a couple of problems with this approach. First, to assemble individual records into a set is basically an enumerative definition. If you describe a family simply by listing its members, e. g. the Brady family is Mike, Carol, Greg, Peter, Bobby, Marcia, Cindy and…oh yeah, Jan, then you are using an enumerative definition. It is a basic approach in set theory: list the members. We can decide whether Alice or the pets are in the set or not, and by what criteria. The strength of enumerative set definitions is their clarity: here are the members. Quoting Carlyle again, “the aim of collocated displays is to provide users with an overview, or picture, of the entire content of a retrieved record set.”6

But there is no real explanation as to the meaning or the significance of the set, like a description or history. Are we really providing an overview of the record set, i.e., the work-set? An overview of the Brady Bunch might include the lyrics of that insipid opening theme song. We know that the Brady family was formed when Mike married Carol, joining Mike’s three sons with Carol’s three daughters (all who had hair of gold) into a single family. We know Mike is an architect, that the children vie with each other in various ways, especially Jan who is outshone by her older sister, etc., etc. We also know that they are a fictional TV family created by Sherwood Schwartz. Much of this is known by Americans my age who watched The Brady Bunch daily after school despite our parents’ pleas. Defining a set exclusively by enumeration leaves out important overview information. The decision to include Alice, her role and by what criteria, are also missing in an enumerative definition. We describe the individual members; we do not describe the collective unit except through a series of unexplained inclusions and exclusions.

Another problem with work-sets in the catalog setting borders with other sets. In a list of people, for example, how do we indicate the transition from one group to another? Ginger, Mary Ann, Thurston, Mike, Carol, Greg–the user has to recognize where we moved from one set into another unless we effectively identify the sets. I am open to counter-arguments but I do not think the demarcation of sets is done effectively in card catalogs, in online catalogs or in the expression of topic sets book assembling books together on the shelf. This has long been a problem in book classification where the user has to see that HX1234 is a different topic from HX123.5–these are not effective set labels for people scanning books on the shelf, and do not get me started when thin books have their label on the front of the book and are thus turned away from the user’s scanning eye.

Can You Spot the Difference?

In card catalogs the issue was solved in theory by the change in access points. Svenonius called these “work languages”7 and they were formed typically by the concatenation of the main entry and the title field as the first element of the description. When standardized titles were required because of variations in the title proper amongst editions, a standardized union title was interposed between the main entry and title elements to bring the various editions together into a collocated unit. Because I am being pedantic, I should give examples:

1.       Melville, Herman. 1802-1880. Bartleby the Scrivener.
2.       Melville, Herman. 1802-1880. Billy Budd, foretopman.
3.       Melville, Herman. 1802-1880. Billy Budd (Motion picture : 1962)

In the card catalog these elements were spread over the top of the card and across multiple lines using funny International Standard for Bibliographic Descriptions (ISBD) punctuation. I am not clear that users—even expert users—recognized what was going on, that the pieces of data concatenated to form a single composite data entry, that the move from example #1 to example #2 was a transition from one work to another, or what the specific relationship was when the user flipped from card #2 to #3. In the computer-based catalog, in theory one was supposed to create a query that hit, for example, all the Billy Budds and only the Billy Budds, or, in information retrieval parlance, perfect precision and retrieval. But as Alyson Carlyle pointed out, this has proven to be quite difficult to do in practice. She found that works-sets were frequently interrupted by records not belonging to the work set, and that descriptions belonging to the work set strayed beyond the collocated sequence because of poor authority work. So the problem is actually Greg, Ginger, Mary Ann, Mike, Carol, Snoopy, Thurston. Or, in her words (without the Peanuts Gang or the Brady Bunch):

The results of this research show that online catalog displays sometimes scatter records relevant to a query among irrelevant records and that multiple-field Boolean matching, in particular, contributes to this scatter…. Although collocation is one of the standards governing catalog design, this standard is obviously far from being operative in current online catalogs.8

While Carlyle’s work is a little aged, and online catalogs have improved somewhat, I believe her results would generally hold up. I don’t think we have ever given the collocation approach its full test, either in card or digital format, because we have never truly successfully implemented it.
Describing Works, Really!

So let us ask our opening question one more time–—how do we best describe a bibliographic family? Not just listing the members, but an actual description? Historically there has been no specific record for the work, just a series of individual bibliographic descriptions assembled into a unit. Even under RDA the work record is really just an authority record with no real explanation of the work. Where do we describe the work? Take, for example,The Lord of the Rings. What do you know about it, beyond its bibliographic details? It’s a trilogy, set in Middle-earth, a major work of English fantasy, they made some films, vastly popular in print and in film, featuring good and evil wizards, fantastic beasts, inspired by various European mythology. You probably know a lot about it, but very little of it is represented in the catalog. Think of an entry in Wikipedia, which, for example, says:

The Lord of the Rings is an epic high fantasy novel by the English author and scholar J. R. R. Tolkien. Set in Middle-earth, a place like Earth at some distant time in the past, the story began as a sequel to Tolkien’s 1937 children’s book The Hobbit, but eventually developed into a much larger work. Written in stages between 1937 and 1949, The Lord of the Rings is one of the best-selling books ever written, with over 150 million copies sold…9

And of course the Wikipedia entry continues to describe the plot, characters and themes, its relationships with other works (especially film productions), it’s critical reception, and impact on popular culture. Even the publication history goes beyond an enumeration of actual publications but describes Tolkien’s relations with publishers and the various compromises he worked out with them on details related to publications.
“This Group Must Somehow Form a Family”

How could we do this? We need to recognize the fundamental basis of the catalog is the work and bibliographic family. Given all the sweat and tears the Program for Cooperative Cataloging (PCC) and others have been pouring into “identity management”10, it is funny how we do not really strive to identify, much less describe, works and authors. We need to expand our notion of the authority record from using an abstract string expression to a shared and distributed database of records about people, works and bibliographic families. I would straight-up model those records from Wikipedia, even partner with them on their content and their processes. The PCC says they want to expand partnerships and sources of metadata, explore linked data structures, and transform their metadata products, let us hold them to it.

But the PCC’s most important strategic direction is their most recent: “Incorporate Diversity, Equity, and Inclusion (DEI) principles to every aspect of PCC operations.” Heck, let us incorporate principles into not just our operations, but into our products. The PCC has set themselves up to be leaders in cataloging, but their strategic directions tend to be focused inward. They say catalogers are introverts. Let us DEI the whole show! Let us DEI all our products from subject thesauri to authority records to catalog systems! A commitment to DEI I believe puts us on a path of greater collaboration, not just amongst libraries, but also amongst other kinds of cultural institutions, and lets us learn from their work practices. But ultimately the catalog is a place of discovery, where we learn about other. In addition to justice and participation, DEI in a cataloging context means helping others discover the fruits of other cultures, by showing them what books and films and music exists from other people and other places. One way to accomplish that is to build real tools of discovery and exploration. And we could substantially improve that effort if we wrote expanded descriptions that introduced others to the works of other cultures, where users need the most orientation, instead of relying on enumerative lists of publications as our attempt to model bibliographic families.

Works Cited

1.       IFLA Study Group on the Functional Requirements for Bibliographic Records. 1998. Functional requirements for bibliographic records: Final report. Munich: K.G. Saur Verlag.

2.       Wilson, Patrick. Two Kinds of Power: An Essay on Bibliographical Control. San Francisco: University of California Press, 1968, p. 9.

3.       Wilson, p. 9.

4.       Svenonius, Elaine. 2000. The intellectual foundation of information organization. Cambridge, MA: MIT Press, p. 37.

5.       Carlyle, Allyson. “Ordering Author and Work Records: An Evaluation of Collocation in Online Catalog Displays.” Journal of the American Society for Information Science 47, no. 7 (1996), p. 540.

6.       Carlyle, p. 540.

7.       Svenonius, p. 87ff.

8.       Carlyle, p. 553.

9.       Wikipedia. “The Lord of the Rings.” Accessed Sept. 14, 2021. https://en.wikipedia.org/wiki/The_Lord_of_the_Rings.

10.    Program for Cooperative Cataloging. “PCC (Program for Cooperative Cataloging) Strategic Directions, January 2018-December 2021 (Extended to December 2022).” Accessed Sept. 12, 2021. https://www.loc.gov/aba/pcc/about/PCC-Strategic-Directions-2018-2022.pdf

GREGORY H LEAZER

Inordinate Maps of Knowledge from the Bibliographers Guild

Bibliographic Families and FRBR Work Sets