Article Summary for Lecture #14 – Schuitema

Schuitema, Joan E. “The Future of Cooperative Cataloging: Curve, Fork, or Impasse?”

In this article, Schuitema offers a thoughtful and timely evaluation of how technology and society are immediately affecting the practice of cataloging. She infers that the days of MaRC and AACR2 are numbered and new metadata schema must be implemented to account for digital formats and patron expectations. She addresses the historical development of cooperative cataloging and further outlines the reasons why the future is uncertain for professional catalogers.

According to Schuitema, the same issues plague each successive generation of catalogers. She specifically details the printed card cooperative era, as well as the LC card distribution program. Based on her historical evaluation, common cataloging problems emerge. Library, research, and user need for standardization conflict with the expensive and time-consuming application of rules for standard structure. Ever-increasing publisher output exceeds the ability to catalog. Evolving skill sets to match technological advances demand that catalogers either keep up or face job loss. Finally, libraries are always searching for a panacea to alleviate cataloging problems and constantly bemoaning the lack of right tools right now to make cataloging better and cheaper.

Schuitema stresses the difference between past problems and the current cataloging crisis. Today, budget constraints mean catalogers face a streamlined environment combining acquisitions and cataloging. Bibliographic description is a costly and time-consuming value-added service provided by catalogers, whereas vendors can provide it at a quicker rate. Users no longer view the library catalog as their primary source of information access, instead turning to the one-size-fits-all search engine interface. Whether journals, books, blogs, or art, more resources are either born digital or also offered as digital and they require different types of metadata for content description. Yet, metadata is not equal in quality—it is not the sole domain of the professional cataloger because anyone can create metadata. But professional, good metadata is needed to find and access important information, so a contradiction exists in loss of professional control over the organization process.

Schuitema applies business and psychological perspectives to the current cataloging problem, claiming that the skill sets of traditional cataloging—which values detail, precision, consistency, and stability—do not necessarily match the emerging cataloging environment. She says that professional and organizational values are both changing. While cooperative cataloging historically changed over the long-term, organizational parameters now experience the same rapid change as individual professionals.

The author concludes that cooperative cataloging is currently at a fork in the road, split between traditional pre-coordinated organization following standards and on the other hand new tools and practices that deconstruct the discovery of information. Schuitema asserts that neither choice is best, instead old and new cataloging approaches should be integrated because their shared ultimate goal is to aid users. In other words, catalogers should not be forced to make a Solomon-like choice to forsake the old ways or never look at new methods.

Schuitema’s article was very informative and I appreciated the consideration she gave the argument. She reflects upon the field as it is today—in a state of flux. I was interested in her suggestion that the LIS profession should be open to looking at outside organizations or settings to find new or better practices. In my opinion, the cataloging debate is analogous to the debate over U.S. constitutional theory—should our set of legal rules reflect global laws or stay the same? Are originalists or revisionists correct? I thought Schuitema’s argument was weakened because she deviated from the cooperative cataloging theme and never really answered Tillett’s question of “what it is we’re trying to accomplish by joining forces.” Schuitema essentially argues that catalogers are no longer regarded as the authority for the organization of knowledge. Catalogers will have to learn new skills for a dynamic environment and its evolving infrastructure. Since digital production will outpace print, new metadata schema for a wide variety of digital content carriers will be critical to cataloging’s sustainability. Only by actively engaging in and working with new cataloging methods can the profession grow and perhaps redefine itself through a merger of old and new.

Article Summary for Lecture #11 – Barite

Barite, Mario Guido. “The Notion of ‘Category’: Its Implications in Subject Analysis and in the Construction and Evaluation of Indexing Languages.”

All too often, the LIS field contains ambiguous jargon and “category” is such a term. According to Barite, the notion of category is ambiguous because it is used interchangeably with other terms like characteristic, it lacks a fixed definition, and it necessarily has to be conditional. In this article, Barite attempts to define category within a classification context and also as a practical tool for catalogers and indexers.

Although most catalogers and indexers employ categories to make their classification language or schema more user-friendly, Barite states that articles about categories are not overly abundant in today’s LIS literature. Over time, “category” has acquired different meanings, mainly from philosophers such as Aristotle seeking to categorize all things. However, Ranganathan pioneered the idea of applying category to the classification of knowledge in an organizational system. Ranganathan’s notion of category is critical to the LIS field, as he posited that categories are the basic building blocks of any organizational system.

Barite’s definition of category expounds upon that of Ranganathan. First, Barite identifies categories as “extremely general abstract expressions” that are distinguishable in all objects. Categories are used to label similar properties among objects, so that single or multiple objects can be analyzed in the physical world. As far as LIS, Barite argues categories are the basic building blocks of organizational systems because they make it possible to analyze and relate concepts—which in turn represent human knowledge. Categories enable classification to be either broad or narrow in an organizational system. Hence, Barite asserts that structuring concepts as manifested in objects are categories’ primary use in the LIS field. For example, categories enable hierarchical ranking of subject terms, so a book about Japanese bullet trains could be classified as “Train-bullet-Japan.”

There would be no need for categorization without objects, the documentary units being analyzed, and categorization could not be performed without a human analyst, who determines the basis for categories of objects—albeit subjectively. Unsurprisingly, Barite notes that categorization is a complicated and problematic process because all objects can be analyzed from countless perspectives. The Civil War could be analyzed chronologically, geographically, or culturally. Moreover, a consensus on a certain object’s categorization may never be reached within a field because people have strongly differing viewpoints. Categories have to balance a daunting task—they have to be comprehensive in scope, yet they are limited to a specific type of analysis and by their nature prohibit cross-category analysis. While categories imply differentiation, they also are an effective means of generalization in multiple disciplines. Yet, an implicit rule exists—the more categories within a system, the less generalized that categories can be crafted. Categories within a given system are mostly static, but Barite contends that classification flexibility is feasible through a changeable number of object characteristics per each category.

Barite has dramatically streamlined the notion of category into that of an external tool used to differentiate and identify characteristics of objects inside of an organizational system. I think his article illustrates that categories were the original relevance ranking system, long before Google’s ranking algorithm. His examples made a highly theoretical argument easier to understand, such as categories in classification being analogous to Poe’s purloined letter—there, but easily ignored. I feel his argument is slightly weakened by not restating just how much catalogers and indexers will have to interact with categories in the course of their LIS careers. Nevertheless, I believe he makes an important point that categories themselves can change over time and as a result of shifting cultural norms. Barite’s categories and Bates’ invisible substrate both authoritatively argue that an understanding of the implicit library organizational system is necessary for LIS professionals.

Article Summary for Lecture #10 – Arms

Arms, William Y. “Automated Digital Libraries: How Effectively Can Computers Be Used for the Skilled Tasks of Professional Librarianship?”

In this article, Arms explores the potential widespread implementation of automated digital libraries. He argues that they deserve discussion because of exorbitant operating costs, and also limited user accessibility, associated with conventional physical research libraries. Arms asserts that digital libraries are relatively inexpensive to operate and sustain in comparison to research libraries. According to Arms, digital libraries are cost effective given open access resources which are either identical to or an acceptable substitute for real materials. Hence, he contends reduced staffing is the solution to low-cost digital libraries. He predicts automated digital libraries will be the future of information access, creating an environment where all library tasks from cataloging to reference are performed automatically. The main question surrounding automated digital libraries is their effectiveness: in the future, could they provide users with good service and a successful research experience?

Arms says automated digital libraries “will provide users with equivalent services that are fundamentally different in the way that they are delivered.” For example, he contrasts search engines and OPACs. Cataloging’s biggest advantage is authority control and higher quality results since search engines lack authority control, arbitrarily decide what records to index, and frequently duplicate results. Yet search engines have a much greater scope than catalogs and offer information immediacy. Ultimately, Arms argues the significant difference between the two “depends on what the user wants to achieve.” He also claims that future value will be dependent upon the user audience’s discipline-specific research needs.

How will sophisticated automated digital libraries be possible in the future? Arms says brute force computing will build the infrastructure, through simple algorithms that work at high speeds to locate the user’s query. He cites Moore’s Law, which calculates computer power will grow 10,000-fold every twenty years, as proof that the necessary infrastructure for automated digital libraries will be in place in the future. Arms also notes automated digital libraries’ development and realization is a process occurring outside of traditional library settings, such as Google’s ranking algorithm. Reference linking is another piece of infrastructure critical to the future of automated digital libraries. An in-document reference can now be linked to the actual digital object it represents, permitting citation analysis and a network of contextualized resources.

Certainly, Arms illustrates automated digital libraries are proficient with the technical search process, but he admits they are inadequate as far as navigation, choice, and judgment. Automated digital libraries cannot disintermediate the value-added services that reference librarians currently perform; in other words, users would not be able to perform navigation and judgment themselves when utilizing automated digital libraries. Arms concedes that the most advanced computers are incapable of replicating human judgment when answering queries about abstract concepts such as “bad idea” in subjective contexts.

In summary, Arms believes professional librarians are unrivaled in theory—when time and money are unlimited. Outside of a perfect world, automated digital libraries have the advantage of lower costs and potentially greater access. Catalogs will never be comprehensive because of cost, whereas automated digital libraries combine greater scope with open access information to offer cheaper and more widespread access to legal, medical, scholarly, and scientific resources. Even though automated digital libraries would lack precision and judgment, Arms feels that they would be the best means of access for most people most of the time.

I liked Arms’ analogy of automated digital libraries as the “Model T Ford of information.” I understood his rationale for their probable adoption in the future given their cost-saving potential and accessibility in underserved areas. However, he grossly overestimates both the availability and extent of open access materials, especially for most scholarly research needs. Conversely, Arms avoids the question of return-on-investment. Are the judgment and precision problems associated with automated digital libraries worth the presumed savings? Arms does not seem to account for the fact that people will expect to have librarians as backup when they are frustrated in their searches or just do not want to do the research themselves. He also rules out the likelihood that traditional and digital libraries will work in tandem in the future, rather than being segregated. Finally, I have to say that this article is an intangible argument that is really a best-guess estimation.

Article Summary for Lecture #9 – Rotenberg

Rotenberg, Ellen, and Ann Kushmerick. “The Author Challenge: Identification of Self in the Scholarly Literature.”

Rotenberg and Kushmerick highlight multiple problems with accurate author identification in the scholarly literature—mainly within journals and databases—and then detail tool-based solutions. They argue that, in the last two decades, the increasingly global nature of publication, concurrent greater publication output, and alternative forms of publication, like open access and online-only editions, have made managing proper author attribution and scholarly output history much more difficult. Moreover, correct author identification has emerged as a problem because worldwide publication output is steadily growing, meaning there are more authors added to the literature and a consequent need to disambiguate them. Most often, identification problems occur when researchers share the same name or if researchers publish under different names over the course of their careers—for example, Joan K. Smith versus Joan Smith-Jones. Rotenberg and Kushmerick also point out that the globalization of authorship frequently results in mismanagement of international names in English-based databases. The identification process is further complicated by collaborative authorship, such as an article “authored” by 50 researchers.

Why is this such a problem? Because proper author name identification and correct attribution of published output is vital to career advancement, scholarly status in the field, and funding opportunities. Location by author is one of the three most common search techniques, and it is imperative that students and researchers alike are able to successfully search by author name. Additional metadata such as who authors are, where they work, and what they publish would aid both ends of the attribution spectrum—student researchers and faculty authors. Rotenberg and Kushmerick determined that researchers would voluntarily use a unique ID number to separate themselves from other authors. Identifiers have been utilized by some government agencies, such as the National Institutes of Health, and are generally regarded as an effective method to lessen ambiguity. However, identifiers have provoked many questions, including centralized versus multiple ID registration systems, oversight and financial responsibilities, and privacy and copyright issues.

Rotenberg and Kushmerick outline Thomson Reuters’ solutions to the problem of unique author identification. ResearcherID is a free-to-subscribe online network allowing researchers to link their publications to a unique ID and supply additional biographical information if desired for their profile. This will remain a static identifier throughout the researcher’s career; therefore, it will not change if the author’s name changes or if affiliated institutions change. Scholars can choose what personal information they share on their profile, such as research interests, and some information can be marked private. ResearcherID also permits further disambiguation through links to collaborators, citation networks, and geographic location in relation to scholarly output. Some universities have used ResearcherID institution-wide as part of their tenure evaluation process, creating profiles for all faculty members. Undertaking a more collective approach than ResearcherID, the Discovery Logic tool acquires author information from databases, but also seeks information from patents, research grant applications, and publishing ventures.

This article was easy to understand, yet I felt that a disclaimer of “sponsored by Thomson Reuters” should have been added at the beginning. It was decidedly an advertisement for the company’s products as the only solution to the problem of author identification. Obviously, ResearcherID is an innovative program, but an overview of the technical solution product marketplace would have been helpful. I appreciated that the authors pointed out that proper, comprehensive attribution is a problem concerning everyone, whether author, publisher, editor, librarian, student, or grant administrator. I agree that any type of research ID should be centralized, otherwise chaos is inevitable across diverse systems. However, I felt the article somewhat dismissed the enormous privacy concerns inherent with identification tools, especially if the researcher’s affiliated institution automatically created a profile for each scholar with no opt-out feature. Finally, I think this article shows the increasingly business-oriented mindset of academic institutions. These are tools to maximize organizational efficiency and illuminate the bottom line just as much as they are intended for author identification.

Article Summary for Lecture #8 – Gross

Gross, Tina, Arlene G. Taylor, and Daniel N. Joudrey. “Still a Lot to Lose: The Role of Controlled Vocabulary in Keyword Searching.”

This article is a sequel to an earlier study by the same authors, which documented in a controlled research experiment that over a third of hits found by users in catalogs using keyword searches would be lost without subject headings. Returning to that study and also addressing criticism of the original research, the authors sought to see, first, if their results still held true in an enriched database and, further, with searches not limited to English. The authors frame this article within the larger debate about controlled vocabulary’s role in today’s information environment. As they explain, the debate’s parameters are defined by two factors: user difficulty using OPACs due to the inherent complexity of subject searching and increasing user acclimation to Google-like keyword searching. Those two factors have prompted many in the LIS field to assert “the lack of importance of controlled vocabulary in the catalog” and motivated some LIS thinkers to propose that subject headings are obsolete given the predominance of keyword searching by the majority of end users.

After evaluating approximately two decades of scholarly debate about keywords versus subject headings, the authors contend that “these studies have consistently shown that human-supplied controlled vocabulary has added around one third or more of the words that make keyword searching successful.” They cite Lois Mai Chan, who said that controlled vocabulary enables mapping, synonymy and homonymy control, and greater relevancy and efficiency. Some literature showed a high percentage of titles did not correspond to their respective subject headings, while other studies disclosed that the subject heading is the only place where the keyword is found in many searches. Other literature argued controlled vocabularies like LCSH and MeSH should be limited to name, uniform title, date, and place because of their ineffectiveness for topical subject searches and should be replaced by enriched metadata such as tables of contents or free-text searching. Those critics refer to users’ ability to search whole books instead of records, such as on Amazon. However, the authors also cite Yee, who asserted that scholarly research needs more detailed and extensive searches than what is available with keyword searches. Indeed, “scholars miss much of the relevant information if the system has been designed only for quick retrievals.” The authors likewise mention Hayman and Lothian, who confirmed that keywords have multiple meanings even within the same database and keywords cannot distinguish meanings which leads to ambiguity. Buckland, as well as Blair and Carlson, verified keyword searching resulted in many missed items because word connotations change over time and since primary historical sources, such as diaries or letters, contain regional word usage, abbreviations, and alternate spellings. Finally, the literature review focused on the two debate parameters. Beall discussed the problems with relying solely on keyword searching: synonyms, variant spelling, word forms, homonyms, obsolete terms, uncontrolled personal names, aboutness issues, inability to sort, non-textual resources, abstract topics, and paired topics. On the other hand, Calhoun insisted that users dislike LCSH and prefer keyword searching as they are unequipped for subject searching. She recommended stopping “the attempt to do comprehensive subject analysis manually with LCSH in favor of subject keywords,” especially since “automated enriched metadata such as TOCs can supply additional keywords for searching.”

Keeping the debate parameters and criticism of their previous study in mind, the authors’ new research in this article again asked what percentage of records retrieved by a keyword search had a keyword only in a subject heading field and, therefore, would be irretrievable without controlled vocabulary. Additional to this study, they questioned the effects upon retrieval percentages when the catalog contained enriched metadata such as TOCs and summary notes, as well as when the results were in multiple languages. The authors employed essentially the same search process as in the prior study. The new study results showed that, overall, for approximately 1 out of 5 successful keyword searches, half or greater of retrieved hits would not be found without subject headings. When search results originated from catalogs with both enriched metadata and resources in multiple languages, the mean percentage of retrieved hits that would be irretrievable without subject headings was 27%, with a median of 17.6%—11.1% less than results from the past study. Relevancy was still the unknown quantity in the study.

I think this article illustrates a searching dichotomy and highlights principle of least effort procedure. The authors point out that “successful keyword searching relies on controlled vocabulary as part of a system,” yet they also cite the OCLC “quality statement,” which says end users want Google and expect their searches to return best results. I found that a contradiction exists between the “average user” and the needs of the specialized researcher. After reading this article, I recognized a disturbing trend to only emphasize cost when the authors illustrated that subject headings are actually more cost effective than wasted search time, particularly in business settings. Also, I agreed with the authors about enriched metadata: how much is too much? It seems like more expense to purchase more confusion. I see the value in controlled vocabulary because it yields a broader search area yet more relevant results. Moreover, some items could be hidden without subject headings, notably images and music. Based on the results of both studies, I believe a hybrid approach combining keyword searches and subject headings should be considered best practice, allowing users to retrieve resources otherwise lost due to the search problems listed by Beall and benefitting every user with more extensive results.

Article Summary for Lecture #7 – Naun

Naun, Chew Chiat. “Objectivity and Subject Access in the Print Library.”

Naun considers how the print environment has directly influenced librarians’ subject cataloging philosophies. Historically, the prohibitive costs associated with the publishing industry meant that libraries operated for the benefit of their entire community—providing all patrons with equal access to information. Yet, just as physical access is impartial, so too is subject representation. Librarians objectively designate subject terminology, encompassing the broadest spectrum of a subject without bias. For example, LCSH uses the term “blacks” in place of “negroes.” In this article, Naun argues that organization is guided by democratic impulses when librarians try to maintain impartiality in subject indexing.

Works in a library collection have to be organized so that they can be found through a variety of user conceptions of a subject. This cataloging practice is tied to the library’s role as a democratic institution and also to librarians’ professional beliefs. Thus, Naun claims subject labeling in cataloging is another facet of the access mission that permeates the role of librarianship.

Naun contends that “any attempt to capture what a document is about requires a frame of reference that may encompass a host of interests, assumptions, and values.” To illustrate, the UNESCO thesaurus maps users searching for “handicapped” to “disabled persons.” User conceptions of a subject index term’s “aboutness” are dependent upon their immediate search needs. This subjectivity poses a problem for librarians, who are supposed to be viewpoint neutral. Cutter’s rule that subject entry be based upon what the majority of users would look for recognizes the problem with subject access. Indeed, Cutter argued user behavior should determine indexing practice, rather than indexer or cataloger judgments.

For user benefit, Naun affirms that index entries also necessarily must be limited because of time constraints. Users need efficient search results; hence, Naun says controlled subject terms are essential since access has to “work for real people in the real world.” Moreover, limitless subject terms would nullify the collocation objective as related subjects would not be in proximity to each other.

Naun evaluates whether automated indexing systems can maintain subject term objectivity. Such systems are mathematically-based, searching the full text of documents in a collection and analyzing the frequency of keywords to find relevant subject content. Basically, records are compared for content likeness. Naun implies that automated indexing lacks the nuance of manual indexing.

Seemingly, automated full-text searching solves the objectivity problem because it eliminates the human indexer who might be biased. On the other hand, Naun says natural language “inevitably has biases of its own.” Subject terms added to a controlled vocabulary still carry meaning and can be loaded with negative connotations—“terrorist” or “dictator,” for example. Naun advises that biased terms can be replaced with neutral terms, such as “developing nations” instead of “underdeveloped countries.” Naun suggests that classification systems can pull together diverse viewpoints on a subject, somewhat countering bias as collocation incorporates “the intent of the author, and the interest of the readers that the author is seeking to reach.”

According to Naun, objectivity equals mediation, neither highlighting nor ignoring a particular viewpoint. This goal in subject representation correlates to the larger role of the library in society. As Cutter said, the needs of most users must be met, meaning that common language is employed while remaining respectful of competing notions of subject terms. People have different interpretations of the same work, but subject terms strive to fulfill the needs of “a consensus of a potentially diverse community of users.” For example, a search for “domestics” would be directed to a see “household employees” reference. Overall, subject term searches should retrieve results that do not favor certain resources or omit relevant materials. Yet, Naun proposes that librarians must monitor subject terms’ meaning over time and appropriately revise terms.

I learned that intellectual access is just as important as physical access in a library. This was an excellent article because it clearly shows that organization of information constructs the infrastructure ensuring all opinions are accessible. Also, I appreciated that Naun pointed out librarians are vital to maintaining democracy and impartiality in an increasingly partisan and stratified society. Meaning can be read in many ways and I was reassured that, even with something outwardly as inconsequential as subject terms, librarians are committed to the “open exchange of ideas as an unconditional good.”

Article Summary for Lecture #6 – Wajenberg

Wajenberg, Arnold S. “A Cataloger’s View of Authorship.”

In this article, Wajenberg addresses ambiguity surrounding the term “author” in descriptive cataloging. He traces evolving definitions of authorship and concurrently describes challenges encountered by catalogers trying to definitively establish the author of a certain work. Ultimately, he proffers his own definition of authorship and details its application. Somewhat predictably, Wajenberg advocates a traditional item-in-hand approach based upon his own cataloging experience.

Cutter was the first cataloging theorist to specifically define the notion of author. His definition ranged from the narrow perspective of “the person who writes a book” to a broader concept “applied to him who is the cause of the book’s existence” and further extending to “bodies of men.” While seemingly excluding female authors, Cutter allowed for both primary and corporate authorship and most succeeding cataloging codes have imitated Cutter. Yet, Wajenberg acknowledges current catalogers are still vexed by authorship, especially when the obvious author is subsumed as an access point.

Lubetzky also attempted to theoretically define authorship, stating that “the author is simply the person who produces a work, whatever the character of the work.” He was directly reacting to what he perceived as a flawed definition of authorship issued by AACR in 1967 that revolved around “chief responsibility for the intellectual or artistic content of a work.” However, Lubetzky’s definition came with its own problem—the word “produce.” For example, Wajenberg indicates that a stenographer might produce a document from dictation, but obviously would not be the author of the document.

In contrast, Carpenter emphasizes the problems of multiple and diffuse authorship, inherent when catalogers try to define authorship through origination of a work. Carpenter asks how Sophocles can be listed as the author of the English translation of Antigone since English was nonexistent when Sophocles originally produced the work. He also cites the example of a movie, which has numerous cataloging access points that could fulfill the origination definition. In correlation to Carpenter, Wajenberg points out that scientific reports usually have multiple authors listed, but only a few actually participated in writing the work.

Finally, Wajenberg profiles a software program called “Racter,” which wrote a book of prose. In this case, a computer program is the author of a work. Yet, no appropriate author heading exists for this example because a computer is neither a personal name nor a corporate entity.

Given the complex problems catalogers confront, Wajenberg asserts that catalogers should strictly focus on the bibliographic universe, which is primarily in the form of the physical object itself. In other words, he advocates for an item-in-hand descriptive cataloging approach. Wajenberg’s definition of authorship says that “an author of a work is a person identified as an author in items containing the work, and/or in secondary literature that mentions the work.” Indeed, his definition considers catalogers’ expertise “with such conventional usages and layouts of such sources as title pages.” Thus, the author listed on the title page of a book, such as “Shakespeare’s Hamlet,” would be presumed as the author. Wajenberg argues that catalogers have secondary literature for authentication of authorship, but will rarely need to consult information beyond the item itself. In effect, Wajenberg gives a definition by attribution, either intrinsic or extrinsic. He claims that a definition by attribution would be a completely objective cataloging process based upon the bibliographic record or object in hand, using the example of spirit communication to illustrate that personal beliefs should not intrude upon the concept of authorship.

I think that Wajenberg constructs a strong case for non-biased definition of authorship through attribution of the work. Nevertheless, Wajenberg concludes that his definition cannot resolve all cataloging issues, especially when attribution is unclear, and instead hopes he has alleviated some cataloging frustration. In my opinion, Wajenberg could be criticized for oversimplification, unwillingness to address diverse library materials, and avoidance of corporate authorship. However, I believe Wajenberg is correct in stating that catalogers should always expect to do some bibliographic investigation as part of their job when dealing with problems presented by authorship—the issue will never disappear. Wajenberg has proposed a straightforward method for determining authorship and, although he claims that he approached the topic from a cataloger’s perspective, he produced a definition catering to user needs—which should be the end goal of all cataloging.

Article Summary for Lecture #5 – Schottlaender

Schottlaender, Brian. “Why Metadata? Why Me? Why Now?”

Schottlaender seeks to connect the library cataloging community to the broader context for metadata use, such as the internet, government, and art world. He tries to explain why metadata is necessary in today’s information world, with one reason being information indefiniteness. He also contends that specialized information resources require management of their individual component parts—hence data about data. In turn, metadata has to be supported by schema, or sets of rules such as MaRC, to maintain essential access to fluid and complex information. With the proliferation of information in a variety of contexts, rising expectations surround metadata.

First, Schottlaender provides several definitions of metadata. For example, one description was a “cloud of collateral information around a data object.” He defines metadata as “structured data that describe the characteristics of a resource” and recognizes an “inherent relationship between data and their metadata.” In other words, if you have data of any sizeable amount, you must have “uber-data” to describe, identify, access, and manage that data. Metadata is akin to a big filing system, but in order for the filing system of data to work, you need organizational rules—therefore schema.

Schottlaender then proceeds to explain schema, which are the standards for encoding information, and he focuses on three types: encoding, metadata, and architectural schema. Encoding schema comprise markup languages such as MaRC and HTML, whereas metadata schema range from descriptive rules like AACR2 to Dublin Core, which specifically describes “document-like objects” in an online environment. Lastly, architectural schema discussed include RDF and the Warwick Framework. Schottlaender makes the point that each type of schema supports the goal of metadata. For example, SGML is a highly structured method to deal with complex packaged resources; AACR2 creates bibliographic access points like titles and related works; identifiers include ISBNs and URLs; and Warwick Framework software accepts diverse data to act as a “comprehensive infrastructure for network resource description.” Schottlaender affirms that “multiple schema are at work managing different types of objects.”

Next, Schottlaender addresses the relationship between the library cataloging community and metadata. Metadata concerns content management and cataloging involves ordering relationships among content. Furthermore, cataloging pertains to standards, controlled vocabulary, and systematic description and classification. In the online environment and other non-library cataloging communities, metadata’s usefulness is now being seriously considered and catalogers have the experience to offer input about standards and designing effective systems. Schottlaender argues that cataloging and broader metadata applications have correlating goals: to help users access, find, and choose information.

Schottlaender observes that almost all schema are library-based, such as AACR2 and LCSH. Yet, he says that outside of the library cataloging community, the “utility and desirability of content standards” has become progressively more apparent, especially in the online environment. To cope with massive amounts of data, catalogers are being consulted by the metadata community and cooperation is growing, such as Dublin Core building on FRBR’s model. Schottlaender predicts further cooperation because of the cataloging goal to identify unique information and catalogers’ greater experience, which metadata communities need in relation to rights management—particularly with commercial and legal implications in the online environment.

Schottlaender concludes that challenges abound for cataloging and metadata communities to jointly tackle, most of which involve the online environment. One such challenge is the fact that most online documents lack permanence, or are not stable over time or anchored to a particular location. Having controlled vocabulary to allow compatibility across different content communities is another challenge. However, the most significant challenge is interoperability, whether in language or structure, between different systems to effectively communicate or exchange data.

From this article, I learned that metadata is unavoidable and that metadata use outside of the library cataloging community demands cataloging standards and catalogers’ proficiency. Still, the reader is left wondering exactly what metadata is after some ten pages. Also, Schottlaender’s title sounds like an existential crisis and he implies that is what is currently happening surrounding the proliferation of data. His article presents a somewhat biased opinion that non-library communities are in desperate need for professional LIS assistance to deal with data and seems to suggest that any non-library cataloging community metadata innovation would be entirely futile.

 

Article Summary for Lecture #4 – Creider

Creider, Laurence S. “Cataloging, Reception, and the Boundaries of a ‘Work.’”

Theorists inside and outside of library science have offered multiple notions of what a “work” is, but have not achieved consensus. The idea of a work and its different expressions is important because of Cutter’s choice objective and hence fulfilling user needs. FRBR’s advent brought the issue of identifying not just what an institution has by a particular author, but also what versions and formats are available into focus.

Creider explains that, although the definition of a work and its boundaries is “fuzzy,” theorists agree that a work is intangible. For example, the FRBR authors describe a work as “no single material object one can point to,” a “distinct intellectual or artistic creation.” However, Creider’s goal is to clarify the ongoing problem of cataloging “distinct” works. In other words, what criteria determines when one work changes so substantially from the original that it is not a different expression but has become a new work? This is an important dividing line because it affects cataloging decisions, such as authority control, and shapes user choice, notably through access points.

Currently, catalogers identify changes to the medium of works as being part of a subset of the original, not as a new work. Boundaries of a work are somewhat governed by cataloging rules, but Creider asserts that “these rules are inadequate and result in many gray areas.” Why is a revised edition a new expression in FRBR, but a new work in AACR2? Movie versions of plays are rarely designated as new works, instead classified as a change of genre. Creider highlights a play translated from Italian to German, with different scenes, as typifying boundary debates.

Catalogers have proposed some principles concerning boundaries of a work. Creider cites Patrick Wilson, who suggested that any textual alteration constituted a new work; yet, Wilson’s proposal is not tenable since it disregards translated editions. FRBR authors concluded a new work is derived from individual effort, but their definition eliminated the influence of contemporary social factors to the understanding of a work. In contrast, Creider decides that boundaries are mostly defined by reception of a work. Accordingly, a work is a mental construct existing independently and concurrently in the minds of users, including authors, editors, publishers, catalogers, and readers. To select which work they need, users interact with the reception of a work, whether in class, conversation, or social media.

Creider substantiates his argument about reception with examples of medieval European manuscripts, St. Anastasius’ biography and Gregory of Tours’ historical manuscript being his major examples. St. Anastsius’ life and martyrdom were originally written in Greek, crudely translated into Latin, and finally amended with “literary” revisions. These versions revised the original text over time and scholars study them as different works. On the other hand, Gregory of Tours’ Histories presents an extreme case in boundaries of a work. His manuscript title was replaced by copyists and historians’ desired version and his theme of ecclesiastical centrality in French history was altered in favor of a cultural story of Frankish kings’ dynastic accomplishments. Nevertheless, scholars consider the multiple editions of Gregory’s Histories to be one work. Creider’s examples show that “catalogers cannot take a quantitative approach or rely on authorial intent to determine the boundaries of a work.”

Creider’s main point is “the state of the reception of the text” is the best way to determine when a work diverges. His most important assertion is that boundaries are not easily distinguished using only a single paradigm. Cataloging is an ongoing historical process, where definition of a work is evolving and subjective. I found it notable that Creider says works must be re-examined as new versions emerge, as people’s views change, and as scholars issue new opinions. My only criticism of Creider’s article is that he debated various perspectives, yet concluded something already discernable—the theoretical idea of a work will always remain elusive.

Article Summary for Lecture #3 – Russell

Russell, Beth M. “Hidden Wisdom and Unseen Treasure: Revisiting Cataloging in Medieval Libraries.”

According to Russell, medieval libraries are overlooked in cataloging history, with innovations of the nineteenth and twentieth centuries being library literature’s predominant focus. She argues that cataloging history should be seen as more of a linear progression, with medieval cataloging included because of problems that catalogers faced and solutions they devised. Indeed, Russell claims medieval catalogers are comparable to modern catalogers given the similar organizational challenges confronted by both groups.

Overall, Russell evaluates how medieval libraries attempted to catalog materials. Most cataloging occurred in a monastery or cathedral, before shifting into the university in the later medieval period. Neither single-system cataloging standards nor general cataloging theories existed; therefore, medieval cataloging was centered at the institutional level and catalogs were unique to their specific locations. As Russell states, “medieval librarians were driven by utilitarian needs to develop cataloging practices that would work in their particular situations.” The primary goal was to provide access to local titles, but over time and with increasing collections and growing research needs, catalogs increased in sophistication.

Russell discusses different medieval cataloging methods created to organize or classify titles. The inventory catalog’s main function was to identify titles in a collection rather than providing detailed descriptions. Differing physical storage location was a key component of cataloging; for example, liturgical books were kept near the chapel. University libraries, including the Sorbonne, divided books based upon whether they were limited or wide circulation, housing them in separate rooms with individual access keys. Medieval catalogers could precisely indicate location through shelf lists and by assigning letters to volumes. Sometimes collections were physically described, varying from “big and pretty” in a Cambridge catalog to Cistercian documents detailing book material and binding. Russell equates medieval and current cataloging, noting that some later medieval catalogs contained the opening words of titles in their collections to distinguish between multiple copies of the same text.

As library collections grew and furthermore secularized at universities, catalogs evolved to better meet access needs. Russell suggests that medieval catalogs correlate to modern counterparts due to their adoption of alphabetically-based subject cataloging, although the medieval notion of subject categories was restricted. Medieval cataloging was not without problems, namely composite volumes bound together because of shared authors, corresponding subjects, or like value. Some catalogs listed the first title in composite volumes while others listed each individual text within. In contrast, the Sorbonne’s cataloger created a master analytic catalog, making a table which charted bound volume contents including titles and opening lines and thus led to individual text location. Russell feels “this is a strong argument against the claim that later catalogs were more sophisticated than earlier documents.” She concludes that modern catalogers can gain examples of innovative practice from medieval catalogers.

Russell’s article taught me that, even in medieval times, users were at the center of the catalog format. Medieval catalogers needed to know “how to let users of the library know what was in the books” of their collections. Interestingly, medieval cataloging problems, such as multiple editions, persist into the present. I now appreciate the level of organization and ease of access I experience when in a library. Yet, Russell presents her argument with narrow evidence, especially given the few libraries referenced in the article. Admittedly, researching medieval catalogs is hindered by a lack of surviving sources. While she did not address this divergence, users of medieval libraries were privileged persons in their respective societies and the concept of access was markedly different from modern libraries’ user policies. Finally, Russell claims that revisiting medieval cataloging is her own special insight and I was curious as to whether other scholarship corroborates her thesis.