Generative-AI Summarization Ann Blair's book Too Much To Know overflows with techniques of how pre-early modern scholars dealt with information overload. [1] One of the more oft-used techniques is summarization. With the advent of generative-AI, it is almost trivial to create more-than-plausible summaries of documents. The linked Python script is an example. [2] Given the path to a plain text file, the script will load a configured large-language model, vectorize the given plain text file, compare the two, and output a three-sentence summary. I enhanced the script to work in batch, and thus I have used the technique to summarize collections of items: * each chapter in each book written by Jane Austen [3] * 250 journal articles on the topic rheumatoid arthritis [4] * another 250 journal articles on the topic of climate change [5] * 130 articles on the topic of cataloging [6] For any given document there are zero 100% correct summaries; everybody will summarize a document differently. That said, the results of this automated process look pretty good to me. Moreover, each list of summaries addresses difficult to answer questions such as: * how can Jane Austen's works be characterized? * what is rheumatoid arthritis and what are some of its treatments? * how is climate change being manifested across the globe? * how has the practice of cataloging changed over time? The lists of summaries may be deemed as information overload in-and-of themselves, and one might consider summarizing the summaries. Such is an exercise left up to the reader. I believe libraries and librarians ought to learn how to exploit generative-AI for summarization purposes. Just as the migration of printed cards to MARC transformed how libraries hosted catalogs, migrating from hand-crafted summaries to computed summaries will transform how information overload is managed. [1] Blair, Ann. 2010. Too Much to Know : Managing Scholarly Information Before the Modern Age. New Haven Conn: Yale University Press. -- Eric Lease Morgan June 27, 2024