Seizing the Digital Humanities Moment: Envisioning a Medieval Middle East History Source Book as an Open-Access Database

On 24 November 2014, Steve Tamari (Southern Illinois University Edwardsville) had convened at the annual meeting of the Middle East Studies Association (MESA) in Washington, DC round table [R3670]: “Is There a Need for a New Primary Source Reader of Pre-Modern Civilizations?” Below follows the précis which I had submitted as one the round table participants.  I was the only participant to argue for a digital Open-Access textbook.  

I am proposing to develop a source book of Middle East history between 600 and 1800 CE as an Open-Access (OA) digital database.  In the US the majority of today’s undergraduates are digital natives.  They may prefer printed books for some of their reading, but most of them will have grown up with digital devices and social media.  Taken together with the rise of Massive Open Online Courses (MOOCs) and Digital Humanities (DH) initiatives of grant giving organization, it seems thus possible to launch this new textbook on the Internet as an OA database. Although its creation and medium-term maintenance pose their own technical and financial challenges, such an OA database would be accessible to students outside the US and the creation of its contents could be organized as a collaborative transnational project, whose participants will have to decide which of the available platforms (e.g., Wiki, WordPress, Omeka, FlickR) would be most appropriate and cost-efficient for the project.

Organized as a database a source book would offer the opportunity to cover in a more adequate fashion the complexities of premodern Muslim societies.  A printed book’s space limitations make it almost impossible to challenge the traditional views of Middle East history, whether they are Sunni, Shiʿi, Arab, Iranian, or Turkish, and to expand introductory courses on Middle East history, politics and economics so that culture and the arts and sciences will also be covered.  In contrast, the storage capacities of a digital database would make it possible to accommodate the full diversity of the preserved sources, such as documents pertaining to Zoroastrians, Jews, and Christians as well as Kharijis, Zaydis, Ismailis, Alevis, Babis, Bahais, and Ahmadis.  Moreover, only an OA digital database can take full advantage of the OA depositories of digitized manuscripts and printed books, art objects, audio files, or coins already available on the Internet (e.g., Women’s World in Qajar Iran, Eastern Art Online, Refaiya Library).

To develop this textbook as an OA resource would allow scholars to gain a modest degree of independence from the pressures of commercial academic publishers, while being mindful of the steadily increasing costs of a US college degree.  Since textbooks are an important source of revenue, publishers are vigorously defending their copyright claims whenever they are negotiating with colleges and universities the uses of copyrighted material in the classroom. Moreover, the database’s diverse contents will help instructors to regularly vary reading assignments.

Amended, 6 March 2022

PS 1 – In October 2021, I posted this 2014 précis here on my blog, as I was reviewing the resource collections of the Invisible East programme at the University of Oxford in preparation for an ultimately unsuccessful job interview for a fourteen-month position as research associate.  Arezou Azad (University of Oxford) directs the progaramme.  Its most general goal is to initiate a paradigm shift in research about the history of Iran, Afghanistan and Central Asia between the eighth and the thirteenth century CE by highlighting the diversity of extant and accessible written sources in New Persian, Judeo-Persian, Arabic, Bactrian, Sogdian, Khotanese and Middle Persian.  In order to affect this paradigm shift, Dr. Azad and her team are committed to the creation of an Open-Access digital corpus.  This forthcoming collection of written witnesses is not mentioned in the posted project description, and so far few concrete details have been published, while the work is under way.

PS 2 – In 2021 two important articles about the digital sources of historical research –  both digital surrogates and born-digital – were published in the American Historical Review (AHR): Itza A. Carbajal and Michelle Caswell, “Critical Digital Archives: A Review from Archival Studies,” AHR 126.3, pp. 1102-1120; https://academic.oup.com/ahr/article/126/3/1102/6421763; and Joseph L. Locke and Ben Wright, “History Can be Open Source: Democratic Dreams and the Rise of Digital History,” AHR 126.4, pp. 1485-1511; https://academic.oup.com/ahr/article/126/4/1485/6525105.  Locke and Wright are tenured professors of American history who review the ethical and financial challenges that impact the capacity of digital humanists to create Open-Access teaching resources for digital history at the contemporary America neoliberal university.  While the two men explicitly restrict their survey to the United States (p. 1486 n.3), their analysis of the tension between the desire for Open-Access pedagogical resources and the persistent challenges of how to properly recognize and fairly remunerate the indispensable – yet mostly invisible – labor of their creation is relevant to Digital Humanities projects in general.  In contrast, Carbajal and Caswell are archival practitioners who are caring for archives of Latin American and South-Asian American minorities, respectively, while being a doctoral student and a tenured professor of information studies (pp. 1104-1105).  Their article surveys the archival practices that accompany the current flowering of so-called digital archives projects.  The two women identify seven key themes in the theory and practice of digital records management in archival studies in order to increase historians’ awareness of the complexities that deeply affect the digitally available sources of historical research. 

Updated, 6 March 2022 

“All You Can Do with Catalogs”

In 2015 the Forum Transregionale Studien (TraFo) in Berlin awarded Paola Molino, at that time Alexander von Humboldt Postdoctoral Fellow at the Ludwig Maximilians Universität (LMU) in Munich, a grant for the organization of an exploratory workshop on information management in early modern societies.  While working on her application, Paola Molino had invited Martina Siebert, Guy Burak, and me to join her as co-convenors.  The workshop was held in Berlin on 6 October 2016 in the Staatsbibliothek (SBB), and on 7 October 2016 in the rooms of the TraFo.

In February 2017 Paola Molino submitted her official final report about the workshop to the TraFo.  Her version was written with the co-convenors, with contributions by Anne MacKinney, and is available here.  The following text includes sections from earlier interim drafts, and is therefore more detailed.

This project began with a serendipitous crossing of the paths of four scholars working on the transmission of knowledge and the history of science in European, Middle Eastern, and East Asian societies.   All of us have extensive experience with libraries—as readers, catalogers, and librarians—and hence quickly found common ground in our abiding interest in the composition of finding aids between 1400 and 1800 ce.  In western Europe, during the early modern era, the transformation of feudal societies into territorial states prompted the ruling elites to invest into the construction of imperial libraries and archives, whose design projected transregional connections and supranational ambitions to the world at large.  Although new cataloging principles emerged for the collections housed within these new physical spaces, they did not explicitly break with the already recognized knowledge traditions, and rather attempted to integrate the established authoritative epistemes into new classificatory regimes.   These finding aids are fascinating objects in their own right: as artifacts they are primarily paper tools and, yet, their written contents can also be understood as a graphic representation of ideas.  Therefore, we decided to focus our exploratory workshop on the catalogues themselves.  One of our goals was to cross over the institutional barriers of memory institutions such as libraries, archives, and museums, as they often generate a confrontational relationship between readers and librarians.   We invited colleagues with a wide range of expertise to reflect on the roles of finding aids within the history of their own academic disciplines.  The transformation of concepts of knowledge—from fifteenth-century Humanism to eighteenth-century Enlightenment and nineteenth-century Positivism—has already received significant scholarly attention, and it has been studied from the bottom-up through tracking the interpersonal transmission of knowledge, and from the top-down by analyzing how imperial institutions, such as academies and universities, supported the diffusion of knowledge.  Against this backdrop, the workshop pursued the nexus between the catalogued items—whether written texts or material artifacts—and the concrete, practical power of a catalogue.  How were finding aids employed as instruments for transforming amassed holdings into a collection’s apparent order?  Conversely, how were cataloging ventures expressions of a ruler’s sophistication through the effective control of precious, rare assets?  In the daily business of doing research catalogues are usually experienced as humble tools and inevitable intermediaries operating as transparent, and thus seemingly neutral, interfaces between readers and written texts.  We wanted to use the exploratory workshop for comparing finding aids in different cultural traditions in order to open fresh views of these very familiar resources—as if they had suddenly changed into unexplored territories.

The workshop comprised five sessions.  We were joined by fifteen established scholars and around two dozen registered guests.  In addition, we included four lightning talks by Sebastian Felten, Celeste Gianni, Anne MacKinney, and Julian zur Lage, since they are currently working on research projects related to the history of information management in a transregional perspective.  On the first day the workshop was held in the Simón Bolívar Lecture Hall, generously made available by the Berlin Staatsbibliothek.  Since the hands-on examination of a catalog’s handwritten or printed copy is an indispensable part of research on their intellectual history, we are grateful that the Staatsbibliothek allowed us to draw on its rich collections for a show-and-tell.  For the second day we convened in the rooms of the Forum Transregionale Studien.

The workshop opened with a session on the epistemology of catalogues, and was chaired by Nur Sobers Khan, a curator at the British library and a historian of Turco-Persian societies after 1500.  Paola Molino, Islam Dayeh, and Martina Siebert investigated how the construction of libraries and the design of their research facilities developed in conjunction with the organization of finding aids.  Molino focused on early modern Europe, Dayeh examined Arabic finding aids from the Arab world before 1500, and Siebert surveyed the development of Chinese bibliography between the first and the nineteenth century.  The speakers agreed that the refinement of classification schemes went hand in hand with a growing demand for the systematization of knowledge.  Particular attention was given to the technical terminology of classification schemes vis-à-vis the various purposes of bibliographical information, and to the appreciation of finding aids as intellectual achievements in their own right.  In the discussion, we explored the possibility of a methodology for the study of finding aids as sources for a transregional history of knowledge.   What is the impact of ideology on classification schemes?  To which degree are cataloging ventures driven by the universal human experience of loss and the complimentary desire to prevent the destruction of cultural heritage?  What is the relationship between technological change in the reproduction of written language (e.g., manuscript books, blockprinted books, books printed with moveable letters), levels of book production, and approaches to the compilation of bibliographical information?

The show-and-tell highlighted some of the important Latin, Arabic, and Japanese finding aids in the Staatsbibliothek’s collections.  Ursula Winter presented the holograph of the Catalogus manuscriptorum by Johann Raue (1610–1679), the first librarian of Berlin’s Kurfürstlicher Bibliothek (Electoral Library, est. 1661).  In 1668, after Friedrich Wilhelm of Brandenburg (r. 1640–1688) had opened his Electoral Library to outside readers, Raue compiled the first catalogue of the new library’s manuscript holdings, arranging these codices according to how they were shelved within the library.  Raue’s Catalogus illustrated the possible interdependence between library architecture and a catalogue’s systematic arrangement (cf. the use of so-called shelf lists as strictly internal methods of inventory control).  Christoph Rauch und Dagmar Riedel explored how bibliographical information was transmitted in Muslim societies by contrasting two Arabic manuscript copies (dated 1724 and c.1840 respectively) of the Kashf al-ẓunūn ʿan asāmī al-kutub wa’l-funūn (“The removal of doubts from the titles of books and the scholarly disciplines”) with an Arabic fragment (dated 14th or 15th cent.) of the Wafayāt al-aʿyān (“Death dates of notables”).  The Wafayāt by Ibn Khallikān (1211–1282) is a bio-bibliographical dictionary and the Kashf by Katib Çelebi (1609–1657) a title catalogue in alphabetical order, but neither the Wafayāt nor the Kashf was designed as a finding aid for the holdings of a particular library.  Exploring the affinities between catalogues, anthologies, and book collections,  Ronny Vollandt showed an Arabic manuscript (dated 1325) with an anthology of prophetic books from the Old Testament, al-Jawhar al-muḍīy fī’l-sittat-ʿashar al-nabī (“The essential content of the sixteen prophets”), and Christian Dunkel explained a private collection of Japanese bookseller catalogues.

The second session investigated catalogues as means to the mastery of knowledge, and featured presentations by Christian Jacob, Seth Kimmel, and Alberto Cevolini.  Arndt Brendeke, a historian of early modern Europe, presided over this session.  Drawing on Kantian epistemology, Jacob highlighted the power of catalogues.  He argued that knowledge is always bound to specific historical circumstances, so that the organization of finding aids reflects concrete human practices of the transmission of knowledge.  Comparing finding aids and maps, Jacob suggested that insights gleaned from research on maps can be employed to advance our understanding of information management through catalogues.   Kimmel used the ultimately failed project of a grand universal library, which the Spanish cartographer, explorer, and bibliophile Hernando Colón (1488–1539) had pursued in Seville, to explore tensions between the Humanist ideal of universal knowledge and Spain’s politics of conquest in the Americas.  In contrast, Cevolini focused on a mechanical indexing device for the storage of written notes and excerpts, known as the “ark of studies“ and designed by the otherwise obscure Thomas Harrison (1595–1649) in the midst of the English Civil Wars.  Cevolini described the “ark“ as an external memory, and interpreted it as a disruptive invention which showed how new cognitive habits were accompanied by new organizational strategies.  Approaching the  “ark“ from the perspective of the sociology of knowledge, Cevolini argued that from the 1450s onwards, after the invention of letterpress printing in western Europe, readers had to confront a dramatic information overload because of steadily increasing levels of book production.  In the discussion, Cevolini’s interpretation of the “ark“ was challenged for its rather negative view of information management in manuscript cultures and its complimentary teleological belief in the inevitable progress of technological change.

The second day opened with a session dedicated to the cataloging of books, handwritten or printed, in Arabic, Hebrew, and Persian.  The presentations by Christoph Rauch, Emile Schrijver, and Francs Richard, who all have worked as catalogers and librarians, combined an examination of the historical development of cataloging standards with observations about the impact of digitization on the access to books in the twenty-first century.  Its chair was Guy Burak, a librarian at New York University and a legal historian of the Ottoman empire.  Rauch used the history of  the Berlin Staatsbibliothek’s Arabic manuscript collection to highlight the importance of scholarly expertise for the cataloging of texts in Semitic languages which were not widely taught at nineteenth-century German universities.  While Rauch presented the cataloging history of a state-owned collection, Schrijver explored the challenges posed by cataloging the books of a religious minority, and surveyed how the history of Hebrew bibliography reflects the precarious life of the Jewish diaspora in western Europe.   Because of the hearty embrace of digitization for the preservation of Jewish Schriftkultur Schrijver examined how digital surrogates are changing the roles of both libraries and catalogs.  Since readers increasingly rely on global online catalogs in order to access books as digital surrogates in global online collections, such as those of the National Library of Israel, what will happen to the relationship between a library’s spatial organization and the systematics of its catalogs?  Richard‘s presentation took as its starting point the cataloging practices in Muslim societies since the tenth and eleventh centuries.  Although there is much evidence for vibrant library traditions in Turkey, Iran, and India, very few catalogs of historical library collections have come down to us.  Richard observed that the librarian’s personal responsibility for a collection under his care might have worked as a disincentive for the compilation of publicly available finding aids, since a catalog can also be used to control the work of the librarian.  At the same time, Richard was sceptical about the current practice of ‘digitize first, catalog later‘, arguing that digital surrogates of uncatalogd books are effectively inaccessible as no catalog can be searched for unidentified items.  The discussion was dominated by questions about digital screens as today’s omnipresent interface between readers, catalogs, and books, since some well-funded western libraries are encouraging readers to set up online accounts in order to create their own digital collection of the depository’s holdings.  Does the access to the contents of books through digital surrogates imply changing ideas of who owns the physical artifacts and consequently pays for their cataloging?  What is the reader’s responsibility for the physical artifact if she only is engaging with its digital surrogate as downloaded unto her own computer?  We also observed that digital surrogates are accompanied by their own access barriers, since readers need a working internet connection in order to benefit from Open Access depositories such as Gallica.

The fourth session approached catalogs from the micro perspective of individual sample entries, and juxtaposed the British cataloging of Persian literature with the Ottoman cataloging of North African literature.  It was chaired by Ronny Vollandt, a Semitist and a specialist of biblical manuscripts.  Nilanjar Sarkar’s case study was the entry on a manuscript copy of the Fatāwā-yi jahāndārī (“Imperial legal opinions”) in the highly regarded and still indispensable Catalogue of Persian Manuscripts in the Library of the India Office (1903) by Hermann Ethé (1844–1917).  Although Ethé was a very accomplished scholar of Persian literature, he did not recognize that the Fatāwā is a work of advice literature which originated in Dehli around 1350, and wrongly identified a work of belles-lettres as an anthology of historical legal opinions.  Sarkar examined to which degree Ethé‘s cataloging error reflected British colonial attitudes to the knowledge traditions of pre-colonial Muslim India.  Guy Burak and Dagmar Riedel used the entry on the Dalāʾil al-khayrāt (“Signs of good deeds”) in the aforementioned Kashf al-ẓunūn to demonstrate that scholars inside and outside Muslim societies approached this alphabetic title catalog as a work of pragmatic literature which everyone could adapt and correct in accordance with their own particular needs.  In different manuscript copies and printed versions of the Kashf, the entries on the Dalāʾil, which is a widely used prayerbook by the North African Sufi Ibn Jazūlī (1404–1465), vary considerably.  These variances can nonetheless seem insignificant, since this prayerbook is so well known.  In the discussion we returned to the point, made by Christian Jacob during the second session, that catalogs are never neutral collections of facts as their production cannot be independent from the ideological commitments of their compilers.  But we also explored the importance of errors and misreadings for the transregional diffusion of knowledge.

The global historian Sebastian Conrad chaired the workshop’s fifth and final session on catalogs of books related to East Asian societies. Michael Facius, Florence Hsia, and Joachim Kurtz discussed synchronicity in knowledge management, and challenged the evidence of transregional influence and interdependence in order to probe the nature of knowledge circulation.  Facius analyzed how the library of the Tokugawa Shogunate (1603–1867) served as an important node in the knowledge networks of early modern Japan.  He examined the relationship between the catalogs of the Shogunate Library and the Nagasaki commissariat’s control of the import of books in Chinese and other foreign languages.  Hsia used the historical development of sinological archives in early modern Europe to pursue the sociological dimensions of list-making.  She examined in particular the challenges posed by the task of cataloging Chinese texts within the Jesuit tradition of bio-bibliographies, and the efforts of Thomas Hyde (1636–1703) to identify the Chinese books held at the Bodleian Library in Oxford.  Joachim Kurtz took the torrent of publications translated into Chinese between 1895 and 1911 as an indicator and a factor in the drastic remaking of China’s intellectual landscape in the waning years of the Qing dynasty (1644–1912).  These catalogs were compiled by publishers as well as scholars and reformists, and range from thinly veiled advertisements to analytical reviews of new branches of learning.  Taken together, they provide ample evidence for changing intellectual emphases, new epistemic ideals, and consequential taxonomic shifts that hastened the demise of China’s imperial order with the end of the Qing dynasty.

In sum, we organized the workshop in order to examine catalogs as intellectual enterprises and material artifacts within a transregional framework.  Its starting point was a gesture of inversion, since usually catalogs are consulted for reference purposes, and not studied in their own right.  The workshop’s focus on the comparative analysis of catalogs from a wide range of European, Middle Eastern, and East Asian societies allowed us to explore similarities and differences in their compilation, while being mindful of the dynamics between catalogers and readers.  The intellectual generosity of all participants ensured stimulating debates that revealed the potential of not yet explored sources and yielded numerous new ideas for future research projects.  Venturing beyond the comfort zone of one’s own discipline is always a challenge, and we deeply appreciate that the Forum Transregionale Studien gave us the unique opportunity to take this risk.

Working with Islamic Manuscripts in the Best of All Possible Worlds

From the last decades of the eighteenth century and for at least a century and a half, Britain and France dominated Orientalism as a discipline.  The great philological discoveries in comparative grammar made by Jones, Franz Bopp, Jakob Grimm and others were originally indebted to manuscripts brought from the East to Paris and London.  Almost without exception, every Orientalist began his career as a philologist.

Edward W. Said, Orientalism, 1979

Historiographical debates, when they stray beyond the internal logic of the field, generally discuss the social or political relevance of new paradigms or approaches, but rarely do they examine the extent to which our scholarship may be shaped by the institutional makeup of our profession.

Nicholas Barreyre et al., “‘Brokering’ or ‘Going Native’: Professional Structures and Intellectual Trajectories for European Historians of the United States,” American Historical Review 119.3 (2014)

In the historiography of Middle Eastern and Islamic Studies we concentrate on Orientalism and Islamophobia, since self-critique is even harder, when a scholarly discipline feels unfairly singled out and criticized.  We are vocal in our critique of Orientalist scholarship which produced the Western mirage of the timeless Orient in the nineteenth century.  But we are reluctant to provide further ammunition to those who are already gunning for us, since we are continually confronted with the question of why on earth anyone would study a civilization or a religion that is responsible for …—and everyone will draw on their own experience for the completion of this sentence: terrorism, oppression of women, religious fanaticism, etc.  While much research in Middle Eastern and Islamic Studies necessarily focuses on contemporary Muslim societies, I find curious that in the last decade manuscripts and books in Arabic script have begun to attract much more attention.  During the last years, François Dèroche, Adam Gacek, and Jan Just Witkam have regularly offered five-day introductions to Islamic codicology in Europe and North America.  Historians and literary critics have published studies about Islamic book culture, drawing on statements preserved in literary sources and paratexts, such as ownership statements and reading certificates, though rarely connecting the literary evidence with the material evidence of the manuscripts and printed books themselves (e.g., the Special issue of JAIS 2012 on “The Book in Fact and Fiction in Pre-Modern Arabic Literature”).  In research on the history of science, technology, and medicine, the trend is still to explore how a certain intellectual milestone was first reached in Muslim societies before anyone in Christian Europe managed to do so (e.g., the project on “Scientific Traditions in Islamic Societies: Intellectual, Institutional, Religious, and Social Contexts,” McGill University).  The follow-up question of what happened to all these grand ideas after their initial conception seems much less popular (e.g., the project of Sonja Brentjes and Jürgen Renn on “Islamicate Transformations of Knowledge,” Max-Planck-Institut für Wissenschaftsgeschichte).  Moreover, there is little interest in harnessing the advances in Humanities computing to improve access to this material evidence through the creation of digital catalogs.  For the time being, we cannot match literary works with identified copies, whether these are accessible, alleged to be extant, or assumed to be lost, as there is neither a complete inventory of documented works written in a language that uses Arabic script (cf. Leuven Database of Ancient Books), nor a catalog of known copies of manuscripts in Arabic script (cf. Universal Short Title Catalog).

As regards the role of the Digital Humanities in research on the Islamic book, we seem largely content to limit their application to publishing on the Internet, though primarily as digitized book or article, and not as born-digital publication.  While many scholars in Middle Eastern and Islamic Studies are maintaining personal websites and weblogs, employing Computer Science to answer research questions need be distinguished from digital publishing on the Internet.  Significant resources continue to be dedicated to the production of digital surrogates, and the number of digitized manuscripts and printed books in Arabic script, many of which are available for free on the Internet, is steadily increasing.  It is rarely noted, though, that free online access to the digital surrogates of insufficiently cataloged manuscripts and printed books does not automatically make their contents available.  The proud press releases are usually very reticent about the indispensable cataloging, which has become a little appreciated and largely ignored activity since Edward Said first associated manuscripts with philology.  Nonetheless, the consequences of insufficient cataloging in combination with poor bibliographical reference works are severe and far-reaching.  As long as we have at best some random bits of information about some works and their extant copies, we have a very limited grasp of how the works to which we happen to have access are related to the intellectual life of any particular period of Middle Eastern history between the seventh century CE and the present.  For example, there is no research on the best practices for assessing survival bias in any corpus of manuscripts or printed books in Arabic script.

Against this backdrop, it seems rather unlikely that in the foreseeable future scholars in Middle Eastern and Islamic Studies will obtain the institutional resources to embark on even one of these cataloging projects, be it the inventory of works or the inventory of their copies, however urgently they are needed.  Their coordination will demand not only expertise in Middle Eastern and Islamic Studies but also experience with large-scale Digital Humanities projects and the development of a global network of participating institutions in order to guarantee its financial viability.  The funding mechanisms for research in the Humanities in general and in Middle Eastern and Islamic Studies in particular provide few incentives for embarking on such a complex project which will primarily benefit future generations.  Aside from the practical challenge that even the most generous grant cycle will be unable to accommodate a decades-long project, whoever will finally manage to embark on either project will probably not live long enough to see it reach maturity.

This dispiriting situation raises the practical question of how to design meaningful research projects that make the most creative use of the already available resources and digital tools.  What seems feasible are clearly limited studies that examine Islamic books in synchronic and diachronic contexts.  Synchronic projects would focus on book production in order to establish criteria for the cataloging of both the literary works and their material support, whether they are manuscripts or printed books, while diachronic projects would trace the circulation and reception of a range of literary works in Arabic and Persian from the Abbasid era (750-1258 CE) to the present.  Both types of project necessitate the codicological analysis of manuscripts and bibliographical research on printed books, so that the project outcomes should combine the publication of a study, whether a book or an article, with an online depository for the accumulated codicological and bibliographical data.  To establish these new publication standards for research about manuscripts and printed books in Arabic script could perhaps even serve as the first baby step towards the organization of an inventory of either works or copies, if scholars working on related subjects agree to contribute their codicological and bibliographical data to a shared Open-Access depository (cf. Open Context which organizes the review, documentation, and Open-Access publication of primary data in cultural heritage related fields).

PS – On 7 July 2014, Nur Sobers-Khan and Ursula Sims-Williams published their post about “A Newly Digitised Unpublished Catalogue of Persian Manuscripts” on the Asian and African Studies Blog of the British Library (BL).  The draft of of the never completed third volume of the Catalogue of the India Office Library‘s Persian manuscript collection, written by C. A. Storey (1888-1968), Reuben Levy (1891-1966), and A. J. Arberry (1905-1969), is now available as a digital surrogate on the BL’s website (Mss Eur E207/1-38).  The unit of digitization is the individual page, and it is impossible to use full-text search for finding information about particular works or specific copies.  In their blog post, the authors explain which indices are available and how the catalog’s 38 separate folders can be browsed by topic or searched by call number.

Partial subject index to folders 5-9, History, by C. A. Storey.
London, British Library, Mss Eur E207/5, fol. 1a,
available at: https://www.bl.uk/manuscripts/Viewer.aspx?ref=mss_eur_e207!5_f001r.
Screen capture, 9 July 2014.

C. A. Storey’s notes about MS pers. India Office Islamic 3739.
London, British Library Mss Eur E207/8, fol. 75a,
available at: https://www.bl.uk/manuscripts/Viewer.aspx?ref=mss_eur_e207!8_f075r.
Screen capture, 9 July 2014.

The decision of the British Library to rather obtain a grant for the creation of 3,778 digital images suggests that British manuscript curators did not consider it feasible to integrate these draft descriptions into Fihrist, the British union catalog for manuscripts in Arabic script.

Updated, 9 July 2014

Corrected, 6 August 2014

The Digitization of Books in Arabic Script and the Digital Divide in Muslim Societies

How could future initiatives for the digitization of manuscripts and printed books in Arabic script respond to the practical and ethical challenges posed by the digital divide between rich and poor in Muslim societies in Eurasia and Africa?  Despite the naturalization of e-texts in Arabic script among those who have managed to cross over, the current uses of digitization in Muslim societies do not address this digital divide.

It is well publicized in the mainstream media in Europe and North America that poverty and underdevelopment in many Muslim societies continue to be exacerbated by bad governance as well as political instability, religious and ethnic violence, civil wars, and occupations by foreign powers.  While the importance of digital literacy beyond the sophisticated uses of smart phones is increasingly stressed in Europe and North America, the digital divide in Muslim societies is rarely noticed.  Its invisibility to outsiders seems to follow from the fact that the western perception of Muslim societies is dominated by the actions of either westernized elites or Islamist terrorists, and both groups are committed Internet users.  Since 2009 the news about democratic protest movements in Iran, Tunisia, Egypt, or Turkey have been associated with their savvy employment of social media, in particular FaceBook, YouTube, and Twitter.  At the same time, the broad surveillance of all forms of digital communication by organizations such as the NSA is still justified by the observation that al-Qaeda and other Islamist movements too rely on the Internet to organize their followers.  But across the Middle East, South Asia and North Africa the engagement with social media and the Digital Humanities is limited to small and highly privileged segments of the population.  Only a minority of students does manage to gain access to prestigious institutions of higher learning such as the American University of Beirut (AUB) where earlier this year the Faculty of Arts and Sciences organized a first Digital Humanities workshop.  Unfortunately, this workshop was hosted by AUB’s Department of English, and not by its Department of Arabic and Near Eastern Languages.

Independent of the uses of digital media and the Internet in the political discourse, in the second decade of the twenty-first century, the digitization of manuscripts and printed books in Arabic script has been smoothly integrated into the pragmatic traditions of Islamic bookmaking that for centuries focused on facilitating the access to written texts by whatever means necessary.  For Islamic civilization combines the reverence for written texts, which originated with the revelation of the Quran to the Prophet Muhammad in the early seventh century CE, with strong oral traditions.  Consequently, the adaptation of digitization to bookmaking was not hampered by theoretical concerns for the ontological differences between books such as the nineteenth-century manuscript copies of thirteenth-century manuscript originals, lithographs, typeset books, microfilms, or digital surrogates: they are all texts.  Historicist awareness for the authentic material artefact and its facsimile or forgery is as irrelevant as legal concerns about copyright law and best practices within the Digital Humanities: as long as the text itself seemingly does not change, it does not matter in which medium a book is reproduced and can be read (see the report of David Hirsch (UCLA) about his 2012 workshop for Iraqi librarians in the TARII Newsletter 8/1 (2013): 22-23).  Nor is there any debate about the carbon footprint of digital hardware and software and about the technical problems of the secure long-term preservation of e-texts in societies where many citizens are struggling with access to electricity.

Since the late 1990s the number of websites that offer free access to Arabic, Persian, Ottoman, or Urdu literatures – delivered in a range of formats, though with a slight preference for downloadable pdf-files – has been steadily increasing (see the list of Textual Databases on the resource website of the Digital Islamic Humanities Project at Brown University).  In addition, foundations such as the Imam Zayd Cultural Foundation and the Iran Heritage Foundation (IHF), as well as philanthropists like Yousef Jameel are underwriting the digitization of illustrated manuscripts in Arabic script, together with the digitization of other Islamic or Middle Eastern artefacts, in public and private collections in Europe and North America, thereby reclaiming these material objects as their cultural heritage.  It depends on the mission of the respective private sponsor to which degree these digital surrogates are also intended as means to the end of giving a boost to particular religious or national goals through pretty pictures on computer screens (see for example the Persian Manuscript Digitization Project at the British Library).

The extent to which the reading of e-texts has become the new normal among those with access to small personal computers or smart phones can be gauged by the lavish indices that have become a distinctive feature of academic books published in print in Muslim societies.  Considering the amazing power of relatively straightforward full-text search engines for text files, it is now customary to find in scholarly books specialized indices for personal names, tribal names, place names, Quran verses, first lines of classical poetry, and so forth.

It seems to me that as long as scholars who specialize in Middle Eastern, North African or South Asian Studies remain on the sidelines as the happy consumers of digital surrogates – which are, admittedly, great time-savers – digitization will not receive the critical attention which is urgently needed to address the practical question whether digitization is really the best and most responsible use of limited financial resources in order to improve access to the written texts of the Islamic civilization within the Muslim societies themselves.

PS.  On June 4, 2013, Sarah Zakzouk published an announcement on the blog Muftah about the Media and Digital Literacy Academy of Beirut (MDLAB) at the AUB.  The MDLAB is an extension of AUB’s Media and Digital Literacy University, and will focus on digital media literacy in Arabic.  In August 2013 it will hold its first session for fifty media scholars and students from Iraq, Jordan, Lebanon, Palestine, and Syria.  The working language of the MDLAB is Arabic, but for the August session the MDLAB has also invited communications scholars from Europe and North America, and they will teach in English.

Updated, 28 July 2013

PPS.  In early July 2013, a slightly different version of this essay was submitted to The First University of Lethbridge, Global Outlook::Digital Humanities, Digital Studies/Le champ numérique Global Digital Humanities Essay Prize.  The results were announced on 1 December 2013: 53 essays or abstracts in seven languages were entered into the competition, and the jury awarded four first and five second prizes; the essay’s older version was among the 16 submissions which received a honourable mention.

Updated, 1 December 2013

Prosopography and Social Networks in the Digital Age

On 17-18 May 2013, Will Hanley of Florida State University (FSU) led the First Workshop for PROSOP, which was held at Brown University.  The workshop was supported by a start-up grant from the National Endowment of the Humanities (NEH) and by Brown’s Middle East Studies program.  Will is a Middle East historian, and his research has, for example, explored Egyptian legal records from the late nineteenth and early twentieth century CE.   As the administrator of the ArchivesWiki of the American Historical Association (AHA), his hands-on experiences with this crowdsourced Wiki are informing his plans for this new Digital Humanities project.

I had applied to the PROSOP workshop because research on manuscripts and printed books in Arabic can yield significant prosopographic knowledge that is not limited to the names of authors.  Many books preserve paratexts such as ownership notes, statements about endowments (Arabic sing. waqf), certificates of transmission (Arabic sing. ijāzah), marginal notes (Arabic sing. ḥāshiya), or study and reading notes.  The paratexts reveal the names of people related to one specific copy of a written text, such as
•       author of a commentary on a specific work
•       author of an abridgement or epitome of a specific work
•       person who rewrites, revises or edits a specific work
•       scribe of a specific copy
•       illuminator of a specific copy
•       binder of a specific copy
•       publisher of a specific copy
•       owner of a specific copy (e.g., institution, dealer, private person)
•       reader of a specific copy
These names can be examined as concrete historical evidence for the production, circulation, and uses of manuscripts and printed books in Arabic script, providing insight into book production and the book trade as one aspect of the transmission of knowledge in Muslim societies.  Considering the overall scarcity of archival sources for the history of premodern Muslim societies, the systematic study of paratexts has the potential to dramatically increase our understanding of the social, intellectual and economic history of Muslim societies.  But two formidable obstacles continue to impede the study of paratexts, since few Middle East historians and literary critics are trained in the quantitative research methods commonly applied in the Social Sciences.  The first obstacle is the methodological evaluation of an assembled corpus of manuscripts and printed book as a statistically valid sample for both quantitative and qualitative analyses.  The second obstacle is the technical skill needed for a meaningful organization of the raw prosopographic data gleaned from paratexts.  Consequently, Stefan Leder’s collection of ijāzah from medieval Damascus (Les certificats d’audition à Damas 550 –750 h./1155–1349, 2 vols. Damascus: Institut français d’Etudes arabes & Deutsches Archäologisches Institut, 1996-2000), did not initiate further publications of paratexts from manuscripts and printed books in Arabic script, even though in Middle Eastern and Islamic Studies it has been long recognized that paratexts are unique sources of historical evidence.  (For more about Islamic books as sources of prosopography, see the notes of my workshop presentation.)

For the PROSOP workshop Will had brought together scholars with very different approaches to prosopography and a wide range of experiences with computer based research and the Digital Humanities.  The goal of PROSOP is the aggregation of datasets generated by microhistorical research so that the aggregated datasets can be subjected to macrohistorical analysis (see this 2010 poster illustrating Will’s vision for PROSOP).  The need for aggregation reflects the insight that every local event has international and transnational dimensions because all human beings are affected by violent conflicts and trade, whether this impact is consciously recognized (e.g., military engagements, commodity prices, climate change, epidemics) or not.  Will himself is right now working with computer scientists on a PROSOP prototype that will provide a website with a template which contributors can adapt to the needs of their specific prosopographic datasets.  The site’s search engine will execute global searches across all uploaded datasets.  In order to allow for flexibility in such a globally conceived data collection it will be necessary to avoid fixed category requirements, and Will expects that PROSOP will employ the Linked Data framework provided by the Semantic Web.

Will had structured the workshop as a series of presentations about different types of prosopographic datasets.  Most of our discussion therefore focused on how the website design and the technical requirements of the database template and its variable fields could be organized in order to accommodate our own idiosyncratic datasets and research needs.  With the hope that the debate will continue and that PROSOP will flourish, here are some reflections about PROSOP’s organizational challenges – just my two cents.

PROSOP’s Mode of Operation

Our own reasons for contributing prosopographic datasets to PROSOP indicated that we were interested in submitting datasets to a website that would serve two different purposes: the first is the safe depository for prosopographic research data which are no longer needed for our current work, and the second is an aggregated database whose big data collection promises synergy and serendipity.  Accordingly, the PROSOP website should have concise how-to pages for submitting and extracting datasets (cf. the Wiki “Contributing to Wikipedia“), as well as for searching PROSOP and for citing from its datasets and search results.

Will is passionate about his commitment that PROSOP be open to all, with no professional barriers to the submission of prosopographic datasets.  PROSOP will accommodate the research of bone fida historians and social scientists, as well as the work of genealogists who conduct their historical research as autodidacts and amateurs.  This debate was oddly self-referential, as we were discussing the social structure of digital data sharing in order to build a digital repository for social network research data.  Will distinguished between data sharing as collaboration among academic peers (e.g., Prosopography of the Byzantine World) and general-audience crowdsourcing, favoring non-commercial general-audience crowdsourcing over strictly academic data sharing (e.g., Open Context).  But in fields such as Anglo-Saxon literature and cuneiform studies, the interpretation of relatively scarce and arcane documents demands a high degree of scholarly expertise which in turn exerts a tyranny of quality over any collaborative project.  Nonetheless, a site open to all is bound to raise considerable anxiety, not only among contributing academics but also among those individuals and organizations whose funding will keep the project running, about the reliability of the submitted datasets.  During our discussion the Americanists were most eager to keep PROSOP accessible to researchers outside academia, as for them the painstaking genealogical research of autodidacts and amateurs is an enormously valuable resource (see Gordon S. Wood, “In Quest of Blood Lines: Review of François Weil’s Family Trees: A History of Genealogy in America,” New York Review of Books, 23 May 2013), even if much of this extramural research primarily generates fuzzy data (see Peter Hajek, “Fuzzy Logic,” in Stanford Encyclopedia of Philosophy, 2002, rev. 2010).

The two crucial issues for PROSOP’s mode of operation are the recruitment of collaborators and the site’s concrete uses, while the functional questions of how datasets be entered, stored, and extracted can be treated separately as a technical challenge.  Throughout the workshop we did not worry much about how to win active participants from all walks of life; after all, we ourselves were willing to give PROSOP a chance.  Will, however, had thought hard about the issue, which he addressed by highlighting the concrete scholarly benefits of data sharing.  My own sense of the situation is that the acceptance of and engagement with the project will depend not only on the site’s research utility but also on PROSOP’s association with professional organizations and its institutional ties, since both will directly impact the project’s social prestige in the academic community.  (For more thoughts about the social history of knowledge production, see my 2010 conference paper about the Encyclopaedia Iranica).

Will envisions PROSOP as a project without any top-down quality control so that possible contributors without formal credentials would not be scared off or censored.  But even a bottom-up project such as Wikipedia has a rating system for entries, and Will therefore insisted that all datasets in PROSOP will receive a “confidence score.”  The group accepted as practical and efficient the device of a straightforward questionnaire with control questions which would allow for a modicum of critical evaluation.  The questionnaire would ensure that contributors describe and evaluate datasets prior to their submission to PROSOP.  I found salient that among a group of Humanities scholars there was a clear preference for a computer’s judgment calls.  Most of us had no problem with surrendering the final judgment of their datasets to an algorithm that would calculate the results of the questionnaire as a dataset’s numerical confidence score.  While I am still surprised by this trust in an algorithm, the preference may reflect the perception that a computer is less fallible and more transparent in its decisions than a (human) editor.

PROSOP’s Professional Associations

In order to provide Will’s vision of a continually growing international website with additional support, it seems to me that PROSOP would benefit from being already in this early stage more closely linked to professional associations, even if these associations would come at the prize of an additional layer of administrative duties.  As a project that Will single-handedly started in the USA it would seem logical to approach the professional organizations of American historians, archivists, and librarians, while reaching out to the Library of Congress (LoC) and the National Archives and Records Administrations (NARA).  The conversation opener could be the fact that Will has received the blessings of a NEH start-up grant for PROSOP, while the concrete matter at hand would be the formal establishment of an advisory board, or something similar (cf. the division of labor among the collaborators of the Social Network and Archival Context Project).  The AHA may be of particular importance to PROSOP since the AHA does not limit its membership to academics.  How many of the workshop participants were, for example, AHA members in good standing?  In addition, the AHA has been actively engaged in fostering Digital History for more than a decade, and among its members are historians from other countries and continents.

PROSOP’s Institutional Ties

At the moment, PROSOP has a freestanding website at http://www.prosop.org.  In order to guarantee the secure storage of datasets contributed to PROSOP it seems necessary to plan already during the development phase for secure and regular backups of the site’s continually growing contents as well as for mechanisms that will allow for the uploading of datasets, the downloading of the template, and the extraction of individual datasets.  The secure storage of the uploaded datasets will be one of the incentives for contributing to PROSOP, but the secure storage presents a technical challenge because PROSOP will not merely aggregate a huge collection of individual files saved as text, PDF, or spreadsheet.  Since successful collaborative Digital Humanities projects such as the Stanford Encyclopedia of Philosophy (http://plato.stanford.edu/) or the Social Network and Archival Context Project (http://socialarchive.iath.virginia.edu/) are hosted on university servers, the question arises whether PROSOP would also benefit from an explicit institutional link with FSU where Will is a professor in the History Department.

PROSOP’s Model of Financing

PROSOP’s future depends on its financial viability.  Irrespective of where the website be hosted on the Internet, the maintenance of an actively growing website, which is designed as a digitally-born resource, is cost- and labor-intensive.  Columbia University Libraries, for example, only accepts active, web-based research projects, if a project has its own endowment dedicated to covering all costs associated with hosting the associated websites.  Aside from the daily maintenance costs which range from electricity to salaries for technicians, money will be needed for a separate research and development (R&D) team so that there will be regular updates to PROSOP’s underlying technology and visible web interface.  While the NEH stipulates that its projects are available as Open Access resources since they have received financial support from the US government, it may be worthwhile to explore not only the options of fundraising for a dedicated endowment and of an institutional sponsorship program (cf. the Stanford Encyclopedia of Philosophy International Association) but also the possibility of cost-recovery for nonprofit institutions (e.g., the secure storage of prosopographic research data; cf. that in the US the libraries of nonprofit colleges and universities rely on cost-recovery to give students and faculty affordable access to very expensive services such as InterLibraryLoan).

PROSOP’s Copyright and Licensing

The current version of the PROSOP website does not have any statement about the site’s copyright and licenses.  Considering the importance of copyright laws for education and research in the US, it may help with the further development of PROSOP if at least the most basic copyright and license issues are addressed, while the first PROSOP prototype is still under development.  It is my understanding that Will’s contract with FSU as well as the stipulations of his NEH grant are relevant for determining the copyright of the website and the database design.  But I would otherwise expect that Creative Commons licenses should be able to solve most of the copyright issues related to the prosopographic datasets.  It may provide an additional incentive for the collaboration with PROSOP if the website has a section which explains, for example, the intellectual property rights of a researcher’s own datasets, or the legal status of prosopographic research based on archival documents and artifacts that are not in the public domain.  Since the goal of PROSOP is the aggregation of of the greatest number of available prosopographic datasets, it may not be possible, though, that collaborators can freely chose a particular Creative Commons license for their individual datasets.  In any case, the section about PROSOP’s section about copyright and licensing should be clearly linked to the how-to page about citing from PROSOP’s search results and datasets.

PROSOP’s Design

The debates about the design of PROSOP’ interface and database template were particularly fascinating because they revealed the extent to which knowledge production and knowledge transmission is culturally determined.  Some favored a user-friendly simple interface design, while others asked for truth in advertising, insisting that no glossy layout be used to hide the nitty-gritty complexity of a serious dataset template.  There was, however, agreement that it be important that there be as few clicks as possible between the homepage and the search form or a particular dataset.

Within the group there were still some proponents for developing PROSOP as a relational database, even though Will and his computer science collaborators have already rejected this option.  Another recurrent theme in the discussion was the question whether a person’s name or a person’s association with a specific place and time would be the primary categories for organizing the prosopographic datasets.  This question strikes me as particularly important, since Will expects that PROSOP will allow for the spatial mapping of search results.

Since I myself I have no practical experience with the setting up of databases, I have no specific wishlist for PROSOP’s database design.  But I am very much looking forward to the first PROSOP prototype going live so that I can start using its database template for my research on the production and trade of manuscripts and printed books in Arabic script.

PROSOP and the Ethics of Humanities Research in the Digital Age

Most of our discussion was taken up with very concrete questions about the quantitative and qualitative analysis of prosopographic research data.  Conversely, we had little time and energy left for a more general reflection on PROSOP within the concrete political and social realities of the second decade of the twenty-first century.  Of course, Will’s decision that PROSOP will not rely on relational database design is based on his philosophical rejection of essentialist categories in historical research.  My most general expectation is that PROSOP will manage to remain as transparent as possible about its organization, its funding, and its collaborators.  In addition to an active outreach to genealogists and scholars outside North America and Europe, I would find particularly important that the future development of PROSOP will take into account its carbon footprint and the digital divide inside and outside the US.

How Digitization Has Changed the Cataloging of Islamic Books

Once the [micro-] films are made, there is seldom any need for the scholar to go back to the books and documents themselves.

Richard D. Altick, The Scholar Adventurers, 1950, chap. VII

The Major had told him one day that in five years’ time no one would read any more.  Later, archaeologists would ponder on, argue about, what books had been for.  ‘It’ll all be telly; visual aids.’  ‘Then why are more books published every year?’ Ludo had asked, annoyed with him as usual. ‘Show me the figures, laddie.  Show me the figures.’

Elizabeth Taylor, Mrs Palfrey at the Claremont, 1971, chap. 9

The total number of all extant and accessible manuscripts in Arabic script is not known (Adam Gacek, Arabic Manuscripts: A Vademecum for Readers, Leiden: Brill, 2009, p. x).  Scholars whose research focuses on the history of the book in Muslim societies are of course aware of this fact, and there are at least two rough estimates on the table.  Geoffrey Roper (“The History of the Book in the Muslim World,” in The Oxford Companion to the Book, eds. Michael F. Suarez and H. R. Woudhuysen, 2 vols., Oxford: Oxford University Press, 2010, vol. 1, p. 323) has recently suggested that “more than 3 million MS texts in Arabic script” have been preserved in accessible collections worldwide, while the number of inaccessible manuscripts in private collections is anybody’s guess.  Roper pulled this number out of his hat, providing no explanation whatsoever as to how he derived at it: What (catalogs and internal shelf-lists?) did who (staff or outside researchers?) analyze to determine the holdings of manuscripts in Arabic script all around the world?  What was counted (parchment and paper? fragments and complete codices?) and what was excluded (papyri and archival documents on paper?).  But Roper’s number is important, because he was the general editor of the World Survey of Islamic Manuscripts (5 vols., London: Furqan, 1991-1994).  Moreover, his number is comparable to François Déroche’s estimate of about 4 million extant manuscripts in Arabic script (oral communication, Christoph Rauch, 5 January 2010), though I do not know in which context Deroche has suggested this number.  Since an estimate in the millions has an accordingly wide margin of error―for example, 1 percent of 1,000,000 is 10,000―the numbers put forward by Roper and Déroche serve, depending on one’s point of view, as rhetorical sleights of hand or effective didactic devices by adding a fleeting sense of fact-based mastery to the much more low-key observation that there are lots and lots of manuscripts in Arabic script dispersed in public and private collections worldwide.  I find remarkable, though, that the estimates by Roper and Déroche leave room for optimism.  In the times of big data, collecting cataloging metadata for manuscript holdings in the low single-digit millions should be manageable.  Indeed, these holdings in Arabic script seem rather modest if compared to the estimate of more than 30 million Indian manuscripts, written in Sanskrit or vernacular Indic languages and preserved in India alone (Sheldon Pollock, “Literary Culture and Manuscript Culture in Precolonial India,” in Literary Cultures and the Material Book, eds. Simon Eliot et al., London: British Library, 2007, p. 87; for a discussion of the empirical data in incunable research, see Joseph A. Dane, The Myth of Print Culture: Essays on Evidence, Textuality, and Bibliographic Method, Toronto: University of Toronto Press, 2003, in particular chapter 2 on “Twenty Million Incunables Can’t Be Wrong,” pp. 32-56).

Specialists of Middle Eastern and Islamic Studies rarely discuss this situation and its impact on all aspects of their research.  We take immense pride in the riches of the Islamic manuscript tradition, and yet, we lament about the primary sources actually available for research (R. Stephen Humphreys, Islamic History: A Framework for Inquiry, rev. ed., Princeton: Princeton University Press, 1991, p. 25).  Although autograph manuscripts are very rare in Middle Eastern and Islamic Studies, there is no agreed upon process of ratiocination―comparable, for example, to the distinctions between Folio, Good Quarto, and Bad Quarto in Shakespeare Studies―for the compilation of a manuscript corpus that will allow for the preparation of a scholarly edition, whenever a complete census of all known and extant manuscript copies is neither feasible nor possible.  In subfields, such as Graeco-Arabic Studies and Papyrology, scholars draw on the editorial theory and practice as developed by Classical Philology.  But, in general, research based on manuscripts and printed books in Arabic script remains curiously disconnected from fundamental questions about the material evidence yielded by paleographical, codicological, and bibliographical analysis.

At the beginning of the twenty-first century, the number of Islamic manuscripts seems as uncountable as the number of Muslims, be it in the US or all around the world, as there is no central organization which can claim authority over the preservation of Islam’s cultural heritage.  The international diffusion of manuscripts and printed books in Arabic script reflects the ethnic, linguistic, and denominational diversity of the worldwide Muslim community, and explains why it is so difficult to track Islamic holdings in public and private collections.  Muslims, in contrast, for example, to members of the Roman-Catholic Church, do not belong to a faith tradition that unites its believers within a strictly top-down and binding hierarchy.  Since the Eurasian, African, and Asian nation states which are ruled by Muslim elites or have a Muslim majority population are currently confronted with much more pressing political and economic problems, it is rather unlikely that an Islamic counterpart to the UNESCO will be established any time soon.  The preservation of books and their cataloging must take a backseat when people live in dire poverty and their lives are threatened by sectarian and ethnic violence.

The current state of cataloging manuscripts in Arabic script mirrors this complex political situation.  There is a tacit agreement, especially among western scholars of Middle Eastern and Islamic Studies, to bravely soldier on with individual research projects without sounding a clarion call for concerted action, as such a call will inevitably raise the specter of Orientalism.  Professional academic organizations, in particular MELCom International, MELA and TIMA, have made the cataloging of Islamic books, whether manuscripts or printed books, a focus of their work, even though they are fighting an uphill battle.  Decades of political pressure on the Humanities in Europe and North America have severely reduced government funding for basic research which does not promise immediate benefits to taxpayers outside the enchanted reading rooms of academia.  It is nice to know which books are in the library.  But a library catalog does not carry the prestige of original research; nor does it help with paying for new acquisitions, salaries, and building maintenance.

The severe funding shortages faced by private and public institutions have created an opening for wealthy individuals who have made the preservation and accessibility of this or that part of Islam’s cultural heritage their responsibility.  The non-profit foundations of their choice exert significant influence over the cataloging of manuscripts and printed books in Arabic script.  Since private and public institutions have to make strategic decisions about acceptable funding, sometimes, pecunia olet.  The strategic necessity to design feasible projects that can successfully compete for as much acceptable outside funding as possible does not ensure that the most deserving holdings receive the funding needed for their cataloging, as well as for preservation and digitization.  As long as the fierce competition for limited funding pits institutions against each other, the absolute merit of an Islamic book collection is less important than an institution’s ability to offer acceptable donors and grant-making agencies the best match for their funding priorities.

In theory, institutions with Islamic manuscripts and printed books have access to experts who determine the merit of uncataloged holdings in Arabic script and select items for cataloging, as well as for preservation and digitization.  Even though it seems logical to catalog all known and extant manuscripts and rare printed books before making decisions on those which should, and still can, be digitized, the creation of digital surrogates that are instantaneously made available on the internet irrespective of the quality of their cataloging is often considered a much better investment of limited financial resources.  There are obvious benefits to facilitating the digital access to the texts of manuscripts and rare printed books.  Digital surrogates are so readily accepted by scholars, because their primary function is that of any other book in any other format or medium: to preserve and display written language.  In addition, pretty digital surrogates offer an immediate esthetic gratification on computer screens that is out of reach for highly technical cataloging in a digital database.  No one gets excited about access to correct and detailed metadata for manuscripts and printed books, even though poorly cataloged holdings are effectively lost to scholarship.  I was told by a Columbia University librarian that it would be impossible to obtain funding for the descriptive cataloging of Columbia’s rare Persian lithographs since these printed books already have records, however faulty and incomplete, in Columbia’s online catalog CLIO and are therefore considered cataloged.

The popular perception of digitization is all about convenience in the service of increased scholarly productivity, since fewer library trips mean less time needed for drudgery and legwork which in turn should increase the time available for working on publications.  We happily delegate to our colleagues in Information Science and the Digital Humanities all worries about the long-term preservation of digital surrogates and their long-term interoperability with future electronic databases, portals, or platforms.  It is not uncommon on Middle East and Islamic Studies listservices that scholars look for e-books of works in Arabic script, specifying that they would prefer e-books with full-text search.  Yet I have never noticed any discussion of TEI and other mark-up languages on these listservices, even though full-text search demands a fully encoded text.  After all it would be ungrateful to complain about the steadily growing number of digitized Islamic manuscripts and printed books available for free on the internet.

Appearances, however, can be deceptive.  The easy one-click access to previously rare texts in Arabic script on our computer screens is not cost-neutral.  On the contrary.  It is accompanied by three serious disadvantages.  The first is that digital surrogates seem to diminish the intellectual merit of the original artifacts’ descriptive cataloging, since the texts themselves can now be read on the internet.  A digital text’s direct accessibility makes the material artifact that allowed for its transmission and preservation invisible, as there is no longer any physical obstacle between the reader and the text.  The immediacy of digital surrogates effectively puts an end to the hands-on experience of material books as historical evidence of intellectual practice (David McKitterick, Print, Manuscript and the Search for Order: 1450–1830, Cambridge: Cambridge University Press, 2003, pp. 18-19).  The Hathi Trust Digital Library, for example, allows its members to download pdf-files of digitized works in the public domain.  But the pdf-file itself will only preserve information about the holding library, without revealing the actual call number (see, for example, the nineteenth-century MS pers. of Vāmiq va ʿAẕrā, University of Michigan Library, Isl. Ms. 1043, cf. the record at Hathi Trust Digital Library at: http://hdl.handle.net/2027/mdp.39015079131705 and the most recent record with comments on the Islamic Manuscripts at Michigan website).  The fact that the creators of this academic digital library consider call numbers dispensable suggests that the digital surrogate is seen as a complete replacement of the original book, making any further interaction with the material artifact itself unnecessary.  For I do not know of any library where it be possible to request a book without knowing its call number.

The second disadvantage of the easy one-click access to previously rare texts on the internet is the haphazard approach to the cataloging and the digitization of Islam’s cultural heritage.  The funding priorities of acceptable donors and grant-making agencies determine feasibility, while the competition for outside funding pits institutions against each other.  North American and European depositories favor institutional independence, when courting donors and applying to grant-making agencies, and focus, very sensibly, on clearly circumscribed projects.  For small-scale projects with their own dedicated Islamic manuscript portals―examples are the digitization initiatives at the Walters Art Museum Baltimore, Harvard University Library, or the Universitätsbibliothek Leipzig―are more likely to be successfully completed within their grant periods.  In contrast, large libraries, such as the Bibliothèque nationale de France in Paris (for single pages, see its Banques d’Images), the Bodleian Libraries (for single pages, see their Masterpieces of the non-Western book), or the Bayerische Staatsbibliothek in Munich (for complete books, see its Münchener Digitalisierungszentrum), include manuscripts and printed books in Arabic scripts while they are digitizing their most important rare holdings.  Whenever Islamic holdings are included into such comprehensive and long-term digitization projects, the quality and the accessibility of their metadata will determine whether in these vast online collections of digital surrogates search engines can retrieve the Islamic holdings.

Occasionally, the initiative of a private donor seems to force a decision which database will receive the digital surrogates of Islamic manuscripts.  In May 2012, the Museum für Islamische Kunst in Berlin announced that it will digitize and catalog its collection within an Islamic Art Online portal because Yousef Jameel has provided the funding.  This digitization project will include the museum’s manuscripts in Arabic script, and the earlier plan of digitizing the museum’s Islamic books in cooperation with the digitization project Orient-Digital of the Orientabteilung der Staatsbibliothek zu Berlin has been abandoned (email, Julia Gonnella, 13 June 2012).  The situation in Berlin is quite curious since both the Museum für Islamische Kunst and the Staatsbibliothek belong to the Stiftung Preussischer Kulturbesitz.

The haphazard approach to the cataloging and the digitization of Islam’s cultural heritage also reflects that for private donors it is now almost impossible to envision the digital cataloging of artifacts not available as digital surrogates.  Since so many manuscripts and printed books have already been digitized, there is enormous pressure on institutions to forge ahead with the digitization of their holdings, as completely as possible.  The example of the Collaboration in Cataloging Project of University of Michigan Library documents that it is possible to obtain funding for the digitization of uncataloged manuscripts in Arabic script.  Indeed, the undigitized book has become a problem, if not as a serious offense.  It is therefore only logical that in the British Library digitization and cataloging are going hand in hand, when private foundations contribute funding to particular projects.  In 2011 the British Library embarked on the creation of digital archive for its Persian manuscripts, and this summer the Iran-centered project was supplemented with a digital archive for the British Library holdings concerning the Gulf region.  This development is noteworthy because in a parallel move British academic libraries have bandied together to establish Fihrist, a digital union catalog for manuscripts in Arabic script in British libraries.  It remains to be seen whether other Western countries will follow suit and emulate the Fihrist model.  I suspect that the development of financing models of academic publishing will determine how Islamic manuscript catalogs will be published in the future.  In Germany, for example, the project of the Katalogisierung der orientalischen Handschriften in Deutschland (KOHD), which is now envisioned to be completed in 2015, continues to receive funding for issuing the Verzeichnis der orientalischen Handschriften in Deutschland (VOHD), as printed hardcovers (email, Tilman Seidensticker, 17 May 2012).  Who are the intended audiences of these very expensive German books?  For decades German has been losing ground as an academic lingua franca, and only research libraries with generous acquisition budgets can afford standing subscriptions to the VOHD.  But be this is as it may, the KOHD sticks to publishing the results of their research in print, as there is no comparable funding available for the creation of digital metadata records, derived from the detailed German descriptions of undigitized Oriental manuscripts.

The pragmatic preference for clearly circumscribed independent cataloging and digitization projects explains why so few specialists bother with keeping track of all the independent databases that contain digital surrogates of manuscripts and printed books in Arabic script.  The fierce competition for outside funding provides little incentive for institutional cooperation, and may be a contributing factor as to why there are not yet widely accepted best practices for how to make the digital surrogates of Islamic manuscripts and printed books, as well as their cataloging records, available on the internet.  In December 2010, Klaus Graf wondered on his blog Archivalia why he could not find a list of databases with digitized Islamic manuscripts anywhere on the internet; Peter Magierski is now keeping such a regularly updated list of open access databases on his blog AMIR.  It remains to be seen whether the decision of grant-making agencies, such as the Humboldt Foundation, NEH, DFG, the Carnegie Corporation of New York, or the Doris Duke Foundation, to prioritize projects that necessitate domestic and international cooperation between institutions will provide an incentive to scholars in Middle Eastern and Islamic Studies to invent new models for how to coordinate the cataloging of and access to Islamic holdings in the Digital Age.

The third disadvantage of easy one-click access to previously rare texts on the internet is that the competition for funding favors holdings which can be presented as exceptional to donors and grant-making agencies.  It is of course unfair to accuse any institution for drawing on the importance or artistic value of its holdings in order to attract outside funding.  The digitization of Zaydī manuscripts in private collections in Yemen, in connection with the digitization of Zaydī manuscripts in Princeton University Library and the Staatsbibliothek zu Berlin, is the example of a successful international project that received funding from several sources, as there is a compelling need to preserve cultural heritage threatened by political conflict.  But significance, like beauty, rests always in the eye of the beholder.  The focus on a particularly endangered group of manuscripts in Arabic script makes it harder to contextualize those holdings, which are now distinguished by having received a substantial grant.  Every book refers to other books, and not even the most exceptional book was produced in a vacuum.  What will happen to those Yemini manuscripts that cannot be classified as Zaydī?  Since every book is a commodity within a society’s system of book production, how is that which has been preserved related to that which was originally produced?  In every literate society there are many more cheap books than livres d’artiste in circulation, and yet, expensive books and other collectibles are much more likely to survive.

At this point my considerations have come full circle.  As long as specialists of Middle Eastern and Islamic Studies have only very rough estimates for the total number of all extant and accessible manuscripts in Arabic script, it is impossible to gain a better understanding of how the bias of survival has shaped, as well as distorted, the available sources of Islamic history.  The international dispersion of Islamic manuscripts and rare printed books makes it very difficult to keep track of these holdings and to organize their cataloging.  Unfortunately the great attraction of pretty digital surrogates further complicates all efforts to raise money for the little valued, but much more urgently needed cataloging of all known books in Arabic script.

PS.  The Iran Heritage Foundation (IHF) has just posted a You Tube fundraising video, providing some figures for its digitization project in the British Library.  Its collection of more than 11,000 Persian manuscripts is the largest collection in the Western World, and about 1,370 of these manuscripts are currently cataloged in the British online catalog Fihrist.  In the course of the IHF project, the British Library expects to completely digitize another 40 to 50 Persian manuscripts, while adding as many metadata records as possible to Fihrist.

Updated, 13 January 2013.

Working with Manuscripts in the Digital Age

The importance of Islamic manuscripts as the most important resource for research about all aspects of Islamic civilization is widely recognized.  Walid Saleh describes the medieval Muslim Middle East as “one of the most bookish of pre-modern cultures” (Formation of the Classical Tafsīr Tradition, Leiden: Brill, 2004, p. 207), and Tilman Seidensticker observes that “the medium of the manuscript was intrinsic to the Islamic-Arabic culture” (in Manuscript Cultures, ed. Jörg B. Quenzer, Hamburg: SFB 950 Manuskriptkulturen Asien, Afrika und Europa, 2011, p. 78).  Scholars and institutions worldwide have heartily embraced digitization to facilitate access to the texts of manuscripts, as well as rare printed books, since the field of Middle Eastern and Islamic Studies is still a discipline focused on the study of written texts.  The use of digitized sources has almost become best practice, and we routinely complain if sources are not digitally available with a good full-text search.  It is therefore noteworthy that the transformation of a three-dimensional physical object into a two-dimensional image on a screen has not ushered in a debate on whether the medium in which we encounter written texts impacts our understanding of their meaning.

One of the unintended side effects of the vigorously championed digitization of Islamic books is the proliferation of a seemingly decorative use of manuscript pages on academic websites and publications, since the widespread use of digitization has made it so much easier to obtain affordable high-quality scans.  I hasten to add that it is of course not particular to Middle Eastern and Islamic Studies to treat beautiful manuscript pages as eye candy.   Moreover, I myself am guilty as charged, though on this blog I will provide identifying information about all featured images (NB – for the blog’s masthead, please see this page).  But I suspect that the use of undocumented images as illustrations most likely reflects a learned lack of interest for the materiality of written texts.  As long as graduate education in Middle Eastern and Islamic Studies is centered on teaching scholars how to base their arguments on the meaning of words only, the text’s embodiment in any particular medium is perceived as secondary and illustrations, as nice as they may be, are accidental.  This logocentric attitude explains why we have moved with relative ease from books on paper to microfilms and e-books.

The following two examples of undocumented manuscript pages illustrate that in Middle Eastern and Islamic Studies our scholarly appreciation of Islamic manuscripts has not initiated a turn to bibliography or material history.  Despite the immense potential of digital media for the study of images, it is the word that stands at the center of contemporary research in Middle Eastern and Islamic Studies.

In Yemen, one of poorest Arab countries, the preservation of public and private manuscript collections presents a serious challenge, and digitization has long been used to address this challenge.  In 2011, Sabine Schmidtke and Jan Thiele of the Research Unit of the Intellectual History of the Islamicate World (Institut für Islamwissenschaft, Freie Universität Berlin) published an English-Arabic pamphlet about their department’s Yemen Manuscript Digitization Project.  The cover of the English version shows part of a rubricated table of content, set into a red frame, with a note on the margin:

As I wanted to know more about the formal manuscript to which this page belongs, I emailed Sabine Schmidtke and promptly received from Jan Thiele a very kind note with the available bibliographical details:  The illustrated leaf belongs to an undated copy of Taysīr al-maṭālib min Amālī Abī Ṭālib by Jaʿfar b. Aḥmad al-Buhlūlī (d. 1177 or 1178), written by Jābir b. Fatḥ Allāh al-Ghaffārī.  The work is preserved as part of a miscellany, which includes another work dated 1029 (began 8 Dec. 1619).  Although the miscellany is uncatalogued and its current owner unknown, the miscellany can be consulted, as it has been digitized by the Imam Zayd b. Ali Cultural Foundation (CD 450:3).  It is intriguing that Schmidtke and Thiele chose for the cover of a printed pamphlet a manuscript that at the moment is only accessible as a digital copy.   Their decision may first and foremost reflect that the work of Jaʿfar b. Aḥmad al-Buhlūlī is important to the department’s research project on theological rationalism.  But what is the ontological status of a digital manuscript copy, for which any knowledge about its original’s size, paper, ink etc. can no longer be ascertained?

The second example concerns the 2008 website of the research project on the Rational Sciences in Islam (Institute of Islamic Studies, McGill University).  A very beautiful illustration of two kinds of kabīkaj plant (Lat. ranunculus asiaticus) – and the word kabīkaj is clearly legible on the top of the right column – is prominently displayed on the homepage and the related three project pages.     The illustration (MS arab., fol. 277a) belongs to a mid-thirteenth-century fragment of the Kitāb al-adwiyah al-mufradah by Abū Jaʿfar Aḥmad b. Muḥammad al-Ghāfiqī (d. 1165), which is owned by McGill’s Osler Library of the History of Medicine.  In 1989, Adam Gacek published the manuscript’s description in “Arabic Calligraphy and the ‘Herbal’ of al-Ghâfiqî” (Fontanus 2, pp. 49-51 and figs. 8-9).  Pharmacology is not directly related to philosophy and the mathematical sciences which are at the heart of the McGill research project.  Yet the kabīkaj presents a fascinating case of the rational sciences in premodern Islam.  What is the status of material evidence for any research on medieval Islam?  As Gacek had shown in an earlier article about “The Use of ‘kabīkaj‘  in Arabic Manuscripts” (Manuscripts of the Middle East1, 1986, pp. 49-53), the kabīkaj plant and the jinn Kabīkaj who protects books from pests are clearly related.  But Gacek’s research on invocations of the Kabīkaj has nonetheless been adduced to argue that the jinn Kabīkaj has been an Orientalist misreading; for example in the description of an Arabic manuscript (dated 1202/began 13 Oct. 1785) of the Kitāb tanbīh al-hādī wa’l-muhtadī by Ḥamīd al-Dīn al-Kirmānī (fl.1020) in the Institute of Ismaili Studies.  In a final twist to this reflection on working with manuscripts in the Digital Age, the title page with the invocation “yā Kabīkaj,” though explicitly mentioned in the description, is not among the four pages shown on its website.

PS.  On February 15, 2012 Tim Parks published “E-books Can’t Burn” on the blog of the New York Review of Books.  Parks’ paean to the many benefits of e-books has generated a lively debate on how the medium in which literature is read and enjoyed is related to its meaning and understanding.

Updated, 21 February 2012.