The New York Public Library is developing and executing a visionary strategy for digitization, digital preservation, and access. The Library stewards rapidly growing collections of digitized and born-digital collection items, and it works to make these collections easier to discover, sample, use, and reuse in more creative ways.

Read our Digital Research Strategy 2021–24 below or download the strategy as a PDF. Plus, explore our current and past projects.

Digital Research Strategy 2021–24: Vision

The New York Public Library’s Stephen A. Schwarzman Building was erected on the site of the Croton Distributing Reservoir. This vast four-acre structure provided the city with drinking water throughout the nineteenth century. That history remains a part of the Library today: the physical vestiges of the reservoir are embedded in the Library’s foundation, visible in the building’s walls.

The reservoir’s purpose informs our present-day work. It pooled a vital resource—20 million gallons of water—and provided freely to New Yorkers. NYPL has gathered an equally critical resource—a body of knowledge spanning 50 million research collection items—for the benefit of the public. But access to this asset has been limited by our very edifice. Archives like ours have functioned as reservoirs without conduits, sites of pilgrimage for those seeking the resource they hold but with limited means of sharing that resource outside their walls.

It is only now that digital tools allow us to share our collections more freely. The ways in which people interact with and access information have fundamentally changed to prioritize digital experiences. The global pandemic has only intensified trends that existed long before society was shuttered. We believe now that we must bring forth a world in which access to the collections we steward is no longer predicated on physical presence in our research libraries.

NYPL has long championed the digital revolution in libraries. We have transferred collections from the stacks to server racks, harnessed crowdsourcing to extract machine-readable data, and led the development of e-reading platforms. This work has transformed our collecting strategies, preservation efforts, and access services. Our Digital Research Strategy 2021–24 presents a set of ambitious priorities that NYPL will pursue to continue and quicken this digital transformation: to seek radical access, to democratize our collections, to deepen engagement, and to reinforce systems and infrastructure.

We will continue to care about and for paper and petabytes in the coming years, but as we move forward we will prioritize digital delivery as our primary method of access. Being “digital-first” will open our collections to global audiences, allowing us to reach millions affordably and without restrictions of time or space, or a patron’s abilities. It will make material easier to discover, sample, use, and reuse in more creative ways than are possible now. With this approach, we can fully embrace our mission as both a public library and a research archive.

Descriptive Practices

Make material available more quickly by refining descriptive practice

Description democratizes access. Our patrons can find library resources because they are cataloged, and they can only find our digital collections when they are adequately described. We are working to remove obstacles to descriptive processes, making time-based media available online at a pace not previously possible. These changes will benefit all our descriptive workflows and shape a more open and engaged culture around description with our research centers. Work here includes finding more efficient ways to update archival description in our digital collections access platforms, new automated communications that will allow us to prioritize new description, and new forms of communication and collaboration among our curatorial divisions and production teams.
 

Remote Access

Create secure distanced access to rights-restricted items

We steward one of the world’s greatest collections of time-based media, from Lou Reed’s demo recordings to Sonny Rollins’s rehearsal tapes and from Martha Graham’s dances to rare newsreels and films documenting the 1960s Civil Rights Movement. While we can provide remote access to paper and still image collections through our services, we have no means of providing access to the vast majority of our audio and moving image collections at a distance because of copyright law.

This challenge is not unique to NYPL; many academic libraries struggle with serving time-based media to remote users. We are working with our peers to develop a legal framework and leverage new technologies that together will provide patrons with secure access while protecting the rights of rights holders. We will create a new “Virtual Reading Room” that allows limited access to users with bona fide research needs—providing a window to collections that have heretofore been available only at our research centers.
 

At-Risk Digitization

Renew strategy for undigitized unique at-risk holdings and for new acquisitions

In 2014, NYPL launched its Audio and Moving Image (“AMI”) Initiative, a large-scale effort to preserve, digitize, and create access to time-based media that is at risk of permanent loss through media degradation and technology obsolescence. As of December 2020, the Library has identified more than 400,000 rare, unique, or at-risk items, and inventoried 315,000 and digitized over 200,000 of those items.

We have established a new team to manage pre- and post-production activities and expanded our in-house conservation and digitization capabilities for new media formats. Parallel transfer workflows will support the Library’s long-term commitment to AMI preservation. Critical investments will support the continued evolution of the AMI Initiative from project-based to programmatic services as we look to maintain our ability to conserve and digitize new acquisitions of at-risk media for years to come.
 

Copyright Parsing

Identify the copyright status of collection items by transcribing and parsing the Catalog of Copyright Entries

Since 2017, we’ve received grants to transcribe the class A registrations from the 1923–77 Catalog of Copyright Entries, a list of all copyrights published by the US Copyright Office. By transcribing the 87,920 pages of these records—and making that data set publicly available and user-friendly—we can identify and make freely available thousands of works that are in the public domain because they are out of copyright.

Diversity, Equity, and Inclusion

Prioritize digitization that champions diversity, equity, and inclusion

Digitization is a choice and a responsibility, through which we can shine a spotlight on missing voices and narratives that celebrate the diversity in our holdings. It is also a uniquely powerful tool by which we can influence the collections that are accessed, used, and drawn upon to create new knowledge. Mindful of this potential, NYPL is committing all discretionary digitization capacity—our capacity beyond that which is committed to public orders and programmatic needs—to missing and underrepresented voices. For the next three years, we will focus specifically on African American, African Diaspora, and African experiences and culture.
 

Access-Level Digitization

Support a digital-first collections strategy for print collections by developing access-level digitization

Our current digitization practices for print collections are built to serve high-value, high-prestige special collections materials. While this expert approach to imaging is necessary for everything from digital preservation of highly fragile materials to high-resolution capture for commercial reproduction, it does not meet the needs of a more typical library patron looking for expansive and fast off-site access to research collections, particularly printed volumes. Building on the Scan & Deliver service developed during the pandemic, we will pilot expanded services focused on access-level service for general print collections, building new workflows that prioritize faster, scalable, high-quality image capture, and delivering materials to patrons worldwide for free. This investment will allow us to meet the needs of more patrons, serve more materials, and embrace accessibility.
 

Enhanced Discovery Capability

Enhance discovery, description, and accessibility of print materials by implementing optical character and handwritten text recognition

We will continue utilizing advanced technologies to improve the patron experience. This includes implementing optical character recognition and handwritten text recognition in order to enhance the discovery, description, and accessibility of print materials. Removing the barrier between information trapped in an image format and machine readability will help democratize our collections, making the information contained easier to discover for all, and, for the first time, give our patrons the ability to search within the volumes we have digitized.
 

User Research

Undertake primary research to better understand users and to inform planning

We are undertaking primary research to ensure our efforts and attention are focused on meeting all our users’ needs and desires. This work focuses not only on researchers using our tools to further their scholarship, but all patrons accessing our collections, whether it be for a creative project or to pursue their curiosity. This work also focuses on ensuring that our platforms meet the needs of the diverse public that we serve.

Digital Collections

Redevelop Digital Collections as an accessible platform that engages the curious

Our Digital Collections portal launched in 2005 as our Digital Gallery, a place where patrons could see a selection of images that had been digitized by the Library. Over the years, the gallery grew in scale and morphed in function: it now serves over 900,000 digitized collection items and is a digital reading room for our digitized content.

This platform is in need of modernization. The ability to wander and discover serendipitously is challenged by the layout of the item pages and the sheer quantity of items now available. Tools are required that can allow curators to contextualize collection items. Better bridges to our catalog and archives portal are required, as are faceted and complex search functions so users can make sense of our ever-growing digitized collections. Improvements are also needed to ensure that the platform meets the accessibility needs of all our patrons.
 

Projects & Experiments

Launch projects and support experimentation that showcases the reuse and remixing of digital collections

We have a deep history of creating and supporting digital humanities explorations of our public domain collections. We will continue providing support for projects that show the many ways our digital collections can be utilized, in order to encourage more people to engage creatively. This includes collaborations with partner universities and developing future scholarships for digital humanities study at the Library.
 

Staff Development

Facilitate staff development, including training staff on our Digital Collections API

In order to empower our patrons to access and utilize our collections, we also need to empower our staff to understand our Digital Collections platform and tools so that they can provide reference assistance to patrons who wish to do the same. We are creating new training and documentation that will allow staff members to engage more deeply with our digital collections platform and API (application programming interface), building their comfort and familiarity with the increasingly digital nature of our collections.
 

Communications

Introduce the public to our strategy through a proactive communication plan

We want to keep you informed! We have created this strategy to share the work we are doing now and our plans over the coming years. We will share updates as we make progress on our goals. We will also seek forums to discuss projects at key inflection points, ensuring that our communication is two-way and that we make opportunities to hear from you.

Digital Preservation

Improve preservation, security, and access by delivering new repository software

Over the past decade, our digital storage and preservation needs have rapidly changed. We have developed and deployed systems to ensure that the millions of files held in our repository are safely stowed and easily accessible. As part of that evolution, we now look to implement new digital preservation software so we may improve our ability to provide a secure and trustworthy digital preservation environment for all digital files. Once implemented, this new system will provide better support for and access to all digitized and born-digital formats.
 

Digital Storage

Evolve our digital storage plan and expand storage to meet needs

Our digital collections are growing rapidly as we acquire new collections and digitize and preserve analog formats. We will continue to invest in expanded storage to ensure that our work is not limited by our ability to hold our collections. We currently manage ten petabytes of content and we expect to grow our storage by at least one petabyte per year in the coming years. In addition, we plan to shift our storage to prioritize cloud-based archiving, creating a future-oriented storage environment that provides improved access and security.
 

Born-Digital Collections

Implement preservation and access pathways for born-digital materials

We are implementing preservation and access pathways for born-digital materials, collection objects that are digital at acquisition and will never have a physical form. As these will continue to make up an ever-increasing volume of our new collections, it is essential that we improve our born-digital media ingest process and create seamless means of access. Our current focus is on web-based and email archives, as well as born-digital time-based media.
 

Governance & Oversight

Enhance our governance and oversight, and ratify our first digital preservation policy

As digital assets make up more and more of our collections, our sophistication and processes around digital preservation must continue to evolve. We have long stewarded digitized and born-digital collections, but we face an inflection point as we focus on our digital transformation. We are creating new ways to ensure that stakeholders across our production teams and curatorial divisions are consulted and informed in our preservation work, and that responsible and accountable parties’ responsibilities are made explicit. As part of this work, we will publish our first digital preservation policy, a standalone policy focused on our standards and workflows that ensures appropriate stewardship of our collections.