Data Library Sunset

November 11, 2025

Sunset of the IRI Data Library

After three decades serving climate data and information to the worldwide research and humanitarian communities, the IRI Data Library (IRIDL) will soon undergo major changes.

For more than a decade, operation of the IRIDL has been funded primarily by international development grants to the IRI. Such grants have declined in recent years, and will soon no longer be sufficient to continue employing the existing IRIDL staff, which is already significantly smaller than it was at its peak. We anticipate that by April 2026 the IRIDL will no longer be staffed sufficiently to continue operating in its current form.

In the time that remains, the IRIDL team has two main goals. Our primary focus is on ensuring that IRI’s remaining funded projects continue to meet their commitments. To this end, we are exploring various options for a smaller, simpler data infrastructure that will be sustainable at the current level of funding. Secondarily, to the extent possible without endangering our funded commitments, we will do what we can to help current users of the IRIDL find other ways to satisfy their data needs.

Sustainable infrastructure

Operating the existing Data Library requires a team of several people with different specialized sets of skills. We are exploring various options to reduce the complexity of the system so that it can be operated by a smaller and less specialized staff. Ideas we are exploring include the following:

  • Replace the redundant architecture of the current IRIDL (two separate, self-sufficient installations in different buildings, each with a distributed filesystem) with a single server. This will make the system simpler to administer, at the cost of increased downtime during maintenance and hardware failures.
  • Eliminate software written in the Ingrid programming language, which was invented for the IRIDL, is not widely known, and for which expertise has been lost to attrition over the years. We are evaluating open source data serving software such as Hyrax and TDS, as well as alternative formats such as Zarr and Icechunk that clients can read directly from cloud storage. No definitive technical decisions have yet been made in this area.

An instance of this simplified architecture will be created at Columbia Climate School’s Center for Climate Systems Research (CCSR) to host data from the NMME, SubC, and S2S projects, thanks to continuing funding from NOAA and DoD. We hope to make the new data service available for limited beta testing by December 2025 and for public access by January 2026.

A new version of PyCPT (IRI climate forecasting software) will be prepared that can download data from sources other than the IRIDL, likely including the new data service at CCSR.

If you are currently engaged with IRI on other funded projects, the data, applications, and support funded by that project will continue to be available privately to you. Users of other datasets should begin exploring other sources for the data they need.

The IRIDL Maprooms website will no longer be maintained. No replacement for the maprooms is planned.

If you have expertise, time, or funds to contribute towards building or hosting a more sustainable Data Library, we invite you to contact us at help@iri.columbia.edu.

Helping users transition

In order to help users of the IRIDL rebuild their data workflows with different tools, we will publish IRIDL code and data, and answer as many transition-related help requests as we can.

In the coming months, we plan to make the following resources available:

  • A spreadsheet listing upstream or alternative download sources for each dataset currently served by the IRIDL, and identifying datasets that are no longer available elsewhere. For datasets that will be “orphaned” by the loss of the IRIDL, we will help transfer data to anyone who wishes to take over hosting them.
  • Source code for the IRIDL’s data catalog. The data catalog consists of executable instructions (in the Ingrid programming language) that map different providers’ idiosyncratic data formats, directory and file naming conventions, and metadata schemas into the IRIDL’s standard representation, as well as metadata such as revision history and pointers to documentation.
  • Source code for scripts that are used to download data from upstream providers.

Users of Maprooms can already find the data and analysis information through links on their respective maproom page, and then use the resources above, when made available, to reproduce their workflow of interest.

As this information becomes available, the IRI help desk (help@iri.columbia.edu) will increasingly prioritize helping users find alternatives, rather than helping them use resources that will soon be lost.

To receive updates on the status of the sunset plan, please watch this page and/or visit https://iri.columbia.edu/subscribe to subscribe to the Data Library mailing list.

Regretfully,
The IRI Data Library team