Power Tool Helps Climate, Public Health Researchers Drill into Data

by Brian Kahn

Health and climate are intrinsically linked, yet they rarely operate on the same scales. The flu doesn’t last all winter and malaria outbreaks don’t happen with the first drops of rain, yet deciphering the relationships of these and other infectious diseases with climate factors is vitally important to public health professionals. This is critical everywhere but especially in the developing world, where resource-strapped public health systems can easily become overwhelmed by outbreaks and epidemics.

“Our climate is changing and we need to understand what impacts these changes will have on climate-sensitive diseases,” said John del Corral, a senior staff associate at the International Research Institute for Climate and Society.

John del Corral on merging climate and public health data from IRI on Vimeo.

A handful of factors stand in the way of climate and health professionals working in complete harmony. External variables, different time and space scales and the sometimes indirect effect of climate all stand as impediments. For the latter two at least, refining data and finding ways to get data from both fields to mesh represent a key avenue to connecting to the two fields so practitioners can actively begin to answer questions and solve problems.

“They need a tool to analyze this connection,” said del Corral. That’s exactly what he and his colleagues at IRI set out to build.

Thankfully, del Corral didn’t have to start completely from scratch. Fast, nimble and adaptable, IRI’s Climate Data Libraryhas been around for 20 years. The Data Library is more than just a repository of over 600 datasets from across the earth sciences; it’s a tool to parse, download and visualize those data. Updates, innovations and an ever-growing array of datasets over those 20 years have made it an even stronger and more widely applicable.

The Data Library, and its workflow language, Ingrid, were initially conceived of as a tool mostly for climate scientists to analyze different climate datasets. However, because IRI’s mission is to take innovations in climate science to help solve real-world problems, over the years funding from the National Oceanic and Atmospheric Administration and, more recently, the U.S. Agency for International Development have helped IRI scientists expand the Data Library to load and analyze data on public health, agriculture and hydrology among others.

Doing that was no small feat. Though data is often thought of as tidy, its collection and organization is anything but.

“Even within a discipline, data arrives in a wide variety of formats and pieces convenient to the provider,” said Benno Blumenthal, creator and project lead for the IRI’s Data Library. “We gather those pieces into datasets that make conceptual sense for the user and allow those data to be co-analyzed with data from very different sources. Cross-disciplinary data are even more diverse, but we bring them together into a common framework.”

In other words, the Data Library is much like a physical library of books, with datasets coming in different shapes, sizes, lengths, topics and publication dates. Walking through the stacks of the Data Library would be a bewildering experience for most users, given that it contains hundreds of terabytes of information. Unlike books, data frequently needs to be transformed before the user has what they need. Ingrid facilitates that.

Meet Ingrid

The Data Library has a card catalogue of sorts to help users navigate a trip through its rows and rows of datasets. Search and browse functions quickly get users to the datasets they most want. Using Ingrid, researchers have set the Data Library up to translate those terabytes of numbers into interactive maps and graphs.

Ingrid helps analysis by hiding the technical detail, which allows the user to think in terms of manipulations of datasets instead of manipulations of files of numbers. This feature, coupled with the ability to more freely manipulate data in time and space offers an upgrade compared to GIS software traditionally used to analyze these types of data. In addition, by having the user specify the desired result in analysis terms rather than specific programmatic steps, Ingrid is free to arrange the computation efficiently. It can, for example, let users access only the data they want from a particular set, requiring far fewer calculations when compared to combing the whole dataset. This means quick access.

Speed and power is particularly important in developing countries where computing power and bandwidth are at a premium.

Out of a Gridded State of Mind

With a system in place to access and sort climate data, del Corral says the next challenge for his team was to translate the data into actionable public health information. One of the key steps is getting out of a gridded state of mind. Though collecting climate data can be messy, once it has been tidied up, it’s often visualized in large gridded squares.

However, public health isn’t confined to simple grids. It lives in a contoured world of communities composed of different demographics, sizes, and shapes. A further confounding factor is that mosquitoes and other vectors that carry diseases are equally ungridded in their travels and interactions with human communities. For example, Anopheles gambiae, a malaria-transmitting mosquito, usually travels 1-3 kilometers in a day but its path will rarely be linear.

Rather than trying to snap these factors to a climate-model-sized grid, which tends to be too coarse to make local decisions, del Corral, Blumenthal and other colleagues used specialized techniques to take the climate data from its large square confines and fit it over finer spatial scales.

Gilma Mantilla, a senior staff associate and public-health expert at IRI, explained it while sitting in her office. Taking out a composition book, she opened it to a blank page and started drawing a square on a piece of paper in her office. “Imagine this is Ethiopia,” she said, tapping the center of the red-outlined box. “Now imagine these two areas are where malaria outbreaks could occur according to climate data we have,” she continued, drawing two smaller squares inside her squared Ethiopia.

“Using better data analysis and including satellite observations in addition to on-the-ground observations we can now pinpoint what those areas look like even more accurately,” she said, drawing two squares smaller still. “Improving the resolution of the data means improving how and where we intervene.”

More successful interventions are ultimately at the heart of why del Corral, Mantilla, and their colleagues worked to put the Data Library’s powers in the hands of public health professionals. Mantilla sees this as just the beginning. “The Data Library is very powerful and can do everything from very simple to very complex analysis,” she said, sitting back at her desk. “We can add social, environmental, and biological variables among others as health professionals become more comfortable using it.”

And with that, pinpointing interventions further can become even more of a reality.