Options
Which Precipitation Dataset to Choose for Hydrological Studies of the Terrestrial Water Cycle?
Publikationstyp
Journal Article
Date Issued
2025-09-01
Sprache
English
Author(s)
Thomson, Johanna Ruth
Sun, Qiaohong
Volume
106
Issue
9
Start Page
E2000
End Page
E2016
Citation
Bulletin of the American Meteorological Society 106 (9): E2000-E2016 (2025)
Publisher DOI
Scopus ID
ISSN
00030007
Precipitation is a critical component of the terrestrial hydrological cycle. It plays a crucial role in shaping climate patterns and ecosystem dynamics. The quest for accurate measurements of global precipitation started over 150 years ago. Comprehensive evaluations of the estimates were rare until the 1970s, mainly due to the challenges of acquiring and maintaining up-to-date datasets. Nowadays, data availability is no longer an issue. However, in the face of the seemingly ever-growing number of available datasets, determining which one to rely upon poses a new obstacle, especially in the absence of global ground truth, further complicated by the interdependencies among datasets. Here, we identify the genealogy of multiple precipitation datasets and define multiplicity artifacts. Then, we compute an evaluation reference benchmark free of multiplicity artifacts to identify the dataset that best represents the artifact-free ensemble over different terrestrial spatial domains. These include countries, IPCC assessment report reference regions, major world river basins, land-cover types, elevation zones, biome categories, and Köppen–Geiger climate classes. It should be noted that the datasets assessed herein had a monthly temporal scale, and our findings might not apply to the study of climate extreme events regardless of the terrestrial spatial domain. We repeatedly found GPM IMERG Final v07 to emerge as the most representative dataset over multiple domains. Furthermore, we found that the dataset’s representativeness is largely influenced by how spatial domains are defined rather than their scale. SIGNIFICANCE STATEMENT: Over the past decades, we have amassed a vast array of precipitation datasets. While offering significant opportunities, this abundance of data creates a new challenge: which precipitation dataset should we use? In this work, we address this challenge by developing a method to identify and mitigate the impact of overlapping or dependent data sources, ensuring more reliable comparisons across diverse spatial domains. This is important because our results can help researchers select more reliable datasets and emphasize the need for a deeper understanding of the data used, thus avoiding misleading conclusions. Such efforts contribute to advancing hydrological sciences and the broader nexus fields, where reliable precipitation data serve as a cornerstone for modeling and projecting climate change impacts.
Subjects
Climate
Data assimilation
Databases
Hydrology
Precipitation
Reanalysis data
DDC Class
600: Technology