Skip to main content
2025

A Comprehensive Dataset of Surface Water Quality Spanning 1940-2023 for Empirical and ML Adopted Research

Md Rajaul Karim, MM Mahbubul Syeed, Ashifur Rahman, Khondkar Ayaz Rabbani, Kaniz Fatema, Razib Hayat Khan, Md Shakhawat Hossain, Mohammad Faisal Uddin

Scientific Data (Nature) , Vol. 12 (1)

A Comprehensive Dataset of Surface Water Quality Spanning 1940-2023 for Empirical and ML Adopted Research

Abstract

Assessment and monitoring of surface water quality are essential for food security, public health, and ecosystem protection. Although water quality monitoring is a known phenomenon, little effort has been made to offer a comprehensive and harmonized dataset for surface water at the global scale. This study presents a comprehensive surface water quality dataset that preserves spatio-temporal variability, integrity, consistency, and depth of the data to facilitate empirical and data-driven evaluation, prediction, and forecasting. The dataset is assembled from a range of sources, including regional and global water quality databases, water management organizations, and individual research projects from five prominent countries in the world, e.g., the USA, Canada, Ireland, England, and China. The resulting dataset consists of 2.82 million measurements of eight water quality parameters that span 1940 - 2023. This dataset can support meta-analysis of water quality models and can facilitate Machine Learning (ML) based data and model-driven investigation of the spatial and temporal drivers and patterns of surface water quality at a cross-regional to global scale.

Citation

Md Rajaul Karim, MM Mahbubul Syeed, Ashifur Rahman, Khondkar Ayaz Rabbani, Kaniz Fatema, Razib Hayat Khan, Md Shakhawat Hossain, Mohammad Faisal Uddin. "A Comprehensive Dataset of Surface Water Quality Spanning 1940-2023 for Empirical and ML Adopted Research." Scientific Data (Nature) 12.1 (2025).

BibTeX

@article{pub15_2025,
  title={A Comprehensive Dataset of Surface Water Quality Spanning 1940-2023 for Empirical and ML Adopted Research},
  author={Md Rajaul Karim, MM Mahbubul Syeed, Ashifur Rahman, Khondkar Ayaz Rabbani, Kaniz Fatema, Razib Hayat Khan, Md Shakhawat Hossain, Mohammad Faisal Uddin},
  journal={Scientific Data (Nature)},
  volume={12},
  number={1},
  year={2025},
  doi={https://doi.org/10.1038/s41597-025-04715-4}
}
Publication Details
Type:
Year:
2025
Journal:
Scientific Data (Nature)
Volume:
12
Issue:
1
Share

Related Publications

SwiftMSeg: lightweight multi-scale local–global context modeling with transformer for medical image segmentation
2026
SwiftMSeg: lightweight multi-scale local–global context mod…

Jahid Hasan Rony, Md Shakhawat Hossain & Fazlul Hasan Siddiqui

LGGC-Net: a local-global graph and color attention-based lightweight CNN for skin cancer classification
2026
LGGC-Net: a local-global graph and color attention-based li…

Md Aminur Sarker, Md Alamgir Kabir, Md Shakhawat Hossain

SiNuS: A Comprehensive Dataset for Singular Nuclei Segmentation for HER2 Grading of Breast Cancer
2026
SiNuS: A Comprehensive Dataset for Singular Nuclei Segmenta…

Md Shakhawat Hossain, Md Sahilur Rahman, Munim Ahmed