Explainable artificial intelligence for the interpretation of ensemble learning performance in algal bloom estimation

© 2024 Water Environment Federation.

Bibliographische Detailangaben
Veröffentlicht in:Water environment research : a research publication of the Water Environment Federation. - 1998. - 96(2024), 10 vom: 09. Okt., Seite e11140
1. Verfasser: Park, Jungsu (VerfasserIn)
Weitere Verfasser: Seong, Byeongchan, Park, Yeonjeong, Lee, Woo Hyoung, Heo, Tae-Young
Format: Online-Aufsatz
Sprache:English
Veröffentlicht: 2024
Zugriff auf das übergeordnete Werk:Water environment research : a research publication of the Water Environment Federation
Schlagworte:Journal Article algal bloom ensemble machine learning explainable artificial intelligence measurement frequency water quality watershed management Chlorophyll A YF5Q9EJC8Y
Beschreibung
Zusammenfassung:© 2024 Water Environment Federation.
Chlorophyll-a (Chl-a) concentrations, a key indicator of algal blooms, were estimated using the XGBoost machine learning model with 23 variables, including water quality and meteorological factors. The model performance was evaluated using three indices: root mean square error (RMSE), RMSE-observation standard deviation ratio (RSR), and Nash-Sutcliffe efficiency. Nine datasets were created by averaging 1 hour data to cover time frequencies ranging from 1 hour to 1 month. The dataset with relatively high observation frequencies (1-24 h) maintained stability, with an RSR ranging between 0.61 and 0.65. However, the model's performance declined significantly for datasets with weekly and monthly intervals. The Shapley value (SHAP) analysis, an explainable artificial intelligence method, was further applied to provide a quantitative understanding of how environmental factors in the watershed impact the model's performance and is also utilized to enhance the practical applicability of the model in the field. The number of input variables for model construction increased sequentially from 1 to 23, starting from the variable with the highest SHAP value to that with the lowest. The model's performance plateaued after considering five or more variables, demonstrating that stable performance could be achieved using only a small number of variables, including relatively easily measured data collected by real-time sensors, such as pH, dissolved oxygen, and turbidity. This result highlights the practicality of employing machine learning models and real-time sensor-based measurements for effective on-site water quality management. PRACTITIONER POINTS: XAI quantifies the effects of environmental factors on algal bloom prediction models The effects of input variable frequency and seasonality were analyzed using XAI XAI analysis on key variables ensures cost-effective model development
Beschreibung:Date Completed 09.10.2024
Date Revised 09.10.2024
published: Print
Citation Status MEDLINE
ISSN:1554-7531
DOI:10.1002/wer.11140