Published April 4, 2025 | Version v1
Journal article Open

Leveraging data mining, active learning, and domain adaptation for efficient discovery of advanced oxygen evolution electrocatalysts

  • 1. University of Chicago
  • 2. North China Electric Power University
  • 3. Nanjing University
  • 4. The Hong Kong University of Science and Technology

Description

Developing advanced catalysts for acidic oxygen evolution reaction (OER) is crucial for sustainable hydrogen production. This study presents a multistage machine learning (ML) approach to streamline the discovery and optimization of complex multimetallic catalysts. Our method integrates data mining, active learning, and domain adaptation throughout the materials discovery process. Unlike traditional trial-and-error methods, this approach systematically narrows the exploration space using domain knowledge with minimized reliance on subjective intuition. Then, the active learning module efficiently refines element composition and synthesis conditions through iterative experimental feedback. The process culminated in the discovery of a promising Ru-Mn-Ca-Pr oxide catalyst. Our workflow also enhances theoretical simulations with domain adaptation strategy, providing deeper mechanistic insights aligned with experimental findings. By leveraging diverse data sources and multiple ML strategies, we demonstrate an efficient pathway for electrocatalyst discovery and optimization. This comprehensive, data-driven approach represents a paradigm shift and potentially benchmark in electrocatalysts research.

Data availability

All data needed to evaluate the conclusions in the paper are present in the paper and/or the Supplementary Materials. In line with the principles of open access and knowledge-sharing in the ML community, all ML training and data mining scripts used in this study, datasets extracted from the literature for data mining and initial ML committee training, high-throughput experimental/DFT computational data, characterization results of the samples, and other supplementary data mining results and trivial detailed discussions (Supplementary Notes 1 to 8) are publicly accessible on the Dryad repository (https://doi.org/10.5061/dryad.nk98sf83g) and mirrored on GitHub (https://github.com/ruiding-uchicago/DASH) for interested readers to review in detail.

Files

sciadv.adr9038.pdf

Files (74.1 MB)

Name Size Download all
Article
md5:96e60903b1d79298f16d43d8ba34b500
10.9 MB Preview Download
Supplementary materials
md5:92a6031097c904d884e007efdfb24f71
63.2 MB Preview Download

Additional details

Identifiers

DOI
10.1126/sciadv.adr9038
Other
oai:uchicago.tind.io:14848

Funding

Joint Fund of the National Natural Science Foundation of China and the Karst Science Research Center of Guizhou Province
U23B2075
Joint Fund of the National Natural Science Foundation of China and the Karst Science Research Center of Guizhou Province
52272039
Joint Fund of the National Natural Science Foundation of China and the Karst Science Research Center of Guizhou Province
51972168
The National Key Research and Development Program of China
2021YFB4000100
Hong Kong Research Grant Council
C6011-20GF
Hong Kong Research Grant Council
JLFS/P-602/24
Guangzhou Science and Technology Bureau
2024A03J0609
Unknown funder
The Eric and Wendy Schmidt AI in Science Postdoctoral Fellowship
The Research Grant Council of Hong Kong Special Region
16308420

UChicago Information

Division(s)
Physical Sciences Division, Pritzker School of Molecular Engineering
Department(s)
Computer Science