Published April 4, 2025
| Version v1
Journal article
Open
Leveraging data mining, active learning, and domain adaptation for efficient discovery of advanced oxygen evolution electrocatalysts
Creators
- 1. University of Chicago
- 2. North China Electric Power University
- 3. Nanjing University
- 4. The Hong Kong University of Science and Technology
Description
Developing advanced catalysts for acidic oxygen evolution reaction (OER) is crucial for sustainable hydrogen production. This study presents a multistage machine learning (ML) approach to streamline the discovery and optimization of complex multimetallic catalysts. Our method integrates data mining, active learning, and domain adaptation throughout the materials discovery process. Unlike traditional trial-and-error methods, this approach systematically narrows the exploration space using domain knowledge with minimized reliance on subjective intuition. Then, the active learning module efficiently refines element composition and synthesis conditions through iterative experimental feedback. The process culminated in the discovery of a promising Ru-Mn-Ca-Pr oxide catalyst. Our workflow also enhances theoretical simulations with domain adaptation strategy, providing deeper mechanistic insights aligned with experimental findings. By leveraging diverse data sources and multiple ML strategies, we demonstrate an efficient pathway for electrocatalyst discovery and optimization. This comprehensive, data-driven approach represents a paradigm shift and potentially benchmark in electrocatalysts research.
Data availability
All data needed to evaluate the conclusions in the paper are present in the paper and/or the Supplementary Materials. In line with the principles of open access and knowledge-sharing in the ML community, all ML training and data mining scripts used in this study, datasets extracted from the literature for data mining and initial ML committee training, high-throughput experimental/DFT computational data, characterization results of the samples, and other supplementary data mining results and trivial detailed discussions (Supplementary Notes 1 to 8) are publicly accessible on the Dryad repository (https://doi.org/10.5061/dryad.nk98sf83g) and mirrored on GitHub (https://github.com/ruiding-uchicago/DASH) for interested readers to review in detail.Files
sciadv.adr9038.pdf
Files
(74.1 MB)
| Name | Size | Download all |
|---|---|---|
|
Article md5:96e60903b1d79298f16d43d8ba34b500 |
10.9 MB | Preview Download |
|
Supplementary materials md5:92a6031097c904d884e007efdfb24f71 |
63.2 MB | Preview Download |
Additional details
Identifiers
- DOI
- 10.1126/sciadv.adr9038
- Other
- oai:uchicago.tind.io:14848
Funding
- Joint Fund of the National Natural Science Foundation of China and the Karst Science Research Center of Guizhou Province
- U23B2075
- Joint Fund of the National Natural Science Foundation of China and the Karst Science Research Center of Guizhou Province
- 52272039
- Joint Fund of the National Natural Science Foundation of China and the Karst Science Research Center of Guizhou Province
- 51972168
- The National Key Research and Development Program of China
- 2021YFB4000100
- Hong Kong Research Grant Council
- C6011-20GF
- Hong Kong Research Grant Council
- JLFS/P-602/24
- Guangzhou Science and Technology Bureau
- 2024A03J0609
- Unknown funder
- The Eric and Wendy Schmidt AI in Science Postdoctoral Fellowship
- The Research Grant Council of Hong Kong Special Region
- 16308420