Probing out-of-distribution generalization in machine learning for materials

Abstract Scientific machine learning (ML) aims to develop generalizable models, yet assessments of generalizability often rely on heuristics. Here, we demonstrate in the materials science setting that heuristic evaluations lead to biased conclusions of ML generalizability and benefits of neural scal...

Full description

Saved in:

Bibliographic Details
Main Authors:	Kangming Li, Andre Niyongabo Rubungo, Xiangyun Lei, Daniel Persaud, Kamal Choudhary, Brian DeCost, Adji Bousso Dieng, Jason Hattrick-Simpers
Format:	Article
Language:	English
Published:	Nature Portfolio 2025-01-01
Series:	Communications Materials
Online Access:	https://doi.org/10.1038/s43246-024-00731-w
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1841544470741385216
author	Kangming Li Andre Niyongabo Rubungo Xiangyun Lei Daniel Persaud Kamal Choudhary Brian DeCost Adji Bousso Dieng Jason Hattrick-Simpers
author_facet	Kangming Li Andre Niyongabo Rubungo Xiangyun Lei Daniel Persaud Kamal Choudhary Brian DeCost Adji Bousso Dieng Jason Hattrick-Simpers
author_sort	Kangming Li
collection	DOAJ
description	Abstract Scientific machine learning (ML) aims to develop generalizable models, yet assessments of generalizability often rely on heuristics. Here, we demonstrate in the materials science setting that heuristic evaluations lead to biased conclusions of ML generalizability and benefits of neural scaling, through evaluations of out-of-distribution (OOD) tasks involving unseen chemistry or structural symmetries. Surprisingly, many tasks demonstrate good performance across models, including boosted trees. However, analysis of the materials representation space shows that most test data reside within regions well-covered by training data, while poorly-performing tasks involve data outside the training domain. For these challenging tasks, increasing training size or time yields limited or adverse effects, contrary to traditional neural scaling trends. Our findings highlight that most OOD tests reflect interpolation, not true extrapolation, leading to overestimations of generalizability and scaling benefits. This emphasizes the need for rigorously challenging OOD benchmarks.
format	Article
id	doaj-art-ddb837a99cba47ecb9f059773865b66c
institution	Kabale University
issn	2662-4443
language	English
publishDate	2025-01-01
publisher	Nature Portfolio
record_format	Article
series	Communications Materials
spelling	doaj-art-ddb837a99cba47ecb9f059773865b66c2025-01-12T12:32:47ZengNature PortfolioCommunications Materials2662-44432025-01-016111010.1038/s43246-024-00731-wProbing out-of-distribution generalization in machine learning for materialsKangming Li0Andre Niyongabo Rubungo1Xiangyun Lei2Daniel Persaud3Kamal Choudhary4Brian DeCost5Adji Bousso Dieng6Jason Hattrick-Simpers7Department of Materials Science and Engineering, University of TorontoVertaix, Department of Computer Science, Princeton UniversityToyota Research InstituteDepartment of Materials Science and Engineering, University of TorontoMaterial Measurement Laboratory, National Institute of Standards and TechnologyMaterial Measurement Laboratory, National Institute of Standards and TechnologyVertaix, Department of Computer Science, Princeton UniversityDepartment of Materials Science and Engineering, University of TorontoAbstract Scientific machine learning (ML) aims to develop generalizable models, yet assessments of generalizability often rely on heuristics. Here, we demonstrate in the materials science setting that heuristic evaluations lead to biased conclusions of ML generalizability and benefits of neural scaling, through evaluations of out-of-distribution (OOD) tasks involving unseen chemistry or structural symmetries. Surprisingly, many tasks demonstrate good performance across models, including boosted trees. However, analysis of the materials representation space shows that most test data reside within regions well-covered by training data, while poorly-performing tasks involve data outside the training domain. For these challenging tasks, increasing training size or time yields limited or adverse effects, contrary to traditional neural scaling trends. Our findings highlight that most OOD tests reflect interpolation, not true extrapolation, leading to overestimations of generalizability and scaling benefits. This emphasizes the need for rigorously challenging OOD benchmarks.https://doi.org/10.1038/s43246-024-00731-w
spellingShingle	Kangming Li Andre Niyongabo Rubungo Xiangyun Lei Daniel Persaud Kamal Choudhary Brian DeCost Adji Bousso Dieng Jason Hattrick-Simpers Probing out-of-distribution generalization in machine learning for materials Communications Materials
title	Probing out-of-distribution generalization in machine learning for materials
title_full	Probing out-of-distribution generalization in machine learning for materials
title_fullStr	Probing out-of-distribution generalization in machine learning for materials
title_full_unstemmed	Probing out-of-distribution generalization in machine learning for materials
title_short	Probing out-of-distribution generalization in machine learning for materials
title_sort	probing out of distribution generalization in machine learning for materials
url	https://doi.org/10.1038/s43246-024-00731-w
work_keys_str_mv	AT kangmingli probingoutofdistributiongeneralizationinmachinelearningformaterials AT andreniyongaborubungo probingoutofdistributiongeneralizationinmachinelearningformaterials AT xiangyunlei probingoutofdistributiongeneralizationinmachinelearningformaterials AT danielpersaud probingoutofdistributiongeneralizationinmachinelearningformaterials AT kamalchoudhary probingoutofdistributiongeneralizationinmachinelearningformaterials AT briandecost probingoutofdistributiongeneralizationinmachinelearningformaterials AT adjiboussodieng probingoutofdistributiongeneralizationinmachinelearningformaterials AT jasonhattricksimpers probingoutofdistributiongeneralizationinmachinelearningformaterials

Probing out-of-distribution generalization in machine learning for materials

Similar Items