Companies working in the fields of aerospace, energy and computing are constantly looking for new materials to improve performance. But to understand how those materials will actually behave once they’re inside a rocket or on computer chips, companies must first create the material and then test it. This is because even the most powerful simulation techniques struggle to model the complex chemical arrangements in most of today’s solid materials. The problem adds cost and time to content innovation.
Now a team of MIT researchers has created a way to accurately model the behavior of metals, regardless of the complexity of their chemical arrangement. At the heart of the approach are machine-learning models that make simulations of materials faster and more accurate. The researchers improved the models by building training datasets that capture the diversity of atomic environments in chemically disordered materials.
in a new paper science advancementThe researchers showed that their approach could be used to accurately predict material properties for a diverse group of metal alloys under different conditions. They also showed how this approach could be used to develop new materials, especially in scenarios where experimentation is costly.
“The focus of the paper is metal alloys, which is the area I work in, but it can be adapted to other types of materials, such as semiconductors,” says senior author Rodrigo Freitas, the TDK Career Development Professor in Materials Science and Engineering at MIT. “It’s not specific to any one application – you can use this approach to create new durable steels, new materials for aerospace and much more. That’s what makes it exciting.”
Joining Freitas on the paper is first author Killian Sheriff PhD ’26; MIT PhD students Daniel Xiao and Yifan Cao; and Lewis R., Senior Lecturer at the University of Sheffield. Owen.
modeling of metals
Physical properties are mostly determined by the internal arrangement of their chemical elements. Even if two materials have the same mixture of chemical elements, different chemical arrangements can make the difference between a brittle material and a material that deforms without breaking.
Capturing that difference requires atom-by-atom simulations. To do this, researchers rely on models that describe how atoms interact with each other. Over the past two decades, machine learning has become the most accurate way to create those models. Such models work well when the chemical arrangement inside the material follows a highly ordered pattern, but this is not the case for most solid materials, whose atomic chemical arrangement is disordered and varies from region to region.
“The real challenge in our field is modeling these chemically disordered phases,” says Freitas. “Chemical disorder means that there is a huge variety of local chemical environments, which are hard for machine-learning models to learn. This is a problem because almost every metal we use in practice is chemically disordered.”
The problem comes from the lack of representative training data for those atom-by-atom simulations. The current leading approach to creating such data works by brute force, often requiring over 100,000 hours of computation to create training data for a single material. Yet, it does not transfer well when researchers change the composition of the material.
In previous work, the Freitas group developed a way to measure the chemical complexity of solids by analyzing the frequency and spacing of small groups of atoms. For this study, the researchers used that ability to create a better training dataset. They used a mathematical approach called information theory to generate a training dataset that captures the wide variety of local chemical environments inside disordered materials. This method works by swapping atoms from samples to reduce repetition and exposing the model to a chemical environment that might otherwise be missed.
“We continued to optimize the training set so that it captured as many different local environments as possible,” says Freitas. “If the same type of environment appears multiple times, we replaced redundant examples with examples that the model had not seen before. This makes the training set more informative because each example adds something new.”
When trained on the researchers’ dataset, the model predicted physical properties more accurately than models trained using random sampling or another popular sampling method.
“The starting point for all these atom-by-atom simulations is: Are you able to accurately describe the chemical bonds between atoms?” Freitas explains. “If not, it can still teach you about materials in general, but it doesn’t tell you what will happen to specific materials in the real world. This approach makes the simulations higher fidelity, in terms of their chemistry, to better reflect what is happening with the materials.”
The researchers applied their technique to create a machine-learning training dataset for a group of chemically diverse metal alloys. Using a set of machine-learning models, they showed that models trained on their dataset are more accurate than much larger models built by companies like Google and Microsoft.
“We got to a point where we were confident that it worked without using these expensive brute-force methods,” says Freitas. “I said to Kilian, ‘This is a nice paper. But if you can show that simulations with these models can now accurately predict useful material properties, then it would make a very nice paper.’ “Kilian took it seriously and tested it as extensively as possible.”
Sherif worked with Xiao and Cao to test the approach on different alloys and properties. The team also used Owen’s experimental data to compare the simulations against actual measurements of atomic ordering in the alloys.
From laboratory to industry
This method works partly by capturing hidden patterns in the sample data. The researchers describe the pattern in the paper as a “subtle energetic bias toward certain local chemical configurations.”
Those small energetic differences matter because they determine which phases form in the alloy, how those phases change with temperature and composition, and ultimately what properties the material will have. As a test, Daniel Xiao led simulations showing that the team’s models could predict phase diagrams that closely match experimental data. Phase diagrams map which phases are stable at different temperatures and chemical compositions, and they are a central tool for designing and processing alloys.
“Phase diagrams are one of the main ways that people connect material modeling to actual processing decisions,” says Freitas. “If you’re welding, casting, or heat-treating an alloy, you need to know which phases are likely to form under different conditions. Our goal is to make these types of predictions accurate enough, and accessible enough, that they become part of the way people design materials.”
Researchers are now using the approach to study how changing the composition of an alloy affects its mechanical properties and radiation tolerance, with the goal of designing materials that remain strong and damage-tolerant in harsh environments. They’re also working to make the method easier to use with the tools and workflow content that engineers already rely on.
“Industries won’t change the way they work if what you’re creating doesn’t fit into their existing operating processes,” says Freitas. “The goal is to make these predictions useful in places where physical decisions are actually made.”
The research was supported by the U.S. Air Force Office of Scientific Research.